NASA Astrophysics Data System (ADS)
Abbasi, Ashkan; Monadjemi, Amirhassan; Fang, Leyuan; Rabbani, Hossein
2018-03-01
We present a nonlocal weighted sparse representation (NWSR) method for reconstruction of retinal optical coherence tomography (OCT) images. To reconstruct a high signal-to-noise ratio and high-resolution OCT images, utilization of efficient denoising and interpolation algorithms are necessary, especially when the original data were subsampled during acquisition. However, the OCT images suffer from the presence of a high level of noise, which makes the estimation of sparse representations a difficult task. Thus, the proposed NWSR method merges sparse representations of multiple similar noisy and denoised patches to better estimate a sparse representation for each patch. First, the sparse representation of each patch is independently computed over an overcomplete dictionary, and then a nonlocal weighted sparse coefficient is computed by averaging representations of similar patches. Since the sparsity can reveal relevant information from noisy patches, combining noisy and denoised patches' representations is beneficial to obtain a more robust estimate of the unknown sparse representation. The denoised patches are obtained by applying an off-the-shelf image denoising method and our method provides an efficient way to exploit information from noisy and denoised patches' representations. The experimental results on denoising and interpolation of spectral domain OCT images demonstrated the effectiveness of the proposed NWSR method over existing state-of-the-art methods.
Joint image restoration and location in visual navigation system
NASA Astrophysics Data System (ADS)
Wu, Yuefeng; Sang, Nong; Lin, Wei; Shao, Yuanjie
2018-02-01
Image location methods are the key technologies of visual navigation, most previous image location methods simply assume the ideal inputs without taking into account the real-world degradations (e.g. low resolution and blur). In view of such degradations, the conventional image location methods first perform image restoration and then match the restored image on the reference image. However, the defective output of the image restoration can affect the result of localization, by dealing with the restoration and location separately. In this paper, we present a joint image restoration and location (JRL) method, which utilizes the sparse representation prior to handle the challenging problem of low-quality image location. The sparse representation prior states that the degraded input image, if correctly restored, will have a good sparse representation in terms of the dictionary constructed from the reference image. By iteratively solving the image restoration in pursuit of the sparest representation, our method can achieve simultaneous restoration and location. Based on such a sparse representation prior, we demonstrate that the image restoration task and the location task can benefit greatly from each other. Extensive experiments on real scene images with Gaussian blur are carried out and our joint model outperforms the conventional methods of treating the two tasks independently.
Video based object representation and classification using multiple covariance matrices.
Zhang, Yurong; Liu, Quan
2017-01-01
Video based object recognition and classification has been widely studied in computer vision and image processing area. One main issue of this task is to develop an effective representation for video. This problem can generally be formulated as image set representation. In this paper, we present a new method called Multiple Covariance Discriminative Learning (MCDL) for image set representation and classification problem. The core idea of MCDL is to represent an image set using multiple covariance matrices with each covariance matrix representing one cluster of images. Firstly, we use the Nonnegative Matrix Factorization (NMF) method to do image clustering within each image set, and then adopt Covariance Discriminative Learning on each cluster (subset) of images. At last, we adopt KLDA and nearest neighborhood classification method for image set classification. Promising experimental results on several datasets show the effectiveness of our MCDL method.
Image fusion based on Bandelet and sparse representation
NASA Astrophysics Data System (ADS)
Zhang, Jiuxing; Zhang, Wei; Li, Xuzhi
2018-04-01
Bandelet transform could acquire geometric regular direction and geometric flow, sparse representation could represent signals with as little as possible atoms on over-complete dictionary, both of which could be used to image fusion. Therefore, a new fusion method is proposed based on Bandelet and Sparse Representation, to fuse Bandelet coefficients of multi-source images and obtain high quality fusion effects. The test are performed on remote sensing images and simulated multi-focus images, experimental results show that the performance of new method is better than tested methods according to objective evaluation indexes and subjective visual effects.
Medical Image Fusion Based on Feature Extraction and Sparse Representation
Wei, Gao; Zongxi, Song
2017-01-01
As a novel multiscale geometric analysis tool, sparse representation has shown many advantages over the conventional image representation methods. However, the standard sparse representation does not take intrinsic structure and its time complexity into consideration. In this paper, a new fusion mechanism for multimodal medical images based on sparse representation and decision map is proposed to deal with these problems simultaneously. Three decision maps are designed including structure information map (SM) and energy information map (EM) as well as structure and energy map (SEM) to make the results reserve more energy and edge information. SM contains the local structure feature captured by the Laplacian of a Gaussian (LOG) and EM contains the energy and energy distribution feature detected by the mean square deviation. The decision map is added to the normal sparse representation based method to improve the speed of the algorithm. Proposed approach also improves the quality of the fused results by enhancing the contrast and reserving more structure and energy information from the source images. The experiment results of 36 groups of CT/MR, MR-T1/MR-T2, and CT/PET images demonstrate that the method based on SR and SEM outperforms five state-of-the-art methods. PMID:28321246
Segmentation of medical images using explicit anatomical knowledge
NASA Astrophysics Data System (ADS)
Wilson, Laurie S.; Brown, Stephen; Brown, Matthew S.; Young, Jeanne; Li, Rongxin; Luo, Suhuai; Brandt, Lee
1999-07-01
Knowledge-based image segmentation is defined in terms of the separation of image analysis procedures and representation of knowledge. Such architecture is particularly suitable for medical image segmentation, because of the large amount of structured domain knowledge. A general methodology for the application of knowledge-based methods to medical image segmentation is described. This includes frames for knowledge representation, fuzzy logic for anatomical variations, and a strategy for determining the order of segmentation from the modal specification. This method has been applied to three separate problems, 3D thoracic CT, chest X-rays and CT angiography. The application of the same methodology to such a range of applications suggests a major role in medical imaging for segmentation methods incorporating representation of anatomical knowledge.
NASA Astrophysics Data System (ADS)
Hu, Ruiguang; Xiao, Liping; Zheng, Wenjuan
2015-12-01
In this paper, multi-kernel learning(MKL) is used for drug-related webpages classification. First, body text and image-label text are extracted through HTML parsing, and valid images are chosen by the FOCARSS algorithm. Second, text based BOW model is used to generate text representation, and image-based BOW model is used to generate images representation. Last, text and images representation are fused with a few methods. Experimental results demonstrate that the classification accuracy of MKL is higher than those of all other fusion methods in decision level and feature level, and much higher than the accuracy of single-modal classification.
Weighted Discriminative Dictionary Learning based on Low-rank Representation
NASA Astrophysics Data System (ADS)
Chang, Heyou; Zheng, Hao
2017-01-01
Low-rank representation has been widely used in the field of pattern classification, especially when both training and testing images are corrupted with large noise. Dictionary plays an important role in low-rank representation. With respect to the semantic dictionary, the optimal representation matrix should be block-diagonal. However, traditional low-rank representation based dictionary learning methods cannot effectively exploit the discriminative information between data and dictionary. To address this problem, this paper proposed weighted discriminative dictionary learning based on low-rank representation, where a weighted representation regularization term is constructed. The regularization associates label information of both training samples and dictionary atoms, and encourages to generate a discriminative representation with class-wise block-diagonal structure, which can further improve the classification performance where both training and testing images are corrupted with large noise. Experimental results demonstrate advantages of the proposed method over the state-of-the-art methods.
Coupled dictionary learning for joint MR image restoration and segmentation
NASA Astrophysics Data System (ADS)
Yang, Xuesong; Fan, Yong
2018-03-01
To achieve better segmentation of MR images, image restoration is typically used as a preprocessing step, especially for low-quality MR images. Recent studies have demonstrated that dictionary learning methods could achieve promising performance for both image restoration and image segmentation. These methods typically learn paired dictionaries of image patches from different sources and use a common sparse representation to characterize paired image patches, such as low-quality image patches and their corresponding high quality counterparts for the image restoration, and image patches and their corresponding segmentation labels for the image segmentation. Since learning these dictionaries jointly in a unified framework may improve the image restoration and segmentation simultaneously, we propose a coupled dictionary learning method to concurrently learn dictionaries for joint image restoration and image segmentation based on sparse representations in a multi-atlas image segmentation framework. Particularly, three dictionaries, including a dictionary of low quality image patches, a dictionary of high quality image patches, and a dictionary of segmentation label patches, are learned in a unified framework so that the learned dictionaries of image restoration and segmentation can benefit each other. Our method has been evaluated for segmenting the hippocampus in MR T1 images collected with scanners of different magnetic field strengths. The experimental results have demonstrated that our method achieved better image restoration and segmentation performance than state of the art dictionary learning and sparse representation based image restoration and image segmentation methods.
NASA Astrophysics Data System (ADS)
Arevalo, John; Cruz-Roa, Angel; González, Fabio A.
2013-11-01
This paper presents a novel method for basal-cell carcinoma detection, which combines state-of-the-art methods for unsupervised feature learning (UFL) and bag of features (BOF) representation. BOF, which is a form of representation learning, has shown a good performance in automatic histopathology image classi cation. In BOF, patches are usually represented using descriptors such as SIFT and DCT. We propose to use UFL to learn the patch representation itself. This is accomplished by applying a topographic UFL method (T-RICA), which automatically learns visual invariance properties of color, scale and rotation from an image collection. These learned features also reveals these visual properties associated to cancerous and healthy tissues and improves carcinoma detection results by 7% with respect to traditional autoencoders, and 6% with respect to standard DCT representations obtaining in average 92% in terms of F-score and 93% of balanced accuracy.
Face recognition via sparse representation of SIFT feature on hexagonal-sampling image
NASA Astrophysics Data System (ADS)
Zhang, Daming; Zhang, Xueyong; Li, Lu; Liu, Huayong
2018-04-01
This paper investigates a face recognition approach based on Scale Invariant Feature Transform (SIFT) feature and sparse representation. The approach takes advantage of SIFT which is local feature other than holistic feature in classical Sparse Representation based Classification (SRC) algorithm and possesses strong robustness to expression, pose and illumination variations. Since hexagonal image has more inherit merits than square image to make recognition process more efficient, we extract SIFT keypoint in hexagonal-sampling image. Instead of matching SIFT feature, firstly the sparse representation of each SIFT keypoint is given according the constructed dictionary; secondly these sparse vectors are quantized according dictionary; finally each face image is represented by a histogram and these so-called Bag-of-Words vectors are classified by SVM. Due to use of local feature, the proposed method achieves better result even when the number of training sample is small. In the experiments, the proposed method gave higher face recognition rather than other methods in ORL and Yale B face databases; also, the effectiveness of the hexagonal-sampling in the proposed method is verified.
Finger vein verification system based on sparse representation.
Xin, Yang; Liu, Zhi; Zhang, Haixia; Zhang, Hong
2012-09-01
Finger vein verification is a promising biometric pattern for personal identification in terms of security and convenience. The recognition performance of this technology heavily relies on the quality of finger vein images and on the recognition algorithm. To achieve efficient recognition performance, a special finger vein imaging device is developed, and a finger vein recognition method based on sparse representation is proposed. The motivation for the proposed method is that finger vein images exhibit a sparse property. In the proposed system, the regions of interest (ROIs) in the finger vein images are segmented and enhanced. Sparse representation and sparsity preserving projection on ROIs are performed to obtain the features. Finally, the features are measured for recognition. An equal error rate of 0.017% was achieved based on the finger vein image database, which contains images that were captured by using the near-IR imaging device that was developed in this study. The experimental results demonstrate that the proposed method is faster and more robust than previous methods.
Target recognition for ladar range image using slice image
NASA Astrophysics Data System (ADS)
Xia, Wenze; Han, Shaokun; Wang, Liang
2015-12-01
A shape descriptor and a complete shape-based recognition system using slice images as geometric feature descriptor for ladar range images are introduced. A slice image is a two-dimensional image generated by three-dimensional Hough transform and the corresponding mathematical transformation. The system consists of two processes, the model library construction and recognition. In the model library construction process, a series of range images are obtained after the model object is sampled at preset attitude angles. Then, all the range images are converted into slice images. The number of slice images is reduced by clustering analysis and finding a representation to reduce the size of the model library. In the recognition process, the slice image of the scene is compared with the slice image in the model library. The recognition results depend on the comparison. Simulated ladar range images are used to analyze the recognition and misjudgment rates, and comparison between the slice image representation method and moment invariants representation method is performed. The experimental results show that whether in conditions without noise or with ladar noise, the system has a high recognition rate and low misjudgment rate. The comparison experiment demonstrates that the slice image has better representation ability than moment invariants.
Part-based deep representation for product tagging and search
NASA Astrophysics Data System (ADS)
Chen, Keqing
2017-06-01
Despite previous studies, tagging and indexing the product images remain challenging due to the large inner-class variation of the products. In the traditional methods, the quantized hand-crafted features such as SIFTs are extracted as the representation of the product images, which are not discriminative enough to handle the inner-class variation. For discriminative image representation, this paper firstly presents a novel deep convolutional neural networks (DCNNs) architect true pre-trained on a large-scale general image dataset. Compared to the traditional features, our DCNNs representation is of more discriminative power with fewer dimensions. Moreover, we incorporate the part-based model into the framework to overcome the negative effect of bad alignment and cluttered background and hence the descriptive ability of the deep representation is further enhanced. Finally, we collect and contribute a well-labeled shoe image database, i.e., the TBShoes, on which we apply the part-based deep representation for product image tagging and search, respectively. The experimental results highlight the advantages of the proposed part-based deep representation.
Low-count PET image restoration using sparse representation
NASA Astrophysics Data System (ADS)
Li, Tao; Jiang, Changhui; Gao, Juan; Yang, Yongfeng; Liang, Dong; Liu, Xin; Zheng, Hairong; Hu, Zhanli
2018-04-01
In the field of positron emission tomography (PET), reconstructed images are often blurry and contain noise. These problems are primarily caused by the low resolution of projection data. Solving this problem by improving hardware is an expensive solution, and therefore, we attempted to develop a solution based on optimizing several related algorithms in both the reconstruction and image post-processing domains. As sparse technology is widely used, sparse prediction is increasingly applied to solve this problem. In this paper, we propose a new sparse method to process low-resolution PET images. Two dictionaries (D1 for low-resolution PET images and D2 for high-resolution PET images) are learned from a group real PET image data sets. Among these two dictionaries, D1 is used to obtain a sparse representation for each patch of the input PET image. Then, a high-resolution PET image is generated from this sparse representation using D2. Experimental results indicate that the proposed method exhibits a stable and superior ability to enhance image resolution and recover image details. Quantitatively, this method achieves better performance than traditional methods. This proposed strategy is a new and efficient approach for improving the quality of PET images.
Locality constrained joint dynamic sparse representation for local matching based face recognition.
Wang, Jianzhong; Yi, Yugen; Zhou, Wei; Shi, Yanjiao; Qi, Miao; Zhang, Ming; Zhang, Baoxue; Kong, Jun
2014-01-01
Recently, Sparse Representation-based Classification (SRC) has attracted a lot of attention for its applications to various tasks, especially in biometric techniques such as face recognition. However, factors such as lighting, expression, pose and disguise variations in face images will decrease the performances of SRC and most other face recognition techniques. In order to overcome these limitations, we propose a robust face recognition method named Locality Constrained Joint Dynamic Sparse Representation-based Classification (LCJDSRC) in this paper. In our method, a face image is first partitioned into several smaller sub-images. Then, these sub-images are sparsely represented using the proposed locality constrained joint dynamic sparse representation algorithm. Finally, the representation results for all sub-images are aggregated to obtain the final recognition result. Compared with other algorithms which process each sub-image of a face image independently, the proposed algorithm regards the local matching-based face recognition as a multi-task learning problem. Thus, the latent relationships among the sub-images from the same face image are taken into account. Meanwhile, the locality information of the data is also considered in our algorithm. We evaluate our algorithm by comparing it with other state-of-the-art approaches. Extensive experiments on four benchmark face databases (ORL, Extended YaleB, AR and LFW) demonstrate the effectiveness of LCJDSRC.
Li, Qing; Liang, Steven Y
2018-04-20
Microstructure images of metallic materials play a significant role in industrial applications. To address image degradation problem of metallic materials, a novel image restoration technique based on K-means singular value decomposition (KSVD) and smoothing penalty sparse representation (SPSR) algorithm is proposed in this work, the microstructure images of aluminum alloy 7075 (AA7075) material are used as examples. To begin with, to reflect the detail structure characteristics of the damaged image, the KSVD dictionary is introduced to substitute the traditional sparse transform basis (TSTB) for sparse representation. Then, due to the image restoration, modeling belongs to a highly underdetermined equation, and traditional sparse reconstruction methods may cause instability and obvious artifacts in the reconstructed images, especially reconstructed image with many smooth regions and the noise level is strong, thus the SPSR (here, q = 0.5) algorithm is designed to reconstruct the damaged image. The results of simulation and two practical cases demonstrate that the proposed method has superior performance compared with some state-of-the-art methods in terms of restoration performance factors and visual quality. Meanwhile, the grain size parameters and grain boundaries of microstructure image are discussed before and after they are restored by proposed method.
A Wavelet Polarization Decomposition Net Model for Polarimetric SAR Image Classification
NASA Astrophysics Data System (ADS)
He, Chu; Ou, Dan; Yang, Teng; Wu, Kun; Liao, Mingsheng; Chen, Erxue
2014-11-01
In this paper, a deep model based on wavelet texture has been proposed for Polarimetric Synthetic Aperture Radar (PolSAR) image classification inspired by recent successful deep learning method. Our model is supposed to learn powerful and informative representations to improve the generalization ability for the complex scene classification tasks. Given the influence of speckle noise in Polarimetric SAR image, wavelet polarization decomposition is applied first to obtain basic and discriminative texture features which are then embedded into a Deep Neural Network (DNN) in order to compose multi-layer higher representations. We demonstrate that the model can produce a powerful representation which can capture some untraceable information from Polarimetric SAR images and show a promising achievement in comparison with other traditional SAR image classification methods for the SAR image dataset.
MToS: A Tree of Shapes for Multivariate Images.
Carlinet, Edwin; Géraud, Thierry
2015-12-01
The topographic map of a gray-level image, also called tree of shapes, provides a high-level hierarchical representation of the image contents. This representation, invariant to contrast changes and to contrast inversion, has been proved very useful to achieve many image processing and pattern recognition tasks. Its definition relies on the total ordering of pixel values, so this representation does not exist for color images, or more generally, multivariate images. Common workarounds, such as marginal processing, or imposing a total order on data, are not satisfactory and yield many problems. This paper presents a method to build a tree-based representation of multivariate images, which features marginally the same properties of the gray-level tree of shapes. Briefly put, we do not impose an arbitrary ordering on values, but we only rely on the inclusion relationship between shapes in the image definition domain. The interest of having a contrast invariant and self-dual representation of multivariate image is illustrated through several applications (filtering, segmentation, and object recognition) on different types of data: color natural images, document images, satellite hyperspectral imaging, multimodal medical imaging, and videos.
New instantaneous frequency estimation method based on the use of image processing techniques
NASA Astrophysics Data System (ADS)
Borda, Monica; Nafornita, Ioan; Isar, Alexandru
2003-05-01
The aim of this paper is to present a new method for the estimation of the instantaneous frequency of a frequency modulated signal, corrupted by additive noise. This method represents an example of fusion of two theories: the time-frequency representations and the mathematical morphology. Any time-frequency representation of a useful signal is concentrated around its instantaneous frequency law and realizes the diffusion of the noise that perturbs the useful signal in the time - frequency plane. In this paper a new time-frequency representation, useful for the estimation of the instantaneous frequency, is proposed. This time-frequency representation is the product of two others time-frequency representations: the Wigner - Ville time-frequency representation and a new one obtained by filtering with a hard thresholding filter the Gabor representation of the signal to be processed. Using the image of this new time-frequency representation the instantaneous frequency of the useful signal can be extracted with the aid of some mathematical morphology operators: the conversion in binary form, the dilation and the skeleton. The simulations of the proposed method have proved its qualities. It is better than other estimation methods, like those based on the use of adaptive notch filters.
OCT despeckling via weighted nuclear norm constrained non-local low-rank representation
NASA Astrophysics Data System (ADS)
Tang, Chang; Zheng, Xiao; Cao, Lijuan
2017-10-01
As a non-invasive imaging modality, optical coherence tomography (OCT) plays an important role in medical sciences. However, OCT images are always corrupted by speckle noise, which can mask image features and pose significant challenges for medical analysis. In this work, we propose an OCT despeckling method by using non-local, low-rank representation with weighted nuclear norm constraint. Unlike previous non-local low-rank representation based OCT despeckling methods, we first generate a guidance image to improve the non-local group patches selection quality, then a low-rank optimization model with a weighted nuclear norm constraint is formulated to process the selected group patches. The corrupted probability of each pixel is also integrated into the model as a weight to regularize the representation error term. Note that each single patch might belong to several groups, hence different estimates of each patch are aggregated to obtain its final despeckled result. Both qualitative and quantitative experimental results on real OCT images show the superior performance of the proposed method compared with other state-of-the-art speckle removal techniques.
Learning viewpoint invariant perceptual representations from cluttered images.
Spratling, Michael W
2005-05-01
In order to perform object recognition, it is necessary to form perceptual representations that are sufficiently specific to distinguish between objects, but that are also sufficiently flexible to generalize across changes in location, rotation, and scale. A standard method for learning perceptual representations that are invariant to viewpoint is to form temporal associations across image sequences showing object transformations. However, this method requires that individual stimuli be presented in isolation and is therefore unlikely to succeed in real-world applications where multiple objects can co-occur in the visual input. This paper proposes a simple modification to the learning method that can overcome this limitation and results in more robust learning of invariant representations.
General tensor discriminant analysis and gabor features for gait recognition.
Tao, Dacheng; Li, Xuelong; Wu, Xindong; Maybank, Stephen J
2007-10-01
The traditional image representations are not suited to conventional classification methods, such as the linear discriminant analysis (LDA), because of the under sample problem (USP): the dimensionality of the feature space is much higher than the number of training samples. Motivated by the successes of the two dimensional LDA (2DLDA) for face recognition, we develop a general tensor discriminant analysis (GTDA) as a preprocessing step for LDA. The benefits of GTDA compared with existing preprocessing methods, e.g., principal component analysis (PCA) and 2DLDA, include 1) the USP is reduced in subsequent classification by, for example, LDA; 2) the discriminative information in the training tensors is preserved; and 3) GTDA provides stable recognition rates because the alternating projection optimization algorithm to obtain a solution of GTDA converges, while that of 2DLDA does not. We use human gait recognition to validate the proposed GTDA. The averaged gait images are utilized for gait representation. Given the popularity of Gabor function based image decompositions for image understanding and object recognition, we develop three different Gabor function based image representations: 1) the GaborD representation is the sum of Gabor filter responses over directions, 2) GaborS is the sum of Gabor filter responses over scales, and 3) GaborSD is the sum of Gabor filter responses over scales and directions. The GaborD, GaborS and GaborSD representations are applied to the problem of recognizing people from their averaged gait images.A large number of experiments were carried out to evaluate the effectiveness (recognition rate) of gait recognition based on first obtaining a Gabor, GaborD, GaborS or GaborSD image representation, then using GDTA to extract features and finally using LDA for classification. The proposed methods achieved good performance for gait recognition based on image sequences from the USF HumanID Database. Experimental comparisons are made with nine state of the art classification methods in gait recognition.
A ganglion-cell-based primary image representation method and its contribution to object recognition
NASA Astrophysics Data System (ADS)
Wei, Hui; Dai, Zhi-Long; Zuo, Qing-Song
2016-10-01
A visual stimulus is represented by the biological visual system at several levels: in the order from low to high levels, they are: photoreceptor cells, ganglion cells (GCs), lateral geniculate nucleus cells and visual cortical neurons. Retinal GCs at the early level need to represent raw data only once, but meet a wide number of diverse requests from different vision-based tasks. This means the information representation at this level is general and not task-specific. Neurobiological findings have attributed this universal adaptation to GCs' receptive field (RF) mechanisms. For the purposes of developing a highly efficient image representation method that can facilitate information processing and interpretation at later stages, here we design a computational model to simulate the GC's non-classical RF. This new image presentation method can extract major structural features from raw data, and is consistent with other statistical measures of the image. Based on the new representation, the performances of other state-of-the-art algorithms in contour detection and segmentation can be upgraded remarkably. This work concludes that applying sophisticated representation schema at early state is an efficient and promising strategy in visual information processing.
Image super-resolution via sparse representation.
Yang, Jianchao; Wright, John; Huang, Thomas S; Ma, Yi
2010-11-01
This paper presents a new approach to single-image super-resolution, based on sparse signal representation. Research on image statistics suggests that image patches can be well-represented as a sparse linear combination of elements from an appropriately chosen over-complete dictionary. Inspired by this observation, we seek a sparse representation for each patch of the low-resolution input, and then use the coefficients of this representation to generate the high-resolution output. Theoretical results from compressed sensing suggest that under mild conditions, the sparse representation can be correctly recovered from the downsampled signals. By jointly training two dictionaries for the low- and high-resolution image patches, we can enforce the similarity of sparse representations between the low resolution and high resolution image patch pair with respect to their own dictionaries. Therefore, the sparse representation of a low resolution image patch can be applied with the high resolution image patch dictionary to generate a high resolution image patch. The learned dictionary pair is a more compact representation of the patch pairs, compared to previous approaches, which simply sample a large amount of image patch pairs, reducing the computational cost substantially. The effectiveness of such a sparsity prior is demonstrated for both general image super-resolution and the special case of face hallucination. In both cases, our algorithm generates high-resolution images that are competitive or even superior in quality to images produced by other similar SR methods. In addition, the local sparse modeling of our approach is naturally robust to noise, and therefore the proposed algorithm can handle super-resolution with noisy inputs in a more unified framework.
Liang, Steven Y.
2018-01-01
Microstructure images of metallic materials play a significant role in industrial applications. To address image degradation problem of metallic materials, a novel image restoration technique based on K-means singular value decomposition (KSVD) and smoothing penalty sparse representation (SPSR) algorithm is proposed in this work, the microstructure images of aluminum alloy 7075 (AA7075) material are used as examples. To begin with, to reflect the detail structure characteristics of the damaged image, the KSVD dictionary is introduced to substitute the traditional sparse transform basis (TSTB) for sparse representation. Then, due to the image restoration, modeling belongs to a highly underdetermined equation, and traditional sparse reconstruction methods may cause instability and obvious artifacts in the reconstructed images, especially reconstructed image with many smooth regions and the noise level is strong, thus the SPSR (here, q = 0.5) algorithm is designed to reconstruct the damaged image. The results of simulation and two practical cases demonstrate that the proposed method has superior performance compared with some state-of-the-art methods in terms of restoration performance factors and visual quality. Meanwhile, the grain size parameters and grain boundaries of microstructure image are discussed before and after they are restored by proposed method. PMID:29677163
Wu, Guorong; Kim, Minjeong; Wang, Qian; Munsell, Brent C.
2015-01-01
Feature selection is a critical step in deformable image registration. In particular, selecting the most discriminative features that accurately and concisely describe complex morphological patterns in image patches improves correspondence detection, which in turn improves image registration accuracy. Furthermore, since more and more imaging modalities are being invented to better identify morphological changes in medical imaging data,, the development of deformable image registration method that scales well to new image modalities or new image applications with little to no human intervention would have a significant impact on the medical image analysis community. To address these concerns, a learning-based image registration framework is proposed that uses deep learning to discover compact and highly discriminative features upon observed imaging data. Specifically, the proposed feature selection method uses a convolutional stacked auto-encoder to identify intrinsic deep feature representations in image patches. Since deep learning is an unsupervised learning method, no ground truth label knowledge is required. This makes the proposed feature selection method more flexible to new imaging modalities since feature representations can be directly learned from the observed imaging data in a very short amount of time. Using the LONI and ADNI imaging datasets, image registration performance was compared to two existing state-of-the-art deformable image registration methods that use handcrafted features. To demonstrate the scalability of the proposed image registration framework image registration experiments were conducted on 7.0-tesla brain MR images. In all experiments, the results showed the new image registration framework consistently demonstrated more accurate registration results when compared to state-of-the-art. PMID:26552069
Wu, Guorong; Kim, Minjeong; Wang, Qian; Munsell, Brent C; Shen, Dinggang
2016-07-01
Feature selection is a critical step in deformable image registration. In particular, selecting the most discriminative features that accurately and concisely describe complex morphological patterns in image patches improves correspondence detection, which in turn improves image registration accuracy. Furthermore, since more and more imaging modalities are being invented to better identify morphological changes in medical imaging data, the development of deformable image registration method that scales well to new image modalities or new image applications with little to no human intervention would have a significant impact on the medical image analysis community. To address these concerns, a learning-based image registration framework is proposed that uses deep learning to discover compact and highly discriminative features upon observed imaging data. Specifically, the proposed feature selection method uses a convolutional stacked autoencoder to identify intrinsic deep feature representations in image patches. Since deep learning is an unsupervised learning method, no ground truth label knowledge is required. This makes the proposed feature selection method more flexible to new imaging modalities since feature representations can be directly learned from the observed imaging data in a very short amount of time. Using the LONI and ADNI imaging datasets, image registration performance was compared to two existing state-of-the-art deformable image registration methods that use handcrafted features. To demonstrate the scalability of the proposed image registration framework, image registration experiments were conducted on 7.0-T brain MR images. In all experiments, the results showed that the new image registration framework consistently demonstrated more accurate registration results when compared to state of the art.
Huang, Jinhong; Guo, Li; Feng, Qianjin; Chen, Wufan; Feng, Yanqiu
2015-07-21
Image reconstruction from undersampled k-space data accelerates magnetic resonance imaging (MRI) by exploiting image sparseness in certain transform domains. Employing image patch representation over a learned dictionary has the advantage of being adaptive to local image structures and thus can better sparsify images than using fixed transforms (e.g. wavelets and total variations). Dictionary learning methods have recently been introduced to MRI reconstruction, and these methods demonstrate significantly reduced reconstruction errors compared to sparse MRI reconstruction using fixed transforms. However, the synthesis sparse coding problem in dictionary learning is NP-hard and computationally expensive. In this paper, we present a novel sparsity-promoting orthogonal dictionary updating method for efficient image reconstruction from highly undersampled MRI data. The orthogonality imposed on the learned dictionary enables the minimization problem in the reconstruction to be solved by an efficient optimization algorithm which alternately updates representation coefficients, orthogonal dictionary, and missing k-space data. Moreover, both sparsity level and sparse representation contribution using updated dictionaries gradually increase during iterations to recover more details, assuming the progressively improved quality of the dictionary. Simulation and real data experimental results both demonstrate that the proposed method is approximately 10 to 100 times faster than the K-SVD-based dictionary learning MRI method and simultaneously improves reconstruction accuracy.
Multiple Sparse Representations Classification
Plenge, Esben; Klein, Stefan S.; Niessen, Wiro J.; Meijering, Erik
2015-01-01
Sparse representations classification (SRC) is a powerful technique for pixelwise classification of images and it is increasingly being used for a wide variety of image analysis tasks. The method uses sparse representation and learned redundant dictionaries to classify image pixels. In this empirical study we propose to further leverage the redundancy of the learned dictionaries to achieve a more accurate classifier. In conventional SRC, each image pixel is associated with a small patch surrounding it. Using these patches, a dictionary is trained for each class in a supervised fashion. Commonly, redundant/overcomplete dictionaries are trained and image patches are sparsely represented by a linear combination of only a few of the dictionary elements. Given a set of trained dictionaries, a new patch is sparse coded using each of them, and subsequently assigned to the class whose dictionary yields the minimum residual energy. We propose a generalization of this scheme. The method, which we call multiple sparse representations classification (mSRC), is based on the observation that an overcomplete, class specific dictionary is capable of generating multiple accurate and independent estimates of a patch belonging to the class. So instead of finding a single sparse representation of a patch for each dictionary, we find multiple, and the corresponding residual energies provides an enhanced statistic which is used to improve classification. We demonstrate the efficacy of mSRC for three example applications: pixelwise classification of texture images, lumen segmentation in carotid artery magnetic resonance imaging (MRI), and bifurcation point detection in carotid artery MRI. We compare our method with conventional SRC, K-nearest neighbor, and support vector machine classifiers. The results show that mSRC outperforms SRC and the other reference methods. In addition, we present an extensive evaluation of the effect of the main mSRC parameters: patch size, dictionary size, and sparsity level. PMID:26177106
Group-sparse representation with dictionary learning for medical image denoising and fusion.
Li, Shutao; Yin, Haitao; Fang, Leyuan
2012-12-01
Recently, sparse representation has attracted a lot of interest in various areas. However, the standard sparse representation does not consider the intrinsic structure, i.e., the nonzero elements occur in clusters, called group sparsity. Furthermore, there is no dictionary learning method for group sparse representation considering the geometrical structure of space spanned by atoms. In this paper, we propose a novel dictionary learning method, called Dictionary Learning with Group Sparsity and Graph Regularization (DL-GSGR). First, the geometrical structure of atoms is modeled as the graph regularization. Then, combining group sparsity and graph regularization, the DL-GSGR is presented, which is solved by alternating the group sparse coding and dictionary updating. In this way, the group coherence of learned dictionary can be enforced small enough such that any signal can be group sparse coded effectively. Finally, group sparse representation with DL-GSGR is applied to 3-D medical image denoising and image fusion. Specifically, in 3-D medical image denoising, a 3-D processing mechanism (using the similarity among nearby slices) and temporal regularization (to perverse the correlations across nearby slices) are exploited. The experimental results on 3-D image denoising and image fusion demonstrate the superiority of our proposed denoising and fusion approaches.
Detecting objects in radiographs for homeland security
NASA Astrophysics Data System (ADS)
Prasad, Lakshman; Snyder, Hans
2005-05-01
We present a general scheme for segmenting a radiographic image into polygons that correspond to visual features. This decomposition provides a vectorized representation that is a high-level description of the image. The polygons correspond to objects or object parts present in the image. This characterization of radiographs allows the direct application of several shape recognition algorithms to identify objects. In this paper we describe the use of constrained Delaunay triangulations as a uniform foundational tool to achieve multiple visual tasks, namely image segmentation, shape decomposition, and parts-based shape matching. Shape decomposition yields parts that serve as tokens representing local shape characteristics. Parts-based shape matching enables the recognition of objects in the presence of occlusions, which commonly occur in radiographs. The polygonal representation of image features affords the efficient design and application of sophisticated geometric filtering methods to detect large-scale structural properties of objects in images. Finally, the representation of radiographs via polygons results in significant reduction of image file sizes and permits the scalable graphical representation of images, along with annotations of detected objects, in the SVG (scalable vector graphics) format that is proposed by the world wide web consortium (W3C). This is a textual representation that can be compressed and encrypted for efficient and secure transmission of information over wireless channels and on the Internet. In particular, our methods described here provide an algorithmic framework for developing image analysis tools for screening cargo at ports of entry for homeland security.
Multi-layer sparse representation for weighted LBP-patches based facial expression recognition.
Jia, Qi; Gao, Xinkai; Guo, He; Luo, Zhongxuan; Wang, Yi
2015-03-19
In this paper, a novel facial expression recognition method based on sparse representation is proposed. Most contemporary facial expression recognition systems suffer from limited ability to handle image nuisances such as low resolution and noise. Especially for low intensity expression, most of the existing training methods have quite low recognition rates. Motivated by sparse representation, the problem can be solved by finding sparse coefficients of the test image by the whole training set. Deriving an effective facial representation from original face images is a vital step for successful facial expression recognition. We evaluate facial representation based on weighted local binary patterns, and Fisher separation criterion is used to calculate the weighs of patches. A multi-layer sparse representation framework is proposed for multi-intensity facial expression recognition, especially for low-intensity expressions and noisy expressions in reality, which is a critical problem but seldom addressed in the existing works. To this end, several experiments based on low-resolution and multi-intensity expressions are carried out. Promising results on publicly available databases demonstrate the potential of the proposed approach.
Example-Based Image Colorization Using Locality Consistent Sparse Representation.
Bo Li; Fuchen Zhao; Zhuo Su; Xiangguo Liang; Yu-Kun Lai; Rosin, Paul L
2017-11-01
Image colorization aims to produce a natural looking color image from a given gray-scale image, which remains a challenging problem. In this paper, we propose a novel example-based image colorization method exploiting a new locality consistent sparse representation. Given a single reference color image, our method automatically colorizes the target gray-scale image by sparse pursuit. For efficiency and robustness, our method operates at the superpixel level. We extract low-level intensity features, mid-level texture features, and high-level semantic features for each superpixel, which are then concatenated to form its descriptor. The collection of feature vectors for all the superpixels from the reference image composes the dictionary. We formulate colorization of target superpixels as a dictionary-based sparse reconstruction problem. Inspired by the observation that superpixels with similar spatial location and/or feature representation are likely to match spatially close regions from the reference image, we further introduce a locality promoting regularization term into the energy formulation, which substantially improves the matching consistency and subsequent colorization results. Target superpixels are colorized based on the chrominance information from the dominant reference superpixels. Finally, to further improve coherence while preserving sharpness, we develop a new edge-preserving filter for chrominance channels with the guidance from the target gray-scale image. To the best of our knowledge, this is the first work on sparse pursuit image colorization from single reference images. Experimental results demonstrate that our colorization method outperforms the state-of-the-art methods, both visually and quantitatively using a user study.
NASA Astrophysics Data System (ADS)
Wu, Wei; Zhao, Dewei; Zhang, Huan
2015-12-01
Super-resolution image reconstruction is an effective method to improve the image quality. It has important research significance in the field of image processing. However, the choice of the dictionary directly affects the efficiency of image reconstruction. A sparse representation theory is introduced into the problem of the nearest neighbor selection. Based on the sparse representation of super-resolution image reconstruction method, a super-resolution image reconstruction algorithm based on multi-class dictionary is analyzed. This method avoids the redundancy problem of only training a hyper complete dictionary, and makes the sub-dictionary more representatives, and then replaces the traditional Euclidean distance computing method to improve the quality of the whole image reconstruction. In addition, the ill-posed problem is introduced into non-local self-similarity regularization. Experimental results show that the algorithm is much better results than state-of-the-art algorithm in terms of both PSNR and visual perception.
Improving Low-dose Cardiac CT Images based on 3D Sparse Representation
NASA Astrophysics Data System (ADS)
Shi, Luyao; Hu, Yining; Chen, Yang; Yin, Xindao; Shu, Huazhong; Luo, Limin; Coatrieux, Jean-Louis
2016-03-01
Cardiac computed tomography (CCT) is a reliable and accurate tool for diagnosis of coronary artery diseases and is also frequently used in surgery guidance. Low-dose scans should be considered in order to alleviate the harm to patients caused by X-ray radiation. However, low dose CT (LDCT) images tend to be degraded by quantum noise and streak artifacts. In order to improve the cardiac LDCT image quality, a 3D sparse representation-based processing (3D SR) is proposed by exploiting the sparsity and regularity of 3D anatomical features in CCT. The proposed method was evaluated by a clinical study of 14 patients. The performance of the proposed method was compared to the 2D spares representation-based processing (2D SR) and the state-of-the-art noise reduction algorithm BM4D. The visual assessment, quantitative assessment and qualitative assessment results show that the proposed approach can lead to effective noise/artifact suppression and detail preservation. Compared to the other two tested methods, 3D SR method can obtain results with image quality most close to the reference standard dose CT (SDCT) images.
Multi-representation ability of students on the problem solving physics
NASA Astrophysics Data System (ADS)
Theasy, Y.; Wiyanto; Sujarwata
2018-03-01
Accuracy in representing knowledge possessed by students will show how the level of student understanding. The multi-representation ability of students on the problem solving of physics has been done through qualitative method of grounded theory model and implemented on physics education student of Unnes academic year 2016/2017. Multiforms of representation used are verbal (V), images/diagrams (D), graph (G), and mathematically (M). High and low category students have an accurate use of graphical representation (G) of 83% and 77.78%, and medium category has accurate use of image representation (D) equal to 66%.
NASA Technical Reports Server (NTRS)
Kweon, In SO; Hebert, Martial; Kanade, Takeo
1989-01-01
A three-dimensional perception system for building a geometrical description of rugged terrain environments from range image data is presented with reference to the exploration of the rugged terrain of Mars. An intermediate representation consisting of an elevation map that includes an explicit representation of uncertainty and labeling of the occluded regions is proposed. The locus method used to convert range image to an elevation map is introduced, along with an uncertainty model based on this algorithm. Both the elevation map and the locus method are the basis of a terrain matching algorithm which does not assume any correspondences between range images. The two-stage algorithm consists of a feature-based matching algorithm to compute an initial transform and an iconic terrain matching algorithm to merge multiple range images into a uniform representation. Terrain modeling results on real range images of rugged terrain are presented. The algorithms considered are a fundamental part of the perception system for the Ambler, a legged locomotor.
Supervoxels for graph cuts-based deformable image registration using guided image filtering
NASA Astrophysics Data System (ADS)
Szmul, Adam; Papież, Bartłomiej W.; Hallack, Andre; Grau, Vicente; Schnabel, Julia A.
2017-11-01
We propose combining a supervoxel-based image representation with the concept of graph cuts as an efficient optimization technique for three-dimensional (3-D) deformable image registration. Due to the pixels/voxels-wise graph construction, the use of graph cuts in this context has been mainly limited to two-dimensional (2-D) applications. However, our work overcomes some of the previous limitations by posing the problem on a graph created by adjacent supervoxels, where the number of nodes in the graph is reduced from the number of voxels to the number of supervoxels. We demonstrate how a supervoxel image representation combined with graph cuts-based optimization can be applied to 3-D data. We further show that the application of a relaxed graph representation of the image, followed by guided image filtering over the estimated deformation field, allows us to model "sliding motion." Applying this method to lung image registration results in highly accurate image registration and anatomically plausible estimations of the deformations. Evaluation of our method on a publicly available computed tomography lung image dataset leads to the observation that our approach compares very favorably with state of the art methods in continuous and discrete image registration, achieving target registration error of 1.16 mm on average per landmark.
Supervoxels for Graph Cuts-Based Deformable Image Registration Using Guided Image Filtering.
Szmul, Adam; Papież, Bartłomiej W; Hallack, Andre; Grau, Vicente; Schnabel, Julia A
2017-10-04
In this work we propose to combine a supervoxel-based image representation with the concept of graph cuts as an efficient optimization technique for 3D deformable image registration. Due to the pixels/voxels-wise graph construction, the use of graph cuts in this context has been mainly limited to 2D applications. However, our work overcomes some of the previous limitations by posing the problem on a graph created by adjacent supervoxels, where the number of nodes in the graph is reduced from the number of voxels to the number of supervoxels. We demonstrate how a supervoxel image representation, combined with graph cuts-based optimization can be applied to 3D data. We further show that the application of a relaxed graph representation of the image, followed by guided image filtering over the estimated deformation field, allows us to model 'sliding motion'. Applying this method to lung image registration, results in highly accurate image registration and anatomically plausible estimations of the deformations. Evaluation of our method on a publicly available Computed Tomography lung image dataset (www.dir-lab.com) leads to the observation that our new approach compares very favorably with state-of-the-art in continuous and discrete image registration methods achieving Target Registration Error of 1.16mm on average per landmark.
Supervoxels for Graph Cuts-Based Deformable Image Registration Using Guided Image Filtering
Szmul, Adam; Papież, Bartłomiej W.; Hallack, Andre; Grau, Vicente; Schnabel, Julia A.
2017-01-01
In this work we propose to combine a supervoxel-based image representation with the concept of graph cuts as an efficient optimization technique for 3D deformable image registration. Due to the pixels/voxels-wise graph construction, the use of graph cuts in this context has been mainly limited to 2D applications. However, our work overcomes some of the previous limitations by posing the problem on a graph created by adjacent supervoxels, where the number of nodes in the graph is reduced from the number of voxels to the number of supervoxels. We demonstrate how a supervoxel image representation, combined with graph cuts-based optimization can be applied to 3D data. We further show that the application of a relaxed graph representation of the image, followed by guided image filtering over the estimated deformation field, allows us to model ‘sliding motion’. Applying this method to lung image registration, results in highly accurate image registration and anatomically plausible estimations of the deformations. Evaluation of our method on a publicly available Computed Tomography lung image dataset (www.dir-lab.com) leads to the observation that our new approach compares very favorably with state-of-the-art in continuous and discrete image registration methods achieving Target Registration Error of 1.16mm on average per landmark. PMID:29225433
Fast Acquisition and Reconstruction of Optical Coherence Tomography Images via Sparse Representation
Li, Shutao; McNabb, Ryan P.; Nie, Qing; Kuo, Anthony N.; Toth, Cynthia A.; Izatt, Joseph A.; Farsiu, Sina
2014-01-01
In this paper, we present a novel technique, based on compressive sensing principles, for reconstruction and enhancement of multi-dimensional image data. Our method is a major improvement and generalization of the multi-scale sparsity based tomographic denoising (MSBTD) algorithm we recently introduced for reducing speckle noise. Our new technique exhibits several advantages over MSBTD, including its capability to simultaneously reduce noise and interpolate missing data. Unlike MSBTD, our new method does not require an a priori high-quality image from the target imaging subject and thus offers the potential to shorten clinical imaging sessions. This novel image restoration method, which we termed sparsity based simultaneous denoising and interpolation (SBSDI), utilizes sparse representation dictionaries constructed from previously collected datasets. We tested the SBSDI algorithm on retinal spectral domain optical coherence tomography images captured in the clinic. Experiments showed that the SBSDI algorithm qualitatively and quantitatively outperforms other state-of-the-art methods. PMID:23846467
Hyperspectral Image Classification via Kernel Sparse Representation
2013-01-01
classification algorithms. Moreover, the spatial coherency across neighboring pixels is also incorporated through a kernelized joint sparsity model , where...joint sparsity model , where all of the pixels within a small neighborhood are jointly represented in the feature space by selecting a few common training...hyperspectral imagery, joint spar- sity model , kernel methods, sparse representation. I. INTRODUCTION HYPERSPECTRAL imaging sensors capture images
Matching shapes with self-intersections: application to leaf classification.
Mokhtarian, Farzin; Abbasi, Sadegh
2004-05-01
We address the problem of two-dimensional (2-D) shape representation and matching in presence of self-intersection for large image databases. This may occur when part of an object is hidden behind another part and results in a darker section in the gray level image of the object. The boundary contour of the object must include the boundary of this part which is entirely inside the outline of the object. The Curvature Scale Space (CSS) image of a shape is a multiscale organization of its inflection points as it is smoothed. The CSS-based shape representation method has been selected for MPEG-7 standardization. We study the effects of contour self-intersection on the Curvature Scale Space image. When there is no self-intersection, the CSS image contains several arch shape contours, each related to a concavity or a convexity of the shape. Self intersections create contours with minima as well as maxima in the CSS image. An efficient shape representation method has been introduced in this paper which describes a shape using the maxima as well as the minima of its CSS contours. This is a natural generalization of the conventional method which only includes the maxima of the CSS image contours. The conventional matching algorithm has also been modified to accommodate the new information about the minima. The method has been successfully used in a real world application to find, for an unknown leaf, similar classes from a database of classified leaf images representing different varieties of chrysanthemum. For many classes of leaves, self-intersection is inevitable during the scanning of the image. Therefore the original contributions of this paper is the generalization of the Curvature Scale Space representation to the class of 2-D contours with self-intersection, and its application to the classification of Chrysanthemum leaves.
Medical Image Retrieval Using Multi-Texton Assignment.
Tang, Qiling; Yang, Jirong; Xia, Xianfu
2018-02-01
In this paper, we present a multi-texton representation method for medical image retrieval, which utilizes the locality constraint to encode each filter bank response within its local-coordinate system consisting of the k nearest neighbors in texton dictionary and subsequently employs spatial pyramid matching technique to implement feature vector representation. Comparison with the traditional nearest neighbor assignment followed by texton histogram statistics method, our strategies reduce the quantization errors in mapping process and add information about the spatial layout of texton distributions and, thus, increase the descriptive power of the image representation. We investigate the effects of different parameters on system performance in order to choose the appropriate ones for our datasets and carry out experiments on the IRMA-2009 medical collection and the mammographic patch dataset. The extensive experimental results demonstrate that the proposed method has superior performance.
Efficient processing of fluorescence images using directional multiscale representations.
Labate, D; Laezza, F; Negi, P; Ozcan, B; Papadakis, M
2014-01-01
Recent advances in high-resolution fluorescence microscopy have enabled the systematic study of morphological changes in large populations of cells induced by chemical and genetic perturbations, facilitating the discovery of signaling pathways underlying diseases and the development of new pharmacological treatments. In these studies, though, due to the complexity of the data, quantification and analysis of morphological features are for the vast majority handled manually, slowing significantly data processing and limiting often the information gained to a descriptive level. Thus, there is an urgent need for developing highly efficient automated analysis and processing tools for fluorescent images. In this paper, we present the application of a method based on the shearlet representation for confocal image analysis of neurons. The shearlet representation is a newly emerged method designed to combine multiscale data analysis with superior directional sensitivity, making this approach particularly effective for the representation of objects defined over a wide range of scales and with highly anisotropic features. Here, we apply the shearlet representation to problems of soma detection of neurons in culture and extraction of geometrical features of neuronal processes in brain tissue, and propose it as a new framework for large-scale fluorescent image analysis of biomedical data.
Efficient processing of fluorescence images using directional multiscale representations
Labate, D.; Laezza, F.; Negi, P.; Ozcan, B.; Papadakis, M.
2017-01-01
Recent advances in high-resolution fluorescence microscopy have enabled the systematic study of morphological changes in large populations of cells induced by chemical and genetic perturbations, facilitating the discovery of signaling pathways underlying diseases and the development of new pharmacological treatments. In these studies, though, due to the complexity of the data, quantification and analysis of morphological features are for the vast majority handled manually, slowing significantly data processing and limiting often the information gained to a descriptive level. Thus, there is an urgent need for developing highly efficient automated analysis and processing tools for fluorescent images. In this paper, we present the application of a method based on the shearlet representation for confocal image analysis of neurons. The shearlet representation is a newly emerged method designed to combine multiscale data analysis with superior directional sensitivity, making this approach particularly effective for the representation of objects defined over a wide range of scales and with highly anisotropic features. Here, we apply the shearlet representation to problems of soma detection of neurons in culture and extraction of geometrical features of neuronal processes in brain tissue, and propose it as a new framework for large-scale fluorescent image analysis of biomedical data. PMID:28804225
Neonatal Atlas Construction Using Sparse Representation
Shi, Feng; Wang, Li; Wu, Guorong; Li, Gang; Gilmore, John H.; Lin, Weili; Shen, Dinggang
2014-01-01
Atlas construction generally includes first an image registration step to normalize all images into a common space and then an atlas building step to fuse the information from all the aligned images. Although numerous atlas construction studies have been performed to improve the accuracy of the image registration step, unweighted or simply weighted average is often used in the atlas building step. In this article, we propose a novel patch-based sparse representation method for atlas construction after all images have been registered into the common space. By taking advantage of local sparse representation, more anatomical details can be recovered in the built atlas. To make the anatomical structures spatially smooth in the atlas, the anatomical feature constraints on group structure of representations and also the overlapping of neighboring patches are imposed to ensure the anatomical consistency between neighboring patches. The proposed method has been applied to 73 neonatal MR images with poor spatial resolution and low tissue contrast, for constructing a neonatal brain atlas with sharp anatomical details. Experimental results demonstrate that the proposed method can significantly enhance the quality of the constructed atlas by discovering more anatomical details especially in the highly convoluted cortical regions. The resulting atlas demonstrates superior performance of our atlas when applied to spatially normalizing three different neonatal datasets, compared with other start-of-the-art neonatal brain atlases. PMID:24638883
Multilinear Graph Embedding: Representation and Regularization for Images.
Chen, Yi-Lei; Hsu, Chiou-Ting
2014-02-01
Given a set of images, finding a compact and discriminative representation is still a big challenge especially when multiple latent factors are hidden in the way of data generation. To represent multifactor images, although multilinear models are widely used to parameterize the data, most methods are based on high-order singular value decomposition (HOSVD), which preserves global statistics but interprets local variations inadequately. To this end, we propose a novel method, called multilinear graph embedding (MGE), as well as its kernelization MKGE to leverage the manifold learning techniques into multilinear models. Our method theoretically links the linear, nonlinear, and multilinear dimensionality reduction. We also show that the supervised MGE encodes informative image priors for image regularization, provided that an image is represented as a high-order tensor. From our experiments on face and gait recognition, the superior performance demonstrates that MGE better represents multifactor images than classic methods, including HOSVD and its variants. In addition, the significant improvement in image (or tensor) completion validates the potential of MGE for image regularization.
NASA Astrophysics Data System (ADS)
Li, Yung-Hui; Zheng, Bo-Ren; Ji, Dai-Yan; Tien, Chung-Hao; Liu, Po-Tsun
2014-09-01
Cross sensor iris matching may seriously degrade the recognition performance because of the sensor mis-match problem of iris images between the enrollment and test stage. In this paper, we propose two novel patch-based heterogeneous dictionary learning method to attack this problem. The first method applies the latest sparse representation theory while the second method tries to learn the correspondence relationship through PCA in heterogeneous patch space. Both methods learn the basic atoms in iris textures across different image sensors and build connections between them. After such connections are built, at test stage, it is possible to hallucinate (synthesize) iris images across different sensors. By matching training images with hallucinated images, the recognition rate can be successfully enhanced. The experimental results showed the satisfied results both visually and in terms of recognition rate. Experimenting with an iris database consisting of 3015 images, we show that the EER is decreased 39.4% relatively by the proposed method.
Multiple Representations-Based Face Sketch-Photo Synthesis.
Peng, Chunlei; Gao, Xinbo; Wang, Nannan; Tao, Dacheng; Li, Xuelong; Li, Jie
2016-11-01
Face sketch-photo synthesis plays an important role in law enforcement and digital entertainment. Most of the existing methods only use pixel intensities as the feature. Since face images can be described using features from multiple aspects, this paper presents a novel multiple representations-based face sketch-photo-synthesis method that adaptively combines multiple representations to represent an image patch. In particular, it combines multiple features from face images processed using multiple filters and deploys Markov networks to exploit the interacting relationships between the neighboring image patches. The proposed framework could be solved using an alternating optimization strategy and it normally converges in only five outer iterations in the experiments. Our experimental results on the Chinese University of Hong Kong (CUHK) face sketch database, celebrity photos, CUHK Face Sketch FERET Database, IIIT-D Viewed Sketch Database, and forensic sketches demonstrate the effectiveness of our method for face sketch-photo synthesis. In addition, cross-database and database-dependent style-synthesis evaluations demonstrate the generalizability of this novel method and suggest promising solutions for face identification in forensic science.
Improving Low-dose Cardiac CT Images based on 3D Sparse Representation
Shi, Luyao; Hu, Yining; Chen, Yang; Yin, Xindao; Shu, Huazhong; Luo, Limin; Coatrieux, Jean-Louis
2016-01-01
Cardiac computed tomography (CCT) is a reliable and accurate tool for diagnosis of coronary artery diseases and is also frequently used in surgery guidance. Low-dose scans should be considered in order to alleviate the harm to patients caused by X-ray radiation. However, low dose CT (LDCT) images tend to be degraded by quantum noise and streak artifacts. In order to improve the cardiac LDCT image quality, a 3D sparse representation-based processing (3D SR) is proposed by exploiting the sparsity and regularity of 3D anatomical features in CCT. The proposed method was evaluated by a clinical study of 14 patients. The performance of the proposed method was compared to the 2D spares representation-based processing (2D SR) and the state-of-the-art noise reduction algorithm BM4D. The visual assessment, quantitative assessment and qualitative assessment results show that the proposed approach can lead to effective noise/artifact suppression and detail preservation. Compared to the other two tested methods, 3D SR method can obtain results with image quality most close to the reference standard dose CT (SDCT) images. PMID:26980176
Sparse representation-based image restoration via nonlocal supervised coding
NASA Astrophysics Data System (ADS)
Li, Ao; Chen, Deyun; Sun, Guanglu; Lin, Kezheng
2016-10-01
Sparse representation (SR) and nonlocal technique (NLT) have shown great potential in low-level image processing. However, due to the degradation of the observed image, SR and NLT may not be accurate enough to obtain a faithful restoration results when they are used independently. To improve the performance, in this paper, a nonlocal supervised coding strategy-based NLT for image restoration is proposed. The novel method has three main contributions. First, to exploit the useful nonlocal patches, a nonnegative sparse representation is introduced, whose coefficients can be utilized as the supervised weights among patches. Second, a novel objective function is proposed, which integrated the supervised weights learning and the nonlocal sparse coding to guarantee a more promising solution. Finally, to make the minimization tractable and convergence, a numerical scheme based on iterative shrinkage thresholding is developed to solve the above underdetermined inverse problem. The extensive experiments validate the effectiveness of the proposed method.
Determining biosonar images using sparse representations.
Fontaine, Bertrand; Peremans, Herbert
2009-05-01
Echolocating bats are thought to be able to create an image of their environment by emitting pulses and analyzing the reflected echoes. In this paper, the theory of sparse representations and its more recent further development into compressed sensing are applied to this biosonar image formation task. Considering the target image representation as sparse allows formulation of this inverse problem as a convex optimization problem for which well defined and efficient solution methods have been established. The resulting technique, referred to as L1-minimization, is applied to simulated data to analyze its performance relative to delay accuracy and delay resolution experiments. This method performs comparably to the coherent receiver for the delay accuracy experiments, is quite robust to noise, and can reconstruct complex target impulse responses as generated by many closely spaced reflectors with different reflection strengths. This same technique, in addition to reconstructing biosonar target images, can be used to simultaneously localize these complex targets by interpreting location cues induced by the bat's head related transfer function. Finally, a tentative explanation is proposed for specific bat behavioral experiments in terms of the properties of target images as reconstructed by the L1-minimization method.
Two-tier tissue decomposition for histopathological image representation and classification.
Gultekin, Tunc; Koyuncu, Can Fahrettin; Sokmensuer, Cenk; Gunduz-Demir, Cigdem
2015-01-01
In digital pathology, devising effective image representations is crucial to design robust automated diagnosis systems. To this end, many studies have proposed to develop object-based representations, instead of directly using image pixels, since a histopathological image may contain a considerable amount of noise typically at the pixel-level. These previous studies mostly employ color information to define their objects, which approximately represent histological tissue components in an image, and then use the spatial distribution of these objects for image representation and classification. Thus, object definition has a direct effect on the way of representing the image, which in turn affects classification accuracies. In this paper, our aim is to design a classification system for histopathological images. Towards this end, we present a new model for effective representation of these images that will be used by the classification system. The contributions of this model are twofold. First, it introduces a new two-tier tissue decomposition method for defining a set of multityped objects in an image. Different than the previous studies, these objects are defined combining texture, shape, and size information and they may correspond to individual histological tissue components as well as local tissue subregions of different characteristics. As its second contribution, it defines a new metric, which we call dominant blob scale, to characterize the shape and size of an object with a single scalar value. Our experiments on colon tissue images reveal that this new object definition and characterization provides distinguishing representation of normal and cancerous histopathological images, which is effective to obtain more accurate classification results compared to its counterparts.
Cognitive representations of AIDS: a phenomenological study.
Anderson, Elizabeth H; Spencer, Margaret Hull
2002-12-01
Cognitive representations of illness determine behavior. How persons living with AIDS image their disease might be key to understanding medication adherence and other health behaviors. The authors' purpose was to describe AIDS patients' cognitive representations of their illness. A purposive sample of 58 men and women with AIDS were interviewed. Using Colaizzi's (1978) phenomenological method, rigor was established through application of verification, validation, and validity. From 175 significant statements, 11 themes emerged. Cognitive representations included imaging AIDS as death, bodily destruction, and just a disease. Coping focused on wiping AIDS out of the mind, hoping for the right drug, and caring for oneself. Inquiring about a patient's image of AIDS might help nurses assess coping processes and enhance nurse-patient relationships.
The role of figure-ground segregation in change blindness.
Landman, Rogier; Spekreijse, Henk; Lamme, Victor A F
2004-04-01
Partial report methods have shown that a large-capacity representation exists for a few hundred milliseconds after a picture has disappeared. However, change blindness studies indicate that very limited information remains available when a changed version of the image is presented subsequently. What happens to the large-capacity representation? New input after the first image may interfere, but this is likely to depend on the characteristics of the new input. In our first experiment, we show that a display containing homogeneous image elements between changing images does not render the large-capacity representation unavailable. Interference occurs when these new elements define objects. On that basis we introduce a new method to produce change blindness: The second experiment shows that change blindness can be induced by redefining figure and background, without an interval between the displays. The local features (line segments) that defined figures and background were swapped, while the contours of the figures remained where they were. Normally, changes are easily detected when there is no interval. However, our paradigm results in massive change blindness. We propose that in a change blindness experiment, there is a large-capacity representation of the original image when it is followed by a homogeneous interval display, but that change blindness occurs whenever the changed image forces resegregation of figures from the background.
MARTA GANs: Unsupervised Representation Learning for Remote Sensing Image Classification
NASA Astrophysics Data System (ADS)
Lin, Daoyu; Fu, Kun; Wang, Yang; Xu, Guangluan; Sun, Xian
2017-11-01
With the development of deep learning, supervised learning has frequently been adopted to classify remotely sensed images using convolutional networks (CNNs). However, due to the limited amount of labeled data available, supervised learning is often difficult to carry out. Therefore, we proposed an unsupervised model called multiple-layer feature-matching generative adversarial networks (MARTA GANs) to learn a representation using only unlabeled data. MARTA GANs consists of both a generative model $G$ and a discriminative model $D$. We treat $D$ as a feature extractor. To fit the complex properties of remote sensing data, we use a fusion layer to merge the mid-level and global features. $G$ can produce numerous images that are similar to the training data; therefore, $D$ can learn better representations of remotely sensed images using the training data provided by $G$. The classification results on two widely used remote sensing image databases show that the proposed method significantly improves the classification performance compared with other state-of-the-art methods.
Self-organized Evaluation of Dynamic Hand Gestures for Sign Language Recognition
NASA Astrophysics Data System (ADS)
Buciu, Ioan; Pitas, Ioannis
Two main theories exist with respect to face encoding and representation in the human visual system (HVS). The first one refers to the dense (holistic) representation of the face, where faces have "holon"-like appearance. The second one claims that a more appropriate face representation is given by a sparse code, where only a small fraction of the neural cells corresponding to face encoding is activated. Theoretical and experimental evidence suggest that the HVS performs face analysis (encoding, storing, face recognition, facial expression recognition) in a structured and hierarchical way, where both representations have their own contribution and goal. According to neuropsychological experiments, it seems that encoding for face recognition, relies on holistic image representation, while a sparse image representation is used for facial expression analysis and classification. From the computer vision perspective, the techniques developed for automatic face and facial expression recognition fall into the same two representation types. Like in Neuroscience, the techniques which perform better for face recognition yield a holistic image representation, while those techniques suitable for facial expression recognition use a sparse or local image representation. The proposed mathematical models of image formation and encoding try to simulate the efficient storing, organization and coding of data in the human cortex. This is equivalent with embedding constraints in the model design regarding dimensionality reduction, redundant information minimization, mutual information minimization, non-negativity constraints, class information, etc. The presented techniques are applied as a feature extraction step followed by a classification method, which also heavily influences the recognition results.
Shu, Ting; Zhang, Bob; Tang, Yuan Yan
2017-01-01
At present, heart disease is the number one cause of death worldwide. Traditionally, heart disease is commonly detected using blood tests, electrocardiogram, cardiac computerized tomography scan, cardiac magnetic resonance imaging, and so on. However, these traditional diagnostic methods are time consuming and/or invasive. In this paper, we propose an effective noninvasive computerized method based on facial images to quantitatively detect heart disease. Specifically, facial key block color features are extracted from facial images and analyzed using the Probabilistic Collaborative Representation Based Classifier. The idea of facial key block color analysis is founded in Traditional Chinese Medicine. A new dataset consisting of 581 heart disease and 581 healthy samples was experimented by the proposed method. In order to optimize the Probabilistic Collaborative Representation Based Classifier, an analysis of its parameters was performed. According to the experimental results, the proposed method obtains the highest accuracy compared with other classifiers and is proven to be effective at heart disease detection.
Facial expression recognition based on weber local descriptor and sparse representation
NASA Astrophysics Data System (ADS)
Ouyang, Yan
2018-03-01
Automatic facial expression recognition has been one of the research hotspots in the area of computer vision for nearly ten years. During the decade, many state-of-the-art methods have been proposed which perform very high accurate rate based on the face images without any interference. Nowadays, many researchers begin to challenge the task of classifying the facial expression images with corruptions and occlusions and the Sparse Representation based Classification framework has been wildly used because it can robust to the corruptions and occlusions. Therefore, this paper proposed a novel facial expression recognition method based on Weber local descriptor (WLD) and Sparse representation. The method includes three parts: firstly the face images are divided into many local patches, and then the WLD histograms of each patch are extracted, finally all the WLD histograms features are composed into a vector and combined with SRC to classify the facial expressions. The experiment results on the Cohn-Kanade database show that the proposed method is robust to occlusions and corruptions.
Complex noise suppression using a sparse representation and 3D filtering of images
NASA Astrophysics Data System (ADS)
Kravchenko, V. F.; Ponomaryov, V. I.; Pustovoit, V. I.; Palacios-Enriquez, A.
2017-08-01
A novel method for the filtering of images corrupted by complex noise composed of randomly distributed impulses and additive Gaussian noise has been substantiated for the first time. The method consists of three main stages: the detection and filtering of pixels corrupted by impulsive noise, the subsequent image processing to suppress the additive noise based on 3D filtering and a sparse representation of signals in a basis of wavelets, and the concluding image processing procedure to clean the final image of the errors emerged at the previous stages. A physical interpretation of the filtering method under complex noise conditions is given. A filtering block diagram has been developed in accordance with the novel approach. Simulations of the novel image filtering method have shown an advantage of the proposed filtering scheme in terms of generally recognized criteria, such as the structural similarity index measure and the peak signal-to-noise ratio, and when visually comparing the filtered images.
Zhang, Xiaodong; Jing, Shasha; Gao, Peiyi; Xue, Jing; Su, Lu; Li, Weiping; Ren, Lijie; Hu, Qingmao
2016-01-01
Segmentation of infarcts at hyperacute stage is challenging as they exhibit substantial variability which may even be hard for experts to delineate manually. In this paper, a sparse representation based classification method is explored. For each patient, four volumetric data items including three volumes of diffusion weighted imaging and a computed asymmetry map are employed to extract patch features which are then fed to dictionary learning and classification based on sparse representation. Elastic net is adopted to replace the traditional L 0 -norm/ L 1 -norm constraints on sparse representation to stabilize sparse code. To decrease computation cost and to reduce false positives, regions-of-interest are determined to confine candidate infarct voxels. The proposed method has been validated on 98 consecutive patients recruited within 6 hours from onset. It is shown that the proposed method could handle well infarcts with intensity variability and ill-defined edges to yield significantly higher Dice coefficient (0.755 ± 0.118) than the other two methods and their enhanced versions by confining their segmentations within the regions-of-interest (average Dice coefficient less than 0.610). The proposed method could provide a potential tool to quantify infarcts from diffusion weighted imaging at hyperacute stage with accuracy and speed to assist the decision making especially for thrombolytic therapy.
NASA Astrophysics Data System (ADS)
Moonon, Altan-Ulzii; Hu, Jianwen; Li, Shutao
2015-12-01
The remote sensing image fusion is an important preprocessing technique in remote sensing image processing. In this paper, a remote sensing image fusion method based on the nonsubsampled shearlet transform (NSST) with sparse representation (SR) is proposed. Firstly, the low resolution multispectral (MS) image is upsampled and color space is transformed from Red-Green-Blue (RGB) to Intensity-Hue-Saturation (IHS). Then, the high resolution panchromatic (PAN) image and intensity component of MS image are decomposed by NSST to high and low frequency coefficients. The low frequency coefficients of PAN and the intensity component are fused by the SR with the learned dictionary. The high frequency coefficients of intensity component and PAN image are fused by local energy based fusion rule. Finally, the fused result is obtained by performing inverse NSST and inverse IHS transform. The experimental results on IKONOS and QuickBird satellites demonstrate that the proposed method provides better spectral quality and superior spatial information in the fused image than other remote sensing image fusion methods both in visual effect and object evaluation.
Unsupervised feature learning for autonomous rock image classification
NASA Astrophysics Data System (ADS)
Shu, Lei; McIsaac, Kenneth; Osinski, Gordon R.; Francis, Raymond
2017-09-01
Autonomous rock image classification can enhance the capability of robots for geological detection and enlarge the scientific returns, both in investigation on Earth and planetary surface exploration on Mars. Since rock textural images are usually inhomogeneous and manually hand-crafting features is not always reliable, we propose an unsupervised feature learning method to autonomously learn the feature representation for rock images. In our tests, rock image classification using the learned features shows that the learned features can outperform manually selected features. Self-taught learning is also proposed to learn the feature representation from a large database of unlabelled rock images of mixed class. The learned features can then be used repeatedly for classification of any subclass. This takes advantage of the large dataset of unlabelled rock images and learns a general feature representation for many kinds of rocks. We show experimental results supporting the feasibility of self-taught learning on rock images.
Predicting perceptual quality of images in realistic scenario using deep filter banks
NASA Astrophysics Data System (ADS)
Zhang, Weixia; Yan, Jia; Hu, Shiyong; Ma, Yang; Deng, Dexiang
2018-03-01
Classical image perceptual quality assessment models usually resort to natural scene statistic methods, which are based on an assumption that certain reliable statistical regularities hold on undistorted images and will be corrupted by introduced distortions. However, these models usually fail to accurately predict degradation severity of images in realistic scenarios since complex, multiple, and interactive authentic distortions usually appear on them. We propose a quality prediction model based on convolutional neural network. Quality-aware features extracted from filter banks of multiple convolutional layers are aggregated into the image representation. Furthermore, an easy-to-implement and effective feature selection strategy is used to further refine the image representation and finally a linear support vector regression model is trained to map image representation into images' subjective perceptual quality scores. The experimental results on benchmark databases present the effectiveness and generalizability of the proposed model.
Knowledge representation for fuzzy inference aided medical image interpretation.
Gal, Norbert; Stoicu-Tivadar, Vasile
2012-01-01
Knowledge defines how an automated system transforms data into information. This paper suggests a representation method of medical imaging knowledge using fuzzy inference systems coded in XML files. The imaging knowledge incorporates features of the investigated objects in linguistic form and inference rules that can transform the linguistic data into information about a possible diagnosis. A fuzzy inference system is used to model the vagueness of the linguistic medical imaging terms. XML files are used to facilitate easy manipulation and deployment of the knowledge into the imaging software. Preliminary results are presented.
USDA-ARS?s Scientific Manuscript database
It is challenging to achieve rapid and accurate processing of large amounts of hyperspectral image data. This research was aimed to develop a novel classification method by employing deep feature representation with the stacked sparse auto-encoder (SSAE) and the SSAE combined with convolutional neur...
Shao, Ling; Yan, Ruomei; Li, Xuelong; Liu, Yan
2014-07-01
Image denoising is a well explored topic in the field of image processing. In the past several decades, the progress made in image denoising has benefited from the improved modeling of natural images. In this paper, we introduce a new taxonomy based on image representations for a better understanding of state-of-the-art image denoising techniques. Within each category, several representative algorithms are selected for evaluation and comparison. The experimental results are discussed and analyzed to determine the overall advantages and disadvantages of each category. In general, the nonlocal methods within each category produce better denoising results than local ones. In addition, methods based on overcomplete representations using learned dictionaries perform better than others. The comprehensive study in this paper would serve as a good reference and stimulate new research ideas in image denoising.
A dictionary learning approach for Poisson image deblurring.
Ma, Liyan; Moisan, Lionel; Yu, Jian; Zeng, Tieyong
2013-07-01
The restoration of images corrupted by blur and Poisson noise is a key issue in medical and biological image processing. While most existing methods are based on variational models, generally derived from a maximum a posteriori (MAP) formulation, recently sparse representations of images have shown to be efficient approaches for image recovery. Following this idea, we propose in this paper a model containing three terms: a patch-based sparse representation prior over a learned dictionary, the pixel-based total variation regularization term and a data-fidelity term capturing the statistics of Poisson noise. The resulting optimization problem can be solved by an alternating minimization technique combined with variable splitting. Extensive experimental results suggest that in terms of visual quality, peak signal-to-noise ratio value and the method noise, the proposed algorithm outperforms state-of-the-art methods.
High resolution OCT image generation using super resolution via sparse representation
NASA Astrophysics Data System (ADS)
Asif, Muhammad; Akram, Muhammad Usman; Hassan, Taimur; Shaukat, Arslan; Waqar, Razi
2017-02-01
In this paper we propose a technique for obtaining a high resolution (HR) image from a single low resolution (LR) image -using joint learning dictionary - on the basis of image statistic research. It suggests that with an appropriate choice of an over-complete dictionary, image patches can be well represented as a sparse linear combination. Medical imaging for clinical analysis and medical intervention is being used for creating visual representations of the interior of a body, as well as visual representation of the function of some organs or tissues (physiology). A number of medical imaging techniques are in use like MRI, CT scan, X-rays and Optical Coherence Tomography (OCT). OCT is one of the new technologies in medical imaging and one of its uses is in ophthalmology where it is being used for analysis of the choroidal thickness in the eyes in healthy and disease states such as age-related macular degeneration, central serous chorioretinopathy, diabetic retinopathy and inherited retinal dystrophies. We have proposed a technique for enhancing the OCT images which can be used for clearly identifying and analyzing the particular diseases. Our method uses dictionary learning technique for generating a high resolution image from a single input LR image. We train two joint dictionaries, one with OCT images and the second with multiple different natural images, and compare the results with previous SR technique. Proposed method for both dictionaries produces HR images which are comparatively superior in quality with the other proposed method of SR. Proposed technique is very effective for noisy OCT images and produces up-sampled and enhanced OCT images.
Lefkimmiatis, Stamatios; Maragos, Petros; Papandreou, George
2009-08-01
We present an improved statistical model for analyzing Poisson processes, with applications to photon-limited imaging. We build on previous work, adopting a multiscale representation of the Poisson process in which the ratios of the underlying Poisson intensities (rates) in adjacent scales are modeled as mixtures of conjugate parametric distributions. Our main contributions include: 1) a rigorous and robust regularized expectation-maximization (EM) algorithm for maximum-likelihood estimation of the rate-ratio density parameters directly from the noisy observed Poisson data (counts); 2) extension of the method to work under a multiscale hidden Markov tree model (HMT) which couples the mixture label assignments in consecutive scales, thus modeling interscale coefficient dependencies in the vicinity of image edges; 3) exploration of a 2-D recursive quad-tree image representation, involving Dirichlet-mixture rate-ratio densities, instead of the conventional separable binary-tree image representation involving beta-mixture rate-ratio densities; and 4) a novel multiscale image representation, which we term Poisson-Haar decomposition, that better models the image edge structure, thus yielding improved performance. Experimental results on standard images with artificially simulated Poisson noise and on real photon-limited images demonstrate the effectiveness of the proposed techniques.
Single image interpolation via adaptive nonlocal sparsity-based modeling.
Romano, Yaniv; Protter, Matan; Elad, Michael
2014-07-01
Single image interpolation is a central and extensively studied problem in image processing. A common approach toward the treatment of this problem in recent years is to divide the given image into overlapping patches and process each of them based on a model for natural image patches. Adaptive sparse representation modeling is one such promising image prior, which has been shown to be powerful in filling-in missing pixels in an image. Another force that such algorithms may use is the self-similarity that exists within natural images. Processing groups of related patches together exploits their correspondence, leading often times to improved results. In this paper, we propose a novel image interpolation method, which combines these two forces-nonlocal self-similarities and sparse representation modeling. The proposed method is contrasted with competitive and related algorithms, and demonstrated to achieve state-of-the-art results.
Chinese Herbal Medicine Image Recognition and Retrieval by Convolutional Neural Network
Sun, Xin; Qian, Huinan
2016-01-01
Chinese herbal medicine image recognition and retrieval have great potential of practical applications. Several previous studies have focused on the recognition with hand-crafted image features, but there are two limitations in them. Firstly, most of these hand-crafted features are low-level image representation, which is easily affected by noise and background. Secondly, the medicine images are very clean without any backgrounds, which makes it difficult to use in practical applications. Therefore, designing high-level image representation for recognition and retrieval in real world medicine images is facing a great challenge. Inspired by the recent progress of deep learning in computer vision, we realize that deep learning methods may provide robust medicine image representation. In this paper, we propose to use the Convolutional Neural Network (CNN) for Chinese herbal medicine image recognition and retrieval. For the recognition problem, we use the softmax loss to optimize the recognition network; then for the retrieval problem, we fine-tune the recognition network by adding a triplet loss to search for the most similar medicine images. To evaluate our method, we construct a public database of herbal medicine images with cluttered backgrounds, which has in total 5523 images with 95 popular Chinese medicine categories. Experimental results show that our method can achieve the average recognition precision of 71% and the average retrieval precision of 53% over all the 95 medicine categories, which are quite promising given the fact that the real world images have multiple pieces of occluded herbal and cluttered backgrounds. Besides, our proposed method achieves the state-of-the-art performance by improving previous studies with a large margin. PMID:27258404
Using fuzzy fractal features of digital images for the material surface analisys
NASA Astrophysics Data System (ADS)
Privezentsev, D. G.; Zhiznyakov, A. L.; Astafiev, A. V.; Pugin, E. V.
2018-01-01
Edge detection is an important task in image processing. There are a lot of approaches in this area: Sobel, Canny operators and others. One of the perspective techniques in image processing is the use of fuzzy logic and fuzzy sets theory. They allow us to increase processing quality by representing information in its fuzzy form. Most of the existing fuzzy image processing methods switch to fuzzy sets on very late stages, so this leads to some useful information loss. In this paper, a novel method of edge detection based on fuzzy image representation and fuzzy pixels is proposed. With this approach, we convert the image to fuzzy form on the first step. Different approaches to this conversion are described. Several membership functions for fuzzy pixel description and requirements for their form and view are given. A novel approach to edge detection based on Sobel operator and fuzzy image representation is proposed. Experimental testing of developed method was performed on remote sensing images.
Wang, Li; Shi, Feng; Li, Gang; Lin, Weili; Gilmore, John H.; Shen, Dinggang
2014-01-01
Segmentation of infant brain MR images is challenging due to insufficient image quality, severe partial volume effect, and ongoing maturation and myelination process. During the first year of life, the signal contrast between white matter (WM) and gray matter (GM) in MR images undergoes inverse changes. In particular, the inversion of WM/GM signal contrast appears around 6–8 months of age, where brain tissues appear isointense and hence exhibit extremely low tissue contrast, posing significant challenges for automated segmentation. In this paper, we propose a novel segmentation method to address the above-mentioned challenge based on the sparse representation of the complementary tissue distribution information from T1, T2 and diffusion-weighted images. Specifically, we first derive an initial segmentation from a library of aligned multi-modality images with ground-truth segmentations by using sparse representation in a patch-based fashion. The segmentation is further refined by the integration of the geometrical constraint information. The proposed method was evaluated on 22 6-month-old training subjects using leave-one-out cross-validation, as well as 10 additional infant testing subjects, showing superior results in comparison to other state-of-the-art methods. PMID:24505729
Wang, Li; Shi, Feng; Li, Gang; Lin, Weili; Gilmore, John H; Shen, Dinggang
2013-01-01
Segmentation of infant brain MR images is challenging due to insufficient image quality, severe partial volume effect, and ongoing maturation and myelination process. During the first year of life, the signal contrast between white matter (WM) and gray matter (GM) in MR images undergoes inverse changes. In particular, the inversion of WM/GM signal contrast appears around 6-8 months of age, where brain tissues appear isointense and hence exhibit extremely low tissue contrast, posing significant challenges for automated segmentation. In this paper, we propose a novel segmentation method to address the above-mentioned challenge based on the sparse representation of the complementary tissue distribution information from T1, T2 and diffusion-weighted images. Specifically, we first derive an initial segmentation from a library of aligned multi-modality images with ground-truth segmentations by using sparse representation in a patch-based fashion. The segmentation is further refined by the integration of the geometrical constraint information. The proposed method was evaluated on 22 6-month-old training subjects using leave-one-out cross-validation, as well as 10 additional infant testing subjects, showing superior results in comparison to other state-of-the-art methods.
Sparse representation based image interpolation with nonlocal autoregressive modeling.
Dong, Weisheng; Zhang, Lei; Lukac, Rastislav; Shi, Guangming
2013-04-01
Sparse representation is proven to be a promising approach to image super-resolution, where the low-resolution (LR) image is usually modeled as the down-sampled version of its high-resolution (HR) counterpart after blurring. When the blurring kernel is the Dirac delta function, i.e., the LR image is directly down-sampled from its HR counterpart without blurring, the super-resolution problem becomes an image interpolation problem. In such cases, however, the conventional sparse representation models (SRM) become less effective, because the data fidelity term fails to constrain the image local structures. In natural images, fortunately, many nonlocal similar patches to a given patch could provide nonlocal constraint to the local structure. In this paper, we incorporate the image nonlocal self-similarity into SRM for image interpolation. More specifically, a nonlocal autoregressive model (NARM) is proposed and taken as the data fidelity term in SRM. We show that the NARM-induced sampling matrix is less coherent with the representation dictionary, and consequently makes SRM more effective for image interpolation. Our extensive experimental results demonstrate that the proposed NARM-based image interpolation method can effectively reconstruct the edge structures and suppress the jaggy/ringing artifacts, achieving the best image interpolation results so far in terms of PSNR as well as perceptual quality metrics such as SSIM and FSIM.
Progressive Label Fusion Framework for Multi-atlas Segmentation by Dictionary Evolution
Song, Yantao; Wu, Guorong; Sun, Quansen; Bahrami, Khosro; Li, Chunming; Shen, Dinggang
2015-01-01
Accurate segmentation of anatomical structures in medical images is very important in neuroscience studies. Recently, multi-atlas patch-based label fusion methods have achieved many successes, which generally represent each target patch from an atlas patch dictionary in the image domain and then predict the latent label by directly applying the estimated representation coefficients in the label domain. However, due to the large gap between these two domains, the estimated representation coefficients in the image domain may not stay optimal for the label fusion. To overcome this dilemma, we propose a novel label fusion framework to make the weighting coefficients eventually to be optimal for the label fusion by progressively constructing a dynamic dictionary in a layer-by-layer manner, where a sequence of intermediate patch dictionaries gradually encode the transition from the patch representation coefficients in image domain to the optimal weights for label fusion. Our proposed framework is general to augment the label fusion performance of the current state-of-the-art methods. In our experiments, we apply our proposed method to hippocampus segmentation on ADNI dataset and achieve more accurate labeling results, compared to the counterpart methods with single-layer dictionary. PMID:26942233
Progressive Label Fusion Framework for Multi-atlas Segmentation by Dictionary Evolution.
Song, Yantao; Wu, Guorong; Sun, Quansen; Bahrami, Khosro; Li, Chunming; Shen, Dinggang
2015-10-01
Accurate segmentation of anatomical structures in medical images is very important in neuroscience studies. Recently, multi-atlas patch-based label fusion methods have achieved many successes, which generally represent each target patch from an atlas patch dictionary in the image domain and then predict the latent label by directly applying the estimated representation coefficients in the label domain. However, due to the large gap between these two domains, the estimated representation coefficients in the image domain may not stay optimal for the label fusion. To overcome this dilemma, we propose a novel label fusion framework to make the weighting coefficients eventually to be optimal for the label fusion by progressively constructing a dynamic dictionary in a layer-by-layer manner, where a sequence of intermediate patch dictionaries gradually encode the transition from the patch representation coefficients in image domain to the optimal weights for label fusion. Our proposed framework is general to augment the label fusion performance of the current state-of-the-art methods. In our experiments, we apply our proposed method to hippocampus segmentation on ADNI dataset and achieve more accurate labeling results, compared to the counterpart methods with single-layer dictionary.
Qiao, Yu; Wang, Wei; Minematsu, Nobuaki; Liu, Jianzhuang; Takeda, Mitsuo; Tang, Xiaoou
2009-10-01
This paper studies phase singularities (PSs) for image representation. We show that PSs calculated with Laguerre-Gauss filters contain important information and provide a useful tool for image analysis. PSs are invariant to image translation and rotation. We introduce several invariant features to characterize the core structures around PSs and analyze the stability of PSs to noise addition and scale change. We also study the characteristics of PSs in a scale space, which lead to a method to select key scales along phase singularity curves. We demonstrate two applications of PSs: object tracking and image matching. In object tracking, we use the iterative closest point algorithm to determine the correspondences of PSs between two adjacent frames. The use of PSs allows us to precisely determine the motions of tracked objects. In image matching, we combine PSs and scale-invariant feature transform (SIFT) descriptor to deal with the variations between two images and examine the proposed method on a benchmark database. The results indicate that our method can find more correct matching pairs with higher repeatability rates than some well-known methods.
Fine-grained visual marine vessel classification for coastal surveillance and defense applications
NASA Astrophysics Data System (ADS)
Solmaz, Berkan; Gundogdu, Erhan; Karaman, Kaan; Yücesoy, Veysel; Koç, Aykut
2017-10-01
The need for capabilities of automated visual content analysis has substantially increased due to presence of large number of images captured by surveillance cameras. With a focus on development of practical methods for extracting effective visual data representations, deep neural network based representations have received great attention due to their success in visual categorization of generic images. For fine-grained image categorization, a closely related yet a more challenging research problem compared to generic image categorization due to high visual similarities within subgroups, diverse applications were developed such as classifying images of vehicles, birds, food and plants. Here, we propose the use of deep neural network based representations for categorizing and identifying marine vessels for defense and security applications. First, we gather a large number of marine vessel images via online sources grouping them into four coarse categories; naval, civil, commercial and service vessels. Next, we subgroup naval vessels into fine categories such as corvettes, frigates and submarines. For distinguishing images, we extract state-of-the-art deep visual representations and train support-vector-machines. Furthermore, we fine tune deep representations for marine vessel images. Experiments address two scenarios, classification and verification of naval marine vessels. Classification experiment aims coarse categorization, as well as learning models of fine categories. Verification experiment embroils identification of specific naval vessels by revealing if a pair of images belongs to identical marine vessels by the help of learnt deep representations. Obtaining promising performance, we believe these presented capabilities would be essential components of future coastal and on-board surveillance systems.
NASA Astrophysics Data System (ADS)
Karimi, Davood; Ward, Rabab K.
2016-03-01
Sparse representation of signals in learned overcomplete dictionaries has proven to be a powerful tool with applications in denoising, restoration, compression, reconstruction, and more. Recent research has shown that learned overcomplete dictionaries can lead to better results than analytical dictionaries such as wavelets in almost all image processing applications. However, a major disadvantage of these dictionaries is that their learning and usage is very computationally intensive. In particular, finding the sparse representation of a signal in these dictionaries requires solving an optimization problem that leads to very long computational times, especially in 3D image processing. Moreover, the sparse representation found by greedy algorithms is usually sub-optimal. In this paper, we propose a novel two-level dictionary structure that improves the performance and the speed of standard greedy sparse coding methods. The first (i.e., the top) level in our dictionary is a fixed orthonormal basis, whereas the second level includes the atoms that are learned from the training data. We explain how such a dictionary can be learned from the training data and how the sparse representation of a new signal in this dictionary can be computed. As an application, we use the proposed dictionary structure for removing the noise and artifacts in 3D computed tomography (CT) images. Our experiments with real CT images show that the proposed method achieves results that are comparable with standard dictionary-based methods while substantially reducing the computational time.
Cross-Modal Retrieval With CNN Visual Features: A New Baseline.
Wei, Yunchao; Zhao, Yao; Lu, Canyi; Wei, Shikui; Liu, Luoqi; Zhu, Zhenfeng; Yan, Shuicheng
2017-02-01
Recently, convolutional neural network (CNN) visual features have demonstrated their powerful ability as a universal representation for various recognition tasks. In this paper, cross-modal retrieval with CNN visual features is implemented with several classic methods. Specifically, off-the-shelf CNN visual features are extracted from the CNN model, which is pretrained on ImageNet with more than one million images from 1000 object categories, as a generic image representation to tackle cross-modal retrieval. To further enhance the representational ability of CNN visual features, based on the pretrained CNN model on ImageNet, a fine-tuning step is performed by using the open source Caffe CNN library for each target data set. Besides, we propose a deep semantic matching method to address the cross-modal retrieval problem with respect to samples which are annotated with one or multiple labels. Extensive experiments on five popular publicly available data sets well demonstrate the superiority of CNN visual features for cross-modal retrieval.
Categorizing biomedicine images using novel image features and sparse coding representation
2013-01-01
Background Images embedded in biomedical publications carry rich information that often concisely summarize key hypotheses adopted, methods employed, or results obtained in a published study. Therefore, they offer valuable clues for understanding main content in a biomedical publication. Prior studies have pointed out the potential of mining images embedded in biomedical publications for automatically understanding and retrieving such images' associated source documents. Within the broad area of biomedical image processing, categorizing biomedical images is a fundamental step for building many advanced image analysis, retrieval, and mining applications. Similar to any automatic categorization effort, discriminative image features can provide the most crucial aid in the process. Method We observe that many images embedded in biomedical publications carry versatile annotation text. Based on the locations of and the spatial relationships between these text elements in an image, we thus propose some novel image features for image categorization purpose, which quantitatively characterize the spatial positions and distributions of text elements inside a biomedical image. We further adopt a sparse coding representation (SCR) based technique to categorize images embedded in biomedical publications by leveraging our newly proposed image features. Results we randomly selected 990 images of the JPG format for use in our experiments where 310 images were used as training samples and the rest were used as the testing cases. We first segmented 310 sample images following the our proposed procedure. This step produced a total of 1035 sub-images. We then manually labeled all these sub-images according to the two-level hierarchical image taxonomy proposed by [1]. Among our annotation results, 316 are microscopy images, 126 are gel electrophoresis images, 135 are line charts, 156 are bar charts, 52 are spot charts, 25 are tables, 70 are flow charts, and the remaining 155 images are of the type "others". A serial of experimental results are obtained. Firstly, each image categorizing results is presented, and next image categorizing performance indexes such as precision, recall, F-score, are all listed. Different features which include conventional image features and our proposed novel features indicate different categorizing performance, and the results are demonstrated. Thirdly, we conduct an accuracy comparison between support vector machine classification method and our proposed sparse representation classification method. At last, our proposed approach is compared with three peer classification method and experimental results verify our impressively improved performance. Conclusions Compared with conventional image features that do not exploit characteristics regarding text positions and distributions inside images embedded in biomedical publications, our proposed image features coupled with the SR based representation model exhibit superior performance for classifying biomedical images as demonstrated in our comparative benchmark study. PMID:24565470
Progressive multi-atlas label fusion by dictionary evolution.
Song, Yantao; Wu, Guorong; Bahrami, Khosro; Sun, Quansen; Shen, Dinggang
2017-02-01
Accurate segmentation of anatomical structures in medical images is important in recent imaging based studies. In the past years, multi-atlas patch-based label fusion methods have achieved a great success in medical image segmentation. In these methods, the appearance of each input image patch is first represented by an atlas patch dictionary (in the image domain), and then the latent label of the input image patch is predicted by applying the estimated representation coefficients to the corresponding anatomical labels of the atlas patches in the atlas label dictionary (in the label domain). However, due to the generally large gap between the patch appearance in the image domain and the patch structure in the label domain, the estimated (patch) representation coefficients from the image domain may not be optimal for the final label fusion, thus reducing the labeling accuracy. To address this issue, we propose a novel label fusion framework to seek for the suitable label fusion weights by progressively constructing a dynamic dictionary in a layer-by-layer manner, where the intermediate dictionaries act as a sequence of guidance to steer the transition of (patch) representation coefficients from the image domain to the label domain. Our proposed multi-layer label fusion framework is flexible enough to be applied to the existing labeling methods for improving their label fusion performance, i.e., by extending their single-layer static dictionary to the multi-layer dynamic dictionary. The experimental results show that our proposed progressive label fusion method achieves more accurate hippocampal segmentation results for the ADNI dataset, compared to the counterpart methods using only the single-layer static dictionary. Copyright © 2016 Elsevier B.V. All rights reserved.
Multi-level tree analysis of pulmonary artery/vein trees in non-contrast CT images
NASA Astrophysics Data System (ADS)
Gao, Zhiyun; Grout, Randall W.; Hoffman, Eric A.; Saha, Punam K.
2012-02-01
Diseases like pulmonary embolism and pulmonary hypertension are associated with vascular dystrophy. Identifying such pulmonary artery/vein (A/V) tree dystrophy in terms of quantitative measures via CT imaging significantly facilitates early detection of disease or a treatment monitoring process. A tree structure, consisting of nodes and connected arcs, linked to the volumetric representation allows multi-level geometric and volumetric analysis of A/V trees. Here, a new theory and method is presented to generate multi-level A/V tree representation of volumetric data and to compute quantitative measures of A/V tree geometry and topology at various tree hierarchies. The new method is primarily designed on arc skeleton computation followed by a tree construction based topologic and geometric analysis of the skeleton. The method starts with a volumetric A/V representation as input and generates its topologic and multi-level volumetric tree representations long with different multi-level morphometric measures. A new recursive merging and pruning algorithms are introduced to detect bad junctions and noisy branches often associated with digital geometric and topologic analysis. Also, a new notion of shortest axial path is introduced to improve the skeletal arc joining two junctions. The accuracy of the multi-level tree analysis algorithm has been evaluated using computer generated phantoms and pulmonary CT images of a pig vessel cast phantom while the reproducibility of method is evaluated using multi-user A/V separation of in vivo contrast-enhanced CT images of a pig lung at different respiratory volumes.
A fractal concentration area method for assigning a color palette for image representation
NASA Astrophysics Data System (ADS)
Cheng, Qiuming; Li, Qingmou
2002-05-01
Displaying the remotely sensed image with a proper color palette is the first task in any kind of image processing and pattern recognition in GIS and image processing environments. The purpose of displaying the image should be not only to provide a visual representation of the variance of the image, although this has been the primary objective of most conventional methods, but also the color palette should reflect real-world features on the ground which must be the primary objective of employing remotely sensed data. Although most conventional methods focus only on the first purpose of image representation, the concentration-area ( C- A plot) fractal method proposed in this paper aims to meet both purposes on the basis of pixel values and pixel value frequency distribution as well as spatial and geometrical properties of image patterns. The C- A method can be used to establish power-law relationships between the area A(⩾ s) with the pixel values greater than s and the pixel value s itself after plotting these values on log-log paper. A number of straight-line segments can be manually or automatically fitted to the points on the log-log paper, each representing a power-law relationship between the area A and the cutoff pixel value for s in a particular range. These straight-line segments can yield a group of cutoff values on the basis of which the image can be classified into discrete classes or zones. These zones usually correspond to the real-world features on the ground. A Windows program has been prepared in ActiveX format for implementing the C- A method and integrating it into other GIS and image processing systems. A case study of Landsat TM band 5 has been used to demonstrate the application of the method and the flexibility of the computer program.
Class Energy Image Analysis for Video Sensor-Based Gait Recognition: A Review
Lv, Zhuowen; Xing, Xianglei; Wang, Kejun; Guan, Donghai
2015-01-01
Gait is a unique perceptible biometric feature at larger distances, and the gait representation approach plays a key role in a video sensor-based gait recognition system. Class Energy Image is one of the most important gait representation methods based on appearance, which has received lots of attentions. In this paper, we reviewed the expressions and meanings of various Class Energy Image approaches, and analyzed the information in the Class Energy Images. Furthermore, the effectiveness and robustness of these approaches were compared on the benchmark gait databases. We outlined the research challenges and provided promising future directions for the field. To the best of our knowledge, this is the first review that focuses on Class Energy Image. It can provide a useful reference in the literature of video sensor-based gait representation approach. PMID:25574935
Diverse Region-Based CNN for Hyperspectral Image Classification.
Zhang, Mengmeng; Li, Wei; Du, Qian
2018-06-01
Convolutional neural network (CNN) is of great interest in machine learning and has demonstrated excellent performance in hyperspectral image classification. In this paper, we propose a classification framework, called diverse region-based CNN, which can encode semantic context-aware representation to obtain promising features. With merging a diverse set of discriminative appearance factors, the resulting CNN-based representation exhibits spatial-spectral context sensitivity that is essential for accurate pixel classification. The proposed method exploiting diverse region-based inputs to learn contextual interactional features is expected to have more discriminative power. The joint representation containing rich spectral and spatial information is then fed to a fully connected network and the label of each pixel vector is predicted by a softmax layer. Experimental results with widely used hyperspectral image data sets demonstrate that the proposed method can surpass any other conventional deep learning-based classifiers and other state-of-the-art classifiers.
Classification of multiple sclerosis lesions using adaptive dictionary learning.
Deshpande, Hrishikesh; Maurel, Pierre; Barillot, Christian
2015-12-01
This paper presents a sparse representation and an adaptive dictionary learning based method for automated classification of multiple sclerosis (MS) lesions in magnetic resonance (MR) images. Manual delineation of MS lesions is a time-consuming task, requiring neuroradiology experts to analyze huge volume of MR data. This, in addition to the high intra- and inter-observer variability necessitates the requirement of automated MS lesion classification methods. Among many image representation models and classification methods that can be used for such purpose, we investigate the use of sparse modeling. In the recent years, sparse representation has evolved as a tool in modeling data using a few basis elements of an over-complete dictionary and has found applications in many image processing tasks including classification. We propose a supervised classification approach by learning dictionaries specific to the lesions and individual healthy brain tissues, which include white matter (WM), gray matter (GM) and cerebrospinal fluid (CSF). The size of the dictionaries learned for each class plays a major role in data representation but it is an even more crucial element in the case of competitive classification. Our approach adapts the size of the dictionary for each class, depending on the complexity of the underlying data. The algorithm is validated using 52 multi-sequence MR images acquired from 13 MS patients. The results demonstrate the effectiveness of our approach in MS lesion classification. Copyright © 2015 Elsevier Ltd. All rights reserved.
Motion representation of the long fingers: a proposal for the definitions of new anatomical frames.
Coupier, Jérôme; Moiseev, Fédor; Feipel, Véronique; Rooze, Marcel; Van Sint Jan, Serge
2014-04-11
Despite the availability of the International Society of Biomechanics (ISB) recommendations for the orientation of anatomical frames, no consensus exists about motion representations related to finger kinematics. This paper proposes novel anatomical frames for motion representation of the phalangeal segments of the long fingers. A three-dimensional model of a human forefinger was acquired from a non-pathological fresh-frozen hand. Medical imaging was used to collect phalangeal discrete positions. Data processing was performed using a customized software interface ("lhpFusionBox") to create a specimen-specific model and to reconstruct the discrete motion path. Five examiners virtually palpated two sets of landmarks. These markers were then used to build anatomical frames following two methods: a reference method following ISB recommendations and a newly-developed method based on the mean helical axis (HA). Motion representations were obtained and compared between examiners. Virtual palpation precision was around 1mm, which is comparable to results from the literature. The comparison of the two methods showed that the helical axis method seemed more reproducible between examiners especially for secondary, or accessory, motions. Computed Root Mean Square distances comparing methods showed that the ISB method displayed a variability 10 times higher than the HA method. The HA method seems to be suitable for finger motion representation using discrete positions from medical imaging. Further investigations are required before being able to use the methodology with continuous tracking of markers set on the subject's hand. Copyright © 2014 Elsevier Ltd. All rights reserved.
Blind image deblurring based on trained dictionary and curvelet using sparse representation
NASA Astrophysics Data System (ADS)
Feng, Liang; Huang, Qian; Xu, Tingfa; Li, Shao
2015-04-01
Motion blur is one of the most significant and common artifacts causing poor image quality in digital photography, in which many factors resulted. In imaging process, if the objects are moving quickly in the scene or the camera moves in the exposure interval, the image of the scene would blur along the direction of relative motion between the camera and the scene, e.g. camera shake, atmospheric turbulence. Recently, sparse representation model has been widely used in signal and image processing, which is an effective method to describe the natural images. In this article, a new deblurring approach based on sparse representation is proposed. An overcomplete dictionary learned from the trained image samples via the KSVD algorithm is designed to represent the latent image. The motion-blur kernel can be treated as a piece-wise smooth function in image domain, whose support is approximately a thin smooth curve, so we employed curvelet to represent the blur kernel. Both of overcomplete dictionary and curvelet system have high sparsity, which improves the robustness to the noise and more satisfies the observer's visual demand. With the two priors, we constructed restoration model of blurred images and succeeded to solve the optimization problem with the help of alternating minimization technique. The experiment results prove the method can preserve the texture of original images and suppress the ring artifacts effectively.
Image Fusion of CT and MR with Sparse Representation in NSST Domain
Qiu, Chenhui; Wang, Yuanyuan; Zhang, Huan
2017-01-01
Multimodal image fusion techniques can integrate the information from different medical images to get an informative image that is more suitable for joint diagnosis, preoperative planning, intraoperative guidance, and interventional treatment. Fusing images of CT and different MR modalities are studied in this paper. Firstly, the CT and MR images are both transformed to nonsubsampled shearlet transform (NSST) domain. So the low-frequency components and high-frequency components are obtained. Then the high-frequency components are merged using the absolute-maximum rule, while the low-frequency components are merged by a sparse representation- (SR-) based approach. And the dynamic group sparsity recovery (DGSR) algorithm is proposed to improve the performance of the SR-based approach. Finally, the fused image is obtained by performing the inverse NSST on the merged components. The proposed fusion method is tested on a number of clinical CT and MR images and compared with several popular image fusion methods. The experimental results demonstrate that the proposed fusion method can provide better fusion results in terms of subjective quality and objective evaluation. PMID:29250134
Image Fusion of CT and MR with Sparse Representation in NSST Domain.
Qiu, Chenhui; Wang, Yuanyuan; Zhang, Huan; Xia, Shunren
2017-01-01
Multimodal image fusion techniques can integrate the information from different medical images to get an informative image that is more suitable for joint diagnosis, preoperative planning, intraoperative guidance, and interventional treatment. Fusing images of CT and different MR modalities are studied in this paper. Firstly, the CT and MR images are both transformed to nonsubsampled shearlet transform (NSST) domain. So the low-frequency components and high-frequency components are obtained. Then the high-frequency components are merged using the absolute-maximum rule, while the low-frequency components are merged by a sparse representation- (SR-) based approach. And the dynamic group sparsity recovery (DGSR) algorithm is proposed to improve the performance of the SR-based approach. Finally, the fused image is obtained by performing the inverse NSST on the merged components. The proposed fusion method is tested on a number of clinical CT and MR images and compared with several popular image fusion methods. The experimental results demonstrate that the proposed fusion method can provide better fusion results in terms of subjective quality and objective evaluation.
Adaptive structured dictionary learning for image fusion based on group-sparse-representation
NASA Astrophysics Data System (ADS)
Yang, Jiajie; Sun, Bin; Luo, Chengwei; Wu, Yuzhong; Xu, Limei
2018-04-01
Dictionary learning is the key process of sparse representation which is one of the most widely used image representation theories in image fusion. The existing dictionary learning method does not use the group structure information and the sparse coefficients well. In this paper, we propose a new adaptive structured dictionary learning algorithm and a l1-norm maximum fusion rule that innovatively utilizes grouped sparse coefficients to merge the images. In the dictionary learning algorithm, we do not need prior knowledge about any group structure of the dictionary. By using the characteristics of the dictionary in expressing the signal, our algorithm can automatically find the desired potential structure information that hidden in the dictionary. The fusion rule takes the physical meaning of the group structure dictionary, and makes activity-level judgement on the structure information when the images are being merged. Therefore, the fused image can retain more significant information. Comparisons have been made with several state-of-the-art dictionary learning methods and fusion rules. The experimental results demonstrate that, the dictionary learning algorithm and the fusion rule both outperform others in terms of several objective evaluation metrics.
Representation of photon limited data in emission tomography using origin ensembles
NASA Astrophysics Data System (ADS)
Sitek, A.
2008-06-01
Representation and reconstruction of data obtained by emission tomography scanners are challenging due to high noise levels in the data. Typically, images obtained using tomographic measurements are represented using grids. In this work, we define images as sets of origins of events detected during tomographic measurements; we call these origin ensembles (OEs). A state in the ensemble is characterized by a vector of 3N parameters Y, where the parameters are the coordinates of origins of detected events in a three-dimensional space and N is the number of detected events. The 3N-dimensional probability density function (PDF) for that ensemble is derived, and we present an algorithm for OE image estimation from tomographic measurements. A displayable image (e.g. grid based image) is derived from the OE formulation by calculating ensemble expectations based on the PDF using the Markov chain Monte Carlo method. The approach was applied to computer-simulated 3D list-mode positron emission tomography data. The reconstruction errors for a 10 000 000 event acquisition for simulated ranged from 0.1 to 34.8%, depending on object size and sampling density. The method was also applied to experimental data and the results of the OE method were consistent with those obtained by a standard maximum-likelihood approach. The method is a new approach to representation and reconstruction of data obtained by photon-limited emission tomography measurements.
Research on simulated infrared image utility evaluation using deep representation
NASA Astrophysics Data System (ADS)
Zhang, Ruiheng; Mu, Chengpo; Yang, Yu; Xu, Lixin
2018-01-01
Infrared (IR) image simulation is an important data source for various target recognition systems. However, whether simulated IR images could be used as training data for classifiers depends on the features of fidelity and authenticity of simulated IR images. For evaluation of IR image features, a deep-representation-based algorithm is proposed. Being different from conventional methods, which usually adopt a priori knowledge or manually designed feature, the proposed method can extract essential features and quantitatively evaluate the utility of simulated IR images. First, for data preparation, we employ our IR image simulation system to generate large amounts of IR images. Then, we present the evaluation model of simulated IR image, for which an end-to-end IR feature extraction and target detection model based on deep convolutional neural network is designed. At last, the experiments illustrate that our proposed method outperforms other verification algorithms in evaluating simulated IR images. Cross-validation, variable proportion mixed data validation, and simulation process contrast experiments are carried out to evaluate the utility and objectivity of the images generated by our simulation system. The optimum mixing ratio between simulated and real data is 0.2≤γ≤0.3, which is an effective data augmentation method for real IR images.
Jia, Yuanyuan; He, Zhongshi; Gholipour, Ali; Warfield, Simon K
2016-11-01
In magnetic resonance (MR), hardware limitation, scanning time, and patient comfort often result in the acquisition of anisotropic 3-D MR images. Enhancing image resolution is desired but has been very challenging in medical image processing. Super resolution reconstruction based on sparse representation and overcomplete dictionary has been lately employed to address this problem; however, these methods require extra training sets, which may not be always available. This paper proposes a novel single anisotropic 3-D MR image upsampling method via sparse representation and overcomplete dictionary that is trained from in-plane high resolution slices to upsample in the out-of-plane dimensions. The proposed method, therefore, does not require extra training sets. Abundant experiments, conducted on simulated and clinical brain MR images, show that the proposed method is more accurate than classical interpolation. When compared to a recent upsampling method based on the nonlocal means approach, the proposed method did not show improved results at low upsampling factors with simulated images, but generated comparable results with much better computational efficiency in clinical cases. Therefore, the proposed approach can be efficiently implemented and routinely used to upsample MR images in the out-of-planes views for radiologic assessment and postacquisition processing.
Adaptive wiener image restoration kernel
Yuan, Ding [Henderson, NV
2007-06-05
A method and device for restoration of electro-optical image data using an adaptive Wiener filter begins with constructing imaging system Optical Transfer Function, and the Fourier Transformations of the noise and the image. A spatial representation of the imaged object is restored by spatial convolution of the image using a Wiener restoration kernel.
Psoriasis image representation using patch-based dictionary learning for erythema severity scoring.
George, Yasmeen; Aldeen, Mohammad; Garnavi, Rahil
2018-06-01
Psoriasis is a chronic skin disease which can be life-threatening. Accurate severity scoring helps dermatologists to decide on the treatment. In this paper, we present a semi-supervised computer-aided system for automatic erythema severity scoring in psoriasis images. Firstly, the unsupervised stage includes a novel image representation method. We construct a dictionary, which is then used in the sparse representation for local feature extraction. To acquire the final image representation vector, an aggregation method is exploited over the local features. Secondly, the supervised phase is where various multi-class machine learning (ML) classifiers are trained for erythema severity scoring. Finally, we compare the proposed system with two popular unsupervised feature extractor methods, namely: bag of visual words model (BoVWs) and AlexNet pretrained model. Root mean square error (RMSE) and F1 score are used as performance measures for the learned dictionaries and the trained ML models, respectively. A psoriasis image set consisting of 676 images, is used in this study. Experimental results demonstrate that the use of the proposed procedure can provide a setup where erythema scoring is accurate and consistent. Also, it is revealed that dictionaries with large number of atoms and small patch sizes yield the best representative erythema severity features. Further, random forest (RF) outperforms other classifiers with F1 score 0.71, followed by support vector machine (SVM) and boosting with 0.66 and 0.64 scores, respectively. Furthermore, the conducted comparative studies confirm the effectiveness of the proposed approach with improvement of 9% and 12% over BoVWs and AlexNet based features, respectively. Crown Copyright © 2018. Published by Elsevier Ltd. All rights reserved.
Bayesian image reconstruction - The pixon and optimal image modeling
NASA Technical Reports Server (NTRS)
Pina, R. K.; Puetter, R. C.
1993-01-01
In this paper we describe the optimal image model, maximum residual likelihood method (OptMRL) for image reconstruction. OptMRL is a Bayesian image reconstruction technique for removing point-spread function blurring. OptMRL uses both a goodness-of-fit criterion (GOF) and an 'image prior', i.e., a function which quantifies the a priori probability of the image. Unlike standard maximum entropy methods, which typically reconstruct the image on the data pixel grid, OptMRL varies the image model in order to find the optimal functional basis with which to represent the image. We show how an optimal basis for image representation can be selected and in doing so, develop the concept of the 'pixon' which is a generalized image cell from which this basis is constructed. By allowing both the image and the image representation to be variable, the OptMRL method greatly increases the volume of solution space over which the image is optimized. Hence the likelihood of the final reconstructed image is greatly increased. For the goodness-of-fit criterion, OptMRL uses the maximum residual likelihood probability distribution introduced previously by Pina and Puetter (1992). This GOF probability distribution, which is based on the spatial autocorrelation of the residuals, has the advantage that it ensures spatially uncorrelated image reconstruction residuals.
Adaptive compressive ghost imaging based on wavelet trees and sparse representation.
Yu, Wen-Kai; Li, Ming-Fei; Yao, Xu-Ri; Liu, Xue-Feng; Wu, Ling-An; Zhai, Guang-Jie
2014-03-24
Compressed sensing is a theory which can reconstruct an image almost perfectly with only a few measurements by finding its sparsest representation. However, the computation time consumed for large images may be a few hours or more. In this work, we both theoretically and experimentally demonstrate a method that combines the advantages of both adaptive computational ghost imaging and compressed sensing, which we call adaptive compressive ghost imaging, whereby both the reconstruction time and measurements required for any image size can be significantly reduced. The technique can be used to improve the performance of all computational ghost imaging protocols, especially when measuring ultra-weak or noisy signals, and can be extended to imaging applications at any wavelength.
NASA Astrophysics Data System (ADS)
Liu, Changjiang; Cheng, Irene; Zhang, Yi; Basu, Anup
2017-06-01
This paper presents an improved multi-scale Retinex (MSR) based enhancement for ariel images under low visibility. For traditional multi-scale Retinex, three scales are commonly employed, which limits its application scenarios. We extend our research to a general purpose enhanced method, and design an MSR with more than three scales. Based on the mathematical analysis and deductions, an explicit multi-scale representation is proposed that balances image contrast and color consistency. In addition, a histogram truncation technique is introduced as a post-processing strategy to remap the multi-scale Retinex output to the dynamic range of the display. Analysis of experimental results and comparisons with existing algorithms demonstrate the effectiveness and generality of the proposed method. Results on image quality assessment proves the accuracy of the proposed method with respect to both objective and subjective criteria.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Li; Gao, Yaozong; Shi, Feng
Purpose: Cone-beam computed tomography (CBCT) is an increasingly utilized imaging modality for the diagnosis and treatment planning of the patients with craniomaxillofacial (CMF) deformities. Accurate segmentation of CBCT image is an essential step to generate three-dimensional (3D) models for the diagnosis and treatment planning of the patients with CMF deformities. However, due to the poor image quality, including very low signal-to-noise ratio and the widespread image artifacts such as noise, beam hardening, and inhomogeneity, it is challenging to segment the CBCT images. In this paper, the authors present a new automatic segmentation method to address these problems. Methods: To segmentmore » CBCT images, the authors propose a new method for fully automated CBCT segmentation by using patch-based sparse representation to (1) segment bony structures from the soft tissues and (2) further separate the mandible from the maxilla. Specifically, a region-specific registration strategy is first proposed to warp all the atlases to the current testing subject and then a sparse-based label propagation strategy is employed to estimate a patient-specific atlas from all aligned atlases. Finally, the patient-specific atlas is integrated into amaximum a posteriori probability-based convex segmentation framework for accurate segmentation. Results: The proposed method has been evaluated on a dataset with 15 CBCT images. The effectiveness of the proposed region-specific registration strategy and patient-specific atlas has been validated by comparing with the traditional registration strategy and population-based atlas. The experimental results show that the proposed method achieves the best segmentation accuracy by comparison with other state-of-the-art segmentation methods. Conclusions: The authors have proposed a new CBCT segmentation method by using patch-based sparse representation and convex optimization, which can achieve considerably accurate segmentation results in CBCT segmentation based on 15 patients.« less
Bag-of-visual-ngrams for histopathology image classification
NASA Astrophysics Data System (ADS)
López-Monroy, A. Pastor; Montes-y-Gómez, Manuel; Escalante, Hugo Jair; Cruz-Roa, Angel; González, Fabio A.
2013-11-01
This paper describes an extension of the Bag-of-Visual-Words (BoVW) representation for image categorization (IC) of histophatology images. This representation is one of the most used approaches in several high-level computer vision tasks. However, the BoVW representation has an important limitation: the disregarding of spatial information among visual words. This information may be useful to capture discriminative visual-patterns in specific computer vision tasks. In order to overcome this problem we propose the use of visual n-grams. N-grams based-representations are very popular in the field of natural language processing (NLP), in particular within text mining and information retrieval. We propose building a codebook of n-grams and then representing images by histograms of visual n-grams. We evaluate our proposal in the challenging task of classifying histopathology images. The novelty of our proposal lies in the fact that we use n-grams as attributes for a classification model (together with visual-words, i.e., 1-grams). This is common practice within NLP, although, to the best of our knowledge, this idea has not been explored yet within computer vision. We report experimental results in a database of histopathology images where our proposed method outperforms the traditional BoVWs formulation.
The Radon cumulative distribution transform and its application to image classification
Kolouri, Soheil; Park, Se Rim; Rohde, Gustavo K.
2016-01-01
Invertible image representation methods (transforms) are routinely employed as low-level image processing operations based on which feature extraction and recognition algorithms are developed. Most transforms in current use (e.g. Fourier, Wavelet, etc.) are linear transforms, and, by themselves, are unable to substantially simplify the representation of image classes for classification. Here we describe a nonlinear, invertible, low-level image processing transform based on combining the well known Radon transform for image data, and the 1D Cumulative Distribution Transform proposed earlier. We describe a few of the properties of this new transform, and with both theoretical and experimental results show that it can often render certain problems linearly separable in transform space. PMID:26685245
Visual Image Sensor Organ Replacement: Implementation
NASA Technical Reports Server (NTRS)
Maluf, A. David (Inventor)
2011-01-01
Method and system for enhancing or extending visual representation of a selected region of a visual image, where visual representation is interfered with or distorted, by supplementing a visual signal with at least one audio signal having one or more audio signal parameters that represent one or more visual image parameters, such as vertical and/or horizontal location of the region; region brightness; dominant wavelength range of the region; change in a parameter value that characterizes the visual image, with respect to a reference parameter value; and time rate of change in a parameter value that characterizes the visual image. Region dimensions can be changed to emphasize change with time of a visual image parameter.
An efficient classification method based on principal component and sparse representation.
Zhai, Lin; Fu, Shujun; Zhang, Caiming; Liu, Yunxian; Wang, Lu; Liu, Guohua; Yang, Mingqiang
2016-01-01
As an important application in optical imaging, palmprint recognition is interfered by many unfavorable factors. An effective fusion of blockwise bi-directional two-dimensional principal component analysis and grouping sparse classification is presented. The dimension reduction and normalizing are implemented by the blockwise bi-directional two-dimensional principal component analysis for palmprint images to extract feature matrixes, which are assembled into an overcomplete dictionary in sparse classification. A subspace orthogonal matching pursuit algorithm is designed to solve the grouping sparse representation. Finally, the classification result is gained by comparing the residual between testing and reconstructed images. Experiments are carried out on a palmprint database, and the results show that this method has better robustness against position and illumination changes of palmprint images, and can get higher rate of palmprint recognition.
Generating virtual training samples for sparse representation of face images and face recognition
NASA Astrophysics Data System (ADS)
Du, Yong; Wang, Yu
2016-03-01
There are many challenges in face recognition. In real-world scenes, images of the same face vary with changing illuminations, different expressions and poses, multiform ornaments, or even altered mental status. Limited available training samples cannot convey these possible changes in the training phase sufficiently, and this has become one of the restrictions to improve the face recognition accuracy. In this article, we view the multiplication of two images of the face as a virtual face image to expand the training set and devise a representation-based method to perform face recognition. The generated virtual samples really reflect some possible appearance and pose variations of the face. By multiplying a training sample with another sample from the same subject, we can strengthen the facial contour feature and greatly suppress the noise. Thus, more human essential information is retained. Also, uncertainty of the training data is simultaneously reduced with the increase of the training samples, which is beneficial for the training phase. The devised representation-based classifier uses both the original and new generated samples to perform the classification. In the classification phase, we first determine K nearest training samples for the current test sample by calculating the Euclidean distances between the test sample and training samples. Then, a linear combination of these selected training samples is used to represent the test sample, and the representation result is used to classify the test sample. The experimental results show that the proposed method outperforms some state-of-the-art face recognition methods.
Image reconstruction from few-view CT data by gradient-domain dictionary learning.
Hu, Zhanli; Liu, Qiegen; Zhang, Na; Zhang, Yunwan; Peng, Xi; Wu, Peter Z; Zheng, Hairong; Liang, Dong
2016-05-21
Decreasing the number of projections is an effective way to reduce the radiation dose exposed to patients in medical computed tomography (CT) imaging. However, incomplete projection data for CT reconstruction will result in artifacts and distortions. In this paper, a novel dictionary learning algorithm operating in the gradient-domain (Grad-DL) is proposed for few-view CT reconstruction. Specifically, the dictionaries are trained from the horizontal and vertical gradient images, respectively and the desired image is reconstructed subsequently from the sparse representations of both gradients by solving the least-square method. Since the gradient images are sparser than the image itself, the proposed approach could lead to sparser representations than conventional DL methods in the image-domain, and thus a better reconstruction quality is achieved. To evaluate the proposed Grad-DL algorithm, both qualitative and quantitative studies were employed through computer simulations as well as real data experiments on fan-beam and cone-beam geometry. The results show that the proposed algorithm can yield better images than the existing algorithms.
Spectral compression algorithms for the analysis of very large multivariate images
Keenan, Michael R.
2007-10-16
A method for spectrally compressing data sets enables the efficient analysis of very large multivariate images. The spectral compression algorithm uses a factored representation of the data that can be obtained from Principal Components Analysis or other factorization technique. Furthermore, a block algorithm can be used for performing common operations more efficiently. An image analysis can be performed on the factored representation of the data, using only the most significant factors. The spectral compression algorithm can be combined with a spatial compression algorithm to provide further computational efficiencies.
Learning of Multimodal Representations With Random Walks on the Click Graph.
Wu, Fei; Lu, Xinyan; Song, Jun; Yan, Shuicheng; Zhang, Zhongfei Mark; Rui, Yong; Zhuang, Yueting
2016-02-01
In multimedia information retrieval, most classic approaches tend to represent different modalities of media in the same feature space. With the click data collected from the users' searching behavior, existing approaches take either one-to-one paired data (text-image pairs) or ranking examples (text-query-image and/or image-query-text ranking lists) as training examples, which do not make full use of the click data, particularly the implicit connections among the data objects. In this paper, we treat the click data as a large click graph, in which vertices are images/text queries and edges indicate the clicks between an image and a query. We consider learning a multimodal representation from the perspective of encoding the explicit/implicit relevance relationship between the vertices in the click graph. By minimizing both the truncated random walk loss as well as the distance between the learned representation of vertices and their corresponding deep neural network output, the proposed model which is named multimodal random walk neural network (MRW-NN) can be applied to not only learn robust representation of the existing multimodal data in the click graph, but also deal with the unseen queries and images to support cross-modal retrieval. We evaluate the latent representation learned by MRW-NN on a public large-scale click log data set Clickture and further show that MRW-NN achieves much better cross-modal retrieval performance on the unseen queries/images than the other state-of-the-art methods.
Agner, Shannon C; Xu, Jun; Madabhushi, Anant
2013-03-01
Segmentation of breast lesions on dynamic contrast enhanced (DCE) magnetic resonance imaging (MRI) is the first step in lesion diagnosis in a computer-aided diagnosis framework. Because manual segmentation of such lesions is both time consuming and highly susceptible to human error and issues of reproducibility, an automated lesion segmentation method is highly desirable. Traditional automated image segmentation methods such as boundary-based active contour (AC) models require a strong gradient at the lesion boundary. Even when region-based terms are introduced to an AC model, grayscale image intensities often do not allow for clear definition of foreground and background region statistics. Thus, there is a need to find alternative image representations that might provide (1) strong gradients at the margin of the object of interest (OOI); and (2) larger separation between intensity distributions and region statistics for the foreground and background, which are necessary to halt evolution of the AC model upon reaching the border of the OOI. In this paper, the authors introduce a spectral embedding (SE) based AC (SEAC) for lesion segmentation on breast DCE-MRI. SE, a nonlinear dimensionality reduction scheme, is applied to the DCE time series in a voxelwise fashion to reduce several time point images to a single parametric image where every voxel is characterized by the three dominant eigenvectors. This parametric eigenvector image (PrEIm) representation allows for better capture of image region statistics and stronger gradients for use with a hybrid AC model, which is driven by both boundary and region information. They compare SEAC to ACs that employ fuzzy c-means (FCM) and principal component analysis (PCA) as alternative image representations. Segmentation performance was evaluated by boundary and region metrics as well as comparing lesion classification using morphological features from SEAC, PCA+AC, and FCM+AC. On a cohort of 50 breast DCE-MRI studies, PrEIm yielded overall better region and boundary-based statistics compared to the original DCE-MR image, FCM, and PCA based image representations. Additionally, SEAC outperformed a hybrid AC applied to both PCA and FCM image representations. Mean dice similarity coefficient (DSC) for SEAC was significantly better (DSC = 0.74 ± 0.21) than FCM+AC (DSC = 0.50 ± 0.32) and similar to PCA+AC (DSC = 0.73 ± 0.22). Boundary-based metrics of mean absolute difference and Hausdorff distance followed the same trends. Of the automated segmentation methods, breast lesion classification based on morphologic features derived from SEAC segmentation using a support vector machine classifier also performed better (AUC = 0.67 ± 0.05; p < 0.05) than FCM+AC (AUC = 0.50 ± 0.07), and PCA+AC (AUC = 0.49 ± 0.07). In this work, we presented SEAC, an accurate, general purpose AC segmentation tool that could be applied to any imaging domain that employs time series data. SE allows for projection of time series data into a PrEIm representation so that every voxel is characterized by the dominant eigenvectors, capturing the global and local time-intensity curve similarities in the data. This PrEIm allows for the calculation of strong tensor gradients and better region statistics than the original image intensities or alternative image representations such as PCA and FCM. The PrEIm also allows for building a more accurate hybrid AC scheme.
NASA Astrophysics Data System (ADS)
Cai, Ailong; Li, Lei; Zheng, Zhizhong; Zhang, Hanming; Wang, Linyuan; Hu, Guoen; Yan, Bin
2018-02-01
In medical imaging many conventional regularization methods, such as total variation or total generalized variation, impose strong prior assumptions which can only account for very limited classes of images. A more reasonable sparse representation frame for images is still badly needed. Visually understandable images contain meaningful patterns, and combinations or collections of these patterns can be utilized to form some sparse and redundant representations which promise to facilitate image reconstructions. In this work, we propose and study block matching sparsity regularization (BMSR) and devise an optimization program using BMSR for computed tomography (CT) image reconstruction for an incomplete projection set. The program is built as a constrained optimization, minimizing the L1-norm of the coefficients of the image in the transformed domain subject to data observation and positivity of the image itself. To solve the program efficiently, a practical method based on the proximal point algorithm is developed and analyzed. In order to accelerate the convergence rate, a practical strategy for tuning the BMSR parameter is proposed and applied. The experimental results for various settings, including real CT scanning, have verified the proposed reconstruction method showing promising capabilities over conventional regularization.
Terrien, Jérémy; Marque, Catherine; Germain, Guy
2008-05-01
Time-frequency representations (TFRs) of signals are increasingly being used in biomedical research. Analysis of such representations is sometimes difficult, however, and is often reduced to the extraction of ridges, or local energy maxima. In this paper, we describe a new ridge extraction method based on the image processing technique of active contours or snakes. We have tested our method on several synthetic signals and for the analysis of uterine electromyogram or electrohysterogram (EHG) recorded during gestation in monkeys. We have also evaluated a postprocessing algorithm that is especially suited for EHG analysis. Parameters are evaluated on real EHG signals in different gestational periods. The presented method gives good results when applied to synthetic as well as EHG signals. We have been able to obtain smaller ridge extraction errors when compared to two other methods specially developed for EHG. The gradient vector flow (GVF) snake method, or GVF-snake method, appears to be a good ridge extraction tool, which could be used on TFR of mono or multicomponent signals with good results.
NASA Astrophysics Data System (ADS)
Tian, Shu; Zhang, Ye; Yan, Yimin; Su, Nan; Zhang, Junping
2016-09-01
Latent low-rank representation (LatLRR) has been attached considerable attention in the field of remote sensing image segmentation, due to its effectiveness in exploring the multiple subspace structures of data. However, the increasingly heterogeneous texture information in the high spatial resolution remote sensing images, leads to more severe interference of pixels in local neighborhood, and the LatLRR fails to capture the local complex structure information. Therefore, we present a local sparse structure constrainted latent low-rank representation (LSSLatLRR) segmentation method, which explicitly imposes the local sparse structure constraint on LatLRR to capture the intrinsic local structure in manifold structure feature subspaces. The whole segmentation framework can be viewed as two stages in cascade. In the first stage, we use the local histogram transform to extract the texture local histogram features (LHOG) at each pixel, which can efficiently capture the complex and micro-texture pattern. In the second stage, a local sparse structure (LSS) formulation is established on LHOG, which aims to preserve the local intrinsic structure and enhance the relationship between pixels having similar local characteristics. Meanwhile, by integrating the LSS and the LatLRR, we can efficiently capture the local sparse and low-rank structure in the mixture of feature subspace, and we adopt the subspace segmentation method to improve the segmentation accuracy. Experimental results on the remote sensing images with different spatial resolution show that, compared with three state-of-the-art image segmentation methods, the proposed method achieves more accurate segmentation results.
NASA Astrophysics Data System (ADS)
Resita, I.; Ertikanto, C.
2018-05-01
This study aims to develop electronic module design based on Learning Content Development System (LCDS) to foster students’ multi representation skills in physics subject material. This study uses research and development method to the product design. This study involves 90 students and 6 physics teachers who were randomly chosen from 3 different Senior High Schools in Lampung Province. The data were collected by using questionnaires and analyzed by using quantitative descriptive method. Based on the data, 95% of the students only use one form of representation in solving physics problems. Representation which is tend to be used by students is symbolic representation. Students are considered to understand the concept of physics if they are able to change from one form to the other forms of representation. Product design of LCDS-based electronic module presents text, image, symbolic, video, and animation representation.
Qian, Jianjun; Yang, Jian; Xu, Yong
2013-09-01
This paper presents a robust but simple image feature extraction method, called image decomposition based on local structure (IDLS). It is assumed that in the local window of an image, the macro-pixel (patch) of the central pixel, and those of its neighbors, are locally linear. IDLS captures the local structural information by describing the relationship between the central macro-pixel and its neighbors. This relationship is represented with the linear representation coefficients determined using ridge regression. One image is actually decomposed into a series of sub-images (also called structure images) according to a local structure feature vector. All the structure images, after being down-sampled for dimensionality reduction, are concatenated into one super-vector. Fisher linear discriminant analysis is then used to provide a low-dimensional, compact, and discriminative representation for each super-vector. The proposed method is applied to face recognition and examined using our real-world face image database, NUST-RWFR, and five popular, publicly available, benchmark face image databases (AR, Extended Yale B, PIE, FERET, and LFW). Experimental results show the performance advantages of IDLS over state-of-the-art algorithms.
NASA Astrophysics Data System (ADS)
Li, Zhengji; Teng, Qizhi; He, Xiaohai; Yue, Guihua; Wang, Zhengyong
2017-09-01
The parameter evaluation of reservoir rocks can help us to identify components and calculate the permeability and other parameters, and it plays an important role in the petroleum industry. Until now, computed tomography (CT) has remained an irreplaceable way to acquire the microstructure of reservoir rocks. During the evaluation and analysis, large samples and high-resolution images are required in order to obtain accurate results. Owing to the inherent limitations of CT, however, a large field of view results in low-resolution images, and high-resolution images entail a smaller field of view. Our method is a promising solution to these data collection limitations. In this study, a framework for sparse representation-based 3D volumetric super-resolution is proposed to enhance the resolution of 3D voxel images of reservoirs scanned with CT. A single reservoir structure and its downgraded model are divided into a large number of 3D cubes of voxel pairs and these cube pairs are used to calculate two overcomplete dictionaries and the sparse-representation coefficients in order to estimate the high frequency component. Future more, to better result, a new feature extract method with combine BM4D together with Laplacian filter are introduced. In addition, we conducted a visual evaluation of the method, and used the PSNR and FSIM to evaluate it qualitatively.
Deep visual-semantic for crowded video understanding
NASA Astrophysics Data System (ADS)
Deng, Chunhua; Zhang, Junwen
2018-03-01
Visual-semantic features play a vital role for crowded video understanding. Convolutional Neural Networks (CNNs) have experienced a significant breakthrough in learning representations from images. However, the learning of visualsemantic features, and how it can be effectively extracted for video analysis, still remains a challenging task. In this study, we propose a novel visual-semantic method to capture both appearance and dynamic representations. In particular, we propose a spatial context method, based on the fractional Fisher vector (FV) encoding on CNN features, which can be regarded as our main contribution. In addition, to capture temporal context information, we also applied fractional encoding method on dynamic images. Experimental results on the WWW crowed video dataset demonstrate that the proposed method outperform the state of the art.
NASA Technical Reports Server (NTRS)
Juday, Richard D. (Editor)
1988-01-01
The present conference discusses topics in pattern-recognition correlator architectures, digital stereo systems, geometric image transformations and their applications, topics in pattern recognition, filter algorithms, object detection and classification, shape representation techniques, and model-based object recognition methods. Attention is given to edge-enhancement preprocessing using liquid crystal TVs, massively-parallel optical data base management, three-dimensional sensing with polar exponential sensor arrays, the optical processing of imaging spectrometer data, hybrid associative memories and metric data models, the representation of shape primitives in neural networks, and the Monte Carlo estimation of moment invariants for pattern recognition.
Li, Wei; Cao, Peng; Zhao, Dazhe; Wang, Junbo
2016-01-01
Computer aided detection (CAD) systems can assist radiologists by offering a second opinion on early diagnosis of lung cancer. Classification and feature representation play critical roles in false-positive reduction (FPR) in lung nodule CAD. We design a deep convolutional neural networks method for nodule classification, which has an advantage of autolearning representation and strong generalization ability. A specified network structure for nodule images is proposed to solve the recognition of three types of nodules, that is, solid, semisolid, and ground glass opacity (GGO). Deep convolutional neural networks are trained by 62,492 regions-of-interest (ROIs) samples including 40,772 nodules and 21,720 nonnodules from the Lung Image Database Consortium (LIDC) database. Experimental results demonstrate the effectiveness of the proposed method in terms of sensitivity and overall accuracy and that it consistently outperforms the competing methods.
Visual saliency detection based on in-depth analysis of sparse representation
NASA Astrophysics Data System (ADS)
Wang, Xin; Shen, Siqiu; Ning, Chen
2018-03-01
Visual saliency detection has been receiving great attention in recent years since it can facilitate a wide range of applications in computer vision. A variety of saliency models have been proposed based on different assumptions within which saliency detection via sparse representation is one of the newly arisen approaches. However, most existing sparse representation-based saliency detection methods utilize partial characteristics of sparse representation, lacking of in-depth analysis. Thus, they may have limited detection performance. Motivated by this, this paper proposes an algorithm for detecting visual saliency based on in-depth analysis of sparse representation. A number of discriminative dictionaries are first learned with randomly sampled image patches by means of inner product-based dictionary atom classification. Then, the input image is partitioned into many image patches, and these patches are classified into salient and nonsalient ones based on the in-depth analysis of sparse coding coefficients. Afterward, sparse reconstruction errors are calculated for the salient and nonsalient patch sets. By investigating the sparse reconstruction errors, the most salient atoms, which tend to be from the most salient region, are screened out and taken away from the discriminative dictionaries. Finally, an effective method is exploited for saliency map generation with the reduced dictionaries. Comprehensive evaluations on publicly available datasets and comparisons with some state-of-the-art approaches demonstrate the effectiveness of the proposed algorithm.
Semantic image segmentation with fused CNN features
NASA Astrophysics Data System (ADS)
Geng, Hui-qiang; Zhang, Hua; Xue, Yan-bing; Zhou, Mian; Xu, Guang-ping; Gao, Zan
2017-09-01
Semantic image segmentation is a task to predict a category label for every image pixel. The key challenge of it is to design a strong feature representation. In this paper, we fuse the hierarchical convolutional neural network (CNN) features and the region-based features as the feature representation. The hierarchical features contain more global information, while the region-based features contain more local information. The combination of these two kinds of features significantly enhances the feature representation. Then the fused features are used to train a softmax classifier to produce per-pixel label assignment probability. And a fully connected conditional random field (CRF) is used as a post-processing method to improve the labeling consistency. We conduct experiments on SIFT flow dataset. The pixel accuracy and class accuracy are 84.4% and 34.86%, respectively.
Iris Image Classification Based on Hierarchical Visual Codebook.
Zhenan Sun; Hui Zhang; Tieniu Tan; Jianyu Wang
2014-06-01
Iris recognition as a reliable method for personal identification has been well-studied with the objective to assign the class label of each iris image to a unique subject. In contrast, iris image classification aims to classify an iris image to an application specific category, e.g., iris liveness detection (classification of genuine and fake iris images), race classification (e.g., classification of iris images of Asian and non-Asian subjects), coarse-to-fine iris identification (classification of all iris images in the central database into multiple categories). This paper proposes a general framework for iris image classification based on texture analysis. A novel texture pattern representation method called Hierarchical Visual Codebook (HVC) is proposed to encode the texture primitives of iris images. The proposed HVC method is an integration of two existing Bag-of-Words models, namely Vocabulary Tree (VT), and Locality-constrained Linear Coding (LLC). The HVC adopts a coarse-to-fine visual coding strategy and takes advantages of both VT and LLC for accurate and sparse representation of iris texture. Extensive experimental results demonstrate that the proposed iris image classification method achieves state-of-the-art performance for iris liveness detection, race classification, and coarse-to-fine iris identification. A comprehensive fake iris image database simulating four types of iris spoof attacks is developed as the benchmark for research of iris liveness detection.
Wang, Li; Shi, Feng; Gao, Yaozong; Li, Gang; Gilmore, John H.; Lin, Weili; Shen, Dinggang
2014-01-01
Segmentation of infant brain MR images is challenging due to poor spatial resolution, severe partial volume effect, and the ongoing maturation and myelination process. During the first year of life, the brain image contrast between white and gray matters undergoes dramatic changes. In particular, the image contrast inverses around 6–8 months of age, where the white and gray matter tissues are isointense in T1 and T2 weighted images and hence exhibit the extremely low tissue contrast, posing significant challenges for automated segmentation. In this paper, we propose a general framework that adopts sparse representation to fuse the multi-modality image information and further incorporate the anatomical constraints for brain tissue segmentation. Specifically, we first derive an initial segmentation from a library of aligned images with ground-truth segmentations by using sparse representation in a patch-based fashion for the multi-modality T1, T2 and FA images. The segmentation result is further iteratively refined by integration of the anatomical constraint. The proposed method was evaluated on 22 infant brain MR images acquired at around 6 months of age by using a leave-one-out cross-validation, as well as other 10 unseen testing subjects. Our method achieved a high accuracy for the Dice ratios that measure the volume overlap between automated and manual segmentations, i.e., 0.889±0.008 for white matter and 0.870±0.006 for gray matter. PMID:24291615
Mobile visual object identification: from SIFT-BoF-RANSAC to Sketchprint
NASA Astrophysics Data System (ADS)
Voloshynovskiy, Sviatoslav; Diephuis, Maurits; Holotyak, Taras
2015-03-01
Mobile object identification based on its visual features find many applications in the interaction with physical objects and security. Discriminative and robust content representation plays a central role in object and content identification. Complex post-processing methods are used to compress descriptors and their geometrical information, aggregate them into more compact and discriminative representations and finally re-rank the results based on the similarity geometries of descriptors. Unfortunately, most of the existing descriptors are not very robust and discriminative once applied to the various contend such as real images, text or noise-like microstructures next to requiring at least 500-1'000 descriptors per image for reliable identification. At the same time, the geometric re-ranking procedures are still too complex to be applied to the numerous candidates obtained from the feature similarity based search only. This restricts that list of candidates to be less than 1'000 which obviously causes a higher probability of miss. In addition, the security and privacy of content representation has become a hot research topic in multimedia and security communities. In this paper, we introduce a new framework for non- local content representation based on SketchPrint descriptors. It extends the properties of local descriptors to a more informative and discriminative, yet geometrically invariant content representation. In particular it allows images to be compactly represented by 100 SketchPrint descriptors without being fully dependent on re-ranking methods. We consider several use cases, applying SketchPrint descriptors to natural images, text documents, packages and micro-structures and compare them with the traditional local descriptors.
Retrieval and classification of food images.
Farinella, Giovanni Maria; Allegra, Dario; Moltisanti, Marco; Stanco, Filippo; Battiato, Sebastiano
2016-10-01
Automatic food understanding from images is an interesting challenge with applications in different domains. In particular, food intake monitoring is becoming more and more important because of the key role that it plays in health and market economies. In this paper, we address the study of food image processing from the perspective of Computer Vision. As first contribution we present a survey of the studies in the context of food image processing from the early attempts to the current state-of-the-art methods. Since retrieval and classification engines able to work on food images are required to build automatic systems for diet monitoring (e.g., to be embedded in wearable cameras), we focus our attention on the aspect of the representation of the food images because it plays a fundamental role in the understanding engines. The food retrieval and classification is a challenging task since the food presents high variableness and an intrinsic deformability. To properly study the peculiarities of different image representations we propose the UNICT-FD1200 dataset. It was composed of 4754 food images of 1200 distinct dishes acquired during real meals. Each food plate is acquired multiple times and the overall dataset presents both geometric and photometric variabilities. The images of the dataset have been manually labeled considering 8 categories: Appetizer, Main Course, Second Course, Single Course, Side Dish, Dessert, Breakfast, Fruit. We have performed tests employing different representations of the state-of-the-art to assess the related performances on the UNICT-FD1200 dataset. Finally, we propose a new representation based on the perceptual concept of Anti-Textons which is able to encode spatial information between Textons outperforming other representations in the context of food retrieval and Classification. Copyright © 2016 Elsevier Ltd. All rights reserved.
Aggoun, Amar; Swash, Mohammad; Grange, Philippe C.R.; Challacombe, Benjamin; Dasgupta, Prokar
2013-01-01
Abstract Background and Purpose Existing imaging modalities of urologic pathology are limited by three-dimensional (3D) representation on a two-dimensional screen. We present 3D-holoscopic imaging as a novel method of representing Digital Imaging and Communications in Medicine data images taken from CT and MRI to produce 3D-holographic representations of anatomy without special eyewear in natural light. 3D-holoscopic technology produces images that are true optical models. This technology is based on physical principles with duplication of light fields. The 3D content is captured in real time with the content viewed by multiple viewers independently of their position, without 3D eyewear. Methods We display 3D-holoscopic anatomy relevant to minimally invasive urologic surgery without the need for 3D eyewear. Results The results have demonstrated that medical 3D-holoscopic content can be displayed on commercially available multiview auto-stereoscopic display. Conclusion The next step is validation studies comparing 3D-Holoscopic imaging with conventional imaging. PMID:23216303
Convolutional neural network features based change detection in satellite images
NASA Astrophysics Data System (ADS)
Mohammed El Amin, Arabi; Liu, Qingjie; Wang, Yunhong
2016-07-01
With the popular use of high resolution remote sensing (HRRS) satellite images, a huge research efforts have been placed on change detection (CD) problem. An effective feature selection method can significantly boost the final result. While hand-designed features have proven difficulties to design features that effectively capture high and mid-level representations, the recent developments in machine learning (Deep Learning) omit this problem by learning hierarchical representation in an unsupervised manner directly from data without human intervention. In this letter, we propose approaching the change detection problem from a feature learning perspective. A novel deep Convolutional Neural Networks (CNN) features based HR satellite images change detection method is proposed. The main guideline is to produce a change detection map directly from two images using a pretrained CNN. This method can omit the limited performance of hand-crafted features. Firstly, CNN features are extracted through different convolutional layers. Then, a concatenation step is evaluated after an normalization step, resulting in a unique higher dimensional feature map. Finally, a change map was computed using pixel-wise Euclidean distance. Our method has been validated on real bitemporal HRRS satellite images according to qualitative and quantitative analyses. The results obtained confirm the interest of the proposed method.
Bimodal Biometric Verification Using the Fusion of Palmprint and Infrared Palm-Dorsum Vein Images
Lin, Chih-Lung; Wang, Shih-Hung; Cheng, Hsu-Yung; Fan, Kuo-Chin; Hsu, Wei-Lieh; Lai, Chin-Rong
2015-01-01
In this paper, we present a reliable and robust biometric verification method based on bimodal physiological characteristics of palms, including the palmprint and palm-dorsum vein patterns. The proposed method consists of five steps: (1) automatically aligning and cropping the same region of interest from different palm or palm-dorsum images; (2) applying the digital wavelet transform and inverse wavelet transform to fuse palmprint and vein pattern images; (3) extracting the line-like features (LLFs) from the fused image; (4) obtaining multiresolution representations of the LLFs by using a multiresolution filter; and (5) using a support vector machine to verify the multiresolution representations of the LLFs. The proposed method possesses four advantages: first, both modal images are captured in peg-free scenarios to improve the user-friendliness of the verification device. Second, palmprint and vein pattern images are captured using a low-resolution digital scanner and infrared (IR) camera. The use of low-resolution images results in a smaller database. In addition, the vein pattern images are captured through the invisible IR spectrum, which improves antispoofing. Third, since the physiological characteristics of palmprint and vein pattern images are different, a hybrid fusing rule can be introduced to fuse the decomposition coefficients of different bands. The proposed method fuses decomposition coefficients at different decomposed levels, with different image sizes, captured from different sensor devices. Finally, the proposed method operates automatically and hence no parameters need to be set manually. Three thousand palmprint images and 3000 vein pattern images were collected from 100 volunteers to verify the validity of the proposed method. The results show a false rejection rate of 1.20% and a false acceptance rate of 1.56%. It demonstrates the validity and excellent performance of our proposed method comparing to other methods. PMID:26703596
Bimodal Biometric Verification Using the Fusion of Palmprint and Infrared Palm-Dorsum Vein Images.
Lin, Chih-Lung; Wang, Shih-Hung; Cheng, Hsu-Yung; Fan, Kuo-Chin; Hsu, Wei-Lieh; Lai, Chin-Rong
2015-12-12
In this paper, we present a reliable and robust biometric verification method based on bimodal physiological characteristics of palms, including the palmprint and palm-dorsum vein patterns. The proposed method consists of five steps: (1) automatically aligning and cropping the same region of interest from different palm or palm-dorsum images; (2) applying the digital wavelet transform and inverse wavelet transform to fuse palmprint and vein pattern images; (3) extracting the line-like features (LLFs) from the fused image; (4) obtaining multiresolution representations of the LLFs by using a multiresolution filter; and (5) using a support vector machine to verify the multiresolution representations of the LLFs. The proposed method possesses four advantages: first, both modal images are captured in peg-free scenarios to improve the user-friendliness of the verification device. Second, palmprint and vein pattern images are captured using a low-resolution digital scanner and infrared (IR) camera. The use of low-resolution images results in a smaller database. In addition, the vein pattern images are captured through the invisible IR spectrum, which improves antispoofing. Third, since the physiological characteristics of palmprint and vein pattern images are different, a hybrid fusing rule can be introduced to fuse the decomposition coefficients of different bands. The proposed method fuses decomposition coefficients at different decomposed levels, with different image sizes, captured from different sensor devices. Finally, the proposed method operates automatically and hence no parameters need to be set manually. Three thousand palmprint images and 3000 vein pattern images were collected from 100 volunteers to verify the validity of the proposed method. The results show a false rejection rate of 1.20% and a false acceptance rate of 1.56%. It demonstrates the validity and excellent performance of our proposed method comparing to other methods.
Multilayer Extreme Learning Machine With Subnetwork Nodes for Representation Learning.
Yang, Yimin; Wu, Q M Jonathan
2016-11-01
The extreme learning machine (ELM), which was originally proposed for "generalized" single-hidden layer feedforward neural networks, provides efficient unified learning solutions for the applications of clustering, regression, and classification. It presents competitive accuracy with superb efficiency in many applications. However, ELM with subnetwork nodes architecture has not attracted much research attentions. Recently, many methods have been proposed for supervised/unsupervised dimension reduction or representation learning, but these methods normally only work for one type of problem. This paper studies the general architecture of multilayer ELM (ML-ELM) with subnetwork nodes, showing that: 1) the proposed method provides a representation learning platform with unsupervised/supervised and compressed/sparse representation learning and 2) experimental results on ten image datasets and 16 classification datasets show that, compared to other conventional feature learning methods, the proposed ML-ELM with subnetwork nodes performs competitively or much better than other feature learning methods.
Computer Vision Research and Its Applications to Automated Cartography
1984-09-01
reflecting from scene surfaces, and the film and digitization processes that result in the computer representation of the image. These models, when...alone. Specifically, intepretations that are in some sense "orthogonal" are preferred. A method for finding such interpretations for right-angle...saturated colors are not precisely representable and the colors recorded with different films or cameras may differ, but the tricomponent representation is t
Spectral embedding-based registration (SERg) for multimodal fusion of prostate histology and MRI
NASA Astrophysics Data System (ADS)
Hwuang, Eileen; Rusu, Mirabela; Karthigeyan, Sudha; Agner, Shannon C.; Sparks, Rachel; Shih, Natalie; Tomaszewski, John E.; Rosen, Mark; Feldman, Michael; Madabhushi, Anant
2014-03-01
Multi-modal image registration is needed to align medical images collected from different protocols or imaging sources, thereby allowing the mapping of complementary information between images. One challenge of multimodal image registration is that typical similarity measures rely on statistical correlations between image intensities to determine anatomical alignment. The use of alternate image representations could allow for mapping of intensities into a space or representation such that the multimodal images appear more similar, thus facilitating their co-registration. In this work, we present a spectral embedding based registration (SERg) method that uses non-linearly embedded representations obtained from independent components of statistical texture maps of the original images to facilitate multimodal image registration. Our methodology comprises the following main steps: 1) image-derived textural representation of the original images, 2) dimensionality reduction using independent component analysis (ICA), 3) spectral embedding to generate the alternate representations, and 4) image registration. The rationale behind our approach is that SERg yields embedded representations that can allow for very different looking images to appear more similar, thereby facilitating improved co-registration. Statistical texture features are derived from the image intensities and then reduced to a smaller set by using independent component analysis to remove redundant information. Spectral embedding generates a new representation by eigendecomposition from which only the most important eigenvectors are selected. This helps to accentuate areas of salience based on modality-invariant structural information and therefore better identifies corresponding regions in both the template and target images. The spirit behind SERg is that image registration driven by these areas of salience and correspondence should improve alignment accuracy. In this work, SERg is implemented using Demons to allow the algorithm to more effectively register multimodal images. SERg is also tested within the free-form deformation framework driven by mutual information. Nine pairs of synthetic T1-weighted to T2-weighted brain MRI were registered under the following conditions: five levels of noise (0%, 1%, 3%, 5%, and 7%) and two levels of bias field (20% and 40%) each with and without noise. We demonstrate that across all of these conditions, SERg yields a mean squared error that is 81.51% lower than that of Demons driven by MRI intensity alone. We also spatially align twenty-six ex vivo histology sections and in vivo prostate MRI in order to map the spatial extent of prostate cancer onto corresponding radiologic imaging. SERg performs better than intensity registration by decreasing the root mean squared distance of annotated landmarks in the prostate gland via both Demons algorithm and mutual information-driven free-form deformation. In both synthetic and clinical experiments, the observed improvement in alignment of the template and target images suggest the utility of parametric eigenvector representations and hence SERg for multimodal image registration.
Representation learning via Dual-Autoencoder for recommendation.
Zhuang, Fuzhen; Zhang, Zhiqiang; Qian, Mingda; Shi, Chuan; Xie, Xing; He, Qing
2017-06-01
Recommendation has provoked vast amount of attention and research in recent decades. Most previous works employ matrix factorization techniques to learn the latent factors of users and items. And many subsequent works consider external information, e.g., social relationships of users and items' attributions, to improve the recommendation performance under the matrix factorization framework. However, matrix factorization methods may not make full use of the limited information from rating or check-in matrices, and achieve unsatisfying results. Recently, deep learning has proven able to learn good representation in natural language processing, image classification, and so on. Along this line, we propose a new representation learning framework called Recommendation via Dual-Autoencoder (ReDa). In this framework, we simultaneously learn the new hidden representations of users and items using autoencoders, and minimize the deviations of training data by the learnt representations of users and items. Based on this framework, we develop a gradient descent method to learn hidden representations. Extensive experiments conducted on several real-world data sets demonstrate the effectiveness of our proposed method compared with state-of-the-art matrix factorization based methods. Copyright © 2017 Elsevier Ltd. All rights reserved.
Deep linear autoencoder and patch clustering-based unified one-dimensional coding of image and video
NASA Astrophysics Data System (ADS)
Li, Honggui
2017-09-01
This paper proposes a unified one-dimensional (1-D) coding framework of image and video, which depends on deep learning neural network and image patch clustering. First, an improved K-means clustering algorithm for image patches is employed to obtain the compact inputs of deep artificial neural network. Second, for the purpose of best reconstructing original image patches, deep linear autoencoder (DLA), a linear version of the classical deep nonlinear autoencoder, is introduced to achieve the 1-D representation of image blocks. Under the circumstances of 1-D representation, DLA is capable of attaining zero reconstruction error, which is impossible for the classical nonlinear dimensionality reduction methods. Third, a unified 1-D coding infrastructure for image, intraframe, interframe, multiview video, three-dimensional (3-D) video, and multiview 3-D video is built by incorporating different categories of videos into the inputs of patch clustering algorithm. Finally, it is shown in the results of simulation experiments that the proposed methods can simultaneously gain higher compression ratio and peak signal-to-noise ratio than those of the state-of-the-art methods in the situation of low bitrate transmission.
Synthesis of atmospheric turbulence point spread functions by sparse and redundant representations
NASA Astrophysics Data System (ADS)
Hunt, Bobby R.; Iler, Amber L.; Bailey, Christopher A.; Rucci, Michael A.
2018-02-01
Atmospheric turbulence is a fundamental problem in imaging through long slant ranges, horizontal-range paths, or uplooking astronomical cases through the atmosphere. An essential characterization of atmospheric turbulence is the point spread function (PSF). Turbulence images can be simulated to study basic questions, such as image quality and image restoration, by synthesizing PSFs of desired properties. In this paper, we report on a method to synthesize PSFs of atmospheric turbulence. The method uses recent developments in sparse and redundant representations. From a training set of measured atmospheric PSFs, we construct a dictionary of "basis functions" that characterize the atmospheric turbulence PSFs. A PSF can be synthesized from this dictionary by a properly weighted combination of dictionary elements. We disclose an algorithm to synthesize PSFs from the dictionary. The algorithm can synthesize PSFs in three orders of magnitude less computing time than conventional wave optics propagation methods. The resulting PSFs are also shown to be statistically representative of the turbulence conditions that were used to construct the dictionary.
Staudacher, Erich M.; Huetteroth, Wolf; Schachtner, Joachim; Daly, Kevin C.
2009-01-01
A central problem facing studies of neural encoding in sensory systems is how to accurately quantify the extent of spatial and temporal responses. In this study, we take advantage of the relatively simple and stereotypic neural architecture found in invertebrates. We combine standard electrophysiological techniques, recently developed population analysis techniques, and novel anatomical methods to form an innovative 4-dimensional view of odor output representations in the antennal lobe of the moth Manduca sexta. This novel approach allows quantification of olfactory responses of characterized neurons with spike time resolution. Additionally, arbitrary integration windows can be used for comparisons with other methods such as imaging. By assigning statistical significance to changes in neuronal firing, this method can visualize activity across the entire antennal lobe. The resulting 4-dimensional representation of antennal lobe output complements imaging and multi-unit experiments yet provides a more comprehensive and accurate view of glomerular activation patterns in spike time resolution. PMID:19464513
Spatial, Temporal and Spectral Satellite Image Fusion via Sparse Representation
NASA Astrophysics Data System (ADS)
Song, Huihui
Remote sensing provides good measurements for monitoring and further analyzing the climate change, dynamics of ecosystem, and human activities in global or regional scales. Over the past two decades, the number of launched satellite sensors has been increasing with the development of aerospace technologies and the growing requirements on remote sensing data in a vast amount of application fields. However, a key technological challenge confronting these sensors is that they tradeoff between spatial resolution and other properties, including temporal resolution, spectral resolution, swath width, etc., due to the limitations of hardware technology and budget constraints. To increase the spatial resolution of data with other good properties, one possible cost-effective solution is to explore data integration methods that can fuse multi-resolution data from multiple sensors, thereby enhancing the application capabilities of available remote sensing data. In this thesis, we propose to fuse the spatial resolution with temporal resolution and spectral resolution, respectively, based on sparse representation theory. Taking the study case of Landsat ETM+ (with spatial resolution of 30m and temporal resolution of 16 days) and MODIS (with spatial resolution of 250m ~ 1km and daily temporal resolution) reflectance, we propose two spatial-temporal fusion methods to combine the fine spatial information of Landsat image and the daily temporal resolution of MODIS image. Motivated by that the images from these two sensors are comparable on corresponding bands, we propose to link their spatial information on available Landsat- MODIS image pair (captured on prior date) and then predict the Landsat image from the MODIS counterpart on prediction date. To well-learn the spatial details from the prior images, we use a redundant dictionary to extract the basic representation atoms for both Landsat and MODIS images based on sparse representation. Under the scenario of two prior Landsat-MODIS image pairs, we build the corresponding relationship between the difference images of MODIS and ETM+ by training a low- and high-resolution dictionary pair from the given prior image pairs. In the second scenario, i.e., only one Landsat- MODIS image pair being available, we directly correlate MODIS and ETM+ data through an image degradation model. Then, the fusion stage is achieved by super-resolving the MODIS image combining the high-pass modulation in a two-layer fusion framework. Remarkably, the proposed spatial-temporal fusion methods form a unified framework for blending remote sensing images with phenology change or land-cover-type change. Based on the proposed spatial-temporal fusion models, we propose to monitor the land use/land cover changes in Shenzhen, China. As a fast-growing city, Shenzhen faces the problem of detecting the rapid changes for both rational city planning and sustainable development. However, the cloudy and rainy weather in region Shenzhen located makes the capturing circle of high-quality satellite images longer than their normal revisit periods. Spatial-temporal fusion methods are capable to tackle this problem by improving the spatial resolution of images with coarse spatial resolution but frequent temporal coverage, thereby making the detection of rapid changes possible. On two Landsat-MODIS datasets with annual and monthly changes, respectively, we apply the proposed spatial-temporal fusion methods to the task of multiple change detection. Afterward, we propose a novel spatial and spectral fusion method for satellite multispectral and hyperspectral (or high-spectral) images based on dictionary-pair learning and sparse non-negative matrix factorization. By combining the spectral information from hyperspectral image, which is characterized by low spatial resolution but high spectral resolution and abbreviated as LSHS, and the spatial information from multispectral image, which is featured by high spatial resolution but low spectral resolution and abbreviated as HSLS, this method aims to generate the fused data with both high spatial and high spectral resolutions. Motivated by the observation that each hyperspectral pixel can be represented by a linear combination of a few endmembers, this method first extracts the spectral bases of LSHS and HSLS images by making full use of the rich spectral information in LSHS data. The spectral bases of these two categories data then formulate a dictionary-pair due to their correspondence in representing each pixel spectra of LSHS data and HSLS data, respectively. Subsequently, the LSHS image is spatially unmixed by representing the HSLS image with respect to the corresponding learned dictionary to derive its representation coefficients. Combining the spectral bases of LSHS data and the representation coefficients of HSLS data, we finally derive the fused data characterized by the spectral resolution of LSHS data and the spatial resolution of HSLS data.
Hierarchical Recurrent Neural Hashing for Image Retrieval With Hierarchical Convolutional Features.
Lu, Xiaoqiang; Chen, Yaxiong; Li, Xuelong
Hashing has been an important and effective technology in image retrieval due to its computational efficiency and fast search speed. The traditional hashing methods usually learn hash functions to obtain binary codes by exploiting hand-crafted features, which cannot optimally represent the information of the sample. Recently, deep learning methods can achieve better performance, since deep learning architectures can learn more effective image representation features. However, these methods only use semantic features to generate hash codes by shallow projection but ignore texture details. In this paper, we proposed a novel hashing method, namely hierarchical recurrent neural hashing (HRNH), to exploit hierarchical recurrent neural network to generate effective hash codes. There are three contributions of this paper. First, a deep hashing method is proposed to extensively exploit both spatial details and semantic information, in which, we leverage hierarchical convolutional features to construct image pyramid representation. Second, our proposed deep network can exploit directly convolutional feature maps as input to preserve the spatial structure of convolutional feature maps. Finally, we propose a new loss function that considers the quantization error of binarizing the continuous embeddings into the discrete binary codes, and simultaneously maintains the semantic similarity and balanceable property of hash codes. Experimental results on four widely used data sets demonstrate that the proposed HRNH can achieve superior performance over other state-of-the-art hashing methods.Hashing has been an important and effective technology in image retrieval due to its computational efficiency and fast search speed. The traditional hashing methods usually learn hash functions to obtain binary codes by exploiting hand-crafted features, which cannot optimally represent the information of the sample. Recently, deep learning methods can achieve better performance, since deep learning architectures can learn more effective image representation features. However, these methods only use semantic features to generate hash codes by shallow projection but ignore texture details. In this paper, we proposed a novel hashing method, namely hierarchical recurrent neural hashing (HRNH), to exploit hierarchical recurrent neural network to generate effective hash codes. There are three contributions of this paper. First, a deep hashing method is proposed to extensively exploit both spatial details and semantic information, in which, we leverage hierarchical convolutional features to construct image pyramid representation. Second, our proposed deep network can exploit directly convolutional feature maps as input to preserve the spatial structure of convolutional feature maps. Finally, we propose a new loss function that considers the quantization error of binarizing the continuous embeddings into the discrete binary codes, and simultaneously maintains the semantic similarity and balanceable property of hash codes. Experimental results on four widely used data sets demonstrate that the proposed HRNH can achieve superior performance over other state-of-the-art hashing methods.
Perceived face size in healthy adults.
D'Amour, Sarah; Harris, Laurence R
2017-01-01
Perceptual body size distortions have traditionally been studied using subjective, qualitative measures that assess only one type of body representation-the conscious body image. Previous research on perceived body size has typically focused on measuring distortions of the entire body and has tended to overlook the face. Here, we present a novel psychophysical method for determining perceived body size that taps into implicit body representation. Using a two-alternative forced choice (2AFC), participants were sequentially shown two life-size images of their own face, viewed upright, upside down, or tilted 90°. In one interval, the width or length dimension was varied, while the other interval contained an undistorted image. Participants reported which image most closely matched their own face. An adaptive staircase adjusted the distorted image to hone in on the image that was equally likely to be judged as matching their perceived face as the accurate image. When viewed upright or upside down, face width was overestimated and length underestimated, whereas perception was accurate for the on-side views. These results provide the first psychophysically robust measurements of how accurately healthy participants perceive the size of their face, revealing distortions of the implicit body representation independent of the conscious body image.
Distant failure prediction for early stage NSCLC by analyzing PET with sparse representation
NASA Astrophysics Data System (ADS)
Hao, Hongxia; Zhou, Zhiguo; Wang, Jing
2017-03-01
Positron emission tomography (PET) imaging has been widely explored for treatment outcome prediction. Radiomicsdriven methods provide a new insight to quantitatively explore underlying information from PET images. However, it is still a challenging problem to automatically extract clinically meaningful features for prognosis. In this work, we develop a PET-guided distant failure predictive model for early stage non-small cell lung cancer (NSCLC) patients after stereotactic ablative radiotherapy (SABR) by using sparse representation. The proposed method does not need precalculated features and can learn intrinsically distinctive features contributing to classification of patients with distant failure. The proposed framework includes two main parts: 1) intra-tumor heterogeneity description; and 2) dictionary pair learning based sparse representation. Tumor heterogeneity is initially captured through anisotropic kernel and represented as a set of concatenated vectors, which forms the sample gallery. Then, given a test tumor image, its identity (i.e., distant failure or not) is classified by applying the dictionary pair learning based sparse representation. We evaluate the proposed approach on 48 NSCLC patients treated by SABR at our institute. Experimental results show that the proposed approach can achieve an area under the characteristic curve (AUC) of 0.70 with a sensitivity of 69.87% and a specificity of 69.51% using a five-fold cross validation.
NASA Technical Reports Server (NTRS)
Jaeckel, Louis A.
1989-01-01
To study the problems of encoding visual images for use with a Sparse Distributed Memory (SDM), I consider a specific class of images- those that consist of several pieces, each of which is a line segment or an arc of a circle. This class includes line drawings of characters such as letters of the alphabet. I give a method of representing a segment of an arc by five numbers in a continuous way; that is, similar arcs have similar representations. I also give methods for encoding these numbers as bit strings in an approximately continuous way. The set of possible segments and arcs may be viewed as a five-dimensional manifold M, whose structure is like a Mobious strip. An image, considered to be an unordered set of segments and arcs, is therefore represented by a set of points in M - one for each piece. I then discuss the problem of constructing a preprocessor to find the segments and arcs in these images, although a preprocessor has not been developed. I also describe a possible extension of the representation.
2012-01-01
Background Computer-based analysis of digitalized histological images has been gaining increasing attention, due to their extensive use in research and routine practice. The article aims to contribute towards the description and retrieval of histological images by employing a structural method using graphs. Due to their expressive ability, graphs are considered as a powerful and versatile representation formalism and have obtained a growing consideration especially by the image processing and computer vision community. Methods The article describes a novel method for determining similarity between histological images through graph-theoretic description and matching, for the purpose of content-based retrieval. A higher order (region-based) graph-based representation of breast biopsy images has been attained and a tree-search based inexact graph matching technique has been employed that facilitates the automatic retrieval of images structurally similar to a given image from large databases. Results The results obtained and evaluation performed demonstrate the effectiveness and superiority of graph-based image retrieval over a common histogram-based technique. The employed graph matching complexity has been reduced compared to the state-of-the-art optimal inexact matching methods by applying a pre-requisite criterion for matching of nodes and a sophisticated design of the estimation function, especially the prognosis function. Conclusion The proposed method is suitable for the retrieval of similar histological images, as suggested by the experimental and evaluation results obtained in the study. It is intended for the use in Content Based Image Retrieval (CBIR)-requiring applications in the areas of medical diagnostics and research, and can also be generalized for retrieval of different types of complex images. Virtual Slides The virtual slide(s) for this article can be found here: http://www.diagnosticpathology.diagnomx.eu/vs/1224798882787923. PMID:23035717
Effective Clipart Image Vectorization through Direct Optimization of Bezigons.
Yang, Ming; Chao, Hongyang; Zhang, Chi; Guo, Jun; Yuan, Lu; Sun, Jian
2016-02-01
Bezigons, i.e., closed paths composed of Bézier curves, have been widely employed to describe shapes in image vectorization results. However, most existing vectorization techniques infer the bezigons by simply approximating an intermediate vector representation (such as polygons). Consequently, the resultant bezigons are sometimes imperfect due to accumulated errors, fitting ambiguities, and a lack of curve priors, especially for low-resolution images. In this paper, we describe a novel method for vectorizing clipart images. In contrast to previous methods, we directly optimize the bezigons rather than using other intermediate representations; therefore, the resultant bezigons are not only of higher fidelity compared with the original raster image but also more reasonable because they were traced by a proficient expert. To enable such optimization, we have overcome several challenges and have devised a differentiable data energy as well as several curve-based prior terms. To improve the efficiency of the optimization, we also take advantage of the local control property of bezigons and adopt an overlapped piecewise optimization strategy. The experimental results show that our method outperforms both the current state-of-the-art method and commonly used commercial software in terms of bezigon quality.
Nestor, Adrian; Vettel, Jean M; Tarr, Michael J
2013-11-01
What basic visual structures underlie human face detection and how can we extract such structures directly from the amplitude of neural responses elicited by face processing? Here, we address these issues by investigating an extension of noise-based image classification to BOLD responses recorded in high-level visual areas. First, we assess the applicability of this classification method to such data and, second, we explore its results in connection with the neural processing of faces. To this end, we construct luminance templates from white noise fields based on the response of face-selective areas in the human ventral cortex. Using behaviorally and neurally-derived classification images, our results reveal a family of simple but robust image structures subserving face representation and detection. Thus, we confirm the role played by classical face selective regions in face detection and we help clarify the representational basis of this perceptual function. From a theory standpoint, our findings support the idea of simple but highly diagnostic neurally-coded features for face detection. At the same time, from a methodological perspective, our work demonstrates the ability of noise-based image classification in conjunction with fMRI to help uncover the structure of high-level perceptual representations. Copyright © 2012 Wiley Periodicals, Inc.
Detection of dual-band infrared small target based on joint dynamic sparse representation
NASA Astrophysics Data System (ADS)
Zhou, Jinwei; Li, Jicheng; Shi, Zhiguang; Lu, Xiaowei; Ren, Dongwei
2015-10-01
Infrared small target detection is a crucial and yet still is a difficult issue in aeronautic and astronautic applications. Sparse representation is an important mathematic tool and has been used extensively in image processing in recent years. Joint sparse representation is applied in dual-band infrared dim target detection in this paper. Firstly, according to the characters of dim targets in dual-band infrared images, 2-dimension Gaussian intensity model was used to construct target dictionary, then the dictionary was classified into different sub-classes according to different positions of Gaussian function's center point in image block; The fact that dual-band small targets detection can use the same dictionary and the sparsity doesn't lie in atom-level but in sub-class level was utilized, hence the detection of targets in dual-band infrared images was converted to be a joint dynamic sparse representation problem. And the dynamic active sets were used to describe the sparse constraint of coefficients. Two modified sparsity concentration index (SCI) criteria was proposed to evaluate whether targets exist in the images. In experiments, it shows that the proposed algorithm can achieve better detecting performance and dual-band detection is much more robust to noise compared with single-band detection. Moreover, the proposed method can be expanded to multi-spectrum small target detection.
Nonlinear features for classification and pose estimation of machined parts from single views
NASA Astrophysics Data System (ADS)
Talukder, Ashit; Casasent, David P.
1998-10-01
A new nonlinear feature extraction method is presented for classification and pose estimation of objects from single views. The feature extraction method is called the maximum representation and discrimination feature (MRDF) method. The nonlinear MRDF transformations to use are obtained in closed form, and offer significant advantages compared to nonlinear neural network implementations. The features extracted are useful for both object discrimination (classification) and object representation (pose estimation). We consider MRDFs on image data, provide a new 2-stage nonlinear MRDF solution, and show it specializes to well-known linear and nonlinear image processing transforms under certain conditions. We show the use of MRDF in estimating the class and pose of images of rendered solid CAD models of machine parts from single views using a feature-space trajectory neural network classifier. We show new results with better classification and pose estimation accuracy than are achieved by standard principal component analysis and Fukunaga-Koontz feature extraction methods.
Discriminative feature representation: an effective postprocessing solution to low dose CT imaging
NASA Astrophysics Data System (ADS)
Chen, Yang; Liu, Jin; Hu, Yining; Yang, Jian; Shi, Luyao; Shu, Huazhong; Gui, Zhiguo; Coatrieux, Gouenou; Luo, Limin
2017-03-01
This paper proposes a concise and effective approach termed discriminative feature representation (DFR) for low dose computerized tomography (LDCT) image processing, which is currently a challenging problem in medical imaging field. This DFR method assumes LDCT images as the superposition of desirable high dose CT (HDCT) 3D features and undesirable noise-artifact 3D features (the combined term of noise and artifact features induced by low dose scan protocols), and the decomposed HDCT features are used to provide the processed LDCT images with higher quality. The target HDCT features are solved via the DFR algorithm using a featured dictionary composed by atoms representing HDCT features and noise-artifact features. In this study, the featured dictionary is efficiently built using physical phantom images collected from the same CT scanner as the target clinical LDCT images to process. The proposed DFR method also has good robustness in parameter setting for different CT scanner types. This DFR method can be directly applied to process DICOM formatted LDCT images, and has good applicability to current CT systems. Comparative experiments with abdomen LDCT data validate the good performance of the proposed approach. This research was supported by National Natural Science Foundation under grants (81370040, 81530060), the Fundamental Research Funds for the Central Universities, and the Qing Lan Project in Jiangsu Province.
Zhao, Guangjun; Wang, Xuchu; Niu, Yanmin; Tan, Liwen; Zhang, Shao-Xiang
2016-01-01
Cryosection brain images in Chinese Visible Human (CVH) dataset contain rich anatomical structure information of tissues because of its high resolution (e.g., 0.167 mm per pixel). Fast and accurate segmentation of these images into white matter, gray matter, and cerebrospinal fluid plays a critical role in analyzing and measuring the anatomical structures of human brain. However, most existing automated segmentation methods are designed for computed tomography or magnetic resonance imaging data, and they may not be applicable for cryosection images due to the imaging difference. In this paper, we propose a supervised learning-based CVH brain tissues segmentation method that uses stacked autoencoder (SAE) to automatically learn the deep feature representations. Specifically, our model includes two successive parts where two three-layer SAEs take image patches as input to learn the complex anatomical feature representation, and then these features are sent to Softmax classifier for inferring the labels. Experimental results validated the effectiveness of our method and showed that it outperformed four other classical brain tissue detection strategies. Furthermore, we reconstructed three-dimensional surfaces of these tissues, which show their potential in exploring the high-resolution anatomical structures of human brain. PMID:27057543
Zhao, Guangjun; Wang, Xuchu; Niu, Yanmin; Tan, Liwen; Zhang, Shao-Xiang
2016-01-01
Cryosection brain images in Chinese Visible Human (CVH) dataset contain rich anatomical structure information of tissues because of its high resolution (e.g., 0.167 mm per pixel). Fast and accurate segmentation of these images into white matter, gray matter, and cerebrospinal fluid plays a critical role in analyzing and measuring the anatomical structures of human brain. However, most existing automated segmentation methods are designed for computed tomography or magnetic resonance imaging data, and they may not be applicable for cryosection images due to the imaging difference. In this paper, we propose a supervised learning-based CVH brain tissues segmentation method that uses stacked autoencoder (SAE) to automatically learn the deep feature representations. Specifically, our model includes two successive parts where two three-layer SAEs take image patches as input to learn the complex anatomical feature representation, and then these features are sent to Softmax classifier for inferring the labels. Experimental results validated the effectiveness of our method and showed that it outperformed four other classical brain tissue detection strategies. Furthermore, we reconstructed three-dimensional surfaces of these tissues, which show their potential in exploring the high-resolution anatomical structures of human brain.
Deep learning for neuroimaging: a validation study.
Plis, Sergey M; Hjelm, Devon R; Salakhutdinov, Ruslan; Allen, Elena A; Bockholt, Henry J; Long, Jeffrey D; Johnson, Hans J; Paulsen, Jane S; Turner, Jessica A; Calhoun, Vince D
2014-01-01
Deep learning methods have recently made notable advances in the tasks of classification and representation learning. These tasks are important for brain imaging and neuroscience discovery, making the methods attractive for porting to a neuroimager's toolbox. Success of these methods is, in part, explained by the flexibility of deep learning models. However, this flexibility makes the process of porting to new areas a difficult parameter optimization problem. In this work we demonstrate our results (and feasible parameter ranges) in application of deep learning methods to structural and functional brain imaging data. These methods include deep belief networks and their building block the restricted Boltzmann machine. We also describe a novel constraint-based approach to visualizing high dimensional data. We use it to analyze the effect of parameter choices on data transformations. Our results show that deep learning methods are able to learn physiologically important representations and detect latent relations in neuroimaging data.
Generative Adversarial Networks: An Overview
NASA Astrophysics Data System (ADS)
Creswell, Antonia; White, Tom; Dumoulin, Vincent; Arulkumaran, Kai; Sengupta, Biswa; Bharath, Anil A.
2018-01-01
Generative adversarial networks (GANs) provide a way to learn deep representations without extensively annotated training data. They achieve this through deriving backpropagation signals through a competitive process involving a pair of networks. The representations that can be learned by GANs may be used in a variety of applications, including image synthesis, semantic image editing, style transfer, image super-resolution and classification. The aim of this review paper is to provide an overview of GANs for the signal processing community, drawing on familiar analogies and concepts where possible. In addition to identifying different methods for training and constructing GANs, we also point to remaining challenges in their theory and application.
Sparse and redundant representations for inverse problems and recognition
NASA Astrophysics Data System (ADS)
Patel, Vishal M.
Sparse and redundant representation of data enables the description of signals as linear combinations of a few atoms from a dictionary. In this dissertation, we study applications of sparse and redundant representations in inverse problems and object recognition. Furthermore, we propose two novel imaging modalities based on the recently introduced theory of Compressed Sensing (CS). This dissertation consists of four major parts. In the first part of the dissertation, we study a new type of deconvolution algorithm that is based on estimating the image from a shearlet decomposition. Shearlets provide a multi-directional and multi-scale decomposition that has been mathematically shown to represent distributed discontinuities such as edges better than traditional wavelets. We develop a deconvolution algorithm that allows for the approximation inversion operator to be controlled on a multi-scale and multi-directional basis. Furthermore, we develop a method for the automatic determination of the threshold values for the noise shrinkage for each scale and direction without explicit knowledge of the noise variance using a generalized cross validation method. In the second part of the dissertation, we study a reconstruction method that recovers highly undersampled images assumed to have a sparse representation in a gradient domain by using partial measurement samples that are collected in the Fourier domain. Our method makes use of a robust generalized Poisson solver that greatly aids in achieving a significantly improved performance over similar proposed methods. We will demonstrate by experiments that this new technique is more flexible to work with either random or restricted sampling scenarios better than its competitors. In the third part of the dissertation, we introduce a novel Synthetic Aperture Radar (SAR) imaging modality which can provide a high resolution map of the spatial distribution of targets and terrain using a significantly reduced number of needed transmitted and/or received electromagnetic waveforms. We demonstrate that this new imaging scheme, requires no new hardware components and allows the aperture to be compressed. Also, it presents many new applications and advantages which include strong resistance to countermesasures and interception, imaging much wider swaths and reduced on-board storage requirements. The last part of the dissertation deals with object recognition based on learning dictionaries for simultaneous sparse signal approximations and feature extraction. A dictionary is learned for each object class based on given training examples which minimize the representation error with a sparseness constraint. A novel test image is then projected onto the span of the atoms in each learned dictionary. The residual vectors along with the coefficients are then used for recognition. Applications to illumination robust face recognition and automatic target recognition are presented.
Human inferior colliculus activity relates to individual differences in spoken language learning.
Chandrasekaran, Bharath; Kraus, Nina; Wong, Patrick C M
2012-03-01
A challenge to learning words of a foreign language is encoding nonnative phonemes, a process typically attributed to cortical circuitry. Using multimodal imaging methods [functional magnetic resonance imaging-adaptation (fMRI-A) and auditory brain stem responses (ABR)], we examined the extent to which pretraining pitch encoding in the inferior colliculus (IC), a primary midbrain structure, related to individual variability in learning to successfully use nonnative pitch patterns to distinguish words in American English-speaking adults. fMRI-A indexed the efficiency of pitch representation localized to the IC, whereas ABR quantified midbrain pitch-related activity with millisecond precision. In line with neural "sharpening" models, we found that efficient IC pitch pattern representation (indexed by fMRI) related to superior neural representation of pitch patterns (indexed by ABR), and consequently more successful word learning following sound-to-meaning training. Our results establish a critical role for the IC in speech-sound representation, consistent with the established role for the IC in the representation of communication signals in other animal models.
Images as Representations: Visual Sources on Education and Childhood in the Past
ERIC Educational Resources Information Center
Dekker, Jeroen J.H.
2015-01-01
The challenge of using images for the history of education and childhood will be addressed in this article by looking at them as representations. Central is the relationship between representations and reality. The focus is on the power of paintings as representations of aspects of realities. First the meaning of representation for images as…
Learning a Dictionary of Shape Epitomes with Applications to Image Labeling
Chen, Liang-Chieh; Papandreou, George; Yuille, Alan L.
2015-01-01
The first main contribution of this paper is a novel method for representing images based on a dictionary of shape epitomes. These shape epitomes represent the local edge structure of the image and include hidden variables to encode shift and rotations. They are learnt in an unsupervised manner from groundtruth edges. This dictionary is compact but is also able to capture the typical shapes of edges in natural images. In this paper, we illustrate the shape epitomes by applying them to the image labeling task. In other work, described in the supplementary material, we apply them to edge detection and image modeling. We apply shape epitomes to image labeling by using Conditional Random Field (CRF) Models. They are alternatives to the superpixel or pixel representations used in most CRFs. In our approach, the shape of an image patch is encoded by a shape epitome from the dictionary. Unlike the superpixel representation, our method avoids making early decisions which cannot be reversed. Our resulting hierarchical CRFs efficiently capture both local and global class co-occurrence properties. We demonstrate its quantitative and qualitative properties of our approach with image labeling experiments on two standard datasets: MSRC-21 and Stanford Background. PMID:26321886
Sharma, Harshita; Alekseychuk, Alexander; Leskovsky, Peter; Hellwich, Olaf; Anand, R S; Zerbe, Norman; Hufnagl, Peter
2012-10-04
Computer-based analysis of digitalized histological images has been gaining increasing attention, due to their extensive use in research and routine practice. The article aims to contribute towards the description and retrieval of histological images by employing a structural method using graphs. Due to their expressive ability, graphs are considered as a powerful and versatile representation formalism and have obtained a growing consideration especially by the image processing and computer vision community. The article describes a novel method for determining similarity between histological images through graph-theoretic description and matching, for the purpose of content-based retrieval. A higher order (region-based) graph-based representation of breast biopsy images has been attained and a tree-search based inexact graph matching technique has been employed that facilitates the automatic retrieval of images structurally similar to a given image from large databases. The results obtained and evaluation performed demonstrate the effectiveness and superiority of graph-based image retrieval over a common histogram-based technique. The employed graph matching complexity has been reduced compared to the state-of-the-art optimal inexact matching methods by applying a pre-requisite criterion for matching of nodes and a sophisticated design of the estimation function, especially the prognosis function. The proposed method is suitable for the retrieval of similar histological images, as suggested by the experimental and evaluation results obtained in the study. It is intended for the use in Content Based Image Retrieval (CBIR)-requiring applications in the areas of medical diagnostics and research, and can also be generalized for retrieval of different types of complex images. The virtual slide(s) for this article can be found here: http://www.diagnosticpathology.diagnomx.eu/vs/1224798882787923.
Spatially variant morphological restoration and skeleton representation.
Bouaynaya, Nidhal; Charif-Chefchaouni, Mohammed; Schonfeld, Dan
2006-11-01
The theory of spatially variant (SV) mathematical morphology is used to extend and analyze two important image processing applications: morphological image restoration and skeleton representation of binary images. For morphological image restoration, we propose the SV alternating sequential filters and SV median filters. We establish the relation of SV median filters to the basic SV morphological operators (i.e., SV erosions and SV dilations). For skeleton representation, we present a general framework for the SV morphological skeleton representation of binary images. We study the properties of the SV morphological skeleton representation and derive conditions for its invertibility. We also develop an algorithm for the implementation of the SV morphological skeleton representation of binary images. The latter algorithm is based on the optimal construction of the SV structuring element mapping designed to minimize the cardinality of the SV morphological skeleton representation. Experimental results show the dramatic improvement in the performance of the SV morphological restoration and SV morphological skeleton representation algorithms in comparison to their translation-invariant counterparts.
A tuned mesh-generation strategy for image representation based on data-dependent triangulation.
Li, Ping; Adams, Michael D
2013-05-01
A mesh-generation framework for image representation based on data-dependent triangulation is proposed. The proposed framework is a modified version of the frameworks of Rippa and Garland and Heckbert that facilitates the development of more effective mesh-generation methods. As the proposed framework has several free parameters, the effects of different choices of these parameters on mesh quality are studied, leading to the recommendation of a particular set of choices for these parameters. A mesh-generation method is then introduced that employs the proposed framework with these best parameter choices. This method is demonstrated to produce meshes of higher quality (both in terms of squared error and subjectively) than those generated by several competing approaches, at a relatively modest computational and memory cost.
Classification and pose estimation of objects using nonlinear features
NASA Astrophysics Data System (ADS)
Talukder, Ashit; Casasent, David P.
1998-03-01
A new nonlinear feature extraction method called the maximum representation and discrimination feature (MRDF) method is presented for extraction of features from input image data. It implements transformations similar to the Sigma-Pi neural network. However, the weights of the MRDF are obtained in closed form, and offer advantages compared to nonlinear neural network implementations. The features extracted are useful for both object discrimination (classification) and object representation (pose estimation). We show its use in estimating the class and pose of images of real objects and rendered solid CAD models of machine parts from single views using a feature-space trajectory (FST) neural network classifier. We show more accurate classification and pose estimation results than are achieved by standard principal component analysis (PCA) and Fukunaga-Koontz (FK) feature extraction methods.
Sequential Processes in Image Generation: An Objective Measure. Technical Report #6.
ERIC Educational Resources Information Center
Kosslyn, Stephen M.; And Others
This paper investigates the processes by which visual mental images--the precept-like short-term memory representations--are created from information stored in long-term memory. It also presents a new method for studying image generation. Three experiments were conducted using college students as subjects. In the first experiment, a Podgorny and…
A Novel Unsupervised Segmentation Quality Evaluation Method for Remote Sensing Images
Tang, Yunwei; Jing, Linhai; Ding, Haifeng
2017-01-01
The segmentation of a high spatial resolution remote sensing image is a critical step in geographic object-based image analysis (GEOBIA). Evaluating the performance of segmentation without ground truth data, i.e., unsupervised evaluation, is important for the comparison of segmentation algorithms and the automatic selection of optimal parameters. This unsupervised strategy currently faces several challenges in practice, such as difficulties in designing effective indicators and limitations of the spectral values in the feature representation. This study proposes a novel unsupervised evaluation method to quantitatively measure the quality of segmentation results to overcome these problems. In this method, multiple spectral and spatial features of images are first extracted simultaneously and then integrated into a feature set to improve the quality of the feature representation of ground objects. The indicators designed for spatial stratified heterogeneity and spatial autocorrelation are included to estimate the properties of the segments in this integrated feature set. These two indicators are then combined into a global assessment metric as the final quality score. The trade-offs of the combined indicators are accounted for using a strategy based on the Mahalanobis distance, which can be exhibited geometrically. The method is tested on two segmentation algorithms and three testing images. The proposed method is compared with two existing unsupervised methods and a supervised method to confirm its capabilities. Through comparison and visual analysis, the results verified the effectiveness of the proposed method and demonstrated the reliability and improvements of this method with respect to other methods. PMID:29064416
Estimation of signal-dependent noise level function in transform domain via a sparse recovery model.
Yang, Jingyu; Gan, Ziqiao; Wu, Zhaoyang; Hou, Chunping
2015-05-01
This paper proposes a novel algorithm to estimate the noise level function (NLF) of signal-dependent noise (SDN) from a single image based on the sparse representation of NLFs. Noise level samples are estimated from the high-frequency discrete cosine transform (DCT) coefficients of nonlocal-grouped low-variation image patches. Then, an NLF recovery model based on the sparse representation of NLFs under a trained basis is constructed to recover NLF from the incomplete noise level samples. Confidence levels of the NLF samples are incorporated into the proposed model to promote reliable samples and weaken unreliable ones. We investigate the behavior of the estimation performance with respect to the block size, sampling rate, and confidence weighting. Simulation results on synthetic noisy images show that our method outperforms existing state-of-the-art schemes. The proposed method is evaluated on real noisy images captured by three types of commodity imaging devices, and shows consistently excellent SDN estimation performance. The estimated NLFs are incorporated into two well-known denoising schemes, nonlocal means and BM3D, and show significant improvements in denoising SDN-polluted images.
Tongue Motion Averaging from Contour Sequences
ERIC Educational Resources Information Center
Li, Min; Kambhamettu, Chandra; Stone, Maureen
2005-01-01
In this paper, a method to get the best representation of a speech motion from several repetitions is presented. Each repetition is a representation of the same speech captured at different times by sequence of ultrasound images and is composed of a set of 2D spatio-temporal contours. These 2D contours in different repetitions are time aligned…
Representation learning: a unified deep learning framework for automatic prostate MR segmentation.
Liao, Shu; Gao, Yaozong; Oto, Aytekin; Shen, Dinggang
2013-01-01
Image representation plays an important role in medical image analysis. The key to the success of different medical image analysis algorithms is heavily dependent on how we represent the input data, namely features used to characterize the input image. In the literature, feature engineering remains as an active research topic, and many novel hand-crafted features are designed such as Haar wavelet, histogram of oriented gradient, and local binary patterns. However, such features are not designed with the guidance of the underlying dataset at hand. To this end, we argue that the most effective features should be designed in a learning based manner, namely representation learning, which can be adapted to different patient datasets at hand. In this paper, we introduce a deep learning framework to achieve this goal. Specifically, a stacked independent subspace analysis (ISA) network is adopted to learn the most effective features in a hierarchical and unsupervised manner. The learnt features are adapted to the dataset at hand and encode high level semantic anatomical information. The proposed method is evaluated on the application of automatic prostate MR segmentation. Experimental results show that significant segmentation accuracy improvement can be achieved by the proposed deep learning method compared to other state-of-the-art segmentation approaches.
A survey of visual preprocessing and shape representation techniques
NASA Technical Reports Server (NTRS)
Olshausen, Bruno A.
1988-01-01
Many recent theories and methods proposed for visual preprocessing and shape representation are summarized. The survey brings together research from the fields of biology, psychology, computer science, electrical engineering, and most recently, neural networks. It was motivated by the need to preprocess images for a sparse distributed memory (SDM), but the techniques presented may also prove useful for applying other associative memories to visual pattern recognition. The material of this survey is divided into three sections: an overview of biological visual processing; methods of preprocessing (extracting parts of shape, texture, motion, and depth); and shape representation and recognition (form invariance, primitives and structural descriptions, and theories of attention).
SDL: Saliency-Based Dictionary Learning Framework for Image Similarity.
Sarkar, Rituparna; Acton, Scott T
2018-02-01
In image classification, obtaining adequate data to learn a robust classifier has often proven to be difficult in several scenarios. Classification of histological tissue images for health care analysis is a notable application in this context due to the necessity of surgery, biopsy or autopsy. To adequately exploit limited training data in classification, we propose a saliency guided dictionary learning method and subsequently an image similarity technique for histo-pathological image classification. Salient object detection from images aids in the identification of discriminative image features. We leverage the saliency values for the local image regions to learn a dictionary and respective sparse codes for an image, such that the more salient features are reconstructed with smaller error. The dictionary learned from an image gives a compact representation of the image itself and is capable of representing images with similar content, with comparable sparse codes. We employ this idea to design a similarity measure between a pair of images, where local image features of one image, are encoded with the dictionary learned from the other and vice versa. To effectively utilize the learned dictionary, we take into account the contribution of each dictionary atom in the sparse codes to generate a global image representation for image comparison. The efficacy of the proposed method was evaluated using three tissue data sets that consist of mammalian kidney, lung and spleen tissue, breast cancer, and colon cancer tissue images. From the experiments, we observe that our methods outperform the state of the art with an increase of 14.2% in the average classification accuracy over all data sets.
High-order statistics of weber local descriptors for image representation.
Han, Xian-Hua; Chen, Yen-Wei; Xu, Gang
2015-06-01
Highly discriminant visual features play a key role in different image classification applications. This study aims to realize a method for extracting highly-discriminant features from images by exploring a robust local descriptor inspired by Weber's law. The investigated local descriptor is based on the fact that human perception for distinguishing a pattern depends not only on the absolute intensity of the stimulus but also on the relative variance of the stimulus. Therefore, we firstly transform the original stimulus (the images in our study) into a differential excitation-domain according to Weber's law, and then explore a local patch, called micro-Texton, in the transformed domain as Weber local descriptor (WLD). Furthermore, we propose to employ a parametric probability process to model the Weber local descriptors, and extract the higher-order statistics to the model parameters for image representation. The proposed strategy can adaptively characterize the WLD space using generative probability model, and then learn the parameters for better fitting the training space, which would lead to more discriminant representation for images. In order to validate the efficiency of the proposed strategy, we apply three different image classification applications including texture, food images and HEp-2 cell pattern recognition, which validates that our proposed strategy has advantages over the state-of-the-art approaches.
Image flows and one-liner graphical image representation.
Makhervaks, Vadim; Barequet, Gill; Bruckstein, Alfred
2002-10-01
This paper introduces a novel graphical image representation consisting of a single curve-the one-liner. The first step of the algorithm involves the detection and ranking of image edges. A new edge exploration technique is used to perform both tasks simultaneously. This process is based on image flows. It uses a gradient vector field and a new operator to explore image edges. Estimation of the derivatives of the image is performed by using local Taylor expansions in conjunction with a weighted least-squares method. This process finds all the possible image edges without any pruning, and collects information that allows the edges found to be prioritized. This enables the most important edges to be selected to form a skeleton of the representation sought. The next step connects the selected edges into one continuous curve-the one-liner. It orders the selected edges and determines the curves connecting them. These two problems are solved separately. Since the abstract graph setting of the first problem is NP-complete, we reduce it to a variant of the traveling salesman problem and compute an approximate solution to it. We solve the second problem by using Dijkstra's shortest-path algorithm. The full software implementation for the entire one-liner determination process is available.
A coarse-to-fine approach for medical hyperspectral image classification with sparse representation
NASA Astrophysics Data System (ADS)
Chang, Lan; Zhang, Mengmeng; Li, Wei
2017-10-01
A coarse-to-fine approach with sparse representation is proposed for medical hyperspectral image classification in this work. Segmentation technique with different scales is employed to exploit edges of the input image, where coarse super-pixel patches provide global classification information while fine ones further provide detail information. Different from common RGB image, hyperspectral image has multi bands to adjust the cluster center with more high precision. After segmentation, each super pixel is classified by recently-developed sparse representation-based classification (SRC), which assigns label for testing samples in one local patch by means of sparse linear combination of all the training samples. Furthermore, segmentation with multiple scales is employed because single scale is not suitable for complicate distribution of medical hyperspectral imagery. Finally, classification results for different sizes of super pixel are fused by some fusion strategy, offering at least two benefits: (1) the final result is obviously superior to that of segmentation with single scale, and (2) the fusion process significantly simplifies the choice of scales. Experimental results using real medical hyperspectral images demonstrate that the proposed method outperforms the state-of-the-art SRC.
Khaligh-Razavi, Seyed-Mahdi; Henriksson, Linda; Kay, Kendrick; Kriegeskorte, Nikolaus
2017-02-01
Studies of the primate visual system have begun to test a wide range of complex computational object-vision models. Realistic models have many parameters, which in practice cannot be fitted using the limited amounts of brain-activity data typically available. Task performance optimization (e.g. using backpropagation to train neural networks) provides major constraints for fitting parameters and discovering nonlinear representational features appropriate for the task (e.g. object classification). Model representations can be compared to brain representations in terms of the representational dissimilarities they predict for an image set. This method, called representational similarity analysis (RSA), enables us to test the representational feature space as is (fixed RSA) or to fit a linear transformation that mixes the nonlinear model features so as to best explain a cortical area's representational space (mixed RSA). Like voxel/population-receptive-field modelling, mixed RSA uses a training set (different stimuli) to fit one weight per model feature and response channel (voxels here), so as to best predict the response profile across images for each response channel. We analysed response patterns elicited by natural images, which were measured with functional magnetic resonance imaging (fMRI). We found that early visual areas were best accounted for by shallow models, such as a Gabor wavelet pyramid (GWP). The GWP model performed similarly with and without mixing, suggesting that the original features already approximated the representational space, obviating the need for mixing. However, a higher ventral-stream visual representation (lateral occipital region) was best explained by the higher layers of a deep convolutional network and mixing of its feature set was essential for this model to explain the representation. We suspect that mixing was essential because the convolutional network had been trained to discriminate a set of 1000 categories, whose frequencies in the training set did not match their frequencies in natural experience or their behavioural importance. The latter factors might determine the representational prominence of semantic dimensions in higher-level ventral-stream areas. Our results demonstrate the benefits of testing both the specific representational hypothesis expressed by a model's original feature space and the hypothesis space generated by linear transformations of that feature space.
a New Protocol for Texture Mapping Process and 2d Representation of Rupestrian Architecture
NASA Astrophysics Data System (ADS)
Carnevali, L.; Carpiceci, M.; Angelini, A.
2018-05-01
The development of the survey techniques for architecture and archaeology requires a general review in the methods used for the representation of numerical data. The possibilities offered by data processing allow to find new paths for studying issues connected to the drawing discipline. The research project aimed at experimenting different approaches for the representation of the rupestrian architecture and the texture mapping process. The nature of the rupestrian architecture does not allow a traditional representation of sections and projections of edges and outlines. The paper presents a method, the Equidistant Multiple Sections (EMS), inspired by cartography and based on the use of isohipses generated from different geometric plane. A specific paragraph is dedicated to the texture mapping process for unstructured surface models. One of the main difficulty in the image projection consists in the recognition of homologous points between image and point cloud, above all in the areas with most deformations. With the aid of the "virtual scan" tool a different procedure was developed for improving the correspondences of the image. The result show a sensible improvement of the entire process above all for the architectural vaults. A detailed study concerned the unfolding of the straight line surfaces; the barrel vault of the analyzed chapel has been unfolded for observing the paintings in the real shapes out of the morphological context.
NASA Astrophysics Data System (ADS)
Lin, Yuchun; Baumketner, Andrij; Deng, Shaozhong; Xu, Zhenli; Jacobs, Donald; Cai, Wei
2009-10-01
In this paper, a new solvation model is proposed for simulations of biomolecules in aqueous solutions that combines the strengths of explicit and implicit solvent representations. Solute molecules are placed in a spherical cavity filled with explicit water, thus providing microscopic detail where it is most needed. Solvent outside of the cavity is modeled as a dielectric continuum whose effect on the solute is treated through the reaction field corrections. With this explicit/implicit model, the electrostatic potential represents a solute molecule in an infinite bath of solvent, thus avoiding unphysical interactions between periodic images of the solute commonly used in the lattice-sum explicit solvent simulations. For improved computational efficiency, our model employs an accurate and efficient multiple-image charge method to compute reaction fields together with the fast multipole method for the direct Coulomb interactions. To minimize the surface effects, periodic boundary conditions are employed for nonelectrostatic interactions. The proposed model is applied to study liquid water. The effect of model parameters, which include the size of the cavity, the number of image charges used to compute reaction field, and the thickness of the buffer layer, is investigated in comparison with the particle-mesh Ewald simulations as a reference. An optimal set of parameters is obtained that allows for a faithful representation of many structural, dielectric, and dynamic properties of the simulated water, while maintaining manageable computational cost. With controlled and adjustable accuracy of the multiple-image charge representation of the reaction field, it is concluded that the employed model achieves convergence with only one image charge in the case of pure water. Future applications to pKa calculations, conformational sampling of solvated biomolecules and electrolyte solutions are briefly discussed.
Robust kernel collaborative representation for face recognition
NASA Astrophysics Data System (ADS)
Huang, Wei; Wang, Xiaohui; Ma, Yanbo; Jiang, Yuzheng; Zhu, Yinghui; Jin, Zhong
2015-05-01
One of the greatest challenges of representation-based face recognition is that the training samples are usually insufficient. In other words, the training set usually does not include enough samples to show varieties of high-dimensional face images caused by illuminations, facial expressions, and postures. When the test sample is significantly different from the training samples of the same subject, the recognition performance will be sharply reduced. We propose a robust kernel collaborative representation based on virtual samples for face recognition. We think that the virtual training set conveys some reasonable and possible variations of the original training samples. Hence, we design a new object function to more closely match the representation coefficients generated from the original and virtual training sets. In order to further improve the robustness, we implement the corresponding representation-based face recognition in kernel space. It is noteworthy that any kind of virtual training samples can be used in our method. We use noised face images to obtain virtual face samples. The noise can be approximately viewed as a reflection of the varieties of illuminations, facial expressions, and postures. Our work is a simple and feasible way to obtain virtual face samples to impose Gaussian noise (and other types of noise) specifically to the original training samples to obtain possible variations of the original samples. Experimental results on the FERET, Georgia Tech, and ORL face databases show that the proposed method is more robust than two state-of-the-art face recognition methods, such as CRC and Kernel CRC.
Numerical computation of diffusion on a surface.
Schwartz, Peter; Adalsteinsson, David; Colella, Phillip; Arkin, Adam Paul; Onsum, Matthew
2005-08-09
We present a numerical method for computing diffusive transport on a surface derived from image data. Our underlying discretization method uses a Cartesian grid embedded boundary method for computing the volume transport in a region consisting of all points a small distance from the surface. We obtain a representation of this region from image data by using a front propagation computation based on level set methods for solving the Hamilton-Jacobi and eikonal equations. We demonstrate that the method is second-order accurate in space and time and is capable of computing solutions on complex surface geometries obtained from image data of cells.
Sensor fusion for synthetic vision
NASA Technical Reports Server (NTRS)
Pavel, M.; Larimer, J.; Ahumada, A.
1991-01-01
Display methodologies are explored for fusing images gathered by millimeter wave sensors with images rendered from an on-board terrain data base to facilitate visually guided flight and ground operations in low visibility conditions. An approach to fusion based on multiresolution image representation and processing is described which facilitates fusion of images differing in resolution within and between images. To investigate possible fusion methods, a workstation-based simulation environment is being developed.
Magnetic Resonance Super-resolution Imaging Measurement with Dictionary-optimized Sparse Learning
NASA Astrophysics Data System (ADS)
Li, Jun-Bao; Liu, Jing; Pan, Jeng-Shyang; Yao, Hongxun
2017-06-01
Magnetic Resonance Super-resolution Imaging Measurement (MRIM) is an effective way of measuring materials. MRIM has wide applications in physics, chemistry, biology, geology, medical and material science, especially in medical diagnosis. It is feasible to improve the resolution of MR imaging through increasing radiation intensity, but the high radiation intensity and the longtime of magnetic field harm the human body. Thus, in the practical applications the resolution of hardware imaging reaches the limitation of resolution. Software-based super-resolution technology is effective to improve the resolution of image. This work proposes a framework of dictionary-optimized sparse learning based MR super-resolution method. The framework is to solve the problem of sample selection for dictionary learning of sparse reconstruction. The textural complexity-based image quality representation is proposed to choose the optimal samples for dictionary learning. Comprehensive experiments show that the dictionary-optimized sparse learning improves the performance of sparse representation.
Apparatus and method for imaging metallic objects using an array of giant magnetoresistive sensors
Chaiken, Alison
2000-01-01
A portable, low-power, metallic object detector and method for providing an image of a detected metallic object. In one embodiment, the present portable low-power metallic object detector an array of giant magnetoresistive (GMR) sensors. The array of GMR sensors is adapted for detecting the presence of and compiling image data of a metallic object. In the embodiment, the array of GMR sensors is arranged in a checkerboard configuration such that axes of sensitivity of alternate GMR sensors are orthogonally oriented. An electronics portion is coupled to the array of GMR sensors. The electronics portion is adapted to receive and process the image data of the metallic object compiled by the array of GMR sensors. The embodiment also includes a display unit which is coupled to the electronics portion. The display unit is adapted to display a graphical representation of the metallic object detected by the array of GMR sensors. In so doing, a graphical representation of the detected metallic object is provided.
NASA Astrophysics Data System (ADS)
Santiago, Daniel; Corredor, Germán.; Romero, Eduardo
2017-11-01
During a diagnosis task, a Pathologist looks over a Whole Slide Image (WSI), aiming to find out relevant pathological patterns. Nonetheless, a virtual microscope captures these structures, but also other cellular patterns with different or none diagnostic meaning. Annotation of these images depends on manual delineation, which in practice becomes a hard task. This article contributes a new method for detecting relevant regions in WSI using the routine navigations in a virtual microscope. This method constructs a sparse representation or dictionary of each navigation path and determines the hidden relevance by maximizing the incoherence between several paths. The resulting dictionaries are then projected onto each other and relevant information is set to the dictionary atoms whose similarity is higher than a custom threshold. Evaluation was performed with 6 pathological images segmented from a skin biopsy already diagnosed with basal cell carcinoma (BCC). Results show that our proposal outperforms the baseline by more than 20%.
Understanding Deep Representations Learned in Modeling Users Likes.
Guntuku, Sharath Chandra; Zhou, Joey Tianyi; Roy, Sujoy; Lin, Weisi; Tsang, Ivor W
2016-08-01
Automatically understanding and discriminating different users' liking for an image is a challenging problem. This is because the relationship between image features (even semantic ones extracted by existing tools, viz., faces, objects, and so on) and users' likes is non-linear, influenced by several subtle factors. This paper presents a deep bi-modal knowledge representation of images based on their visual content and associated tags (text). A mapping step between the different levels of visual and textual representations allows for the transfer of semantic knowledge between the two modalities. Feature selection is applied before learning deep representation to identify the important features for a user to like an image. The proposed representation is shown to be effective in discriminating users based on images they like and also in recommending images that a given user likes, outperforming the state-of-the-art feature representations by ∼ 15 %-20%. Beyond this test-set performance, an attempt is made to qualitatively understand the representations learned by the deep architecture used to model user likes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Seyedhosseini, Mojtaba; Kumar, Ritwik; Jurrus, Elizabeth R.
2011-10-01
Automated neural circuit reconstruction through electron microscopy (EM) images is a challenging problem. In this paper, we present a novel method that exploits multi-scale contextual information together with Radon-like features (RLF) to learn a series of discriminative models. The main idea is to build a framework which is capable of extracting information about cell membranes from a large contextual area of an EM image in a computationally efficient way. Toward this goal, we extract RLF that can be computed efficiently from the input image and generate a scale-space representation of the context images that are obtained at the output ofmore » each discriminative model in the series. Compared to a single-scale model, the use of a multi-scale representation of the context image gives the subsequent classifiers access to a larger contextual area in an effective way. Our strategy is general and independent of the classifier and has the potential to be used in any context based framework. We demonstrate that our method outperforms the state-of-the-art algorithms in detection of neuron membranes in EM images.« less
Decomposition and extraction: a new framework for visual classification.
Fang, Yuqiang; Chen, Qiang; Sun, Lin; Dai, Bin; Yan, Shuicheng
2014-08-01
In this paper, we present a novel framework for visual classification based on hierarchical image decomposition and hybrid midlevel feature extraction. Unlike most midlevel feature learning methods, which focus on the process of coding or pooling, we emphasize that the mechanism of image composition also strongly influences the feature extraction. To effectively explore the image content for the feature extraction, we model a multiplicity feature representation mechanism through meaningful hierarchical image decomposition followed by a fusion step. In particularly, we first propose a new hierarchical image decomposition approach in which each image is decomposed into a series of hierarchical semantical components, i.e, the structure and texture images. Then, different feature extraction schemes can be adopted to match the decomposed structure and texture processes in a dissociative manner. Here, two schemes are explored to produce property related feature representations. One is based on a single-stage network over hand-crafted features and the other is based on a multistage network, which can learn features from raw pixels automatically. Finally, those multiple midlevel features are incorporated by solving a multiple kernel learning task. Extensive experiments are conducted on several challenging data sets for visual classification, and experimental results demonstrate the effectiveness of the proposed method.
Representational Distance Learning for Deep Neural Networks
McClure, Patrick; Kriegeskorte, Nikolaus
2016-01-01
Deep neural networks (DNNs) provide useful models of visual representational transformations. We present a method that enables a DNN (student) to learn from the internal representational spaces of a reference model (teacher), which could be another DNN or, in the future, a biological brain. Representational spaces of the student and the teacher are characterized by representational distance matrices (RDMs). We propose representational distance learning (RDL), a stochastic gradient descent method that drives the RDMs of the student to approximate the RDMs of the teacher. We demonstrate that RDL is competitive with other transfer learning techniques for two publicly available benchmark computer vision datasets (MNIST and CIFAR-100), while allowing for architectural differences between student and teacher. By pulling the student's RDMs toward those of the teacher, RDL significantly improved visual classification performance when compared to baseline networks that did not use transfer learning. In the future, RDL may enable combined supervised training of deep neural networks using task constraints (e.g., images and category labels) and constraints from brain-activity measurements, so as to build models that replicate the internal representational spaces of biological brains. PMID:28082889
Representational Distance Learning for Deep Neural Networks.
McClure, Patrick; Kriegeskorte, Nikolaus
2016-01-01
Deep neural networks (DNNs) provide useful models of visual representational transformations. We present a method that enables a DNN (student) to learn from the internal representational spaces of a reference model (teacher), which could be another DNN or, in the future, a biological brain. Representational spaces of the student and the teacher are characterized by representational distance matrices (RDMs). We propose representational distance learning (RDL), a stochastic gradient descent method that drives the RDMs of the student to approximate the RDMs of the teacher. We demonstrate that RDL is competitive with other transfer learning techniques for two publicly available benchmark computer vision datasets (MNIST and CIFAR-100), while allowing for architectural differences between student and teacher. By pulling the student's RDMs toward those of the teacher, RDL significantly improved visual classification performance when compared to baseline networks that did not use transfer learning. In the future, RDL may enable combined supervised training of deep neural networks using task constraints (e.g., images and category labels) and constraints from brain-activity measurements, so as to build models that replicate the internal representational spaces of biological brains.
Online Hierarchical Sparse Representation of Multifeature for Robust Object Tracking
Qu, Shiru
2016-01-01
Object tracking based on sparse representation has given promising tracking results in recent years. However, the trackers under the framework of sparse representation always overemphasize the sparse representation and ignore the correlation of visual information. In addition, the sparse coding methods only encode the local region independently and ignore the spatial neighborhood information of the image. In this paper, we propose a robust tracking algorithm. Firstly, multiple complementary features are used to describe the object appearance; the appearance model of the tracked target is modeled by instantaneous and stable appearance features simultaneously. A two-stage sparse-coded method which takes the spatial neighborhood information of the image patch and the computation burden into consideration is used to compute the reconstructed object appearance. Then, the reliability of each tracker is measured by the tracking likelihood function of transient and reconstructed appearance models. Finally, the most reliable tracker is obtained by a well established particle filter framework; the training set and the template library are incrementally updated based on the current tracking results. Experiment results on different challenging video sequences show that the proposed algorithm performs well with superior tracking accuracy and robustness. PMID:27630710
NASA Astrophysics Data System (ADS)
Mefleh, Fuad N.; Baker, G. Hamilton; Kwartowitz, David M.
2014-03-01
In our previous work we presented a novel image-guided surgery (IGS) system, Kit for Navigation by Image Focused Exploration (KNIFE).1,2 KNIFE has been demonstrated to be effective in guiding mock clinical procedures with the tip of an electromagnetically tracked catheter overlaid onto a pre-captured bi-plane fluoroscopic loop. Representation of the catheter in KNIFE differs greatly from what is captured by the fluoroscope, due to distortions and other properties of fluoroscopic images. When imaged by a fluoroscope, catheters can be visualized due to the inclusion of radiopaque materials (i.e. Bi, Ba, W) in the polymer blend.3 However, in KNIFE catheter location is determined using a single tracking seed located in the catheter tip that is represented as a single point overlaid on pre-captured fluoroscopic images. To bridge the gap in catheter representation between KNIFE and traditional methods we constructed a catheter with five tracking seeds positioned along the distal 70 mm of the catheter. We have currently investigated the use of four spline interpolation methods for estimation of true catheter shape and have assesed the error in their estimation of true catheter shape. In this work we present a method for the evaluation of interpolation algorithms with respect to catheter shape determination.
Detecting Edges in Images by Use of Fuzzy Reasoning
NASA Technical Reports Server (NTRS)
Dominguez, Jesus A.; Klinko, Steve
2003-01-01
A method of processing digital image data to detect edges includes the use of fuzzy reasoning. The method is completely adaptive and does not require any advance knowledge of an image. During initial processing of image data at a low level of abstraction, the nature of the data is indeterminate. Fuzzy reasoning is used in the present method because it affords an ability to construct useful abstractions from approximate, incomplete, and otherwise imperfect sets of data. Humans are able to make some sense of even unfamiliar objects that have imperfect high-level representations. It appears that to perceive unfamiliar objects or to perceive familiar objects in imperfect images, humans apply heuristic algorithms to understand the images
A denoising algorithm for CT image using low-rank sparse coding
NASA Astrophysics Data System (ADS)
Lei, Yang; Xu, Dong; Zhou, Zhengyang; Wang, Tonghe; Dong, Xue; Liu, Tian; Dhabaan, Anees; Curran, Walter J.; Yang, Xiaofeng
2018-03-01
We propose a denoising method of CT image based on low-rank sparse coding. The proposed method constructs an adaptive dictionary of image patches and estimates the sparse coding regularization parameters using the Bayesian interpretation. A low-rank approximation approach is used to simultaneously construct the dictionary and achieve sparse representation through clustering similar image patches. A variable-splitting scheme and a quadratic optimization are used to reconstruct CT image based on achieved sparse coefficients. We tested this denoising technology using phantom, brain and abdominal CT images. The experimental results showed that the proposed method delivers state-of-art denoising performance, both in terms of objective criteria and visual quality.
Accelerated dynamic EPR imaging using fast acquisition and compressive recovery
NASA Astrophysics Data System (ADS)
Ahmad, Rizwan; Samouilov, Alexandre; Zweier, Jay L.
2016-12-01
Electron paramagnetic resonance (EPR) allows quantitative imaging of tissue redox status, which provides important information about ischemic syndromes, cancer and other pathologies. For continuous wave EPR imaging, however, poor signal-to-noise ratio and low acquisition efficiency limit its ability to image dynamic processes in vivo including tissue redox, where conditions can change rapidly. Here, we present a data acquisition and processing framework that couples fast acquisition with compressive sensing-inspired image recovery to enable EPR-based redox imaging with high spatial and temporal resolutions. The fast acquisition (FA) allows collecting more, albeit noisier, projections in a given scan time. The composite regularization based processing method, called spatio-temporal adaptive recovery (STAR), not only exploits sparsity in multiple representations of the spatio-temporal image but also adaptively adjusts the regularization strength for each representation based on its inherent level of the sparsity. As a result, STAR adjusts to the disparity in the level of sparsity across multiple representations, without introducing any tuning parameter. Our simulation and phantom imaging studies indicate that a combination of fast acquisition and STAR (FASTAR) enables high-fidelity recovery of volumetric image series, with each volumetric image employing less than 10 s of scan. In addition to image fidelity, the time constants derived from FASTAR also match closely to the ground truth even when a small number of projections are used for recovery. This development will enhance the capability of EPR to study fast dynamic processes that cannot be investigated using existing EPR imaging techniques.
From experimental imaging techniques to virtual embryology.
Weninger, Wolfgang J; Tassy, Olivier; Darras, Sébastien; Geyer, Stefan H; Thieffry, Denis
2004-01-01
Modern embryology increasingly relies on descriptive and functional three dimensional (3D) and four dimensional (4D) analysis of physically, optically, or virtually sectioned specimens. To cope with the technical requirements, new methods for high detailed in vivo imaging, as well as the generation of high resolution digital volume data sets for the accurate visualisation of transgene activity and gene product presence, in the context of embryo morphology, were recently developed and are under construction. These methods profoundly change the scientific applicability, appearance and style of modern embryo representations. In this paper, we present an overview of the emerging techniques to create, visualise and administrate embryo representations (databases, digital data sets, 3-4D embryo reconstructions, models, etc.), and discuss the implications of these new methods on the work of modern embryologists, including, research, teaching, the selection of specific model organisms, and potential collaborators.
NASA Astrophysics Data System (ADS)
Zhang, Min; Zhou, Xiangrong; Goshima, Satoshi; Chen, Huayue; Muramatsu, Chisako; Hara, Takeshi; Yokoyama, Ryojiro; Kanematsu, Masayuki; Fujita, Hiroshi
2012-03-01
We aim at using a new texton based texture classification method in the classification of pulmonary emphysema in computed tomography (CT) images of the lungs. Different from conventional computer-aided diagnosis (CAD) pulmonary emphysema classification methods, in this paper, firstly, the dictionary of texton is learned via applying sparse representation(SR) to image patches in the training dataset. Then the SR coefficients of the test images over the dictionary are used to construct the histograms for texture presentations. Finally, classification is performed by using a nearest neighbor classifier with a histogram dissimilarity measure as distance. The proposed approach is tested on 3840 annotated regions of interest consisting of normal tissue and mild, moderate and severe pulmonary emphysema of three subtypes. The performance of the proposed system, with an accuracy of about 88%, is comparably higher than state of the art method based on the basic rotation invariant local binary pattern histograms and the texture classification method based on texton learning by k-means, which performs almost the best among other approaches in the literature.
Feature hashing for fast image retrieval
NASA Astrophysics Data System (ADS)
Yan, Lingyu; Fu, Jiarun; Zhang, Hongxin; Yuan, Lu; Xu, Hui
2018-03-01
Currently, researches on content based image retrieval mainly focus on robust feature extraction. However, due to the exponential growth of online images, it is necessary to consider searching among large scale images, which is very timeconsuming and unscalable. Hence, we need to pay much attention to the efficiency of image retrieval. In this paper, we propose a feature hashing method for image retrieval which not only generates compact fingerprint for image representation, but also prevents huge semantic loss during the process of hashing. To generate the fingerprint, an objective function of semantic loss is constructed and minimized, which combine the influence of both the neighborhood structure of feature data and mapping error. Since the machine learning based hashing effectively preserves neighborhood structure of data, it yields visual words with strong discriminability. Furthermore, the generated binary codes leads image representation building to be of low-complexity, making it efficient and scalable to large scale databases. Experimental results show good performance of our approach.
Deep supervised dictionary learning for no-reference image quality assessment
NASA Astrophysics Data System (ADS)
Huang, Yuge; Liu, Xuesong; Tian, Xiang; Zhou, Fan; Chen, Yaowu; Jiang, Rongxin
2018-03-01
We propose a deep convolutional neural network (CNN) for general no-reference image quality assessment (NR-IQA), i.e., accurate prediction of image quality without a reference image. The proposed model consists of three components such as a local feature extractor that is a fully CNN, an encoding module with an inherent dictionary that aggregates local features to output a fixed-length global quality-aware image representation, and a regression module that maps the representation to an image quality score. Our model can be trained in an end-to-end manner, and all of the parameters, including the weights of the convolutional layers, the dictionary, and the regression weights, are simultaneously learned from the loss function. In addition, the model can predict quality scores for input images of arbitrary sizes in a single step. We tested our method on commonly used image quality databases and showed that its performance is comparable with that of state-of-the-art general-purpose NR-IQA algorithms.
Human inferior colliculus activity relates to individual differences in spoken language learning
Chandrasekaran, Bharath; Kraus, Nina
2012-01-01
A challenge to learning words of a foreign language is encoding nonnative phonemes, a process typically attributed to cortical circuitry. Using multimodal imaging methods [functional magnetic resonance imaging-adaptation (fMRI-A) and auditory brain stem responses (ABR)], we examined the extent to which pretraining pitch encoding in the inferior colliculus (IC), a primary midbrain structure, related to individual variability in learning to successfully use nonnative pitch patterns to distinguish words in American English-speaking adults. fMRI-A indexed the efficiency of pitch representation localized to the IC, whereas ABR quantified midbrain pitch-related activity with millisecond precision. In line with neural “sharpening” models, we found that efficient IC pitch pattern representation (indexed by fMRI) related to superior neural representation of pitch patterns (indexed by ABR), and consequently more successful word learning following sound-to-meaning training. Our results establish a critical role for the IC in speech-sound representation, consistent with the established role for the IC in the representation of communication signals in other animal models. PMID:22131377
Dictionary Pair Learning on Grassmann Manifolds for Image Denoising.
Zeng, Xianhua; Bian, Wei; Liu, Wei; Shen, Jialie; Tao, Dacheng
2015-11-01
Image denoising is a fundamental problem in computer vision and image processing that holds considerable practical importance for real-world applications. The traditional patch-based and sparse coding-driven image denoising methods convert 2D image patches into 1D vectors for further processing. Thus, these methods inevitably break down the inherent 2D geometric structure of natural images. To overcome this limitation pertaining to the previous image denoising methods, we propose a 2D image denoising model, namely, the dictionary pair learning (DPL) model, and we design a corresponding algorithm called the DPL on the Grassmann-manifold (DPLG) algorithm. The DPLG algorithm first learns an initial dictionary pair (i.e., the left and right dictionaries) by employing a subspace partition technique on the Grassmann manifold, wherein the refined dictionary pair is obtained through a sub-dictionary pair merging. The DPLG obtains a sparse representation by encoding each image patch only with the selected sub-dictionary pair. The non-zero elements of the sparse representation are further smoothed by the graph Laplacian operator to remove the noise. Consequently, the DPLG algorithm not only preserves the inherent 2D geometric structure of natural images but also performs manifold smoothing in the 2D sparse coding space. We demonstrate that the DPLG algorithm also improves the structural SIMilarity values of the perceptual visual quality for denoised images using the experimental evaluations on the benchmark images and Berkeley segmentation data sets. Moreover, the DPLG also produces the competitive peak signal-to-noise ratio values from popular image denoising algorithms.
A Subdivision-Based Representation for Vector Image Editing.
Liao, Zicheng; Hoppe, Hugues; Forsyth, David; Yu, Yizhou
2012-11-01
Vector graphics has been employed in a wide variety of applications due to its scalability and editability. Editability is a high priority for artists and designers who wish to produce vector-based graphical content with user interaction. In this paper, we introduce a new vector image representation based on piecewise smooth subdivision surfaces, which is a simple, unified and flexible framework that supports a variety of operations, including shape editing, color editing, image stylization, and vector image processing. These operations effectively create novel vector graphics by reusing and altering existing image vectorization results. Because image vectorization yields an abstraction of the original raster image, controlling the level of detail of this abstraction is highly desirable. To this end, we design a feature-oriented vector image pyramid that offers multiple levels of abstraction simultaneously. Our new vector image representation can be rasterized efficiently using GPU-accelerated subdivision. Experiments indicate that our vector image representation achieves high visual quality and better supports editing operations than existing representations.
Methods And Systems For Using Reference Images In Acoustic Image Processing
Moore, Thomas L.; Barter, Robert Henry
2005-01-04
A method and system of examining tissue are provided in which a field, including at least a portion of the tissue and one or more registration fiducials, is insonified. Scattered acoustic information, including both transmitted and reflected waves, is received from the field. A representation of the field, including both the tissue and the registration fiducials, is then derived from the received acoustic radiation.
Plenoptic image watermarking to preserve copyright
NASA Astrophysics Data System (ADS)
Ansari, A.; Dorado, A.; Saavedra, G.; Martinez Corral, M.
2017-05-01
Common camera loses a huge amount of information obtainable from scene as it does not record the value of individual rays passing a point and it merely keeps the summation of intensities of all the rays passing a point. Plenoptic images can be exploited to provide a 3D representation of the scene and watermarking such images can be helpful to protect the ownership of these images. In this paper we propose a method for watermarking the plenoptic images to achieve this aim. The performance of the proposed method is validated by experimental results and a compromise is held between imperceptibility and robustness.
Prostate segmentation by sparse representation based classification
Gao, Yaozong; Liao, Shu; Shen, Dinggang
2012-01-01
Purpose: The segmentation of prostate in CT images is of essential importance to external beam radiotherapy, which is one of the major treatments for prostate cancer nowadays. During the radiotherapy, the prostate is radiated by high-energy x rays from different directions. In order to maximize the dose to the cancer and minimize the dose to the surrounding healthy tissues (e.g., bladder and rectum), the prostate in the new treatment image needs to be accurately localized. Therefore, the effectiveness and efficiency of external beam radiotherapy highly depend on the accurate localization of the prostate. However, due to the low contrast of the prostate with its surrounding tissues (e.g., bladder), the unpredicted prostate motion, and the large appearance variations across different treatment days, it is challenging to segment the prostate in CT images. In this paper, the authors present a novel classification based segmentation method to address these problems. Methods: To segment the prostate, the proposed method first uses sparse representation based classification (SRC) to enhance the prostate in CT images by pixel-wise classification, in order to overcome the limitation of poor contrast of the prostate images. Then, based on the classification results, previous segmented prostates of the same patient are used as patient-specific atlases to align onto the current treatment image and the majority voting strategy is finally adopted to segment the prostate. In order to address the limitations of the traditional SRC in pixel-wise classification, especially for the purpose of segmentation, the authors extend SRC from the following four aspects: (1) A discriminant subdictionary learning method is proposed to learn a discriminant and compact representation of training samples for each class so that the discriminant power of SRC can be increased and also SRC can be applied to the large-scale pixel-wise classification. (2) The L1 regularized sparse coding is replaced by the elastic net in order to obtain a smooth and clear prostate boundary in the classification result. (3) Residue-based linear regression is incorporated to improve the classification performance and to extend SRC from hard classification to soft classification. (4) Iterative SRC is proposed by using context information to iteratively refine the classification results. Results: The proposed method has been comprehensively evaluated on a dataset consisting of 330 CT images from 24 patients. The effectiveness of the extended SRC has been validated by comparing it with the traditional SRC based on the proposed four extensions. The experimental results show that our extended SRC can obtain not only more accurate classification results but also smoother and clearer prostate boundary than the traditional SRC. Besides, the comparison with other five state-of-the-art prostate segmentation methods indicates that our method can achieve better performance than other methods under comparison. Conclusions: The authors have proposed a novel prostate segmentation method based on the sparse representation based classification, which can achieve considerably accurate segmentation results in CT prostate segmentation. PMID:23039673
Sparse subspace clustering for data with missing entries and high-rank matrix completion.
Fan, Jicong; Chow, Tommy W S
2017-09-01
Many methods have recently been proposed for subspace clustering, but they are often unable to handle incomplete data because of missing entries. Using matrix completion methods to recover missing entries is a common way to solve the problem. Conventional matrix completion methods require that the matrix should be of low-rank intrinsically, but most matrices are of high-rank or even full-rank in practice, especially when the number of subspaces is large. In this paper, a new method called Sparse Representation with Missing Entries and Matrix Completion is proposed to solve the problems of incomplete-data subspace clustering and high-rank matrix completion. The proposed algorithm alternately computes the matrix of sparse representation coefficients and recovers the missing entries of a data matrix. The proposed algorithm recovers missing entries through minimizing the representation coefficients, representation errors, and matrix rank. Thorough experimental study and comparative analysis based on synthetic data and natural images were conducted. The presented results demonstrate that the proposed algorithm is more effective in subspace clustering and matrix completion compared with other existing methods. Copyright © 2017 Elsevier Ltd. All rights reserved.
Images of Nature in Greek Primary School Textbooks
ERIC Educational Resources Information Center
Korfiatis, Kostas J.; Stamou, Anastasia G.; Paraskevopoulos, Stephanos
2004-01-01
In this article, the environmental content of the textbooks used for the teaching of natural sciences in Greek primary schools was examined. Specifically, by employing the method of content analysis, both representational (metaphors, depictions, values, etc.) and cognitive ecological concepts) elements, building images of nature, and shaping our…
Restoration for Noise Removal in Quantum Images
NASA Astrophysics Data System (ADS)
Liu, Kai; Zhang, Yi; Lu, Kai; Wang, Xiaoping
2017-09-01
Quantum computation has become increasingly attractive in the past few decades due to its extraordinary performance. As a result, some studies focusing on image representation and processing via quantum mechanics have been done. However, few of them have considered the quantum operations for images restoration. To address this problem, three noise removal algorithms are proposed in this paper based on the novel enhanced quantum representation model, oriented to two kinds of noise pollution (Salt-and-Pepper noise and Gaussian noise). For the first algorithm Q-Mean, it is designed to remove the Salt-and-Pepper noise. The noise points are extracted through comparisons with the adjacent pixel values, after which the restoration operation is finished by mean filtering. As for the second method Q-Gauss, a special mask is applied to weaken the Gaussian noise pollution. The third algorithm Q-Adapt is effective for the source image containing unknown noise. The type of noise can be judged through the quantum statistic operations for the color value of the whole image, and then different noise removal algorithms are used to conduct image restoration respectively. Performance analysis reveals that our methods can offer high restoration quality and achieve significant speedup through inherent parallelism of quantum computation.
The neural basis of precise visual short-term memory for complex recognisable objects.
Veldsman, Michele; Mitchell, Daniel J; Cusack, Rhodri
2017-10-01
Recent evidence suggests that visual short-term memory (VSTM) capacity estimated using simple objects, such as colours and oriented bars, may not generalise well to more naturalistic stimuli. More visual detail can be stored in VSTM when complex, recognisable objects are maintained compared to simple objects. It is not yet known if it is recognisability that enhances memory precision, nor whether maintenance of recognisable objects is achieved with the same network of brain regions supporting maintenance of simple objects. We used a novel stimulus generation method to parametrically warp photographic images along a continuum, allowing separate estimation of the precision of memory representations and the number of items retained. The stimulus generation method was also designed to create unrecognisable, though perceptually matched, stimuli, to investigate the impact of recognisability on VSTM. We adapted the widely-used change detection and continuous report paradigms for use with complex, photographic images. Across three functional magnetic resonance imaging (fMRI) experiments, we demonstrated greater precision for recognisable objects in VSTM compared to unrecognisable objects. This clear behavioural advantage was not the result of recruitment of additional brain regions, or of stronger mean activity within the core network. Representational similarity analysis revealed greater variability across item repetitions in the representations of recognisable, compared to unrecognisable complex objects. We therefore propose that a richer range of neural representations support VSTM for complex recognisable objects. Copyright © 2017 Elsevier Inc. All rights reserved.
Interactive distributed hardware-accelerated LOD-sprite terrain rendering with stable frame rates
NASA Astrophysics Data System (ADS)
Swan, J. E., II; Arango, Jesus; Nakshatrala, Bala K.
2002-03-01
A stable frame rate is important for interactive rendering systems. Image-based modeling and rendering (IBMR) techniques, which model parts of the scene with image sprites, are a promising technique for interactive systems because they allow the sprite to be manipulated instead of the underlying scene geometry. However, with IBMR techniques a frequent problem is an unstable frame rate, because generating an image sprite (with 3D rendering) is time-consuming relative to manipulating the sprite (with 2D image resampling). This paper describes one solution to this problem, by distributing an IBMR technique into a collection of cooperating threads and executable programs across two computers. The particular IBMR technique distributed here is the LOD-Sprite algorithm. This technique uses a multiple level-of-detail (LOD) scene representation. It first renders a keyframe from a high-LOD representation, and then caches the frame as an image sprite. It renders subsequent spriteframes by texture-mapping the cached image sprite into a lower-LOD representation. We describe a distributed architecture and implementation of LOD-Sprite, in the context of terrain rendering, which takes advantage of graphics hardware. We present timing results which indicate we have achieved a stable frame rate. In addition to LOD-Sprite, our distribution method holds promise for other IBMR techniques.
Profile of biology prospective teachers’ representation on plant anatomy learning
NASA Astrophysics Data System (ADS)
Ermayanti; Susanti, R.; Anwar, Y.
2018-04-01
This study aims to obtaining students’ representation ability in understanding the structure and function of plant tissues in plant anatomy course. Thirty students of The Biology Education Department of Sriwijaya University were involved in this study. Data on representation ability were collected using test and observation. The instruments had been validated by expert judgment. Test scores were used to represent students’ ability in 4 categories: 2D-image, 3D-image, spatial, and verbal representations. The results show that students’ representation ability is still low: 2D-image (40.0), 3D-image (25.0), spatial (20.0), and verbal representation (45.0). Based on the results of this study, it is suggested that instructional strategies be developed for plant anatomy course.
Generating Text from Functional Brain Images
Pereira, Francisco; Detre, Greg; Botvinick, Matthew
2011-01-01
Recent work has shown that it is possible to take brain images acquired during viewing of a scene and reconstruct an approximation of the scene from those images. Here we show that it is also possible to generate text about the mental content reflected in brain images. We began with images collected as participants read names of concrete items (e.g., “Apartment’’) while also seeing line drawings of the item named. We built a model of the mental semantic representation of concrete concepts from text data and learned to map aspects of such representation to patterns of activation in the corresponding brain image. In order to validate this mapping, without accessing information about the items viewed for left-out individual brain images, we were able to generate from each one a collection of semantically pertinent words (e.g., “door,” “window” for “Apartment’’). Furthermore, we show that the ability to generate such words allows us to perform a classification task and thus validate our method quantitatively. PMID:21927602
Research on pre-processing of QR Code
NASA Astrophysics Data System (ADS)
Sun, Haixing; Xia, Haojie; Dong, Ning
2013-10-01
QR code encodes many kinds of information because of its advantages: large storage capacity, high reliability, full arrange of utter-high-speed reading, small printing size and high-efficient representation of Chinese characters, etc. In order to obtain the clearer binarization image from complex background, and improve the recognition rate of QR code, this paper researches on pre-processing methods of QR code (Quick Response Code), and shows algorithms and results of image pre-processing for QR code recognition. Improve the conventional method by changing the Souvola's adaptive text recognition method. Additionally, introduce the QR code Extraction which adapts to different image size, flexible image correction approach, and improve the efficiency and accuracy of QR code image processing.
Three-dimensional representation of curved nanowires.
Huang, Z; Dikin, D A; Ding, W; Qiao, Y; Chen, X; Fridman, Y; Ruoff, R S
2004-12-01
Nanostructures, such as nanowires, nanotubes and nanocoils, can be described in many cases as quasi one-dimensional curved objects projecting in three-dimensional space. A parallax method to construct the correct three-dimensional geometry of such one-dimensional nanostructures is presented. A series of scanning electron microscope images was acquired at different view angles, thus providing a set of image pairs that were used to generate three-dimensional representations using a matlab program. An error analysis as a function of the view angle between the two images is presented and discussed. As an example application, the importance of knowing the true three-dimensional shape of boron nanowires is demonstrated; without the nanowire's correct length and diameter, mechanical resonance data cannot provide an accurate estimate of Young's modulus.
Optical aberration correction for simple lenses via sparse representation
NASA Astrophysics Data System (ADS)
Cui, Jinlin; Huang, Wei
2018-04-01
Simple lenses with spherical surfaces are lightweight, inexpensive, highly flexible, and can be easily processed. However, they suffer from optical aberrations that lead to limitations in high-quality photography. In this study, we propose a set of computational photography techniques based on sparse signal representation to remove optical aberrations, thereby allowing the recovery of images captured through a single-lens camera. The primary advantage of the proposed method is that many prior point spread functions calibrated at different depths are successfully used for restoring visual images in a short time, which can be generally applied to nonblind deconvolution methods for solving the problem of the excessive processing time caused by the number of point spread functions. The optical software CODE V is applied for examining the reliability of the proposed method by simulation. The simulation results reveal that the suggested method outperforms the traditional methods. Moreover, the performance of a single-lens camera is significantly enhanced both qualitatively and perceptually. Particularly, the prior information obtained by CODE V can be used for processing the real images of a single-lens camera, which provides an alternative approach to conveniently and accurately obtain point spread functions of single-lens cameras.
NASA Astrophysics Data System (ADS)
Zhang, Peng; Peng, Jing; Sims, S. Richard F.
2005-05-01
In ATR applications, each feature is a convolution of an image with a filter. It is important to use most discriminant features to produce compact representations. We propose two novel subspace methods for dimension reduction to address limitations associated with Fukunaga-Koontz Transform (FKT). The first method, Scatter-FKT, assumes that target is more homogeneous, while clutter can be anything other than target and anywhere. Thus, instead of estimating a clutter covariance matrix, Scatter-FKT computes a clutter scatter matrix that measures the spread of clutter from the target mean. We choose dimensions along which the difference in variation between target and clutter is most pronounced. When the target follows a Gaussian distribution, Scatter-FKT can be viewed as a generalization of FKT. The second method, Optimal Bayesian Subspace, is derived from the optimal Bayesian classifier. It selects dimensions such that the minimum Bayes error rate can be achieved. When both target and clutter follow Gaussian distributions, OBS computes optimal subspace representations. We compare our methods against FKT using character image as well as IR data.
Pasqualotto, Achille; Esenkaya, Tayfun
2016-01-01
Visual-to-auditory sensory substitution is used to convey visual information through audition, and it was initially created to compensate for blindness; it consists of software converting the visual images captured by a video-camera into the equivalent auditory images, or "soundscapes". Here, it was used by blindfolded sighted participants to learn the spatial position of simple shapes depicted in images arranged on the floor. Very few studies have used sensory substitution to investigate spatial representation, while it has been widely used to investigate object recognition. Additionally, with sensory substitution we could study the performance of participants actively exploring the environment through audition, rather than passively localizing sound sources. Blindfolded participants egocentrically learnt the position of six images by using sensory substitution and then a judgment of relative direction task (JRD) was used to determine how this scene was represented. This task consists of imagining being in a given location, oriented in a given direction, and pointing towards the required image. Before performing the JRD task, participants explored a map that provided allocentric information about the scene. Although spatial exploration was egocentric, surprisingly we found that performance in the JRD task was better for allocentric perspectives. This suggests that the egocentric representation of the scene was updated. This result is in line with previous studies using visual and somatosensory scenes, thus supporting the notion that different sensory modalities produce equivalent spatial representation(s). Moreover, our results have practical implications to improve training methods with sensory substitution devices (SSD).
Image fusion using sparse overcomplete feature dictionaries
Brumby, Steven P.; Bettencourt, Luis; Kenyon, Garrett T.; Chartrand, Rick; Wohlberg, Brendt
2015-10-06
Approaches for deciding what individuals in a population of visual system "neurons" are looking for using sparse overcomplete feature dictionaries are provided. A sparse overcomplete feature dictionary may be learned for an image dataset and a local sparse representation of the image dataset may be built using the learned feature dictionary. A local maximum pooling operation may be applied on the local sparse representation to produce a translation-tolerant representation of the image dataset. An object may then be classified and/or clustered within the translation-tolerant representation of the image dataset using a supervised classification algorithm and/or an unsupervised clustering algorithm.
Prostate segmentation by sparse representation based classification.
Gao, Yaozong; Liao, Shu; Shen, Dinggang
2012-10-01
The segmentation of prostate in CT images is of essential importance to external beam radiotherapy, which is one of the major treatments for prostate cancer nowadays. During the radiotherapy, the prostate is radiated by high-energy x rays from different directions. In order to maximize the dose to the cancer and minimize the dose to the surrounding healthy tissues (e.g., bladder and rectum), the prostate in the new treatment image needs to be accurately localized. Therefore, the effectiveness and efficiency of external beam radiotherapy highly depend on the accurate localization of the prostate. However, due to the low contrast of the prostate with its surrounding tissues (e.g., bladder), the unpredicted prostate motion, and the large appearance variations across different treatment days, it is challenging to segment the prostate in CT images. In this paper, the authors present a novel classification based segmentation method to address these problems. To segment the prostate, the proposed method first uses sparse representation based classification (SRC) to enhance the prostate in CT images by pixel-wise classification, in order to overcome the limitation of poor contrast of the prostate images. Then, based on the classification results, previous segmented prostates of the same patient are used as patient-specific atlases to align onto the current treatment image and the majority voting strategy is finally adopted to segment the prostate. In order to address the limitations of the traditional SRC in pixel-wise classification, especially for the purpose of segmentation, the authors extend SRC from the following four aspects: (1) A discriminant subdictionary learning method is proposed to learn a discriminant and compact representation of training samples for each class so that the discriminant power of SRC can be increased and also SRC can be applied to the large-scale pixel-wise classification. (2) The L1 regularized sparse coding is replaced by the elastic net in order to obtain a smooth and clear prostate boundary in the classification result. (3) Residue-based linear regression is incorporated to improve the classification performance and to extend SRC from hard classification to soft classification. (4) Iterative SRC is proposed by using context information to iteratively refine the classification results. The proposed method has been comprehensively evaluated on a dataset consisting of 330 CT images from 24 patients. The effectiveness of the extended SRC has been validated by comparing it with the traditional SRC based on the proposed four extensions. The experimental results show that our extended SRC can obtain not only more accurate classification results but also smoother and clearer prostate boundary than the traditional SRC. Besides, the comparison with other five state-of-the-art prostate segmentation methods indicates that our method can achieve better performance than other methods under comparison. The authors have proposed a novel prostate segmentation method based on the sparse representation based classification, which can achieve considerably accurate segmentation results in CT prostate segmentation.
Khalilzadeh, Mohammad Mahdi; Fatemizadeh, Emad; Behnam, Hamid
2013-06-01
Automatic extraction of the varying regions of magnetic resonance images is required as a prior step in a diagnostic intelligent system. The sparsest representation and high-dimensional feature are provided based on learned dictionary. The classification is done by employing the technique that computes the reconstruction error locally and non-locally of each pixel. The acquired results from the real and simulated images are superior to the best MRI segmentation method with regard to the stability advantages. In addition, it is segmented exactly through a formula taken from the distance and sparse factors. Also, it is done automatically taking sparse factor in unsupervised clustering methods whose results have been improved. Copyright © 2013 Elsevier Inc. All rights reserved.
Infrared and visible image fusion with spectral graph wavelet transform.
Yan, Xiang; Qin, Hanlin; Li, Jia; Zhou, Huixin; Zong, Jing-guo
2015-09-01
Infrared and visible image fusion technique is a popular topic in image analysis because it can integrate complementary information and obtain reliable and accurate description of scenes. Multiscale transform theory as a signal representation method is widely used in image fusion. In this paper, a novel infrared and visible image fusion method is proposed based on spectral graph wavelet transform (SGWT) and bilateral filter. The main novelty of this study is that SGWT is used for image fusion. On the one hand, source images are decomposed by SGWT in its transform domain. The proposed approach not only effectively preserves the details of different source images, but also excellently represents the irregular areas of the source images. On the other hand, a novel weighted average method based on bilateral filter is proposed to fuse low- and high-frequency subbands by taking advantage of spatial consistency of natural images. Experimental results demonstrate that the proposed method outperforms seven recently proposed image fusion methods in terms of both visual effect and objective evaluation metrics.
An Interactive Image Segmentation Method in Hand Gesture Recognition
Chen, Disi; Li, Gongfa; Sun, Ying; Kong, Jianyi; Jiang, Guozhang; Tang, Heng; Ju, Zhaojie; Yu, Hui; Liu, Honghai
2017-01-01
In order to improve the recognition rate of hand gestures a new interactive image segmentation method for hand gesture recognition is presented, and popular methods, e.g., Graph cut, Random walker, Interactive image segmentation using geodesic star convexity, are studied in this article. The Gaussian Mixture Model was employed for image modelling and the iteration of Expectation Maximum algorithm learns the parameters of Gaussian Mixture Model. We apply a Gibbs random field to the image segmentation and minimize the Gibbs Energy using Min-cut theorem to find the optimal segmentation. The segmentation result of our method is tested on an image dataset and compared with other methods by estimating the region accuracy and boundary accuracy. Finally five kinds of hand gestures in different backgrounds are tested on our experimental platform, and the sparse representation algorithm is used, proving that the segmentation of hand gesture images helps to improve the recognition accuracy. PMID:28134818
Segmentation of High Angular Resolution Diffusion MRI using Sparse Riemannian Manifold Clustering
Wright, Margaret J.; Thompson, Paul M.; Vidal, René
2015-01-01
We address the problem of segmenting high angular resolution diffusion imaging (HARDI) data into multiple regions (or fiber tracts) with distinct diffusion properties. We use the orientation distribution function (ODF) to represent HARDI data and cast the problem as a clustering problem in the space of ODFs. Our approach integrates tools from sparse representation theory and Riemannian geometry into a graph theoretic segmentation framework. By exploiting the Riemannian properties of the space of ODFs, we learn a sparse representation for each ODF and infer the segmentation by applying spectral clustering to a similarity matrix built from these representations. In cases where regions with similar (resp. distinct) diffusion properties belong to different (resp. same) fiber tracts, we obtain the segmentation by incorporating spatial and user-specified pairwise relationships into the formulation. Experiments on synthetic data evaluate the sensitivity of our method to image noise and the presence of complex fiber configurations, and show its superior performance compared to alternative segmentation methods. Experiments on phantom and real data demonstrate the accuracy of the proposed method in segmenting simulated fibers, as well as white matter fiber tracts of clinical importance in the human brain. PMID:24108748
Constrained Low-Rank Learning Using Least Squares-Based Regularization.
Li, Ping; Yu, Jun; Wang, Meng; Zhang, Luming; Cai, Deng; Li, Xuelong
2017-12-01
Low-rank learning has attracted much attention recently due to its efficacy in a rich variety of real-world tasks, e.g., subspace segmentation and image categorization. Most low-rank methods are incapable of capturing low-dimensional subspace for supervised learning tasks, e.g., classification and regression. This paper aims to learn both the discriminant low-rank representation (LRR) and the robust projecting subspace in a supervised manner. To achieve this goal, we cast the problem into a constrained rank minimization framework by adopting the least squares regularization. Naturally, the data label structure tends to resemble that of the corresponding low-dimensional representation, which is derived from the robust subspace projection of clean data by low-rank learning. Moreover, the low-dimensional representation of original data can be paired with some informative structure by imposing an appropriate constraint, e.g., Laplacian regularizer. Therefore, we propose a novel constrained LRR method. The objective function is formulated as a constrained nuclear norm minimization problem, which can be solved by the inexact augmented Lagrange multiplier algorithm. Extensive experiments on image classification, human pose estimation, and robust face recovery have confirmed the superiority of our method.
Context-Aware Local Binary Feature Learning for Face Recognition.
Duan, Yueqi; Lu, Jiwen; Feng, Jianjiang; Zhou, Jie
2018-05-01
In this paper, we propose a context-aware local binary feature learning (CA-LBFL) method for face recognition. Unlike existing learning-based local face descriptors such as discriminant face descriptor (DFD) and compact binary face descriptor (CBFD) which learn each feature code individually, our CA-LBFL exploits the contextual information of adjacent bits by constraining the number of shifts from different binary bits, so that more robust information can be exploited for face representation. Given a face image, we first extract pixel difference vectors (PDV) in local patches, and learn a discriminative mapping in an unsupervised manner to project each pixel difference vector into a context-aware binary vector. Then, we perform clustering on the learned binary codes to construct a codebook, and extract a histogram feature for each face image with the learned codebook as the final representation. In order to exploit local information from different scales, we propose a context-aware local binary multi-scale feature learning (CA-LBMFL) method to jointly learn multiple projection matrices for face representation. To make the proposed methods applicable for heterogeneous face recognition, we present a coupled CA-LBFL (C-CA-LBFL) method and a coupled CA-LBMFL (C-CA-LBMFL) method to reduce the modality gap of corresponding heterogeneous faces in the feature level, respectively. Extensive experimental results on four widely used face datasets clearly show that our methods outperform most state-of-the-art face descriptors.
A User Study on Tactile Graphic Generation Methods
ERIC Educational Resources Information Center
Krufka, S. E.; Barner, K. E.
2006-01-01
Methods to automatically convert graphics into tactile representations have been recently investigated, creating either raised-line or relief images. In particular, we briefly review one raised-line method where important features are emphasized. This paper focuses primarily on the effects of such emphasis and on comparing both raised-line and…
ERIC Educational Resources Information Center
Py, Bernard, Ed.
2000-01-01
Articles in this issue include the following: "Representations sociales et discours. Questions epistemologiques et methodologiques" (Social Representations and Discourse. Questions of Epistemology and Methodology); "Aspects theoretiques et methodologiques de la recherche sur le traitement discursif des representations sociales"…
Zhang, Yu; Wu, Jianxin; Cai, Jianfei
2016-05-01
In large-scale visual recognition and image retrieval tasks, feature vectors, such as Fisher vector (FV) or the vector of locally aggregated descriptors (VLAD), have achieved state-of-the-art results. However, the combination of the large numbers of examples and high-dimensional vectors necessitates dimensionality reduction, in order to reduce its storage and CPU costs to a reasonable range. In spite of the popularity of various feature compression methods, this paper shows that the feature (dimension) selection is a better choice for high-dimensional FV/VLAD than the feature (dimension) compression methods, e.g., product quantization. We show that strong correlation among the feature dimensions in the FV and the VLAD may not exist, which renders feature selection a natural choice. We also show that, many dimensions in FV/VLAD are noise. Throwing them away using feature selection is better than compressing them and useful dimensions altogether using feature compression methods. To choose features, we propose an efficient importance sorting algorithm considering both the supervised and unsupervised cases, for visual recognition and image retrieval, respectively. Combining with the 1-bit quantization, feature selection has achieved both higher accuracy and less computational cost than feature compression methods, such as product quantization, on the FV and the VLAD image representations.
Accelerated dynamic EPR imaging using fast acquisition and compressive recovery.
Ahmad, Rizwan; Samouilov, Alexandre; Zweier, Jay L
2016-12-01
Electron paramagnetic resonance (EPR) allows quantitative imaging of tissue redox status, which provides important information about ischemic syndromes, cancer and other pathologies. For continuous wave EPR imaging, however, poor signal-to-noise ratio and low acquisition efficiency limit its ability to image dynamic processes in vivo including tissue redox, where conditions can change rapidly. Here, we present a data acquisition and processing framework that couples fast acquisition with compressive sensing-inspired image recovery to enable EPR-based redox imaging with high spatial and temporal resolutions. The fast acquisition (FA) allows collecting more, albeit noisier, projections in a given scan time. The composite regularization based processing method, called spatio-temporal adaptive recovery (STAR), not only exploits sparsity in multiple representations of the spatio-temporal image but also adaptively adjusts the regularization strength for each representation based on its inherent level of the sparsity. As a result, STAR adjusts to the disparity in the level of sparsity across multiple representations, without introducing any tuning parameter. Our simulation and phantom imaging studies indicate that a combination of fast acquisition and STAR (FASTAR) enables high-fidelity recovery of volumetric image series, with each volumetric image employing less than 10 s of scan. In addition to image fidelity, the time constants derived from FASTAR also match closely to the ground truth even when a small number of projections are used for recovery. This development will enhance the capability of EPR to study fast dynamic processes that cannot be investigated using existing EPR imaging techniques. Copyright © 2016 Elsevier Inc. All rights reserved.
Novel transform for image description and compression with implementation by neural architectures
NASA Astrophysics Data System (ADS)
Ben-Arie, Jezekiel; Rao, Raghunath K.
1991-10-01
A general method for signal representation using nonorthogonal basis functions that are composed of Gaussians are described. The Gaussians can be combined into groups with predetermined configuration that can approximate any desired basis function. The same configuration at different scales forms a set of self-similar wavelets. The general scheme is demonstrated by representing a natural signal employing an arbitrary basis function. The basic methodology is demonstrated by two novel schemes for efficient representation of 1-D and 2- D signals using Gaussian basis functions (BFs). Special methods are required here since the Gaussian functions are nonorthogonal. The first method employs a paradigm of maximum energy reduction interlaced with the A* heuristic search. The second method uses an adaptive lattice system to find the minimum-squared error of the BFs onto the signal, and a lateral-vertical suppression network to select the most efficient representation in terms of data compression.
Geodesic regression for image time-series.
Niethammer, Marc; Huang, Yang; Vialard, François-Xavier
2011-01-01
Registration of image-time series has so far been accomplished (i) by concatenating registrations between image pairs, (ii) by solving a joint estimation problem resulting in piecewise geodesic paths between image pairs, (iii) by kernel based local averaging or (iv) by augmenting the joint estimation with additional temporal irregularity penalties. Here, we propose a generative model extending least squares linear regression to the space of images by using a second-order dynamic formulation for image registration. Unlike previous approaches, the formulation allows for a compact representation of an approximation to the full spatio-temporal trajectory through its initial values. The method also opens up possibilities to design image-based approximation algorithms. The resulting optimization problem is solved using an adjoint method.
An automatic panoramic image reconstruction scheme from dental computed tomography images
Papakosta, Thekla K; Savva, Antonis D; Economopoulos, Theodore L; Gröhndal, H G
2017-01-01
Objectives: Panoramic images of the jaws are extensively used for dental examinations and/or surgical planning because they provide a general overview of the patient's maxillary and mandibular regions. Panoramic images are two-dimensional projections of three-dimensional (3D) objects. Therefore, it should be possible to reconstruct them from 3D radiographic representations of the jaws, produced by CBCT scanning, obviating the need for additional exposure to X-rays, should there be a need of panoramic views. The aim of this article is to present an automated method for reconstructing panoramic dental images from CBCT data. Methods: The proposed methodology consists of a series of sequential processing stages for detecting a fitting dental arch which is used for projecting the 3D information of the CBCT data to the two-dimensional plane of the panoramic image. The detection is based on a template polynomial which is constructed from a training data set. Results: A total of 42 CBCT data sets of real clinical pre-operative and post-operative representations from 21 patients were used. Eight data sets were used for training the system and the rest for testing. Conclusions: The proposed methodology was successfully applied to CBCT data sets, producing corresponding panoramic images, suitable for examining pre-operatively and post-operatively the patients' maxillary and mandibular regions. PMID:28112548
Wu, Guorong; Kim, Minjeong; Sanroma, Gerard; Wang, Qian; Munsell, Brent C.; Shen, Dinggang
2014-01-01
Multi-atlas patch-based label fusion methods have been successfully used to improve segmentation accuracy in many important medical image analysis applications. In general, to achieve label fusion a single target image is first registered to several atlas images, after registration a label is assigned to each target point in the target image by determining the similarity between the underlying target image patch (centered at the target point) and the aligned image patch in each atlas image. To achieve the highest level of accuracy during the label fusion process it’s critical the chosen patch similarity measurement accurately captures the tissue/shape appearance of the anatomical structure. One major limitation of existing state-of-the-art label fusion methods is that they often apply a fixed size image patch throughout the entire label fusion procedure. Doing so may severely affect the fidelity of the patch similarity measurement, which in turn may not adequately capture complex tissue appearance patterns expressed by the anatomical structure. To address this limitation, we advance state-of-the-art by adding three new label fusion contributions: First, each image patch now characterized by a multi-scale feature representation that encodes both local and semi-local image information. Doing so will increase the accuracy of the patch-based similarity measurement. Second, to limit the possibility of the patch-based similarity measurement being wrongly guided by the presence of multiple anatomical structures in the same image patch, each atlas image patch is further partitioned into a set of label-specific partial image patches according to the existing labels. Since image information has now been semantically divided into different patterns, these new label-specific atlas patches make the label fusion process more specific and flexible. Lastly, in order to correct target points that are mislabeled during label fusion, a hierarchically approach is used to improve the label fusion results. In particular, a coarse-to-fine iterative label fusion approach is used that gradually reduces the patch size. To evaluate the accuracy of our label fusion approach, the proposed method was used to segment the hippocampus in the ADNI dataset and 7.0 tesla MR images, sub-cortical regions in LONI LBPA40 dataset, mid-brain regions in SATA dataset from MICCAI 2013 segmentation challenge, and a set of key internal gray matter structures in IXI dataset. In all experiments, the segmentation results of the proposed hierarchical label fusion method with multi-scale feature representations and label-specific atlas patches are more accurate than several well-known state-of-the-art label fusion methods. PMID:25463474
Research on application of LADAR in ground vehicle recognition
NASA Astrophysics Data System (ADS)
Lan, Jinhui; Shen, Zhuoxun
2009-11-01
For the requirement of many practical applications in the field of military, the research of 3D target recognition is active. The representation that captures the salient attributes of a 3D target independent of the viewing angle will be especially useful to the automatic 3D target recognition system. This paper presents a new approach of image generation based on Laser Detection and Ranging (LADAR) data. Range image of target is obtained by transformation of point cloud. In order to extract features of different ground vehicle targets and to recognize targets, zernike moment properties of typical ground vehicle targets are researched in this paper. A technique of support vector machine is applied to the classification and recognition of target. The new method of image generation and feature representation has been applied to the outdoor experiments. Through outdoor experiments, it can be proven that the method of image generation is stability, the moments are effective to be used as features for recognition, and the LADAR can be applied to the field of 3D target recognition.
Jia, Yuanyuan; Gholipour, Ali; He, Zhongshi; Warfield, Simon K
2017-05-01
In magnetic resonance (MR), hardware limitations, scan time constraints, and patient movement often result in the acquisition of anisotropic 3-D MR images with limited spatial resolution in the out-of-plane views. Our goal is to construct an isotropic high-resolution (HR) 3-D MR image through upsampling and fusion of orthogonal anisotropic input scans. We propose a multiframe super-resolution (SR) reconstruction technique based on sparse representation of MR images. Our proposed algorithm exploits the correspondence between the HR slices and the low-resolution (LR) sections of the orthogonal input scans as well as the self-similarity of each input scan to train pairs of overcomplete dictionaries that are used in a sparse-land local model to upsample the input scans. The upsampled images are then combined using wavelet fusion and error backprojection to reconstruct an image. Features are learned from the data and no extra training set is needed. Qualitative and quantitative analyses were conducted to evaluate the proposed algorithm using simulated and clinical MR scans. Experimental results show that the proposed algorithm achieves promising results in terms of peak signal-to-noise ratio, structural similarity image index, intensity profiles, and visualization of small structures obscured in the LR imaging process due to partial volume effects. Our novel SR algorithm outperforms the nonlocal means (NLM) method using self-similarity, NLM method using self-similarity and image prior, self-training dictionary learning-based SR method, averaging of upsampled scans, and the wavelet fusion method. Our SR algorithm can reduce through-plane partial volume artifact by combining multiple orthogonal MR scans, and thus can potentially improve medical image analysis, research, and clinical diagnosis.
Pixel-based meshfree modelling of skeletal muscles.
Chen, Jiun-Shyan; Basava, Ramya Rao; Zhang, Yantao; Csapo, Robert; Malis, Vadim; Sinha, Usha; Hodgson, John; Sinha, Shantanu
2016-01-01
This paper introduces the meshfree Reproducing Kernel Particle Method (RKPM) for 3D image-based modeling of skeletal muscles. This approach allows for construction of simulation model based on pixel data obtained from medical images. The material properties and muscle fiber direction obtained from Diffusion Tensor Imaging (DTI) are input at each pixel point. The reproducing kernel (RK) approximation allows a representation of material heterogeneity with smooth transition. A multiphase multichannel level set based segmentation framework is adopted for individual muscle segmentation using Magnetic Resonance Images (MRI) and DTI. The application of the proposed methods for modeling the human lower leg is demonstrated.
An Accurate Projector Calibration Method Based on Polynomial Distortion Representation
Liu, Miao; Sun, Changku; Huang, Shujun; Zhang, Zonghua
2015-01-01
In structure light measurement systems or 3D printing systems, the errors caused by optical distortion of a digital projector always affect the precision performance and cannot be ignored. Existing methods to calibrate the projection distortion rely on calibration plate and photogrammetry, so the calibration performance is largely affected by the quality of the plate and the imaging system. This paper proposes a new projector calibration approach that makes use of photodiodes to directly detect the light emitted from a digital projector. By analyzing the output sequence of the photoelectric module, the pixel coordinates can be accurately obtained by the curve fitting method. A polynomial distortion representation is employed to reduce the residuals of the traditional distortion representation model. Experimental results and performance evaluation show that the proposed calibration method is able to avoid most of the disadvantages in traditional methods and achieves a higher accuracy. This proposed method is also practically applicable to evaluate the geometric optical performance of other optical projection system. PMID:26492247
Mathematical models used in segmentation and fractal methods of 2-D ultrasound images
NASA Astrophysics Data System (ADS)
Moldovanu, Simona; Moraru, Luminita; Bibicu, Dorin
2012-11-01
Mathematical models are widely used in biomedical computing. The extracted data from images using the mathematical techniques are the "pillar" achieving scientific progress in experimental, clinical, biomedical, and behavioural researches. This article deals with the representation of 2-D images and highlights the mathematical support for the segmentation operation and fractal analysis in ultrasound images. A large number of mathematical techniques are suitable to be applied during the image processing stage. The addressed topics cover the edge-based segmentation, more precisely the gradient-based edge detection and active contour model, and the region-based segmentation namely Otsu method. Another interesting mathematical approach consists of analyzing the images using the Box Counting Method (BCM) to compute the fractal dimension. The results of the paper provide explicit samples performed by various combination of methods.
A Remote Sensing Image Fusion Method based on adaptive dictionary learning
NASA Astrophysics Data System (ADS)
He, Tongdi; Che, Zongxi
2018-01-01
This paper discusses using a remote sensing fusion method, based on' adaptive sparse representation (ASP)', to provide improved spectral information, reduce data redundancy and decrease system complexity. First, the training sample set is formed by taking random blocks from the images to be fused, the dictionary is then constructed using the training samples, and the remaining terms are clustered to obtain the complete dictionary by iterated processing at each step. Second, the self-adaptive weighted coefficient rule of regional energy is used to select the feature fusion coefficients and complete the reconstruction of the image blocks. Finally, the reconstructed image blocks are rearranged and an average is taken to obtain the final fused images. Experimental results show that the proposed method is superior to other traditional remote sensing image fusion methods in both spectral information preservation and spatial resolution.
NASA Astrophysics Data System (ADS)
Jiang, Junjun; Hu, Ruimin; Han, Zhen; Wang, Zhongyuan; Chen, Jun
2013-10-01
Face superresolution (SR), or face hallucination, refers to the technique of generating a high-resolution (HR) face image from a low-resolution (LR) one with the help of a set of training examples. It aims at transcending the limitations of electronic imaging systems. Applications of face SR include video surveillance, in which the individual of interest is often far from cameras. A two-step method is proposed to infer a high-quality and HR face image from a low-quality and LR observation. First, we establish the nonlinear relationship between LR face images and HR ones, according to radial basis function and partial least squares (RBF-PLS) regression, to transform the LR face into the global face space. Then, a locality-induced sparse representation (LiSR) approach is presented to enhance the local facial details once all the global faces for each LR training face are constructed. A comparison of some state-of-the-art SR methods shows the superiority of the proposed two-step approach, RBF-PLS global face regression followed by LiSR-based local patch reconstruction. Experiments also demonstrate the effectiveness under both simulation conditions and some real conditions.
Hippocampus Segmentation Based on Local Linear Mapping
Pang, Shumao; Jiang, Jun; Lu, Zhentai; Li, Xueli; Yang, Wei; Huang, Meiyan; Zhang, Yu; Feng, Yanqiu; Huang, Wenhua; Feng, Qianjin
2017-01-01
We propose local linear mapping (LLM), a novel fusion framework for distance field (DF) to perform automatic hippocampus segmentation. A k-means cluster method is propose for constructing magnetic resonance (MR) and DF dictionaries. In LLM, we assume that the MR and DF samples are located on two nonlinear manifolds and the mapping from the MR manifold to the DF manifold is differentiable and locally linear. We combine the MR dictionary using local linear representation to present the test sample, and combine the DF dictionary using the corresponding coefficients derived from local linear representation procedure to predict the DF of the test sample. We then merge the overlapped predicted DF patch to obtain the DF value of each point in the test image via a confidence-based weighted average method. This approach enabled us to estimate the label of the test image according to the predicted DF. The proposed method was evaluated on brain images of 35 subjects obtained from SATA dataset. Results indicate the effectiveness of the proposed method, which yields mean Dice similarity coefficients of 0.8697, 0.8770 and 0.8734 for the left, right and bi-lateral hippocampus, respectively. PMID:28368016
Hippocampus Segmentation Based on Local Linear Mapping.
Pang, Shumao; Jiang, Jun; Lu, Zhentai; Li, Xueli; Yang, Wei; Huang, Meiyan; Zhang, Yu; Feng, Yanqiu; Huang, Wenhua; Feng, Qianjin
2017-04-03
We propose local linear mapping (LLM), a novel fusion framework for distance field (DF) to perform automatic hippocampus segmentation. A k-means cluster method is propose for constructing magnetic resonance (MR) and DF dictionaries. In LLM, we assume that the MR and DF samples are located on two nonlinear manifolds and the mapping from the MR manifold to the DF manifold is differentiable and locally linear. We combine the MR dictionary using local linear representation to present the test sample, and combine the DF dictionary using the corresponding coefficients derived from local linear representation procedure to predict the DF of the test sample. We then merge the overlapped predicted DF patch to obtain the DF value of each point in the test image via a confidence-based weighted average method. This approach enabled us to estimate the label of the test image according to the predicted DF. The proposed method was evaluated on brain images of 35 subjects obtained from SATA dataset. Results indicate the effectiveness of the proposed method, which yields mean Dice similarity coefficients of 0.8697, 0.8770 and 0.8734 for the left, right and bi-lateral hippocampus, respectively.
Hippocampus Segmentation Based on Local Linear Mapping
NASA Astrophysics Data System (ADS)
Pang, Shumao; Jiang, Jun; Lu, Zhentai; Li, Xueli; Yang, Wei; Huang, Meiyan; Zhang, Yu; Feng, Yanqiu; Huang, Wenhua; Feng, Qianjin
2017-04-01
We propose local linear mapping (LLM), a novel fusion framework for distance field (DF) to perform automatic hippocampus segmentation. A k-means cluster method is propose for constructing magnetic resonance (MR) and DF dictionaries. In LLM, we assume that the MR and DF samples are located on two nonlinear manifolds and the mapping from the MR manifold to the DF manifold is differentiable and locally linear. We combine the MR dictionary using local linear representation to present the test sample, and combine the DF dictionary using the corresponding coefficients derived from local linear representation procedure to predict the DF of the test sample. We then merge the overlapped predicted DF patch to obtain the DF value of each point in the test image via a confidence-based weighted average method. This approach enabled us to estimate the label of the test image according to the predicted DF. The proposed method was evaluated on brain images of 35 subjects obtained from SATA dataset. Results indicate the effectiveness of the proposed method, which yields mean Dice similarity coefficients of 0.8697, 0.8770 and 0.8734 for the left, right and bi-lateral hippocampus, respectively.
Face recognition via edge-based Gabor feature representation for plastic surgery-altered images
NASA Astrophysics Data System (ADS)
Chude-Olisah, Chollette C.; Sulong, Ghazali; Chude-Okonkwo, Uche A. K.; Hashim, Siti Z. M.
2014-12-01
Plastic surgery procedures on the face introduce skin texture variations between images of the same person (intra-subject), thereby making the task of face recognition more difficult than in normal scenario. Usually, in contemporary face recognition systems, the original gray-level face image is used as input to the Gabor descriptor, which translates to encoding some texture properties of the face image. The texture-encoding process significantly degrades the performance of such systems in the case of plastic surgery due to the presence of surgically induced intra-subject variations. Based on the proposition that the shape of significant facial components such as eyes, nose, eyebrow, and mouth remains unchanged after plastic surgery, this paper employs an edge-based Gabor feature representation approach for the recognition of surgically altered face images. We use the edge information, which is dependent on the shapes of the significant facial components, to address the plastic surgery-induced texture variation problems. To ensure that the significant facial components represent useful edge information with little or no false edges, a simple illumination normalization technique is proposed for preprocessing. Gabor wavelet is applied to the edge image to accentuate on the uniqueness of the significant facial components for discriminating among different subjects. The performance of the proposed method is evaluated on the Georgia Tech (GT) and the Labeled Faces in the Wild (LFW) databases with illumination and expression problems, and the plastic surgery database with texture changes. Results show that the proposed edge-based Gabor feature representation approach is robust against plastic surgery-induced face variations amidst expression and illumination problems and outperforms the existing plastic surgery face recognition methods reported in the literature.
Aircraft geometry verification with enhanced computer generated displays
NASA Technical Reports Server (NTRS)
Cozzolongo, J. V.
1982-01-01
A method for visual verification of aerodynamic geometries using computer generated, color shaded images is described. The mathematical models representing aircraft geometries are created for use in theoretical aerodynamic analyses and in computer aided manufacturing. The aerodynamic shapes are defined using parametric bi-cubic splined patches. This mathematical representation is then used as input to an algorithm that generates a color shaded image of the geometry. A discussion of the techniques used in the mathematical representation of the geometry and in the rendering of the color shaded display is presented. The results include examples of color shaded displays, which are contrasted with wire frame type displays. The examples also show the use of mapped surface pressures in terms of color shaded images of V/STOL fighter/attack aircraft and advanced turboprop aircraft.
Visualizing dispersive features in 2D image via minimum gradient method
DOE Office of Scientific and Technical Information (OSTI.GOV)
He, Yu; Wang, Yan; Shen, Zhi -Xun
Here, we developed a minimum gradient based method to track ridge features in a 2D image plot, which is a typical data representation in many momentum resolved spectroscopy experiments. Through both analytic formulation and numerical simulation, we compare this new method with existing DC (distribution curve) based and higher order derivative based analyses. We find that the new method has good noise resilience and enhanced contrast especially for weak intensity features and meanwhile preserves the quantitative local maxima information from the raw image. An algorithm is proposed to extract 1D ridge dispersion from the 2D image plot, whose quantitative applicationmore » to angle-resolved photoemission spectroscopy measurements on high temperature superconductors is demonstrated.« less
Visualizing dispersive features in 2D image via minimum gradient method
He, Yu; Wang, Yan; Shen, Zhi -Xun
2017-07-24
Here, we developed a minimum gradient based method to track ridge features in a 2D image plot, which is a typical data representation in many momentum resolved spectroscopy experiments. Through both analytic formulation and numerical simulation, we compare this new method with existing DC (distribution curve) based and higher order derivative based analyses. We find that the new method has good noise resilience and enhanced contrast especially for weak intensity features and meanwhile preserves the quantitative local maxima information from the raw image. An algorithm is proposed to extract 1D ridge dispersion from the 2D image plot, whose quantitative applicationmore » to angle-resolved photoemission spectroscopy measurements on high temperature superconductors is demonstrated.« less
Automatic face recognition in HDR imaging
NASA Astrophysics Data System (ADS)
Pereira, Manuela; Moreno, Juan-Carlos; Proença, Hugo; Pinheiro, António M. G.
2014-05-01
The gaining popularity of the new High Dynamic Range (HDR) imaging systems is raising new privacy issues caused by the methods used for visualization. HDR images require tone mapping methods for an appropriate visualization on conventional and non-expensive LDR displays. These visualization methods might result in completely different visualization raising several issues on privacy intrusion. In fact, some visualization methods result in a perceptual recognition of the individuals, while others do not even show any identity. Although perceptual recognition might be possible, a natural question that can rise is how computer based recognition will perform using tone mapping generated images? In this paper, a study where automatic face recognition using sparse representation is tested with images that result from common tone mapping operators applied to HDR images. Its ability for the face identity recognition is described. Furthermore, typical LDR images are used for the face recognition training.
Robust multi-atlas label propagation by deep sparse representation
Zu, Chen; Wang, Zhengxia; Zhang, Daoqiang; Liang, Peipeng; Shi, Yonghong; Shen, Dinggang; Wu, Guorong
2016-01-01
Recently, multi-atlas patch-based label fusion has achieved many successes in medical imaging area. The basic assumption in the current state-of-the-art approaches is that the image patch at the target image point can be represented by a patch dictionary consisting of atlas patches from registered atlas images. Therefore, the label at the target image point can be determined by fusing labels of atlas image patches with similar anatomical structures. However, such assumption on image patch representation does not always hold in label fusion since (1) the image content within the patch may be corrupted due to noise and artifact; and (2) the distribution of morphometric patterns among atlas patches might be unbalanced such that the majority patterns can dominate label fusion result over other minority patterns. The violation of the above basic assumptions could significantly undermine the label fusion accuracy. To overcome these issues, we first consider forming label-specific group for the atlas patches with the same label. Then, we alter the conventional flat and shallow dictionary to a deep multi-layer structure, where the top layer (label-specific dictionaries) consists of groups of representative atlas patches and the subsequent layers (residual dictionaries) hierarchically encode the patchwise residual information in different scales. Thus, the label fusion follows the representation consensus across representative dictionaries. However, the representation of target patch in each group is iteratively optimized by using the representative atlas patches in each label-specific dictionary exclusively to match the principal patterns and also using all residual patterns across groups collaboratively to overcome the issue that some groups might be absent of certain variation patterns presented in the target image patch. Promising segmentation results have been achieved in labeling hippocampus on ADNI dataset, as well as basal ganglia and brainstem structures, compared to other counterpart label fusion methods. PMID:27942077
Robust multi-atlas label propagation by deep sparse representation.
Zu, Chen; Wang, Zhengxia; Zhang, Daoqiang; Liang, Peipeng; Shi, Yonghong; Shen, Dinggang; Wu, Guorong
2017-03-01
Recently, multi-atlas patch-based label fusion has achieved many successes in medical imaging area. The basic assumption in the current state-of-the-art approaches is that the image patch at the target image point can be represented by a patch dictionary consisting of atlas patches from registered atlas images. Therefore, the label at the target image point can be determined by fusing labels of atlas image patches with similar anatomical structures. However, such assumption on image patch representation does not always hold in label fusion since (1) the image content within the patch may be corrupted due to noise and artifact; and (2) the distribution of morphometric patterns among atlas patches might be unbalanced such that the majority patterns can dominate label fusion result over other minority patterns. The violation of the above basic assumptions could significantly undermine the label fusion accuracy. To overcome these issues, we first consider forming label-specific group for the atlas patches with the same label. Then, we alter the conventional flat and shallow dictionary to a deep multi-layer structure, where the top layer ( label-specific dictionaries ) consists of groups of representative atlas patches and the subsequent layers ( residual dictionaries ) hierarchically encode the patchwise residual information in different scales. Thus, the label fusion follows the representation consensus across representative dictionaries. However, the representation of target patch in each group is iteratively optimized by using the representative atlas patches in each label-specific dictionary exclusively to match the principal patterns and also using all residual patterns across groups collaboratively to overcome the issue that some groups might be absent of certain variation patterns presented in the target image patch. Promising segmentation results have been achieved in labeling hippocampus on ADNI dataset, as well as basal ganglia and brainstem structures, compared to other counterpart label fusion methods.
System and method for optical fiber based image acquisition suitable for use in turbine engines
Baleine, Erwan; A V, Varun; Zombo, Paul J.; Varghese, Zubin
2017-05-16
A system and a method for image acquisition suitable for use in a turbine engine are disclosed. Light received from a field of view in an object plane is projected onto an image plane through an optical modulation device and is transferred through an image conduit to a sensor array. The sensor array generates a set of sampled image signals in a sensing basis based on light received from the image conduit. Finally, the sampled image signals are transformed from the sensing basis to a representation basis and a set of estimated image signals are generated therefrom. The estimated image signals are used for reconstructing an image and/or a motion-video of a region of interest within a turbine engine.
Function representation with circle inversion map systems
NASA Astrophysics Data System (ADS)
Boreland, Bryson; Kunze, Herb
2017-01-01
The fractals literature develops the now well-known concept of local iterated function systems (using affine maps) with grey-level maps (LIFSM) as an approach to function representation in terms of the associated fixed point of the so-called fractal transform. While originally explored as a method to achieve signal (and 2-D image) compression, more recent work has explored various aspects of signal and image processing using this machinery. In this paper, we develop a similar framework for function representation using circle inversion map systems. Given a circle C with centre õ and radius r, inversion with respect to C transforms the point p˜ to the point p˜', such that p˜ and p˜' lie on the same radial half-line from õ and d(õ, p˜)d(õ, p˜') = r2, where d is Euclidean distance. We demonstrate the results with an example.
Ma, Xu; Cheng, Yongmei; Hao, Shuai
2016-12-10
Automatic classification of terrain surfaces from an aerial image is essential for an autonomous unmanned aerial vehicle (UAV) landing at an unprepared site by using vision. Diverse terrain surfaces may show similar spectral properties due to the illumination and noise that easily cause poor classification performance. To address this issue, a multi-stage classification algorithm based on low-rank recovery and multi-feature fusion sparse representation is proposed. First, color moments and Gabor texture feature are extracted from training data and stacked as column vectors of a dictionary. Then we perform low-rank matrix recovery for the dictionary by using augmented Lagrange multipliers and construct a multi-stage terrain classifier. Experimental results on an aerial map database that we prepared verify the classification accuracy and robustness of the proposed method.
Jaccard distance based weighted sparse representation for coarse-to-fine plant species recognition.
Zhang, Shanwen; Wu, Xiaowei; You, Zhuhong
2017-01-01
Leaf based plant species recognition plays an important role in ecological protection, however its application to large and modern leaf databases has been a long-standing obstacle due to the computational cost and feasibility. Recognizing such limitations, we propose a Jaccard distance based sparse representation (JDSR) method which adopts a two-stage, coarse to fine strategy for plant species recognition. In the first stage, we use the Jaccard distance between the test sample and each training sample to coarsely determine the candidate classes of the test sample. The second stage includes a Jaccard distance based weighted sparse representation based classification(WSRC), which aims to approximately represent the test sample in the training space, and classify it by the approximation residuals. Since the training model of our JDSR method involves much fewer but more informative representatives, this method is expected to overcome the limitation of high computational and memory costs in traditional sparse representation based classification. Comparative experimental results on a public leaf image database demonstrate that the proposed method outperforms other existing feature extraction and SRC based plant recognition methods in terms of both accuracy and computational speed.
Silverstein, Jonathan C; Dech, Fred; Kouchoukos, Philip L
2004-01-01
Radiological volumes are typically reviewed by surgeons using cross-sections and iso-surface reconstructions. Applications that combine collaborative stereo volume visualization with symbolic anatomic information and data fusions would expand surgeons' capabilities in interpretation of data and in planning treatment. Such an application has not been seen clinically. We are developing methods to systematically combine symbolic anatomy (term hierarchies and iso-surface atlases) with patient data using data fusion. We describe our progress toward integrating these methods into our collaborative virtual reality application. The fully combined application will be a feature-rich stereo collaborative volume visualization environment for use by surgeons in which DICOM datasets will self-report underlying anatomy with visual feedback. Using hierarchical navigation of SNOMED-CT anatomic terms integrated with our existing Tele-immersive DICOM-based volumetric rendering application, we will display polygonal representations of anatomic systems on the fly from menus that query a database. The methods and tools involved in this application development are SNOMED-CT, DICOM, VISIBLE HUMAN, volumetric fusion and C++ on a Tele-immersive platform. This application will allow us to identify structures and display polygonal representations from atlas data overlaid with the volume rendering. First, atlas data is automatically translated, rotated, and scaled to the patient data during loading using a public domain volumetric fusion algorithm. This generates a modified symbolic representation of the underlying canonical anatomy. Then, through the use of collision detection or intersection testing of various transparent polygonal representations, the polygonal structures are highlighted into the volumetric representation while the SNOMED names are displayed. Thus, structural names and polygonal models are associated with the visualized DICOM data. This novel juxtaposition of information promises to expand surgeons' abilities to interpret images and plan treatment.
Langley, Keith; Anderson, Stephen J
2010-08-06
To represent the local orientation and energy of a 1-D image signal, many models of early visual processing employ bandpass quadrature filters, formed by combining the original signal with its Hilbert transform. However, representations capable of estimating an image signal's 2-D phase have been largely ignored. Here, we consider 2-D phase representations using a method based upon the Riesz transform. For spatial images there exist two Riesz transformed signals and one original signal from which orientation, phase and energy may be represented as a vector in 3-D signal space. We show that these image properties may be represented by a Singular Value Decomposition (SVD) of the higher-order derivatives of the original and the Riesz transformed signals. We further show that the expected responses of even and odd symmetric filters from the Riesz transform may be represented by a single signal autocorrelation function, which is beneficial in simplifying Bayesian computations for spatial orientation. Importantly, the Riesz transform allows one to weight linearly across orientation using both symmetric and asymmetric filters to account for some perceptual phase distortions observed in image signals - notably one's perception of edge structure within plaid patterns whose component gratings are either equal or unequal in contrast. Finally, exploiting the benefits that arise from the Riesz definition of local energy as a scalar quantity, we demonstrate the utility of Riesz signal representations in estimating the spatial orientation of second-order image signals. We conclude that the Riesz transform may be employed as a general tool for 2-D visual pattern recognition by its virtue of representing phase, orientation and energy as orthogonal signal quantities.
Image gathering and restoration - Information and visual quality
NASA Technical Reports Server (NTRS)
Mccormick, Judith A.; Alter-Gartenberg, Rachel; Huck, Friedrich O.
1989-01-01
A method is investigated for optimizing the end-to-end performance of image gathering and restoration for visual quality. To achieve this objective, one must inevitably confront the problems that the visual quality of restored images depends on perceptual rather than mathematical considerations and that these considerations vary with the target, the application, and the observer. The method adopted in this paper is to optimize image gathering informationally and to restore images interactively to obtain the visually preferred trade-off among fidelity resolution, sharpness, and clarity. The results demonstrate that this method leads to significant improvements in the visual quality obtained by the traditional digital processing methods. These traditional methods allow a significant loss of visual quality to occur because they treat the design of the image-gathering system and the formulation of the image-restoration algorithm as two separate tasks and fail to account for the transformations between the continuous and the discrete representations in image gathering and reconstruction.
Seeland, Marco; Rzanny, Michael; Alaqraa, Nedal; Wäldchen, Jana; Mäder, Patrick
2017-01-01
Steady improvements of image description methods induced a growing interest in image-based plant species classification, a task vital to the study of biodiversity and ecological sensitivity. Various techniques have been proposed for general object classification over the past years and several of them have already been studied for plant species classification. However, results of these studies are selective in the evaluated steps of a classification pipeline, in the utilized datasets for evaluation, and in the compared baseline methods. No study is available that evaluates the main competing methods for building an image representation on the same datasets allowing for generalized findings regarding flower-based plant species classification. The aim of this paper is to comparatively evaluate methods, method combinations, and their parameters towards classification accuracy. The investigated methods span from detection, extraction, fusion, pooling, to encoding of local features for quantifying shape and color information of flower images. We selected the flower image datasets Oxford Flower 17 and Oxford Flower 102 as well as our own Jena Flower 30 dataset for our experiments. Findings show large differences among the various studied techniques and that their wisely chosen orchestration allows for high accuracies in species classification. We further found that true local feature detectors in combination with advanced encoding methods yield higher classification results at lower computational costs compared to commonly used dense sampling and spatial pooling methods. Color was found to be an indispensable feature for high classification results, especially while preserving spatial correspondence to gray-level features. In result, our study provides a comprehensive overview of competing techniques and the implications of their main parameters for flower-based plant species classification. PMID:28234999
Pasqualotto, Achille; Esenkaya, Tayfun
2016-01-01
Visual-to-auditory sensory substitution is used to convey visual information through audition, and it was initially created to compensate for blindness; it consists of software converting the visual images captured by a video-camera into the equivalent auditory images, or “soundscapes”. Here, it was used by blindfolded sighted participants to learn the spatial position of simple shapes depicted in images arranged on the floor. Very few studies have used sensory substitution to investigate spatial representation, while it has been widely used to investigate object recognition. Additionally, with sensory substitution we could study the performance of participants actively exploring the environment through audition, rather than passively localizing sound sources. Blindfolded participants egocentrically learnt the position of six images by using sensory substitution and then a judgment of relative direction task (JRD) was used to determine how this scene was represented. This task consists of imagining being in a given location, oriented in a given direction, and pointing towards the required image. Before performing the JRD task, participants explored a map that provided allocentric information about the scene. Although spatial exploration was egocentric, surprisingly we found that performance in the JRD task was better for allocentric perspectives. This suggests that the egocentric representation of the scene was updated. This result is in line with previous studies using visual and somatosensory scenes, thus supporting the notion that different sensory modalities produce equivalent spatial representation(s). Moreover, our results have practical implications to improve training methods with sensory substitution devices (SSD). PMID:27148000
Interrogation of an object for dimensional and topographical information
McMakin, Douglas L.; Severtsen, Ronald H.; Hall, Thomas E.; Sheen, David M.; Kennedy, Mike O.
2004-03-09
Disclosed are systems, methods, devices, and apparatus to interrogate a clothed individual with electromagnetic radiation to determine one or more body measurements at least partially covered by the individual's clothing. The invention further includes techniques to interrogate an object with electromagnetic radiation in the millimeter and/or microwave range to provide a volumetric representation of the object. This representation can be used to display images and/or determine dimensional information concerning the object.
Interrogation of an object for dimensional and topographical information
McMakin, Doug L [Richland, WA; Severtsen, Ronald H [Richland, WA; Hall, Thomas E [Richland, WA; Sheen, David M [Richland, WA
2003-01-14
Disclosed are systems, methods, devices, and apparatus to interrogate a clothed individual with electromagnetic radiation to determine one or more body measurements at least partially covered by the individual's clothing. The invention further includes techniques to interrogate an object with electromagnetic radiation in the millimeter and/or microwave range to provide a volumetric representation of the object. This representation can be used to display images and/or determine dimensional information concerning the object.
Sparse Representations for Limited Data Tomography (PREPRINT)
2007-11-01
predefined (such as wavelets ) or learned (e.g., by the K-SVD algorithm [8]), as in this work. Due to its highly effectiveness for tasks such as image...from den- tal data produced by the Focus intraoral X-ray source and the Sigma intraoral sensor (Instrumentarium Dental ; courtesy of Maaria Rantala...proposed method (right column). a functional, encouraging a sparse representation of the im- age patches while keeping the data constraints provided by
Mishra, Pankaj; Li, Ruijiang; Mak, Raymond H.; Rottmann, Joerg; Bryant, Jonathan H.; Williams, Christopher L.; Berbeco, Ross I.; Lewis, John H.
2014-01-01
Purpose: In this work the authors develop and investigate the feasibility of a method to estimate time-varying volumetric images from individual MV cine electronic portal image device (EPID) images. Methods: The authors adopt a two-step approach to time-varying volumetric image estimation from a single cine EPID image. In the first step, a patient-specific motion model is constructed from 4DCT. In the second step, parameters in the motion model are tuned according to the information in the EPID image. The patient-specific motion model is based on a compact representation of lung motion represented in displacement vector fields (DVFs). DVFs are calculated through deformable image registration (DIR) of a reference 4DCT phase image (typically peak-exhale) to a set of 4DCT images corresponding to different phases of a breathing cycle. The salient characteristics in the DVFs are captured in a compact representation through principal component analysis (PCA). PCA decouples the spatial and temporal components of the DVFs. Spatial information is represented in eigenvectors and the temporal information is represented by eigen-coefficients. To generate a new volumetric image, the eigen-coefficients are updated via cost function optimization based on digitally reconstructed radiographs and projection images. The updated eigen-coefficients are then multiplied with the eigenvectors to obtain updated DVFs that, in turn, give the volumetric image corresponding to the cine EPID image. Results: The algorithm was tested on (1) Eight digital eXtended CArdiac-Torso phantom datasets based on different irregular patient breathing patterns and (2) patient cine EPID images acquired during SBRT treatments. The root-mean-squared tumor localization error is (0.73 ± 0.63 mm) for the XCAT data and (0.90 ± 0.65 mm) for the patient data. Conclusions: The authors introduced a novel method of estimating volumetric time-varying images from single cine EPID images and a PCA-based lung motion model. This is the first method to estimate volumetric time-varying images from single MV cine EPID images, and has the potential to provide volumetric information with no additional imaging dose to the patient. PMID:25086523
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mishra, Pankaj, E-mail: pankaj.mishra@varian.com; Mak, Raymond H.; Rottmann, Joerg
2014-08-15
Purpose: In this work the authors develop and investigate the feasibility of a method to estimate time-varying volumetric images from individual MV cine electronic portal image device (EPID) images. Methods: The authors adopt a two-step approach to time-varying volumetric image estimation from a single cine EPID image. In the first step, a patient-specific motion model is constructed from 4DCT. In the second step, parameters in the motion model are tuned according to the information in the EPID image. The patient-specific motion model is based on a compact representation of lung motion represented in displacement vector fields (DVFs). DVFs are calculatedmore » through deformable image registration (DIR) of a reference 4DCT phase image (typically peak-exhale) to a set of 4DCT images corresponding to different phases of a breathing cycle. The salient characteristics in the DVFs are captured in a compact representation through principal component analysis (PCA). PCA decouples the spatial and temporal components of the DVFs. Spatial information is represented in eigenvectors and the temporal information is represented by eigen-coefficients. To generate a new volumetric image, the eigen-coefficients are updated via cost function optimization based on digitally reconstructed radiographs and projection images. The updated eigen-coefficients are then multiplied with the eigenvectors to obtain updated DVFs that, in turn, give the volumetric image corresponding to the cine EPID image. Results: The algorithm was tested on (1) Eight digital eXtended CArdiac-Torso phantom datasets based on different irregular patient breathing patterns and (2) patient cine EPID images acquired during SBRT treatments. The root-mean-squared tumor localization error is (0.73 ± 0.63 mm) for the XCAT data and (0.90 ± 0.65 mm) for the patient data. Conclusions: The authors introduced a novel method of estimating volumetric time-varying images from single cine EPID images and a PCA-based lung motion model. This is the first method to estimate volumetric time-varying images from single MV cine EPID images, and has the potential to provide volumetric information with no additional imaging dose to the patient.« less
Hyperspectral image denoising and anomaly detection based on low-rank and sparse representations
NASA Astrophysics Data System (ADS)
Zhuang, Lina; Gao, Lianru; Zhang, Bing; Bioucas-Dias, José M.
2017-10-01
The very high spectral resolution of Hyperspectral Images (HSIs) enables the identification of materials with subtle differences and the extraction subpixel information. However, the increasing of spectral resolution often implies an increasing in the noise linked with the image formation process. This degradation mechanism limits the quality of extracted information and its potential applications. Since HSIs represent natural scenes and their spectral channels are highly correlated, they are characterized by a high level of self-similarity and are well approximated by low-rank representations. These characteristic underlies the state-of-the-art in HSI denoising. However, in presence of rare pixels, the denoising performance of those methods is not optimal and, in addition, it may compromise the future detection of those pixels. To address these hurdles, we introduce RhyDe (Robust hyperspectral Denoising), a powerful HSI denoiser, which implements explicit low-rank representation, promotes self-similarity, and, by using a form of collaborative sparsity, preserves rare pixels. The denoising and detection effectiveness of the proposed robust HSI denoiser is illustrated using semi-real data.
Multiphase flow predictions from carbonate pore space images using extracted network models
NASA Astrophysics Data System (ADS)
Al-Kharusi, Anwar S.; Blunt, Martin J.
2008-06-01
A methodology to extract networks from pore space images is used to make predictions of multiphase transport properties for subsurface carbonate samples. The extraction of the network model is based on the computation of the location and sizes of pores and throats to create a topological representation of the void space of three-dimensional (3-D) rock images, using the concept of maximal balls. In this work, we follow a multistaged workflow. We start with a 2-D thin-section image; convert it statistically into a 3-D representation of the pore space; extract a network model from this image; and finally, simulate primary drainage, waterflooding, and secondary drainage flow processes using a pore-scale simulator. We test this workflow for a reservoir carbonate rock. The network-predicted absolute permeability is similar to the core plug measured value and the value computed on the 3-D void space image using the lattice Boltzmann method. The predicted capillary pressure during primary drainage agrees well with a mercury-air experiment on a core sample, indicating that we have an adequate representation of the rock's pore structure. We adjust the contact angles in the network to match the measured waterflood and secondary drainage capillary pressures. We infer a significant degree of contact angle hysteresis. We then predict relative permeabilities for primary drainage, waterflooding, and secondary drainage that agree well with laboratory measured values. This approach can be used to predict multiphase transport properties when wettability and pore structure vary in a reservoir, where experimental data is scant or missing. There are shortfalls to this approach, however. We compare results from three networks, one of which was derived from a section of the rock containing vugs. Our method fails to predict properties reliably when an unrepresentative image is processed to construct the 3-D network model. This occurs when the image volume is not sufficient to represent the geological variations observed in a core plug sample.
Descriptive and Computer Aided Drawing Perspective on an Unfolded Polyhedral Projection Surface
NASA Astrophysics Data System (ADS)
Dzwierzynska, Jolanta
2017-10-01
The aim of the herby study is to develop a method of direct and practical mapping of perspective on an unfolded prism polyhedral projection surface. The considered perspective representation is a rectilinear central projection onto a surface composed of several flat elements. In the paper two descriptive methods of drawing perspective are presented: direct and indirect. The graphical mapping of the effects of the representation is realized directly on the unfolded flat projection surface. That is due to the projective and graphical connection between points displayed on the polyhedral background and their counterparts received on the unfolded flat surface. For a significant improvement of the construction of line, analytical algorithms are formulated. They draw a perspective image of a segment of line passing through two different points determined by their coordinates in a spatial coordinate system of axis x, y, z. Compared to other perspective construction methods that use information about points, for computer vision and the computer aided design, our algorithms utilize data about lines, which are applied very often in architectural forms. Possibility of drawing lines in the considered perspective enables drawing an edge perspective image of an architectural object. The application of the changeable base elements of perspective as a horizon height and a station point location enable drawing perspective image from different viewing positions. The analytical algorithms for drawing perspective images are formulated in Mathcad software, however, they can be implemented in the majority of computer graphical packages, which can make drawing perspective more efficient and easier. The representation presented in the paper and the way of its direct mapping on the flat unfolded projection surface can find application in presentation of architectural space in advertisement and art.
Learning feature representations with a cost-relevant sparse autoencoder.
Längkvist, Martin; Loutfi, Amy
2015-02-01
There is an increasing interest in the machine learning community to automatically learn feature representations directly from the (unlabeled) data instead of using hand-designed features. The autoencoder is one method that can be used for this purpose. However, for data sets with a high degree of noise, a large amount of the representational capacity in the autoencoder is used to minimize the reconstruction error for these noisy inputs. This paper proposes a method that improves the feature learning process by focusing on the task relevant information in the data. This selective attention is achieved by weighting the reconstruction error and reducing the influence of noisy inputs during the learning process. The proposed model is trained on a number of publicly available image data sets and the test error rate is compared to a standard sparse autoencoder and other methods, such as the denoising autoencoder and contractive autoencoder.
Multiscale skeletal representation of images via Voronoi diagrams
NASA Astrophysics Data System (ADS)
Marston, R. E.; Shih, Jian C.
1995-08-01
Polygonal approximations to skeletal or stroke-based representations of 2D objects may consume less storage and be sufficient to describe their shape for many applications. Multi- scale descriptions of object outlines are well established but corresponding methods for skeletal descriptions have been slower to develop. In this paper we offer a method of generating scale-based skeletal representation via the Voronoi diagram. The method has the advantages of less time complexity, a closer relationship between the skeletons at each scale and better control over simplification of the skeleton at lower scales. This is because the algorithm starts by generating the skeleton at the coarsest scale first, then it produces each finer scale, in an iterative manner, directly from the level below. The skeletal approximations produced by the algorithm also benefit from a strong relationship with the object outline, due to the structure of the Voronoi diagram.
Automatic face naming by learning discriminative affinity matrices from weakly labeled images.
Xiao, Shijie; Xu, Dong; Wu, Jianxin
2015-10-01
Given a collection of images, where each image contains several faces and is associated with a few names in the corresponding caption, the goal of face naming is to infer the correct name for each face. In this paper, we propose two new methods to effectively solve this problem by learning two discriminative affinity matrices from these weakly labeled images. We first propose a new method called regularized low-rank representation by effectively utilizing weakly supervised information to learn a low-rank reconstruction coefficient matrix while exploring multiple subspace structures of the data. Specifically, by introducing a specially designed regularizer to the low-rank representation method, we penalize the corresponding reconstruction coefficients related to the situations where a face is reconstructed by using face images from other subjects or by using itself. With the inferred reconstruction coefficient matrix, a discriminative affinity matrix can be obtained. Moreover, we also develop a new distance metric learning method called ambiguously supervised structural metric learning by using weakly supervised information to seek a discriminative distance metric. Hence, another discriminative affinity matrix can be obtained using the similarity matrix (i.e., the kernel matrix) based on the Mahalanobis distances of the data. Observing that these two affinity matrices contain complementary information, we further combine them to obtain a fused affinity matrix, based on which we develop a new iterative scheme to infer the name of each face. Comprehensive experiments demonstrate the effectiveness of our approach.
Robust kernel representation with statistical local features for face recognition.
Yang, Meng; Zhang, Lei; Shiu, Simon Chi-Keung; Zhang, David
2013-06-01
Factors such as misalignment, pose variation, and occlusion make robust face recognition a difficult problem. It is known that statistical features such as local binary pattern are effective for local feature extraction, whereas the recently proposed sparse or collaborative representation-based classification has shown interesting results in robust face recognition. In this paper, we propose a novel robust kernel representation model with statistical local features (SLF) for robust face recognition. Initially, multipartition max pooling is used to enhance the invariance of SLF to image registration error. Then, a kernel-based representation model is proposed to fully exploit the discrimination information embedded in the SLF, and robust regression is adopted to effectively handle the occlusion in face images. Extensive experiments are conducted on benchmark face databases, including extended Yale B, AR (A. Martinez and R. Benavente), multiple pose, illumination, and expression (multi-PIE), facial recognition technology (FERET), face recognition grand challenge (FRGC), and labeled faces in the wild (LFW), which have different variations of lighting, expression, pose, and occlusions, demonstrating the promising performance of the proposed method.
Sharpening of Hierarchical Visual Feature Representations of Blurred Images.
Abdelhack, Mohamed; Kamitani, Yukiyasu
2018-01-01
The robustness of the visual system lies in its ability to perceive degraded images. This is achieved through interacting bottom-up, recurrent, and top-down pathways that process the visual input in concordance with stored prior information. The interaction mechanism by which they integrate visual input and prior information is still enigmatic. We present a new approach using deep neural network (DNN) representation to reveal the effects of such integration on degraded visual inputs. We transformed measured human brain activity resulting from viewing blurred images to the hierarchical representation space derived from a feedforward DNN. Transformed representations were found to veer toward the original nonblurred image and away from the blurred stimulus image. This indicated deblurring or sharpening in the neural representation, and possibly in our perception. We anticipate these results will help unravel the interplay mechanism between bottom-up, recurrent, and top-down pathways, leading to more comprehensive models of vision.
Robust Multimodal Dictionary Learning
Cao, Tian; Jojic, Vladimir; Modla, Shannon; Powell, Debbie; Czymmek, Kirk; Niethammer, Marc
2014-01-01
We propose a robust multimodal dictionary learning method for multimodal images. Joint dictionary learning for both modalities may be impaired by lack of correspondence between image modalities in training data, for example due to areas of low quality in one of the modalities. Dictionaries learned with such non-corresponding data will induce uncertainty about image representation. In this paper, we propose a probabilistic model that accounts for image areas that are poorly corresponding between the image modalities. We cast the problem of learning a dictionary in presence of problematic image patches as a likelihood maximization problem and solve it with a variant of the EM algorithm. Our algorithm iterates identification of poorly corresponding patches and re-finements of the dictionary. We tested our method on synthetic and real data. We show improvements in image prediction quality and alignment accuracy when using the method for multimodal image registration. PMID:24505674
Dictionary Approaches to Image Compression and Reconstruction
NASA Technical Reports Server (NTRS)
Ziyad, Nigel A.; Gilmore, Erwin T.; Chouikha, Mohamed F.
1998-01-01
This paper proposes using a collection of parameterized waveforms, known as a dictionary, for the purpose of medical image compression. These waveforms, denoted as phi(sub gamma), are discrete time signals, where gamma represents the dictionary index. A dictionary with a collection of these waveforms is typically complete or overcomplete. Given such a dictionary, the goal is to obtain a representation image based on the dictionary. We examine the effectiveness of applying Basis Pursuit (BP), Best Orthogonal Basis (BOB), Matching Pursuits (MP), and the Method of Frames (MOF) methods for the compression of digitized radiological images with a wavelet-packet dictionary. The performance of these algorithms is studied for medical images with and without additive noise.
Dictionary Approaches to Image Compression and Reconstruction
NASA Technical Reports Server (NTRS)
Ziyad, Nigel A.; Gilmore, Erwin T.; Chouikha, Mohamed F.
1998-01-01
This paper proposes using a collection of parameterized waveforms, known as a dictionary, for the purpose of medical image compression. These waveforms, denoted as lambda, are discrete time signals, where y represents the dictionary index. A dictionary with a collection of these waveforms Is typically complete or over complete. Given such a dictionary, the goal is to obtain a representation Image based on the dictionary. We examine the effectiveness of applying Basis Pursuit (BP), Best Orthogonal Basis (BOB), Matching Pursuits (MP), and the Method of Frames (MOF) methods for the compression of digitized radiological images with a wavelet-packet dictionary. The performance of these algorithms is studied for medical images with and without additive noise.
Multiplicative noise removal via a learned dictionary.
Huang, Yu-Mei; Moisan, Lionel; Ng, Michael K; Zeng, Tieyong
2012-11-01
Multiplicative noise removal is a challenging image processing problem, and most existing methods are based on the maximum a posteriori formulation and the logarithmic transformation of multiplicative denoising problems into additive denoising problems. Sparse representations of images have shown to be efficient approaches for image recovery. Following this idea, in this paper, we propose to learn a dictionary from the logarithmic transformed image, and then to use it in a variational model built for noise removal. Extensive experimental results suggest that in terms of visual quality, peak signal-to-noise ratio, and mean absolute deviation error, the proposed algorithm outperforms state-of-the-art methods.
Super-resolution using a light inception layer in convolutional neural network
NASA Astrophysics Data System (ADS)
Mou, Qinyang; Guo, Jun
2018-04-01
Recently, several models based on CNN architecture have achieved great result on Single Image Super-Resolution (SISR) problem. In this paper, we propose an image super-resolution method (SR) using a light inception layer in convolutional network (LICN). Due to the strong representation ability of our well-designed inception layer that can learn richer representation with less parameters, we can build our model with shallow architecture that can reduce the effect of vanishing gradients problem and save computational costs. Our model strike a balance between computational speed and the quality of the result. Compared with state-of-the-art result, we produce comparable or better results with faster computational speed.
Describing a Robot's Workspace Using a Sequence of Views from a Moving Camera.
Hong, T H; Shneier, M O
1985-06-01
This correspondence describes a method of building and maintaining a spatial respresentation for the workspace of a robot, using a sensor that moves about in the world. From the known camera position at which an image is obtained, and two-dimensional silhouettes of the image, a series of cones is projected to describe the possible positions of the objects in the space. When an object is seen from several viewpoints, the intersections of the cones constrain the position and size of the object. After several views have been processed, the representation of the object begins to resemble its true shape. At all times, the spatial representation contains the best guess at the true situation in the world with uncertainties in position and shape explicitly represented. An octree is used as the data structure for the representation. It not only provides a relatively compact representation, but also allows fast access to information and enables large parts of the workspace to be ignored. The purpose of constructing this representation is not so much to recognize objects as to describe the volumes in the workspace that are occupied and those that are empty. This enables trajectory planning to be carried out, and also provides a means of spatially indexing objects without needing to represent the objects at an extremely fine resolution. The spatial representation is one part of a more complex representation of the workspace used by the sensory system of a robot manipulator in understanding its environment.
Volumetric calibration of a plenoptic camera.
Hall, Elise Munz; Fahringer, Timothy W; Guildenbecher, Daniel R; Thurow, Brian S
2018-02-01
The volumetric calibration of a plenoptic camera is explored to correct for inaccuracies due to real-world lens distortions and thin-lens assumptions in current processing methods. Two methods of volumetric calibration based on a polynomial mapping function that does not require knowledge of specific lens parameters are presented and compared to a calibration based on thin-lens assumptions. The first method, volumetric dewarping, is executed by creation of a volumetric representation of a scene using the thin-lens assumptions, which is then corrected in post-processing using a polynomial mapping function. The second method, direct light-field calibration, uses the polynomial mapping in creation of the initial volumetric representation to relate locations in object space directly to image sensor locations. The accuracy and feasibility of these methods is examined experimentally by capturing images of a known dot card at a variety of depths. Results suggest that use of a 3D polynomial mapping function provides a significant increase in reconstruction accuracy and that the achievable accuracy is similar using either polynomial-mapping-based method. Additionally, direct light-field calibration provides significant computational benefits by eliminating some intermediate processing steps found in other methods. Finally, the flexibility of this method is shown for a nonplanar calibration.
Optimal color coding for compression of true color images
NASA Astrophysics Data System (ADS)
Musatenko, Yurij S.; Kurashov, Vitalij N.
1998-11-01
In the paper we present the method that improves lossy compression of the true color or other multispectral images. The essence of the method is to project initial color planes into Karhunen-Loeve (KL) basis that gives completely decorrelated representation for the image and to compress basis functions instead of the planes. To do that the new fast algorithm of true KL basis construction with low memory consumption is suggested and our recently proposed scheme for finding optimal losses of Kl functions while compression is used. Compare to standard JPEG compression of the CMYK images the method provides the PSNR gain from 0.2 to 2 dB for the convenient compression ratios. Experimental results are obtained for high resolution CMYK images. It is demonstrated that presented scheme could work on common hardware.
Multiview alignment hashing for efficient image search.
Liu, Li; Yu, Mengyang; Shao, Ling
2015-03-01
Hashing is a popular and efficient method for nearest neighbor search in large-scale data spaces by embedding high-dimensional feature descriptors into a similarity preserving Hamming space with a low dimension. For most hashing methods, the performance of retrieval heavily depends on the choice of the high-dimensional feature descriptor. Furthermore, a single type of feature cannot be descriptive enough for different images when it is used for hashing. Thus, how to combine multiple representations for learning effective hashing functions is an imminent task. In this paper, we present a novel unsupervised multiview alignment hashing approach based on regularized kernel nonnegative matrix factorization, which can find a compact representation uncovering the hidden semantics and simultaneously respecting the joint probability distribution of data. In particular, we aim to seek a matrix factorization to effectively fuse the multiple information sources meanwhile discarding the feature redundancy. Since the raised problem is regarded as nonconvex and discrete, our objective function is then optimized via an alternate way with relaxation and converges to a locally optimal solution. After finding the low-dimensional representation, the hashing functions are finally obtained through multivariable logistic regression. The proposed method is systematically evaluated on three data sets: 1) Caltech-256; 2) CIFAR-10; and 3) CIFAR-20, and the results show that our method significantly outperforms the state-of-the-art multiview hashing techniques.
Multi-scale Gaussian representation and outline-learning based cell image segmentation
2013-01-01
Background High-throughput genome-wide screening to study gene-specific functions, e.g. for drug discovery, demands fast automated image analysis methods to assist in unraveling the full potential of such studies. Image segmentation is typically at the forefront of such analysis as the performance of the subsequent steps, for example, cell classification, cell tracking etc., often relies on the results of segmentation. Methods We present a cell cytoplasm segmentation framework which first separates cell cytoplasm from image background using novel approach of image enhancement and coefficient of variation of multi-scale Gaussian scale-space representation. A novel outline-learning based classification method is developed using regularized logistic regression with embedded feature selection which classifies image pixels as outline/non-outline to give cytoplasm outlines. Refinement of the detected outlines to separate cells from each other is performed in a post-processing step where the nuclei segmentation is used as contextual information. Results and conclusions We evaluate the proposed segmentation methodology using two challenging test cases, presenting images with completely different characteristics, with cells of varying size, shape, texture and degrees of overlap. The feature selection and classification framework for outline detection produces very simple sparse models which use only a small subset of the large, generic feature set, that is, only 7 and 5 features for the two cases. Quantitative comparison of the results for the two test cases against state-of-the-art methods show that our methodology outperforms them with an increase of 4-9% in segmentation accuracy with maximum accuracy of 93%. Finally, the results obtained for diverse datasets demonstrate that our framework not only produces accurate segmentation but also generalizes well to different segmentation tasks. PMID:24267488
A new image representation for compact and secure communication
DOE Office of Scientific and Technical Information (OSTI.GOV)
Prasad, Lakshman; Skourikhine, A. N.
In many areas of nuclear materials management there is a need for communication, archival, and retrieval of annotated image data between heterogeneous platforms and devices to effectively implement safety, security, and safeguards of nuclear materials. Current image formats such as JPEG are not ideally suited in such scenarios as they are not scalable to different viewing formats, and do not provide a high-level representation of images that facilitate automatic object/change detection or annotation. The new Scalable Vector Graphics (SVG) open standard for representing graphical information, recommended by the World Wide Web Consortium (W3C) is designed to address issues of imagemore » scalability, portability, and annotation. However, until now there has been no viable technology to efficiently field images of high visual quality under this standard. Recently, LANL has developed a vectorized image representation that is compatible with the SVG standard and preserves visual quality. This is based on a new geometric framework for characterizing complex features in real-world imagery that incorporates perceptual principles of processing visual information known from cognitive psychology and vision science, to obtain a polygonal image representation of high fidelity. This representation can take advantage of all textual compression and encryption routines unavailable to other image formats. Moreover, this vectorized image representation can be exploited to facilitate automated object recognition that can reduce time required for data review. The objects/features of interest in these vectorized images can be annotated via animated graphics to facilitate quick and easy display and comprehension of processed image content.« less
Chen, Yang; Budde, Adam; Li, Ke; Li, Yinsheng; Hsieh, Jiang; Chen, Guang-Hong
2017-01-01
When the scan field of view (SFOV) of a CT system is not large enough to enclose the entire cross-section of the patient, or the patient needs to be positioned partially outside the SFOV for certain clinical applications, truncation artifacts often appear in the reconstructed CT images. Many truncation artifact correction methods perform extrapolations of the truncated projection data based on certain a priori assumptions. The purpose of this work was to develop a novel CT truncation artifact reduction method that directly operates on DICOM images. The blooming of pixel values associated with truncation was modeled using exponential decay functions, and based on this model, a discriminative dictionary was constructed to represent truncation artifacts and nonartifact image information in a mutually exclusive way. The discriminative dictionary consists of a truncation artifact subdictionary and a nonartifact subdictionary. The truncation artifact subdictionary contains 1000 atoms with different decay parameters, while the nonartifact subdictionary contains 1000 independent realizations of Gaussian white noise that are exclusive with the artifact features. By sparsely representing an artifact-contaminated CT image with this discriminative dictionary, the image was separated into a truncation artifact-dominated image and a complementary image with reduced truncation artifacts. The artifact-dominated image was then subtracted from the original image with an appropriate weighting coefficient to generate the final image with reduced artifacts. This proposed method was validated via physical phantom studies and retrospective human subject studies. Quantitative image evaluation metrics including the relative root-mean-square error (rRMSE) and the universal image quality index (UQI) were used to quantify the performance of the algorithm. For both phantom and human subject studies, truncation artifacts at the peripheral region of the SFOV were effectively reduced, revealing soft tissue and bony structure once buried in the truncation artifacts. For the phantom study, the proposed method reduced the relative RMSE from 15% (original images) to 11%, and improved the UQI from 0.34 to 0.80. A discriminative dictionary representation method was developed to mitigate CT truncation artifacts directly in the DICOM image domain. Both phantom and human subject studies demonstrated that the proposed method can effectively reduce truncation artifacts without access to projection data. © 2016 American Association of Physicists in Medicine.
Logo image clustering based on advanced statistics
NASA Astrophysics Data System (ADS)
Wei, Yi; Kamel, Mohamed; He, Yiwei
2007-11-01
In recent years, there has been a growing interest in the research of image content description techniques. Among those, image clustering is one of the most frequently discussed topics. Similar to image recognition, image clustering is also a high-level representation technique. However it focuses on the coarse categorization rather than the accurate recognition. Based on wavelet transform (WT) and advanced statistics, the authors propose a novel approach that divides various shaped logo images into groups according to the external boundary of each logo image. Experimental results show that the presented method is accurate, fast and insensitive to defects.
An image understanding system using attributed symbolic representation and inexact graph-matching
NASA Astrophysics Data System (ADS)
Eshera, M. A.; Fu, K.-S.
1986-09-01
A powerful image understanding system using a semantic-syntactic representation scheme consisting of attributed relational graphs (ARGs) is proposed for the analysis of the global information content of images. A multilayer graph transducer scheme performs the extraction of ARG representations from images, with ARG nodes representing the global image features, and the relations between features represented by the attributed branches between corresponding nodes. An efficient dynamic programming technique is employed to derive the distance between two ARGs and the inexact matching of their respective components. Noise, distortion and ambiguity in real-world images are handled through modeling in the transducer mapping rules and through the appropriate cost of error-transformation for the inexact matching of the representation. The system is demonstrated for the case of locating objects in a scene composed of complex overlapped objects, and the case of target detection in noisy and distorted synthetic aperture radar image.
Face recognition by applying wavelet subband representation and kernel associative memory.
Zhang, Bai-Ling; Zhang, Haihong; Ge, Shuzhi Sam
2004-01-01
In this paper, we propose an efficient face recognition scheme which has two features: 1) representation of face images by two-dimensional (2-D) wavelet subband coefficients and 2) recognition by a modular, personalised classification method based on kernel associative memory models. Compared to PCA projections and low resolution "thumb-nail" image representations, wavelet subband coefficients can efficiently capture substantial facial features while keeping computational complexity low. As there are usually very limited samples, we constructed an associative memory (AM) model for each person and proposed to improve the performance of AM models by kernel methods. Specifically, we first applied kernel transforms to each possible training pair of faces sample and then mapped the high-dimensional feature space back to input space. Our scheme using modular autoassociative memory for face recognition is inspired by the same motivation as using autoencoders for optical character recognition (OCR), for which the advantages has been proven. By associative memory, all the prototypical faces of one particular person are used to reconstruct themselves and the reconstruction error for a probe face image is used to decide if the probe face is from the corresponding person. We carried out extensive experiments on three standard face recognition datasets, the FERET data, the XM2VTS data, and the ORL data. Detailed comparisons with earlier published results are provided and our proposed scheme offers better recognition accuracy on all of the face datasets.
The Statistics of Visual Representation
NASA Technical Reports Server (NTRS)
Jobson, Daniel J.; Rahman, Zia-Ur; Woodell, Glenn A.
2002-01-01
The experience of retinex image processing has prompted us to reconsider fundamental aspects of imaging and image processing. Foremost is the idea that a good visual representation requires a non-linear transformation of the recorded (approximately linear) image data. Further, this transformation appears to converge on a specific distribution. Here we investigate the connection between numerical and visual phenomena. Specifically the questions explored are: (1) Is there a well-defined consistent statistical character associated with good visual representations? (2) Does there exist an ideal visual image? And (3) what are its statistical properties?
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tripathi, Ashish; McNulty, Ian; Munson, Todd
We propose a new approach to robustly retrieve the exit wave of an extended sample from its coherent diffraction pattern by exploiting sparsity of the sample's edges. This approach enables imaging of an extended sample with a single view, without ptychography. We introduce nonlinear optimization methods that promote sparsity, and we derive update rules to robustly recover the sample's exit wave. We test these methods on simulated samples by varying the sparsity of the edge-detected representation of the exit wave. Finally, our tests illustrate the strengths and limitations of the proposed method in imaging extended samples.
Classification of Clouds in Satellite Imagery Using Adaptive Fuzzy Sparse Representation.
Jin, Wei; Gong, Fei; Zeng, Xingbin; Fu, Randi
2016-12-16
Automatic cloud detection and classification using satellite cloud imagery have various meteorological applications such as weather forecasting and climate monitoring. Cloud pattern analysis is one of the research hotspots recently. Since satellites sense the clouds remotely from space, and different cloud types often overlap and convert into each other, there must be some fuzziness and uncertainty in satellite cloud imagery. Satellite observation is susceptible to noises, while traditional cloud classification methods are sensitive to noises and outliers; it is hard for traditional cloud classification methods to achieve reliable results. To deal with these problems, a satellite cloud classification method using adaptive fuzzy sparse representation-based classification (AFSRC) is proposed. Firstly, by defining adaptive parameters related to attenuation rate and critical membership, an improved fuzzy membership is introduced to accommodate the fuzziness and uncertainty of satellite cloud imagery; secondly, by effective combination of the improved fuzzy membership function and sparse representation-based classification (SRC), atoms in training dictionary are optimized; finally, an adaptive fuzzy sparse representation classifier for cloud classification is proposed. Experiment results on FY-2G satellite cloud image show that, the proposed method not only improves the accuracy of cloud classification, but also has strong stability and adaptability with high computational efficiency.
Contrast enhancing solution for use in confocal microscopy
Tannous, Zeina; Torres, Abel; Gonzalez, Salvador
2006-10-31
A method of optically detecting a tumor during surgery. The method includes imaging at least one test point defined on the tumor using a first optical imaging system to provide a first tumor image. The method further includes excising a first predetermined layer of the tumor for forming an in-vivo defect area. A predetermined contrast enhancing solution is disposed on the in-vivo defect area, which is adapted to interact with at least one cell anomaly, such as basal cell carcinoma, located on the in-vivo defect area for optically enhancing the cell anomaly. Thereafter the defect area can be optically imaged to provide a clear and bright representation of the cell anomaly to aid a surgeon while surgically removing the cell anomaly.
Al-Shaikhli, Saif Dawood Salman; Yang, Michael Ying; Rosenhahn, Bodo
2016-12-01
This paper presents a novel method for Alzheimer's disease classification via an automatic 3D caudate nucleus segmentation. The proposed method consists of segmentation and classification steps. In the segmentation step, we propose a novel level set cost function. The proposed cost function is constrained by a sparse representation of local image features using a dictionary learning method. We present coupled dictionaries: a feature dictionary of a grayscale brain image and a label dictionary of a caudate nucleus label image. Using online dictionary learning, the coupled dictionaries are learned from the training data. The learned coupled dictionaries are embedded into a level set function. In the classification step, a region-based feature dictionary is built. The region-based feature dictionary is learned from shape features of the caudate nucleus in the training data. The classification is based on the measure of the similarity between the sparse representation of region-based shape features of the segmented caudate in the test image and the region-based feature dictionary. The experimental results demonstrate the superiority of our method over the state-of-the-art methods by achieving a high segmentation (91.5%) and classification (92.5%) accuracy. In this paper, we find that the study of the caudate nucleus atrophy gives an advantage over the study of whole brain structure atrophy to detect Alzheimer's disease. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
New feature extraction method for classification of agricultural products from x-ray images
NASA Astrophysics Data System (ADS)
Talukder, Ashit; Casasent, David P.; Lee, Ha-Woon; Keagy, Pamela M.; Schatzki, Thomas F.
1999-01-01
Classification of real-time x-ray images of randomly oriented touching pistachio nuts is discussed. The ultimate objective is the development of a system for automated non- invasive detection of defective product items on a conveyor belt. We discuss the extraction of new features that allow better discrimination between damaged and clean items. This feature extraction and classification stage is the new aspect of this paper; our new maximum representation and discrimination between damaged and clean items. This feature extraction and classification stage is the new aspect of this paper; our new maximum representation and discriminating feature (MRDF) extraction method computes nonlinear features that are used as inputs to a new modified k nearest neighbor classifier. In this work the MRDF is applied to standard features. The MRDF is robust to various probability distributions of the input class and is shown to provide good classification and new ROC data.
Infrared and visible image fusion method based on saliency detection in sparse domain
NASA Astrophysics Data System (ADS)
Liu, C. H.; Qi, Y.; Ding, W. R.
2017-06-01
Infrared and visible image fusion is a key problem in the field of multi-sensor image fusion. To better preserve the significant information of the infrared and visible images in the final fused image, the saliency maps of the source images is introduced into the fusion procedure. Firstly, under the framework of the joint sparse representation (JSR) model, the global and local saliency maps of the source images are obtained based on sparse coefficients. Then, a saliency detection model is proposed, which combines the global and local saliency maps to generate an integrated saliency map. Finally, a weighted fusion algorithm based on the integrated saliency map is developed to achieve the fusion progress. The experimental results show that our method is superior to the state-of-the-art methods in terms of several universal quality evaluation indexes, as well as in the visual quality.
A four-dimensional motion field atlas of the tongue from tagged and cine magnetic resonance imaging
NASA Astrophysics Data System (ADS)
Xing, Fangxu; Prince, Jerry L.; Stone, Maureen; Wedeen, Van J.; El Fakhri, Georges; Woo, Jonghye
2017-02-01
Representation of human tongue motion using three-dimensional vector fields over time can be used to better understand tongue function during speech, swallowing, and other lingual behaviors. To characterize the inter-subject variability of the tongue's shape and motion of a population carrying out one of these functions it is desirable to build a statistical model of the four-dimensional (4D) tongue. In this paper, we propose a method to construct a spatio-temporal atlas of tongue motion using magnetic resonance (MR) images acquired from fourteen healthy human subjects. First, cine MR images revealing the anatomical features of the tongue are used to construct a 4D intensity image atlas. Second, tagged MR images acquired to capture internal motion are used to compute a dense motion field at each time frame using a phase-based motion tracking method. Third, motion fields from each subject are pulled back to the cine atlas space using the deformation fields computed during the cine atlas construction. Finally, a spatio-temporal motion field atlas is created to show a sequence of mean motion fields and their inter-subject variation. The quality of the atlas was evaluated by deforming cine images in the atlas space. Comparison between deformed and original cine images showed high correspondence. The proposed method provides a quantitative representation to observe the commonality and variability of the tongue motion field for the first time, and shows potential in evaluation of common properties such as strains and other tensors based on motion fields.
A Four-dimensional Motion Field Atlas of the Tongue from Tagged and Cine Magnetic Resonance Imaging.
Xing, Fangxu; Prince, Jerry L; Stone, Maureen; Wedeen, Van J; Fakhri, Georges El; Woo, Jonghye
2017-01-01
Representation of human tongue motion using three-dimensional vector fields over time can be used to better understand tongue function during speech, swallowing, and other lingual behaviors. To characterize the inter-subject variability of the tongue's shape and motion of a population carrying out one of these functions it is desirable to build a statistical model of the four-dimensional (4D) tongue. In this paper, we propose a method to construct a spatio-temporal atlas of tongue motion using magnetic resonance (MR) images acquired from fourteen healthy human subjects. First, cine MR images revealing the anatomical features of the tongue are used to construct a 4D intensity image atlas. Second, tagged MR images acquired to capture internal motion are used to compute a dense motion field at each time frame using a phase-based motion tracking method. Third, motion fields from each subject are pulled back to the cine atlas space using the deformation fields computed during the cine atlas construction. Finally, a spatio-temporal motion field atlas is created to show a sequence of mean motion fields and their inter-subject variation. The quality of the atlas was evaluated by deforming cine images in the atlas space. Comparison between deformed and original cine images showed high correspondence. The proposed method provides a quantitative representation to observe the commonality and variability of the tongue motion field for the first time, and shows potential in evaluation of common properties such as strains and other tensors based on motion fields.
On pictures and stuff: image quality and material appearance
NASA Astrophysics Data System (ADS)
Ferwerda, James A.
2014-02-01
Realistic images are a puzzle because they serve as visual representations of objects while also being objects themselves. When we look at an image we are able to perceive both the properties of the image and the properties of the objects represented by the image. Research on image quality has typically focused improving image properties (resolution, dynamic range, frame rate, etc.) while ignoring the issue of whether images are serving their role as visual representations. In this paper we describe a series of experiments that investigate how well images of different quality convey information about the properties of the objects they represent. In the experiments we focus on the effects that two image properties (contrast and sharpness) have on the ability of images to represent the gloss of depicted objects. We found that different experimental methods produced differing results. Specifically, when the stimulus images were presented using simultaneous pair comparison, observers were influenced by the surface properties of the images and conflated changes in image contrast and sharpness with changes in object gloss. On the other hand, when the stimulus images were presented sequentially, observers were able to disregard the image plane properties and more accurately match the gloss of the objects represented by the different quality images. These findings suggest that in understanding image quality it is useful to distinguish between quality of the imaging medium and the quality of the visual information represented by that medium.
Kavianpour, Hamidreza; Vasighi, Mahdi
2017-02-01
Nowadays, having knowledge about cellular attributes of proteins has an important role in pharmacy, medical science and molecular biology. These attributes are closely correlated with the function and three-dimensional structure of proteins. Knowledge of protein structural class is used by various methods for better understanding the protein functionality and folding patterns. Computational methods and intelligence systems can have an important role in performing structural classification of proteins. Most of protein sequences are saved in databanks as characters and strings and a numerical representation is essential for applying machine learning methods. In this work, a binary representation of protein sequences is introduced based on reduced amino acids alphabets according to surrounding hydrophobicity index. Many important features which are hidden in these long binary sequences can be clearly displayed through their cellular automata images. The extracted features from these images are used to build a classification model by support vector machine. Comparing to previous studies on the several benchmark datasets, the promising classification rates obtained by tenfold cross-validation imply that the current approach can help in revealing some inherent features deeply hidden in protein sequences and improve the quality of predicting protein structural class.
NASA Astrophysics Data System (ADS)
Zhang, Min; Zhou, Xiangrong; Goshima, Satoshi; Chen, Huayue; Muramatsu, Chisako; Hara, Takeshi; Yokoyama, Ryujiro; Kanematsu, Masayuki; Fujita, Hiroshi
2013-03-01
In this paper, we present a texture classification method based on texton learned via sparse representation (SR) with new feature histogram maps in the classification of emphysema. First, an overcomplete dictionary of textons is learned via KSVD learning on every class image patches in the training dataset. In this stage, high-pass filter is introduced to exclude patches in smooth area to speed up the dictionary learning process. Second, 3D joint-SR coefficients and intensity histograms of the test images are used for characterizing regions of interest (ROIs) instead of conventional feature histograms constructed from SR coefficients of the test images over the dictionary. Classification is then performed using a classifier with distance as a histogram dissimilarity measure. Four hundreds and seventy annotated ROIs extracted from 14 test subjects, including 6 paraseptal emphysema (PSE) subjects, 5 centrilobular emphysema (CLE) subjects and 3 panlobular emphysema (PLE) subjects, are used to evaluate the effectiveness and robustness of the proposed method. The proposed method is tested on 167 PSE, 240 CLE and 63 PLE ROIs consisting of mild, moderate and severe pulmonary emphysema. The accuracy of the proposed system is around 74%, 88% and 89% for PSE, CLE and PLE, respectively.
Radar Imaging Using The Wigner-Ville Distribution
NASA Astrophysics Data System (ADS)
Boashash, Boualem; Kenny, Owen P.; Whitehouse, Harper J.
1989-12-01
The need for analysis of time-varying signals has led to the formulation of a class of joint time-frequency distributions (TFDs). One of these TFDs, the Wigner-Ville distribution (WVD), has useful properties which can be applied to radar imaging. This paper first discusses the radar equation in terms of the time-frequency representation of the signal received from a radar system. It then presents a method of tomographic reconstruction for time-frequency images to estimate the scattering function of the aircraft. An optical archi-tecture is then discussed for the real-time implementation of the analysis method based on the WVD.
Sparse signal representation and its applications in ultrasonic NDE.
Zhang, Guang-Ming; Zhang, Cheng-Zhong; Harvey, David M
2012-03-01
Many sparse signal representation (SSR) algorithms have been developed in the past decade. The advantages of SSR such as compact representations and super resolution lead to the state of the art performance of SSR for processing ultrasonic non-destructive evaluation (NDE) signals. Choosing a suitable SSR algorithm and designing an appropriate overcomplete dictionary is a key for success. After a brief review of sparse signal representation methods and the design of overcomplete dictionaries, this paper addresses the recent accomplishments of SSR for processing ultrasonic NDE signals. The advantages and limitations of SSR algorithms and various overcomplete dictionaries widely-used in ultrasonic NDE applications are explored in depth. Their performance improvement compared to conventional signal processing methods in many applications such as ultrasonic flaw detection and noise suppression, echo separation and echo estimation, and ultrasonic imaging is investigated. The challenging issues met in practical ultrasonic NDE applications for example the design of a good dictionary are discussed. Representative experimental results are presented for demonstration. Copyright © 2011 Elsevier B.V. All rights reserved.
Morphology filter bank for extracting nodular and linear patterns in medical images.
Hashimoto, Ryutaro; Uchiyama, Yoshikazu; Uchimura, Keiichi; Koutaki, Gou; Inoue, Tomoki
2017-04-01
Using image processing to extract nodular or linear shadows is a key technique of computer-aided diagnosis schemes. This study proposes a new method for extracting nodular and linear patterns of various sizes in medical images. We have developed a morphology filter bank that creates multiresolution representations of an image. Analysis bank of this filter bank produces nodular and linear patterns at each resolution level. Synthesis bank can then be used to perfectly reconstruct the original image from these decomposed patterns. Our proposed method shows better performance based on a quantitative evaluation using a synthesized image compared with a conventional method based on a Hessian matrix, often used to enhance nodular and linear patterns. In addition, experiments show that our method can be applied to the followings: (1) microcalcifications of various sizes in mammograms can be extracted, (2) blood vessels of various sizes in retinal fundus images can be extracted, and (3) thoracic CT images can be reconstructed while removing normal vessels. Our proposed method is useful for extracting nodular and linear shadows or removing normal structures in medical images.
An infrared-visible image fusion scheme based on NSCT and compressed sensing
NASA Astrophysics Data System (ADS)
Zhang, Qiong; Maldague, Xavier
2015-05-01
Image fusion, as a research hot point nowadays in the field of infrared computer vision, has been developed utilizing different varieties of methods. Traditional image fusion algorithms are inclined to bring problems, such as data storage shortage and computational complexity increase, etc. Compressed sensing (CS) uses sparse sampling without knowing the priori knowledge and greatly reconstructs the image, which reduces the cost and complexity of image processing. In this paper, an advanced compressed sensing image fusion algorithm based on non-subsampled contourlet transform (NSCT) is proposed. NSCT provides better sparsity than the wavelet transform in image representation. Throughout the NSCT decomposition, the low-frequency and high-frequency coefficients can be obtained respectively. For the fusion processing of low-frequency coefficients of infrared and visible images , the adaptive regional energy weighting rule is utilized. Thus only the high-frequency coefficients are specially measured. Here we use sparse representation and random projection to obtain the required values of high-frequency coefficients, afterwards, the coefficients of each image block can be fused via the absolute maximum selection rule and/or the regional standard deviation rule. In the reconstruction of the compressive sampling results, a gradient-based iterative algorithm and the total variation (TV) method are employed to recover the high-frequency coefficients. Eventually, the fused image is recovered by inverse NSCT. Both the visual effects and the numerical computation results after experiments indicate that the presented approach achieves much higher quality of image fusion, accelerates the calculations, enhances various targets and extracts more useful information.
NASA Astrophysics Data System (ADS)
Cruz-Roa, Angel; Arevalo, John; Basavanhally, Ajay; Madabhushi, Anant; González, Fabio
2015-01-01
Learning data representations directly from the data itself is an approach that has shown great success in different pattern recognition problems, outperforming state-of-the-art feature extraction schemes for different tasks in computer vision, speech recognition and natural language processing. Representation learning applies unsupervised and supervised machine learning methods to large amounts of data to find building-blocks that better represent the information in it. Digitized histopathology images represents a very good testbed for representation learning since it involves large amounts of high complex, visual data. This paper presents a comparative evaluation of different supervised and unsupervised representation learning architectures to specifically address open questions on what type of learning architectures (deep or shallow), type of learning (unsupervised or supervised) is optimal. In this paper we limit ourselves to addressing these questions in the context of distinguishing between anaplastic and non-anaplastic medulloblastomas from routine haematoxylin and eosin stained images. The unsupervised approaches evaluated were sparse autoencoders and topographic reconstruct independent component analysis, and the supervised approach was convolutional neural networks. Experimental results show that shallow architectures with more neurons are better than deeper architectures without taking into account local space invariances and that topographic constraints provide useful invariant features in scale and rotations for efficient tumor differentiation.
Sparsity based target detection for compressive spectral imagery
NASA Astrophysics Data System (ADS)
Boada, David Alberto; Arguello Fuentes, Henry
2016-09-01
Hyperspectral imagery provides significant information about the spectral characteristics of objects and materials present in a scene. It enables object and feature detection, classification, or identification based on the acquired spectral characteristics. However, it relies on sophisticated acquisition and data processing systems able to acquire, process, store, and transmit hundreds or thousands of image bands from a given area of interest which demands enormous computational resources in terms of storage, computationm, and I/O throughputs. Specialized optical architectures have been developed for the compressed acquisition of spectral images using a reduced set of coded measurements contrary to traditional architectures that need a complete set of measurements of the data cube for image acquisition, dealing with the storage and acquisition limitations. Despite this improvement, if any processing is desired, the image has to be reconstructed by an inverse algorithm in order to be processed, which is also an expensive task. In this paper, a sparsity-based algorithm for target detection in compressed spectral images is presented. Specifically, the target detection model adapts a sparsity-based target detector to work in a compressive domain, modifying the sparse representation basis in the compressive sensing problem by means of over-complete training dictionaries and a wavelet basis representation. Simulations show that the presented method can achieve even better detection results than the state of the art methods.
Kather, Jakob Nikolas; Marx, Alexander; Reyes-Aldasoro, Constantino Carlos; Schad, Lothar R; Zöllner, Frank Gerrit; Weis, Cleo-Aron
2015-08-07
Blood vessels in solid tumors are not randomly distributed, but are clustered in angiogenic hotspots. Tumor microvessel density (MVD) within these hotspots correlates with patient survival and is widely used both in diagnostic routine and in clinical trials. Still, these hotspots are usually subjectively defined. There is no unbiased, continuous and explicit representation of tumor vessel distribution in histological whole slide images. This shortcoming distorts angiogenesis measurements and may account for ambiguous results in the literature. In the present study, we describe and evaluate a new method that eliminates this bias and makes angiogenesis quantification more objective and more efficient. Our approach involves automatic slide scanning, automatic image analysis and spatial statistical analysis. By comparing a continuous MVD function of the actual sample to random point patterns, we introduce an objective criterion for hotspot detection: An angiogenic hotspot is defined as a clustering of blood vessels that is very unlikely to occur randomly. We evaluate the proposed method in N=11 images of human colorectal carcinoma samples and compare the results to a blinded human observer. For the first time, we demonstrate the existence of statistically significant hotspots in tumor images and provide a tool to accurately detect these hotspots.
Cruz-Roa, Angel; Díaz, Gloria; Romero, Eduardo; González, Fabio A.
2011-01-01
Histopathological images are an important resource for clinical diagnosis and biomedical research. From an image understanding point of view, the automatic annotation of these images is a challenging problem. This paper presents a new method for automatic histopathological image annotation based on three complementary strategies, first, a part-based image representation, called the bag of features, which takes advantage of the natural redundancy of histopathological images for capturing the fundamental patterns of biological structures, second, a latent topic model, based on non-negative matrix factorization, which captures the high-level visual patterns hidden in the image, and, third, a probabilistic annotation model that links visual appearance of morphological and architectural features associated to 10 histopathological image annotations. The method was evaluated using 1,604 annotated images of skin tissues, which included normal and pathological architectural and morphological features, obtaining a recall of 74% and a precision of 50%, which improved a baseline annotation method based on support vector machines in a 64% and 24%, respectively. PMID:22811960
A contour-based shape descriptor for biomedical image classification and retrieval
NASA Astrophysics Data System (ADS)
You, Daekeun; Antani, Sameer; Demner-Fushman, Dina; Thoma, George R.
2013-12-01
Contours, object blobs, and specific feature points are utilized to represent object shapes and extract shape descriptors that can then be used for object detection or image classification. In this research we develop a shape descriptor for biomedical image type (or, modality) classification. We adapt a feature extraction method used in optical character recognition (OCR) for character shape representation, and apply various image preprocessing methods to successfully adapt the method to our application. The proposed shape descriptor is applied to radiology images (e.g., MRI, CT, ultrasound, X-ray, etc.) to assess its usefulness for modality classification. In our experiment we compare our method with other visual descriptors such as CEDD, CLD, Tamura, and PHOG that extract color, texture, or shape information from images. The proposed method achieved the highest classification accuracy of 74.1% among all other individual descriptors in the test, and when combined with CSD (color structure descriptor) showed better performance (78.9%) than using the shape descriptor alone.
Neural network-based feature point descriptors for registration of optical and SAR images
NASA Astrophysics Data System (ADS)
Abulkhanov, Dmitry; Konovalenko, Ivan; Nikolaev, Dmitry; Savchik, Alexey; Shvets, Evgeny; Sidorchuk, Dmitry
2018-04-01
Registration of images of different nature is an important technique used in image fusion, change detection, efficient information representation and other problems of computer vision. Solving this task using feature-based approaches is usually more complex than registration of several optical images because traditional feature descriptors (SIFT, SURF, etc.) perform poorly when images have different nature. In this paper we consider the problem of registration of SAR and optical images. We train neural network to build feature point descriptors and use RANSAC algorithm to align found matches. Experimental results are presented that confirm the method's effectiveness.
Hologram production and representation for corrected image
NASA Astrophysics Data System (ADS)
Jiao, Gui Chao; Zhang, Rui; Su, Xue Mei
2015-12-01
In this paper, a CCD sensor device is used to record the distorted homemade grid images which are taken by a wide angle camera. The distorted images are corrected by using methods of position calibration and correction of gray with vc++ 6.0 and opencv software. Holography graphes for the corrected pictures are produced. The clearly reproduced images are obtained where Fresnel algorithm is used in graph processing by reducing the object and reference light from Fresnel diffraction to delete zero-order part of the reproduced images. The investigation is useful in optical information processing and image encryption transmission.
Online fingerprint verification.
Upendra, K; Singh, S; Kumar, V; Verma, H K
2007-01-01
As organizations search for more secure authentication methods for user access, e-commerce, and other security applications, biometrics is gaining increasing attention. With an increasing emphasis on the emerging automatic personal identification applications, fingerprint based identification is becoming more popular. The most widely used fingerprint representation is the minutiae based representation. The main drawback with this representation is that it does not utilize a significant component of the rich discriminatory information available in the fingerprints. Local ridge structures cannot be completely characterized by minutiae. Also, it is difficult quickly to match two fingerprint images containing different number of unregistered minutiae points. In this study filter bank based representation, which eliminates these weakness, is implemented and the overall performance of the developed system is tested. The results have shown that this system can be used effectively for secure online verification applications.
Towards a multilevel cognitive probabilistic representation of space
NASA Astrophysics Data System (ADS)
Tapus, Adriana; Vasudevan, Shrihari; Siegwart, Roland
2005-03-01
This paper addresses the problem of perception and representation of space for a mobile agent. A probabilistic hierarchical framework is suggested as a solution to this problem. The method proposed is a combination of probabilistic belief with "Object Graph Models" (OGM). The world is viewed from a topological optic, in terms of objects and relationships between them. The hierarchical representation that we propose permits an efficient and reliable modeling of the information that the mobile agent would perceive from its environment. The integration of both navigational and interactional capabilities through efficient representation is also addressed. Experiments on a set of images taken from the real world that validate the approach are reported. This framework draws on the general understanding of human cognition and perception and contributes towards the overall efforts to build cognitive robot companions.
Making data matter: Voxel printing for the digital fabrication of data across scales and domains.
Bader, Christoph; Kolb, Dominik; Weaver, James C; Sharma, Sunanda; Hosny, Ahmed; Costa, João; Oxman, Neri
2018-05-01
We present a multimaterial voxel-printing method that enables the physical visualization of data sets commonly associated with scientific imaging. Leveraging voxel-based control of multimaterial three-dimensional (3D) printing, our method enables additive manufacturing of discontinuous data types such as point cloud data, curve and graph data, image-based data, and volumetric data. By converting data sets into dithered material deposition descriptions, through modifications to rasterization processes, we demonstrate that data sets frequently visualized on screen can be converted into physical, materially heterogeneous objects. Our approach alleviates the need to postprocess data sets to boundary representations, preventing alteration of data and loss of information in the produced physicalizations. Therefore, it bridges the gap between digital information representation and physical material composition. We evaluate the visual characteristics and features of our method, assess its relevance and applicability in the production of physical visualizations, and detail the conversion of data sets for multimaterial 3D printing. We conclude with exemplary 3D-printed data sets produced by our method pointing toward potential applications across scales, disciplines, and problem domains.
Efficient local representations for three-dimensional palmprint recognition
NASA Astrophysics Data System (ADS)
Yang, Bing; Wang, Xiaohua; Yao, Jinliang; Yang, Xin; Zhu, Wenhua
2013-10-01
Palmprints have been broadly used for personal authentication because they are highly accurate and incur low cost. Most previous works have focused on two-dimensional (2-D) palmprint recognition in the past decade. Unfortunately, 2-D palmprint recognition systems lose the shape information when capturing palmprint images. Moreover, such 2-D palmprint images can be easily forged or affected by noise. Hence, three-dimensional (3-D) palmprint recognition has been regarded as a promising way to further improve the performance of palmprint recognition systems. We have developed a simple, but efficient method for 3-D palmprint recognition by using local features. We first utilize shape index representation to describe the geometry of local regions in 3-D palmprint data. Then, we extract local binary pattern and Gabor wavelet features from the shape index image. The two types of complementary features are finally fused at a score level for further improvements. The experimental results on the Hong Kong Polytechnic 3-D palmprint database, which contains 8000 samples from 400 palms, illustrate the effectiveness of the proposed method.
NASA Astrophysics Data System (ADS)
Yan, Yue
2018-03-01
A synthetic aperture radar (SAR) automatic target recognition (ATR) method based on the convolutional neural networks (CNN) trained by augmented training samples is proposed. To enhance the robustness of CNN to various extended operating conditions (EOCs), the original training images are used to generate the noisy samples at different signal-to-noise ratios (SNRs), multiresolution representations, and partially occluded images. Then, the generated images together with the original ones are used to train a designed CNN for target recognition. The augmented training samples can contrapuntally improve the robustness of the trained CNN to the covered EOCs, i.e., the noise corruption, resolution variance, and partial occlusion. Moreover, the significantly larger training set effectively enhances the representation capability for other conditions, e.g., the standard operating condition (SOC), as well as the stability of the network. Therefore, better performance can be achieved by the proposed method for SAR ATR. For experimental evaluation, extensive experiments are conducted on the Moving and Stationary Target Acquisition and Recognition dataset under SOC and several typical EOCs.
Mid-level image representations for real-time heart view plane classification of echocardiograms.
Penatti, Otávio A B; Werneck, Rafael de O; de Almeida, Waldir R; Stein, Bernardo V; Pazinato, Daniel V; Mendes Júnior, Pedro R; Torres, Ricardo da S; Rocha, Anderson
2015-11-01
In this paper, we explore mid-level image representations for real-time heart view plane classification of 2D echocardiogram ultrasound images. The proposed representations rely on bags of visual words, successfully used by the computer vision community in visual recognition problems. An important element of the proposed representations is the image sampling with large regions, drastically reducing the execution time of the image characterization procedure. Throughout an extensive set of experiments, we evaluate the proposed approach against different image descriptors for classifying four heart view planes. The results show that our approach is effective and efficient for the target problem, making it suitable for use in real-time setups. The proposed representations are also robust to different image transformations, e.g., downsampling, noise filtering, and different machine learning classifiers, keeping classification accuracy above 90%. Feature extraction can be performed in 30 fps or 60 fps in some cases. This paper also includes an in-depth review of the literature in the area of automatic echocardiogram view classification giving the reader a through comprehension of this field of study. Copyright © 2015 Elsevier Ltd. All rights reserved.
Unique semantic space in the brain of each beholder predicts perceived similarity
Charest, Ian; Kievit, Rogier A.; Schmitz, Taylor W.; Deca, Diana; Kriegeskorte, Nikolaus
2014-01-01
The unique way in which each of us perceives the world must arise from our brain representations. If brain imaging could reveal an individual’s unique mental representation, it could help us understand the biological substrate of our individual experiential worlds in mental health and disease. However, imaging studies of object vision have focused on commonalities between individuals rather than individual differences and on category averages rather than representations of particular objects. Here we investigate the individually unique component of brain representations of particular objects with functional MRI (fMRI). Subjects were presented with unfamiliar and personally meaningful object images while we measured their brain activity on two separate days. We characterized the representational geometry by the dissimilarity matrix of activity patterns elicited by particular object images. The representational geometry remained stable across scanning days and was unique in each individual in early visual cortex and human inferior temporal cortex (hIT). The hIT representation predicted perceived similarity as reflected in dissimilarity judgments. Importantly, hIT predicted the individually unique component of the judgments when the objects were personally meaningful. Our results suggest that hIT brain representational idiosyncrasies accessible to fMRI are expressed in an individual's perceptual judgments. The unique way each of us perceives the world thus might reflect the individually unique representation in high-level visual areas. PMID:25246586
Sparsity-based Poisson denoising with dictionary learning.
Giryes, Raja; Elad, Michael
2014-12-01
The problem of Poisson denoising appears in various imaging applications, such as low-light photography, medical imaging, and microscopy. In cases of high SNR, several transformations exist so as to convert the Poisson noise into an additive-independent identically distributed. Gaussian noise, for which many effective algorithms are available. However, in a low-SNR regime, these transformations are significantly less accurate, and a strategy that relies directly on the true noise statistics is required. Salmon et al took this route, proposing a patch-based exponential image representation model based on Gaussian mixture model, leading to state-of-the-art results. In this paper, we propose to harness sparse-representation modeling to the image patches, adopting the same exponential idea. Our scheme uses a greedy pursuit with boot-strapping-based stopping condition and dictionary learning within the denoising process. The reconstruction performance of the proposed scheme is competitive with leading methods in high SNR and achieving state-of-the-art results in cases of low SNR.
A Selective-Echo Method for Chemical-Shift Imaging of Two-Component Systems
NASA Astrophysics Data System (ADS)
Gerald, Rex E., II; Krasavin, Anatoly O.; Botto, Robert E.
A simple and effective method for selectively imaging either one of two chemical species in a two-component system is presented and demonstrated experimentally. The pulse sequence employed, selective- echo chemical- shift imaging (SECSI), is a hybrid (frequency-selective/ T1-contrast) technique that is executed in a short period of time, utilizes the full Boltzmann magnetization of each chemical species to form the corresponding image, and requires only hard pulses of quadrature phase. This approach provides a direct and unambiguous representation of the spatial distribution of the two chemical species. In addition, the performance characteristics and the advantages of the SECSI sequence are compared on a common basis to those of other pulse sequences.
NASA Astrophysics Data System (ADS)
Vishnukumar, S.; Wilscy, M.
2017-12-01
In this paper, we propose a single image Super-Resolution (SR) method based on Compressive Sensing (CS) and Improved Total Variation (TV) Minimization Sparse Recovery. In the CS framework, low-resolution (LR) image is treated as the compressed version of high-resolution (HR) image. Dictionary Training and Sparse Recovery are the two phases of the method. K-Singular Value Decomposition (K-SVD) method is used for dictionary training and the dictionary represents HR image patches in a sparse manner. Here, only the interpolated version of the LR image is used for training purpose and thereby the structural self similarity inherent in the LR image is exploited. In the sparse recovery phase the sparse representation coefficients with respect to the trained dictionary for LR image patches are derived using Improved TV Minimization method. HR image can be reconstructed by the linear combination of the dictionary and the sparse coefficients. The experimental results show that the proposed method gives better results quantitatively as well as qualitatively on both natural and remote sensing images. The reconstructed images have better visual quality since edges and other sharp details are preserved.
Applying a visual language for image processing as a graphical teaching tool in medical imaging
NASA Astrophysics Data System (ADS)
Birchman, James J.; Tanimoto, Steven L.; Rowberg, Alan H.; Choi, Hyung-Sik; Kim, Yongmin
1992-05-01
Typical user interaction in image processing is with command line entries, pull-down menus, or text menu selections from a list, and as such is not generally graphical in nature. Although applying these interactive methods to construct more sophisticated algorithms from a series of simple image processing steps may be clear to engineers and programmers, it may not be clear to clinicians. A solution to this problem is to implement a visual programming language using visual representations to express image processing algorithms. Visual representations promote a more natural and rapid understanding of image processing algorithms by providing more visual insight into what the algorithms do than the interactive methods mentioned above can provide. Individuals accustomed to dealing with images will be more likely to understand an algorithm that is represented visually. This is especially true of referring physicians, such as surgeons in an intensive care unit. With the increasing acceptance of picture archiving and communications system (PACS) workstations and the trend toward increasing clinical use of image processing, referring physicians will need to learn more sophisticated concepts than simply image access and display. If the procedures that they perform commonly, such as window width and window level adjustment and image enhancement using unsharp masking, are depicted visually in an interactive environment, it will be easier for them to learn and apply these concepts. The software described in this paper is a visual programming language for imaging processing which has been implemented on the NeXT computer using NeXTstep user interface development tools and other tools in an object-oriented environment. The concept is based upon the description of a visual language titled `Visualization of Vision Algorithms' (VIVA). Iconic representations of simple image processing steps are placed into a workbench screen and connected together into a dataflow path by the user. As the user creates and edits a dataflow path, more complex algorithms can be built on the screen. Once the algorithm is built, it can be executed, its results can be reviewed, and operator parameters can be interactively adjusted until an optimized output is produced. The optimized algorithm can then be saved and added to the system as a new operator. This system has been evaluated as a graphical teaching tool for window width and window level adjustment, image enhancement using unsharp masking, and other techniques.
Quantum realization of the nearest-neighbor interpolation method for FRQI and NEQR
NASA Astrophysics Data System (ADS)
Sang, Jianzhi; Wang, Shen; Niu, Xiamu
2016-01-01
This paper is concerned with the feasibility of the classical nearest-neighbor interpolation based on flexible representation of quantum images (FRQI) and novel enhanced quantum representation (NEQR). Firstly, the feasibility of the classical image nearest-neighbor interpolation for quantum images of FRQI and NEQR is proven. Then, by defining the halving operation and by making use of quantum rotation gates, the concrete quantum circuit of the nearest-neighbor interpolation for FRQI is designed for the first time. Furthermore, quantum circuit of the nearest-neighbor interpolation for NEQR is given. The merit of the proposed NEQR circuit lies in their low complexity, which is achieved by utilizing the halving operation and the quantum oracle operator. Finally, in order to further improve the performance of the former circuits, new interpolation circuits for FRQI and NEQR are presented by using Control-NOT gates instead of a halving operation. Simulation results show the effectiveness of the proposed circuits.
Patwary, Nurmohammed; Preza, Chrysanthe
2015-01-01
A depth-variant (DV) image restoration algorithm for wide field fluorescence microscopy, using an orthonormal basis decomposition of DV point-spread functions (PSFs), is investigated in this study. The efficient PSF representation is based on a previously developed principal component analysis (PCA), which is computationally intensive. We present an approach developed to reduce the number of DV PSFs required for the PCA computation, thereby making the PCA-based approach computationally tractable for thick samples. Restoration results from both synthetic and experimental images show consistency and that the proposed algorithm addresses efficiently depth-induced aberration using a small number of principal components. Comparison of the PCA-based algorithm with a previously-developed strata-based DV restoration algorithm demonstrates that the proposed method improves performance by 50% in terms of accuracy and simultaneously reduces the processing time by 64% using comparable computational resources. PMID:26504634
Groupwise registration of MR brain images with tumors.
Tang, Zhenyu; Wu, Yihong; Fan, Yong
2017-08-04
A novel groupwise image registration framework is developed for registering MR brain images with tumors. Our method iteratively estimates a normal-appearance counterpart for each tumor image to be registered and constructs a directed graph (digraph) of normal-appearance images to guide the groupwise image registration. Particularly, our method maps each tumor image to its normal appearance counterpart by identifying and inpainting brain tumor regions with intensity information estimated using a low-rank plus sparse matrix decomposition based image representation technique. The estimated normal-appearance images are groupwisely registered to a group center image guided by a digraph of images so that the total length of 'image registration paths' to be the minimum, and then the original tumor images are warped to the group center image using the resulting deformation fields. We have evaluated our method based on both simulated and real MR brain tumor images. The registration results were evaluated with overlap measures of corresponding brain regions and average entropy of image intensity information, and Wilcoxon signed rank tests were adopted to compare different methods with respect to their regional overlap measures. Compared with a groupwise image registration method that is applied to normal-appearance images estimated using the traditional low-rank plus sparse matrix decomposition based image inpainting, our method achieved higher image registration accuracy with statistical significance (p = 7.02 × 10 -9 ).
An integration of minimum local feature representation methods to recognize large variation of foods
NASA Astrophysics Data System (ADS)
Razali, Mohd Norhisham bin; Manshor, Noridayu; Halin, Alfian Abdul; Mustapha, Norwati; Yaakob, Razali
2017-10-01
Local invariant features have shown to be successful in describing object appearances for image classification tasks. Such features are robust towards occlusion and clutter and are also invariant against scale and orientation changes. This makes them suitable for classification tasks with little inter-class similarity and large intra-class difference. In this paper, we propose an integrated representation of the Speeded-Up Robust Feature (SURF) and Scale Invariant Feature Transform (SIFT) descriptors, using late fusion strategy. The proposed representation is used for food recognition from a dataset of food images with complex appearance variations. The Bag of Features (BOF) approach is employed to enhance the discriminative ability of the local features. Firstly, the individual local features are extracted to construct two kinds of visual vocabularies, representing SURF and SIFT. The visual vocabularies are then concatenated and fed into a Linear Support Vector Machine (SVM) to classify the respective food categories. Experimental results demonstrate impressive overall recognition at 82.38% classification accuracy based on the challenging UEC-Food100 dataset.
Multiclass Data Segmentation using Diffuse Interface Methods on Graphs
2014-01-01
37] that performs interac- tive image segmentation using the solution to a combinatorial Dirichlet problem. Elmoataz et al . have developed general...izations of the graph Laplacian [25] for image denoising and manifold smoothing. Couprie et al . in [18] define a conve- niently parameterized graph...continuous setting carry over to the discrete graph representation. For general data segmentation, Bresson et al . in [8], present rigorous convergence
Pneumothorax detection in chest radiographs using local and global texture signatures
NASA Astrophysics Data System (ADS)
Geva, Ofer; Zimmerman-Moreno, Gali; Lieberman, Sivan; Konen, Eli; Greenspan, Hayit
2015-03-01
A novel framework for automatic detection of pneumothorax abnormality in chest radiographs is presented. The suggested method is based on a texture analysis approach combined with supervised learning techniques. The proposed framework consists of two main steps: at first, a texture analysis process is performed for detection of local abnormalities. Labeled image patches are extracted in the texture analysis procedure following which local analysis values are incorporated into a novel global image representation. The global representation is used for training and detection of the abnormality at the image level. The presented global representation is designed based on the distinctive shape of the lung, taking into account the characteristics of typical pneumothorax abnormalities. A supervised learning process was performed on both the local and global data, leading to trained detection system. The system was tested on a dataset of 108 upright chest radiographs. Several state of the art texture feature sets were experimented with (Local Binary Patterns, Maximum Response filters). The optimal configuration yielded sensitivity of 81% with specificity of 87%. The results of the evaluation are promising, establishing the current framework as a basis for additional improvements and extensions.
Predefined Redundant Dictionary for Effective Depth Maps Representation
NASA Astrophysics Data System (ADS)
Sebai, Dorsaf; Chaieb, Faten; Ghorbel, Faouzi
2016-01-01
The multi-view video plus depth (MVD) video format consists of two components: texture and depth map, where a combination of these components enables a receiver to generate arbitrary virtual views. However, MVD presents a very voluminous video format that requires a compression process for storage and especially for transmission. Conventional codecs are perfectly efficient for texture images compression but not for intrinsic depth maps properties. Depth images indeed are characterized by areas of smoothly varying grey levels separated by sharp discontinuities at the position of object boundaries. Preserving these characteristics is important to enable high quality view synthesis at the receiver side. In this paper, sparse representation of depth maps is discussed. It is shown that a significant gain in sparsity is achieved when particular mixed dictionaries are used for approximating these types of images with greedy selection strategies. Experiments are conducted to confirm the effectiveness at producing sparse representations, and competitiveness, with respect to candidate state-of-art dictionaries. Finally, the resulting method is shown to be effective for depth maps compression and represents an advantage over the ongoing 3D high efficiency video coding compression standard, particularly at medium and high bitrates.
Breast histopathology image segmentation using spatio-colour-texture based graph partition method.
Belsare, A D; Mushrif, M M; Pangarkar, M A; Meshram, N
2016-06-01
This paper proposes a novel integrated spatio-colour-texture based graph partitioning method for segmentation of nuclear arrangement in tubules with a lumen or in solid islands without a lumen from digitized Hematoxylin-Eosin stained breast histology images, in order to automate the process of histology breast image analysis to assist the pathologists. We propose a new similarity based super pixel generation method and integrate it with texton representation to form spatio-colour-texture map of Breast Histology Image. Then a new weighted distance based similarity measure is used for generation of graph and final segmentation using normalized cuts method is obtained. The extensive experiments carried shows that the proposed algorithm can segment nuclear arrangement in normal as well as malignant duct in breast histology tissue image. For evaluation of the proposed method the ground-truth image database of 100 malignant and nonmalignant breast histology images is created with the help of two expert pathologists and the quantitative evaluation of proposed breast histology image segmentation has been performed. It shows that the proposed method outperforms over other methods. © 2015 The Authors Journal of Microscopy © 2015 Royal Microscopical Society.
Paiton, Dylan M.; Kenyon, Garrett T.; Brumby, Steven P.; Schultz, Peter F.; George, John S.
2015-07-28
An approach to detecting objects in an image dataset may combine texture/color detection, shape/contour detection, and/or motion detection using sparse, generative, hierarchical models with lateral and top-down connections. A first independent representation of objects in an image dataset may be produced using a color/texture detection algorithm. A second independent representation of objects in the image dataset may be produced using a shape/contour detection algorithm. A third independent representation of objects in the image dataset may be produced using a motion detection algorithm. The first, second, and third independent representations may then be combined into a single coherent output using a combinatorial algorithm.
Scattering Removal for Finger-Vein Image Restoration
Yang, Jinfeng; Zhang, Ben; Shi, Yihua
2012-01-01
Finger-vein recognition has received increased attention recently. However, the finger-vein images are always captured in poor quality. This certainly makes finger-vein feature representation unreliable, and further impairs the accuracy of finger-vein recognition. In this paper, we first give an analysis of the intrinsic factors causing finger-vein image degradation, and then propose a simple but effective image restoration method based on scattering removal. To give a proper description of finger-vein image degradation, a biological optical model (BOM) specific to finger-vein imaging is proposed according to the principle of light propagation in biological tissues. Based on BOM, the light scattering component is sensibly estimated and properly removed for finger-vein image restoration. Finally, experimental results demonstrate that the proposed method is powerful in enhancing the finger-vein image contrast and in improving the finger-vein image matching accuracy. PMID:22737028
Brain's tumor image processing using shearlet transform
NASA Astrophysics Data System (ADS)
Cadena, Luis; Espinosa, Nikolai; Cadena, Franklin; Korneeva, Anna; Kruglyakov, Alexey; Legalov, Alexander; Romanenko, Alexey; Zotin, Alexander
2017-09-01
Brain tumor detection is well known research area for medical and computer scientists. In last decades there has been much research done on tumor detection, segmentation, and classification. Medical imaging plays a central role in the diagnosis of brain tumors and nowadays uses methods non-invasive, high-resolution techniques, especially magnetic resonance imaging and computed tomography scans. Edge detection is a fundamental tool in image processing, particularly in the areas of feature detection and feature extraction, which aim at identifying points in a digital image at which the image has discontinuities. Shearlets is the most successful frameworks for the efficient representation of multidimensional data, capturing edges and other anisotropic features which frequently dominate multidimensional phenomena. The paper proposes an improved brain tumor detection method by automatically detecting tumor location in MR images, its features are extracted by new shearlet transform.
NASA Astrophysics Data System (ADS)
Lecun, Yann; Bengio, Yoshua; Hinton, Geoffrey
2015-05-01
Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.
LeCun, Yann; Bengio, Yoshua; Hinton, Geoffrey
2015-05-28
Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.
Single-view phase retrieval of an extended sample by exploiting edge detection and sparsity
Tripathi, Ashish; McNulty, Ian; Munson, Todd; ...
2016-10-14
We propose a new approach to robustly retrieve the exit wave of an extended sample from its coherent diffraction pattern by exploiting sparsity of the sample's edges. This approach enables imaging of an extended sample with a single view, without ptychography. We introduce nonlinear optimization methods that promote sparsity, and we derive update rules to robustly recover the sample's exit wave. We test these methods on simulated samples by varying the sparsity of the edge-detected representation of the exit wave. Finally, our tests illustrate the strengths and limitations of the proposed method in imaging extended samples.
NASA Astrophysics Data System (ADS)
Wang, Xianmin; Li, Bo; Xu, Qizhi
2016-07-01
The anisotropic scale space (ASS) is often used to enhance the performance of a scale-invariant feature transform (SIFT) algorithm in the registration of synthetic aperture radar (SAR) images. The existing ASS-based methods usually suffer from unstable keypoints and false matches, since the anisotropic diffusion filtering has limitations in reducing the speckle noise from SAR images while building the ASS image representation. We proposed a speckle reducing SIFT match method to obtain stable keypoints and acquire precise matches for the SAR image registration. First, the keypoints are detected in a speckle reducing anisotropic scale space constructed by the speckle reducing anisotropic diffusion, so that speckle noise is greatly reduced and prominent structures of the images are preserved, consequently the stable keypoints can be derived. Next, the probabilistic relaxation labeling approach is employed to establish the matches of the keypoints then the correct match rate of the keypoints is significantly increased. Experiments conducted on simulated speckled images and real SAR images demonstrate the effectiveness of the proposed method.
Disocclusion: a variational approach using level lines.
Masnou, Simon
2002-01-01
Object recognition, robot vision, image and film restoration may require the ability to perform disocclusion. We call disocclusion the recovery of occluded areas in a digital image by interpolation from their vicinity. It is shown in this paper how disocclusion can be performed by means of the level-lines structure, which offers a reliable, complete and contrast-invariant representation of images. Level-lines based disocclusion yields a solution that may have strong discontinuities. The proposed method is compatible with Kanizsa's amodal completion theory.
ERIC Educational Resources Information Center
Thoma, Volker; Hummel, John E.; Davidoff, Jules
2004-01-01
According to the hybrid theory of object recognition (J. E. Hummel, 2001), ignored object images are represented holistically, and attended images are represented both holistically and analytically. This account correctly predicts patterns of visual priming as a function of translation, scale (B. J. Stankiewicz & J. E. Hummel, 2002), and…
Good Practices for Learning to Recognize Actions Using FV and VLAD.
Wu, Jianxin; Zhang, Yu; Lin, Weiyao
2016-12-01
High dimensional representations such as Fisher vectors (FV) and vectors of locally aggregated descriptors (VLAD) have shown state-of-the-art accuracy for action recognition in videos. The high dimensionality, on the other hand, also causes computational difficulties when scaling up to large-scale video data. This paper makes three lines of contributions to learning to recognize actions using high dimensional representations. First, we reviewed several existing techniques that improve upon FV or VLAD in image classification, and performed extensive empirical evaluations to assess their applicability for action recognition. Our analyses of these empirical results show that normality and bimodality are essential to achieve high accuracy. Second, we proposed a new pooling strategy for VLAD and three simple, efficient, and effective transformations for both FV and VLAD. Both proposed methods have shown higher accuracy than the original FV/VLAD method in extensive evaluations. Third, we proposed and evaluated new feature selection and compression methods for the FV and VLAD representations. This strategy uses only 4% of the storage of the original representation, but achieves comparable or even higher accuracy. Based on these contributions, we recommend a set of good practices for action recognition in videos for practitioners in this field.
Signal Sampling for Efficient Sparse Representation of Resting State FMRI Data
Ge, Bao; Makkie, Milad; Wang, Jin; Zhao, Shijie; Jiang, Xi; Li, Xiang; Lv, Jinglei; Zhang, Shu; Zhang, Wei; Han, Junwei; Guo, Lei; Liu, Tianming
2015-01-01
As the size of brain imaging data such as fMRI grows explosively, it provides us with unprecedented and abundant information about the brain. How to reduce the size of fMRI data but not lose much information becomes a more and more pressing issue. Recent literature studies tried to deal with it by dictionary learning and sparse representation methods, however, their computation complexities are still high, which hampers the wider application of sparse representation method to large scale fMRI datasets. To effectively address this problem, this work proposes to represent resting state fMRI (rs-fMRI) signals of a whole brain via a statistical sampling based sparse representation. First we sampled the whole brain’s signals via different sampling methods, then the sampled signals were aggregate into an input data matrix to learn a dictionary, finally this dictionary was used to sparsely represent the whole brain’s signals and identify the resting state networks. Comparative experiments demonstrate that the proposed signal sampling framework can speed-up by ten times in reconstructing concurrent brain networks without losing much information. The experiments on the 1000 Functional Connectomes Project further demonstrate its effectiveness and superiority. PMID:26646924
Volumetric calibration of a plenoptic camera
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hall, Elise Munz; Fahringer, Timothy W.; Guildenbecher, Daniel Robert
Here, the volumetric calibration of a plenoptic camera is explored to correct for inaccuracies due to real-world lens distortions and thin-lens assumptions in current processing methods. Two methods of volumetric calibration based on a polynomial mapping function that does not require knowledge of specific lens parameters are presented and compared to a calibration based on thin-lens assumptions. The first method, volumetric dewarping, is executed by creation of a volumetric representation of a scene using the thin-lens assumptions, which is then corrected in post-processing using a polynomial mapping function. The second method, direct light-field calibration, uses the polynomial mapping in creationmore » of the initial volumetric representation to relate locations in object space directly to image sensor locations. The accuracy and feasibility of these methods is examined experimentally by capturing images of a known dot card at a variety of depths. Results suggest that use of a 3D polynomial mapping function provides a significant increase in reconstruction accuracy and that the achievable accuracy is similar using either polynomial-mapping-based method. Additionally, direct light-field calibration provides significant computational benefits by eliminating some intermediate processing steps found in other methods. Finally, the flexibility of this method is shown for a nonplanar calibration.« less
Volumetric calibration of a plenoptic camera
Hall, Elise Munz; Fahringer, Timothy W.; Guildenbecher, Daniel Robert; ...
2018-02-01
Here, the volumetric calibration of a plenoptic camera is explored to correct for inaccuracies due to real-world lens distortions and thin-lens assumptions in current processing methods. Two methods of volumetric calibration based on a polynomial mapping function that does not require knowledge of specific lens parameters are presented and compared to a calibration based on thin-lens assumptions. The first method, volumetric dewarping, is executed by creation of a volumetric representation of a scene using the thin-lens assumptions, which is then corrected in post-processing using a polynomial mapping function. The second method, direct light-field calibration, uses the polynomial mapping in creationmore » of the initial volumetric representation to relate locations in object space directly to image sensor locations. The accuracy and feasibility of these methods is examined experimentally by capturing images of a known dot card at a variety of depths. Results suggest that use of a 3D polynomial mapping function provides a significant increase in reconstruction accuracy and that the achievable accuracy is similar using either polynomial-mapping-based method. Additionally, direct light-field calibration provides significant computational benefits by eliminating some intermediate processing steps found in other methods. Finally, the flexibility of this method is shown for a nonplanar calibration.« less
Simultaneous Spectral-Spatial Feature Selection and Extraction for Hyperspectral Images.
Zhang, Lefei; Zhang, Qian; Du, Bo; Huang, Xin; Tang, Yuan Yan; Tao, Dacheng
2018-01-01
In hyperspectral remote sensing data mining, it is important to take into account of both spectral and spatial information, such as the spectral signature, texture feature, and morphological property, to improve the performances, e.g., the image classification accuracy. In a feature representation point of view, a nature approach to handle this situation is to concatenate the spectral and spatial features into a single but high dimensional vector and then apply a certain dimension reduction technique directly on that concatenated vector before feed it into the subsequent classifier. However, multiple features from various domains definitely have different physical meanings and statistical properties, and thus such concatenation has not efficiently explore the complementary properties among different features, which should benefit for boost the feature discriminability. Furthermore, it is also difficult to interpret the transformed results of the concatenated vector. Consequently, finding a physically meaningful consensus low dimensional feature representation of original multiple features is still a challenging task. In order to address these issues, we propose a novel feature learning framework, i.e., the simultaneous spectral-spatial feature selection and extraction algorithm, for hyperspectral images spectral-spatial feature representation and classification. Specifically, the proposed method learns a latent low dimensional subspace by projecting the spectral-spatial feature into a common feature space, where the complementary information has been effectively exploited, and simultaneously, only the most significant original features have been transformed. Encouraging experimental results on three public available hyperspectral remote sensing datasets confirm that our proposed method is effective and efficient.
Finding Imaging Patterns of Structural Covariance via Non-Negative Matrix Factorization
Sotiras, Aristeidis; Resnick, Susan M.; Davatzikos, Christos
2015-01-01
In this paper, we investigate the use of Non-Negative Matrix Factorization (NNMF) for the analysis of structural neuroimaging data. The goal is to identify the brain regions that co-vary across individuals in a consistent way, hence potentially being part of underlying brain networks or otherwise influenced by underlying common mechanisms such as genetics and pathologies. NNMF offers a directly data-driven way of extracting relatively localized co-varying structural regions, thereby transcending limitations of Principal Component Analysis (PCA), Independent Component Analysis (ICA) and other related methods that tend to produce dispersed components of positive and negative loadings. In particular, leveraging upon the well known ability of NNMF to produce parts-based representations of image data, we derive decompositions that partition the brain into regions that vary in consistent ways across individuals. Importantly, these decompositions achieve dimensionality reduction via highly interpretable ways and generalize well to new data as shown via split-sample experiments. We empirically validate NNMF in two data sets: i) a Diffusion Tensor (DT) mouse brain development study, and ii) a structural Magnetic Resonance (sMR) study of human brain aging. We demonstrate the ability of NNMF to produce sparse parts-based representations of the data at various resolutions. These representations seem to follow what we know about the underlying functional organization of the brain and also capture some pathological processes. Moreover, we show that these low dimensional representations favorably compare to descriptions obtained with more commonly used matrix factorization methods like PCA and ICA. PMID:25497684
Ugurlu, Devran; Firat, Zeynep; Türe, Uğur; Unal, Gozde
2018-05-01
Accurate digital representation of major white matter bundles in the brain is an important goal in neuroscience image computing since the representations can be used for surgical planning, intra-patient longitudinal analysis and inter-subject population connectivity studies. Reconstructing desired fiber bundles generally involves manual selection of regions of interest by an expert, which is subject to user bias and fatigue, hence an automation is desirable. To that end, we first present a novel anatomical representation based on Neighborhood Resolved Fiber Orientation Distributions (NRFOD) along the fibers. The resolved fiber orientations are obtained by generalized q-sampling imaging (GQI) and a subsequent diffusion decomposition method. A fiber-to-fiber distance measure between the proposed fiber representations is then used in a density-based clustering framework to select the clusters corresponding to the major pathways of interest. In addition, neuroanatomical priors are utilized to constrain the set of candidate fibers before density-based clustering. The proposed fiber clustering approach is exemplified on automation of the reconstruction of the major fiber pathways in the brainstem: corticospinal tract (CST); medial lemniscus (ML); middle cerebellar peduncle (MCP); inferior cerebellar peduncle (ICP); superior cerebellar peduncle (SCP). Experimental results on Human Connectome Project (HCP)'s publicly available "WU-Minn 500 Subjects + MEG2 dataset" and expert evaluations demonstrate the potential of the proposed fiber clustering method in brainstem white matter structure analysis. Copyright © 2018 Elsevier B.V. All rights reserved.
Multi-scale Gaussian representation and outline-learning based cell image segmentation.
Farhan, Muhammad; Ruusuvuori, Pekka; Emmenlauer, Mario; Rämö, Pauli; Dehio, Christoph; Yli-Harja, Olli
2013-01-01
High-throughput genome-wide screening to study gene-specific functions, e.g. for drug discovery, demands fast automated image analysis methods to assist in unraveling the full potential of such studies. Image segmentation is typically at the forefront of such analysis as the performance of the subsequent steps, for example, cell classification, cell tracking etc., often relies on the results of segmentation. We present a cell cytoplasm segmentation framework which first separates cell cytoplasm from image background using novel approach of image enhancement and coefficient of variation of multi-scale Gaussian scale-space representation. A novel outline-learning based classification method is developed using regularized logistic regression with embedded feature selection which classifies image pixels as outline/non-outline to give cytoplasm outlines. Refinement of the detected outlines to separate cells from each other is performed in a post-processing step where the nuclei segmentation is used as contextual information. We evaluate the proposed segmentation methodology using two challenging test cases, presenting images with completely different characteristics, with cells of varying size, shape, texture and degrees of overlap. The feature selection and classification framework for outline detection produces very simple sparse models which use only a small subset of the large, generic feature set, that is, only 7 and 5 features for the two cases. Quantitative comparison of the results for the two test cases against state-of-the-art methods show that our methodology outperforms them with an increase of 4-9% in segmentation accuracy with maximum accuracy of 93%. Finally, the results obtained for diverse datasets demonstrate that our framework not only produces accurate segmentation but also generalizes well to different segmentation tasks.
Contacts de langues et representations (Language Contacts and Representations).
ERIC Educational Resources Information Center
Matthey, Marinette, Ed.
1997-01-01
Essays on language contact and the image of language, entirely in French, include: "Representations 'du' contexte et representations 'en' contexte? Eleves et enseignants face a l'apprentissage de la langue" ("Representations 'of' Context or Representations 'in' Context? Students and Teachers Facing Language Learning" (Laurent…
Multi-modal Registration for Correlative Microscopy using Image Analogies
Cao, Tian; Zach, Christopher; Modla, Shannon; Powell, Debbie; Czymmek, Kirk; Niethammer, Marc
2014-01-01
Correlative microscopy is a methodology combining the functionality of light microscopy with the high resolution of electron microscopy and other microscopy technologies for the same biological specimen. In this paper, we propose an image registration method for correlative microscopy, which is challenging due to the distinct appearance of biological structures when imaged with different modalities. Our method is based on image analogies and allows to transform images of a given modality into the appearance-space of another modality. Hence, the registration between two different types of microscopy images can be transformed to a mono-modality image registration. We use a sparse representation model to obtain image analogies. The method makes use of corresponding image training patches of two different imaging modalities to learn a dictionary capturing appearance relations. We test our approach on backscattered electron (BSE) scanning electron microscopy (SEM)/confocal and transmission electron microscopy (TEM)/confocal images. We perform rigid, affine, and deformable registration via B-splines and show improvements over direct registration using both mutual information and sum of squared differences similarity measures to account for differences in image appearance. PMID:24387943
Image fusion via nonlocal sparse K-SVD dictionary learning.
Li, Ying; Li, Fangyi; Bai, Bendu; Shen, Qiang
2016-03-01
Image fusion aims to merge two or more images captured via various sensors of the same scene to construct a more informative image by integrating their details. Generally, such integration is achieved through the manipulation of the representations of the images concerned. Sparse representation plays an important role in the effective description of images, offering a great potential in a variety of image processing tasks, including image fusion. Supported by sparse representation, in this paper, an approach for image fusion by the use of a novel dictionary learning scheme is proposed. The nonlocal self-similarity property of the images is exploited, not only at the stage of learning the underlying description dictionary but during the process of image fusion. In particular, the property of nonlocal self-similarity is combined with the traditional sparse dictionary. This results in an improved learned dictionary, hereafter referred to as the nonlocal sparse K-SVD dictionary (where K-SVD stands for the K times singular value decomposition that is commonly used in the literature), and abbreviated to NL_SK_SVD. The performance of the NL_SK_SVD dictionary is applied for image fusion using simultaneous orthogonal matching pursuit. The proposed approach is evaluated with different types of images, and compared with a number of alternative image fusion techniques. The resultant superior fused images using the present approach demonstrates the efficacy of the NL_SK_SVD dictionary in sparse image representation.
Log-Gabor Energy Based Multimodal Medical Image Fusion in NSCT Domain
Yang, Yong; Tong, Song; Huang, Shuying; Lin, Pan
2014-01-01
Multimodal medical image fusion is a powerful tool in clinical applications such as noninvasive diagnosis, image-guided radiotherapy, and treatment planning. In this paper, a novel nonsubsampled Contourlet transform (NSCT) based method for multimodal medical image fusion is presented, which is approximately shift invariant and can effectively suppress the pseudo-Gibbs phenomena. The source medical images are initially transformed by NSCT followed by fusing low- and high-frequency components. The phase congruency that can provide a contrast and brightness-invariant representation is applied to fuse low-frequency coefficients, whereas the Log-Gabor energy that can efficiently determine the frequency coefficients from the clear and detail parts is employed to fuse the high-frequency coefficients. The proposed fusion method has been compared with the discrete wavelet transform (DWT), the fast discrete curvelet transform (FDCT), and the dual tree complex wavelet transform (DTCWT) based image fusion methods and other NSCT-based methods. Visually and quantitatively experimental results indicate that the proposed fusion method can obtain more effective and accurate fusion results of multimodal medical images than other algorithms. Further, the applicability of the proposed method has been testified by carrying out a clinical example on a woman affected with recurrent tumor images. PMID:25214889
Classification of Clouds in Satellite Imagery Using Adaptive Fuzzy Sparse Representation
Jin, Wei; Gong, Fei; Zeng, Xingbin; Fu, Randi
2016-01-01
Automatic cloud detection and classification using satellite cloud imagery have various meteorological applications such as weather forecasting and climate monitoring. Cloud pattern analysis is one of the research hotspots recently. Since satellites sense the clouds remotely from space, and different cloud types often overlap and convert into each other, there must be some fuzziness and uncertainty in satellite cloud imagery. Satellite observation is susceptible to noises, while traditional cloud classification methods are sensitive to noises and outliers; it is hard for traditional cloud classification methods to achieve reliable results. To deal with these problems, a satellite cloud classification method using adaptive fuzzy sparse representation-based classification (AFSRC) is proposed. Firstly, by defining adaptive parameters related to attenuation rate and critical membership, an improved fuzzy membership is introduced to accommodate the fuzziness and uncertainty of satellite cloud imagery; secondly, by effective combination of the improved fuzzy membership function and sparse representation-based classification (SRC), atoms in training dictionary are optimized; finally, an adaptive fuzzy sparse representation classifier for cloud classification is proposed. Experiment results on FY-2G satellite cloud image show that, the proposed method not only improves the accuracy of cloud classification, but also has strong stability and adaptability with high computational efficiency. PMID:27999261
Sparse representations via learned dictionaries for x-ray angiogram image denoising
NASA Astrophysics Data System (ADS)
Shang, Jingfan; Huang, Zhenghua; Li, Qian; Zhang, Tianxu
2018-03-01
X-ray angiogram image denoising is always an active research topic in the field of computer vision. In particular, the denoising performance of many existing methods had been greatly improved by the widely use of nonlocal similar patches. However, the only nonlocal self-similar (NSS) patch-based methods can be still be improved and extended. In this paper, we propose an image denoising model based on the sparsity of the NSS patches to obtain high denoising performance and high-quality image. In order to represent the sparsely NSS patches in every location of the image well and solve the image denoising model more efficiently, we obtain dictionaries as a global image prior by the K-SVD algorithm over the processing image; Then the single and effectively alternating directions method of multipliers (ADMM) method is used to solve the image denoising model. The results of widely synthetic experiments demonstrate that, owing to learned dictionaries by K-SVD algorithm, a sparsely augmented lagrangian image denoising (SALID) model, which perform effectively, obtains a state-of-the-art denoising performance and better high-quality images. Moreover, we also give some denoising results of clinical X-ray angiogram images.
Research on polarization imaging information parsing method
NASA Astrophysics Data System (ADS)
Yuan, Hongwu; Zhou, Pucheng; Wang, Xiaolong
2016-11-01
Polarization information parsing plays an important role in polarization imaging detection. This paper focus on the polarization information parsing method: Firstly, the general process of polarization information parsing is given, mainly including polarization image preprocessing, multiple polarization parameters calculation, polarization image fusion and polarization image tracking, etc.; And then the research achievements of the polarization information parsing method are presented, in terms of polarization image preprocessing, the polarization image registration method based on the maximum mutual information is designed. The experiment shows that this method can improve the precision of registration and be satisfied the need of polarization information parsing; In terms of multiple polarization parameters calculation, based on the omnidirectional polarization inversion model is built, a variety of polarization parameter images are obtained and the precision of inversion is to be improve obviously; In terms of polarization image fusion , using fuzzy integral and sparse representation, the multiple polarization parameters adaptive optimal fusion method is given, and the targets detection in complex scene is completed by using the clustering image segmentation algorithm based on fractal characters; In polarization image tracking, the average displacement polarization image characteristics of auxiliary particle filtering fusion tracking algorithm is put forward to achieve the smooth tracking of moving targets. Finally, the polarization information parsing method is applied to the polarization imaging detection of typical targets such as the camouflage target, the fog and latent fingerprints.
Positron emission imaging device and method of using the same
Bingham, Philip R.; Mullens, James Allen
2013-01-15
An imaging system and method of imaging are disclosed. The imaging system can include an external radiation source producing pairs of substantially simultaneous radiation emissions of a picturization emission and a verification emissions at an emission angle. The imaging system can also include a plurality of picturization sensors and at least one verification sensor for detecting the picturization and verification emissions, respectively. The imaging system also includes an object stage is arranged such that a picturization emission can pass through an object supported on said object stage before being detected by one of said plurality of picturization sensors. A coincidence system and a reconstruction system can also be included. The coincidence can receive information from the picturization and verification sensors and determine whether a detected picturization emission is direct radiation or scattered radiation. The reconstruction system can produce a multi-dimensional representation of an object imaged with the imaging system.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Paiton, Dylan M.; Kenyon, Garrett T.; Brumby, Steven P.
An approach to detecting objects in an image dataset may combine texture/color detection, shape/contour detection, and/or motion detection using sparse, generative, hierarchical models with lateral and top-down connections. A first independent representation of objects in an image dataset may be produced using a color/texture detection algorithm. A second independent representation of objects in the image dataset may be produced using a shape/contour detection algorithm. A third independent representation of objects in the image dataset may be produced using a motion detection algorithm. The first, second, and third independent representations may then be combined into a single coherent output using amore » combinatorial algorithm.« less
NASA Astrophysics Data System (ADS)
Díaz, Elkin; Arguello, Henry
2016-05-01
Urban ecosystem studies require monitoring, controlling and planning to analyze building density, urban density, urban planning, atmospheric modeling and land use. In urban planning, there are many methods for building height estimation using optical remote sensing images. These methods however, highly depend on sun illumination and cloud-free weather. In contrast, high resolution synthetic aperture radar provides images independent from daytime and weather conditions, although, these images rely on special hardware and expensive acquisition. Most of the biggest cities around the world have been photographed by Google street view under different conditions. Thus, thousands of images from the principal streets of a city can be accessed online. The availability of this and similar rich city imagery such as StreetSide from Microsoft, represents huge opportunities in computer vision because these images can be used as input in many applications such as 3D modeling, segmentation, recognition and stereo correspondence. This paper proposes a novel algorithm to estimate building heights using public Google Street-View imagery. The objective of this work is to obtain thousands of geo-referenced images from Google Street-View using a representational state transfer system, and estimate their average height using single view metrology. Furthermore, the resulting measurements and image metadata are used to derive a layer of heights in a Google map available online. The experimental results show that the proposed algorithm can estimate an accurate average building height map of thousands of images using Google Street-View Imagery of any city.
Wang, Zhengxia; Zhu, Xiaofeng; Adeli, Ehsan; Zhu, Yingying; Nie, Feiping; Munsell, Brent
2018-01-01
Graph-based transductive learning (GTL) is a powerful machine learning technique that is used when sufficient training data is not available. In particular, conventional GTL approaches first construct a fixed inter-subject relation graph that is based on similarities in voxel intensity values in the feature domain, which can then be used to propagate the known phenotype data (i.e., clinical scores and labels) from the training data to the testing data in the label domain. However, this type of graph is exclusively learned in the feature domain, and primarily due to outliers in the observed features, may not be optimal for label propagation in the label domain. To address this limitation, a progressive GTL (pGTL) method is proposed that gradually finds an intrinsic data representation that more accurately aligns imaging features with the phenotype data. In general, optimal feature-to-phenotype alignment is achieved using an iterative approach that: (1) refines inter-subject relationships observed in the feature domain by using the learned intrinsic data representation in the label domain, (2) updates the intrinsic data representation from the refined inter-subject relationships, and (3) verifies the intrinsic data representation on the training data to guarantee an optimal classification when applied to testing data. Additionally, the iterative approach is extended to multi-modal imaging data to further improve pGTL classification accuracy. Using Alzheimer’s disease and Parkinson’s disease study data, the classification accuracy of the proposed pGTL method is compared to several state-of-the-art classification methods, and the results show pGTL can more accurately identify subjects, even at different progression stages, in these two study data sets. PMID:28551556
Measurement of entanglement entropy in the two-dimensional Potts model using wavelet analysis.
Tomita, Yusuke
2018-05-01
A method is introduced to measure the entanglement entropy using a wavelet analysis. Using this method, the two-dimensional Haar wavelet transform of a configuration of Fortuin-Kasteleyn (FK) clusters is performed. The configuration represents a direct snapshot of spin-spin correlations since spin degrees of freedom are traced out in FK representation. A snapshot of FK clusters loses image information at each coarse-graining process by the wavelet transform. It is shown that the loss of image information measures the entanglement entropy in the Potts model.
Hierarchical feature representation and multimodal fusion with deep learning for AD/MCI diagnosis.
Suk, Heung-Il; Lee, Seong-Whan; Shen, Dinggang
2014-11-01
For the last decade, it has been shown that neuroimaging can be a potential tool for the diagnosis of Alzheimer's Disease (AD) and its prodromal stage, Mild Cognitive Impairment (MCI), and also fusion of different modalities can further provide the complementary information to enhance diagnostic accuracy. Here, we focus on the problems of both feature representation and fusion of multimodal information from Magnetic Resonance Imaging (MRI) and Positron Emission Tomography (PET). To our best knowledge, the previous methods in the literature mostly used hand-crafted features such as cortical thickness, gray matter densities from MRI, or voxel intensities from PET, and then combined these multimodal features by simply concatenating into a long vector or transforming into a higher-dimensional kernel space. In this paper, we propose a novel method for a high-level latent and shared feature representation from neuroimaging modalities via deep learning. Specifically, we use Deep Boltzmann Machine (DBM)(2), a deep network with a restricted Boltzmann machine as a building block, to find a latent hierarchical feature representation from a 3D patch, and then devise a systematic method for a joint feature representation from the paired patches of MRI and PET with a multimodal DBM. To validate the effectiveness of the proposed method, we performed experiments on ADNI dataset and compared with the state-of-the-art methods. In three binary classification problems of AD vs. healthy Normal Control (NC), MCI vs. NC, and MCI converter vs. MCI non-converter, we obtained the maximal accuracies of 95.35%, 85.67%, and 74.58%, respectively, outperforming the competing methods. By visual inspection of the trained model, we observed that the proposed method could hierarchically discover the complex latent patterns inherent in both MRI and PET. Copyright © 2014 Elsevier Inc. All rights reserved.
Hierarchical Feature Representation and Multimodal Fusion with Deep Learning for AD/MCI Diagnosis
Suk, Heung-Il; Lee, Seong-Whan; Shen, Dinggang
2014-01-01
For the last decade, it has been shown that neuroimaging can be a potential tool for the diagnosis of Alzheimer’s Disease (AD) and its prodromal stage, Mild Cognitive Impairment (MCI), and also fusion of different modalities can further provide the complementary information to enhance diagnostic accuracy. Here, we focus on the problems of both feature representation and fusion of multimodal information from Magnetic Resonance Imaging (MRI) and Positron Emission Tomography (PET). To our best knowledge, the previous methods in the literature mostly used hand-crafted features such as cortical thickness, gray matter densities from MRI, or voxel intensities from PET, and then combined these multimodal features by simply concatenating into a long vector or transforming into a higher-dimensional kernel space. In this paper, we propose a novel method for a high-level latent and shared feature representation from neuroimaging modalities via deep learning. Specifically, we use Deep Boltzmann Machine (DBM)1, a deep network with a restricted Boltzmann machine as a building block, to find a latent hierarchical feature representation from a 3D patch, and then devise a systematic method for a joint feature representation from the paired patches of MRI and PET with a multimodal DBM. To validate the effectiveness of the proposed method, we performed experiments on ADNI dataset and compared with the state-of-the-art methods. In three binary classification problems of AD vs. healthy Normal Control (NC), MCI vs. NC, and MCI converter vs. MCI non-converter, we obtained the maximal accuracies of 95.35%, 85.67%, and 74.58%, respectively, outperforming the competing methods. By visual inspection of the trained model, we observed that the proposed method could hierarchically discover the complex latent patterns inherent in both MRI and PET. PMID:25042445
Zakeri, Fahimeh Sadat; Setarehdan, Seyed Kamaledin; Norouzi, Somayye
2017-10-01
Segmentation of the arterial wall boundaries from intravascular ultrasound images is an important image processing task in order to quantify arterial wall characteristics such as shape, area, thickness and eccentricity. Since manual segmentation of these boundaries is a laborious and time consuming procedure, many researchers attempted to develop (semi-) automatic segmentation techniques as a powerful tool for educational and clinical purposes in the past but as yet there is no any clinically approved method in the market. This paper presents a deterministic-statistical strategy for automatic media-adventitia border detection by a fourfold algorithm. First, a smoothed initial contour is extracted based on the classification in the sparse representation framework which is combined with the dynamic directional convolution vector field. Next, an active contour model is utilized for the propagation of the initial contour toward the interested borders. Finally, the extracted contour is refined in the leakage, side branch openings and calcification regions based on the image texture patterns. The performance of the proposed algorithm is evaluated by comparing the results to those manually traced borders by an expert on 312 different IVUS images obtained from four different patients. The statistical analysis of the results demonstrates the efficiency of the proposed method in the media-adventitia border detection with enough consistency in the leakage and calcification regions. Copyright © 2017 Elsevier Ltd. All rights reserved.
Multiclass Data Segmentation Using Diffuse Interface Methods on Graphs
2014-01-01
interac- tive image segmentation using the solution to a combinatorial Dirichlet problem. Elmoataz et al . have developed general- izations of the graph...Laplacian [25] for image denoising and manifold smoothing. Couprie et al . in [18] define a conve- niently parameterized graph-based energy function that...over to the discrete graph representation. For general data segmentation, Bresson et al . in [8], present rigorous convergence results for two algorithms
McKillop, Chris; Parsons, Janet A; Brown, Janet; Scott, Susan; Holness, D Linn
2016-09-27
Immigrant workers who are new to Canada are considered a vulnerable population under the Ontario Ministry of Labour Prevention Strategy for workplace safety. Posters outlining workplace safety rights and responsibilities may not be understandable to new immigrants. To explore visual approaches to making health and safety messages more understandable to new immigrants. This pilot study used arts-based qualitative research methods. Key messages from the Ministry of Labour Health & Safety at Work poster were (re)represented as images by an artist. Recent immigrants engaged in individual interviews and then took part in a focus group, in order to elicit their experiences of health and safety practices, their understanding and feedback concerning the Ministry poster, and the images created. An image-rich version of the poster was developed. The combination of drawings and minimal text was preferred and considered helpful by participants. Barriers to health and safety and work challenges for new immigrants were highlighted. Visual analysis yielded new versions of the poster, as well as a pictorial representation of the research process and study findings. The study demonstrates the value of using image-rich posters with immigrant workers, and the effectiveness of using arts-based methods within the research process.
Speckle noise reduction for optical coherence tomography based on adaptive 2D dictionary
NASA Astrophysics Data System (ADS)
Lv, Hongli; Fu, Shujun; Zhang, Caiming; Zhai, Lin
2018-05-01
As a high-resolution biomedical imaging modality, optical coherence tomography (OCT) is widely used in medical sciences. However, OCT images often suffer from speckle noise, which can mask some important image information, and thus reduce the accuracy of clinical diagnosis. Taking full advantage of nonlocal self-similarity and adaptive 2D-dictionary-based sparse representation, in this work, a speckle noise reduction algorithm is proposed for despeckling OCT images. To reduce speckle noise while preserving local image features, similar nonlocal patches are first extracted from the noisy image and put into groups using a gamma- distribution-based block matching method. An adaptive 2D dictionary is then learned for each patch group. Unlike traditional vector-based sparse coding, we express each image patch by the linear combination of a few matrices. This image-to-matrix method can exploit the local correlation between pixels. Since each image patch might belong to several groups, the despeckled OCT image is finally obtained by aggregating all filtered image patches. The experimental results demonstrate the superior performance of the proposed method over other state-of-the-art despeckling methods, in terms of objective metrics and visual inspection.
NASA Astrophysics Data System (ADS)
Jin, Dakai; Lu, Jia; Zhang, Xiaoliu; Chen, Cheng; Bai, ErWei; Saha, Punam K.
2017-03-01
Osteoporosis is associated with increased fracture risk. Recent advancement in the area of in vivo imaging allows segmentation of trabecular bone (TB) microstructures, which is a known key determinant of bone strength and fracture risk. An accurate biomechanical modelling of TB micro-architecture provides a comprehensive summary measure of bone strength and fracture risk. In this paper, a new direct TB biomechanical modelling method using nonlinear manifold-based volumetric reconstruction of trabecular network is presented. It is accomplished in two sequential modules. The first module reconstructs a nonlinear manifold-based volumetric representation of TB networks from three-dimensional digital images. Specifically, it starts with the fuzzy digital segmentation of a TB network, and computes its surface and curve skeletons. An individual trabecula is identified as a topological segment in the curve skeleton. Using geometric analysis, smoothing and optimization techniques, the algorithm generates smooth, curved, and continuous representations of individual trabeculae glued at their junctions. Also, the method generates a geometrically consistent TB volume at junctions. In the second module, a direct computational biomechanical stress-strain analysis is applied on the reconstructed TB volume to predict mechanical measures. The accuracy of the method was examined using micro-CT imaging of cadaveric distal tibia specimens (N = 12). A high linear correlation (r = 0.95) between TB volume computed using the new manifold-modelling algorithm and that directly derived from the voxel-based micro-CT images was observed. Young's modulus (YM) was computed using direct mechanical analysis on the TB manifold-model over a cubical volume of interest (VOI), and its correlation with the YM, computed using micro-CT based conventional finite-element analysis over the same VOI, was examined. A moderate linear correlation (r = 0.77) was observed between the two YM measures. This preliminary results show the accuracy of the new nonlinear manifold modelling algorithm for TB, and demonstrate the feasibility of a new direct mechanical strain-strain analysis on a nonlinear manifold model of a highly complex biological structure.
Tirado-Ramos, Alfredo; Hu, Jingkun; Lee, K.P.
2002-01-01
Supplement 23 to DICOM (Digital Imaging and Communications for Medicine), Structured Reporting, is a specification that supports a semantically rich representation of image and waveform content, enabling experts to share image and related patient information. DICOM SR supports the representation of textual and coded data linked to images and waveforms. Nevertheless, the medical information technology community needs models that work as bridges between the DICOM relational model and open object-oriented technologies. The authors assert that representations of the DICOM Structured Reporting standard, using object-oriented modeling languages such as the Unified Modeling Language, can provide a high-level reference view of the semantically rich framework of DICOM and its complex structures. They have produced an object-oriented model to represent the DICOM SR standard and have derived XML-exchangeable representations of this model using World Wide Web Consortium specifications. They expect the model to benefit developers and system architects who are interested in developing applications that are compliant with the DICOM SR specification. PMID:11751804
Yuan, Yuan; Lin, Jianzhe; Wang, Qi
2016-12-01
Hyperspectral image (HSI) classification is a crucial issue in remote sensing. Accurate classification benefits a large number of applications such as land use analysis and marine resource utilization. But high data correlation brings difficulty to reliable classification, especially for HSI with abundant spectral information. Furthermore, the traditional methods often fail to well consider the spatial coherency of HSI that also limits the classification performance. To address these inherent obstacles, a novel spectral-spatial classification scheme is proposed in this paper. The proposed method mainly focuses on multitask joint sparse representation (MJSR) and a stepwise Markov random filed framework, which are claimed to be two main contributions in this procedure. First, the MJSR not only reduces the spectral redundancy, but also retains necessary correlation in spectral field during classification. Second, the stepwise optimization further explores the spatial correlation that significantly enhances the classification accuracy and robustness. As far as several universal quality evaluation indexes are concerned, the experimental results on Indian Pines and Pavia University demonstrate the superiority of our method compared with the state-of-the-art competitors.
NASA Astrophysics Data System (ADS)
Fan, Jiayuan; Tan, Hui Li; Toomik, Maria; Lu, Shijian
2016-10-01
Spatial pyramid matching has demonstrated its power for image recognition task by pooling features from spatially increasingly fine sub-regions. Motivated by the concept of feature pooling at multiple pyramid levels, we propose a novel spectral-spatial hyperspectral image classification approach using superpixel-based spatial pyramid representation. This technique first generates multiple superpixel maps by decreasing the superpixel number gradually along with the increased spatial regions for labelled samples. By using every superpixel map, sparse representation of pixels within every spatial region is then computed through local max pooling. Finally, features learned from training samples are aggregated and trained by a support vector machine (SVM) classifier. The proposed spectral-spatial hyperspectral image classification technique has been evaluated on two public hyperspectral datasets, including the Indian Pines image containing 16 different agricultural scene categories with a 20m resolution acquired by AVIRIS and the University of Pavia image containing 9 land-use categories with a 1.3m spatial resolution acquired by the ROSIS-03 sensor. Experimental results show significantly improved performance compared with the state-of-the-art works. The major contributions of this proposed technique include (1) a new spectral-spatial classification approach to generate feature representation for hyperspectral image, (2) a complementary yet effective feature pooling approach, i.e. the superpixel-based spatial pyramid representation that is used for the spatial correlation study, (3) evaluation on two public hyperspectral image datasets with superior image classification performance.
NASA Astrophysics Data System (ADS)
Megherbi, Dalila B.; Lodhi, S. M.; Boulenouar, A. J.
2001-03-01
This work is in the field of automated document processing. This work addresses the problem of representation and recognition of Urdu characters using Fourier representation and a Neural Network architecture. In particular, we show that a two-stage Neural Network scheme is used here to make classification of 36 Urdu characters into seven sub-classes namely subclasses characterized by seven proposed and defined fuzzy features specifically related to Urdu characters. We show that here Fourier Descriptors and Neural Network provide a remarkably simple way to draw definite conclusions from vague, ambiguous, noisy or imprecise information. In particular, we illustrate the concept of interest regions and describe a framing method that provides a way to make the proposed technique for Urdu characters recognition robust and invariant to scaling and translation. We also show that a given character rotation is dealt with by using the Hotelling transform. This transform is based upon the eigenvalue decomposition of the covariance matrix of an image, providing a method of determining the orientation of the major axis of an object within an image. Finally experimental results are presented to show the power and robustness of the proposed two-stage Neural Network based technique for Urdu character recognition, its fault tolerance, and high recognition accuracy.
An automatic indexing method for medical documents.
Wagner, M. M.
1991-01-01
This paper describes MetaIndex, an automatic indexing program that creates symbolic representations of documents for the purpose of document retrieval. MetaIndex uses a simple transition network parser to recognize a language that is derived from the set of main concepts in the Unified Medical Language System Metathesaurus (Meta-1). MetaIndex uses a hierarchy of medical concepts, also derived from Meta-1, to represent the content of documents. The goal of this approach is to improve document retrieval performance by better representation of documents. An evaluation method is described, and the performance of MetaIndex on the task of indexing the Slice of Life medical image collection is reported. PMID:1807564
The receptive field is dead. Long live the receptive field?
Fairhall, Adrienne
2014-01-01
Advances in experimental techniques, including behavioral paradigms using rich stimuli under closed loop conditions and the interfacing of neural systems with external inputs and outputs, reveal complex dynamics in the neural code and require a revisiting of standard concepts of representation. High-throughput recording and imaging methods along with the ability to observe and control neuronal subpopulations allow increasingly detailed access to the neural circuitry that subserves these representations and the computations they support. How do we harness theory to build biologically grounded models of complex neural function? PMID:24618227
Wang, Bigong; Li, Liang
2015-01-01
As an implementation of compressive sensing (CS), dual-dictionary learning (DDL) method provides an ideal access to restore signals of two related dictionaries and sparse representation. It has been proven that this method performs well in medical image reconstruction with highly undersampled data, especially for multimodality imaging like CT-MRI hybrid reconstruction. Because of its outstanding strength, short signal acquisition time, and low radiation dose, DDL has allured a broad interest in both academic and industrial fields. Here in this review article, we summarize DDL's development history, conclude the latest advance, and also discuss its role in the future directions and potential applications in medical imaging. Meanwhile, this paper points out that DDL is still in the initial stage, and it is necessary to make further studies to improve this method, especially in dictionary training.
Recent Development of Dual-Dictionary Learning Approach in Medical Image Analysis and Reconstruction
Wang, Bigong; Li, Liang
2015-01-01
As an implementation of compressive sensing (CS), dual-dictionary learning (DDL) method provides an ideal access to restore signals of two related dictionaries and sparse representation. It has been proven that this method performs well in medical image reconstruction with highly undersampled data, especially for multimodality imaging like CT-MRI hybrid reconstruction. Because of its outstanding strength, short signal acquisition time, and low radiation dose, DDL has allured a broad interest in both academic and industrial fields. Here in this review article, we summarize DDL's development history, conclude the latest advance, and also discuss its role in the future directions and potential applications in medical imaging. Meanwhile, this paper points out that DDL is still in the initial stage, and it is necessary to make further studies to improve this method, especially in dictionary training. PMID:26089956
Cruz-Roa, Angel; Gilmore, Hannah; Basavanhally, Ajay; Feldman, Michael; Ganesan, Shridar; Shih, Natalie; Tomaszewski, John; Madabhushi, Anant; González, Fabio
2018-01-01
Precise detection of invasive cancer on whole-slide images (WSI) is a critical first step in digital pathology tasks of diagnosis and grading. Convolutional neural network (CNN) is the most popular representation learning method for computer vision tasks, which have been successfully applied in digital pathology, including tumor and mitosis detection. However, CNNs are typically only tenable with relatively small image sizes (200 × 200 pixels). Only recently, Fully convolutional networks (FCN) are able to deal with larger image sizes (500 × 500 pixels) for semantic segmentation. Hence, the direct application of CNNs to WSI is not computationally feasible because for a WSI, a CNN would require billions or trillions of parameters. To alleviate this issue, this paper presents a novel method, High-throughput Adaptive Sampling for whole-slide Histopathology Image analysis (HASHI), which involves: i) a new efficient adaptive sampling method based on probability gradient and quasi-Monte Carlo sampling, and, ii) a powerful representation learning classifier based on CNNs. We applied HASHI to automated detection of invasive breast cancer on WSI. HASHI was trained and validated using three different data cohorts involving near 500 cases and then independently tested on 195 studies from The Cancer Genome Atlas. The results show that (1) the adaptive sampling method is an effective strategy to deal with WSI without compromising prediction accuracy by obtaining comparative results of a dense sampling (∼6 million of samples in 24 hours) with far fewer samples (∼2,000 samples in 1 minute), and (2) on an independent test dataset, HASHI is effective and robust to data from multiple sites, scanners, and platforms, achieving an average Dice coefficient of 76%.
Gilmore, Hannah; Basavanhally, Ajay; Feldman, Michael; Ganesan, Shridar; Shih, Natalie; Tomaszewski, John; Madabhushi, Anant; González, Fabio
2018-01-01
Precise detection of invasive cancer on whole-slide images (WSI) is a critical first step in digital pathology tasks of diagnosis and grading. Convolutional neural network (CNN) is the most popular representation learning method for computer vision tasks, which have been successfully applied in digital pathology, including tumor and mitosis detection. However, CNNs are typically only tenable with relatively small image sizes (200 × 200 pixels). Only recently, Fully convolutional networks (FCN) are able to deal with larger image sizes (500 × 500 pixels) for semantic segmentation. Hence, the direct application of CNNs to WSI is not computationally feasible because for a WSI, a CNN would require billions or trillions of parameters. To alleviate this issue, this paper presents a novel method, High-throughput Adaptive Sampling for whole-slide Histopathology Image analysis (HASHI), which involves: i) a new efficient adaptive sampling method based on probability gradient and quasi-Monte Carlo sampling, and, ii) a powerful representation learning classifier based on CNNs. We applied HASHI to automated detection of invasive breast cancer on WSI. HASHI was trained and validated using three different data cohorts involving near 500 cases and then independently tested on 195 studies from The Cancer Genome Atlas. The results show that (1) the adaptive sampling method is an effective strategy to deal with WSI without compromising prediction accuracy by obtaining comparative results of a dense sampling (∼6 million of samples in 24 hours) with far fewer samples (∼2,000 samples in 1 minute), and (2) on an independent test dataset, HASHI is effective and robust to data from multiple sites, scanners, and platforms, achieving an average Dice coefficient of 76%. PMID:29795581
Psychophysical Reverse Correlation with Multiple Response Alternatives
ERIC Educational Resources Information Center
Dai, Huanping; Micheyl, Christophe
2010-01-01
Psychophysical reverse-correlation methods such as the "classification image" technique provide a unique tool to uncover the internal representations and decision strategies of individual participants in perceptual tasks. Over the past 30 years, these techniques have gained increasing popularity among both visual and auditory psychophysicists.…
Navigation domain representation for interactive multiview imaging.
Maugey, Thomas; Daribo, Ismael; Cheung, Gene; Frossard, Pascal
2013-09-01
Enabling users to interactively navigate through different viewpoints of a static scene is a new interesting functionality in 3D streaming systems. While it opens exciting perspectives toward rich multimedia applications, it requires the design of novel representations and coding techniques to solve the new challenges imposed by the interactive navigation. In particular, the encoder must prepare a priori a compressed media stream that is flexible enough to enable the free selection of multiview navigation paths by different streaming media clients. Interactivity clearly brings new design constraints: the encoder is unaware of the exact decoding process, while the decoder has to reconstruct information from incomplete subsets of data since the server generally cannot transmit images for all possible viewpoints due to resource constrains. In this paper, we propose a novel multiview data representation that permits us to satisfy bandwidth and storage constraints in an interactive multiview streaming system. In particular, we partition the multiview navigation domain into segments, each of which is described by a reference image (color and depth data) and some auxiliary information. The auxiliary information enables the client to recreate any viewpoint in the navigation segment via view synthesis. The decoder is then able to navigate freely in the segment without further data request to the server; it requests additional data only when it moves to a different segment. We discuss the benefits of this novel representation in interactive navigation systems and further propose a method to optimize the partitioning of the navigation domain into independent segments, under bandwidth and storage constraints. Experimental results confirm the potential of the proposed representation; namely, our system leads to similar compression performance as classical inter-view coding, while it provides the high level of flexibility that is required for interactive streaming. Because of these unique properties, our new framework represents a promising solution for 3D data representation in novel interactive multimedia services.
Gurunathan, Rajalakshmi; Van Emden, Bernard; Panchanathan, Sethuraman; Kumar, Sudhir
2004-01-01
Background Modern developmental biology relies heavily on the analysis of embryonic gene expression patterns. Investigators manually inspect hundreds or thousands of expression patterns to identify those that are spatially similar and to ultimately infer potential gene interactions. However, the rapid accumulation of gene expression pattern data over the last two decades, facilitated by high-throughput techniques, has produced a need for the development of efficient approaches for direct comparison of images, rather than their textual descriptions, to identify spatially similar expression patterns. Results The effectiveness of the Binary Feature Vector (BFV) and Invariant Moment Vector (IMV) based digital representations of the gene expression patterns in finding biologically meaningful patterns was compared for a small (226 images) and a large (1819 images) dataset. For each dataset, an ordered list of images, with respect to a query image, was generated to identify overlapping and similar gene expression patterns, in a manner comparable to what a developmental biologist might do. The results showed that the BFV representation consistently outperforms the IMV representation in finding biologically meaningful matches when spatial overlap of the gene expression pattern and the genes involved are considered. Furthermore, we explored the value of conducting image-content based searches in a dataset where individual expression components (or domains) of multi-domain expression patterns were also included separately. We found that this technique improves performance of both IMV and BFV based searches. Conclusions We conclude that the BFV representation consistently produces a more extensive and better list of biologically useful patterns than the IMV representation. The high quality of results obtained scales well as the search database becomes larger, which encourages efforts to build automated image query and retrieval systems for spatial gene expression patterns. PMID:15603586
Using image mapping towards biomedical and biological data sharing
2013-01-01
Image-based data integration in eHealth and life sciences is typically concerned with the method used for anatomical space mapping, needed to retrieve, compare and analyse large volumes of biomedical data. In mapping one image onto another image, a mechanism is used to match and find the corresponding spatial regions which have the same meaning between the source and the matching image. Image-based data integration is useful for integrating data of various information structures. Here we discuss a broad range of issues related to data integration of various information structures, review exemplary work on image representation and mapping, and discuss the challenges that these techniques may bring. PMID:24059352
The Thinker versus a Quilting Bee: Contrasting Images.
ERIC Educational Resources Information Center
Thayer-Bacon, Barbara J.
1999-01-01
Offers the image of the quilting bee as a contrasting representation of critical thinking (or constructive thinking), comparing the two images, discussing a quilting bee representation of knowledge construction in terms of the tools used by quilters (knowers), and summarizing the transformation of critical thinking theory that a quilting bee image…
Beyond sensory images: Object-based representation in the human ventral pathway
Pietrini, Pietro; Furey, Maura L.; Ricciardi, Emiliano; Gobbini, M. Ida; Wu, W.-H. Carolyn; Cohen, Leonardo; Guazzelli, Mario; Haxby, James V.
2004-01-01
We investigated whether the topographically organized, category-related patterns of neural response in the ventral visual pathway are a representation of sensory images or a more abstract representation of object form that is not dependent on sensory modality. We used functional MRI to measure patterns of response evoked during visual and tactile recognition of faces and manmade objects in sighted subjects and during tactile recognition in blind subjects. Results showed that visual and tactile recognition evoked category-related patterns of response in a ventral extrastriate visual area in the inferior temporal gyrus that were correlated across modality for manmade objects. Blind subjects also demonstrated category-related patterns of response in this “visual” area, and in more ventral cortical regions in the fusiform gyrus, indicating that these patterns are not due to visual imagery and, furthermore, that visual experience is not necessary for category-related representations to develop in these cortices. These results demonstrate that the representation of objects in the ventral visual pathway is not simply a representation of visual images but, rather, is a representation of more abstract features of object form. PMID:15064396
ERIC Educational Resources Information Center
Ritzhaupt, Albert Dieter; Barron, Ann
2008-01-01
The purpose of this study was to investigate the effect of time-compressed narration and representational adjunct images on a learner's ability to recall and recognize information. The experiment was a 4 Audio Speeds (1.0 = normal vs. 1.5 = moderate vs. 2.0 = fast vs. 2.5 = fastest rate) x Adjunct Image (Image Present vs. Image Absent) factorial…
ERIC Educational Resources Information Center
Mendick, Heather; Moreau, Marie-Pierre
2013-01-01
This paper looks at online representations of women and men in science, engineering and technology. We show that these representations largely re/produce dominant gender discourses. We then focus on the question: How are gender cliched images re/produced online? Drawing on a discursive analysis of data from six interviews with web authors, we…
ERIC Educational Resources Information Center
Halliwell, Emma; Malson, Helen; Tischner, Irmgard
2011-01-01
There has been a shift in the depiction of women in advertising from objectifying representations of women as passive sex objects to agentic sexual representations where the women appear powerful and in control (Gill, 2007a, 2008), and there is substantial evidence that these representations have a negative impact on women's body image. However,…
Conical Perspective Image of an Architectural Object Close to Human Perception
NASA Astrophysics Data System (ADS)
Dzwierzynska, Jolanta
2017-10-01
The aim of the study is to develop a method of computer aided constructing conical perspective of an architectural object, which is close to human perception. The conical perspective considered in the paper is a central projection onto a projection surface being a conical rotary surface or a fragment of it. Whereas, the centre of projection is a stationary point or a point moving on a circular path. The graphical mapping results of the perspective representation is realized directly on an unrolled flat projection surface. The projective relation between a range of points on a line and the perspective image of the same range of points received on a cylindrical projection surface permitted to derive formulas for drawing perspective. Next, the analytical algorithms for drawing perspective image of a straight line passing through any two points were formulated. It enabled drawing a perspective wireframe image of a given 3D object. The use of the moving view point as well as the application of the changeable base elements of perspective as the variables in the algorithms enable drawing conical perspective from different viewing positions. Due to this fact, the perspective drawing method is universal. The algorithms are formulated and tested in Mathcad Professional software, but can be implemented in AutoCAD and majority of computer graphical packages, which makes drawing a perspective image more efficient and easier. The presented conical perspective representation, and the convenient method of its mapping directly on the flat unrolled surface can find application for numerous advertisement and art presentations.
Reweighted mass center based object-oriented sparse subspace clustering for hyperspectral images
NASA Astrophysics Data System (ADS)
Zhai, Han; Zhang, Hongyan; Zhang, Liangpei; Li, Pingxiang
2016-10-01
Considering the inevitable obstacles faced by the pixel-based clustering methods, such as salt-and-pepper noise, high computational complexity, and the lack of spatial information, a reweighted mass center based object-oriented sparse subspace clustering (RMC-OOSSC) algorithm for hyperspectral images (HSIs) is proposed. First, the mean-shift segmentation method is utilized to oversegment the HSI to obtain meaningful objects. Second, a distance reweighted mass center learning model is presented to extract the representative and discriminative features for each object. Third, assuming that all the objects are sampled from a union of subspaces, it is natural to apply the SSC algorithm to the HSI. Faced with the high correlation among the hyperspectral objects, a weighting scheme is adopted to ensure that the highly correlated objects are preferred in the procedure of sparse representation, to reduce the representation errors. Two widely used hyperspectral datasets were utilized to test the performance of the proposed RMC-OOSSC algorithm, obtaining high clustering accuracies (overall accuracy) of 71.98% and 89.57%, respectively. The experimental results show that the proposed method clearly improves the clustering performance with respect to the other state-of-the-art clustering methods, and it significantly reduces the computational time.
Three-dimensional reconstruction of Roman coins from photometric image sets
NASA Astrophysics Data System (ADS)
MacDonald, Lindsay; Moitinho de Almeida, Vera; Hess, Mona
2017-01-01
A method is presented for increasing the spatial resolution of the three-dimensional (3-D) digital representation of coins by combining fine photometric detail derived from a set of photographic images with accurate geometric data from a 3-D laser scanner. 3-D reconstructions were made of the obverse and reverse sides of two ancient Roman denarii by processing sets of images captured under directional lighting in an illumination dome. Surface normal vectors were calculated by a "bounded regression" technique, excluding both shadow and specular components of reflection from the metallic surface. Because of the known difficulty in achieving geometric accuracy when integrating photometric normals to produce a digital elevation model, the low spatial frequencies were replaced by those derived from the point cloud produced by a 3-D laser scanner. The two datasets were scaled and registered by matching the outlines and correlating the surface gradients. The final result was a realistic rendering of the coins at a spatial resolution of 75 pixels/mm (13-μm spacing), in which the fine detail modulated the underlying geometric form of the surface relief. The method opens the way to obtain high quality 3-D representations of coins in collections to enable interactive online viewing.
Effective real-time vehicle tracking using discriminative sparse coding on local patches
NASA Astrophysics Data System (ADS)
Chen, XiangJun; Ye, Feiyue; Ruan, Yaduan; Chen, Qimei
2016-01-01
A visual tracking framework that provides an object detector and tracker, which focuses on effective and efficient visual tracking in surveillance of real-world intelligent transport system applications, is proposed. The framework casts the tracking task as problems of object detection, feature representation, and classification, which is different from appearance model-matching approaches. Through a feature representation of discriminative sparse coding on local patches called DSCLP, which trains a dictionary on local clustered patches sampled from both positive and negative datasets, the discriminative power and robustness has been improved remarkably, which makes our method more robust to a complex realistic setting with all kinds of degraded image quality. Moreover, by catching objects through one-time background subtraction, along with offline dictionary training, computation time is dramatically reduced, which enables our framework to achieve real-time tracking performance even in a high-definition sequence with heavy traffic. Experiment results show that our work outperforms some state-of-the-art methods in terms of speed, accuracy, and robustness and exhibits increased robustness in a complex real-world scenario with degraded image quality caused by vehicle occlusion, image blur of rain or fog, and change in viewpoint or scale.
Learning toward practical head pose estimation
NASA Astrophysics Data System (ADS)
Sang, Gaoli; He, Feixiang; Zhu, Rong; Xuan, Shibin
2017-08-01
Head pose is useful information for many face-related tasks, such as face recognition, behavior analysis, human-computer interfaces, etc. Existing head pose estimation methods usually assume that the face images have been well aligned or that sufficient and precise training data are available. In practical applications, however, these assumptions are very likely to be invalid. This paper first investigates the impact of the failure of these assumptions, i.e., misalignment of face images, uncertainty and undersampling of training data, on head pose estimation accuracy of state-of-the-art methods. A learning-based approach is then designed to enhance the robustness of head pose estimation to these factors. To cope with misalignment, instead of using hand-crafted features, it seeks suitable features by learning from a set of training data with a deep convolutional neural network (DCNN), such that the training data can be best classified into the correct head pose categories. To handle uncertainty and undersampling, it employs multivariate labeling distributions (MLDs) with dense sampling intervals to represent the head pose attributes of face images. The correlation between the features and the dense MLD representations of face images is approximated by a maximum entropy model, whose parameters are optimized on the given training data. To estimate the head pose of a face image, its MLD representation is first computed according to the model based on the features extracted from the image by the trained DCNN, and its head pose is then assumed to be the one corresponding to the peak in its MLD. Evaluation experiments on the Pointing'04, FacePix, Multi-PIE, and CASIA-PEAL databases prove the effectiveness and efficiency of the proposed method.
NASA Astrophysics Data System (ADS)
Behlim, Sadaf Iqbal; Syed, Tahir Qasim; Malik, Muhammad Yameen; Vigneron, Vincent
2016-11-01
Grouping image tokens is an intermediate step needed to arrive at meaningful image representation and summarization. Usually, perceptual cues, for instance, gestalt properties inform token grouping. However, they do not take into account structural continuities that could be derived from other tokens belonging to similar structures irrespective of their location. We propose an image representation that encodes structural constraints emerging from local binary patterns (LBP), which provides a long-distance measure of similarity but in a structurally connected way. Our representation provides a grouping of pixels or larger image tokens that is free of numeric similarity measures and could therefore be extended to nonmetric spaces. The representation lends itself nicely to ubiquitous image processing applications such as connected component labeling and segmentation. We test our proposed representation on the perceptual grouping or segmentation task on the popular Berkeley segmentation dataset (BSD500) that with respect to human segmented images achieves an average F-measure of 0.559. Our algorithm achieves a high average recall of 0.787 and is therefore well-suited to other applications such as object retrieval and category-independent object recognition. The proposed merging heuristic based on levels of singular tree component has shown promising results on the BSD500 dataset and currently ranks 12th among all benchmarked algorithms, but contrary to the others, it requires no data-driven training or specialized preprocessing.
A survey of infrared and visual image fusion methods
NASA Astrophysics Data System (ADS)
Jin, Xin; Jiang, Qian; Yao, Shaowen; Zhou, Dongming; Nie, Rencan; Hai, Jinjin; He, Kangjian
2017-09-01
Infrared (IR) and visual (VI) image fusion is designed to fuse multiple source images into a comprehensive image to boost imaging quality and reduce redundancy information, which is widely used in various imaging equipment to improve the visual ability of human and robot. The accurate, reliable and complementary descriptions of the scene in fused images make these techniques be widely used in various fields. In recent years, a large number of fusion methods for IR and VI images have been proposed due to the ever-growing demands and the progress of image representation methods; however, there has not been published an integrated survey paper about this field in last several years. Therefore, we make a survey to report the algorithmic developments of IR and VI image fusion. In this paper, we first characterize the IR and VI image fusion based applications to represent an overview of the research status. Then we present a synthesize survey of the state of the art. Thirdly, the frequently-used image fusion quality measures are introduced. Fourthly, we perform some experiments of typical methods and make corresponding analysis. At last, we summarize the corresponding tendencies and challenges in IR and VI image fusion. This survey concludes that although various IR and VI image fusion methods have been proposed, there still exist further improvements or potential research directions in different applications of IR and VI image fusion.
Sparks, Rachel; Madabhushi, Anant
2016-01-01
Content-based image retrieval (CBIR) retrieves database images most similar to the query image by (1) extracting quantitative image descriptors and (2) calculating similarity between database and query image descriptors. Recently, manifold learning (ML) has been used to perform CBIR in a low dimensional representation of the high dimensional image descriptor space to avoid the curse of dimensionality. ML schemes are computationally expensive, requiring an eigenvalue decomposition (EVD) for every new query image to learn its low dimensional representation. We present out-of-sample extrapolation utilizing semi-supervised ML (OSE-SSL) to learn the low dimensional representation without recomputing the EVD for each query image. OSE-SSL incorporates semantic information, partial class label, into a ML scheme such that the low dimensional representation co-localizes semantically similar images. In the context of prostate histopathology, gland morphology is an integral component of the Gleason score which enables discrimination between prostate cancer aggressiveness. Images are represented by shape features extracted from the prostate gland. CBIR with OSE-SSL for prostate histology obtained from 58 patient studies, yielded an area under the precision recall curve (AUPRC) of 0.53 ± 0.03 comparatively a CBIR with Principal Component Analysis (PCA) to learn a low dimensional space yielded an AUPRC of 0.44 ± 0.01. PMID:27264985
NASA Astrophysics Data System (ADS)
Shi, R.; Sun, Z.
2018-04-01
GF-3 synthetic aperture radar (SAR) images are rich in information and have obvious sparse features. However, the speckle appears in the GF-3 SAR images due to the coherent imaging system and it hinders the interpretation of images seriously. Recently, Shearlet is applied to the image processing with its best sparse representation. A new Shearlet-transform-based method is proposed in this paper based on the improved non-local means. Firstly, the logarithmic operation and the non-subsampled Shearlet transformation are applied to the GF-3 SAR image. Secondly, in order to solve the problems that the image details are smoothed overly and the weight distribution is affected by the speckle, a new non-local means is used for the transformed high frequency coefficient. Thirdly, the Shearlet reconstruction is carried out. Finally, the final filtered image is obtained by an exponential operation. Experimental results demonstrate that, compared with other despeckling methods, the proposed method can suppress the speckle effectively in homogeneous regions and has better capability of edge preserving.
High compression image and image sequence coding
NASA Technical Reports Server (NTRS)
Kunt, Murat
1989-01-01
The digital representation of an image requires a very large number of bits. This number is even larger for an image sequence. The goal of image coding is to reduce this number, as much as possible, and reconstruct a faithful duplicate of the original picture or image sequence. Early efforts in image coding, solely guided by information theory, led to a plethora of methods. The compression ratio reached a plateau around 10:1 a couple of years ago. Recent progress in the study of the brain mechanism of vision and scene analysis has opened new vistas in picture coding. Directional sensitivity of the neurones in the visual pathway combined with the separate processing of contours and textures has led to a new class of coding methods capable of achieving compression ratios as high as 100:1 for images and around 300:1 for image sequences. Recent progress on some of the main avenues of object-based methods is presented. These second generation techniques make use of contour-texture modeling, new results in neurophysiology and psychophysics and scene analysis.
2010-01-01
Background Change blindness refers to a failure to detect changes between consecutively presented images separated by, for example, a brief blank screen. As an explanation of change blindness, it has been suggested that our representations of the environment are sparse outside focal attention and even that changed features may not be represented at all. In order to find electrophysiological evidence of neural representations of changed features during change blindness, we recorded event-related potentials (ERPs) in adults in an oddball variant of the change blindness flicker paradigm. Methods ERPs were recorded when subjects performed a change detection task in which the modified images were infrequently interspersed (p = .2) among the frequently (p = .8) presented unmodified images. Responses to modified and unmodified images were compared in the time window of 60-100 ms after stimulus onset. Results ERPs to infrequent modified images were found to differ in amplitude from those to frequent unmodified images at the midline electrodes (Fz, Pz, Cz and Oz) at the latency of 60-100 ms even when subjects were unaware of changes (change blindness). Conclusions The results suggest that the brain registers changes very rapidly, and that changed features in images are neurally represented even without participants' ability to report them. PMID:20181126
Efficient space-time sampling with pixel-wise coded exposure for high-speed imaging.
Liu, Dengyu; Gu, Jinwei; Hitomi, Yasunobu; Gupta, Mohit; Mitsunaga, Tomoo; Nayar, Shree K
2014-02-01
Cameras face a fundamental trade-off between spatial and temporal resolution. Digital still cameras can capture images with high spatial resolution, but most high-speed video cameras have relatively low spatial resolution. It is hard to overcome this trade-off without incurring a significant increase in hardware costs. In this paper, we propose techniques for sampling, representing, and reconstructing the space-time volume to overcome this trade-off. Our approach has two important distinctions compared to previous works: 1) We achieve sparse representation of videos by learning an overcomplete dictionary on video patches, and 2) we adhere to practical hardware constraints on sampling schemes imposed by architectures of current image sensors, which means that our sampling function can be implemented on CMOS image sensors with modified control units in the future. We evaluate components of our approach, sampling function and sparse representation, by comparing them to several existing approaches. We also implement a prototype imaging system with pixel-wise coded exposure control using a liquid crystal on silicon device. System characteristics such as field of view and modulation transfer function are evaluated for our imaging system. Both simulations and experiments on a wide range of scenes show that our method can effectively reconstruct a video from a single coded image while maintaining high spatial resolution.
Cost-Sensitive Local Binary Feature Learning for Facial Age Estimation.
Lu, Jiwen; Liong, Venice Erin; Zhou, Jie
2015-12-01
In this paper, we propose a cost-sensitive local binary feature learning (CS-LBFL) method for facial age estimation. Unlike the conventional facial age estimation methods that employ hand-crafted descriptors or holistically learned descriptors for feature representation, our CS-LBFL method learns discriminative local features directly from raw pixels for face representation. Motivated by the fact that facial age estimation is a cost-sensitive computer vision problem and local binary features are more robust to illumination and expression variations than holistic features, we learn a series of hashing functions to project raw pixel values extracted from face patches into low-dimensional binary codes, where binary codes with similar chronological ages are projected as close as possible, and those with dissimilar chronological ages are projected as far as possible. Then, we pool and encode these local binary codes within each face image as a real-valued histogram feature for face representation. Moreover, we propose a cost-sensitive local binary multi-feature learning method to jointly learn multiple sets of hashing functions using face patches extracted from different scales to exploit complementary information. Our methods achieve competitive performance on four widely used face aging data sets.
Anderson, Andrew James; Bruni, Elia; Lopopolo, Alessandro; Poesio, Massimo; Baroni, Marco
2015-10-15
Embodiment theory predicts that mental imagery of object words recruits neural circuits involved in object perception. The degree of visual imagery present in routine thought and how it is encoded in the brain is largely unknown. We test whether fMRI activity patterns elicited by participants reading objects' names include embodied visual-object representations, and whether we can decode the representations using novel computational image-based semantic models. We first apply the image models in conjunction with text-based semantic models to test predictions of visual-specificity of semantic representations in different brain regions. Representational similarity analysis confirms that fMRI structure within ventral-temporal and lateral-occipital regions correlates most strongly with the image models and conversely text models correlate better with posterior-parietal/lateral-temporal/inferior-frontal regions. We use an unsupervised decoding algorithm that exploits commonalities in representational similarity structure found within both image model and brain data sets to classify embodied visual representations with high accuracy (8/10) and then extend it to exploit model combinations to robustly decode different brain regions in parallel. By capturing latent visual-semantic structure our models provide a route into analyzing neural representations derived from past perceptual experience rather than stimulus-driven brain activity. Our results also verify the benefit of combining multimodal data to model human-like semantic representations. Copyright © 2015 Elsevier Inc. All rights reserved.
Contour Tracking in Echocardiographic Sequences via Sparse Representation and Dictionary Learning
Huang, Xiaojie; Dione, Donald P.; Compas, Colin B.; Papademetris, Xenophon; Lin, Ben A.; Bregasi, Alda; Sinusas, Albert J.; Staib, Lawrence H.; Duncan, James S.
2013-01-01
This paper presents a dynamical appearance model based on sparse representation and dictionary learning for tracking both endocardial and epicardial contours of the left ventricle in echocardiographic sequences. Instead of learning offline spatiotemporal priors from databases, we exploit the inherent spatiotemporal coherence of individual data to constraint cardiac contour estimation. The contour tracker is initialized with a manual tracing of the first frame. It employs multiscale sparse representation of local image appearance and learns online multiscale appearance dictionaries in a boosting framework as the image sequence is segmented frame-by-frame sequentially. The weights of multiscale appearance dictionaries are optimized automatically. Our region-based level set segmentation integrates a spectrum of complementary multilevel information including intensity, multiscale local appearance, and dynamical shape prediction. The approach is validated on twenty-six 4D canine echocardiographic images acquired from both healthy and post-infarct canines. The segmentation results agree well with expert manual tracings. The ejection fraction estimates also show good agreement with manual results. Advantages of our approach are demonstrated by comparisons with a conventional pure intensity model, a registration-based contour tracker, and a state-of-the-art database-dependent offline dynamical shape model. We also demonstrate the feasibility of clinical application by applying the method to four 4D human data sets. PMID:24292554
Visual Research Methods: Image, Society, and Representation
ERIC Educational Resources Information Center
Stanczak, Gregory C., Ed.
2007-01-01
Visual research is reemerging across the social sciences as a significant, underutilized resource producing unique lines of inquiry and sparking innovative pedagogies. Stanczak's edited volume crisscrosses disciplines in ways that highlights the multiple manifestations of this newer interdisciplinary trend. As such, this volume will be useful as…
eHUGS: Enhanced Hierarchical Unbiased Graph Shrinkage for Efficient Groupwise Registration
Wu, Guorong; Peng, Xuewei; Ying, Shihui; Wang, Qian; Yap, Pew-Thian; Shen, Dan; Shen, Dinggang
2016-01-01
Effective and efficient spatial normalization of a large population of brain images is critical for many clinical and research studies, but it is technically very challenging. A commonly used approach is to choose a certain image as the template and then align all other images in the population to this template by applying pairwise registration. To avoid the potential bias induced by the inappropriate template selection, groupwise registration methods have been proposed to simultaneously register all images to a latent common space. However, current groupwise registration methods do not make full use of image distribution information for more accurate registration. In this paper, we present a novel groupwise registration method that harnesses the image distribution information by capturing the image distribution manifold using a hierarchical graph with its nodes representing the individual images. More specifically, a low-level graph describes the image distribution in each subgroup, and a high-level graph encodes the relationship between representative images of subgroups. Given the graph representation, we can register all images to the common space by dynamically shrinking the graph on the image manifold. The topology of the entire image distribution is always maintained during graph shrinkage. Evaluations on two datasets, one for 80 elderly individuals and one for 285 infants, indicate that our method can yield promising results. PMID:26800361
Using deep learning for content-based medical image retrieval
NASA Astrophysics Data System (ADS)
Sun, Qinpei; Yang, Yuanyuan; Sun, Jianyong; Yang, Zhiming; Zhang, Jianguo
2017-03-01
Content-Based medical image retrieval (CBMIR) is been highly active research area from past few years. The retrieval performance of a CBMIR system crucially depends on the feature representation, which have been extensively studied by researchers for decades. Although a variety of techniques have been proposed, it remains one of the most challenging problems in current CBMIR research, which is mainly due to the well-known "semantic gap" issue that exists between low-level image pixels captured by machines and high-level semantic concepts perceived by human[1]. Recent years have witnessed some important advances of new techniques in machine learning. One important breakthrough technique is known as "deep learning". Unlike conventional machine learning methods that are often using "shallow" architectures, deep learning mimics the human brain that is organized in a deep architecture and processes information through multiple stages of transformation and representation. This means that we do not need to spend enormous energy to extract features manually. In this presentation, we propose a novel framework which uses deep learning to retrieval the medical image to improve the accuracy and speed of a CBIR in integrated RIS/PACS.
A Study of Hand Back Skin Texture Patterns for Personal Identification and Gender Classification
Xie, Jin; Zhang, Lei; You, Jane; Zhang, David; Qu, Xiaofeng
2012-01-01
Human hand back skin texture (HBST) is often consistent for a person and distinctive from person to person. In this paper, we study the HBST pattern recognition problem with applications to personal identification and gender classification. A specially designed system is developed to capture HBST images, and an HBST image database was established, which consists of 1,920 images from 80 persons (160 hands). An efficient texton learning based method is then presented to classify the HBST patterns. First, textons are learned in the space of filter bank responses from a set of training images using the l1 -minimization based sparse representation (SR) technique. Then, under the SR framework, we represent the feature vector at each pixel over the learned dictionary to construct a representation coefficient histogram. Finally, the coefficient histogram is used as skin texture feature for classification. Experiments on personal identification and gender classification are performed by using the established HBST database. The results show that HBST can be used to assist human identification and gender classification. PMID:23012512
Teng, Santani
2017-01-01
In natural environments, visual and auditory stimulation elicit responses across a large set of brain regions in a fraction of a second, yielding representations of the multimodal scene and its properties. The rapid and complex neural dynamics underlying visual and auditory information processing pose major challenges to human cognitive neuroscience. Brain signals measured non-invasively are inherently noisy, the format of neural representations is unknown, and transformations between representations are complex and often nonlinear. Further, no single non-invasive brain measurement technique provides a spatio-temporally integrated view. In this opinion piece, we argue that progress can be made by a concerted effort based on three pillars of recent methodological development: (i) sensitive analysis techniques such as decoding and cross-classification, (ii) complex computational modelling using models such as deep neural networks, and (iii) integration across imaging methods (magnetoencephalography/electroencephalography, functional magnetic resonance imaging) and models, e.g. using representational similarity analysis. We showcase two recent efforts that have been undertaken in this spirit and provide novel results about visual and auditory scene analysis. Finally, we discuss the limits of this perspective and sketch a concrete roadmap for future research. This article is part of the themed issue ‘Auditory and visual scene analysis’. PMID:28044019
Cichy, Radoslaw Martin; Teng, Santani
2017-02-19
In natural environments, visual and auditory stimulation elicit responses across a large set of brain regions in a fraction of a second, yielding representations of the multimodal scene and its properties. The rapid and complex neural dynamics underlying visual and auditory information processing pose major challenges to human cognitive neuroscience. Brain signals measured non-invasively are inherently noisy, the format of neural representations is unknown, and transformations between representations are complex and often nonlinear. Further, no single non-invasive brain measurement technique provides a spatio-temporally integrated view. In this opinion piece, we argue that progress can be made by a concerted effort based on three pillars of recent methodological development: (i) sensitive analysis techniques such as decoding and cross-classification, (ii) complex computational modelling using models such as deep neural networks, and (iii) integration across imaging methods (magnetoencephalography/electroencephalography, functional magnetic resonance imaging) and models, e.g. using representational similarity analysis. We showcase two recent efforts that have been undertaken in this spirit and provide novel results about visual and auditory scene analysis. Finally, we discuss the limits of this perspective and sketch a concrete roadmap for future research.This article is part of the themed issue 'Auditory and visual scene analysis'. © 2017 The Authors.
Recognition Alters the Spatial Pattern of fMRI Activation in Early Retinotopic Cortex
Vul, E.; Kanwisher, N.
2010-01-01
Early retinotopic cortex has traditionally been viewed as containing a veridical representation of the low-level properties of the image, not imbued by high-level interpretation and meaning. Yet several recent results indicate that neural representations in early retinotopic cortex reflect not just the sensory properties of the image, but also the perceived size and brightness of image regions. Here we used functional magnetic resonance imaging pattern analyses to ask whether the representation of an object in early retinotopic cortex changes when the object is recognized compared with when the same stimulus is presented but not recognized. Our data confirmed this hypothesis: the pattern of response in early retinotopic visual cortex to a two-tone “Mooney” image of an object was more similar to the response to the full grayscale photo version of the same image when observers knew what the two-tone image represented than when they did not. Further, in a second experiment, high-level interpretations actually overrode bottom-up stimulus information, such that the pattern of response in early retinotopic cortex to an identified two-tone image was more similar to the response to the photographic version of that stimulus than it was to the response to the identical two-tone image when it was not identified. Our findings are consistent with prior results indicating that perceived size and brightness affect representations in early retinotopic visual cortex and, further, show that even higher-level information—knowledge of object identity—also affects the representation of an object in early retinotopic cortex. PMID:20071627
Robust algebraic image enhancement for intelligent control systems
NASA Technical Reports Server (NTRS)
Lerner, Bao-Ting; Morrelli, Michael
1993-01-01
Robust vision capability for intelligent control systems has been an elusive goal in image processing. The computationally intensive techniques a necessary for conventional image processing make real-time applications, such as object tracking and collision avoidance difficult. In order to endow an intelligent control system with the needed vision robustness, an adequate image enhancement subsystem capable of compensating for the wide variety of real-world degradations, must exist between the image capturing and the object recognition subsystems. This enhancement stage must be adaptive and must operate with consistency in the presence of both statistical and shape-based noise. To deal with this problem, we have developed an innovative algebraic approach which provides a sound mathematical framework for image representation and manipulation. Our image model provides a natural platform from which to pursue dynamic scene analysis, and its incorporation into a vision system would serve as the front-end to an intelligent control system. We have developed a unique polynomial representation of gray level imagery and applied this representation to develop polynomial operators on complex gray level scenes. This approach is highly advantageous since polynomials can be manipulated very easily, and are readily understood, thus providing a very convenient environment for image processing. Our model presents a highly structured and compact algebraic representation of grey-level images which can be viewed as fuzzy sets.
Image synthesis for SAR system, calibration and processor design
NASA Technical Reports Server (NTRS)
Holtzman, J. C.; Abbott, J. L.; Kaupp, V. H.; Frost, V. S.
1978-01-01
The Point Scattering Method of simulating radar imagery rigorously models all aspects of the imaging radar phenomena. Its computational algorithms operate on a symbolic representation of the terrain test site to calculate such parameters as range, angle of incidence, resolution cell size, etc. Empirical backscatter data and elevation data are utilized to model the terrain. Additionally, the important geometrical/propagation effects such as shadow, foreshortening, layover, and local angle of incidence are rigorously treated. Applications of radar image simulation to a proposed calibrated SAR system are highlighted: soil moisture detection and vegetation discrimination.
Reconstruction of noisy and blurred images using blur kernel
NASA Astrophysics Data System (ADS)
Ellappan, Vijayan; Chopra, Vishal
2017-11-01
Blur is a common in so many digital images. Blur can be caused by motion of the camera and scene object. In this work we proposed a new method for deblurring images. This work uses sparse representation to identify the blur kernel. By analyzing the image coordinates Using coarse and fine, we fetch the kernel based image coordinates and according to that observation we get the motion angle of the shaken or blurred image. Then we calculate the length of the motion kernel using radon transformation and Fourier for the length calculation of the image and we use Lucy Richardson algorithm which is also called NON-Blind(NBID) Algorithm for more clean and less noisy image output. All these operation will be performed in MATLAB IDE.
Differences conception prospective students teacher about limit of function based gender
NASA Astrophysics Data System (ADS)
Usman, Juniati, Dwi; Siswono, Tatag Yuli Eko
2017-08-01
Gender is one of the interesting topics and has continuity to be explored in mathematics education research. The purpose of this study to explore difference on conceptions of students teaching program by gender. It specialized on conception of understanding, representating, and mental images about limit function. This research conducting qualitative explorative method approach. The subject consisted of one man and one woman from the group of highly skilled student and has gone through semester V. Based on data that had been analyzed proved that male student has an understanding about limit function shared by explaining this material using illustrations, while female student explained it through verbal explanation. Due to representating aspect, it revealed that both of male and female students have similarity such as using verbal explanation, graphs, symbols, and tables to representating about limit function. Analyzing Mental image aspect, researcher got that male student using word "to converge" to explained about limit function, while female student using word "to approach". So, there are differences conceptions about limit function between male and female student.
Finding imaging patterns of structural covariance via Non-Negative Matrix Factorization.
Sotiras, Aristeidis; Resnick, Susan M; Davatzikos, Christos
2015-03-01
In this paper, we investigate the use of Non-Negative Matrix Factorization (NNMF) for the analysis of structural neuroimaging data. The goal is to identify the brain regions that co-vary across individuals in a consistent way, hence potentially being part of underlying brain networks or otherwise influenced by underlying common mechanisms such as genetics and pathologies. NNMF offers a directly data-driven way of extracting relatively localized co-varying structural regions, thereby transcending limitations of Principal Component Analysis (PCA), Independent Component Analysis (ICA) and other related methods that tend to produce dispersed components of positive and negative loadings. In particular, leveraging upon the well known ability of NNMF to produce parts-based representations of image data, we derive decompositions that partition the brain into regions that vary in consistent ways across individuals. Importantly, these decompositions achieve dimensionality reduction via highly interpretable ways and generalize well to new data as shown via split-sample experiments. We empirically validate NNMF in two data sets: i) a Diffusion Tensor (DT) mouse brain development study, and ii) a structural Magnetic Resonance (sMR) study of human brain aging. We demonstrate the ability of NNMF to produce sparse parts-based representations of the data at various resolutions. These representations seem to follow what we know about the underlying functional organization of the brain and also capture some pathological processes. Moreover, we show that these low dimensional representations favorably compare to descriptions obtained with more commonly used matrix factorization methods like PCA and ICA. Copyright © 2014 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Viswanath, Satish; Rosen, Mark; Madabhushi, Anant
2008-03-01
Current techniques for localization of prostatic adenocarcinoma (CaP) via blinded trans-rectal ultrasound biopsy are associated with a high false negative detection rate. While high resolution endorectal in vivo Magnetic Resonance (MR) prostate imaging has been shown to have improved contrast and resolution for CaP detection over ultrasound, similarity in intensity characteristics between benign and cancerous regions on MR images contribute to a high false positive detection rate. In this paper, we present a novel unsupervised segmentation method that employs manifold learning via consensus schemes for detection of cancerous regions from high resolution 1.5 Tesla (T) endorectal in vivo prostate MRI. A significant contribution of this paper is a method to combine multiple weak, lower-dimensional representations of high dimensional feature data in a way analogous to classifier ensemble schemes, and hence create a stable and accurate reduced dimensional representation. After correcting for MR image intensity artifacts, such as bias field inhomogeneity and intensity non-standardness, our algorithm extracts over 350 3D texture features at every spatial location in the MR scene at multiple scales and orientations. Non-linear dimensionality reduction schemes such as Locally Linear Embedding (LLE) and Graph Embedding (GE) are employed to create multiple low dimensional data representations of this high dimensional texture feature space. Our novel consensus embedding method is used to average object adjacencies from within the multiple low dimensional projections so that class relationships are preserved. Unsupervised consensus clustering is then used to partition the objects in this consensus embedding space into distinct classes. Quantitative evaluation on 18 1.5 T prostate MR data against corresponding histology obtained from the multi-site ACRIN trials show a sensitivity of 92.65% and a specificity of 82.06%, which suggests that our method is successfully able to detect suspicious regions in the prostate.
A Robust Image Watermarking in the Joint Time-Frequency Domain
NASA Astrophysics Data System (ADS)
Öztürk, Mahmut; Akan, Aydın; Çekiç, Yalçın
2010-12-01
With the rapid development of computers and internet applications, copyright protection of multimedia data has become an important problem. Watermarking techniques are proposed as a solution to copyright protection of digital media files. In this paper, a new, robust, and high-capacity watermarking method that is based on spatiofrequency (SF) representation is presented. We use the discrete evolutionary transform (DET) calculated by the Gabor expansion to represent an image in the joint SF domain. The watermark is embedded onto selected coefficients in the joint SF domain. Hence, by combining the advantages of spatial and spectral domain watermarking methods, a robust, invisible, secure, and high-capacity watermarking method is presented. A correlation-based detector is also proposed to detect and extract any possible watermarks on an image. The proposed watermarking method was tested on some commonly used test images under different signal processing attacks like additive noise, Wiener and Median filtering, JPEG compression, rotation, and cropping. Simulation results show that our method is robust against all of the attacks.
Ghost microscope imaging system from the perspective of coherent-mode representation
NASA Astrophysics Data System (ADS)
Shen, Qian; Bai, Yanfeng; Shi, Xiaohui; Nan, Suqin; Qu, Lijie; Li, Hengxing; Fu, Xiquan
2018-03-01
The coherent-mode representation theory of partially coherent fields is firstly used to analyze a two-arm ghost microscope imaging system. It is shown that imaging quality of the generated images depend crucially on the distribution of the decomposition coefficients of the object imaged when the light source is fixed. This theory is also suitable for demonstrating the effects from the distance the object is moved away from the original plane on imaging quality. Our results are verified theoretically and experimentally.
Color Sparse Representations for Image Processing: Review, Models, and Prospects.
Barthélemy, Quentin; Larue, Anthony; Mars, Jérôme I
2015-11-01
Sparse representations have been extended to deal with color images composed of three channels. A review of dictionary-learning-based sparse representations for color images is made here, detailing the differences between the models, and comparing their results on the real and simulated data. These models are considered in a unifying framework that is based on the degrees of freedom of the linear filtering/transformation of the color channels. Moreover, this allows it to be shown that the scalar quaternionic linear model is equivalent to constrained matrix-based color filtering, which highlights the filtering implicitly applied through this model. Based on this reformulation, the new color filtering model is introduced, using unconstrained filters. In this model, spatial morphologies of color images are encoded by atoms, and colors are encoded by color filters. Color variability is no longer captured in increasing the dictionary size, but with color filters, this gives an efficient color representation.
Efficient content-based low-altitude images correlated network and strips reconstruction
NASA Astrophysics Data System (ADS)
He, Haiqing; You, Qi; Chen, Xiaoyong
2017-01-01
The manual intervention method is widely used to reconstruct strips for further aerial triangulation in low-altitude photogrammetry. Clearly the method for fully automatic photogrammetric data processing is not an expected way. In this paper, we explore a content-based approach without manual intervention or external information for strips reconstruction. Feature descriptors in the local spatial patterns are extracted by SIFT to construct vocabulary tree, in which these features are encoded in terms of TF-IDF numerical statistical algorithm to generate new representation for each low-altitude image. Then images correlated network is reconstructed by similarity measure, image matching and geometric graph theory. Finally, strips are reconstructed automatically by tracing straight lines and growing adjacent images gradually. Experimental results show that the proposed approach is highly effective in automatically rearranging strips of lowaltitude images and can provide rough relative orientation for further aerial triangulation.
Color reproduction and processing algorithm based on real-time mapping for endoscopic images.
Khan, Tareq H; Mohammed, Shahed K; Imtiaz, Mohammad S; Wahid, Khan A
2016-01-01
In this paper, we present a real-time preprocessing algorithm for image enhancement for endoscopic images. A novel dictionary based color mapping algorithm is used for reproducing the color information from a theme image. The theme image is selected from a nearby anatomical location. A database of color endoscopy image for different location is prepared for this purpose. The color map is dynamic as its contents change with the change of the theme image. This method is used on low contrast grayscale white light images and raw narrow band images to highlight the vascular and mucosa structures and to colorize the images. It can also be applied to enhance the tone of color images. The statistic visual representation and universal image quality measures show that the proposed method can highlight the mucosa structure compared to other methods. The color similarity has been verified using Delta E color difference, structure similarity index, mean structure similarity index and structure and hue similarity. The color enhancement was measured using color enhancement factor that shows considerable improvements. The proposed algorithm has low and linear time complexity, which results in higher execution speed than other related works.
NASA Astrophysics Data System (ADS)
Chen, Cunjian; Ross, Arun
2013-05-01
Researchers in face recognition have been using Gabor filters for image representation due to their robustness to complex variations in expression and illumination. Numerous methods have been proposed to model the output of filter responses by employing either local or global descriptors. In this work, we propose a novel but simple approach for encoding Gradient information on Gabor-transformed images to represent the face, which can be used for identity, gender and ethnicity assessment. Extensive experiments on the standard face benchmark FERET (Visible versus Visible), as well as the heterogeneous face dataset HFB (Near-infrared versus Visible), suggest that the matching performance due to the proposed descriptor is comparable against state-of-the-art descriptor-based approaches in face recognition applications. Furthermore, the same feature set is used in the framework of a Collaborative Representation Classification (CRC) scheme for deducing soft biometric traits such as gender and ethnicity from face images in the AR, Morph and CAS-PEAL databases.
Orthogonal Procrustes Analysis for Dictionary Learning in Sparse Linear Representation.
Grossi, Giuliano; Lanzarotti, Raffaella; Lin, Jianyi
2017-01-01
In the sparse representation model, the design of overcomplete dictionaries plays a key role for the effectiveness and applicability in different domains. Recent research has produced several dictionary learning approaches, being proven that dictionaries learnt by data examples significantly outperform structured ones, e.g. wavelet transforms. In this context, learning consists in adapting the dictionary atoms to a set of training signals in order to promote a sparse representation that minimizes the reconstruction error. Finding the best fitting dictionary remains a very difficult task, leaving the question still open. A well-established heuristic method for tackling this problem is an iterative alternating scheme, adopted for instance in the well-known K-SVD algorithm. Essentially, it consists in repeating two stages; the former promotes sparse coding of the training set and the latter adapts the dictionary to reduce the error. In this paper we present R-SVD, a new method that, while maintaining the alternating scheme, adopts the Orthogonal Procrustes analysis to update the dictionary atoms suitably arranged into groups. Comparative experiments on synthetic data prove the effectiveness of R-SVD with respect to well known dictionary learning algorithms such as K-SVD, ILS-DLA and the online method OSDL. Moreover, experiments on natural data such as ECG compression, EEG sparse representation, and image modeling confirm R-SVD's robustness and wide applicability.
Lai, Zongying; Zhang, Xinlin; Guo, Di; Du, Xiaofeng; Yang, Yonggui; Guo, Gang; Chen, Zhong; Qu, Xiaobo
2018-05-03
Multi-contrast images in magnetic resonance imaging (MRI) provide abundant contrast information reflecting the characteristics of the internal tissues of human bodies, and thus have been widely utilized in clinical diagnosis. However, long acquisition time limits the application of multi-contrast MRI. One efficient way to accelerate data acquisition is to under-sample the k-space data and then reconstruct images with sparsity constraint. However, images are compromised at high acceleration factor if images are reconstructed individually. We aim to improve the images with a jointly sparse reconstruction and Graph-based redundant wavelet transform (GBRWT). First, a sparsifying transform, GBRWT, is trained to reflect the similarity of tissue structures in multi-contrast images. Second, joint multi-contrast image reconstruction is formulated as a ℓ 2, 1 norm optimization problem under GBRWT representations. Third, the optimization problem is numerically solved using a derived alternating direction method. Experimental results in synthetic and in vivo MRI data demonstrate that the proposed joint reconstruction method can achieve lower reconstruction errors and better preserve image structures than the compared joint reconstruction methods. Besides, the proposed method outperforms single image reconstruction with joint sparsity constraint of multi-contrast images. The proposed method explores the joint sparsity of multi-contrast MRI images under graph-based redundant wavelet transform and realizes joint sparse reconstruction of multi-contrast images. Experiment demonstrate that the proposed method outperforms the compared joint reconstruction methods as well as individual reconstructions. With this high quality image reconstruction method, it is possible to achieve the high acceleration factors by exploring the complementary information provided by multi-contrast MRI.
Feature-based registration of historical aerial images by Area Minimization
NASA Astrophysics Data System (ADS)
Nagarajan, Sudhagar; Schenk, Toni
2016-06-01
The registration of historical images plays a significant role in assessing changes in land topography over time. By comparing historical aerial images with recent data, geometric changes that have taken place over the years can be quantified. However, the lack of ground control information and precise camera parameters has limited scientists' ability to reliably incorporate historical images into change detection studies. Other limitations include the methods of determining identical points between recent and historical images, which has proven to be a cumbersome task due to continuous land cover changes. Our research demonstrates a method of registering historical images using Time Invariant Line (TIL) features. TIL features are different representations of the same line features in multi-temporal data without explicit point-to-point or straight line-to-straight line correspondence. We successfully determined the exterior orientation of historical images by minimizing the area formed between corresponding TIL features in recent and historical images. We then tested the feasibility of the approach with synthetic and real data and analyzed the results. Based on our analysis, this method shows promise for long-term 3D change detection studies.
Image wavelet decomposition and applications
NASA Technical Reports Server (NTRS)
Treil, N.; Mallat, S.; Bajcsy, R.
1989-01-01
The general problem of computer vision has been investigated for more that 20 years and is still one of the most challenging fields in artificial intelligence. Indeed, taking a look at the human visual system can give us an idea of the complexity of any solution to the problem of visual recognition. This general task can be decomposed into a whole hierarchy of problems ranging from pixel processing to high level segmentation and complex objects recognition. Contrasting an image at different representations provides useful information such as edges. An example of low level signal and image processing using the theory of wavelets is introduced which provides the basis for multiresolution representation. Like the human brain, we use a multiorientation process which detects features independently in different orientation sectors. So, images of the same orientation but of different resolutions are contrasted to gather information about an image. An interesting image representation using energy zero crossings is developed. This representation is shown to be experimentally complete and leads to some higher level applications such as edge and corner finding, which in turn provides two basic steps to image segmentation. The possibilities of feedback between different levels of processing are also discussed.
Tirado-Ramos, Alfredo; Hu, Jingkun; Lee, K P
2002-01-01
Supplement 23 to DICOM (Digital Imaging and Communications for Medicine), Structured Reporting, is a specification that supports a semantically rich representation of image and waveform content, enabling experts to share image and related patient information. DICOM SR supports the representation of textual and coded data linked to images and waveforms. Nevertheless, the medical information technology community needs models that work as bridges between the DICOM relational model and open object-oriented technologies. The authors assert that representations of the DICOM Structured Reporting standard, using object-oriented modeling languages such as the Unified Modeling Language, can provide a high-level reference view of the semantically rich framework of DICOM and its complex structures. They have produced an object-oriented model to represent the DICOM SR standard and have derived XML-exchangeable representations of this model using World Wide Web Consortium specifications. They expect the model to benefit developers and system architects who are interested in developing applications that are compliant with the DICOM SR specification.
Building generic anatomical models using virtual model cutting and iterative registration.
Xiao, Mei; Soh, Jung; Meruvia-Pastor, Oscar; Schmidt, Eric; Hallgrímsson, Benedikt; Sensen, Christoph W
2010-02-08
Using 3D generic models to statistically analyze trends in biological structure changes is an important tool in morphometrics research. Therefore, 3D generic models built for a range of populations are in high demand. However, due to the complexity of biological structures and the limited views of them that medical images can offer, it is still an exceptionally difficult task to quickly and accurately create 3D generic models (a model is a 3D graphical representation of a biological structure) based on medical image stacks (a stack is an ordered collection of 2D images). We show that the creation of a generic model that captures spatial information exploitable in statistical analyses is facilitated by coupling our generalized segmentation method to existing automatic image registration algorithms. The method of creating generic 3D models consists of the following processing steps: (i) scanning subjects to obtain image stacks; (ii) creating individual 3D models from the stacks; (iii) interactively extracting sub-volume by cutting each model to generate the sub-model of interest; (iv) creating image stacks that contain only the information pertaining to the sub-models; (v) iteratively registering the corresponding new 2D image stacks; (vi) averaging the newly created sub-models based on intensity to produce the generic model from all the individual sub-models. After several registration procedures are applied to the image stacks, we can create averaged image stacks with sharp boundaries. The averaged 3D model created from those image stacks is very close to the average representation of the population. The image registration time varies depending on the image size and the desired accuracy of the registration. Both volumetric data and surface model for the generic 3D model are created at the final step. Our method is very flexible and easy to use such that anyone can use image stacks to create models and retrieve a sub-region from it at their ease. Java-based implementation allows our method to be used on various visualization systems including personal computers, workstations, computers equipped with stereo displays, and even virtual reality rooms such as the CAVE Automated Virtual Environment. The technique allows biologists to build generic 3D models of their interest quickly and accurately.
Tensor Fukunaga-Koontz transform for small target detection in infrared images
NASA Astrophysics Data System (ADS)
Liu, Ruiming; Wang, Jingzhuo; Yang, Huizhen; Gong, Chenglong; Zhou, Yuanshen; Liu, Lipeng; Zhang, Zhen; Shen, Shuli
2016-09-01
Infrared small targets detection plays a crucial role in warning and tracking systems. Some novel methods based on pattern recognition technology catch much attention from researchers. However, those classic methods must reshape images into vectors with the high dimensionality. Moreover, vectorizing breaks the natural structure and correlations in the image data. Image representation based on tensor treats images as matrices and can hold the natural structure and correlation information. So tensor algorithms have better classification performance than vector algorithms. Fukunaga-Koontz transform is one of classification algorithms and it is a vector version method with the disadvantage of all vector algorithms. In this paper, we first extended the Fukunaga-Koontz transform into its tensor version, tensor Fukunaga-Koontz transform. Then we designed a method based on tensor Fukunaga-Koontz transform for detecting targets and used it to detect small targets in infrared images. The experimental results, comparison through signal-to-clutter, signal-to-clutter gain and background suppression factor, have validated the advantage of the target detection based on the tensor Fukunaga-Koontz transform over that based on the Fukunaga-Koontz transform.
Varying face occlusion detection and iterative recovery for face recognition
NASA Astrophysics Data System (ADS)
Wang, Meng; Hu, Zhengping; Sun, Zhe; Zhao, Shuhuan; Sun, Mei
2017-05-01
In most sparse representation methods for face recognition (FR), occlusion problems were usually solved via removing the occlusion part of both query samples and training samples to perform the recognition process. This practice ignores the global feature of facial image and may lead to unsatisfactory results due to the limitation of local features. Considering the aforementioned drawback, we propose a method called varying occlusion detection and iterative recovery for FR. The main contributions of our method are as follows: (1) to detect an accurate occlusion area of facial images, an image processing and intersection-based clustering combination method is used for occlusion FR; (2) according to an accurate occlusion map, the new integrated facial images are recovered iteratively and put into a recognition process; and (3) the effectiveness on recognition accuracy of our method is verified by comparing it with three typical occlusion map detection methods. Experiments show that the proposed method has a highly accurate detection and recovery performance and that it outperforms several similar state-of-the-art methods against partial contiguous occlusion.
NASA Astrophysics Data System (ADS)
Shen, Qian; Bai, Yanfeng; Shi, Xiaohui; Nan, Suqin; Qu, Lijie; Li, Hengxing; Fu, Xiquan
2017-07-01
The difference in imaging quality between different ghost imaging schemes is studied by using coherent-mode representation of partially coherent fields. It is shown that the difference mainly relies on the distribution changes of the decomposition coefficients of the object imaged when the light source is fixed. For a new-designed imaging scheme, we only need to give the distribution of the decomposition coefficients and compare them with that of the existing imaging system, thus one can predict imaging quality. By choosing several typical ghost imaging systems, we theoretically and experimentally verify our results.
Augmenting cognitive architectures to support diagrammatic imagination.
Chandrasekaran, Balakrishnan; Banerjee, Bonny; Kurup, Unmesh; Lele, Omkar
2011-10-01
Diagrams are a form of spatial representation that supports reasoning and problem solving. Even when diagrams are external, not to mention when there are no external representations, problem solving often calls for internal representations, that is, representations in cognition, of diagrammatic elements and internal perceptions on them. General cognitive architectures--Soar and ACT-R, to name the most prominent--do not have representations and operations to support diagrammatic reasoning. In this article, we examine some requirements for such internal representations and processes in cognitive architectures. We discuss the degree to which DRS, our earlier proposal for such an internal representation for diagrams, meets these requirements. In DRS, the diagrams are not raw images, but a composition of objects that can be individuated and thus symbolized, while, unlike traditional symbols, the referent of the symbol is an object that retains its perceptual essence, namely, its spatiality. This duality provides a way to resolve what anti-imagists thought was a contradiction in mental imagery: the compositionality of mental images that seemed to be unique to symbol systems, and their support of a perceptual experience of images and some types of perception on them. We briefly review the use of DRS to augment Soar and ACT-R with a diagrammatic representation component. We identify issues for further research. Copyright © 2011 Cognitive Science Society, Inc.
NASA Astrophysics Data System (ADS)
Wang, Guanxi; Tie, Yun; Qi, Lin
2017-07-01
In this paper, we propose a novel approach based on Depth Maps and compute Multi-Scale Histograms of Oriented Gradient (MSHOG) from sequences of depth maps to recognize actions. Each depth frame in a depth video sequence is projected onto three orthogonal Cartesian planes. Under each projection view, the absolute difference between two consecutive projected maps is accumulated through a depth video sequence to form a Depth Map, which is called Depth Motion Trail Images (DMTI). The MSHOG is then computed from the Depth Maps for the representation of an action. In addition, we apply L2-Regularized Collaborative Representation (L2-CRC) to classify actions. We evaluate the proposed approach on MSR Action3D dataset and MSRGesture3D dataset. Promising experimental result demonstrates the effectiveness of our proposed method.
Yoo, Youngjin; Tang, Lisa Y W; Brosch, Tom; Li, David K B; Kolind, Shannon; Vavasour, Irene; Rauscher, Alexander; MacKay, Alex L; Traboulsee, Anthony; Tam, Roger C
2018-01-01
Myelin imaging is a form of quantitative magnetic resonance imaging (MRI) that measures myelin content and can potentially allow demyelinating diseases such as multiple sclerosis (MS) to be detected earlier. Although focal lesions are the most visible signs of MS pathology on conventional MRI, it has been shown that even tissues that appear normal may exhibit decreased myelin content as revealed by myelin-specific images (i.e., myelin maps). Current methods for analyzing myelin maps typically use global or regional mean myelin measurements to detect abnormalities, but ignore finer spatial patterns that may be characteristic of MS. In this paper, we present a machine learning method to automatically learn, from multimodal MR images, latent spatial features that can potentially improve the detection of MS pathology at early stage. More specifically, 3D image patches are extracted from myelin maps and the corresponding T1-weighted (T1w) MRIs, and are used to learn a latent joint myelin-T1w feature representation via unsupervised deep learning. Using a data set of images from MS patients and healthy controls, a common set of patches are selected via a voxel-wise t -test performed between the two groups. In each MS image, any patches overlapping with focal lesions are excluded, and a feature imputation method is used to fill in the missing values. A feature selection process (LASSO) is then utilized to construct a sparse representation. The resulting normal-appearing features are used to train a random forest classifier. Using the myelin and T1w images of 55 relapse-remitting MS patients and 44 healthy controls in an 11-fold cross-validation experiment, the proposed method achieved an average classification accuracy of 87.9% (SD = 8.4%), which is higher and more consistent across folds than those attained by regional mean myelin (73.7%, SD = 13.7%) and T1w measurements (66.7%, SD = 10.6%), or deep-learned features in either the myelin (83.8%, SD = 11.0%) or T1w (70.1%, SD = 13.6%) images alone, suggesting that the proposed method has strong potential for identifying image features that are more sensitive and specific to MS pathology in normal-appearing brain tissues.
Image Quality Assessment Using the Joint Spatial/Spatial-Frequency Representation
NASA Astrophysics Data System (ADS)
Beghdadi, Azeddine; Iordache, Răzvan
2006-12-01
This paper demonstrates the usefulness of spatial/spatial-frequency representations in image quality assessment by introducing a new image dissimilarity measure based on 2D Wigner-Ville distribution (WVD). The properties of 2D WVD are shortly reviewed, and the important issue of choosing the analytic image is emphasized. The WVD-based measure is shown to be correlated with subjective human evaluation, which is the premise towards an image quality assessor developed on this principle.
A high-level 3D visualization API for Java and ImageJ.
Schmid, Benjamin; Schindelin, Johannes; Cardona, Albert; Longair, Mark; Heisenberg, Martin
2010-05-21
Current imaging methods such as Magnetic Resonance Imaging (MRI), Confocal microscopy, Electron Microscopy (EM) or Selective Plane Illumination Microscopy (SPIM) yield three-dimensional (3D) data sets in need of appropriate computational methods for their analysis. The reconstruction, segmentation and registration are best approached from the 3D representation of the data set. Here we present a platform-independent framework based on Java and Java 3D for accelerated rendering of biological images. Our framework is seamlessly integrated into ImageJ, a free image processing package with a vast collection of community-developed biological image analysis tools. Our framework enriches the ImageJ software libraries with methods that greatly reduce the complexity of developing image analysis tools in an interactive 3D visualization environment. In particular, we provide high-level access to volume rendering, volume editing, surface extraction, and image annotation. The ability to rely on a library that removes the low-level details enables concentrating software development efforts on the algorithm implementation parts. Our framework enables biomedical image software development to be built with 3D visualization capabilities with very little effort. We offer the source code and convenient binary packages along with extensive documentation at http://3dviewer.neurofly.de.
Geodesic denoising for optical coherence tomography images
NASA Astrophysics Data System (ADS)
Shahrian Varnousfaderani, Ehsan; Vogl, Wolf-Dieter; Wu, Jing; Gerendas, Bianca S.; Simader, Christian; Langs, Georg; Waldstein, Sebastian M.; Schmidt-Erfurth, Ursula
2016-03-01
Optical coherence tomography (OCT) is an optical signal acquisition method capturing micrometer resolution, cross-sectional three-dimensional images. OCT images are used widely in ophthalmology to diagnose and monitor retinal diseases such as age-related macular degeneration (AMD) and Glaucoma. While OCT allows the visualization of retinal structures such as vessels and retinal layers, image quality and contrast is reduced by speckle noise, obfuscating small, low intensity structures and structural boundaries. Existing denoising methods for OCT images may remove clinically significant image features such as texture and boundaries of anomalies. In this paper, we propose a novel patch based denoising method, Geodesic Denoising. The method reduces noise in OCT images while preserving clinically significant, although small, pathological structures, such as fluid-filled cysts in diseased retinas. Our method selects optimal image patch distribution representations based on geodesic patch similarity to noisy samples. Patch distributions are then randomly sampled to build a set of best matching candidates for every noisy sample, and the denoised value is computed based on a geodesic weighted average of the best candidate samples. Our method is evaluated qualitatively on real pathological OCT scans and quantitatively on a proposed set of ground truth, noise free synthetic OCT scans with artificially added noise and pathologies. Experimental results show that performance of our method is comparable with state of the art denoising methods while outperforming them in preserving the critical clinically relevant structures.
Dictionary Learning Algorithms for Sparse Representation
Kreutz-Delgado, Kenneth; Murray, Joseph F.; Rao, Bhaskar D.; Engan, Kjersti; Lee, Te-Won; Sejnowski, Terrence J.
2010-01-01
Algorithms for data-driven learning of domain-specific overcomplete dictionaries are developed to obtain maximum likelihood and maximum a posteriori dictionary estimates based on the use of Bayesian models with concave/Schur-concave (CSC) negative log priors. Such priors are appropriate for obtaining sparse representations of environmental signals within an appropriately chosen (environmentally matched) dictionary. The elements of the dictionary can be interpreted as concepts, features, or words capable of succinct expression of events encountered in the environment (the source of the measured signals). This is a generalization of vector quantization in that one is interested in a description involving a few dictionary entries (the proverbial “25 words or less”), but not necessarily as succinct as one entry. To learn an environmentally adapted dictionary capable of concise expression of signals generated by the environment, we develop algorithms that iterate between a representative set of sparse representations found by variants of FOCUSS and an update of the dictionary using these sparse representations. Experiments were performed using synthetic data and natural images. For complete dictionaries, we demonstrate that our algorithms have improved performance over other independent component analysis (ICA) methods, measured in terms of signal-to-noise ratios of separated sources. In the overcomplete case, we show that the true underlying dictionary and sparse sources can be accurately recovered. In tests with natural images, learned overcomplete dictionaries are shown to have higher coding efficiency than complete dictionaries; that is, images encoded with an over-complete dictionary have both higher compression (fewer bits per pixel) and higher accuracy (lower mean square error). PMID:12590811
New technologies lead to a new frontier: cognitive multiple data representation
NASA Astrophysics Data System (ADS)
Buffat, S.; Liege, F.; Plantier, J.; Roumes, C.
2005-05-01
The increasing number and complexity of operational sensors (radar, infrared, hyperspectral...) and availability of huge amount of data, lead to more and more sophisticated information presentations. But one key element of the IMINT line cannot be improved beyond initial system specification: the operator.... In order to overcome this issue, we have to better understand human visual object representation. Object recognition theories in human vision balance between matching 2D templates representation with viewpoint-dependant information, and a viewpoint-invariant system based on structural description. Spatial frequency content is relevant due to early vision filtering. Orientation in depth is an important variable to challenge object constancy. Three objects, seen from three different points of view in a natural environment made the original images in this study. Test images were a combination of spatial frequency filtered original images and an additive contrast level of white noise. In the first experiment, the observer's task was a same versus different forced choice with spatial alternative. Test images had the same noise level in a presentation row. Discrimination threshold was determined by modifying the white noise contrast level by means of an adaptative method. In the second experiment, a repetition blindness paradigm was used to further investigate the viewpoint effect on object recognition. The results shed some light on the human visual system processing of objects displayed under different physical descriptions. This is an important achievement because targets which not always match physical properties of usual visual stimuli can increase operational workload.
Apes, skulls and drums: using images to make ethnographic knowledge in imperial Germany.
Petrou, Marissa H
2018-03-01
In this paper, I discuss the development and use of images employed by the Dresden Royal Museum for Zoology, Anthropology and Ethnography to resolve debates about how to use visual representation as a means of making ethnographic knowledge. Through experimentation with techniques of visual representation, the founding director, A.B. Meyer (1840-1911), proposed a historical, non-essentialist approach to understanding racial and cultural difference. Director Meyer's approach was inspired by the new knowledge he had gained through field research in Asia-Pacific as well as new forms of imaging that made highly detailed representations of objects possible. Through a combination of various techniques, he developed new visual methods that emphasized intimate familiarity with variations within any one ethnic group, from skull shape to material ornamentation, as integral to the new disciplines of physical and cultural anthropology. It is well known that photographs were a favoured form of visual documentation among the anthropological and ethnographic sciences at the fin de siècle. However, in the scholarly journals of the Dresden museum, photographs, drawings, tables and etchings were frequently displayed alongside one another. Meyer sought to train the reader's eye through organized arrangements that represented objects from multiple angles and at various levels of magnification. Focusing on chimpanzees, skulls and kettledrums from Asia-Pacific, I track the development of new modes of making and reading images, from zoology and physical anthropology to ethnography, to demonstrate how the museum visually historicized humankind.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xu, Q; Stanford University School of Medicine, Stanford, CA; Liu, H
Purpose: Spectral CT enabled by an energy-resolved photon-counting detector outperforms conventional CT in terms of material discrimination, contrast resolution, etc. One reconstruction method for spectral CT is to generate a color image from a reconstructed component in each energy channel. However, given the radiation dose, the number of photons in each channel is limited, which will result in strong noise in each channel and affect the final color reconstruction. Here we propose a novel dictionary learning method for spectral CT that combines dictionary-based sparse representation method and the patch based low-rank constraint to simultaneously improve the reconstruction in each channelmore » and to address the inter-channel correlations to further improve the reconstruction. Methods: The proposed method has two important features: (1) guarantee of the patch based sparsity in each energy channel, which is the result of the dictionary based sparse representation constraint; (2) the explicit consideration of the correlations among different energy channels, which is realized by patch-by-patch nuclear norm-based low-rank constraint. For each channel, the dictionary consists of two sub-dictionaries. One is learned from the average of the images in all energy channels, and the other is learned from the average of the images in all energy channels except the current channel. With average operation to reduce noise, these two dictionaries can effectively preserve the structural details and get rid of artifacts caused by noise. Combining them together can express all structural information in current channel. Results: Dictionary learning based methods can obtain better results than FBP and the TV-based method. With low-rank constraint, the image quality can be further improved in the channel with more noise. The final color result by the proposed method has the best visual quality. Conclusion: The proposed method can effectively improve the image quality of low-dose spectral CT. This work is partially supported by the National Natural Science Foundation of China (No. 61302136), and the Natural Science Basic Research Plan in Shaanxi Province of China (No. 2014JQ8317).« less
Visual information processing; Proceedings of the Meeting, Orlando, FL, Apr. 20-22, 1992
NASA Technical Reports Server (NTRS)
Huck, Friedrich O. (Editor); Juday, Richard D. (Editor)
1992-01-01
Topics discussed in these proceedings include nonlinear processing and communications; feature extraction and recognition; image gathering, interpolation, and restoration; image coding; and wavelet transform. Papers are presented on noise reduction for signals from nonlinear systems; driving nonlinear systems with chaotic signals; edge detection and image segmentation of space scenes using fractal analyses; a vision system for telerobotic operation; a fidelity analysis of image gathering, interpolation, and restoration; restoration of images degraded by motion; and information, entropy, and fidelity in visual communication. Attention is also given to image coding methods and their assessment, hybrid JPEG/recursive block coding of images, modified wavelets that accommodate causality, modified wavelet transform for unbiased frequency representation, and continuous wavelet transform of one-dimensional signals by Fourier filtering.
Revealing representational content with pattern-information fMRI--an introductory guide.
Mur, Marieke; Bandettini, Peter A; Kriegeskorte, Nikolaus
2009-03-01
Conventional statistical analysis methods for functional magnetic resonance imaging (fMRI) data are very successful at detecting brain regions that are activated as a whole during specific mental activities. The overall activation of a region is usually taken to indicate involvement of the region in the task. However, such activation analysis does not consider the multivoxel patterns of activity within a brain region. These patterns of activity, which are thought to reflect neuronal population codes, can be investigated by pattern-information analysis. In this framework, a region's multivariate pattern information is taken to indicate representational content. This tutorial introduction motivates pattern-information analysis, explains its underlying assumptions, introduces the most widespread methods in an intuitive way, and outlines the basic sequence of analysis steps.
The Influence of Social Comparison on Visual Representation of One's Face
Zell, Ethan; Balcetis, Emily
2012-01-01
Can the effects of social comparison extend beyond explicit evaluation to visual self-representation—a perceptual stimulus that is objectively verifiable, unambiguous, and frequently updated? We morphed images of participants' faces with attractive and unattractive references. With access to a mirror, participants selected the morphed image they perceived as depicting their face. Participants who engaged in upward comparison with relevant attractive targets selected a less attractive morph compared to participants exposed to control images (Study 1). After downward comparison with relevant unattractive targets compared to control images, participants selected a more attractive morph (Study 2). Biased representations were not the products of cognitive accessibility of beauty constructs; comparisons did not influence representations of strangers' faces (Study 3). We discuss implications for vision, social comparison, and body image. PMID:22662124
An effective and efficient compression algorithm for ECG signals with irregular periods.
Chou, Hsiao-Hsuan; Chen, Ying-Jui; Shiau, Yu-Chien; Kuo, Te-Son
2006-06-01
This paper presents an effective and efficient preprocessing algorithm for two-dimensional (2-D) electrocardiogram (ECG) compression to better compress irregular ECG signals by exploiting their inter- and intra-beat correlations. To better reveal the correlation structure, we first convert the ECG signal into a proper 2-D representation, or image. This involves a few steps including QRS detection and alignment, period sorting, and length equalization. The resulting 2-D ECG representation is then ready to be compressed by an appropriate image compression algorithm. We choose the state-of-the-art JPEG2000 for its high efficiency and flexibility. In this way, the proposed algorithm is shown to outperform some existing arts in the literature by simultaneously achieving high compression ratio (CR), low percent root mean squared difference (PRD), low maximum error (MaxErr), and low standard derivation of errors (StdErr). In particular, because the proposed period sorting method rearranges the detected heartbeats into a smoother image that is easier to compress, this algorithm is insensitive to irregular ECG periods. Thus either the irregular ECG signals or the QRS false-detection cases can be better compressed. This is a significant improvement over existing 2-D ECG compression methods. Moreover, this algorithm is not tied exclusively to JPEG2000. It can also be combined with other 2-D preprocessing methods or appropriate codecs to enhance the compression performance in irregular ECG cases.
Donato, Gianluca; Bartlett, Marian Stewart; Hager, Joseph C.; Ekman, Paul; Sejnowski, Terrence J.
2010-01-01
The Facial Action Coding System (FACS) [23] is an objective method for quantifying facial movement in terms of component actions. This system is widely used in behavioral investigations of emotion, cognitive processes, and social interaction. The coding is presently performed by highly trained human experts. This paper explores and compares techniques for automatically recognizing facial actions in sequences of images. These techniques include analysis of facial motion through estimation of optical flow; holistic spatial analysis, such as principal component analysis, independent component analysis, local feature analysis, and linear discriminant analysis; and methods based on the outputs of local filters, such as Gabor wavelet representations and local principal components. Performance of these systems is compared to naive and expert human subjects. Best performances were obtained using the Gabor wavelet representation and the independent component representation, both of which achieved 96 percent accuracy for classifying 12 facial actions of the upper and lower face. The results provide converging evidence for the importance of using local filters, high spatial frequencies, and statistical independence for classifying facial actions. PMID:21188284
Age and gender classification in the wild with unsupervised feature learning
NASA Astrophysics Data System (ADS)
Wan, Lihong; Huo, Hong; Fang, Tao
2017-03-01
Inspired by unsupervised feature learning (UFL) within the self-taught learning framework, we propose a method based on UFL, convolution representation, and part-based dimensionality reduction to handle facial age and gender classification, which are two challenging problems under unconstrained circumstances. First, UFL is introduced to learn selective receptive fields (filters) automatically by applying whitening transformation and spherical k-means on random patches collected from unlabeled data. The learning process is fast and has no hyperparameters to tune. Then, the input image is convolved with these filters to obtain filtering responses on which local contrast normalization is applied. Average pooling and feature concatenation are then used to form global face representation. Finally, linear discriminant analysis with part-based strategy is presented to reduce the dimensions of the global representation and to improve classification performances further. Experiments on three challenging databases, namely, Labeled faces in the wild, Gallagher group photos, and Adience, demonstrate the effectiveness of the proposed method relative to that of state-of-the-art approaches.
Using eye movements to explore mental representations of space.
Fourtassi, Maryam; Rode, Gilles; Pisella, Laure
2017-06-01
Visual mental imagery is a cognitive experience characterised by the activation of the mental representation of an object or scene in the absence of the corresponding stimulus. According to the analogical theory, mental representations have a pictorial nature that preserves the spatial characteristics of the environment that is mentally represented. This cognitive experience shares many similarities with the experience of visual perception, including eye movements. The mental visualisation of a scene is accompanied by eye movements that reflect the spatial content of the mental image, and which can mirror the deformations of this mental image with respect to the real image, such as asymmetries or size reduction. The present article offers a concise overview of the main theories explaining the interactions between eye movements and mental representations, with some examples of the studies supporting them. It also aims to explain how ocular-tracking could be a useful tool in exploring the dynamics of spatial mental representations, especially in pathological situations where these representations can be altered, for instance in unilateral spatial neglect. Copyright © 2016 Elsevier Masson SAS. All rights reserved.
The importance of context: evidence that contextual representations increase intrusive memories.
Pearson, David G; Ross, Fiona D C; Webster, Victoria L
2012-03-01
Intrusive memories appear to enter consciousness via involuntary rather than deliberate recollection. Some clinical accounts of PTSD seek to explain this phenomenon by making a clear distinction between the encoding of sensory-based and contextual representations. Contextual representations have been claimed to actively reduce intrusions by anchoring encoded perceptual data for an event in memory. The current analogue trauma study examined this hypothesis by manipulating contextual information independently from encoded sensory-perceptual information. Participants' viewed images selected from the International Affective Picture System that depicted scenes of violence and bodily injury. Images were viewed either under neutral conditions or paired with contextual information. Two experiments revealed a significant increase in memory intrusions for images paired with contextual information in comparison to the same images viewed under neutral conditions. In contrast to the observed increase in intrusion frequency there was no effect of contextual representations on voluntary memory for the images. The vividness and emotionality of memory intrusions were also unaffected. The analogue trauma paradigm may fail to replicate the effect of extreme stress on encoding postulated to occur during PTSD. These findings question the assertion that intrusive memories develop from a lack of integration between sensory-based and contextual representations in memory. Instead it is argued contextual representations play a causal role in increasing the frequency of intrusions by increasing the sensitivity of memory to involuntary retrieval by associated internal and external cues. Copyright © 2011 Elsevier Ltd. All rights reserved.
A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes
Just, Marcel Adam; Cherkassky, Vladimir L.; Aryal, Sandesh; Mitchell, Tom M.
2010-01-01
This article describes the discovery of a set of biologically-driven semantic dimensions underlying the neural representation of concrete nouns, and then demonstrates how a resulting theory of noun representation can be used to identify simple thoughts through their fMRI patterns. We use factor analysis of fMRI brain imaging data to reveal the biological representation of individual concrete nouns like apple, in the absence of any pictorial stimuli. From this analysis emerge three main semantic factors underpinning the neural representation of nouns naming physical objects, which we label manipulation, shelter, and eating. Each factor is neurally represented in 3–4 different brain locations that correspond to a cortical network that co-activates in non-linguistic tasks, such as tool use pantomime for the manipulation factor. Several converging methods, such as the use of behavioral ratings of word meaning and text corpus characteristics, provide independent evidence of the centrality of these factors to the representations. The factors are then used with machine learning classifier techniques to show that the fMRI-measured brain representation of an individual concrete noun like apple can be identified with good accuracy from among 60 candidate words, using only the fMRI activity in the 16 locations associated with these factors. To further demonstrate the generativity of the proposed account, a theory-based model is developed to predict the brain activation patterns for words to which the algorithm has not been previously exposed. The methods, findings, and theory constitute a new approach of using brain activity for understanding how object concepts are represented in the mind. PMID:20084104
A neurosemantic theory of concrete noun representation based on the underlying brain codes.
Just, Marcel Adam; Cherkassky, Vladimir L; Aryal, Sandesh; Mitchell, Tom M
2010-01-13
This article describes the discovery of a set of biologically-driven semantic dimensions underlying the neural representation of concrete nouns, and then demonstrates how a resulting theory of noun representation can be used to identify simple thoughts through their fMRI patterns. We use factor analysis of fMRI brain imaging data to reveal the biological representation of individual concrete nouns like apple, in the absence of any pictorial stimuli. From this analysis emerge three main semantic factors underpinning the neural representation of nouns naming physical objects, which we label manipulation, shelter, and eating. Each factor is neurally represented in 3-4 different brain locations that correspond to a cortical network that co-activates in non-linguistic tasks, such as tool use pantomime for the manipulation factor. Several converging methods, such as the use of behavioral ratings of word meaning and text corpus characteristics, provide independent evidence of the centrality of these factors to the representations. The factors are then used with machine learning classifier techniques to show that the fMRI-measured brain representation of an individual concrete noun like apple can be identified with good accuracy from among 60 candidate words, using only the fMRI activity in the 16 locations associated with these factors. To further demonstrate the generativity of the proposed account, a theory-based model is developed to predict the brain activation patterns for words to which the algorithm has not been previously exposed. The methods, findings, and theory constitute a new approach of using brain activity for understanding how object concepts are represented in the mind.
Geographical topic learning for social images with a deep neural network
NASA Astrophysics Data System (ADS)
Feng, Jiangfan; Xu, Xin
2017-03-01
The use of geographical tagging in social-media images is becoming a part of image metadata and a great interest for geographical information science. It is well recognized that geographical topic learning is crucial for geographical annotation. Existing methods usually exploit geographical characteristics using image preprocessing, pixel-based classification, and feature recognition. How to effectively exploit the high-level semantic feature and underlying correlation among different types of contents is a crucial task for geographical topic learning. Deep learning (DL) has recently demonstrated robust capabilities for image tagging and has been introduced into geoscience. It extracts high-level features computed from a whole image component, where the cluttered background may dominate spatial features in the deep representation. Therefore, a method of spatial-attentional DL for geographical topic learning is provided and we can regard it as a special case of DL combined with various deep networks and tuning tricks. Results demonstrated that the method is discriminative for different types of geographical topic learning. In addition, it outperforms other sequential processing models in a tagging task for a geographical image dataset.
New false color mapping for image fusion
NASA Astrophysics Data System (ADS)
Toet, Alexander; Walraven, Jan
1996-03-01
A pixel-based color-mapping algorithm is presented that produces a fused false color rendering of two gray-level images representing different sensor modalities. The resulting images have a higher information content than each of the original images and retain sensor-specific image information. The unique component of each image modality is enhanced in the resulting fused color image representation. First, the common component of the two original input images is determined. Second, the common component is subtracted from the original images to obtain the unique component of each image. Third, the unique component of each image modality is subtracted from the image of the other modality. This step serves to enhance the representation of sensor-specific details in the final fused result. Finally, a fused color image is produced by displaying the images resulting from the last step through, respectively, the red and green channels of a color display. The method is applied to fuse thermal and visual images. The results show that the color mapping enhances the visibility of certain details and preserves the specificity of the sensor information. The fused images also have a fairly natural appearance. The fusion scheme involves only operations on corresponding pixels. The resolution of a fused image is therefore directly related to the resolution of the input images. Before fusing, the contrast of the images can be enhanced and their noise can be reduced by standard image- processing techniques. The color mapping algorithm is computationally simple. This implies that the investigated approaches can eventually be applied in real time and that the hardware needed is not too complicated or too voluminous (an important consideration when it has to fit in an airplane, for instance).
Superpixel guided active contour segmentation of retinal layers in OCT volumes
NASA Astrophysics Data System (ADS)
Bai, Fangliang; Gibson, Stuart J.; Marques, Manuel J.; Podoleanu, Adrian
2018-03-01
Retinal OCT image segmentation is a precursor to subsequent medical diagnosis by a clinician or machine learning algorithm. In the last decade, many algorithms have been proposed to detect retinal layer boundaries and simplify the image representation. Inspired by the recent success of superpixel methods for pre-processing natural images, we present a novel framework for segmentation of retinal layers in OCT volume data. In our framework, the region of interest (e.g. the fovea) is located using an adaptive-curve method. The cell layer boundaries are then robustly detected firstly using 1D superpixels, applied to A-scans, and then fitting active contours in B-scan images. Thereafter the 3D cell layer surfaces are efficiently segmented from the volume data. The framework was tested on healthy eye data and we show that it is capable of segmenting up to 12 layers. The experimental results imply the effectiveness of proposed method and indicate its robustness to low image resolution and intrinsic speckle noise.
Spatial transform coding of color images.
NASA Technical Reports Server (NTRS)
Pratt, W. K.
1971-01-01
The application of the transform-coding concept to the coding of color images represented by three primary color planes of data is discussed. The principles of spatial transform coding are reviewed and the merits of various methods of color-image representation are examined. A performance analysis is presented for the color-image transform-coding system. Results of a computer simulation of the coding system are also given. It is shown that, by transform coding, the chrominance content of a color image can be coded with an average of 1.0 bits per element or less without serious degradation. If luminance coding is also employed, the average rate reduces to about 2.0 bits per element or less.
Transductive multi-view zero-shot learning.
Fu, Yanwei; Hospedales, Timothy M; Xiang, Tao; Gong, Shaogang
2015-11-01
Most existing zero-shot learning approaches exploit transfer learning via an intermediate semantic representation shared between an annotated auxiliary dataset and a target dataset with different classes and no annotation. A projection from a low-level feature space to the semantic representation space is learned from the auxiliary dataset and applied without adaptation to the target dataset. In this paper we identify two inherent limitations with these approaches. First, due to having disjoint and potentially unrelated classes, the projection functions learned from the auxiliary dataset/domain are biased when applied directly to the target dataset/domain. We call this problem the projection domain shift problem and propose a novel framework, transductive multi-view embedding, to solve it. The second limitation is the prototype sparsity problem which refers to the fact that for each target class, only a single prototype is available for zero-shot learning given a semantic representation. To overcome this problem, a novel heterogeneous multi-view hypergraph label propagation method is formulated for zero-shot learning in the transductive embedding space. It effectively exploits the complementary information offered by different semantic representations and takes advantage of the manifold structures of multiple representation spaces in a coherent manner. We demonstrate through extensive experiments that the proposed approach (1) rectifies the projection shift between the auxiliary and target domains, (2) exploits the complementarity of multiple semantic representations, (3) significantly outperforms existing methods for both zero-shot and N-shot recognition on three image and video benchmark datasets, and (4) enables novel cross-view annotation tasks.
Matsumoto, Yuji; Takaki, Yasuhiro
2014-06-15
Horizontally scanning holography can enlarge both screen size and viewing zone angle. A microelectromechanical-system spatial light modulator, which can generate only binary images, is used to generate hologram patterns. Thus, techniques to improve gray-scale representation in reconstructed images should be developed. In this study, the error diffusion technique was used for the binarization of holograms. When the Floyd-Steinberg error diffusion coefficients were used, gray-scale representation was improved. However, the linearity in the gray-scale representation was not satisfactory. We proposed the use of a correction table and showed that the linearity was greatly improved.
Picture This... Developing Standards for Electronic Images at the National Library of Medicine
Masys, Daniel R.
1990-01-01
New computer technologies have made it feasible to represent, store, and communicate high resolution biomedical images via electronic means. Traditional two dimensional medical images such as those on printed pages have been supplemented by three dimensional images which can be rendered, rotated, and “dissected” from any point of view. The library of the future will provide electronic access not only to words and numbers, but to pictures, sounds, and other nontextual information. There currently exist few widely-accepted standards for the representation and communication of complex images, yet such standards will be critical to the feasibility and usefulness of digital image collections in the life sciences. The National Library of Medicine is embarked on a project to develop a complete digital volumetric representation of an adult human male and female. This “Visible Human Project” will address the issue of standards for computer representation of biological structure.
Guariglia, Cecilia; Palermo, Liana; Piccardi, Laura; Iaria, Giuseppe; Incoccia, Chiara
2013-01-01
Representational neglect, which is characterized by the failure to report left-sided details of a mental image from memory, can occur after a right hemisphere lesion. In this study, we set out to verify the hypothesis that two distinct forms of representational neglect exist, one involving object representation and the other environmental representation. As representational neglect is considered rare, we also evaluated the prevalence and frequency of its association with perceptual neglect. We submitted a group of 96 unselected, consecutive, chronic, right brain-damaged patients to an extensive neuropsychological evaluation that included two representational neglect tests: the Familiar Square Description Test and the O'Clock Test. Representational neglect, as well as perceptual neglect, was present in about one-third of the sample. Most patients neglected the left side of imagined familiar squares but not the left side of imagined clocks. The present data show that representational neglect is not a rare disorder and also support the hypothesis that two different types of mental representations (i.e. topological and non-topological images) may be selectively damaged in representational neglect. PMID:23874416
Light field rendering with omni-directional camera
NASA Astrophysics Data System (ADS)
Todoroki, Hiroshi; Saito, Hideo
2003-06-01
This paper presents an approach to capture visual appearance of a real environment such as an interior of a room. We propose the method for generating arbitrary viewpoint images by building light field with the omni-directional camera, which can capture the wide circumferences. Omni-directional camera used in this technique is a special camera with the hyperbolic mirror in the upper part of a camera, so that we can capture luminosity in the environment in the range of 360 degree of circumferences in one image. We apply the light field method, which is one technique of Image-Based-Rendering(IBR), for generating the arbitrary viewpoint images. The light field is a kind of the database that records the luminosity information in the object space. We employ the omni-directional camera for constructing the light field, so that we can collect many view direction images in the light field. Thus our method allows the user to explore the wide scene, that can acheive realistic representation of virtual enviroment. For demonstating the proposed method, we capture image sequence in our lab's interior environment with an omni-directional camera, and succesfully generate arbitray viewpoint images for virual tour of the environment.
Local spatial frequency analysis for computer vision
NASA Technical Reports Server (NTRS)
Krumm, John; Shafer, Steven A.
1990-01-01
A sense of vision is a prerequisite for a robot to function in an unstructured environment. However, real-world scenes contain many interacting phenomena that lead to complex images which are difficult to interpret automatically. Typical computer vision research proceeds by analyzing various effects in isolation (e.g., shading, texture, stereo, defocus), usually on images devoid of realistic complicating factors. This leads to specialized algorithms which fail on real-world images. Part of this failure is due to the dichotomy of useful representations for these phenomena. Some effects are best described in the spatial domain, while others are more naturally expressed in frequency. In order to resolve this dichotomy, we present the combined space/frequency representation which, for each point in an image, shows the spatial frequencies at that point. Within this common representation, we develop a set of simple, natural theories describing phenomena such as texture, shape, aliasing and lens parameters. We show these theories lead to algorithms for shape from texture and for dealiasing image data. The space/frequency representation should be a key aid in untangling the complex interaction of phenomena in images, allowing automatic understanding of real-world scenes.
Abnormality detection of mammograms by discriminative dictionary learning on DSIFT descriptors.
Tavakoli, Nasrin; Karimi, Maryam; Nejati, Mansour; Karimi, Nader; Reza Soroushmehr, S M; Samavi, Shadrokh; Najarian, Kayvan
2017-07-01
Detection and classification of breast lesions using mammographic images are one of the most difficult studies in medical image processing. A number of learning and non-learning methods have been proposed for detecting and classifying these lesions. However, the accuracy of the detection/classification still needs improvement. In this paper we propose a powerful classification method based on sparse learning to diagnose breast cancer in mammograms. For this purpose, a supervised discriminative dictionary learning approach is applied on dense scale invariant feature transform (DSIFT) features. A linear classifier is also simultaneously learned with the dictionary which can effectively classify the sparse representations. Our experimental results show the superior performance of our method compared to existing approaches.
Dynamic facial expression recognition based on geometric and texture features
NASA Astrophysics Data System (ADS)
Li, Ming; Wang, Zengfu
2018-04-01
Recently, dynamic facial expression recognition in videos has attracted growing attention. In this paper, we propose a novel dynamic facial expression recognition method by using geometric and texture features. In our system, the facial landmark movements and texture variations upon pairwise images are used to perform the dynamic facial expression recognition tasks. For one facial expression sequence, pairwise images are created between the first frame and each of its subsequent frames. Integration of both geometric and texture features further enhances the representation of the facial expressions. Finally, Support Vector Machine is used for facial expression recognition. Experiments conducted on the extended Cohn-Kanade database show that our proposed method can achieve a competitive performance with other methods.
Deformable MR Prostate Segmentation via Deep Feature Learning and Sparse Patch Matching
Guo, Yanrong; Gao, Yaozong
2016-01-01
Automatic and reliable segmentation of the prostate is an important but difficult task for various clinical applications such as prostate cancer radiotherapy. The main challenges for accurate MR prostate localization lie in two aspects: (1) inhomogeneous and inconsistent appearance around prostate boundary, and (2) the large shape variation across different patients. To tackle these two problems, we propose a new deformable MR prostate segmentation method by unifying deep feature learning with the sparse patch matching. First, instead of directly using handcrafted features, we propose to learn the latent feature representation from prostate MR images by the stacked sparse auto-encoder (SSAE). Since the deep learning algorithm learns the feature hierarchy from the data, the learned features are often more concise and effective than the handcrafted features in describing the underlying data. To improve the discriminability of learned features, we further refine the feature representation in a supervised fashion. Second, based on the learned features, a sparse patch matching method is proposed to infer a prostate likelihood map by transferring the prostate labels from multiple atlases to the new prostate MR image. Finally, a deformable segmentation is used to integrate a sparse shape model with the prostate likelihood map for achieving the final segmentation. The proposed method has been extensively evaluated on the dataset that contains 66 T2-wighted prostate MR images. Experimental results show that the deep-learned features are more effective than the handcrafted features in guiding MR prostate segmentation. Moreover, our method shows superior performance than other state-of-the-art segmentation methods. PMID:26685226
Electrical and fluid transport in consolidated sphere packs
NASA Astrophysics Data System (ADS)
Zhan, Xin; Schwartz, Lawrence M.; Toksöz, M. Nafi
2015-05-01
We calculate geometrical and transport properties (electrical conductivity, permeability, specific surface area, and surface conductivity) of a family of model granular porous media from an image based representation of its microstructure. The models are based on the packing described by Finney and cover a wide range of porosities. Finite difference methods are applied to solve for electrical conductivity and hydraulic permeability. Two image processing methods are used to identify the pore-grain interface and to test correlations linking permeability to electrical conductivity. A three phase conductivity model is developed to compute surface conductivity associated with the grain-pore interface. Our results compare well against empirical models over the entire porosity range studied. We conclude by examining the influence of image resolution on our calculations.
Figure-ground segmentation based on class-independent shape priors
NASA Astrophysics Data System (ADS)
Li, Yang; Liu, Yang; Liu, Guojun; Guo, Maozu
2018-01-01
We propose a method to generate figure-ground segmentation by incorporating shape priors into the graph-cuts algorithm. Given an image, we first obtain a linear representation of an image and then apply directional chamfer matching to generate class-independent, nonparametric shape priors, which provide shape clues for the graph-cuts algorithm. We then enforce shape priors in a graph-cuts energy function to produce object segmentation. In contrast to previous segmentation methods, the proposed method shares shape knowledge for different semantic classes and does not require class-specific model training. Therefore, the approach obtains high-quality segmentation for objects. We experimentally validate that the proposed method outperforms previous approaches using the challenging PASCAL VOC 2010/2012 and Berkeley (BSD300) segmentation datasets.
NASA Astrophysics Data System (ADS)
Zhang, Xiaolei; Zhang, Xiangchao; Yuan, He; Zhang, Hao; Xu, Min
2018-02-01
Digital holography is a promising measurement method in the fields of bio-medicine and micro-electronics. But the captured images of digital holography are severely polluted by the speckle noise because of optical scattering and diffraction. Via analyzing the properties of Fresnel diffraction and the topographies of micro-structures, a novel reconstruction method based on the dual-tree complex wavelet transform (DT-CWT) is proposed. This algorithm is shiftinvariant and capable of obtaining sparse representations for the diffracted signals of salient features, thus it is well suited for multiresolution processing of the interferometric holograms of directional morphologies. An explicit representation of orthogonal Fresnel DT-CWT bases and a specific filtering method are developed. This method can effectively remove the speckle noise without destroying the salient features. Finally, the proposed reconstruction method is compared with the conventional Fresnel diffraction integration and Fresnel wavelet transform with compressive sensing methods to validate its remarkable superiority on the aspects of topography reconstruction and speckle removal.
Visual information processing II; Proceedings of the Meeting, Orlando, FL, Apr. 14-16, 1993
NASA Technical Reports Server (NTRS)
Huck, Friedrich O. (Editor); Juday, Richard D. (Editor)
1993-01-01
Various papers on visual information processing are presented. Individual topics addressed include: aliasing as noise, satellite image processing using a hammering neural network, edge-detetion method using visual perception, adaptive vector median filters, design of a reading test for low-vision image warping, spatial transformation architectures, automatic image-enhancement method, redundancy reduction in image coding, lossless gray-scale image compression by predictive GDF, information efficiency in visual communication, optimizing JPEG quantization matrices for different applications, use of forward error correction to maintain image fidelity, effect of peanoscanning on image compression. Also discussed are: computer vision for autonomous robotics in space, optical processor for zero-crossing edge detection, fractal-based image edge detection, simulation of the neon spreading effect by bandpass filtering, wavelet transform (WT) on parallel SIMD architectures, nonseparable 2D wavelet image representation, adaptive image halftoning based on WT, wavelet analysis of global warming, use of the WT for signal detection, perfect reconstruction two-channel rational filter banks, N-wavelet coding for pattern classification, simulation of image of natural objects, number-theoretic coding for iconic systems.
A multiresolution prostate representation for automatic segmentation in magnetic resonance images.
Alvarez, Charlens; Martínez, Fabio; Romero, Eduardo
2017-04-01
Accurate prostate delineation is necessary in radiotherapy processes for concentrating the dose onto the prostate and reducing side effects in neighboring organs. Currently, manual delineation is performed over magnetic resonance imaging (MRI) taking advantage of its high soft tissue contrast property. Nevertheless, as human intervention is a consuming task with high intra- and interobserver variability rates, (semi)-automatic organ delineation tools have emerged to cope with these challenges, reducing the time spent for these tasks. This work presents a multiresolution representation that defines a novel metric and allows to segment a new prostate by combining a set of most similar prostates in a dataset. The proposed method starts by selecting the set of most similar prostates with respect to a new one using the proposed multiresolution representation. This representation characterizes the prostate through a set of salient points, extracted from a region of interest (ROI) that encloses the organ and refined using structural information, allowing to capture main relevant features of the organ boundary. Afterward, the new prostate is automatically segmented by combining the nonrigidly registered expert delineations associated to the previous selected similar prostates using a weighted patch-based strategy. Finally, the prostate contour is smoothed based on morphological operations. The proposed approach was evaluated with respect to the expert manual segmentation under a leave-one-out scheme using two public datasets, obtaining averaged Dice coefficients of 82% ± 0.07 and 83% ± 0.06, and demonstrating a competitive performance with respect to atlas-based state-of-the-art methods. The proposed multiresolution representation provides a feature space that follows a local salient point criteria and a global rule of the spatial configuration among these points to find out the most similar prostates. This strategy suggests an easy adaptation in the clinical routine, as supporting tool for annotation. © 2017 American Association of Physicists in Medicine.
Cross-modal learning to rank via latent joint representation.
Wu, Fei; Jiang, Xinyang; Li, Xi; Tang, Siliang; Lu, Weiming; Zhang, Zhongfei; Zhuang, Yueting
2015-05-01
Cross-modal ranking is a research topic that is imperative to many applications involving multimodal data. Discovering a joint representation for multimodal data and learning a ranking function are essential in order to boost the cross-media retrieval (i.e., image-query-text or text-query-image). In this paper, we propose an approach to discover the latent joint representation of pairs of multimodal data (e.g., pairs of an image query and a text document) via a conditional random field and structural learning in a listwise ranking manner. We call this approach cross-modal learning to rank via latent joint representation (CML²R). In CML²R, the correlations between multimodal data are captured in terms of their sharing hidden variables (e.g., topics), and a hidden-topic-driven discriminative ranking function is learned in a listwise ranking manner. The experiments show that the proposed approach achieves a good performance in cross-media retrieval and meanwhile has the capability to learn the discriminative representation of multimodal data.
Shared liking and association valence for representational art but not abstract art.
Schepman, Astrid; Rodway, Paul; Pullen, Sarah J; Kirkham, Julie
2015-01-01
We examined the finding that aesthetic evaluations are more similar across observers for representational images than for abstract images. It has been proposed that a difference in convergence of observers' tastes is due to differing levels of shared semantic associations (Vessel & Rubin, 2010). In Experiment 1, student participants rated 20 representational and 20 abstract artworks. We found that their judgments were more similar for representational than abstract artworks. In Experiment 2, we replicated this finding, and also found that valence ratings given to associations and meanings provided in response to the artworks converged more across observers for representational than for abstract art. Our empirical work provides insight into processes that may underlie the observation that taste for representational art is shared across individual observers, while taste for abstract art is more idiosyncratic.
Reducible dictionaries for single image super-resolution based on patch matching and mean shifting
NASA Astrophysics Data System (ADS)
Rasti, Pejman; Nasrollahi, Kamal; Orlova, Olga; Tamberg, Gert; Moeslund, Thomas B.; Anbarjafari, Gholamreza
2017-03-01
A single-image super-resolution (SR) method is proposed. The proposed method uses a generated dictionary from pairs of high resolution (HR) images and their corresponding low resolution (LR) representations. First, HR images and the corresponding LR ones are divided into patches of HR and LR, respectively, and then they are collected into separate dictionaries. Afterward, when performing SR, the distance between every patch of the input LR image and those of available LR patches in the LR dictionary is calculated. The minimum distance between the input LR patch and those in the LR dictionary is taken, and its counterpart from the HR dictionary is passed through an illumination enhancement process. By this technique, the noticeable change of illumination between neighbor patches in the super-resolved image is significantly reduced. The enhanced HR patch represents the HR patch of the super-resolved image. Finally, to remove the blocking effect caused by merging the patches, an average of the obtained HR image and the interpolated image obtained using bicubic interpolation is calculated. The quantitative and qualitative analyses show the superiority of the proposed technique over the conventional and state-of-art methods.
Cascaded K-means convolutional feature learner and its application to face recognition
NASA Astrophysics Data System (ADS)
Zhou, Daoxiang; Yang, Dan; Zhang, Xiaohong; Huang, Sheng; Feng, Shu
2017-09-01
Currently, considerable efforts have been devoted to devise image representation. However, handcrafted methods need strong domain knowledge and show low generalization ability, and conventional feature learning methods require enormous training data and rich parameters tuning experience. A lightened feature learner is presented to solve these problems with application to face recognition, which shares similar topology architecture as a convolutional neural network. Our model is divided into three components: cascaded convolution filters bank learning layer, nonlinear processing layer, and feature pooling layer. Specifically, in the filters learning layer, we use K-means to learn convolution filters. Features are extracted via convoluting images with the learned filters. Afterward, in the nonlinear processing layer, hyperbolic tangent is employed to capture the nonlinear feature. In the feature pooling layer, to remove the redundancy information and incorporate the spatial layout, we exploit multilevel spatial pyramid second-order pooling technique to pool the features in subregions and concatenate them together as the final representation. Extensive experiments on four representative datasets demonstrate the effectiveness and robustness of our model to various variations, yielding competitive recognition results on extended Yale B and FERET. In addition, our method achieves the best identification performance on AR and labeled faces in the wild datasets among the comparative methods.
Decoding representations of face identity that are tolerant to rotation.
Anzellotti, Stefano; Fairhall, Scott L; Caramazza, Alfonso
2014-08-01
In order to recognize the identity of a face we need to distinguish very similar images (specificity) while also generalizing identity information across image transformations such as changes in orientation (tolerance). Recent studies investigated the representation of individual faces in the brain, but it remains unclear whether the human brain regions that were found encode representations of individual images (specificity) or face identity (specificity plus tolerance). In the present article, we use multivoxel pattern analysis in the human ventral stream to investigate the representation of face identity across rotations in depth, a kind of transformation in which no point in the face image remains unchanged. The results reveal representations of face identity that are tolerant to rotations in depth in occipitotemporal cortex and in anterior temporal cortex, even when the similarity between mirror symmetrical views cannot be used to achieve tolerance. Converging evidence from different analysis techniques shows that the right anterior temporal lobe encodes a comparable amount of identity information to occipitotemporal regions, but this information is encoded over a smaller extent of cortex. © The Author 2013. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Menon, Nadia; White, David; Kemp, Richard I
2015-01-01
According to cognitive and neurological models of the face-processing system, faces are represented at two levels of abstraction. First, image-based pictorial representations code a particular instance of a face and include information that is unrelated to identity-such as lighting, pose, and expression. Second, at a more abstract level, identity-specific representations combine information from various encounters with a single face. Here we tested whether identity-level representations mediate unfamiliar face matching performance. Across three experiments we manipulated identity attributions to pairs of target images and measured the effect on subsequent identification decisions. Participants were instructed that target images were either two photos of the same person (1ID condition) or photos of two different people (2ID condition). This manipulation consistently affected performance in sequential matching: 1ID instructions improved accuracy on "match" trials and caused participants to adopt a more liberal response bias than the 2ID condition. However, this manipulation did not affect performance in simultaneous matching. We conclude that identity-level representations, generated in working memory, influence the amount of variation tolerated between images, when making identity judgements in sequential face matching.
NASA Astrophysics Data System (ADS)
Wapenaar, Kees; Thorbecke, Jan; van der Neut, Joost
2016-04-01
Green's theorem plays a fundamental role in a diverse range of wavefield imaging applications, such as holographic imaging, inverse scattering, time-reversal acoustics and interferometric Green's function retrieval. In many of those applications, the homogeneous Green's function (i.e. the Green's function of the wave equation without a singularity on the right-hand side) is represented by a closed boundary integral. In practical applications, sources and/or receivers are usually present only on an open surface, which implies that a significant part of the closed boundary integral is by necessity ignored. Here we derive a homogeneous Green's function representation for the common situation that sources and/or receivers are present on an open surface only. We modify the integrand in such a way that it vanishes on the part of the boundary where no sources and receivers are present. As a consequence, the remaining integral along the open surface is an accurate single-sided representation of the homogeneous Green's function. This single-sided representation accounts for all orders of multiple scattering. The new representation significantly improves the aforementioned wavefield imaging applications, particularly in situations where the first-order scattering approximation breaks down.
Noisy image magnification with total variation regularization and order-changed dictionary learning
NASA Astrophysics Data System (ADS)
Xu, Jian; Chang, Zhiguo; Fan, Jiulun; Zhao, Xiaoqiang; Wu, Xiaomin; Wang, Yanzi
2015-12-01
Noisy low resolution (LR) images are always obtained in real applications, but many existing image magnification algorithms can not get good result from a noisy LR image. We propose a two-step image magnification algorithm to solve this problem. The proposed algorithm takes the advantages of both regularization-based method and learning-based method. The first step is based on total variation (TV) regularization and the second step is based on sparse representation. In the first step, we add a constraint on the TV regularization model to magnify the LR image and at the same time to suppress the noise in it. In the second step, we propose an order-changed dictionary training algorithm to train the dictionaries which is dominated by texture details. Experimental results demonstrate that the proposed algorithm performs better than many other algorithms when the noise is not serious. The proposed algorithm can also provide better visual quality on natural LR images.
Preservice Teachers' Images of Scientists: Do Prior Science Experiences Make a Difference?
NASA Astrophysics Data System (ADS)
Milford, Todd M.; Tippett, Christine D.
2013-06-01
This article presents the results of a mixed methods study that used the Draw-a-Scientist Test as a visual tool for exploring preservice teachers' beliefs about scientists. A questionnaire was also administered to 165 students who were enrolled in elementary (K-8) and secondary (8-12) science methods courses. Taken as a whole, the images drawn by preservice teachers reflected the stereotype of a scientist as a man with a wild hairdo who wears a lab coat and glasses while working in a laboratory setting. However, results indicated statistically significant differences in stereotypical components of representations of scientists depending on preservice teachers' program and previous science experiences. Post degree students in secondary science methods courses created images of scientists with fewer stereotypical elements than drawings created by students in the regular elementary program.
High Resolution Signal Processing
1993-08-19
Donald Tufts, Journal of Visual Communication and Image Representation, Vol.2, No. 4 PP.395-404, December 1991 "* "Iterative Realization of the...Chen and Donald Tufts , Journal of Visual Communication and Image Representation, Vol.2, No. 4 PP.395-404, December 1991. * "Fast Maximum Likelihood
Knowledge Representation Of CT Scans Of The Head
NASA Astrophysics Data System (ADS)
Ackerman, Laurens V.; Burke, M. W.; Rada, Roy
1984-06-01
We have been investigating diagnostic knowledge models which assist in the automatic classification of medical images by combining information extracted from each image with knowledge specific to that class of images. In a more general sense we are trying to integrate verbal and pictorial descriptions of disease via representations of knowledge, study automatic hypothesis generation as related to clinical medicine, evolve new mathematical image measures while integrating them into the total diagnostic process, and investigate ways to augment the knowledge of the physician. Specifically, we have constructed an artificial intelligence knowledge model using the technique of a production system blending pictorial and verbal knowledge about the respective CT scan and patient history. It is an attempt to tie together different sources of knowledge representation, picture feature extraction and hypothesis generation. Our knowledge reasoning and representation system (KRRS) works with data at the conscious reasoning level of the practicing physician while at the visual perceptional level we are building another production system, the picture parameter extractor (PPE). This paper describes KRRS and its relationship to PPE.
Fox, Christopher J; Barton, Jason J S
2007-01-05
The neural representation of facial expression within the human visual system is not well defined. Using an adaptation paradigm, we examined aftereffects on expression perception produced by various stimuli. Adapting to a face, which was used to create morphs between two expressions, substantially biased expression perception within the morphed faces away from the adapting expression. This adaptation was not based on low-level image properties, as a different image of the same person displaying that expression produced equally robust aftereffects. Smaller but significant aftereffects were generated by images of different individuals, irrespective of gender. Non-face visual, auditory, or verbal representations of emotion did not generate significant aftereffects. These results suggest that adaptation affects at least two neural representations of expression: one specific to the individual (not the image), and one that represents expression across different facial identities. The identity-independent aftereffect suggests the existence of a 'visual semantic' for facial expression in the human visual system.
2017-01-01
Although deep learning approaches have had tremendous success in image, video and audio processing, computer vision, and speech recognition, their applications to three-dimensional (3D) biomolecular structural data sets have been hindered by the geometric and biological complexity. To address this problem we introduce the element-specific persistent homology (ESPH) method. ESPH represents 3D complex geometry by one-dimensional (1D) topological invariants and retains important biological information via a multichannel image-like representation. This representation reveals hidden structure-function relationships in biomolecules. We further integrate ESPH and deep convolutional neural networks to construct a multichannel topological neural network (TopologyNet) for the predictions of protein-ligand binding affinities and protein stability changes upon mutation. To overcome the deep learning limitations from small and noisy training sets, we propose a multi-task multichannel topological convolutional neural network (MM-TCNN). We demonstrate that TopologyNet outperforms the latest methods in the prediction of protein-ligand binding affinities, mutation induced globular protein folding free energy changes, and mutation induced membrane protein folding free energy changes. Availability: weilab.math.msu.edu/TDL/ PMID:28749969
Optimizing image registration and infarct definition in stroke research.
Harston, George W J; Minks, David; Sheerin, Fintan; Payne, Stephen J; Chappell, Michael; Jezzard, Peter; Jenkinson, Mark; Kennedy, James
2017-03-01
Accurate representation of final infarct volume is essential for assessing the efficacy of stroke interventions in imaging-based studies. This study defines the impact of image registration methods used at different timepoints following stroke, and the implications for infarct definition in stroke research. Patients presenting with acute ischemic stroke were imaged serially using magnetic resonance imaging. Infarct volume was defined manually using four metrics: 24-h b1000 imaging; 1-week and 1-month T2-weighted FLAIR; and automatically using predefined thresholds of ADC at 24 h. Infarct overlap statistics and volumes were compared across timepoints following both rigid body and nonlinear image registration to the presenting MRI. The effect of nonlinear registration on a hypothetical trial sample size was calculated. Thirty-seven patients were included. Nonlinear registration improved infarct overlap statistics and consistency of total infarct volumes across timepoints, and reduced infarct volumes by 4.0 mL (13.1%) and 7.1 mL (18.2%) at 24 h and 1 week, respectively, compared to rigid body registration. Infarct volume at 24 h, defined using a predetermined ADC threshold, was less sensitive to infarction than b1000 imaging. 1-week T2-weighted FLAIR imaging was the most accurate representation of final infarct volume. Nonlinear registration reduced hypothetical trial sample size, independent of infarct volume, by an average of 13%. Nonlinear image registration may offer the opportunity of improving the accuracy of infarct definition in serial imaging studies compared to rigid body registration, helping to overcome the challenges of anatomical distortions at subacute timepoints, and reducing sample size for imaging-based clinical trials.
A novel framework for change detection in bi-temporal polarimetric SAR images
NASA Astrophysics Data System (ADS)
Pirrone, Davide; Bovolo, Francesca; Bruzzone, Lorenzo
2016-10-01
Last years have seen relevant increase of polarimetric Synthetic Aperture Radar (SAR) data availability, thanks to satellite sensors like Sentinel-1 or ALOS-2 PALSAR-2. The augmented information lying in the additional polarimetric channels represents a possibility for better discriminate different classes of changes in change detection (CD) applications. This work aims at proposing a framework for CD in multi-temporal multi-polarization SAR data. The framework includes both a tool for an effective visual representation of the change information and a method for extracting the multiple-change information. Both components are designed to effectively handle the multi-dimensionality of polarimetric data. In the novel representation, multi-temporal intensity SAR data are employed to compute a polarimetric log-ratio. The multitemporal information of the polarimetric log-ratio image is represented in a multi-dimensional features space, where changes are highlighted in terms of magnitude and direction. This representation is employed to design a novel unsupervised multi-class CD approach. This approach considers a sequential two-step analysis of the magnitude and the direction information for separating non-changed and changed samples. The proposed approach has been validated on a pair of Sentinel-1 data acquired before and after the flood in Tamil-Nadu in 2015. Preliminary results demonstrate that the representation tool is effective and that the use of polarimetric SAR data is promising in multi-class change detection applications.
Zeugin, David; Arfa, Norhan; Notter, Michael; Murray, Micah M; Ionta, Silvio
2017-08-30
Face recognition is an apparently straightforward but, in fact, complex ability, encompassing the activation of at least visual and somatosensory representations. Understanding how identity shapes the interplay between these face-related affordances could clarify the mechanisms of self-other discrimination. To this aim, we exploited the so-called "face inversion effect" (FIE), a specific bias in the mental rotation of face images (of other people): with respect to inanimate objects, face images require longer time to be mentally rotated from the upside-down. Via the FIE, which suggests the activation of somatosensory mechanisms, we assessed identity-related changes in the interplay between visual and somatosensory affordances between self- and other-face representations. Methodologically, to avoid the potential interference of the somatosensory feedback associated with musculoskeletal movements, we introduced the tracking of gaze direction to record participants' response. Response times from twenty healthy participants showed the larger FIE for self- than other-faces, suggesting that the impact of somatosensory affordances on mental representation of faces varies according to identity. The present study lays the foundations of a quantifiable method to implicitly assess self-other discrimination, with possible translational benefits for early diagnosis of face processing disturbances (e.g. prosopagnosia), and for neurophysiological studies on self-other discrimination in ethological settings. Copyright © 2017 Elsevier B.V. All rights reserved.
A Balanced Comparison of Object Invariances in Monkey IT Neurons.
Ratan Murty, N Apurva; Arun, Sripati P
2017-01-01
Our ability to recognize objects across variations in size, position, or rotation is based on invariant object representations in higher visual cortex. However, we know little about how these invariances are related. Are some invariances harder than others? Do some invariances arise faster than others? These comparisons can be made only upon equating image changes across transformations. Here, we targeted invariant neural representations in the monkey inferotemporal (IT) cortex using object images with balanced changes in size, position, and rotation. Across the recorded population, IT neurons generalized across size and position both stronger and faster than to rotations in the image plane as well as in depth. We obtained a similar ordering of invariances in deep neural networks but not in low-level visual representations. Thus, invariant neural representations dynamically evolve in a temporal order reflective of their underlying computational complexity.
A multimodal biometric authentication system based on 2D and 3D palmprint features
NASA Astrophysics Data System (ADS)
Aggithaya, Vivek K.; Zhang, David; Luo, Nan
2008-03-01
This paper presents a new personal authentication system that simultaneously exploits 2D and 3D palmprint features. Here, we aim to improve the accuracy and robustness of existing palmprint authentication systems using 3D palmprint features. The proposed system uses an active stereo technique, structured light, to capture 3D image or range data of the palm and a registered intensity image simultaneously. The surface curvature based method is employed to extract features from 3D palmprint and Gabor feature based competitive coding scheme is used for 2D representation. We individually analyze these representations and attempt to combine them with score level fusion technique. Our experiments on a database of 108 subjects achieve significant improvement in performance (Equal Error Rate) with the integration of 3D features as compared to the case when 2D palmprint features alone are employed.
Embedded sparse representation of fMRI data via group-wise dictionary optimization
NASA Astrophysics Data System (ADS)
Zhu, Dajiang; Lin, Binbin; Faskowitz, Joshua; Ye, Jieping; Thompson, Paul M.
2016-03-01
Sparse learning enables dimension reduction and efficient modeling of high dimensional signals and images, but it may need to be tailored to best suit specific applications and datasets. Here we used sparse learning to efficiently represent functional magnetic resonance imaging (fMRI) data from the human brain. We propose a novel embedded sparse representation (ESR), to identify the most consistent dictionary atoms across different brain datasets via an iterative group-wise dictionary optimization procedure. In this framework, we introduced additional criteria to make the learned dictionary atoms more consistent across different subjects. We successfully identified four common dictionary atoms that follow the external task stimuli with very high accuracy. After projecting the corresponding coefficient vectors back into the 3-D brain volume space, the spatial patterns are also consistent with traditional fMRI analysis results. Our framework reveals common features of brain activation in a population, as a new, efficient fMRI analysis method.
Fractals, Fuzzy Sets And Image Representation
NASA Astrophysics Data System (ADS)
Dodds, D. R.
1988-10-01
This paper addresses some uses of fractals, fuzzy sets and image representation as it pertains to robotic grip planning and autonomous vehicle navigation AVN. The robot/vehicle is assumed to be equipped with multimodal sensors including ultrashort pulse imaging laser rangefinder. With a temporal resolution of 50 femtoseconds a time of flight laser rangefinder can resolve distances within approximately half an inch or 1.25 centimeters. (Fujimoto88)
Processing the image gradient field using a topographic primal sketch approach.
Gambaruto, A M
2015-03-01
The spatial derivatives of the image intensity provide topographic information that may be used to identify and segment objects. The accurate computation of the derivatives is often hampered in medical images by the presence of noise and a limited resolution. This paper focuses on accurate computation of spatial derivatives and their subsequent use to process an image gradient field directly, from which an image with improved characteristics can be reconstructed. The improvements include noise reduction, contrast enhancement, thinning object contours and the preservation of edges. Processing the gradient field directly instead of the image is shown to have numerous benefits. The approach is developed such that the steps are modular, allowing the overall method to be improved and possibly tailored to different applications. As presented, the approach relies on a topographic representation and primal sketch of an image. Comparisons with existing image processing methods on a synthetic image and different medical images show improved results and accuracy in segmentation. Here, the focus is on objects with low spatial resolution, which is often the case in medical images. The methods developed show the importance of improved accuracy in derivative calculation and the potential in processing the image gradient field directly. Copyright © 2015 John Wiley & Sons, Ltd. Copyright © 2015 John Wiley & Sons, Ltd.
Liu, Dong; Wang, Shengsheng; Huang, Dezhi; Deng, Gang; Zeng, Fantao; Chen, Huiling
2016-05-01
Medical image recognition is an important task in both computer vision and computational biology. In the field of medical image classification, representing an image based on local binary patterns (LBP) descriptor has become popular. However, most existing LBP-based methods encode the binary patterns in a fixed neighborhood radius and ignore the spatial relationships among local patterns. The ignoring of the spatial relationships in the LBP will cause a poor performance in the process of capturing discriminative features for complex samples, such as medical images obtained by microscope. To address this problem, in this paper we propose a novel method to improve local binary patterns by assigning an adaptive neighborhood radius for each pixel. Based on these adaptive local binary patterns, we further propose a spatial adjacent histogram strategy to encode the micro-structures for image representation. An extensive set of evaluations are performed on four medical datasets which show that the proposed method significantly improves standard LBP and compares favorably with several other prevailing approaches. Copyright © 2016 Elsevier Ltd. All rights reserved.
Eye Detection and Tracking for Intelligent Human Computer Interaction
2006-02-01
Meer and I. Weiss, “Smoothed Differentiation Filters for Images”, Journal of Visual Communication and Image Representation, 3(1):58-72, 1992. [13...25] P. Meer and I. Weiss. “Smoothed differentiation filters for images”. Journal of Visual Communication and Image Representation, 3(1), 1992
Baum, S.; Sillem, M.; Ney, J. T.; Baum, A.; Friedrich, M.; Radosa, J.; Kramer, K. M.; Gronwald, B.; Gottschling, S.; Solomayer, E. F.; Rody, A.; Joukhadar, R.
2017-01-01
Introduction Minimally invasive operative techniques are being used increasingly in gynaecological surgery. The expansion of the laparoscopic operation spectrum is in part the result of improved imaging. This study investigates the practical advantages of using 3D cameras in routine surgical practice. Materials and Methods Two different 3-dimensional camera systems were compared with a 2-dimensional HD system; the operating surgeonʼs experiences were documented immediately postoperatively using a questionnaire. Results Significant advantages were reported for suturing and cutting of anatomical structures when using the 3D compared to 2D camera systems. There was only a slight advantage for coagulating. The use of 3D cameras significantly improved the general operative visibility and in particular the representation of spacial depth compared to 2-dimensional images. There was not a significant advantage for image width. Depiction of adhesions and retroperitoneal neural structures was significantly improved by the stereoscopic cameras, though this did not apply to blood vessels, ureter, uterus or ovaries. Conclusion 3-dimensional cameras were particularly advantageous for the depiction of fine anatomical structures due to improved spacial depth representation compared to 2D systems. 3D cameras provide the operating surgeon with a monitor image that more closely resembles actual anatomy, thus simplifying laparoscopic procedures. PMID:28190888
Orthogonal Procrustes Analysis for Dictionary Learning in Sparse Linear Representation
Grossi, Giuliano; Lin, Jianyi
2017-01-01
In the sparse representation model, the design of overcomplete dictionaries plays a key role for the effectiveness and applicability in different domains. Recent research has produced several dictionary learning approaches, being proven that dictionaries learnt by data examples significantly outperform structured ones, e.g. wavelet transforms. In this context, learning consists in adapting the dictionary atoms to a set of training signals in order to promote a sparse representation that minimizes the reconstruction error. Finding the best fitting dictionary remains a very difficult task, leaving the question still open. A well-established heuristic method for tackling this problem is an iterative alternating scheme, adopted for instance in the well-known K-SVD algorithm. Essentially, it consists in repeating two stages; the former promotes sparse coding of the training set and the latter adapts the dictionary to reduce the error. In this paper we present R-SVD, a new method that, while maintaining the alternating scheme, adopts the Orthogonal Procrustes analysis to update the dictionary atoms suitably arranged into groups. Comparative experiments on synthetic data prove the effectiveness of R-SVD with respect to well known dictionary learning algorithms such as K-SVD, ILS-DLA and the online method OSDL. Moreover, experiments on natural data such as ECG compression, EEG sparse representation, and image modeling confirm R-SVD’s robustness and wide applicability. PMID:28103283
NASA Astrophysics Data System (ADS)
Taşkin Kaya, Gülşen
2013-10-01
Recently, earthquake damage assessment using satellite images has been a very popular ongoing research direction. Especially with the availability of very high resolution (VHR) satellite images, a quite detailed damage map based on building scale has been produced, and various studies have also been conducted in the literature. As the spatial resolution of satellite images increases, distinguishability of damage patterns becomes more cruel especially in case of using only the spectral information during classification. In order to overcome this difficulty, textural information needs to be involved to the classification to improve the visual quality and reliability of damage map. There are many kinds of textural information which can be derived from VHR satellite images depending on the algorithm used. However, extraction of textural information and evaluation of them have been generally a time consuming process especially for the large areas affected from the earthquake due to the size of VHR image. Therefore, in order to provide a quick damage map, the most useful features describing damage patterns needs to be known in advance as well as the redundant features. In this study, a very high resolution satellite image after Iran, Bam earthquake was used to identify the earthquake damage. Not only the spectral information, textural information was also used during the classification. For textural information, second order Haralick features were extracted from the panchromatic image for the area of interest using gray level co-occurrence matrix with different size of windows and directions. In addition to using spatial features in classification, the most useful features representing the damage characteristic were selected with a novel feature selection method based on high dimensional model representation (HDMR) giving sensitivity of each feature during classification. The method called HDMR was recently proposed as an efficient tool to capture the input-output relationships in high-dimensional systems for many problems in science and engineering. The HDMR method is developed to improve the efficiency of the deducing high dimensional behaviors. The method is formed by a particular organization of low dimensional component functions, in which each function is the contribution of one or more input variables to the output variables.
Tomographic imaging using poissonian detector data
Aspelmeier, Timo; Ebel, Gernot; Hoeschen, Christoph
2013-10-15
An image reconstruction method for reconstructing a tomographic image (f.sub.j) of a region of investigation within an object (1), comprises the steps of providing detector data (y.sub.i) comprising Poisson random values measured at an i-th of a plurality of different positions, e.g. i=(k,l) with pixel index k on a detector device and angular index l referring to both the angular position (.alpha..sub.l) and the rotation radius (r.sub.l) of the detector device (10) relative to the object (1), providing a predetermined system matrix A.sub.ij assigning a j-th voxel of the object (1) to the i-th detector data (y.sub.i), and reconstructing the tomographic image (f.sub.j) based on the detector data (y.sub.i), said reconstructing step including a procedure of minimizing a functional F(f) depending on the detector data (y.sub.i) and the system matrix A.sub.ij and additionally including a sparse or compressive representation of the object (1) in an orthobasis T, wherein the tomographic image (f.sub.j) represents the global minimum of the functional F(f). Furthermore, an imaging method and an imaging device using the image reconstruction method are described.
Practical implementation of tetrahedral mesh reconstruction in emission tomography
Boutchko, R.; Sitek, A.; Gullberg, G. T.
2014-01-01
This paper presents a practical implementation of image reconstruction on tetrahedral meshes optimized for emission computed tomography with parallel beam geometry. Tetrahedral mesh built on a point cloud is a convenient image representation method, intrinsically three-dimensional and with a multi-level resolution property. Image intensities are defined at the mesh nodes and linearly interpolated inside each tetrahedron. For the given mesh geometry, the intensities can be computed directly from tomographic projections using iterative reconstruction algorithms with a system matrix calculated using an exact analytical formula. The mesh geometry is optimized for a specific patient using a two stage process. First, a noisy image is reconstructed on a finely-spaced uniform cloud. Then, the geometry of the representation is adaptively transformed through boundary-preserving node motion and elimination. Nodes are removed in constant intensity regions, merged along the boundaries, and moved in the direction of the mean local intensity gradient in order to provide higher node density in the boundary regions. Attenuation correction and detector geometric response are included in the system matrix. Once the mesh geometry is optimized, it is used to generate the final system matrix for ML-EM reconstruction of node intensities and for visualization of the reconstructed images. In dynamic PET or SPECT imaging, the system matrix generation procedure is performed using a quasi-static sinogram, generated by summing projection data from multiple time frames. This system matrix is then used to reconstruct the individual time frame projections. Performance of the new method is evaluated by reconstructing simulated projections of the NCAT phantom and the method is then applied to dynamic SPECT phantom and patient studies and to a dynamic microPET rat study. Tetrahedral mesh-based images are compared to the standard voxel-based reconstruction for both high and low signal-to-noise ratio projection datasets. The results demonstrate that the reconstructed images represented as tetrahedral meshes based on point clouds offer image quality comparable to that achievable using a standard voxel grid while allowing substantial reduction in the number of unknown intensities to be reconstructed and reducing the noise. PMID:23588373
Practical implementation of tetrahedral mesh reconstruction in emission tomography
NASA Astrophysics Data System (ADS)
Boutchko, R.; Sitek, A.; Gullberg, G. T.
2013-05-01
This paper presents a practical implementation of image reconstruction on tetrahedral meshes optimized for emission computed tomography with parallel beam geometry. Tetrahedral mesh built on a point cloud is a convenient image representation method, intrinsically three-dimensional and with a multi-level resolution property. Image intensities are defined at the mesh nodes and linearly interpolated inside each tetrahedron. For the given mesh geometry, the intensities can be computed directly from tomographic projections using iterative reconstruction algorithms with a system matrix calculated using an exact analytical formula. The mesh geometry is optimized for a specific patient using a two stage process. First, a noisy image is reconstructed on a finely-spaced uniform cloud. Then, the geometry of the representation is adaptively transformed through boundary-preserving node motion and elimination. Nodes are removed in constant intensity regions, merged along the boundaries, and moved in the direction of the mean local intensity gradient in order to provide higher node density in the boundary regions. Attenuation correction and detector geometric response are included in the system matrix. Once the mesh geometry is optimized, it is used to generate the final system matrix for ML-EM reconstruction of node intensities and for visualization of the reconstructed images. In dynamic PET or SPECT imaging, the system matrix generation procedure is performed using a quasi-static sinogram, generated by summing projection data from multiple time frames. This system matrix is then used to reconstruct the individual time frame projections. Performance of the new method is evaluated by reconstructing simulated projections of the NCAT phantom and the method is then applied to dynamic SPECT phantom and patient studies and to a dynamic microPET rat study. Tetrahedral mesh-based images are compared to the standard voxel-based reconstruction for both high and low signal-to-noise ratio projection datasets. The results demonstrate that the reconstructed images represented as tetrahedral meshes based on point clouds offer image quality comparable to that achievable using a standard voxel grid while allowing substantial reduction in the number of unknown intensities to be reconstructed and reducing the noise.
NASA Astrophysics Data System (ADS)
Alshehhi, Rasha; Marpu, Prashanth Reddy
2017-04-01
Extraction of road networks in urban areas from remotely sensed imagery plays an important role in many urban applications (e.g. road navigation, geometric correction of urban remote sensing images, updating geographic information systems, etc.). It is normally difficult to accurately differentiate road from its background due to the complex geometry of the buildings and the acquisition geometry of the sensor. In this paper, we present a new method for extracting roads from high-resolution imagery based on hierarchical graph-based image segmentation. The proposed method consists of: 1. Extracting features (e.g., using Gabor and morphological filtering) to enhance the contrast between road and non-road pixels, 2. Graph-based segmentation consisting of (i) Constructing a graph representation of the image based on initial segmentation and (ii) Hierarchical merging and splitting of image segments based on color and shape features, and 3. Post-processing to remove irregularities in the extracted road segments. Experiments are conducted on three challenging datasets of high-resolution images to demonstrate the proposed method and compare with other similar approaches. The results demonstrate the validity and superior performance of the proposed method for road extraction in urban areas.
Content Analysis of Science Teacher Representations in Google Images
ERIC Educational Resources Information Center
Bergman, Daniel
2017-01-01
Teacher images can impact numerous perceptions in educational settings, as well as through popular media. The portrayal of effective science teaching is especially challenging to specify, given the complex nature of science inquiry and other standards-based practices. The present study examined the litany of representations of science teachers…
A Novel Quantum Image Steganography Scheme Based on LSB
NASA Astrophysics Data System (ADS)
Zhou, Ri-Gui; Luo, Jia; Liu, XingAo; Zhu, Changming; Wei, Lai; Zhang, Xiafen
2018-06-01
Based on the NEQR representation of quantum images and least significant bit (LSB) scheme, a novel quantum image steganography scheme is proposed. The sizes of the cover image and the original information image are assumed to be 4 n × 4 n and n × n, respectively. Firstly, the bit-plane scrambling method is used to scramble the original information image. Then the scrambled information image is expanded to the same size of the cover image by using the key only known to the operator. The expanded image is scrambled to be a meaningless image with the Arnold scrambling. The embedding procedure and extracting procedure are carried out by K 1 and K 2 which are under control of the operator. For validation of the presented scheme, the peak-signal-to-noise ratio (PSNR), the capacity, the security of the images and the circuit complexity are analyzed.
Principal curve detection in complicated graph images
NASA Astrophysics Data System (ADS)
Liu, Yuncai; Huang, Thomas S.
2001-09-01
Finding principal curves in an image is an important low level processing in computer vision and pattern recognition. Principal curves are those curves in an image that represent boundaries or contours of objects of interest. In general, a principal curve should be smooth with certain length constraint and allow either smooth or sharp turning. In this paper, we present a method that can efficiently detect principal curves in complicated map images. For a given feature image, obtained from edge detection of an intensity image or thinning operation of a pictorial map image, the feature image is first converted to a graph representation. In graph image domain, the operation of principal curve detection is performed to identify useful image features. The shortest path and directional deviation schemes are used in our algorithm os principal verve detection, which is proven to be very efficient working with real graph images.
Tensor scale: An analytic approach with efficient computation and applications☆
Xu, Ziyue; Saha, Punam K.; Dasgupta, Soura
2015-01-01
Scale is a widely used notion in computer vision and image understanding that evolved in the form of scale-space theory where the key idea is to represent and analyze an image at various resolutions. Recently, we introduced a notion of local morphometric scale referred to as “tensor scale” using an ellipsoidal model that yields a unified representation of structure size, orientation and anisotropy. In the previous work, tensor scale was described using a 2-D algorithmic approach and a precise analytic definition was missing. Also, the application of tensor scale in 3-D using the previous framework is not practical due to high computational complexity. In this paper, an analytic definition of tensor scale is formulated for n-dimensional (n-D) images that captures local structure size, orientation and anisotropy. Also, an efficient computational solution in 2- and 3-D using several novel differential geometric approaches is presented and the accuracy of results is experimentally examined. Also, a matrix representation of tensor scale is derived facilitating several operations including tensor field smoothing to capture larger contextual knowledge. Finally, the applications of tensor scale in image filtering and n-linear interpolation are presented and the performance of their results is examined in comparison with respective state-of-art methods. Specifically, the performance of tensor scale based image filtering is compared with gradient and Weickert’s structure tensor based diffusive filtering algorithms. Also, the performance of tensor scale based n-linear interpolation is evaluated in comparison with standard n-linear and windowed-sinc interpolation methods. PMID:26236148
Droplet Image Super Resolution Based on Sparse Representation and Kernel Regression
NASA Astrophysics Data System (ADS)
Zou, Zhenzhen; Luo, Xinghong; Yu, Qiang
2018-02-01
Microgravity and containerless conditions, which are produced via electrostatic levitation combined with a drop tube, are important when studying the intrinsic properties of new metastable materials. Generally, temperature and image sensors can be used to measure the changes of sample temperature, morphology and volume. Then, the specific heat, surface tension, viscosity changes and sample density can be obtained. Considering that the falling speed of the material sample droplet is approximately 31.3 m/s when it reaches the bottom of a 50-meter-high drop tube, a high-speed camera with a collection rate of up to 106 frames/s is required to image the falling droplet. However, at the high-speed mode, very few pixels, approximately 48-120, will be obtained in each exposure time, which results in low image quality. Super-resolution image reconstruction is an algorithm that provides finer details than the sampling grid of a given imaging device by increasing the number of pixels per unit area in the image. In this work, we demonstrate the application of single image-resolution reconstruction in the microgravity and electrostatic levitation for the first time. Here, using the image super-resolution method based on sparse representation, a low-resolution droplet image can be reconstructed. Employed Yang's related dictionary model, high- and low-resolution image patches were combined with dictionary training, and high- and low-resolution-related dictionaries were obtained. The online double-sparse dictionary training algorithm was used in the study of related dictionaries and overcome the shortcomings of the traditional training algorithm with small image patch. During the stage of image reconstruction, the algorithm of kernel regression is added, which effectively overcomes the shortcomings of the Yang image's edge blurs.
Droplet Image Super Resolution Based on Sparse Representation and Kernel Regression
NASA Astrophysics Data System (ADS)
Zou, Zhenzhen; Luo, Xinghong; Yu, Qiang
2018-05-01
Microgravity and containerless conditions, which are produced via electrostatic levitation combined with a drop tube, are important when studying the intrinsic properties of new metastable materials. Generally, temperature and image sensors can be used to measure the changes of sample temperature, morphology and volume. Then, the specific heat, surface tension, viscosity changes and sample density can be obtained. Considering that the falling speed of the material sample droplet is approximately 31.3 m/s when it reaches the bottom of a 50-meter-high drop tube, a high-speed camera with a collection rate of up to 106 frames/s is required to image the falling droplet. However, at the high-speed mode, very few pixels, approximately 48-120, will be obtained in each exposure time, which results in low image quality. Super-resolution image reconstruction is an algorithm that provides finer details than the sampling grid of a given imaging device by increasing the number of pixels per unit area in the image. In this work, we demonstrate the application of single image-resolution reconstruction in the microgravity and electrostatic levitation for the first time. Here, using the image super-resolution method based on sparse representation, a low-resolution droplet image can be reconstructed. Employed Yang's related dictionary model, high- and low-resolution image patches were combined with dictionary training, and high- and low-resolution-related dictionaries were obtained. The online double-sparse dictionary training algorithm was used in the study of related dictionaries and overcome the shortcomings of the traditional training algorithm with small image patch. During the stage of image reconstruction, the algorithm of kernel regression is added, which effectively overcomes the shortcomings of the Yang image's edge blurs.
Lupo, Janine M; Nelson, Sarah J
2014-10-01
This review explores how the integration of advanced imaging methods with high-quality anatomical images significantly improves the characterization, target definition, assessment of response to therapy, and overall management of patients with high-grade glioma. Metrics derived from diffusion-, perfusion-, and susceptibility-weighted magnetic resonance imaging in conjunction with magnetic resonance spectroscopic imaging, allows us to characterize regions of edema, hypoxia, increased cellularity, and necrosis within heterogeneous tumor and surrounding brain tissue. Quantification of such measures may provide a more reliable initial representation of tumor delineation and response to therapy than changes in the contrast-enhancing or T2 lesion alone and have a significant effect on targeting resection, planning radiation, and assessing treatment effectiveness. In the long term, implementation of these imaging methodologies can also aid in the identification of recurrent tumor and its differentiation from treatment-related confounds and facilitate the detection of radiationinduced vascular injury in otherwise normal-appearing brain tissue.
Deep Learning in Medical Image Analysis.
Shen, Dinggang; Wu, Guorong; Suk, Heung-Il
2017-06-21
This review covers computer-assisted analysis of images in the field of medical imaging. Recent advances in machine learning, especially with regard to deep learning, are helping to identify, classify, and quantify patterns in medical images. At the core of these advances is the ability to exploit hierarchical feature representations learned solely from data, instead of features designed by hand according to domain-specific knowledge. Deep learning is rapidly becoming the state of the art, leading to enhanced performance in various medical applications. We introduce the fundamentals of deep learning methods and review their successes in image registration, detection of anatomical and cellular structures, tissue segmentation, computer-aided disease diagnosis and prognosis, and so on. We conclude by discussing research issues and suggesting future directions for further improvement.