Rapid Processing of a Global Feature in the ON Visual Pathways of Behaving Monkeys.
Huang, Jun; Yang, Yan; Zhou, Ke; Zhao, Xudong; Zhou, Quan; Zhu, Hong; Yang, Yingshan; Zhang, Chunming; Zhou, Yifeng; Zhou, Wu
2017-01-01
Visual objects are recognized by their features. Whereas, some features are based on simple components (i.e., local features, such as orientation of line segments), some features are based on the whole object (i.e., global features, such as an object having a hole in it). Over the past five decades, behavioral, physiological, anatomical, and computational studies have established a general model of vision, which starts from extracting local features in the lower visual pathways followed by a feature integration process that extracts global features in the higher visual pathways. This local-to-global model is successful in providing a unified account for a vast sets of perception experiments, but it fails to account for a set of experiments showing human visual systems' superior sensitivity to global features. Understanding the neural mechanisms underlying the "global-first" process will offer critical insights into new models of vision. The goal of the present study was to establish a non-human primate model of rapid processing of global features for elucidating the neural mechanisms underlying differential processing of global and local features. Monkeys were trained to make a saccade to a target in the black background, which was different from the distractors (white circle) in color (e.g., red circle target), local features (e.g., white square target), a global feature (e.g., white ring with a hole target) or their combinations (e.g., red square target). Contrary to the predictions of the prevailing local-to-global model, we found that (1) detecting a distinction or a change in the global feature was faster than detecting a distinction or a change in color or local features; (2) detecting a distinction in color was facilitated by a distinction in the global feature, but not in the local features; and (3) detecting the hole was interfered by the local features of the hole (e.g., white ring with a squared hole). These results suggest that monkey ON visual systems have a subsystem that is more sensitive to distinctions in the global feature than local features. They also provide the behavioral constraints for identifying the underlying neural substrates.
Specific excitatory connectivity for feature integration in mouse primary visual cortex
Molina-Luna, Patricia; Roth, Morgane M.
2017-01-01
Local excitatory connections in mouse primary visual cortex (V1) are stronger and more prevalent between neurons that share similar functional response features. However, the details of how functional rules for local connectivity shape neuronal responses in V1 remain unknown. We hypothesised that complex responses to visual stimuli may arise as a consequence of rules for selective excitatory connectivity within the local network in the superficial layers of mouse V1. In mouse V1 many neurons respond to overlapping grating stimuli (plaid stimuli) with highly selective and facilitatory responses, which are not simply predicted by responses to single gratings presented alone. This complexity is surprising, since excitatory neurons in V1 are considered to be mainly tuned to single preferred orientations. Here we examined the consequences for visual processing of two alternative connectivity schemes: in the first case, local connections are aligned with visual properties inherited from feedforward input (a ‘like-to-like’ scheme specifically connecting neurons that share similar preferred orientations); in the second case, local connections group neurons into excitatory subnetworks that combine and amplify multiple feedforward visual properties (a ‘feature binding’ scheme). By comparing predictions from large scale computational models with in vivo recordings of visual representations in mouse V1, we found that responses to plaid stimuli were best explained by assuming feature binding connectivity. Unlike under the like-to-like scheme, selective amplification within feature-binding excitatory subnetworks replicated experimentally observed facilitatory responses to plaid stimuli; explained selective plaid responses not predicted by grating selectivity; and was consistent with broad anatomical selectivity observed in mouse V1. Our results show that visual feature binding can occur through local recurrent mechanisms without requiring feedforward convergence, and that such a mechanism is consistent with visual responses and cortical anatomy in mouse V1. PMID:29240769
Coding Local and Global Binary Visual Features Extracted From Video Sequences.
Baroffio, Luca; Canclini, Antonio; Cesana, Matteo; Redondi, Alessandro; Tagliasacchi, Marco; Tubaro, Stefano
2015-11-01
Binary local features represent an effective alternative to real-valued descriptors, leading to comparable results for many visual analysis tasks while being characterized by significantly lower computational complexity and memory requirements. When dealing with large collections, a more compact representation based on global features is often preferred, which can be obtained from local features by means of, e.g., the bag-of-visual word model. Several applications, including, for example, visual sensor networks and mobile augmented reality, require visual features to be transmitted over a bandwidth-limited network, thus calling for coding techniques that aim at reducing the required bit budget while attaining a target level of efficiency. In this paper, we investigate a coding scheme tailored to both local and global binary features, which aims at exploiting both spatial and temporal redundancy by means of intra- and inter-frame coding. In this respect, the proposed coding scheme can conveniently be adopted to support the analyze-then-compress (ATC) paradigm. That is, visual features are extracted from the acquired content, encoded at remote nodes, and finally transmitted to a central controller that performs the visual analysis. This is in contrast with the traditional approach, in which visual content is acquired at a node, compressed and then sent to a central unit for further processing, according to the compress-then-analyze (CTA) paradigm. In this paper, we experimentally compare the ATC and the CTA by means of rate-efficiency curves in the context of two different visual analysis tasks: 1) homography estimation and 2) content-based retrieval. Our results show that the novel ATC paradigm based on the proposed coding primitives can be competitive with the CTA, especially in bandwidth limited scenarios.
Coding Local and Global Binary Visual Features Extracted From Video Sequences
NASA Astrophysics Data System (ADS)
Baroffio, Luca; Canclini, Antonio; Cesana, Matteo; Redondi, Alessandro; Tagliasacchi, Marco; Tubaro, Stefano
2015-11-01
Binary local features represent an effective alternative to real-valued descriptors, leading to comparable results for many visual analysis tasks, while being characterized by significantly lower computational complexity and memory requirements. When dealing with large collections, a more compact representation based on global features is often preferred, which can be obtained from local features by means of, e.g., the Bag-of-Visual-Word (BoVW) model. Several applications, including for example visual sensor networks and mobile augmented reality, require visual features to be transmitted over a bandwidth-limited network, thus calling for coding techniques that aim at reducing the required bit budget, while attaining a target level of efficiency. In this paper we investigate a coding scheme tailored to both local and global binary features, which aims at exploiting both spatial and temporal redundancy by means of intra- and inter-frame coding. In this respect, the proposed coding scheme can be conveniently adopted to support the Analyze-Then-Compress (ATC) paradigm. That is, visual features are extracted from the acquired content, encoded at remote nodes, and finally transmitted to a central controller that performs visual analysis. This is in contrast with the traditional approach, in which visual content is acquired at a node, compressed and then sent to a central unit for further processing, according to the Compress-Then-Analyze (CTA) paradigm. In this paper we experimentally compare ATC and CTA by means of rate-efficiency curves in the context of two different visual analysis tasks: homography estimation and content-based retrieval. Our results show that the novel ATC paradigm based on the proposed coding primitives can be competitive with CTA, especially in bandwidth limited scenarios.
Combining local and global limitations of visual search.
Põder, Endel
2017-04-01
There are different opinions about the roles of local interactions and central processing capacity in visual search. This study attempts to clarify the problem using a new version of relevant set cueing. A central precue indicates two symmetrical segments (that may contain a target object) within a circular array of objects presented briefly around the fixation point. The number of objects in the relevant segments, and density of objects in the array were varied independently. Three types of search experiments were run: (a) search for a simple visual feature (color, size, and orientation); (b) conjunctions of simple features; and (c) spatial configuration of simple features (rotated Ts). For spatial configuration stimuli, the results were consistent with a fixed global processing capacity and standard crowding zones. For simple features and their conjunctions, the results were different, dependent on the features involved. While color search exhibits virtually no capacity limits or crowding, search for an orientation target was limited by both. Results for conjunctions of features can be partly explained by the results from the respective features. This study shows that visual search is limited by both local interference and global capacity, and the limitations are different for different visual features.
Local coding based matching kernel method for image classification.
Song, Yan; McLoughlin, Ian Vince; Dai, Li-Rong
2014-01-01
This paper mainly focuses on how to effectively and efficiently measure visual similarity for local feature based representation. Among existing methods, metrics based on Bag of Visual Word (BoV) techniques are efficient and conceptually simple, at the expense of effectiveness. By contrast, kernel based metrics are more effective, but at the cost of greater computational complexity and increased storage requirements. We show that a unified visual matching framework can be developed to encompass both BoV and kernel based metrics, in which local kernel plays an important role between feature pairs or between features and their reconstruction. Generally, local kernels are defined using Euclidean distance or its derivatives, based either explicitly or implicitly on an assumption of Gaussian noise. However, local features such as SIFT and HoG often follow a heavy-tailed distribution which tends to undermine the motivation behind Euclidean metrics. Motivated by recent advances in feature coding techniques, a novel efficient local coding based matching kernel (LCMK) method is proposed. This exploits the manifold structures in Hilbert space derived from local kernels. The proposed method combines advantages of both BoV and kernel based metrics, and achieves a linear computational complexity. This enables efficient and scalable visual matching to be performed on large scale image sets. To evaluate the effectiveness of the proposed LCMK method, we conduct extensive experiments with widely used benchmark datasets, including 15-Scenes, Caltech101/256, PASCAL VOC 2007 and 2011 datasets. Experimental results confirm the effectiveness of the relatively efficient LCMK method.
Onboard Robust Visual Tracking for UAVs Using a Reliable Global-Local Object Model
Fu, Changhong; Duan, Ran; Kircali, Dogan; Kayacan, Erdal
2016-01-01
In this paper, we present a novel onboard robust visual algorithm for long-term arbitrary 2D and 3D object tracking using a reliable global-local object model for unmanned aerial vehicle (UAV) applications, e.g., autonomous tracking and chasing a moving target. The first main approach in this novel algorithm is the use of a global matching and local tracking approach. In other words, the algorithm initially finds feature correspondences in a way that an improved binary descriptor is developed for global feature matching and an iterative Lucas–Kanade optical flow algorithm is employed for local feature tracking. The second main module is the use of an efficient local geometric filter (LGF), which handles outlier feature correspondences based on a new forward-backward pairwise dissimilarity measure, thereby maintaining pairwise geometric consistency. In the proposed LGF module, a hierarchical agglomerative clustering, i.e., bottom-up aggregation, is applied using an effective single-link method. The third proposed module is a heuristic local outlier factor (to the best of our knowledge, it is utilized for the first time to deal with outlier features in a visual tracking application), which further maximizes the representation of the target object in which we formulate outlier feature detection as a binary classification problem with the output features of the LGF module. Extensive UAV flight experiments show that the proposed visual tracker achieves real-time frame rates of more than thirty-five frames per second on an i7 processor with 640 × 512 image resolution and outperforms the most popular state-of-the-art trackers favorably in terms of robustness, efficiency and accuracy. PMID:27589769
An integration of minimum local feature representation methods to recognize large variation of foods
NASA Astrophysics Data System (ADS)
Razali, Mohd Norhisham bin; Manshor, Noridayu; Halin, Alfian Abdul; Mustapha, Norwati; Yaakob, Razali
2017-10-01
Local invariant features have shown to be successful in describing object appearances for image classification tasks. Such features are robust towards occlusion and clutter and are also invariant against scale and orientation changes. This makes them suitable for classification tasks with little inter-class similarity and large intra-class difference. In this paper, we propose an integrated representation of the Speeded-Up Robust Feature (SURF) and Scale Invariant Feature Transform (SIFT) descriptors, using late fusion strategy. The proposed representation is used for food recognition from a dataset of food images with complex appearance variations. The Bag of Features (BOF) approach is employed to enhance the discriminative ability of the local features. Firstly, the individual local features are extracted to construct two kinds of visual vocabularies, representing SURF and SIFT. The visual vocabularies are then concatenated and fed into a Linear Support Vector Machine (SVM) to classify the respective food categories. Experimental results demonstrate impressive overall recognition at 82.38% classification accuracy based on the challenging UEC-Food100 dataset.
Collinearity Impairs Local Element Visual Search
ERIC Educational Resources Information Center
Jingling, Li; Tseng, Chia-Huei
2013-01-01
In visual searches, stimuli following the law of good continuity attract attention to the global structure and receive attentional priority. Also, targets that have unique features are of high feature contrast and capture attention in visual search. We report on a salient global structure combined with a high orientation contrast to the…
The fate of task-irrelevant visual motion: perceptual load versus feature-based attention.
Taya, Shuichiro; Adams, Wendy J; Graf, Erich W; Lavie, Nilli
2009-11-18
We tested contrasting predictions derived from perceptual load theory and from recent feature-based selection accounts. Observers viewed moving, colored stimuli and performed low or high load tasks associated with one stimulus feature, either color or motion. The resultant motion aftereffect (MAE) was used to evaluate attentional allocation. We found that task-irrelevant visual features received less attention than co-localized task-relevant features of the same objects. Moreover, when color and motion features were co-localized yet perceived to belong to two distinct surfaces, feature-based selection was further increased at the expense of object-based co-selection. Load theory predicts that the MAE for task-irrelevant motion would be reduced with a higher load color task. However, this was not seen for co-localized features; perceptual load only modulated the MAE for task-irrelevant motion when this was spatially separated from the attended color location. Our results suggest that perceptual load effects are mediated by spatial selection and do not generalize to the feature domain. Feature-based selection operates to suppress processing of task-irrelevant, co-localized features, irrespective of perceptual load.
Miconi, Thomas; Groomes, Laura; Kreiman, Gabriel
2016-01-01
When searching for an object in a scene, how does the brain decide where to look next? Visual search theories suggest the existence of a global “priority map” that integrates bottom-up visual information with top-down, target-specific signals. We propose a mechanistic model of visual search that is consistent with recent neurophysiological evidence, can localize targets in cluttered images, and predicts single-trial behavior in a search task. This model posits that a high-level retinotopic area selective for shape features receives global, target-specific modulation and implements local normalization through divisive inhibition. The normalization step is critical to prevent highly salient bottom-up features from monopolizing attention. The resulting activity pattern constitues a priority map that tracks the correlation between local input and target features. The maximum of this priority map is selected as the locus of attention. The visual input is then spatially enhanced around the selected location, allowing object-selective visual areas to determine whether the target is present at this location. This model can localize objects both in array images and when objects are pasted in natural scenes. The model can also predict single-trial human fixations, including those in error and target-absent trials, in a search task involving complex objects. PMID:26092221
Aiello, Marilena; Merola, Sheila; Lasaponara, Stefano; Pinto, Mario; Tomaiuolo, Francesco; Doricchi, Fabrizio
2018-01-31
The possibility of allocating attentional resources to the "global" shape or to the "local" details of pictorial stimuli helps visual processing. Investigations with hierarchical Navon letters, that are large "global" letters made up of small "local" ones, consistently demonstrate a right hemisphere advantage for global processing and a left hemisphere advantage for local processing. Here we investigated how the visual and phonological features of the global and local components of Navon letters influence these hemispheric advantages. In a first study in healthy participants, we contrasted the hemispheric processing of hierarchical letters with global and local items competing for response selection, to the processing of hierarchical letters in which a letter, a false-letter conveying no phonological information or a geometrical shape presented at the unattended level did not compete for response selection. In a second study, we investigated the hemispheric processing of hierarchical stimuli in which global and local letters were both visually and phonologically congruent (e.g. large uppercase G made of smaller uppercase G), visually incongruent and phonologically congruent (e.g. large uppercase G made of small lowercase g) or visually incongruent and phonologically incongruent (e.g. large uppercase G made of small lowercase or uppercase M). In a third study, we administered the same tasks to a right brain damaged patient with a lesion involving pre-striate areas engaged by global processing. The results of the first two experiments showed that the global abilities of the left hemisphere are limited because of its strong susceptibility to interference from local letters even when these are irrelevant to the task. Phonological features played a crucial role in this interference because the interference was entirely maintained also when letters at the global and local level were presented in different uppercase vs. lowercase formats. In contrast, when local features conveyed no phonological information, the left hemisphere showed preserved global processing abilities. These findings were supported by the study of the right brain damaged patient. These results offer a new look at the hemispheric dominance in the attentional processing of the global and local levels of hierarchical stimuli. Copyright © 2017 Elsevier Ltd. All rights reserved.
Using neuronal populations to study the mechanisms underlying spatial and feature attention
Cohen, Marlene R.; Maunsell, John H.R.
2012-01-01
Summary Visual attention affects both perception and neuronal responses. Whether the same neuronal mechanisms mediate spatial attention, which improves perception of attended locations, and non-spatial forms of attention has been a subject of considerable debate. Spatial and feature attention have similar effects on individual neurons. Because visual cortex is retinotopically organized, however, spatial attention can co-modulate local neuronal populations, while feature attention generally requires more selective modulation. We compared the effects of feature and spatial attention on local and spatially separated populations by recording simultaneously from dozens of neurons in both hemispheres of V4. Feature and spatial attention affect the activity of local populations similarly, modulating both firing rates and correlations between pairs of nearby neurons. However, while spatial attention appears to act on local populations, feature attention is coordinated across hemispheres. Our results are consistent with a unified attentional mechanism that can modulate the responses of arbitrary subgroups of neurons. PMID:21689604
Implicit integration in a case of integrative visual agnosia.
Aviezer, Hillel; Landau, Ayelet N; Robertson, Lynn C; Peterson, Mary A; Soroker, Nachum; Sacher, Yaron; Bonneh, Yoram; Bentin, Shlomo
2007-05-15
We present a case (SE) with integrative visual agnosia following ischemic stroke affecting the right dorsal and the left ventral pathways of the visual system. Despite his inability to identify global hierarchical letters [Navon, D. (1977). Forest before trees: The precedence of global features in visual perception. Cognitive Psychology, 9, 353-383], and his dense object agnosia, SE showed normal global-to-local interference when responding to local letters in Navon hierarchical stimuli and significant picture-word identity priming in a semantic decision task for words. Since priming was absent if these features were scrambled, it stands to reason that these effects were not due to priming by distinctive features. The contrast between priming effects induced by coherent and scrambled stimuli is consistent with implicit but not explicit integration of features into a unified whole. We went on to show that possible/impossible object decisions were facilitated by words in a word-picture priming task, suggesting that prompts could activate perceptually integrated images in a backward fashion. We conclude that the absence of SE's ability to identify visual objects except through tedious serial construction reflects a deficit in accessing an integrated visual representation through bottom-up visual processing alone. However, top-down generated images can help activate these visual representations through semantic links.
A recurrent neural model for proto-object based contour integration and figure-ground segregation.
Hu, Brian; Niebur, Ernst
2017-12-01
Visual processing of objects makes use of both feedforward and feedback streams of information. However, the nature of feedback signals is largely unknown, as is the identity of the neuronal populations in lower visual areas that receive them. Here, we develop a recurrent neural model to address these questions in the context of contour integration and figure-ground segregation. A key feature of our model is the use of grouping neurons whose activity represents tentative objects ("proto-objects") based on the integration of local feature information. Grouping neurons receive input from an organized set of local feature neurons, and project modulatory feedback to those same neurons. Additionally, inhibition at both the local feature level and the object representation level biases the interpretation of the visual scene in agreement with principles from Gestalt psychology. Our model explains several sets of neurophysiological results (Zhou et al. Journal of Neuroscience, 20(17), 6594-6611 2000; Qiu et al. Nature Neuroscience, 10(11), 1492-1499 2007; Chen et al. Neuron, 82(3), 682-694 2014), and makes testable predictions about the influence of neuronal feedback and attentional selection on neural responses across different visual areas. Our model also provides a framework for understanding how object-based attention is able to select both objects and the features associated with them.
Facial recognition using multisensor images based on localized kernel eigen spaces.
Gundimada, Satyanadh; Asari, Vijayan K
2009-06-01
A feature selection technique along with an information fusion procedure for improving the recognition accuracy of a visual and thermal image-based facial recognition system is presented in this paper. A novel modular kernel eigenspaces approach is developed and implemented on the phase congruency feature maps extracted from the visual and thermal images individually. Smaller sub-regions from a predefined neighborhood within the phase congruency images of the training samples are merged to obtain a large set of features. These features are then projected into higher dimensional spaces using kernel methods. The proposed localized nonlinear feature selection procedure helps to overcome the bottlenecks of illumination variations, partial occlusions, expression variations and variations due to temperature changes that affect the visual and thermal face recognition techniques. AR and Equinox databases are used for experimentation and evaluation of the proposed technique. The proposed feature selection procedure has greatly improved the recognition accuracy for both the visual and thermal images when compared to conventional techniques. Also, a decision level fusion methodology is presented which along with the feature selection procedure has outperformed various other face recognition techniques in terms of recognition accuracy.
Bag-of-features based medical image retrieval via multiple assignment and visual words weighting.
Wang, Jingyan; Li, Yongping; Zhang, Ying; Wang, Chao; Xie, Honglan; Chen, Guoling; Gao, Xin
2011-11-01
Bag-of-features based approaches have become prominent for image retrieval and image classification tasks in the past decade. Such methods represent an image as a collection of local features, such as image patches and key points with scale invariant feature transform (SIFT) descriptors. To improve the bag-of-features methods, we first model the assignments of local descriptors as contribution functions, and then propose a novel multiple assignment strategy. Assuming the local features can be reconstructed by their neighboring visual words in a vocabulary, reconstruction weights can be solved by quadratic programming. The weights are then used to build contribution functions, resulting in a novel assignment method, called quadratic programming (QP) assignment. We further propose a novel visual word weighting method. The discriminative power of each visual word is analyzed by the sub-similarity function in the bin that corresponds to the visual word. Each sub-similarity function is then treated as a weak classifier. A strong classifier is learned by boosting methods that combine those weak classifiers. The weighting factors of the visual words are learned accordingly. We evaluate the proposed methods on medical image retrieval tasks. The methods are tested on three well-known data sets, i.e., the ImageCLEFmed data set, the 304 CT Set, and the basal-cell carcinoma image set. Experimental results demonstrate that the proposed QP assignment outperforms the traditional nearest neighbor assignment, the multiple assignment, and the soft assignment, whereas the proposed boosting based weighting strategy outperforms the state-of-the-art weighting methods, such as the term frequency weights and the term frequency-inverse document frequency weights.
Model-based analysis of pattern motion processing in mouse primary visual cortex
Muir, Dylan R.; Roth, Morgane M.; Helmchen, Fritjof; Kampa, Björn M.
2015-01-01
Neurons in sensory areas of neocortex exhibit responses tuned to specific features of the environment. In visual cortex, information about features such as edges or textures with particular orientations must be integrated to recognize a visual scene or object. Connectivity studies in rodent cortex have revealed that neurons make specific connections within sub-networks sharing common input tuning. In principle, this sub-network architecture enables local cortical circuits to integrate sensory information. However, whether feature integration indeed occurs locally in rodent primary sensory areas has not been examined directly. We studied local integration of sensory features in primary visual cortex (V1) of the mouse by presenting drifting grating and plaid stimuli, while recording the activity of neuronal populations with two-photon calcium imaging. Using a Bayesian model-based analysis framework, we classified single-cell responses as being selective for either individual grating components or for moving plaid patterns. Rather than relying on trial-averaged responses, our model-based framework takes into account single-trial responses and can easily be extended to consider any number of arbitrary predictive models. Our analysis method was able to successfully classify significantly more responses than traditional partial correlation (PC) analysis, and provides a rigorous statistical framework to rank any number of models and reject poorly performing models. We also found a large proportion of cells that respond strongly to only one stimulus class. In addition, a quarter of selectively responding neurons had more complex responses that could not be explained by any simple integration model. Our results show that a broad range of pattern integration processes already take place at the level of V1. This diversity of integration is consistent with processing of visual inputs by local sub-networks within V1 that are tuned to combinations of sensory features. PMID:26300738
Romei, Vincenzo; Thut, Gregor; Mok, Robert M; Schyns, Philippe G; Driver, Jon
2012-03-01
Although oscillatory activity in the alpha band was traditionally associated with lack of alertness, more recent work has linked it to specific cognitive functions, including visual attention. The emerging method of rhythmic transcranial magnetic stimulation (TMS) allows causal interventional tests for the online impact on performance of TMS administered in short bursts at a particular frequency. TMS bursts at 10 Hz have recently been shown to have an impact on spatial visual attention, but any role in featural attention remains unclear. Here we used rhythmic TMS at 10 Hz to assess the impact on attending to global or local components of a hierarchical Navon-like stimulus (D. Navon (1977) Forest before trees: The precedence of global features in visual perception. Cognit. Psychol., 9, 353), in a paradigm recently used with TMS at other frequencies (V. Romei, J. Driver, P.G. Schyns & G. Thut. (2011) Rhythmic TMS over parietal cortex links distinct brain frequencies to global versus local visual processing. Curr. Biol., 2, 334-337). In separate groups, left or right posterior parietal sites were stimulated at 10 Hz just before presentation of the hierarchical stimulus. Participants had to identify either the local or global component in separate blocks. Right parietal 10 Hz stimulation (vs. sham) significantly impaired global processing without affecting local processing, while left parietal 10 Hz stimulation vs. sham impaired local processing with a minor trend to enhance global processing. These 10 Hz outcomes differed significantly from stimulation at other frequencies (i.e. 5 or 20 Hz) over the same site in other recent work with the same paradigm. These dissociations confirm differential roles of the two hemispheres in local vs. global processing, and reveal a frequency-specific role for stimulation in the alpha band for regulating feature-based visual attention. © 2012 The Authors. European Journal of Neuroscience © 2012 Federation of European Neuroscience Societies and Blackwell Publishing Ltd.
Deficit in visual temporal integration in autism spectrum disorders.
Nakano, Tamami; Ota, Haruhisa; Kato, Nobumasa; Kitazawa, Shigeru
2010-04-07
Individuals with autism spectrum disorders (ASD) are superior in processing local features. Frith and Happe conceptualize this cognitive bias as 'weak central coherence', implying that a local enhancement derives from a weakness in integrating local elements into a coherent whole. The suggested deficit has been challenged, however, because individuals with ASD were not found to be inferior to normal controls in holistic perception. In these opposing studies, however, subjects were encouraged to ignore local features and attend to the whole. Therefore, no one has directly tested whether individuals with ASD are able to integrate local elements over time into a whole image. Here, we report a weakness of individuals with ASD in naming familiar objects moved behind a narrow slit, which was worsened by the absence of local salient features. The results indicate that individuals with ASD have a clear deficit in integrating local visual information over time into a global whole, providing direct evidence for the weak central coherence hypothesis.
A Novel Image Retrieval Based on Visual Words Integration of SIFT and SURF
Ali, Nouman; Bajwa, Khalid Bashir; Sablatnig, Robert; Chatzichristofis, Savvas A.; Iqbal, Zeshan; Rashid, Muhammad; Habib, Hafiz Adnan
2016-01-01
With the recent evolution of technology, the number of image archives has increased exponentially. In Content-Based Image Retrieval (CBIR), high-level visual information is represented in the form of low-level features. The semantic gap between the low-level features and the high-level image concepts is an open research problem. In this paper, we present a novel visual words integration of Scale Invariant Feature Transform (SIFT) and Speeded-Up Robust Features (SURF). The two local features representations are selected for image retrieval because SIFT is more robust to the change in scale and rotation, while SURF is robust to changes in illumination. The visual words integration of SIFT and SURF adds the robustness of both features to image retrieval. The qualitative and quantitative comparisons conducted on Corel-1000, Corel-1500, Corel-2000, Oliva and Torralba and Ground Truth image benchmarks demonstrate the effectiveness of the proposed visual words integration. PMID:27315101
Registering Ground and Satellite Imagery for Visual Localization
2012-08-01
reckoning, inertial, stereo, light detection and ranging ( LIDAR ), cellular radio, and visual. As no sensor or algorithm provides perfect localization in...by metric localization approaches to confine the region of a map that needs to be searched. Simultaneous Localization and Mapping ( SLAM ) (5, 6), using...estimate the metric location of the camera. Se et al. (7) use SIFT features for both appearance-based global localization and incremental 3D SLAM . Johns and
Objects Classification by Learning-Based Visual Saliency Model and Convolutional Neural Network.
Li, Na; Zhao, Xinbo; Yang, Yongjia; Zou, Xiaochun
2016-01-01
Humans can easily classify different kinds of objects whereas it is quite difficult for computers. As a hot and difficult problem, objects classification has been receiving extensive interests with broad prospects. Inspired by neuroscience, deep learning concept is proposed. Convolutional neural network (CNN) as one of the methods of deep learning can be used to solve classification problem. But most of deep learning methods, including CNN, all ignore the human visual information processing mechanism when a person is classifying objects. Therefore, in this paper, inspiring the completed processing that humans classify different kinds of objects, we bring forth a new classification method which combines visual attention model and CNN. Firstly, we use the visual attention model to simulate the processing of human visual selection mechanism. Secondly, we use CNN to simulate the processing of how humans select features and extract the local features of those selected areas. Finally, not only does our classification method depend on those local features, but also it adds the human semantic features to classify objects. Our classification method has apparently advantages in biology. Experimental results demonstrated that our method made the efficiency of classification improve significantly.
Rahman, Md Mahmudur; Antani, Sameer K; Demner-Fushman, Dina; Thoma, George R
2015-10-01
This article presents an approach to biomedical image retrieval by mapping image regions to local concepts where images are represented in a weighted entropy-based concept feature space. The term "concept" refers to perceptually distinguishable visual patches that are identified locally in image regions and can be mapped to a glossary of imaging terms. Further, the visual significance (e.g., visualness) of concepts is measured as the Shannon entropy of pixel values in image patches and is used to refine the feature vector. Moreover, the system can assist the user in interactively selecting a region-of-interest (ROI) and searching for similar image ROIs. Further, a spatial verification step is used as a postprocessing step to improve retrieval results based on location information. The hypothesis that such approaches would improve biomedical image retrieval is validated through experiments on two different data sets, which are collected from open access biomedical literature.
Rahman, Md. Mahmudur; Antani, Sameer K.; Demner-Fushman, Dina; Thoma, George R.
2015-01-01
Abstract. This article presents an approach to biomedical image retrieval by mapping image regions to local concepts where images are represented in a weighted entropy-based concept feature space. The term “concept” refers to perceptually distinguishable visual patches that are identified locally in image regions and can be mapped to a glossary of imaging terms. Further, the visual significance (e.g., visualness) of concepts is measured as the Shannon entropy of pixel values in image patches and is used to refine the feature vector. Moreover, the system can assist the user in interactively selecting a region-of-interest (ROI) and searching for similar image ROIs. Further, a spatial verification step is used as a postprocessing step to improve retrieval results based on location information. The hypothesis that such approaches would improve biomedical image retrieval is validated through experiments on two different data sets, which are collected from open access biomedical literature. PMID:26730398
Comparative analysis and visualization of multiple collinear genomes
2012-01-01
Background Genome browsers are a common tool used by biologists to visualize genomic features including genes, polymorphisms, and many others. However, existing genome browsers and visualization tools are not well-suited to perform meaningful comparative analysis among a large number of genomes. With the increasing quantity and availability of genomic data, there is an increased burden to provide useful visualization and analysis tools for comparison of multiple collinear genomes such as the large panels of model organisms which are the basis for much of the current genetic research. Results We have developed a novel web-based tool for visualizing and analyzing multiple collinear genomes. Our tool illustrates genome-sequence similarity through a mosaic of intervals representing local phylogeny, subspecific origin, and haplotype identity. Comparative analysis is facilitated through reordering and clustering of tracks, which can vary throughout the genome. In addition, we provide local phylogenetic trees as an alternate visualization to assess local variations. Conclusions Unlike previous genome browsers and viewers, ours allows for simultaneous and comparative analysis. Our browser provides intuitive selection and interactive navigation about features of interest. Dynamic visualizations adjust to scale and data content making analysis at variable resolutions and of multiple data sets more informative. We demonstrate our genome browser for an extensive set of genomic data sets composed of almost 200 distinct mouse laboratory strains. PMID:22536897
GABAergic neurons in ferret visual cortex participate in functionally specific networks
Wilson, Daniel E.; Smith, Gordon B.; Jacob, Amanda; Walker, Theo; Dimidschstein, Jordane; Fishell, Gord J.; Fitzpatrick, David
2017-01-01
Summary Functional circuits in the visual cortex require the coordinated activity of excitatory and inhibitory neurons. Molecular genetic approaches in the mouse have led to the ‘local nonspecific pooling principle’ of inhibitory connectivity, in which inhibitory neurons are untuned for stimulus features due to the random pooling of local inputs. However, it remains unclear whether this principle generalizes to species with a columnar organization of feature selectivity such as carnivores, primates, and humans. Here we use virally-mediated GABAergic-specific GCaMP6f expression to demonstrate that inhibitory neurons in ferret visual cortex respond robustly and selectively to oriented stimuli. We find that the tuning of inhibitory neurons is inconsistent with the local non-specific pooling of excitatory inputs, and that inhibitory neurons exhibit orientation-specific noise correlations with local and distant excitatory neurons. These findings challenge the generality of the non-specific pooling principle for inhibitory neurons, suggesting different rules for functional excitatory-inhibitory interactions in non-murine species. PMID:28279352
Neuromorphic audio-visual sensor fusion on a sound-localizing robot.
Chan, Vincent Yue-Sek; Jin, Craig T; van Schaik, André
2012-01-01
This paper presents the first robotic system featuring audio-visual (AV) sensor fusion with neuromorphic sensors. We combine a pair of silicon cochleae and a silicon retina on a robotic platform to allow the robot to learn sound localization through self motion and visual feedback, using an adaptive ITD-based sound localization algorithm. After training, the robot can localize sound sources (white or pink noise) in a reverberant environment with an RMS error of 4-5° in azimuth. We also investigate the AV source binding problem and an experiment is conducted to test the effectiveness of matching an audio event with a corresponding visual event based on their onset time. Despite the simplicity of this method and a large number of false visual events in the background, a correct match can be made 75% of the time during the experiment.
Image fusion using sparse overcomplete feature dictionaries
Brumby, Steven P.; Bettencourt, Luis; Kenyon, Garrett T.; Chartrand, Rick; Wohlberg, Brendt
2015-10-06
Approaches for deciding what individuals in a population of visual system "neurons" are looking for using sparse overcomplete feature dictionaries are provided. A sparse overcomplete feature dictionary may be learned for an image dataset and a local sparse representation of the image dataset may be built using the learned feature dictionary. A local maximum pooling operation may be applied on the local sparse representation to produce a translation-tolerant representation of the image dataset. An object may then be classified and/or clustered within the translation-tolerant representation of the image dataset using a supervised classification algorithm and/or an unsupervised clustering algorithm.
NASA Astrophysics Data System (ADS)
Rahman, Md M.; Antani, Sameer K.; Demner-Fushman, Dina; Thoma, George R.
2015-03-01
This paper presents a novel approach to biomedical image retrieval by mapping image regions to local concepts and represent images in a weighted entropy-based concept feature space. The term concept refers to perceptually distinguishable visual patches that are identified locally in image regions and can be mapped to a glossary of imaging terms. Further, the visual significance (e.g., visualness) of concepts is measured as Shannon entropy of pixel values in image patches and is used to refine the feature vector. Moreover, the system can assist user in interactively select a Region-Of-Interest (ROI) and search for similar image ROIs. Further, a spatial verification step is used as a post-processing step to improve retrieval results based on location information. The hypothesis that such approaches would improve biomedical image retrieval, is validated through experiments on a data set of 450 lung CT images extracted from journal articles from four different collections.
NASA Astrophysics Data System (ADS)
Jin, Xin; Jiang, Qian; Yao, Shaowen; Zhou, Dongming; Nie, Rencan; Lee, Shin-Jye; He, Kangjian
2018-01-01
In order to promote the performance of infrared and visual image fusion and provide better visual effects, this paper proposes a hybrid fusion method for infrared and visual image by the combination of discrete stationary wavelet transform (DSWT), discrete cosine transform (DCT) and local spatial frequency (LSF). The proposed method has three key processing steps. Firstly, DSWT is employed to decompose the important features of the source image into a series of sub-images with different levels and spatial frequencies. Secondly, DCT is used to separate the significant details of the sub-images according to the energy of different frequencies. Thirdly, LSF is applied to enhance the regional features of DCT coefficients, and it can be helpful and useful for image feature extraction. Some frequently-used image fusion methods and evaluation metrics are employed to evaluate the validity of the proposed method. The experiments indicate that the proposed method can achieve good fusion effect, and it is more efficient than other conventional image fusion methods.
NASA Technical Reports Server (NTRS)
Eckstein, M. P.; Thomas, J. P.; Palmer, J.; Shimozaki, S. S.
2000-01-01
Recently, quantitative models based on signal detection theory have been successfully applied to the prediction of human accuracy in visual search for a target that differs from distractors along a single attribute (feature search). The present paper extends these models for visual search accuracy to multidimensional search displays in which the target differs from the distractors along more than one feature dimension (conjunction, disjunction, and triple conjunction displays). The model assumes that each element in the display elicits a noisy representation for each of the relevant feature dimensions. The observer combines the representations across feature dimensions to obtain a single decision variable, and the stimulus with the maximum value determines the response. The model accurately predicts human experimental data on visual search accuracy in conjunctions and disjunctions of contrast and orientation. The model accounts for performance degradation without resorting to a limited-capacity spatially localized and temporally serial mechanism by which to bind information across feature dimensions.
Local Homing Navigation Based on the Moment Model for Landmark Distribution and Features
Lee, Changmin; Kim, DaeEun
2017-01-01
For local homing navigation, an agent is supposed to return home based on the surrounding environmental information. According to the snapshot model, the home snapshot and the current view are compared to determine the homing direction. In this paper, we propose a novel homing navigation method using the moment model. The suggested moment model also follows the snapshot theory to compare the home snapshot and the current view, but the moment model defines a moment of landmark inertia as the sum of the product of the feature of the landmark particle with the square of its distance. The method thus uses range values of landmarks in the surrounding view and the visual features. The center of the moment can be estimated as the reference point, which is the unique convergence point in the moment potential from any view. The homing vector can easily be extracted from the centers of the moment measured at the current position and the home location. The method effectively guides homing direction in real environments, as well as in the simulation environment. In this paper, we take a holistic approach to use all pixels in the panoramic image as landmarks and use the RGB color intensity for the visual features in the moment model in which a set of three moment functions is encoded to determine the homing vector. We also tested visual homing or the moment model with only visual features, but the suggested moment model with both the visual feature and the landmark distance shows superior performance. We demonstrate homing performance with various methods classified by the status of the feature, the distance and the coordinate alignment. PMID:29149043
Mihalas, Stefan; Dong, Yi; von der Heydt, Rüdiger; Niebur, Ernst
2011-01-01
Visual attention is often understood as a modulatory field acting at early stages of processing, but the mechanisms that direct and fit the field to the attended object are not known. We show that a purely spatial attention field propagating downward in the neuronal network responsible for perceptual organization will be reshaped, repositioned, and sharpened to match the object's shape and scale. Key features of the model are grouping neurons integrating local features into coherent tentative objects, excitatory feedback to the same local feature neurons that caused grouping neuron activation, and inhibition between incompatible interpretations both at the local feature level and at the object representation level. PMID:21502489
Parts-based stereoscopic image assessment by learning binocular manifold color visual properties
NASA Astrophysics Data System (ADS)
Xu, Haiyong; Yu, Mei; Luo, Ting; Zhang, Yun; Jiang, Gangyi
2016-11-01
Existing stereoscopic image quality assessment (SIQA) methods are mostly based on the luminance information, in which color information is not sufficiently considered. Actually, color is part of the important factors that affect human visual perception, and nonnegative matrix factorization (NMF) and manifold learning are in line with human visual perception. We propose an SIQA method based on learning binocular manifold color visual properties. To be more specific, in the training phase, a feature detector is created based on NMF with manifold regularization by considering color information, which not only allows parts-based manifold representation of an image, but also manifests localized color visual properties. In the quality estimation phase, visually important regions are selected by considering different human visual attention, and feature vectors are extracted by using the feature detector. Then the feature similarity index is calculated and the parts-based manifold color feature energy (PMCFE) for each view is defined based on the color feature vectors. The final quality score is obtained by considering a binocular combination based on PMCFE. The experimental results on LIVE I and LIVE Π 3-D IQA databases demonstrate that the proposed method can achieve much higher consistency with subjective evaluations than the state-of-the-art SIQA methods.
Brooks, Joseph L.; Gilaie-Dotan, Sharon; Rees, Geraint; Bentin, Shlomo; Driver, Jon
2012-01-01
Visual perception depends not only on local stimulus features but also on their relationship to the surrounding stimulus context, as evident in both local and contextual influences on figure-ground segmentation. Intermediate visual areas may play a role in such contextual influences, as we tested here by examining LG, a rare case of developmental visual agnosia. LG has no evident abnormality of brain structure and functional neuroimaging showed relatively normal V1 function, but his intermediate visual areas (V2/V3) function abnormally. We found that contextual influences on figure-ground organization were selectively disrupted in LG, while local sources of figure-ground influences were preserved. Effects of object knowledge and familiarity on figure-ground organization were also significantly diminished. Our results suggest that the mechanisms mediating contextual and familiarity influences on figure-ground organization are dissociable from those mediating local influences on figure-ground assignment. The disruption of contextual processing in intermediate visual areas may play a role in the substantial object recognition difficulties experienced by LG. PMID:22947116
Learning Rotation-Invariant Local Binary Descriptor.
Duan, Yueqi; Lu, Jiwen; Feng, Jianjiang; Zhou, Jie
2017-08-01
In this paper, we propose a rotation-invariant local binary descriptor (RI-LBD) learning method for visual recognition. Compared with hand-crafted local binary descriptors, such as local binary pattern and its variants, which require strong prior knowledge, local binary feature learning methods are more efficient and data-adaptive. Unlike existing learning-based local binary descriptors, such as compact binary face descriptor and simultaneous local binary feature learning and encoding, which are susceptible to rotations, our RI-LBD first categorizes each local patch into a rotational binary pattern (RBP), and then jointly learns the orientation for each pattern and the projection matrix to obtain RI-LBDs. As all the rotation variants of a patch belong to the same RBP, they are rotated into the same orientation and projected into the same binary descriptor. Then, we construct a codebook by a clustering method on the learned binary codes, and obtain a histogram feature for each image as the final representation. In order to exploit higher order statistical information, we extend our RI-LBD to the triple rotation-invariant co-occurrence local binary descriptor (TRICo-LBD) learning method, which learns a triple co-occurrence binary code for each local patch. Extensive experimental results on four different visual recognition tasks, including image patch matching, texture classification, face recognition, and scene classification, show that our RI-LBD and TRICo-LBD outperform most existing local descriptors.
NASA Astrophysics Data System (ADS)
Hassanat, Ahmad B. A.; Jassim, Sabah
2010-04-01
In this paper, the automatic lip reading problem is investigated, and an innovative approach to providing solutions to this problem has been proposed. This new VSR approach is dependent on the signature of the word itself, which is obtained from a hybrid feature extraction method dependent on geometric, appearance, and image transform features. The proposed VSR approach is termed "visual words". The visual words approach consists of two main parts, 1) Feature extraction/selection, and 2) Visual speech feature recognition. After localizing face and lips, several visual features for the lips where extracted. Such as the height and width of the mouth, mutual information and the quality measurement between the DWT of the current ROI and the DWT of the previous ROI, the ratio of vertical to horizontal features taken from DWT of ROI, The ratio of vertical edges to horizontal edges of ROI, the appearance of the tongue and the appearance of teeth. Each spoken word is represented by 8 signals, one of each feature. Those signals maintain the dynamic of the spoken word, which contains a good portion of information. The system is then trained on these features using the KNN and DTW. This approach has been evaluated using a large database for different people, and large experiment sets. The evaluation has proved the visual words efficiency, and shown that the VSR is a speaker dependent problem.
Relationship between visual binding, reentry and awareness.
Koivisto, Mika; Silvanto, Juha
2011-12-01
Visual feature binding has been suggested to depend on reentrant processing. We addressed the relationship between binding, reentry, and visual awareness by asking the participants to discriminate the color and orientation of a colored bar (presented either alone or simultaneously with a white distractor bar) and to report their phenomenal awareness of the target features. The success of reentry was manipulated with object substitution masking and backward masking. The results showed that late reentrant processes are necessary for successful binding but not for phenomenal awareness of the bound features. Binding errors were accompanied by phenomenal awareness of the misbound feature conjunctions, demonstrating that they were experienced as real properties of the stimuli (i.e., illusory conjunctions). Our results suggest that early preattentive binding and local recurrent processing enable features to reach phenomenal awareness, while later attention-related reentrant iterations modulate the way in which the features are bound and experienced in awareness. Copyright © 2011 Elsevier Inc. All rights reserved.
Fox, Olivia M.; Harel, Assaf; Bennett, Kevin B.
2017-01-01
The perception of a visual stimulus is dependent not only upon local features, but also on the arrangement of those features. When stimulus features are perceptually well organized (e.g., symmetric or parallel), a global configuration with a high degree of salience emerges from the interactions between these features, often referred to as emergent features. Emergent features can be demonstrated in the Configural Superiority Effect (CSE): presenting a stimulus within an organized context relative to its presentation in a disarranged one results in better performance. Prior neuroimaging work on the perception of emergent features regards the CSE as an “all or none” phenomenon, focusing on the contrast between configural and non-configural stimuli. However, it is still not clear how emergent features are processed between these two endpoints. The current study examined the extent to which behavioral and neuroimaging markers of emergent features are responsive to the degree of configurality in visual displays. Subjects were tasked with reporting the anomalous quadrant in a visual search task while being scanned. Degree of configurality was manipulated by incrementally varying the rotational angle of low-level features within the stimulus arrays. Behaviorally, we observed faster response times with increasing levels of configurality. These behavioral changes were accompanied by increases in response magnitude across multiple visual areas in occipito-temporal cortex, primarily early visual cortex and object-selective cortex. Our findings suggest that the neural correlates of emergent features can be observed even in response to stimuli that are not fully configural, and demonstrate that configural information is already present at early stages of the visual hierarchy. PMID:28167924
Feature-selective attention enhances color signals in early visual areas of the human brain.
Müller, M M; Andersen, S; Trujillo, N J; Valdés-Sosa, P; Malinowski, P; Hillyard, S A
2006-09-19
We used an electrophysiological measure of selective stimulus processing (the steady-state visual evoked potential, SSVEP) to investigate feature-specific attention to color cues. Subjects viewed a display consisting of spatially intermingled red and blue dots that continually shifted their positions at random. The red and blue dots flickered at different frequencies and thereby elicited distinguishable SSVEP signals in the visual cortex. Paying attention selectively to either the red or blue dot population produced an enhanced amplitude of its frequency-tagged SSVEP, which was localized by source modeling to early levels of the visual cortex. A control experiment showed that this selection was based on color rather than flicker frequency cues. This signal amplification of attended color items provides an empirical basis for the rapid identification of feature conjunctions during visual search, as proposed by "guided search" models.
The Forest, the Trees, and the Leaves: Differences of Processing across Development
ERIC Educational Resources Information Center
Krakowski, Claire-Sara; Poirel, Nicolas; Vidal, Julie; Roëll, Margot; Pineau, Arlette; Borst, Grégoire; Houdé, Olivier
2016-01-01
To act and think, children and adults are continually required to ignore irrelevant visual information to focus on task-relevant items. As real-world visual information is organized into structures, we designed a feature visual search task containing 3-level hierarchical stimuli (i.e., local shapes that constituted intermediate shapes that formed…
Tschechne, Stephan; Neumann, Heiko
2014-01-01
Visual structures in the environment are segmented into image regions and those combined to a representation of surfaces and prototypical objects. Such a perceptual organization is performed by complex neural mechanisms in the visual cortex of primates. Multiple mutually connected areas in the ventral cortical pathway receive visual input and extract local form features that are subsequently grouped into increasingly complex, more meaningful image elements. Such a distributed network of processing must be capable to make accessible highly articulated changes in shape boundary as well as very subtle curvature changes that contribute to the perception of an object. We propose a recurrent computational network architecture that utilizes hierarchical distributed representations of shape features to encode surface and object boundary over different scales of resolution. Our model makes use of neural mechanisms that model the processing capabilities of early and intermediate stages in visual cortex, namely areas V1–V4 and IT. We suggest that multiple specialized component representations interact by feedforward hierarchical processing that is combined with feedback signals driven by representations generated at higher stages. Based on this, global configurational as well as local information is made available to distinguish changes in the object's contour. Once the outline of a shape has been established, contextual contour configurations are used to assign border ownership directions and thus achieve segregation of figure and ground. The model, thus, proposes how separate mechanisms contribute to distributed hierarchical cortical shape representation and combine with processes of figure-ground segregation. Our model is probed with a selection of stimuli to illustrate processing results at different processing stages. We especially highlight how modulatory feedback connections contribute to the processing of visual input at various stages in the processing hierarchy. PMID:25157228
Tschechne, Stephan; Neumann, Heiko
2014-01-01
Visual structures in the environment are segmented into image regions and those combined to a representation of surfaces and prototypical objects. Such a perceptual organization is performed by complex neural mechanisms in the visual cortex of primates. Multiple mutually connected areas in the ventral cortical pathway receive visual input and extract local form features that are subsequently grouped into increasingly complex, more meaningful image elements. Such a distributed network of processing must be capable to make accessible highly articulated changes in shape boundary as well as very subtle curvature changes that contribute to the perception of an object. We propose a recurrent computational network architecture that utilizes hierarchical distributed representations of shape features to encode surface and object boundary over different scales of resolution. Our model makes use of neural mechanisms that model the processing capabilities of early and intermediate stages in visual cortex, namely areas V1-V4 and IT. We suggest that multiple specialized component representations interact by feedforward hierarchical processing that is combined with feedback signals driven by representations generated at higher stages. Based on this, global configurational as well as local information is made available to distinguish changes in the object's contour. Once the outline of a shape has been established, contextual contour configurations are used to assign border ownership directions and thus achieve segregation of figure and ground. The model, thus, proposes how separate mechanisms contribute to distributed hierarchical cortical shape representation and combine with processes of figure-ground segregation. Our model is probed with a selection of stimuli to illustrate processing results at different processing stages. We especially highlight how modulatory feedback connections contribute to the processing of visual input at various stages in the processing hierarchy.
Local and Global Auditory Processing: Behavioral and ERP Evidence
Sanders, Lisa D.; Poeppel, David
2007-01-01
Differential processing of local and global visual features is well established. Global precedence effects, differences in event-related potentials (ERPs) elicited when attention is focused on local versus global levels, and hemispheric specialization for local and global features all indicate that relative scale of detail is an important distinction in visual processing. Observing analogous differential processing of local and global auditory information would suggest that scale of detail is a general organizational principle of the brain. However, to date the research on auditory local and global processing has primarily focused on music perception or on the perceptual analysis of relatively higher and lower frequencies. The study described here suggests that temporal aspects of auditory stimuli better capture the local-global distinction. By combining short (40 ms) frequency modulated tones in series to create global auditory patterns (500 ms), we independently varied whether pitch increased or decreased over short time spans (local) and longer time spans (global). Accuracy and reaction time measures revealed better performance for global judgments and asymmetric interference that were modulated by amount of pitch change. ERPs recorded while participants listened to identical sounds and indicated the direction of pitch change at the local or global levels provided evidence for differential processing similar to that found in ERP studies employing hierarchical visual stimuli. ERP measures failed to provide evidence for lateralization of local and global auditory perception, but differences in distributions suggest preferential processing in more ventral and dorsal areas respectively. PMID:17113115
Coding visual features extracted from video sequences.
Baroffio, Luca; Cesana, Matteo; Redondi, Alessandro; Tagliasacchi, Marco; Tubaro, Stefano
2014-05-01
Visual features are successfully exploited in several applications (e.g., visual search, object recognition and tracking, etc.) due to their ability to efficiently represent image content. Several visual analysis tasks require features to be transmitted over a bandwidth-limited network, thus calling for coding techniques to reduce the required bit budget, while attaining a target level of efficiency. In this paper, we propose, for the first time, a coding architecture designed for local features (e.g., SIFT, SURF) extracted from video sequences. To achieve high coding efficiency, we exploit both spatial and temporal redundancy by means of intraframe and interframe coding modes. In addition, we propose a coding mode decision based on rate-distortion optimization. The proposed coding scheme can be conveniently adopted to implement the analyze-then-compress (ATC) paradigm in the context of visual sensor networks. That is, sets of visual features are extracted from video frames, encoded at remote nodes, and finally transmitted to a central controller that performs visual analysis. This is in contrast to the traditional compress-then-analyze (CTA) paradigm, in which video sequences acquired at a node are compressed and then sent to a central unit for further processing. In this paper, we compare these coding paradigms using metrics that are routinely adopted to evaluate the suitability of visual features in the context of content-based retrieval, object recognition, and tracking. Experimental results demonstrate that, thanks to the significant coding gains achieved by the proposed coding scheme, ATC outperforms CTA with respect to all evaluation metrics.
NASA Astrophysics Data System (ADS)
Sanghavi, Foram; Agaian, Sos
2017-05-01
The goal of this paper is to (a) test the nuclei based Computer Aided Cancer Detection system using Human Visual based system on the histopathology images and (b) Compare the results of the proposed system with the Local Binary Pattern and modified Fibonacci -p pattern systems. The system performance is evaluated using different parameters such as accuracy, specificity, sensitivity, positive predictive value, and negative predictive value on 251 prostate histopathology images. The accuracy of 96.69% was observed for cancer detection using the proposed human visual based system compared to 87.42% and 94.70% observed for Local Binary patterns and the modified Fibonacci p patterns.
Visual Odometry Based on Structural Matching of Local Invariant Features Using Stereo Camera Sensor
Núñez, Pedro; Vázquez-Martín, Ricardo; Bandera, Antonio
2011-01-01
This paper describes a novel sensor system to estimate the motion of a stereo camera. Local invariant image features are matched between pairs of frames and linked into image trajectories at video rate, providing the so-called visual odometry, i.e., motion estimates from visual input alone. Our proposal conducts two matching sessions: the first one between sets of features associated to the images of the stereo pairs and the second one between sets of features associated to consecutive frames. With respect to previously proposed approaches, the main novelty of this proposal is that both matching algorithms are conducted by means of a fast matching algorithm which combines absolute and relative feature constraints. Finding the largest-valued set of mutually consistent matches is equivalent to finding the maximum-weighted clique on a graph. The stereo matching allows to represent the scene view as a graph which emerge from the features of the accepted clique. On the other hand, the frame-to-frame matching defines a graph whose vertices are features in 3D space. The efficiency of the approach is increased by minimizing the geometric and algebraic errors to estimate the final displacement of the stereo camera between consecutive acquired frames. The proposed approach has been tested for mobile robotics navigation purposes in real environments and using different features. Experimental results demonstrate the performance of the proposal, which could be applied in both industrial and service robot fields. PMID:22164016
Multilevel depth and image fusion for human activity detection.
Ni, Bingbing; Pei, Yong; Moulin, Pierre; Yan, Shuicheng
2013-10-01
Recognizing complex human activities usually requires the detection and modeling of individual visual features and the interactions between them. Current methods only rely on the visual features extracted from 2-D images, and therefore often lead to unreliable salient visual feature detection and inaccurate modeling of the interaction context between individual features. In this paper, we show that these problems can be addressed by combining data from a conventional camera and a depth sensor (e.g., Microsoft Kinect). We propose a novel complex activity recognition and localization framework that effectively fuses information from both grayscale and depth image channels at multiple levels of the video processing pipeline. In the individual visual feature detection level, depth-based filters are applied to the detected human/object rectangles to remove false detections. In the next level of interaction modeling, 3-D spatial and temporal contexts among human subjects or objects are extracted by integrating information from both grayscale and depth images. Depth information is also utilized to distinguish different types of indoor scenes. Finally, a latent structural model is developed to integrate the information from multiple levels of video processing for an activity detection. Extensive experiments on two activity recognition benchmarks (one with depth information) and a challenging grayscale + depth human activity database that contains complex interactions between human-human, human-object, and human-surroundings demonstrate the effectiveness of the proposed multilevel grayscale + depth fusion scheme. Higher recognition and localization accuracies are obtained relative to the previous methods.
[Several mechanisms of visual gnosis disorders in local brain lesions].
Meerson, Ia A
1981-01-01
The object of the studies were peculiarities of recognizing visual images by patients with local cerebral lesions under conditions of incomplete sets of the image features, disjunction of the latter, distortion of their spatial arrangement, and unusual spatial orientation of the image as a whole. It was found that elimination of even one essential feature sharply hampered the recognition of the image both by healthy individuals (control), and patients with extraoccipital lesions, whereas elimination of several nonessential features only slowed down the process. In distinction from this the difficulties of the recognition of incomplete images by patients with occipital lesions were directly proportional to the number of the eliminated features irrespective of the latters' significance, i.e. these patients were unable to evaluate the hierarchy of the features. The recognition process in these patients were followed the way of scanning individual features. The reaccumulation and summation. The recognition of the fragmental, spatially distorted and unusually oriented images was found to be affected selectively in patients with parietal lobe affections. The patients with occipital lesions recognized such images practically as good as the ordinary ones.
Characterizing the effects of feature salience and top-down attention in the early visual system.
Poltoratski, Sonia; Ling, Sam; McCormack, Devin; Tong, Frank
2017-07-01
The visual system employs a sophisticated balance of attentional mechanisms: salient stimuli are prioritized for visual processing, yet observers can also ignore such stimuli when their goals require directing attention elsewhere. A powerful determinant of visual salience is local feature contrast: if a local region differs from its immediate surround along one or more feature dimensions, it will appear more salient. We used high-resolution functional MRI (fMRI) at 7T to characterize the modulatory effects of bottom-up salience and top-down voluntary attention within multiple sites along the early visual pathway, including visual areas V1-V4 and the lateral geniculate nucleus (LGN). Observers viewed arrays of spatially distributed gratings, where one of the gratings immediately to the left or right of fixation differed from all other items in orientation or motion direction, making it salient. To investigate the effects of directed attention, observers were cued to attend to the grating to the left or right of fixation, which was either salient or nonsalient. Results revealed reliable additive effects of top-down attention and stimulus-driven salience throughout visual areas V1-hV4. In comparison, the LGN exhibited significant attentional enhancement but was not reliably modulated by orientation- or motion-defined salience. Our findings indicate that top-down effects of spatial attention can influence visual processing at the earliest possible site along the visual pathway, including the LGN, whereas the processing of orientation- and motion-driven salience primarily involves feature-selective interactions that take place in early cortical visual areas. NEW & NOTEWORTHY While spatial attention allows for specific, goal-driven enhancement of stimuli, salient items outside of the current focus of attention must also be prioritized. We used 7T fMRI to compare salience and spatial attentional enhancement along the early visual hierarchy. We report additive effects of attention and bottom-up salience in early visual areas, suggesting that salience enhancement is not contingent on the observer's attentional state. Copyright © 2017 the American Physiological Society.
ERIC Educational Resources Information Center
Nordfang, Maria; Dyrholm, Mads; Bundesen, Claus
2013-01-01
The attentional weight of a visual object depends on the contrast of the features of the object to its local surroundings (feature contrast) and the relevance of the features to one's goals (feature relevance). We investigated the dependency in partial report experiments with briefly presented stimuli but unspeeded responses. The task was to…
Learning Human Actions by Combining Global Dynamics and Local Appearance.
Luo, Guan; Yang, Shuang; Tian, Guodong; Yuan, Chunfeng; Hu, Weiming; Maybank, Stephen J
2014-12-01
In this paper, we address the problem of human action recognition through combining global temporal dynamics and local visual spatio-temporal appearance features. For this purpose, in the global temporal dimension, we propose to model the motion dynamics with robust linear dynamical systems (LDSs) and use the model parameters as motion descriptors. Since LDSs live in a non-Euclidean space and the descriptors are in non-vector form, we propose a shift invariant subspace angles based distance to measure the similarity between LDSs. In the local visual dimension, we construct curved spatio-temporal cuboids along the trajectories of densely sampled feature points and describe them using histograms of oriented gradients (HOG). The distance between motion sequences is computed with the Chi-Squared histogram distance in the bag-of-words framework. Finally we perform classification using the maximum margin distance learning method by combining the global dynamic distances and the local visual distances. We evaluate our approach for action recognition on five short clips data sets, namely Weizmann, KTH, UCF sports, Hollywood2 and UCF50, as well as three long continuous data sets, namely VIRAT, ADL and CRIM13. We show competitive results as compared with current state-of-the-art methods.
W-tree indexing for fast visual word generation.
Shi, Miaojing; Xu, Ruixin; Tao, Dacheng; Xu, Chao
2013-03-01
The bag-of-visual-words representation has been widely used in image retrieval and visual recognition. The most time-consuming step in obtaining this representation is the visual word generation, i.e., assigning visual words to the corresponding local features in a high-dimensional space. Recently, structures based on multibranch trees and forests have been adopted to reduce the time cost. However, these approaches cannot perform well without a large number of backtrackings. In this paper, by considering the spatial correlation of local features, we can significantly speed up the time consuming visual word generation process while maintaining accuracy. In particular, visual words associated with certain structures frequently co-occur; hence, we can build a co-occurrence table for each visual word for a large-scale data set. By associating each visual word with a probability according to the corresponding co-occurrence table, we can assign a probabilistic weight to each node of a certain index structure (e.g., a KD-tree and a K-means tree), in order to re-direct the searching path to be close to its global optimum within a small number of backtrackings. We carefully study the proposed scheme by comparing it with the fast library for approximate nearest neighbors and the random KD-trees on the Oxford data set. Thorough experimental results suggest the efficiency and effectiveness of the new scheme.
Ince, Robin A. A.; Jaworska, Katarzyna; Gross, Joachim; Panzeri, Stefano; van Rijsbergen, Nicola J.; Rousselet, Guillaume A.; Schyns, Philippe G.
2016-01-01
A key to understanding visual cognition is to determine “where”, “when”, and “how” brain responses reflect the processing of the specific visual features that modulate categorization behavior—the “what”. The N170 is the earliest Event-Related Potential (ERP) that preferentially responds to faces. Here, we demonstrate that a paradigmatic shift is necessary to interpret the N170 as the product of an information processing network that dynamically codes and transfers face features across hemispheres, rather than as a local stimulus-driven event. Reverse-correlation methods coupled with information-theoretic analyses revealed that visibility of the eyes influences face detection behavior. The N170 initially reflects coding of the behaviorally relevant eye contralateral to the sensor, followed by a causal communication of the other eye from the other hemisphere. These findings demonstrate that the deceptively simple N170 ERP hides a complex network information processing mechanism involving initial coding and subsequent cross-hemispheric transfer of visual features. PMID:27550865
Ernst, Udo A.; Schiffer, Alina; Persike, Malte; Meinhardt, Günter
2016-01-01
Processing natural scenes requires the visual system to integrate local features into global object descriptions. To achieve coherent representations, the human brain uses statistical dependencies to guide weighting of local feature conjunctions. Pairwise interactions among feature detectors in early visual areas may form the early substrate of these local feature bindings. To investigate local interaction structures in visual cortex, we combined psychophysical experiments with computational modeling and natural scene analysis. We first measured contrast thresholds for 2 × 2 grating patch arrangements (plaids), which differed in spatial frequency composition (low, high, or mixed), number of grating patch co-alignments (0, 1, or 2), and inter-patch distances (1° and 2° of visual angle). Contrast thresholds for the different configurations were compared to the prediction of probability summation (PS) among detector families tuned to the four retinal positions. For 1° distance the thresholds for all configurations were larger than predicted by PS, indicating inhibitory interactions. For 2° distance, thresholds were significantly lower compared to PS when the plaids were homogeneous in spatial frequency and orientation, but not when spatial frequencies were mixed or there was at least one misalignment. Next, we constructed a neural population model with horizontal laminar structure, which reproduced the detection thresholds after adaptation of connection weights. Consistent with prior work, contextual interactions were medium-range inhibition and long-range, orientation-specific excitation. However, inclusion of orientation-specific, inhibitory interactions between populations with different spatial frequency preferences were crucial for explaining detection thresholds. Finally, for all plaid configurations we computed their likelihood of occurrence in natural images. The likelihoods turned out to be inversely related to the detection thresholds obtained at larger inter-patch distances. However, likelihoods were almost independent of inter-patch distance, implying that natural image statistics could not explain the crowding-like results at short distances. This failure of natural image statistics to resolve the patch distance modulation of plaid visibility remains a challenge to the approach. PMID:27757076
Running VisIt Software on the Peregrine System | High-Performance Computing
kilobyte range. VisIt features a robust remote visualization capability. VisIt can be started on a local machine and used to visualize data on a remote compute cluster.The remote machine must be able to send VisIt module must be loaded as part of this process. To enable remote visualization the 'module load
Park, Kyu Hyung; Kim, Yong-Kyu; Woo, Se Joon; Kang, Se Woong; Lee, Won Ki; Choi, Kyung Seek; Kwak, Hyung Woo; Yoon, Ill Han; Huh, Kuhl; Kim, Jong Woo
2014-06-01
Iatrogenic occlusion of the ophthalmic artery and its branches is a rare but devastating complication of cosmetic facial filler injections. To investigate clinical and angiographic features of iatrogenic occlusion of the ophthalmic artery and its branches caused by cosmetic facial filler injections. Data from 44 patients with occlusion of the ophthalmic artery and its branches after cosmetic facial filler injections were obtained retrospectively from a national survey completed by members of the Korean Retina Society from 27 retinal centers. Clinical features were compared between patients grouped by angiographic findings and injected filler material. Visual prognosis and its relationship to angiographic findings and injected filler material. Ophthalmic artery occlusion was classified into 6 types according to angiographic findings. Twenty-eight patients had diffuse retinal and choroidal artery occlusions (ophthalmic artery occlusion, generalized posterior ciliary artery occlusion, and central retinal artery occlusion). Sixteen patients had localized occlusions (localized posterior ciliary artery occlusion, branch retinal artery occlusion, and posterior ischemic optic neuropathy). Patients with diffuse occlusions showed worse initial and final visual acuity and less visual gain compared with those having localized occlusions. Patients receiving autologous fat injections (n = 22) had diffuse ophthalmic artery occlusions, worse visual prognosis, and a higher incidence of combined brain infarction compared with patients having hyaluronic acid injections (n = 13). Clinical features of iatrogenic occlusion of the ophthalmic artery and its branches following cosmetic facial filler injections were diverse according to the location and extent of obstruction and the injected filler material. Autologous fat injections were associated with a worse visual prognosis and a higher incidence of combined cerebral infarction. Extreme caution and care should be taken during these injections, and physicians should be aware of a diverse spectrum of complications following cosmetic facial filler injections.
Chan, Louis K H; Hayward, William G
2009-02-01
In feature integration theory (FIT; A. Treisman & S. Sato, 1990), feature detection is driven by independent dimensional modules, and other searches are driven by a master map of locations that integrates dimensional information into salience signals. Although recent theoretical models have largely abandoned this distinction, some observed results are difficult to explain in its absence. The present study measured dimension-specific performance during detection and localization, tasks that require operation of dimensional modules and the master map, respectively. Results showed a dissociation between tasks in terms of both dimension-switching costs and cross-dimension attentional capture, reflecting a dimension-specific nature for detection tasks and a dimension-general nature for localization tasks. In a feature-discrimination task, results precluded an explanation based on response mode. These results are interpreted to support FIT's postulation that different mechanisms are involved in parallel and focal attention searches. This indicates that the FIT architecture should be adopted to explain the current results and that a variety of visual attention findings can be addressed within this framework. Copyright 2009 APA, all rights reserved.
A Novel Robot Visual Homing Method Based on SIFT Features
Zhu, Qidan; Liu, Chuanjia; Cai, Chengtao
2015-01-01
Warping is an effective visual homing method for robot local navigation. However, the performance of the warping method can be greatly influenced by the changes of the environment in a real scene, thus resulting in lower accuracy. In order to solve the above problem and to get higher homing precision, a novel robot visual homing algorithm is proposed by combining SIFT (scale-invariant feature transform) features with the warping method. The algorithm is novel in using SIFT features as landmarks instead of the pixels in the horizon region of the panoramic image. In addition, to further improve the matching accuracy of landmarks in the homing algorithm, a novel mismatching elimination algorithm, based on the distribution characteristics of landmarks in the catadioptric panoramic image, is proposed. Experiments on image databases and on a real scene confirm the effectiveness of the proposed method. PMID:26473880
Coupled binary embedding for large-scale image retrieval.
Zheng, Liang; Wang, Shengjin; Tian, Qi
2014-08-01
Visual matching is a crucial step in image retrieval based on the bag-of-words (BoW) model. In the baseline method, two keypoints are considered as a matching pair if their SIFT descriptors are quantized to the same visual word. However, the SIFT visual word has two limitations. First, it loses most of its discriminative power during quantization. Second, SIFT only describes the local texture feature. Both drawbacks impair the discriminative power of the BoW model and lead to false positive matches. To tackle this problem, this paper proposes to embed multiple binary features at indexing level. To model correlation between features, a multi-IDF scheme is introduced, through which different binary features are coupled into the inverted file. We show that matching verification methods based on binary features, such as Hamming embedding, can be effectively incorporated in our framework. As an extension, we explore the fusion of binary color feature into image retrieval. The joint integration of the SIFT visual word and binary features greatly enhances the precision of visual matching, reducing the impact of false positive matches. Our method is evaluated through extensive experiments on four benchmark datasets (Ukbench, Holidays, DupImage, and MIR Flickr 1M). We show that our method significantly improves the baseline approach. In addition, large-scale experiments indicate that the proposed method requires acceptable memory usage and query time compared with other approaches. Further, when global color feature is integrated, our method yields competitive performance with the state-of-the-arts.
De Lillo, Carlo; Spinozzi, Giovanna; Truppa, Valentina; Naylor, Donna M
2005-05-01
Results obtained with preschool children (Homo sapiens) were compared with results previously obtained from capuchin monkeys (Cebus apella) in matching-to-sample tasks featuring hierarchical visual stimuli. In Experiment 1, monkeys, in contrast with children, showed an advantage in matching the stimuli on the basis of their local features. These results were replicated in a 2nd experiment in which control trials enabled the authors to rule out that children used spurious cues to solve the matching task. In a 3rd experiment featuring conditions in which the density of the stimuli was manipulated, monkeys' accuracy in the processing of the global shape of the stimuli was negatively affected by the separation of the local elements, whereas children's performance was robust across testing conditions. Children's response latencies revealed a global precedence in the 2nd and 3rd experiments. These results show differences in the processing of hierarchical stimuli by humans and monkeys that emerge early during childhood. 2005 APA, all rights reserved
Cocchi, Luca; Sale, Martin V; L Gollo, Leonardo; Bell, Peter T; Nguyen, Vinh T; Zalesky, Andrew; Breakspear, Michael; Mattingley, Jason B
2016-01-01
Within the primate visual system, areas at lower levels of the cortical hierarchy process basic visual features, whereas those at higher levels, such as the frontal eye fields (FEF), are thought to modulate sensory processes via feedback connections. Despite these functional exchanges during perception, there is little shared activity between early and late visual regions at rest. How interactions emerge between regions encompassing distinct levels of the visual hierarchy remains unknown. Here we combined neuroimaging, non-invasive cortical stimulation and computational modelling to characterize changes in functional interactions across widespread neural networks before and after local inhibition of primary visual cortex or FEF. We found that stimulation of early visual cortex selectively increased feedforward interactions with FEF and extrastriate visual areas, whereas identical stimulation of the FEF decreased feedback interactions with early visual areas. Computational modelling suggests that these opposing effects reflect a fast-slow timescale hierarchy from sensory to association areas. DOI: http://dx.doi.org/10.7554/eLife.15252.001 PMID:27596931
Cocchi, Luca; Sale, Martin V; L Gollo, Leonardo; Bell, Peter T; Nguyen, Vinh T; Zalesky, Andrew; Breakspear, Michael; Mattingley, Jason B
2016-09-06
Within the primate visual system, areas at lower levels of the cortical hierarchy process basic visual features, whereas those at higher levels, such as the frontal eye fields (FEF), are thought to modulate sensory processes via feedback connections. Despite these functional exchanges during perception, there is little shared activity between early and late visual regions at rest. How interactions emerge between regions encompassing distinct levels of the visual hierarchy remains unknown. Here we combined neuroimaging, non-invasive cortical stimulation and computational modelling to characterize changes in functional interactions across widespread neural networks before and after local inhibition of primary visual cortex or FEF. We found that stimulation of early visual cortex selectively increased feedforward interactions with FEF and extrastriate visual areas, whereas identical stimulation of the FEF decreased feedback interactions with early visual areas. Computational modelling suggests that these opposing effects reflect a fast-slow timescale hierarchy from sensory to association areas.
The primary visual cortex in the neural circuit for visual orienting
NASA Astrophysics Data System (ADS)
Zhaoping, Li
The primary visual cortex (V1) is traditionally viewed as remote from influencing brain's motor outputs. However, V1 provides the most abundant cortical inputs directly to the sensory layers of superior colliculus (SC), a midbrain structure to command visual orienting such as shifting gaze and turning heads. I will show physiological, anatomical, and behavioral data suggesting that V1 transforms visual input into a saliency map to guide a class of visual orienting that is reflexive or involuntary. In particular, V1 receives a retinotopic map of visual features, such as orientation, color, and motion direction of local visual inputs; local interactions between V1 neurons perform a local-to-global computation to arrive at a saliency map that highlights conspicuous visual locations by higher V1 responses. The conspicuous location are usually, but not always, where visual input statistics changes. The population V1 outputs to SC, which is also retinotopic, enables SC to locate, by lateral inhibition between SC neurons, the most salient location as the saccadic target. Experimental tests of this hypothesis will be shown. Variations of the neural circuit for visual orienting across animal species, with more or less V1 involvement, will be discussed. Supported by the Gatsby Charitable Foundation.
Epicenters of dynamic connectivity in the adaptation of the ventral visual system.
Prčkovska, Vesna; Huijbers, Willem; Schultz, Aaron; Ortiz-Teran, Laura; Peña-Gomez, Cleofe; Villoslada, Pablo; Johnson, Keith; Sperling, Reisa; Sepulcre, Jorge
2017-04-01
Neuronal responses adapt to familiar and repeated sensory stimuli. Enhanced synchrony across wide brain systems has been postulated as a potential mechanism for this adaptation phenomenon. Here, we used recently developed graph theory methods to investigate hidden connectivity features of dynamic synchrony changes during a visual repetition paradigm. Particularly, we focused on strength connectivity changes occurring at local and distant brain neighborhoods. We found that connectivity reorganization in visual modal cortex-such as local suppressed connectivity in primary visual areas and distant suppressed connectivity in fusiform areas-is accompanied by enhanced local and distant connectivity in higher cognitive processing areas in multimodal and association cortex. Moreover, we found a shift of the dynamic functional connections from primary-visual-fusiform to primary-multimodal/association cortex. These findings suggest that repetition-suppression is made possible by reorganization of functional connectivity that enables communication between low- and high-order areas. Hum Brain Mapp 38:1965-1976, 2017. © 2017 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Temporal and spatial adaptation of transient responses to local features
O'Carroll, David C.; Barnett, Paul D.; Nordström, Karin
2012-01-01
Interpreting visual motion within the natural environment is a challenging task, particularly considering that natural scenes vary enormously in brightness, contrast and spatial structure. The performance of current models for the detection of self-generated optic flow depends critically on these very parameters, but despite this, animals manage to successfully navigate within a broad range of scenes. Within global scenes local areas with more salient features are common. Recent work has highlighted the influence that local, salient features have on the encoding of optic flow, but it has been difficult to quantify how local transient responses affect responses to subsequent features and thus contribute to the global neural response. To investigate this in more detail we used experimenter-designed stimuli and recorded intracellularly from motion-sensitive neurons. We limited the stimulus to a small vertically elongated strip, to investigate local and global neural responses to pairs of local “doublet” features that were designed to interact with each other in the temporal and spatial domain. We show that the passage of a high-contrast doublet feature produces a complex transient response from local motion detectors consistent with predictions of a simple computational model. In the neuron, the passage of a high-contrast feature induces a local reduction in responses to subsequent low-contrast features. However, this neural contrast gain reduction appears to be recruited only when features stretch vertically (i.e., orthogonal to the direction of motion) across at least several aligned neighboring ommatidia. Horizontal displacement of the components of elongated features abolishes the local adaptation effect. It is thus likely that features in natural scenes with vertically aligned edges, such as tree trunks, recruit the greatest amount of response suppression. This property could emphasize the local responses to such features vs. those in nearby texture within the scene. PMID:23087617
Pictures Speak Louder than Words in ESP, Too!
ERIC Educational Resources Information Center
Erfani, Seyyed Mahdi
2012-01-01
While integrating visual features can be among the most important characteristics of English language textbooks, reviewing the current locally-produced English for Specific Purposes (ESP) ones reveals that they lack such a feature. Enjoying a rich theoretical background including Paivio's dual coding theory as well as Sert's educational semiotics,…
The devil is in the detail: brain dynamics in preparation for a global-local task.
Leaver, Echo E; Low, Kathy A; DiVacri, Assunta; Merla, Arcangelo; Fabiani, Monica; Gratton, Gabriele
2015-08-01
When analyzing visual scenes, it is sometimes important to determine the relevant "grain" size. Attention control mechanisms may help direct our processing to the intended grain size. Here we used the event-related optical signal, a method possessing high temporal and spatial resolution, to examine the involvement of brain structures within the dorsal attention network (DAN) and the visual processing network (VPN) in preparation for the appropriate level of analysis. Behavioral data indicate that the small features of a hierarchical stimulus (local condition) are more difficult to process than the large features (global condition). Consistent with this finding, cues predicting a local trial were associated with greater DAN activation. This activity was bilateral but more pronounced in the left hemisphere, where it showed a frontal-to-parietal progression over time. Furthermore, the amount of DAN activation, especially in the left hemisphere and in parietal regions, was predictive of subsequent performance. Although local cues elicited left-lateralized DAN activity, no preponderantly right activity was observed for global cues; however, the data indicated an interaction between level of analysis (local vs. global) and hemisphere in VPN. They further showed that local processing involves structures in the ventral VPN, whereas global processing involves structures in the dorsal VPN. These results indicate that in our study preparation for analyzing different size features is an asymmetric process, in which greater preparation is required to focus on small rather than large features, perhaps because of their lesser salience. This preparation involves the same DAN used for other attention control operations.
Recent results in visual servoing
NASA Astrophysics Data System (ADS)
Chaumette, François
2008-06-01
Visual servoing techniques consist in using the data provided by a vision sensor in order to control the motions of a dynamic system. Such systems are usually robot arms, mobile robots, aerial robots,… but can also be virtual robots for applications in computer animation, or even a virtual camera for applications in computer vision and augmented reality. A large variety of positioning tasks, or mobile target tracking, can be implemented by controlling from one to all the degrees of freedom of the system. Whatever the sensor configuration, which can vary from one on-board camera on the robot end-effector to several free-standing cameras, a set of visual features has to be selected at best from the image measurements available, allowing to control the degrees of freedom desired. A control law has also to be designed so that these visual features reach a desired value, defining a correct realization of the task. With a vision sensor providing 2D measurements, potential visual features are numerous, since as well 2D data (coordinates of feature points in the image, moments, …) as 3D data provided by a localization algorithm exploiting the extracted 2D measurements can be considered. It is also possible to combine 2D and 3D visual features to take the advantages of each approach while avoiding their respective drawbacks. From the selected visual features, the behavior of the system will have particular properties as for stability, robustness with respect to noise or to calibration errors, robot 3D trajectory, etc. The talk will present the main basic aspects of visual servoing, as well as technical advances obtained recently in the field inside the Lagadic group at INRIA/INRISA Rennes. Several application results will be also described.
MoleCoolQt – a molecule viewer for charge-density research
Hübschle, Christian B.; Dittrich, Birger
2011-01-01
MoleCoolQt is a molecule viewer for charge-density research. Features include the visualization of local atomic coordinate systems in multipole refinements based on the Hansen and Coppens formalism as implemented, for example, in the XD suite. Residual peaks and holes from XDfft are translated so that they appear close to the nearest atom of the asymmetric unit. Critical points from a topological analysis of the charge density can also be visualized. As in the program MolIso, color-mapped isosurfaces can be generated with a simple interface. Apart from its visualization features the program interactively helps in assigning local atomic coordinate systems and local symmetry, which can be automatically detected and altered. Dummy atoms – as sometimes required for local atomic coordinate systems – are calculated on demand; XD system files are updated after changes. When using the invariom database, potential scattering factor assignment problems can be resolved by the use of an interactive dialog. The following file formats are supported: XD, MoPro, SHELX, GAUSSIAN (com, FChk, cube), CIF and PDB. MoleCoolQt is written in C++ using the Qt4 library, has a user-friendly graphical user interface, and is available for several flavors of Linux, Windows and MacOS. PMID:22477783
Hexagonal wavelet processing of digital mammography
NASA Astrophysics Data System (ADS)
Laine, Andrew F.; Schuler, Sergio; Huda, Walter; Honeyman-Buck, Janice C.; Steinbach, Barbara G.
1993-09-01
This paper introduces a novel approach for accomplishing mammographic feature analysis through overcomplete multiresolution representations. We show that efficient representations may be identified from digital mammograms and used to enhance features of importance to mammography within a continuum of scale-space. We present a method of contrast enhancement based on an overcomplete, non-separable multiscale representation: the hexagonal wavelet transform. Mammograms are reconstructed from transform coefficients modified at one or more levels by local and global non-linear operators. Multiscale edges identified within distinct levels of transform space provide local support for enhancement. We demonstrate that features extracted from multiresolution representations can provide an adaptive mechanism for accomplishing local contrast enhancement. We suggest that multiscale detection and local enhancement of singularities may be effectively employed for the visualization of breast pathology without excessive noise amplification.
Ince, Robin A A; Jaworska, Katarzyna; Gross, Joachim; Panzeri, Stefano; van Rijsbergen, Nicola J; Rousselet, Guillaume A; Schyns, Philippe G
2016-08-22
A key to understanding visual cognition is to determine "where", "when", and "how" brain responses reflect the processing of the specific visual features that modulate categorization behavior-the "what". The N170 is the earliest Event-Related Potential (ERP) that preferentially responds to faces. Here, we demonstrate that a paradigmatic shift is necessary to interpret the N170 as the product of an information processing network that dynamically codes and transfers face features across hemispheres, rather than as a local stimulus-driven event. Reverse-correlation methods coupled with information-theoretic analyses revealed that visibility of the eyes influences face detection behavior. The N170 initially reflects coding of the behaviorally relevant eye contralateral to the sensor, followed by a causal communication of the other eye from the other hemisphere. These findings demonstrate that the deceptively simple N170 ERP hides a complex network information processing mechanism involving initial coding and subsequent cross-hemispheric transfer of visual features. © The Author 2016. Published by Oxford University Press.
Hemispheric asymmetry of liking for representational and abstract paintings.
Nadal, Marcos; Schiavi, Susanna; Cattaneo, Zaira
2017-10-13
Although the neural correlates of the appreciation of aesthetic qualities have been the target of much research in the past decade, few experiments have explored the hemispheric asymmetries in underlying processes. In this study, we used a divided visual field paradigm to test for hemispheric asymmetries in men and women's preference for abstract and representational artworks. Both male and female participants liked representational paintings more when presented in the right visual field, whereas preference for abstract paintings was unaffected by presentation hemifield. We hypothesize that this result reflects a facilitation of the sort of visual processes relevant to laypeople's liking for art-specifically, local processing of highly informative object features-when artworks are presented in the right visual field, given the left hemisphere's advantage in processing such features.
ERIC Educational Resources Information Center
Guy, Maggie W.; Reynolds, Greg D.; Zhang, Dantong
2013-01-01
Event-related potentials (ERPs) were utilized in an investigation of 21 six-month-olds' attention to and processing of global and local properties of hierarchical patterns. Overall, infants demonstrated an advantage for processing the overall configuration (i.e., global properties) of local features of hierarchical patterns; however,…
Explaining seeing? Disentangling qualia from perceptual organization.
Ibáñez, Agustin; Bekinschtein, Tristan
2010-09-01
Abstract Visual perception and integration seem to play an essential role in our conscious phenomenology. Relatively local neural processing of reentrant nature may explain several visual integration processes (feature binding or figure-ground segregation, object recognition, inference, competition), even without attention or cognitive control. Based on the above statements, should the neural signatures of visual integration (via reentrant process) be non-reportable phenomenological qualia? We argue that qualia are not required to understand this perceptual organization.
Coding of Border Ownership in Monkey Visual Cortex
Zhou, Hong; Friedman, Howard S.; von der Heydt, Rüdiger
2016-01-01
Areas V1 and V2 of the visual cortex have traditionally been conceived as stages of local feature representations. We investigated whether neural responses carry information about how local features belong to objects. Single-cell activity was recorded in areas V1, V2, and V4 of awake behaving monkeys. Displays were used in which the same local feature (contrast edge or line) could be presented as part of different figures. For example, the same light–dark edge could be the left side of a dark square or the right side of a light square. Each display was also presented with reversed contrast. We found significant modulation of responses as a function of the side of the figure in >50% of neurons of V2 and V4 and in 18% of neurons of the top layers of V1. Thus, besides the local contrast border information, neurons were found to encode the side to which the border belongs (“border ownership coding”). A majority of these neurons coded border ownership and the local polarity of luminance–chromaticity contrast. The others were insensitive to contrast polarity. Another 20% of the neurons of V2 and V4, and 48% of top layer V1, coded local contrast polarity, but not border ownership. The border ownership-related response differences emerged soon (<25 msec) after the response onset. In V2 and V4, the differences were found to be nearly independent of figure size up to the limit set by the size of our display (21°). Displays that differed only far outside the conventional receptive field could produce markedly different responses. When tested with more complex displays in which figure-ground cues were varied, some neurons produced invariant border ownership signals, others failed to signal border ownership for some of the displays, but neurons that reversed signals were rare. The influence of visual stimulation far from the receptive field center indicates mechanisms of global context integration. The short latencies and incomplete cue invariance suggest that the border-ownership effect is generated within the visual cortex rather than projected down from higher levels. PMID:10964965
An object-based visual attention model for robotic applications.
Yu, Yuanlong; Mann, George K I; Gosine, Raymond G
2010-10-01
By extending integrated competition hypothesis, this paper presents an object-based visual attention model, which selects one object of interest using low-dimensional features, resulting that visual perception starts from a fast attentional selection procedure. The proposed attention model involves seven modules: learning of object representations stored in a long-term memory (LTM), preattentive processing, top-down biasing, bottom-up competition, mediation between top-down and bottom-up ways, generation of saliency maps, and perceptual completion processing. It works in two phases: learning phase and attending phase. In the learning phase, the corresponding object representation is trained statistically when one object is attended. A dual-coding object representation consisting of local and global codings is proposed. Intensity, color, and orientation features are used to build the local coding, and a contour feature is employed to constitute the global coding. In the attending phase, the model preattentively segments the visual field into discrete proto-objects using Gestalt rules at first. If a task-specific object is given, the model recalls the corresponding representation from LTM and deduces the task-relevant feature(s) to evaluate top-down biases. The mediation between automatic bottom-up competition and conscious top-down biasing is then performed to yield a location-based saliency map. By combination of location-based saliency within each proto-object, the proto-object-based saliency is evaluated. The most salient proto-object is selected for attention, and it is finally put into the perceptual completion processing module to yield a complete object region. This model has been applied into distinct tasks of robots: detection of task-specific stationary and moving objects. Experimental results under different conditions are shown to validate this model.
Modeling Geometric-Temporal Context With Directional Pyramid Co-Occurrence for Action Recognition.
Yuan, Chunfeng; Li, Xi; Hu, Weiming; Ling, Haibin; Maybank, Stephen J
2014-02-01
In this paper, we present a new geometric-temporal representation for visual action recognition based on local spatio-temporal features. First, we propose a modified covariance descriptor under the log-Euclidean Riemannian metric to represent the spatio-temporal cuboids detected in the video sequences. Compared with previously proposed covariance descriptors, our descriptor can be measured and clustered in Euclidian space. Second, to capture the geometric-temporal contextual information, we construct a directional pyramid co-occurrence matrix (DPCM) to describe the spatio-temporal distribution of the vector-quantized local feature descriptors extracted from a video. DPCM characterizes the co-occurrence statistics of local features as well as the spatio-temporal positional relationships among the concurrent features. These statistics provide strong descriptive power for action recognition. To use DPCM for action recognition, we propose a directional pyramid co-occurrence matching kernel to measure the similarity of videos. The proposed method achieves the state-of-the-art performance and improves on the recognition performance of the bag-of-visual-words (BOVWs) models by a large margin on six public data sets. For example, on the KTH data set, it achieves 98.78% accuracy while the BOVW approach only achieves 88.06%. On both Weizmann and UCF CIL data sets, the highest possible accuracy of 100% is achieved.
a Preliminary Work on Layout Slam for Reconstruction of Indoor Corridor Environments
NASA Astrophysics Data System (ADS)
Baligh Jahromi, A.; Sohn, G.; Shahbazi, M.; Kang, J.
2017-09-01
We propose a real time indoor corridor layout estimation method based on visual Simultaneous Localization and Mapping (SLAM). The proposed method adopts the Manhattan World Assumption at indoor spaces and uses the detected single image straight line segments and their corresponding orthogonal vanishing points to improve the feature matching scheme in the adopted visual SLAM system. Using the proposed real time indoor corridor layout estimation method, the system is able to build an online sparse map of structural corner point features. The challenges presented by abrupt camera rotation in the 3D space are successfully handled through matching vanishing directions of consecutive video frames on the Gaussian sphere. Using the single image based indoor layout features for initializing the system, permitted the proposed method to perform real time layout estimation and camera localization in indoor corridor areas. For layout structural corner points matching, we adopted features which are invariant under scale, translation, and rotation. We proposed a new feature matching cost function which considers both local and global context information. The cost function consists of a unary term, which measures pixel to pixel orientation differences of the matched corners, and a binary term, which measures the amount of angle differences between directly connected layout corner features. We have performed the experiments on real scenes at York University campus buildings and the available RAWSEEDS dataset. The incoming results depict that the proposed method robustly performs along with producing very limited position and orientation errors.
The feature-weighted receptive field: an interpretable encoding model for complex feature spaces.
St-Yves, Ghislain; Naselaris, Thomas
2017-06-20
We introduce the feature-weighted receptive field (fwRF), an encoding model designed to balance expressiveness, interpretability and scalability. The fwRF is organized around the notion of a feature map-a transformation of visual stimuli into visual features that preserves the topology of visual space (but not necessarily the native resolution of the stimulus). The key assumption of the fwRF model is that activity in each voxel encodes variation in a spatially localized region across multiple feature maps. This region is fixed for all feature maps; however, the contribution of each feature map to voxel activity is weighted. Thus, the model has two separable sets of parameters: "where" parameters that characterize the location and extent of pooling over visual features, and "what" parameters that characterize tuning to visual features. The "where" parameters are analogous to classical receptive fields, while "what" parameters are analogous to classical tuning functions. By treating these as separable parameters, the fwRF model complexity is independent of the resolution of the underlying feature maps. This makes it possible to estimate models with thousands of high-resolution feature maps from relatively small amounts of data. Once a fwRF model has been estimated from data, spatial pooling and feature tuning can be read-off directly with no (or very little) additional post-processing or in-silico experimentation. We describe an optimization algorithm for estimating fwRF models from data acquired during standard visual neuroimaging experiments. We then demonstrate the model's application to two distinct sets of features: Gabor wavelets and features supplied by a deep convolutional neural network. We show that when Gabor feature maps are used, the fwRF model recovers receptive fields and spatial frequency tuning functions consistent with known organizational principles of the visual cortex. We also show that a fwRF model can be used to regress entire deep convolutional networks against brain activity. The ability to use whole networks in a single encoding model yields state-of-the-art prediction accuracy. Our results suggest a wide variety of uses for the feature-weighted receptive field model, from retinotopic mapping with natural scenes, to regressing the activities of whole deep neural networks onto measured brain activity. Copyright © 2017. Published by Elsevier Inc.
A Novel Locally Linear KNN Method With Applications to Visual Recognition.
Liu, Qingfeng; Liu, Chengjun
2017-09-01
A locally linear K Nearest Neighbor (LLK) method is presented in this paper with applications to robust visual recognition. Specifically, the concept of an ideal representation is first presented, which improves upon the traditional sparse representation in many ways. The objective function based on a host of criteria for sparsity, locality, and reconstruction is then optimized to derive a novel representation, which is an approximation to the ideal representation. The novel representation is further processed by two classifiers, namely, an LLK-based classifier and a locally linear nearest mean-based classifier, for visual recognition. The proposed classifiers are shown to connect to the Bayes decision rule for minimum error. Additional new theoretical analysis is presented, such as the nonnegative constraint, the group regularization, and the computational efficiency of the proposed LLK method. New methods such as a shifted power transformation for improving reliability, a coefficients' truncating method for enhancing generalization, and an improved marginal Fisher analysis method for feature extraction are proposed to further improve visual recognition performance. Extensive experiments are implemented to evaluate the proposed LLK method for robust visual recognition. In particular, eight representative data sets are applied for assessing the performance of the LLK method for various visual recognition applications, such as action recognition, scene recognition, object recognition, and face recognition.
NASA Astrophysics Data System (ADS)
Haigang, Sui; Zhina, Song
2016-06-01
Reliably ship detection in optical satellite images has a wide application in both military and civil fields. However, this problem is very difficult in complex backgrounds, such as waves, clouds, and small islands. Aiming at these issues, this paper explores an automatic and robust model for ship detection in large-scale optical satellite images, which relies on detecting statistical signatures of ship targets, in terms of biologically-inspired visual features. This model first selects salient candidate regions across large-scale images by using a mechanism based on biologically-inspired visual features, combined with visual attention model with local binary pattern (CVLBP). Different from traditional studies, the proposed algorithm is high-speed and helpful to focus on the suspected ship areas avoiding the separation step of land and sea. Largearea images are cut into small image chips and analyzed in two complementary ways: Sparse saliency using visual attention model and detail signatures using LBP features, thus accordant with sparseness of ship distribution on images. Then these features are employed to classify each chip as containing ship targets or not, using a support vector machine (SVM). After getting the suspicious areas, there are still some false alarms such as microwaves and small ribbon clouds, thus simple shape and texture analysis are adopted to distinguish between ships and nonships in suspicious areas. Experimental results show the proposed method is insensitive to waves, clouds, illumination and ship size.
Local and Global Processing: Observations from a Remote Culture
ERIC Educational Resources Information Center
Davidoff, Jules; Fonteneau, Elisabeth; Fagot, Joel
2008-01-01
In Experiment 1, a normal adult population drawn from a remote culture (Himba) in northern Namibia made similarity matches to [Navon, D. (1977). Forest before trees: The precedence of global features in visual perception. "Cognitive Psychology", 9, 353-383] hierarchical figures. The Himba showed a local bias stronger than that has been…
Quantification and Visualization of Variation in Anatomical Trees
DOE Office of Scientific and Technical Information (OSTI.GOV)
Amenta, Nina; Datar, Manasi; Dirksen, Asger
This paper presents two approaches to quantifying and visualizing variation in datasets of trees. The first approach localizes subtrees in which significant population differences are found through hypothesis testing and sparse classifiers on subtree features. The second approach visualizes the global metric structure of datasets through low-distortion embedding into hyperbolic planes in the style of multidimensional scaling. A case study is made on a dataset of airway trees in relation to Chronic Obstructive Pulmonary Disease.
Wang, Yilin; Kanchanawong, Pakorn
2016-12-01
Fluorescence microscopy enables direct visualization of specific biomolecules within cells. However, for conventional fluorescence microscopy, the spatial resolution is restricted by diffraction to ~ 200 nm within the image plane and > 500 nm along the optical axis. As a result, fluorescence microscopy has long been severely limited in the observation of ultrastructural features within cells. The recent development of super resolution microscopy methods has overcome this limitation. In particular, the advent of photoswitchable fluorophores enables localization-based super resolution microscopy, which provides resolving power approaching the molecular-length scale. Here, we describe the application of a three-dimensional super resolution microscopy method based on single-molecule localization microscopy and multiphase interferometry, called interferometric PhotoActivated Localization Microscopy (iPALM). This method provides nearly isotropic resolution on the order of 20 nm in all three dimensions. Protocols for visualizing the filamentous actin cytoskeleton, including specimen preparation and operation of the iPALM instrument, are described here. These protocols are also readily adaptable and instructive for the study of other ultrastructural features in cells.
Sparse Contextual Activation for Efficient Visual Re-Ranking.
Bai, Song; Bai, Xiang
2016-03-01
In this paper, we propose an extremely efficient algorithm for visual re-ranking. By considering the original pairwise distance in the contextual space, we develop a feature vector called sparse contextual activation (SCA) that encodes the local distribution of an image. Hence, re-ranking task can be simply accomplished by vector comparison under the generalized Jaccard metric, which has its theoretical meaning in the fuzzy set theory. In order to improve the time efficiency of re-ranking procedure, inverted index is successfully introduced to speed up the computation of generalized Jaccard metric. As a result, the average time cost of re-ranking for a certain query can be controlled within 1 ms. Furthermore, inspired by query expansion, we also develop an additional method called local consistency enhancement on the proposed SCA to improve the retrieval performance in an unsupervised manner. On the other hand, the retrieval performance using a single feature may not be satisfactory enough, which inspires us to fuse multiple complementary features for accurate retrieval. Based on SCA, a robust feature fusion algorithm is exploited that also preserves the characteristic of high time efficiency. We assess our proposed method in various visual re-ranking tasks. Experimental results on Princeton shape benchmark (3D object), WM-SRHEC07 (3D competition), YAEL data set B (face), MPEG-7 data set (shape), and Ukbench data set (image) manifest the effectiveness and efficiency of SCA.
Localization Using Visual Odometry and a Single Downward-Pointing Camera
NASA Technical Reports Server (NTRS)
Swank, Aaron J.
2012-01-01
Stereo imaging is a technique commonly employed for vision-based navigation. For such applications, two images are acquired from different vantage points and then compared using transformations to extract depth information. The technique is commonly used in robotics for obstacle avoidance or for Simultaneous Localization And Mapping, (SLAM). Yet, the process requires a number of image processing steps and therefore tends to be CPU-intensive, which limits the real-time data rate and use in power-limited applications. Evaluated here is a technique where a monocular camera is used for vision-based odometry. In this work, an optical flow technique with feature recognition is performed to generate odometry measurements. The visual odometry sensor measurements are intended to be used as control inputs or measurements in a sensor fusion algorithm using low-cost MEMS based inertial sensors to provide improved localization information. Presented here are visual odometry results which demonstrate the challenges associated with using ground-pointing cameras for visual odometry. The focus is for rover-based robotic applications for localization within GPS-denied environments.
Dogrusoz, U; Erson, E Z; Giral, E; Demir, E; Babur, O; Cetintas, A; Colak, R
2006-02-01
Patikaweb provides a Web interface for retrieving and analyzing biological pathways in the Patika database, which contains data integrated from various prominent public pathway databases. It features a user-friendly interface, dynamic visualization and automated layout, advanced graph-theoretic queries for extracting biologically important phenomena, local persistence capability and exporting facilities to various pathway exchange formats.
New insights into ambient and focal visual fixations using an automatic classification algorithm
Follet, Brice; Le Meur, Olivier; Baccino, Thierry
2011-01-01
Overt visual attention is the act of directing the eyes toward a given area. These eye movements are characterised by saccades and fixations. A debate currently surrounds the role of visual fixations. Do they all have the same role in the free viewing of natural scenes? Recent studies suggest that at least two types of visual fixations exist: focal and ambient. The former is believed to be used to inspect local areas accurately, whereas the latter is used to obtain the context of the scene. We investigated the use of an automated system to cluster visual fixations in two groups using four types of natural scene images. We found new evidence to support a focal–ambient dichotomy. Our data indicate that the determining factor is the saccade amplitude. The dependence on the low-level visual features and the time course of these two kinds of visual fixations were examined. Our results demonstrate that there is an interplay between both fixation populations and that focal fixations are more dependent on low-level visual features than are ambient fixations. PMID:23145248
Correlation Filter Learning Toward Peak Strength for Visual Tracking.
Sui, Yao; Wang, Guanghui; Zhang, Li
2018-04-01
This paper presents a novel visual tracking approach to correlation filter learning toward peak strength of correlation response. Previous methods leverage all features of the target and the immediate background to learn a correlation filter. Some features, however, may be distractive to tracking, like those from occlusion and local deformation, resulting in unstable tracking performance. This paper aims at solving this issue and proposes a novel algorithm to learn the correlation filter. The proposed approach, by imposing an elastic net constraint on the filter, can adaptively eliminate those distractive features in the correlation filtering. A new peak strength metric is proposed to measure the discriminative capability of the learned correlation filter. It is demonstrated that the proposed approach effectively strengthens the peak of the correlation response, leading to more discriminative performance than previous methods. Extensive experiments on a challenging visual tracking benchmark demonstrate that the proposed tracker outperforms most state-of-the-art methods.
Eye guidance during real-world scene search: The role color plays in central and peripheral vision.
Nuthmann, Antje; Malcolm, George L
2016-01-01
The visual system utilizes environmental features to direct gaze efficiently when locating objects. While previous research has isolated various features' contributions to gaze guidance, these studies generally used sparse displays and did not investigate how features facilitated search as a function of their location on the visual field. The current study investigated how features across the visual field--particularly color--facilitate gaze guidance during real-world search. A gaze-contingent window followed participants' eye movements, restricting color information to specified regions. Scene images were presented in full color, with color in the periphery and gray in central vision or gray in the periphery and color in central vision, or in grayscale. Color conditions were crossed with a search cue manipulation, with the target cued either with a word label or an exact picture. Search times increased as color information in the scene decreased. A gaze-data based decomposition of search time revealed color-mediated effects on specific subprocesses of search. Color in peripheral vision facilitated target localization, whereas color in central vision facilitated target verification. Picture cues facilitated search, with the effects of cue specificity and scene color combining additively. When available, the visual system utilizes the environment's color information to facilitate different real-world visual search behaviors based on the location within the visual field.
Tamarisk Mapping and Monitoring Using High Resolution Satellite Imagery
Jason W. San Souci; John T. Doyle
2006-01-01
QuickBird high resolution multispectral satellite imagery (60 cm GSD, 4 spectral bands) and calibrated products from DigitalGlobeâs AgroWatch program were used as inputs to Visual Learning Systemâs Feature Analyst automated feature extraction software to map localized occurrences of pervasive and aggressive Tamarisk (Tamarix ramosissima), an invasive...
A novel visual-inertial monocular SLAM
NASA Astrophysics Data System (ADS)
Yue, Xiaofeng; Zhang, Wenjuan; Xu, Li; Liu, JiangGuo
2018-02-01
With the development of sensors and computer vision research community, cameras, which are accurate, compact, wellunderstood and most importantly cheap and ubiquitous today, have gradually been at the center of robot location. Simultaneous localization and mapping (SLAM) using visual features, which is a system getting motion information from image acquisition equipment and rebuild the structure in unknown environment. We provide an analysis of bioinspired flights in insects, employing a novel technique based on SLAM. Then combining visual and inertial measurements to get high accuracy and robustness. we present a novel tightly-coupled Visual-Inertial Simultaneous Localization and Mapping system which get a new attempt to address two challenges which are the initialization problem and the calibration problem. experimental results and analysis show the proposed approach has a more accurate quantitative simulation of insect navigation, which can reach the positioning accuracy of centimeter level.
Modulatory compartments in cortex and local regulation of cholinergic tone.
Coppola, Jennifer J; Ward, Nicholas J; Jadi, Monika P; Disney, Anita A
2016-09-01
Neuromodulatory signaling is generally considered broad in its impact across cortex. However, variations in the characteristics of cortical circuits may introduce regionally-specific responses to diffuse modulatory signals. Features such as patterns of axonal innervation, tissue tortuosity and molecular diffusion, effectiveness of degradation pathways, subcellular receptor localization, and patterns of receptor expression can lead to local modification of modulatory inputs. We propose that modulatory compartments exist in cortex and can be defined by variation in structural features of local circuits. Further, we argue that these compartments are responsible for local regulation of neuromodulatory tone. For the cholinergic system, these modulatory compartments are regions of cortical tissue within which signaling conditions for acetylcholine are relatively uniform, but between which signaling can vary profoundly. In the visual system, evidence for the existence of compartments indicates that cholinergic modulation likely differs across the visual pathway. We argue that the existence of these compartments calls for thinking about cholinergic modulation in terms of finer-grained control of local cortical circuits than is implied by the traditional view of this system as a diffuse modulator. Further, an understanding of modulatory compartments provides an opportunity to better understand and perhaps correct signal modifications that lead to pathological states. Copyright © 2016 Elsevier Ltd. All rights reserved.
Perception and Processing of Faces in the Human Brain Is Tuned to Typical Feature Locations
Schwarzkopf, D. Samuel; Alvarez, Ivan; Lawson, Rebecca P.; Henriksson, Linda; Kriegeskorte, Nikolaus; Rees, Geraint
2016-01-01
Faces are salient social stimuli whose features attract a stereotypical pattern of fixations. The implications of this gaze behavior for perception and brain activity are largely unknown. Here, we characterize and quantify a retinotopic bias implied by typical gaze behavior toward faces, which leads to eyes and mouth appearing most often in the upper and lower visual field, respectively. We found that the adult human visual system is tuned to these contingencies. In two recognition experiments, recognition performance for isolated face parts was better when they were presented at typical, rather than reversed, visual field locations. The recognition cost of reversed locations was equal to ∼60% of that for whole face inversion in the same sample. Similarly, an fMRI experiment showed that patterns of activity evoked by eye and mouth stimuli in the right inferior occipital gyrus could be separated with significantly higher accuracy when these features were presented at typical, rather than reversed, visual field locations. Our findings demonstrate that human face perception is determined not only by the local position of features within a face context, but by whether features appear at the typical retinotopic location given normal gaze behavior. Such location sensitivity may reflect fine-tuning of category-specific visual processing to retinal input statistics. Our findings further suggest that retinotopic heterogeneity might play a role for face inversion effects and for the understanding of conditions affecting gaze behavior toward faces, such as autism spectrum disorders and congenital prosopagnosia. SIGNIFICANCE STATEMENT Faces attract our attention and trigger stereotypical patterns of visual fixations, concentrating on inner features, like eyes and mouth. Here we show that the visual system represents face features better when they are shown at retinal positions where they typically fall during natural vision. When facial features were shown at typical (rather than reversed) visual field locations, they were discriminated better by humans and could be decoded with higher accuracy from brain activity patterns in the right occipital face area. This suggests that brain representations of face features do not cover the visual field uniformly. It may help us understand the well-known face-inversion effect and conditions affecting gaze behavior toward faces, such as prosopagnosia and autism spectrum disorders. PMID:27605606
Visual Categorization of Natural Movies by Rats
Vinken, Kasper; Vermaercke, Ben
2014-01-01
Visual categorization of complex, natural stimuli has been studied for some time in human and nonhuman primates. Recent interest in the rodent as a model for visual perception, including higher-level functional specialization, leads to the question of how rodents would perform on a categorization task using natural stimuli. To answer this question, rats were trained in a two-alternative forced choice task to discriminate movies containing rats from movies containing other objects and from scrambled movies (ordinate-level categorization). Subsequently, transfer to novel, previously unseen stimuli was tested, followed by a series of control probes. The results show that the animals are capable of acquiring a decision rule by abstracting common features from natural movies to generalize categorization to new stimuli. Control probes demonstrate that they did not use single low-level features, such as motion energy or (local) luminance. Significant generalization was even present with stationary snapshots from untrained movies. The variability within and between training and test stimuli, the complexity of natural movies, and the control experiments and analyses all suggest that a more high-level rule based on more complex stimulus features than local luminance-based cues was used to classify the novel stimuli. In conclusion, natural stimuli can be used to probe ordinate-level categorization in rats. PMID:25100598
Localization from Visual Landmarks on a Free-Flying Robot
NASA Technical Reports Server (NTRS)
Coltin, Brian; Fusco, Jesse; Moratto, Zack; Alexandrov, Oleg; Nakamura, Robert
2016-01-01
We present the localization approach for Astrobee, a new free-flying robot designed to navigate autonomously on the International Space Station (ISS). Astrobee will accommodate a variety of payloads and enable guest scientists to run experiments in zero-g, as well as assist astronauts and ground controllers. Astrobee will replace the SPHERES robots which currently operate on the ISS, whose use of fixed ultrasonic beacons for localization limits them to work in a 2 meter cube. Astrobee localizes with monocular vision and an IMU, without any environmental modifications. Visual features detected on a pre-built map, optical flow information, and IMU readings are all integrated into an extended Kalman filter (EKF) to estimate the robot pose. We introduce several modifications to the filter to make it more robust to noise, and extensively evaluate the localization algorithm.
Localized direction selective responses in the dendrites of visual interneurons of the fly
2010-01-01
Background The various tasks of visual systems, including course control, collision avoidance and the detection of small objects, require at the neuronal level the dendritic integration and subsequent processing of many spatially distributed visual motion inputs. While much is known about the pooled output in these systems, as in the medial superior temporal cortex of monkeys or in the lobula plate of the insect visual system, the motion tuning of the elements that provide the input has yet received little attention. In order to visualize the motion tuning of these inputs we examined the dendritic activation patterns of neurons that are selective for the characteristic patterns of wide-field motion, the lobula-plate tangential cells (LPTCs) of the blowfly. These neurons are known to sample direction-selective motion information from large parts of the visual field and combine these signals into axonal and dendro-dendritic outputs. Results Fluorescence imaging of intracellular calcium concentration allowed us to take a direct look at the local dendritic activity and the resulting local preferred directions in LPTC dendrites during activation by wide-field motion in different directions. These 'calcium response fields' resembled a retinotopic dendritic map of local preferred directions in the receptive field, the layout of which is a distinguishing feature of different LPTCs. Conclusions Our study reveals how neurons acquire selectivity for distinct visual motion patterns by dendritic integration of the local inputs with different preferred directions. With their spatial layout of directional responses, the dendrites of the LPTCs we investigated thus served as matched filters for wide-field motion patterns. PMID:20384983
Visualizing Vector Fields Using Line Integral Convolution and Dye Advection
NASA Technical Reports Server (NTRS)
Shen, Han-Wei; Johnson, Christopher R.; Ma, Kwan-Liu
1996-01-01
We present local and global techniques to visualize three-dimensional vector field data. Using the Line Integral Convolution (LIC) method to image the global vector field, our new algorithm allows the user to introduce colored 'dye' into the vector field to highlight local flow features. A fast algorithm is proposed that quickly recomputes the dyed LIC images. In addition, we introduce volume rendering methods that can map the LIC texture on any contour surface and/or translucent region defined by additional scalar quantities, and can follow the advection of colored dye throughout the volume.
[Local involvement of the optic nerve by acute lymphoblastic leukemia].
Bernardczyk-Meller, Jadwiga; Stefańska, Katarzyna
2005-01-01
The leucemias quite commonly involve the eyes and adnexa. In some cases it causes visual complants. Both, the anterior chamber of the eye and the posterior portion of the globe may sites of acute or chronic leukemia and leucemic relapse. We report an unique case of a 14 years old leucemic patient who suffered visual loss and papilloedema, due to a unilateral local involvement within optic nerve, during second relapse of acute lymphocytic leuemia. In spite of typical treatment of main disease, the boy had died. The authors present typical ophthalmic features of the leucemia, too.
Can two dots form a Gestalt? Measuring emergent features with the capacity coefficient.
Hawkins, Robert X D; Houpt, Joseph W; Eidels, Ami; Townsend, James T
2016-09-01
While there is widespread agreement among vision researchers on the importance of some local aspects of visual stimuli, such as hue and intensity, there is no general consensus on a full set of basic sources of information used in perceptual tasks or how they are processed. Gestalt theories place particular value on emergent features, which are based on the higher-order relationships among elements of a stimulus rather than local properties. Thus, arbitrating between different accounts of features is an important step in arbitrating between local and Gestalt theories of perception in general. In this paper, we present the capacity coefficient from Systems Factorial Technology (SFT) as a quantitative approach for formalizing and rigorously testing predictions made by local and Gestalt theories of features. As a simple, easily controlled domain for testing this approach, we focus on the local feature of location and the emergent features of Orientation and Proximity in a pair of dots. We introduce a redundant-target change detection task to compare our capacity measure on (1) trials where the configuration of the dots changed along with their location against (2) trials where the amount of local location change was exactly the same, but there was no change in the configuration. Our results, in conjunction with our modeling tools, favor the Gestalt account of emergent features. We conclude by suggesting several candidate information-processing models that incorporate emergent features, which follow from our approach. Copyright © 2015 Elsevier Ltd. All rights reserved.
Shapley, Robert M.; Xing, Dajun
2012-01-01
Theoretical considerations have led to the concept that the cerebral cortex is operating in a balanced state in which synaptic excitation is approximately balanced by synaptic inhibition from the local cortical circuit. This paper is about the functional consequences of the balanced state in sensory cortex. One consequence is gain control: there is experimental evidence and theoretical support for the idea that local circuit inhibition acts as a local automatic gain control throughout the cortex. Second, inhibition increases cortical feature selectivity: many studies of different sensory cortical areas have reported that suppressive mechanisms contribute to feature selectivity. Synaptic inhibition from the local microcircuit should be untuned (or broadly tuned) for stimulus features because of the microarchitecture of the cortical microcircuit. Untuned inhibition probably is the source of Untuned Suppression that enhances feature selectivity. We studied inhibition’s function in our experiments, guided by a neuronal network model, on orientation selectivity in the primary visual cortex, V1, of the Macaque monkey. Our results revealed that Untuned Suppression, generated by local circuit inhibition, is crucial for the generation of highly orientation-selective cells in V1 cortex. PMID:23036513
Comparing visual representations across human fMRI and computational vision
Leeds, Daniel D.; Seibert, Darren A.; Pyles, John A.; Tarr, Michael J.
2013-01-01
Feedforward visual object perception recruits a cortical network that is assumed to be hierarchical, progressing from basic visual features to complete object representations. However, the nature of the intermediate features related to this transformation remains poorly understood. Here, we explore how well different computer vision recognition models account for neural object encoding across the human cortical visual pathway as measured using fMRI. These neural data, collected during the viewing of 60 images of real-world objects, were analyzed with a searchlight procedure as in Kriegeskorte, Goebel, and Bandettini (2006): Within each searchlight sphere, the obtained patterns of neural activity for all 60 objects were compared to model responses for each computer recognition algorithm using representational dissimilarity analysis (Kriegeskorte et al., 2008). Although each of the computer vision methods significantly accounted for some of the neural data, among the different models, the scale invariant feature transform (Lowe, 2004), encoding local visual properties gathered from “interest points,” was best able to accurately and consistently account for stimulus representations within the ventral pathway. More generally, when present, significance was observed in regions of the ventral-temporal cortex associated with intermediate-level object perception. Differences in model effectiveness and the neural location of significant matches may be attributable to the fact that each model implements a different featural basis for representing objects (e.g., more holistic or more parts-based). Overall, we conclude that well-known computer vision recognition systems may serve as viable proxies for theories of intermediate visual object representation. PMID:24273227
Character feature integration of Chinese calligraphy and font
NASA Astrophysics Data System (ADS)
Shi, Cao; Xiao, Jianguo; Jia, Wenhua; Xu, Canhui
2013-01-01
A framework is proposed in this paper to effectively generate a new hybrid character type by means of integrating local contour feature of Chinese calligraphy with structural feature of font in computer system. To explore traditional art manifestation of calligraphy, multi-directional spatial filter is applied for local contour feature extraction. Then the contour of character image is divided into sub-images. The sub-images in the identical position from various characters are estimated by Gaussian distribution. According to its probability distribution, the dilation operator and erosion operator are designed to adjust the boundary of font image. And then new Chinese character images are generated which possess both contour feature of artistical calligraphy and elaborate structural feature of font. Experimental results demonstrate the new characters are visually acceptable, and the proposed framework is an effective and efficient strategy to automatically generate the new hybrid character of calligraphy and font.
Curveslam: Utilizing Higher Level Structure In Stereo Vision-Based Navigation
2012-01-01
consider their applica- tion to SLAM . The work of [31] [32] develops a spline-based SLAM framework, but this is only for application to LIDAR -based SLAM ...Existing approaches to visual Simultaneous Localization and Mapping ( SLAM ) typically utilize points as visual feature primitives to represent landmarks...regions of interest. Further, previous SLAM techniques that propose the use of higher level structures often place constraints on the environment, such as
Ouimet, Tia; Foster, Nicholas E V; Tryfon, Ana; Hyde, Krista L
2012-04-01
Autism spectrum disorder (ASD) is a complex neurodevelopmental condition characterized by atypical social and communication skills, repetitive behaviors, and atypical visual and auditory perception. Studies in vision have reported enhanced detailed ("local") processing but diminished holistic ("global") processing of visual features in ASD. Individuals with ASD also show enhanced processing of simple visual stimuli but diminished processing of complex visual stimuli. Relative to the visual domain, auditory global-local distinctions, and the effects of stimulus complexity on auditory processing in ASD, are less clear. However, one remarkable finding is that many individuals with ASD have enhanced musical abilities, such as superior pitch processing. This review provides a critical evaluation of behavioral and brain imaging studies of auditory processing with respect to current theories in ASD. We have focused on auditory-musical processing in terms of global versus local processing and simple versus complex sound processing. This review contributes to a better understanding of auditory processing differences in ASD. A deeper comprehension of sensory perception in ASD is key to better defining ASD phenotypes and, in turn, may lead to better interventions. © 2012 New York Academy of Sciences.
Robust feature tracking for endoscopic pose estimation and structure recovery
NASA Astrophysics Data System (ADS)
Speidel, S.; Krappe, S.; Röhl, S.; Bodenstedt, S.; Müller-Stich, B.; Dillmann, R.
2013-03-01
Minimally invasive surgery is a highly complex medical discipline with several difficulties for the surgeon. To alleviate these difficulties, augmented reality can be used for intraoperative assistance. For visualization, the endoscope pose must be known which can be acquired with a SLAM (Simultaneous Localization and Mapping) approach using the endoscopic images. In this paper we focus on feature tracking for SLAM in minimally invasive surgery. Robust feature tracking and minimization of false correspondences is crucial for localizing the endoscope. As sensory input we use a stereo endoscope and evaluate different feature types in a developed SLAM framework. The accuracy of the endoscope pose estimation is validated with synthetic and ex vivo data. Furthermore we test the approach with in vivo image sequences from da Vinci interventions.
Visual EKF-SLAM from Heterogeneous Landmarks †
Esparza-Jiménez, Jorge Othón; Devy, Michel; Gordillo, José L.
2016-01-01
Many applications require the localization of a moving object, e.g., a robot, using sensory data acquired from embedded devices. Simultaneous localization and mapping from vision performs both the spatial and temporal fusion of these data on a map when a camera moves in an unknown environment. Such a SLAM process executes two interleaved functions: the front-end detects and tracks features from images, while the back-end interprets features as landmark observations and estimates both the landmarks and the robot positions with respect to a selected reference frame. This paper describes a complete visual SLAM solution, combining both point and line landmarks on a single map. The proposed method has an impact on both the back-end and the front-end. The contributions comprehend the use of heterogeneous landmark-based EKF-SLAM (the management of a map composed of both point and line landmarks); from this perspective, the comparison between landmark parametrizations and the evaluation of how the heterogeneity improves the accuracy on the camera localization, the development of a front-end active-search process for linear landmarks integrated into SLAM and the experimentation methodology. PMID:27070602
Salience from the decision perspective: You know where it is before you know it is there.
Zehetleitner, Michael; Müller, Hermann J
2010-12-31
In visual search for feature contrast ("odd-one-out") singletons, identical manipulations of salience, whether by varying target-distractor similarity or dimensional redundancy of target definition, had smaller effects on reaction times (RTs) for binary localization decisions than for yes/no detection decisions. According to formal models of binary decisions, identical differences in drift rates would yield larger RT differences for slow than for fast decisions. From this principle and the present findings, it follows that decisions on the presence of feature contrast singletons are slower than decisions on their location. This is at variance with two classes of standard models of visual search and object recognition that assume a serial cascade of first detection, then localization and identification of a target object, but also inconsistent with models assuming that as soon as a target is detected all its properties, spatial as well as non-spatial (e.g., its category), are available immediately. As an alternative, we propose a model of detection and localization tasks based on random walk processes, which can account for the present findings.
Kablammo: an interactive, web-based BLAST results visualizer.
Wintersinger, Jeff A; Wasmuth, James D
2015-04-15
Kablammo is a web-based application that produces interactive, vector-based visualizations of sequence alignments generated by BLAST. These visualizations can illustrate many features, including shared protein domains, chromosome structural modifications and genome misassembly. Kablammo can be used at http://kablammo.wasmuthlab.org. For a local installation, the source code and instructions are available under the MIT license at http://github.com/jwintersinger/kablammo. jeff@wintersinger.org. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Zachariou, Valentinos; Nikas, Christine V; Safiullah, Zaid N; Gotts, Stephen J; Ungerleider, Leslie G
2017-08-01
Human face recognition is often attributed to configural processing; namely, processing the spatial relationships among the features of a face. If configural processing depends on fine-grained spatial information, do visuospatial mechanisms within the dorsal visual pathway contribute to this process? We explored this question in human adults using functional magnetic resonance imaging and transcranial magnetic stimulation (TMS) in a same-different face detection task. Within localized, spatial-processing regions of the posterior parietal cortex, configural face differences led to significantly stronger activation compared to featural face differences, and the magnitude of this activation correlated with behavioral performance. In addition, detection of configural relative to featural face differences led to significantly stronger functional connectivity between the right FFA and the spatial processing regions of the dorsal stream, whereas detection of featural relative to configural face differences led to stronger functional connectivity between the right FFA and left FFA. Critically, TMS centered on these parietal regions impaired performance on configural but not featural face difference detections. We conclude that spatial mechanisms within the dorsal visual pathway contribute to the configural processing of facial features and, more broadly, that the dorsal stream may contribute to the veridical perception of faces. Published by Oxford University Press 2016.
A proto-architecture for innate directionally selective visual maps.
Adams, Samantha V; Harris, Chris M
2014-01-01
Self-organizing artificial neural networks are a popular tool for studying visual system development, in particular the cortical feature maps present in real systems that represent properties such as ocular dominance (OD), orientation-selectivity (OR) and direction selectivity (DS). They are also potentially useful in artificial systems, for example robotics, where the ability to extract and learn features from the environment in an unsupervised way is important. In this computational study we explore a DS map that is already latent in a simple artificial network. This latent selectivity arises purely from the cortical architecture without any explicit coding for DS and prior to any self-organising process facilitated by spontaneous activity or training. We find DS maps with local patchy regions that exhibit features similar to maps derived experimentally and from previous modeling studies. We explore the consequences of changes to the afferent and lateral connectivity to establish the key features of this proto-architecture that support DS.
Multiview Locally Linear Embedding for Effective Medical Image Retrieval
Shen, Hualei; Tao, Dacheng; Ma, Dianfu
2013-01-01
Content-based medical image retrieval continues to gain attention for its potential to assist radiological image interpretation and decision making. Many approaches have been proposed to improve the performance of medical image retrieval system, among which visual features such as SIFT, LBP, and intensity histogram play a critical role. Typically, these features are concatenated into a long vector to represent medical images, and thus traditional dimension reduction techniques such as locally linear embedding (LLE), principal component analysis (PCA), or laplacian eigenmaps (LE) can be employed to reduce the “curse of dimensionality”. Though these approaches show promising performance for medical image retrieval, the feature-concatenating method ignores the fact that different features have distinct physical meanings. In this paper, we propose a new method called multiview locally linear embedding (MLLE) for medical image retrieval. Following the patch alignment framework, MLLE preserves the geometric structure of the local patch in each feature space according to the LLE criterion. To explore complementary properties among a range of features, MLLE assigns different weights to local patches from different feature spaces. Finally, MLLE employs global coordinate alignment and alternating optimization techniques to learn a smooth low-dimensional embedding from different features. To justify the effectiveness of MLLE for medical image retrieval, we compare it with conventional spectral embedding methods. We conduct experiments on a subset of the IRMA medical image data set. Evaluation results show that MLLE outperforms state-of-the-art dimension reduction methods. PMID:24349277
Fingerprints selection for topological localization
NASA Astrophysics Data System (ADS)
Popov, Vladimir
2017-07-01
Problems of visual navigation are extensively studied in contemporary robotics. In particular, we can mention different problems of visual landmarks selection, the problem of selection of a minimal set of visual landmarks, selection of partially distinguishable guards, the problem of placement of visual landmarks. In this paper, we consider one-dimensional color panoramas. Such panoramas can be used for creating fingerprints. Fingerprints give us unique identifiers for visually distinct locations by recovering statistically significant features. Fingerprints can be used as visual landmarks for the solution of various problems of mobile robot navigation. In this paper, we consider a method for automatic generation of fingerprints. In particular, we consider the bounded Post correspondence problem and applications of the problem to consensus fingerprints and topological localization. We propose an efficient approach to solve the bounded Post correspondence problem. In particular, we use an explicit reduction from the decision version of the problem to the satisfiability problem. We present the results of computational experiments for different satisfiability algorithms. In robotic experiments, we consider the average accuracy of reaching of the target point for different lengths of routes and types of fingerprints.
Visual categorization of natural movies by rats.
Vinken, Kasper; Vermaercke, Ben; Op de Beeck, Hans P
2014-08-06
Visual categorization of complex, natural stimuli has been studied for some time in human and nonhuman primates. Recent interest in the rodent as a model for visual perception, including higher-level functional specialization, leads to the question of how rodents would perform on a categorization task using natural stimuli. To answer this question, rats were trained in a two-alternative forced choice task to discriminate movies containing rats from movies containing other objects and from scrambled movies (ordinate-level categorization). Subsequently, transfer to novel, previously unseen stimuli was tested, followed by a series of control probes. The results show that the animals are capable of acquiring a decision rule by abstracting common features from natural movies to generalize categorization to new stimuli. Control probes demonstrate that they did not use single low-level features, such as motion energy or (local) luminance. Significant generalization was even present with stationary snapshots from untrained movies. The variability within and between training and test stimuli, the complexity of natural movies, and the control experiments and analyses all suggest that a more high-level rule based on more complex stimulus features than local luminance-based cues was used to classify the novel stimuli. In conclusion, natural stimuli can be used to probe ordinate-level categorization in rats. Copyright © 2014 the authors 0270-6474/14/3410645-14$15.00/0.
A Motion-Based Feature for Event-Based Pattern Recognition
Clady, Xavier; Maro, Jean-Matthieu; Barré, Sébastien; Benosman, Ryad B.
2017-01-01
This paper introduces an event-based luminance-free feature from the output of asynchronous event-based neuromorphic retinas. The feature consists in mapping the distribution of the optical flow along the contours of the moving objects in the visual scene into a matrix. Asynchronous event-based neuromorphic retinas are composed of autonomous pixels, each of them asynchronously generating “spiking” events that encode relative changes in pixels' illumination at high temporal resolutions. The optical flow is computed at each event, and is integrated locally or globally in a speed and direction coordinate frame based grid, using speed-tuned temporal kernels. The latter ensures that the resulting feature equitably represents the distribution of the normal motion along the current moving edges, whatever their respective dynamics. The usefulness and the generality of the proposed feature are demonstrated in pattern recognition applications: local corner detection and global gesture recognition. PMID:28101001
Visual short-term memory for oriented, colored objects.
Shin, Hongsup; Ma, Wei Ji
2017-08-01
A central question in the study of visual short-term memory (VSTM) has been whether its basic units are objects or features. Most studies addressing this question have used change detection tasks in which the feature value before the change is highly discriminable from the feature value after the change. This approach assumes that memory noise is negligible, which recent work has shown not to be the case. Here, we investigate VSTM for orientation and color within a noisy-memory framework, using change localization with a variable magnitude of change. A specific consequence of the noise is that it is necessary to model the inference (decision) stage. We find that (a) orientation and color have independent pools of memory resource (consistent with classic results); (b) an irrelevant feature dimension is either encoded but ignored during decision-making, or encoded with low precision and taken into account during decision-making; and (c) total resource available in a given feature dimension is lower in the presence of task-relevant stimuli that are neutral in that feature dimension. We propose a framework in which feature resource comes both in packaged and in targeted form.
Multiscale wavelet representations for mammographic feature analysis
NASA Astrophysics Data System (ADS)
Laine, Andrew F.; Song, Shuwu
1992-12-01
This paper introduces a novel approach for accomplishing mammographic feature analysis through multiresolution representations. We show that efficient (nonredundant) representations may be identified from digital mammography and used to enhance specific mammographic features within a continuum of scale space. The multiresolution decomposition of wavelet transforms provides a natural hierarchy in which to embed an interactive paradigm for accomplishing scale space feature analysis. Choosing wavelets (or analyzing functions) that are simultaneously localized in both space and frequency, results in a powerful methodology for image analysis. Multiresolution and orientation selectivity, known biological mechanisms in primate vision, are ingrained in wavelet representations and inspire the techniques presented in this paper. Our approach includes local analysis of complete multiscale representations. Mammograms are reconstructed from wavelet coefficients, enhanced by linear, exponential and constant weight functions localized in scale space. By improving the visualization of breast pathology we can improve the changes of early detection of breast cancers (improve quality) while requiring less time to evaluate mammograms for most patients (lower costs).
Steerable dyadic wavelet transform and interval wavelets for enhancement of digital mammography
NASA Astrophysics Data System (ADS)
Laine, Andrew F.; Koren, Iztok; Yang, Wuhai; Taylor, Fred J.
1995-04-01
This paper describes two approaches for accomplishing interactive feature analysis by overcomplete multiresolution representations. We show quantitatively that transform coefficients, modified by an adaptive non-linear operator, can make more obvious unseen or barely seen features of mammography without requiring additional radiation. Our results are compared with traditional image enhancement techniques by measuring the local contrast of known mammographic features. We design a filter bank representing a steerable dyadic wavelet transform that can be used for multiresolution analysis along arbitrary orientations. Digital mammograms are enhanced by orientation analysis performed by a steerable dyadic wavelet transform. Arbitrary regions of interest (ROI) are enhanced by Deslauriers-Dubuc interpolation representations on an interval. We demonstrate that our methods can provide radiologists with an interactive capability to support localized processing of selected (suspicion) areas (lesions). Features extracted from multiscale representations can provide an adaptive mechanism for accomplishing local contrast enhancement. By improving the visualization of breast pathology can improve changes of early detection while requiring less time to evaluate mammograms for most patients.
Adaptive multiscale processing for contrast enhancement
NASA Astrophysics Data System (ADS)
Laine, Andrew F.; Song, Shuwu; Fan, Jian; Huda, Walter; Honeyman, Janice C.; Steinbach, Barbara G.
1993-07-01
This paper introduces a novel approach for accomplishing mammographic feature analysis through overcomplete multiresolution representations. We show that efficient representations may be identified from digital mammograms within a continuum of scale space and used to enhance features of importance to mammography. Choosing analyzing functions that are well localized in both space and frequency, results in a powerful methodology for image analysis. We describe methods of contrast enhancement based on two overcomplete (redundant) multiscale representations: (1) Dyadic wavelet transform (2) (phi) -transform. Mammograms are reconstructed from transform coefficients modified at one or more levels by non-linear, logarithmic and constant scale-space weight functions. Multiscale edges identified within distinct levels of transform space provide a local support for enhancement throughout each decomposition. We demonstrate that features extracted from wavelet spaces can provide an adaptive mechanism for accomplishing local contrast enhancement. We suggest that multiscale detection and local enhancement of singularities may be effectively employed for the visualization of breast pathology without excessive noise amplification.
Advanced biologically plausible algorithms for low-level image processing
NASA Astrophysics Data System (ADS)
Gusakova, Valentina I.; Podladchikova, Lubov N.; Shaposhnikov, Dmitry G.; Markin, Sergey N.; Golovan, Alexander V.; Lee, Seong-Whan
1999-08-01
At present, in computer vision, the approach based on modeling the biological vision mechanisms is extensively developed. However, up to now, real world image processing has no effective solution in frameworks of both biologically inspired and conventional approaches. Evidently, new algorithms and system architectures based on advanced biological motivation should be developed for solution of computational problems related to this visual task. Basic problems that should be solved for creation of effective artificial visual system to process real world imags are a search for new algorithms of low-level image processing that, in a great extent, determine system performance. In the present paper, the result of psychophysical experiments and several advanced biologically motivated algorithms for low-level processing are presented. These algorithms are based on local space-variant filter, context encoding visual information presented in the center of input window, and automatic detection of perceptually important image fragments. The core of latter algorithm are using local feature conjunctions such as noncolinear oriented segment and composite feature map formation. Developed algorithms were integrated into foveal active vision model, the MARR. It is supposed that proposed algorithms may significantly improve model performance while real world image processing during memorizing, search, and recognition.
A global/local affinity graph for image segmentation.
Xiaofang Wang; Yuxing Tang; Masnou, Simon; Liming Chen
2015-04-01
Construction of a reliable graph capturing perceptual grouping cues of an image is fundamental for graph-cut based image segmentation methods. In this paper, we propose a novel sparse global/local affinity graph over superpixels of an input image to capture both short- and long-range grouping cues, and thereby enabling perceptual grouping laws, including proximity, similarity, continuity, and to enter in action through a suitable graph-cut algorithm. Moreover, we also evaluate three major visual features, namely, color, texture, and shape, for their effectiveness in perceptual segmentation and propose a simple graph fusion scheme to implement some recent findings from psychophysics, which suggest combining these visual features with different emphases for perceptual grouping. In particular, an input image is first oversegmented into superpixels at different scales. We postulate a gravitation law based on empirical observations and divide superpixels adaptively into small-, medium-, and large-sized sets. Global grouping is achieved using medium-sized superpixels through a sparse representation of superpixels' features by solving a ℓ0-minimization problem, and thereby enabling continuity or propagation of local smoothness over long-range connections. Small- and large-sized superpixels are then used to achieve local smoothness through an adjacent graph in a given feature space, and thus implementing perceptual laws, for example, similarity and proximity. Finally, a bipartite graph is also introduced to enable propagation of grouping cues between superpixels of different scales. Extensive experiments are carried out on the Berkeley segmentation database in comparison with several state-of-the-art graph constructions. The results show the effectiveness of the proposed approach, which outperforms state-of-the-art graphs using four different objective criteria, namely, the probabilistic rand index, the variation of information, the global consistency error, and the boundary displacement error.
Audio-Visual Speaker Diarization Based on Spatiotemporal Bayesian Fusion.
Gebru, Israel D; Ba, Sileye; Li, Xiaofei; Horaud, Radu
2018-05-01
Speaker diarization consists of assigning speech signals to people engaged in a dialogue. An audio-visual spatiotemporal diarization model is proposed. The model is well suited for challenging scenarios that consist of several participants engaged in multi-party interaction while they move around and turn their heads towards the other participants rather than facing the cameras and the microphones. Multiple-person visual tracking is combined with multiple speech-source localization in order to tackle the speech-to-person association problem. The latter is solved within a novel audio-visual fusion method on the following grounds: binaural spectral features are first extracted from a microphone pair, then a supervised audio-visual alignment technique maps these features onto an image, and finally a semi-supervised clustering method assigns binaural spectral features to visible persons. The main advantage of this method over previous work is that it processes in a principled way speech signals uttered simultaneously by multiple persons. The diarization itself is cast into a latent-variable temporal graphical model that infers speaker identities and speech turns, based on the output of an audio-visual association process, executed at each time slice, and on the dynamics of the diarization variable itself. The proposed formulation yields an efficient exact inference procedure. A novel dataset, that contains audio-visual training data as well as a number of scenarios involving several participants engaged in formal and informal dialogue, is introduced. The proposed method is thoroughly tested and benchmarked with respect to several state-of-the art diarization algorithms.
Multiple-stage ambiguity in motion perception reveals global computation of local motion directions.
Rider, Andrew T; Nishida, Shin'ya; Johnston, Alan
2016-12-01
The motion of a 1D image feature, such as a line, seen through a small aperture, or the small receptive field of a neural motion sensor, is underconstrained, and it is not possible to derive the true motion direction from a single local measurement. This is referred to as the aperture problem. How the visual system solves the aperture problem is a fundamental question in visual motion research. In the estimation of motion vectors through integration of ambiguous local motion measurements at different positions, conventional theories assume that the object motion is a rigid translation, with motion signals sharing a common motion vector within the spatial region over which the aperture problem is solved. However, this strategy fails for global rotation. Here we show that the human visual system can estimate global rotation directly through spatial pooling of locally ambiguous measurements, without an intervening step that computes local motion vectors. We designed a novel ambiguous global flow stimulus, which is globally as well as locally ambiguous. The global ambiguity implies that the stimulus is simultaneously consistent with both a global rigid translation and an infinite number of global rigid rotations. By the standard view, the motion should always be seen as a global translation, but it appears to shift from translation to rotation as observers shift fixation. This finding indicates that the visual system can estimate local vectors using a global rotation constraint, and suggests that local motion ambiguity may not be resolved until consistencies with multiple global motion patterns are assessed.
Principal visual word discovery for automatic license plate detection.
Zhou, Wengang; Li, Houqiang; Lu, Yijuan; Tian, Qi
2012-09-01
License plates detection is widely considered a solved problem, with many systems already in operation. However, the existing algorithms or systems work well only under some controlled conditions. There are still many challenges for license plate detection in an open environment, such as various observation angles, background clutter, scale changes, multiple plates, uneven illumination, and so on. In this paper, we propose a novel scheme to automatically locate license plates by principal visual word (PVW), discovery and local feature matching. Observing that characters in different license plates are duplicates of each other, we bring in the idea of using the bag-of-words (BoW) model popularly applied in partial-duplicate image search. Unlike the classic BoW model, for each plate character, we automatically discover the PVW characterized with geometric context. Given a new image, the license plates are extracted by matching local features with PVW. Besides license plate detection, our approach can also be extended to the detection of logos and trademarks. Due to the invariance virtue of scale-invariant feature transform feature, our method can adaptively deal with various changes in the license plates, such as rotation, scaling, illumination, etc. Promising results of the proposed approach are demonstrated with an experimental study in license plate detection.
Content based image retrieval using local binary pattern operator and data mining techniques.
Vatamanu, Oana Astrid; Frandeş, Mirela; Lungeanu, Diana; Mihalaş, Gheorghe-Ioan
2015-01-01
Content based image retrieval (CBIR) concerns the retrieval of similar images from image databases, using feature vectors extracted from images. These feature vectors globally define the visual content present in an image, defined by e.g., texture, colour, shape, and spatial relations between vectors. Herein, we propose the definition of feature vectors using the Local Binary Pattern (LBP) operator. A study was performed in order to determine the optimum LBP variant for the general definition of image feature vectors. The chosen LBP variant is then subsequently used to build an ultrasound image database, and a database with images obtained from Wireless Capsule Endoscopy. The image indexing process is optimized using data clustering techniques for images belonging to the same class. Finally, the proposed indexing method is compared to the classical indexing technique, which is nowadays widely used.
Fang, Chunying; Li, Haifeng; Ma, Lin; Zhang, Mancai
2017-01-01
Pathological speech usually refers to speech distortion resulting from illness or other biological insults. The assessment of pathological speech plays an important role in assisting the experts, while automatic evaluation of speech intelligibility is difficult because it is usually nonstationary and mutational. In this paper, we carry out an independent innovation of feature extraction and reduction, and we describe a multigranularity combined feature scheme which is optimized by the hierarchical visual method. A novel method of generating feature set based on S -transform and chaotic analysis is proposed. There are BAFS (430, basic acoustics feature), local spectral characteristics MSCC (84, Mel S -transform cepstrum coefficients), and chaotic features (12). Finally, radar chart and F -score are proposed to optimize the features by the hierarchical visual fusion. The feature set could be optimized from 526 to 96 dimensions based on NKI-CCRT corpus and 104 dimensions based on SVD corpus. The experimental results denote that new features by support vector machine (SVM) have the best performance, with a recognition rate of 84.4% on NKI-CCRT corpus and 78.7% on SVD corpus. The proposed method is thus approved to be effective and reliable for pathological speech intelligibility evaluation.
JackIn Head: Immersive Visual Telepresence System with Omnidirectional Wearable Camera.
Kasahara, Shunichi; Nagai, Shohei; Rekimoto, Jun
2017-03-01
Sharing one's own immersive experience over the Internet is one of the ultimate goals of telepresence technology. In this paper, we present JackIn Head, a visual telepresence system featuring an omnidirectional wearable camera with image motion stabilization. Spherical omnidirectional video footage taken around the head of a local user is stabilized and then broadcast to others, allowing remote users to explore the immersive visual environment independently of the local user's head direction. We describe the system design of JackIn Head and report the evaluation results of real-time image stabilization and alleviation of cybersickness. Then, through an exploratory observation study, we investigate how individuals can remotely interact, communicate with, and assist each other with our system. We report our observation and analysis of inter-personal communication, demonstrating the effectiveness of our system in augmenting remote collaboration.
Reilly, Jamie; Garcia, Amanda; Binney, Richard J.
2016-01-01
Much remains to be learned about the neural architecture underlying word meaning. Fully distributed models of semantic memory predict that the sound of a barking dog will conjointly engage a network of distributed sensorimotor spokes. An alternative framework holds that modality-specific features additionally converge within transmodal hubs. Participants underwent functional MRI while covertly naming familiar objects versus newly learned novel objects from only one of their constituent semantic features (visual form, characteristic sound, or point-light motion representation). Relative to the novel object baseline, familiar concepts elicited greater activation within association regions specific to that presentation modality. Furthermore, visual form elicited activation within high-level auditory association cortex. Conversely, environmental sounds elicited activation in regions proximal to visual association cortex. Both conditions commonly engaged a putative hub region within lateral anterior temporal cortex. These results support hybrid semantic models in which local hubs and distributed spokes are dually engaged in service of semantic memory. PMID:27289210
Fan, Jianping; Gao, Yuli; Luo, Hangzai
2008-03-01
In this paper, we have developed a new scheme for achieving multilevel annotations of large-scale images automatically. To achieve more sufficient representation of various visual properties of the images, both the global visual features and the local visual features are extracted for image content representation. To tackle the problem of huge intraconcept visual diversity, multiple types of kernels are integrated to characterize the diverse visual similarity relationships between the images more precisely, and a multiple kernel learning algorithm is developed for SVM image classifier training. To address the problem of huge interconcept visual similarity, a novel multitask learning algorithm is developed to learn the correlated classifiers for the sibling image concepts under the same parent concept and enhance their discrimination and adaptation power significantly. To tackle the problem of huge intraconcept visual diversity for the image concepts at the higher levels of the concept ontology, a novel hierarchical boosting algorithm is developed to learn their ensemble classifiers hierarchically. In order to assist users on selecting more effective hypotheses for image classifier training, we have developed a novel hyperbolic framework for large-scale image visualization and interactive hypotheses assessment. Our experiments on large-scale image collections have also obtained very positive results.
A robust method for estimating motorbike count based on visual information learning
NASA Astrophysics Data System (ADS)
Huynh, Kien C.; Thai, Dung N.; Le, Sach T.; Thoai, Nam; Hamamoto, Kazuhiko
2015-03-01
Estimating the number of vehicles in traffic videos is an important and challenging task in traffic surveillance, especially with a high level of occlusions between vehicles, e.g.,in crowded urban area with people and/or motorbikes. In such the condition, the problem of separating individual vehicles from foreground silhouettes often requires complicated computation [1][2][3]. Thus, the counting problem is gradually shifted into drawing statistical inferences of target objects density from their shape [4], local features [5], etc. Those researches indicate a correlation between local features and the number of target objects. However, they are inadequate to construct an accurate model for vehicles density estimation. In this paper, we present a reliable method that is robust to illumination changes and partial affine transformations. It can achieve high accuracy in case of occlusions. Firstly, local features are extracted from images of the scene using Speed-Up Robust Features (SURF) method. For each image, a global feature vector is computed using a Bag-of-Words model which is constructed from the local features above. Finally, a mapping between the extracted global feature vectors and their labels (the number of motorbikes) is learned. That mapping provides us a strong prediction model for estimating the number of motorbikes in new images. The experimental results show that our proposed method can achieve a better accuracy in comparison to others.
Zhang, Yu; Wu, Jianxin; Cai, Jianfei
2016-05-01
In large-scale visual recognition and image retrieval tasks, feature vectors, such as Fisher vector (FV) or the vector of locally aggregated descriptors (VLAD), have achieved state-of-the-art results. However, the combination of the large numbers of examples and high-dimensional vectors necessitates dimensionality reduction, in order to reduce its storage and CPU costs to a reasonable range. In spite of the popularity of various feature compression methods, this paper shows that the feature (dimension) selection is a better choice for high-dimensional FV/VLAD than the feature (dimension) compression methods, e.g., product quantization. We show that strong correlation among the feature dimensions in the FV and the VLAD may not exist, which renders feature selection a natural choice. We also show that, many dimensions in FV/VLAD are noise. Throwing them away using feature selection is better than compressing them and useful dimensions altogether using feature compression methods. To choose features, we propose an efficient importance sorting algorithm considering both the supervised and unsupervised cases, for visual recognition and image retrieval, respectively. Combining with the 1-bit quantization, feature selection has achieved both higher accuracy and less computational cost than feature compression methods, such as product quantization, on the FV and the VLAD image representations.
Object segmentation controls image reconstruction from natural scenes
2017-01-01
The structure of the physical world projects images onto our eyes. However, those images are often poorly representative of environmental structure: well-defined boundaries within the eye may correspond to irrelevant features of the physical world, while critical features of the physical world may be nearly invisible at the retinal projection. The challenge for the visual cortex is to sort these two types of features according to their utility in ultimately reconstructing percepts and interpreting the constituents of the scene. We describe a novel paradigm that enabled us to selectively evaluate the relative role played by these two feature classes in signal reconstruction from corrupted images. Our measurements demonstrate that this process is quickly dominated by the inferred structure of the environment, and only minimally controlled by variations of raw image content. The inferential mechanism is spatially global and its impact on early visual cortex is fast. Furthermore, it retunes local visual processing for more efficient feature extraction without altering the intrinsic transduction noise. The basic properties of this process can be partially captured by a combination of small-scale circuit models and large-scale network architectures. Taken together, our results challenge compartmentalized notions of bottom-up/top-down perception and suggest instead that these two modes are best viewed as an integrated perceptual mechanism. PMID:28827801
Independent and additive repetition priming of motion direction and color in visual search.
Kristjánsson, Arni
2009-03-01
Priming of visual search for Gabor patch stimuli, varying in color and local drift direction, was investigated. The task relevance of each feature varied between the different experimental conditions compared. When the target defining dimension was color, a large effect of color repetition was seen as well as a smaller effect of the repetition of motion direction. The opposite priming pattern was seen when motion direction defined the target--the effect of motion direction repetition was this time larger than for color repetition. Finally, when neither was task relevant, and the target defining dimension was the spatial frequency of the Gabor patch, priming was seen for repetition of both color and motion direction, but the effects were smaller than in the previous two conditions. These results show that features do not necessarily have to be task relevant for priming to occur. There is little interaction between priming following repetition of color and motion, these two features show independent and additive priming effects, most likely reflecting that the two features are processed at separate processing sites in the nervous system, consistent with previous findings from neuropsychology & neurophysiology. The implications of the findings for theoretical accounts of priming in visual search are discussed.
Using Visual Odometry to Estimate Position and Attitude
NASA Technical Reports Server (NTRS)
Maimone, Mark; Cheng, Yang; Matthies, Larry; Schoppers, Marcel; Olson, Clark
2007-01-01
A computer program in the guidance system of a mobile robot generates estimates of the position and attitude of the robot, using features of the terrain on which the robot is moving, by processing digitized images acquired by a stereoscopic pair of electronic cameras mounted rigidly on the robot. Developed for use in localizing the Mars Exploration Rover (MER) vehicles on Martian terrain, the program can also be used for similar purposes on terrestrial robots moving in sufficiently visually textured environments: examples include low-flying robotic aircraft and wheeled robots moving on rocky terrain or inside buildings. In simplified terms, the program automatically detects visual features and tracks them across stereoscopic pairs of images acquired by the cameras. The 3D locations of the tracked features are then robustly processed into an estimate of overall vehicle motion. Testing has shown that by use of this software, the error in the estimate of the position of the robot can be limited to no more than 2 percent of the distance traveled, provided that the terrain is sufficiently rich in features. This software has proven extremely useful on the MER vehicles during driving on sandy and highly sloped terrains on Mars.
A Self-Paced Physical Geology Laboratory.
ERIC Educational Resources Information Center
Watson, Donald W.
1983-01-01
Describes a self-paced geology course utilizing a diversity of instructional techniques, including maps, models, samples, audio-visual materials, and a locally developed laboratory manual. Mechanical features are laboratory exercises, followed by unit quizzes; quizzes are repeated until the desired level of competence is attained. (Author/JN)
Search asymmetry and eye movements in infants and adults.
Adler, Scott A; Gallego, Pamela
2014-08-01
Search asymmetry is characterized by the detection of a feature-present target amidst feature-absent distractors being efficient and unaffected by the number of distractors, whereas detection of a feature-absent target amidst feature-present distractors is typically inefficient and affected by the number of distractors. Although studies have attempted to investigate this phenomenon with infants (e.g., Adler, Inslicht, Rovee-Collier, & Gerhardstein in Infant Behavioral Development, 21, 253-272, 1998; Colombo, Mitchell, Coldren, & Atwater in Journal of Experimental Psychology: Learning, Memory and Cognition, 19, 98-109, 1990), due to methodological limitations, their findings have been unable to definitively establish the development of visual search mechanisms in infants. The present study assessed eye movements as a means to examine an asymmetry in responding to feature-present versus feature-absent targets in 3-month-olds, relative to adults. Saccade latencies to localize a target (or a distractor, as in the homogeneous conditions) were measured as infants and adults randomly viewed feature-present (R among Ps), feature-absent (P among Rs), and homogeneous (either all Rs or all Ps) arrays at set sizes of 1, 3, 5, and 8. Results indicated that neither infants' nor adults' saccade latencies to localize the target in the feature-present arrays were affected by increasing set sizes, suggesting that localization of the target was efficient. In contrast, saccade latencies to localize the target in the feature-absent arrays increased with increasing set sizes for both infants and adults, suggesting an inefficient localization. These findings indicate that infants exhibit an asymmetry consistent with that found with adults, providing support for functional bottom-up selective attention mechanisms in early infancy.
High-Order Local Pooling and Encoding Gaussians Over a Dictionary of Gaussians.
Li, Peihua; Zeng, Hui; Wang, Qilong; Shiu, Simon C K; Zhang, Lei
2017-07-01
Local pooling (LP) in configuration (feature) space proposed by Boureau et al. explicitly restricts similar features to be aggregated, which can preserve as much discriminative information as possible. At the time it appeared, this method combined with sparse coding achieved competitive classification results with only a small dictionary. However, its performance lags far behind the state-of-the-art results as only the zero-order information is exploited. Inspired by the success of high-order statistical information in existing advanced feature coding or pooling methods, we make an attempt to address the limitation of LP. To this end, we present a novel method called high-order LP (HO-LP) to leverage the information higher than the zero-order one. Our idea is intuitively simple: we compute the first- and second-order statistics per configuration bin and model them as a Gaussian. Accordingly, we employ a collection of Gaussians as visual words to represent the universal probability distribution of features from all classes. Our problem is naturally formulated as encoding Gaussians over a dictionary of Gaussians as visual words. This problem, however, is challenging since the space of Gaussians is not a Euclidean space but forms a Riemannian manifold. We address this challenge by mapping Gaussians into the Euclidean space, which enables us to perform coding with common Euclidean operations rather than complex and often expensive Riemannian operations. Our HO-LP preserves the advantages of the original LP: pooling only similar features and using a small dictionary. Meanwhile, it achieves very promising performance on standard benchmarks, with either conventional, hand-engineered features or deep learning-based features.
Scene analysis for effective visual search in rough three-dimensional-modeling scenes
NASA Astrophysics Data System (ADS)
Wang, Qi; Hu, Xiaopeng
2016-11-01
Visual search is a fundamental technology in the computer vision community. It is difficult to find an object in complex scenes when there exist similar distracters in the background. We propose a target search method in rough three-dimensional-modeling scenes based on a vision salience theory and camera imaging model. We give the definition of salience of objects (or features) and explain the way that salience measurements of objects are calculated. Also, we present one type of search path that guides to the target through salience objects. Along the search path, when the previous objects are localized, the search region of each subsequent object decreases, which is calculated through imaging model and an optimization method. The experimental results indicate that the proposed method is capable of resolving the ambiguities resulting from distracters containing similar visual features with the target, leading to an improvement of search speed by over 50%.
Lesion classification using clinical and visual data fusion by multiple kernel learning
NASA Astrophysics Data System (ADS)
Kisilev, Pavel; Hashoul, Sharbell; Walach, Eugene; Tzadok, Asaf
2014-03-01
To overcome operator dependency and to increase diagnosis accuracy in breast ultrasound (US), a lot of effort has been devoted to developing computer-aided diagnosis (CAD) systems for breast cancer detection and classification. Unfortunately, the efficacy of such CAD systems is limited since they rely on correct automatic lesions detection and localization, and on robustness of features computed based on the detected areas. In this paper we propose a new approach to boost the performance of a Machine Learning based CAD system, by combining visual and clinical data from patient files. We compute a set of visual features from breast ultrasound images, and construct the textual descriptor of patients by extracting relevant keywords from patients' clinical data files. We then use the Multiple Kernel Learning (MKL) framework to train SVM based classifier to discriminate between benign and malignant cases. We investigate different types of data fusion methods, namely, early, late, and intermediate (MKL-based) fusion. Our database consists of 408 patient cases, each containing US images, textual description of complaints and symptoms filled by physicians, and confirmed diagnoses. We show experimentally that the proposed MKL-based approach is superior to other classification methods. Even though the clinical data is very sparse and noisy, its MKL-based fusion with visual features yields significant improvement of the classification accuracy, as compared to the image features only based classifier.
Zmigrod, Sharon; Zmigrod, Leor; Hommel, Bernhard
2015-01-01
While recent studies have investigated how processes underlying human creativity are affected by particular visual-attentional states, we tested the impact of more stable attention-related preferences. These were assessed by means of Navon's global-local task, in which participants respond to the global or local features of large letters constructed from smaller letters. Three standard measures were derived from this task: the sizes of the global precedence effect, the global interference effect (i.e., the impact of incongruent letters at the global level on local processing), and the local interference effect (i.e., the impact of incongruent letters at the local level on global processing). These measures were correlated with performance in a convergent-thinking creativity task (the Remote Associates Task), a divergent-thinking creativity task (the Alternate Uses Task), and a measure of fluid intelligence (Raven's matrices). Flexibility in divergent thinking was predicted by the local interference effect while convergent thinking was predicted by intelligence only. We conclude that a stronger attentional bias to visual information about the "bigger picture" promotes cognitive flexibility in searching for multiple solutions.
Effective real-time vehicle tracking using discriminative sparse coding on local patches
NASA Astrophysics Data System (ADS)
Chen, XiangJun; Ye, Feiyue; Ruan, Yaduan; Chen, Qimei
2016-01-01
A visual tracking framework that provides an object detector and tracker, which focuses on effective and efficient visual tracking in surveillance of real-world intelligent transport system applications, is proposed. The framework casts the tracking task as problems of object detection, feature representation, and classification, which is different from appearance model-matching approaches. Through a feature representation of discriminative sparse coding on local patches called DSCLP, which trains a dictionary on local clustered patches sampled from both positive and negative datasets, the discriminative power and robustness has been improved remarkably, which makes our method more robust to a complex realistic setting with all kinds of degraded image quality. Moreover, by catching objects through one-time background subtraction, along with offline dictionary training, computation time is dramatically reduced, which enables our framework to achieve real-time tracking performance even in a high-definition sequence with heavy traffic. Experiment results show that our work outperforms some state-of-the-art methods in terms of speed, accuracy, and robustness and exhibits increased robustness in a complex real-world scenario with degraded image quality caused by vehicle occlusion, image blur of rain or fog, and change in viewpoint or scale.
Fusion of multichannel local and global structural cues for photo aesthetics evaluation.
Luming Zhang; Yue Gao; Zimmermann, Roger; Qi Tian; Xuelong Li
2014-03-01
Photo aesthetic quality evaluation is a fundamental yet under addressed task in computer vision and image processing fields. Conventional approaches are frustrated by the following two drawbacks. First, both the local and global spatial arrangements of image regions play an important role in photo aesthetics. However, existing rules, e.g., visual balance, heuristically define which spatial distribution among the salient regions of a photo is aesthetically pleasing. Second, it is difficult to adjust visual cues from multiple channels automatically in photo aesthetics assessment. To solve these problems, we propose a new photo aesthetics evaluation framework, focusing on learning the image descriptors that characterize local and global structural aesthetics from multiple visual channels. In particular, to describe the spatial structure of the image local regions, we construct graphlets small-sized connected graphs by connecting spatially adjacent atomic regions. Since spatially adjacent graphlets distribute closely in their feature space, we project them onto a manifold and subsequently propose an embedding algorithm. The embedding algorithm encodes the photo global spatial layout into graphlets. Simultaneously, the importance of graphlets from multiple visual channels are dynamically adjusted. Finally, these post-embedding graphlets are integrated for photo aesthetics evaluation using a probabilistic model. Experimental results show that: 1) the visualized graphlets explicitly capture the aesthetically arranged atomic regions; 2) the proposed approach generalizes and improves four prominent aesthetic rules; and 3) our approach significantly outperforms state-of-the-art algorithms in photo aesthetics prediction.
Receptive-field subfields of V2 neurons in macaque monkeys are adult-like near birth.
Zhang, Bin; Tao, Xiaofeng; Shen, Guofu; Smith, Earl L; Ohzawa, Izumi; Chino, Yuzo M
2013-02-06
Infant primates can discriminate texture-defined form despite their relatively low visual acuity. The neuronal mechanisms underlying this remarkable visual capacity of infants have not been studied in nonhuman primates. Since many V2 neurons in adult monkeys can extract the local features in complex stimuli that are required for form vision, we used two-dimensional dynamic noise stimuli and local spectral reverse correlation to measure whether the spatial map of receptive-field subfields in individual V2 neurons is sufficiently mature near birth to capture local features. As in adults, most V2 neurons in 4-week-old monkeys showed a relatively high degree of homogeneity in the spatial matrix of facilitatory subfields. However, ∼25% of V2 neurons had the subfield map where the neighboring facilitatory subfields substantially differed in their preferred orientations and spatial frequencies. Over 80% of V2 neurons in both infants and adults had "tuned" suppressive profiles in their subfield maps that could alter the tuning properties of facilitatory profiles. The differences in the preferred orientations between facilitatory and suppressive profiles were relatively large but extended over a broad range. Response immaturities in infants were mild; the overall strength of facilitatory subfield responses was lower than that in adults, and the optimal correlation delay ("latency") was longer in 4-week-old infants. These results suggest that as early as 4 weeks of age, the spatial receptive-field structure of V2 neurons is as complex as in adults and the ability of V2 neurons to compare local features of neighboring stimulus elements is nearly adult like.
Global-local feature attention network with reranking strategy for image caption generation
NASA Astrophysics Data System (ADS)
Wu, Jie; Xie, Si-ya; Shi, Xin-bao; Chen, Yao-wen
2017-11-01
In this paper, a novel framework, named as global-local feature attention network with reranking strategy (GLAN-RS), is presented for image captioning task. Rather than only adopting unitary visual information in the classical models, GLAN-RS explores the attention mechanism to capture local convolutional salient image maps. Furthermore, we adopt reranking strategy to adjust the priority of the candidate captions and select the best one. The proposed model is verified using the Microsoft Common Objects in Context (MSCOCO) benchmark dataset across seven standard evaluation metrics. Experimental results show that GLAN-RS significantly outperforms the state-of-the-art approaches, such as multimodal recurrent neural network (MRNN) and Google NIC, which gets an improvement of 20% in terms of BLEU4 score and 13 points in terms of CIDER score.
Orientation selectivity sharpens motion detection in Drosophila
Fisher, Yvette E.; Silies, Marion; Clandinin, Thomas R.
2015-01-01
SUMMARY Detecting the orientation and movement of edges in a scene is critical to visually guided behaviors of many animals. What are the circuit algorithms that allow the brain to extract such behaviorally vital visual cues? Using in vivo two-photon calcium imaging in Drosophila, we describe direction selective signals in the dendrites of T4 and T5 neurons, detectors of local motion. We demonstrate that this circuit performs selective amplification of local light inputs, an observation that constrains motion detection models and confirms a core prediction of the Hassenstein-Reichardt Correlator (HRC). These neurons are also orientation selective, responding strongly to static features that are orthogonal to their preferred axis of motion, a tuning property not predicted by the HRC. This coincident extraction of orientation and direction sharpens directional tuning through surround inhibition and reveals a striking parallel between visual processing in flies and vertebrate cortex, suggesting a universal strategy for motion processing. PMID:26456048
Experience improves feature extraction in Drosophila.
Peng, Yueqing; Xi, Wang; Zhang, Wei; Zhang, Ke; Guo, Aike
2007-05-09
Previous exposure to a pattern in the visual scene can enhance subsequent recognition of that pattern in many species from honeybees to humans. However, whether previous experience with a visual feature of an object, such as color or shape, can also facilitate later recognition of that particular feature from multiple visual features is largely unknown. Visual feature extraction is the ability to select the key component from multiple visual features. Using a visual flight simulator, we designed a novel protocol for visual feature extraction to investigate the effects of previous experience on visual reinforcement learning in Drosophila. We found that, after conditioning with a visual feature of objects among combinatorial shape-color features, wild-type flies exhibited poor ability to extract the correct visual feature. However, the ability for visual feature extraction was greatly enhanced in flies trained previously with that visual feature alone. Moreover, we demonstrated that flies might possess the ability to extract the abstract category of "shape" but not a particular shape. Finally, this experience-dependent feature extraction is absent in flies with defective MBs, one of the central brain structures in Drosophila. Our results indicate that previous experience can enhance visual feature extraction in Drosophila and that MBs are required for this experience-dependent visual cognition.
GAFFE: a gaze-attentive fixation finding engine.
Rajashekar, U; van der Linde, I; Bovik, A C; Cormack, L K
2008-04-01
The ability to automatically detect visually interesting regions in images has many practical applications, especially in the design of active machine vision and automatic visual surveillance systems. Analysis of the statistics of image features at observers' gaze can provide insights into the mechanisms of fixation selection in humans. Using a foveated analysis framework, we studied the statistics of four low-level local image features: luminance, contrast, and bandpass outputs of both luminance and contrast, and discovered that image patches around human fixations had, on average, higher values of each of these features than image patches selected at random. Contrast-bandpass showed the greatest difference between human and random fixations, followed by luminance-bandpass, RMS contrast, and luminance. Using these measurements, we present a new algorithm that selects image regions as likely candidates for fixation. These regions are shown to correlate well with fixations recorded from human observers.
Organization of the Drosophila larval visual circuit
Gendre, Nanae; Neagu-Maier, G Larisa; Fetter, Richard D; Schneider-Mizell, Casey M; Truman, James W; Zlatic, Marta; Cardona, Albert
2017-01-01
Visual systems transduce, process and transmit light-dependent environmental cues. Computation of visual features depends on photoreceptor neuron types (PR) present, organization of the eye and wiring of the underlying neural circuit. Here, we describe the circuit architecture of the visual system of Drosophila larvae by mapping the synaptic wiring diagram and neurotransmitters. By contacting different targets, the two larval PR-subtypes create two converging pathways potentially underlying the computation of ambient light intensity and temporal light changes already within this first visual processing center. Locally processed visual information then signals via dedicated projection interneurons to higher brain areas including the lateral horn and mushroom body. The stratified structure of the larval optic neuropil (LON) suggests common organizational principles with the adult fly and vertebrate visual systems. The complete synaptic wiring diagram of the LON paves the way to understanding how circuits with reduced numerical complexity control wide ranges of behaviors.
Scene perception and the visual control of travel direction in navigating wood ants
Collett, Thomas S.; Lent, David D.; Graham, Paul
2014-01-01
This review reflects a few of Mike Land's many and varied contributions to visual science. In it, we show for wood ants, as Mike has done for a variety of animals, including readers of this piece, what can be learnt from a detailed analysis of an animal's visually guided eye, head or body movements. In the case of wood ants, close examination of their body movements, as they follow visually guided routes, is starting to reveal how they perceive and respond to their visual world and negotiate a path within it. We describe first some of the mechanisms that underlie the visual control of their paths, emphasizing that vision is not the ant's only sense. In the second part, we discuss how remembered local shape-dependent and global shape-independent features of a visual scene may interact in guiding the ant's path. PMID:24395962
The Precedence of Global Features in the Perception of Map Symbols
1988-06-01
be continually updated. The present study evaluated the feasibility of a serial model of visual processing. By comparing performance between a symbol...symbols, is based on a " filter - ing" procedure, consisting of a series of passive-to-active or global- to-local stages. Navon (1977, 1981a) has proposed a...packages or segments. This advances the earlier, static feature aggregation ap- proaches to comprise a "figure." According to the global precedence model
Selective weighting of action-related feature dimensions in visual working memory.
Heuer, Anna; Schubö, Anna
2017-08-01
Planning an action primes feature dimensions that are relevant for that particular action, increasing the impact of these dimensions on perceptual processing. Here, we investigated whether action planning also affects the short-term maintenance of visual information. In a combined memory and movement task, participants were to memorize items defined by size or color while preparing either a grasping or a pointing movement. Whereas size is a relevant feature dimension for grasping, color can be used to localize the goal object and guide a pointing movement. The results showed that memory for items defined by size was better during the preparation of a grasping movement than during the preparation of a pointing movement. Conversely, memory for color tended to be better when a pointing movement rather than a grasping movement was being planned. This pattern was not only observed when the memory task was embedded within the preparation period of the movement, but also when the movement to be performed was only indicated during the retention interval of the memory task. These findings reveal that a weighting of information in visual working memory according to action relevance can even be implemented at the representational level during maintenance, demonstrating that our actions continue to influence visual processing beyond the perceptual stage.
Rules infants look by: Testing the assumption of transitivity in visual salience.
Kibbe, Melissa M; Kaldy, Zsuzsa; Blaser, Erik
2018-01-01
What drives infants' attention in complex visual scenes? Early models of infant attention suggested that the degree to which different visual features were detectable determines their attentional priority. Here, we tested this by asking whether two targets - defined by different features, but each equally salient when evaluated independently - would drive attention equally when pitted head-to-head. In Experiment 1, we presented 6-month-old infants with an array of gabor patches in which a target region varied either in color or spatial frequency from the background. Using a forced-choice preferential-looking method, we measured how readily infants fixated the target as its featural difference from the background was parametrically increased. Then, in Experiment 2, we used these psychometric preference functions to choose values for color and spatial frequency targets that were equally salient (preferred), and pitted them against each other within the same display. We reasoned that, if salience is transitive, then the stimuli should be iso-salient and infants should therefore show no systematic preference for either stimulus. On the contrary, we found that infants consistently preferred the color-defined stimulus. This suggests that computing visual salience in more complex scenes needs to include factors above and beyond local salience values.
Role of early visual cortex in trans-saccadic memory of object features.
Malik, Pankhuri; Dessing, Joost C; Crawford, J Douglas
2015-08-01
Early visual cortex (EVC) participates in visual feature memory and the updating of remembered locations across saccades, but its role in the trans-saccadic integration of object features is unknown. We hypothesized that if EVC is involved in updating object features relative to gaze, feature memory should be disrupted when saccades remap an object representation into a simultaneously perturbed EVC site. To test this, we applied transcranial magnetic stimulation (TMS) over functional magnetic resonance imaging-localized EVC clusters corresponding to the bottom left/right visual quadrants (VQs). During experiments, these VQs were probed psychophysically by briefly presenting a central object (Gabor patch) while subjects fixated gaze to the right or left (and above). After a short memory interval, participants were required to detect the relative change in orientation of a re-presented test object at the same spatial location. Participants either sustained fixation during the memory interval (fixation task) or made a horizontal saccade that either maintained or reversed the VQ of the object (saccade task). Three TMS pulses (coinciding with the pre-, peri-, and postsaccade intervals) were applied to the left or right EVC. This had no effect when (a) fixation was maintained, (b) saccades kept the object in the same VQ, or (c) the EVC quadrant corresponding to the first object was stimulated. However, as predicted, TMS reduced performance when saccades (especially larger saccades) crossed the remembered object location and brought it into the VQ corresponding to the TMS site. This suppression effect was statistically significant for leftward saccades and followed a weaker trend for rightward saccades. These causal results are consistent with the idea that EVC is involved in the gaze-centered updating of object features for trans-saccadic memory and perception.
Monocular Visual Odometry Based on Trifocal Tensor Constraint
NASA Astrophysics Data System (ADS)
Chen, Y. J.; Yang, G. L.; Jiang, Y. X.; Liu, X. Y.
2018-02-01
For the problem of real-time precise localization in the urban street, a monocular visual odometry based on Extend Kalman fusion of optical-flow tracking and trifocal tensor constraint is proposed. To diminish the influence of moving object, such as pedestrian, we estimate the motion of the camera by extracting the features on the ground, which improves the robustness of the system. The observation equation based on trifocal tensor constraint is derived, which can form the Kalman filter alone with the state transition equation. An Extend Kalman filter is employed to cope with the nonlinear system. Experimental results demonstrate that, compares with Yu’s 2-step EKF method, the algorithm is more accurate which meets the needs of real-time accurate localization in cities.
Feature integration and object representations along the dorsal stream visual hierarchy
Perry, Carolyn Jeane; Fallah, Mazyar
2014-01-01
The visual system is split into two processing streams: a ventral stream that receives color and form information and a dorsal stream that receives motion information. Each stream processes that information hierarchically, with each stage building upon the previous. In the ventral stream this leads to the formation of object representations that ultimately allow for object recognition regardless of changes in the surrounding environment. In the dorsal stream, this hierarchical processing has classically been thought to lead to the computation of complex motion in three dimensions. However, there is evidence to suggest that there is integration of both dorsal and ventral stream information into motion computation processes, giving rise to intermediate object representations, which facilitate object selection and decision making mechanisms in the dorsal stream. First we review the hierarchical processing of motion along the dorsal stream and the building up of object representations along the ventral stream. Then we discuss recent work on the integration of ventral and dorsal stream features that lead to intermediate object representations in the dorsal stream. Finally we propose a framework describing how and at what stage different features are integrated into dorsal visual stream object representations. Determining the integration of features along the dorsal stream is necessary to understand not only how the dorsal stream builds up an object representation but also which computations are performed on object representations instead of local features. PMID:25140147
The role of attention in figure-ground segregation in areas V1 and V4 of the visual cortex.
Poort, Jasper; Raudies, Florian; Wannig, Aurel; Lamme, Victor A F; Neumann, Heiko; Roelfsema, Pieter R
2012-07-12
Our visual system segments images into objects and background. Figure-ground segregation relies on the detection of feature discontinuities that signal boundaries between the figures and the background and on a complementary region-filling process that groups together image regions with similar features. The neuronal mechanisms for these processes are not well understood and it is unknown how they depend on visual attention. We measured neuronal activity in V1 and V4 in a task where monkeys either made an eye movement to texture-defined figures or ignored them. V1 activity predicted the timing and the direction of the saccade if the figures were task relevant. We found that boundary detection is an early process that depends little on attention, whereas region filling occurs later and is facilitated by visual attention, which acts in an object-based manner. Our findings are explained by a model with local, bottom-up computations for boundary detection and feedback processing for region filling. Copyright © 2012 Elsevier Inc. All rights reserved.
Truppa, Valentina; Carducci, Paola; De Simone, Diego Antonio; Bisazza, Angelo; De Lillo, Carlo
2017-03-01
In the last two decades, comparative research has addressed the issue of how the global and local levels of structure of visual stimuli are processed by different species, using Navon-type hierarchical figures, i.e. smaller local elements that form larger global configurations. Determining whether or not the variety of procedures adopted to test different species with hierarchical figures are equivalent is of crucial importance to ensure comparability of results. Among non-human species, global/local processing has been extensively studied in tufted capuchin monkeys using matching-to-sample tasks with hierarchical patterns. Local dominance has emerged consistently in these New World primates. In the present study, we assessed capuchins' processing of hierarchical stimuli with a method frequently adopted in studies of global/local processing in non-primate species: the conflict-choice task. Different from the matching-to-sample procedure, this task involved processing local and global information retained in long-term memory. Capuchins were trained to discriminate between consistent hierarchical stimuli (similar global and local shape) and then tested with inconsistent hierarchical stimuli (different global and local shapes). We found that capuchins preferred the hierarchical stimuli featuring the correct local elements rather than those with the correct global configuration. This finding confirms that capuchins' local dominance, typically observed using matching-to-sample procedures, is also expressed as a local preference in the conflict-choice task. Our study adds to the growing body of comparative studies on visual grouping functions by demonstrating that the methods most frequently used in the literature on global/local processing produce analogous results irrespective of extent of the involvement of memory processes.
Bölte, Sven; Poustka, Fritz
2006-06-01
The objective of this study was to investigate the tendency for local processing style ('weak central coherence') and executive dysfunction in parents of subjects with an autism spectrum disorder (ASD) compared with parents of individuals with early onset schizophrenia (EOS) and mental retardation (MR). Sixty-two parents of subjects with ASD, 36 parents of subjects with EOS and 30 parents of subjects with MR were examined. Data on two scales indicative of local visual processing (Embedded Figures Test, Block Design) and on three executive function tests (Wisconsin Card Sorting Test, Tower of Hanoi, Trailmaking Test) were collected for all participants. Parents of subjects with ASD performed significantly faster on the Embedded Figures Test compared with both control samples. No other substantial group differences were observed. The findings indicate that an increased tendency for local processing in terms of visual disembedding could be a relatively specific core feature of the broader cognitive phenotype of autism in parents.
Robotic Attention Processing And Its Application To Visual Guidance
NASA Astrophysics Data System (ADS)
Barth, Matthew; Inoue, Hirochika
1988-03-01
This paper describes a method of real-time visual attention processing for robots performing visual guidance. This robot attention processing is based on a novel vision processor, the multi-window vision system that was developed at the University of Tokyo. The multi-window vision system is unique in that it only processes visual information inside local area windows. These local area windows are quite flexible in their ability to move anywhere on the visual screen, change their size and shape, and alter their pixel sampling rate. By using these windows for specific attention tasks, it is possible to perform high speed attention processing. The primary attention skills of detecting motion, tracking an object, and interpreting an image are all performed at high speed on the multi-window vision system. A basic robotic attention scheme using the attention skills was developed. The attention skills involved detection and tracking of salient visual features. The tracking and motion information thus obtained was utilized in producing the response to the visual stimulus. The response of the attention scheme was quick enough to be applicable to the real-time vision processing tasks of playing a video 'pong' game, and later using an automobile driving simulator. By detecting the motion of a 'ball' on a video screen and then tracking the movement, the attention scheme was able to control a 'paddle' in order to keep the ball in play. The response was faster than that of a human's, allowing the attention scheme to play the video game at higher speeds. Further, in the application to the driving simulator, the attention scheme was able to control both direction and velocity of a simulated vehicle following a lead car. These two applications show the potential of local visual processing in its use for robotic attention processing.
NASA Astrophysics Data System (ADS)
White, Robin T.; Wu, Alex; Najm, Marina; Orfino, Francesco P.; Dutta, Monica; Kjeang, Erik
2017-05-01
A four-dimensional visualization approach, featuring three dimensions in space and one dimension in time, is proposed to study local electrode degradation effects during voltage cycling in fuel cells. Non-invasive in situ micro X-ray computed tomography (XCT) with a custom fuel cell fixture is utilized to track the same cathode catalyst layer domain throughout various degradation times from beginning-of-life (BOL) to end-of-life (EOL). With this unique approach, new information regarding damage features and trends are revealed, including crack propagation and catalyst layer thinning being quantified by means of image processing and analysis methods. Degradation heterogeneities as a result of local environmental variations under land and channel are also explored, with a higher structural degradation rate under channels being observed. Density and compositional changes resulting from carbon corrosion and catalyst layer collapse and thinning are observed by changes in relative X-ray attenuation from BOL to EOL, which also indicate possible vulnerable regions where crack initiation and propagation may occur. Electrochemical diagnostics and morphological features observed by micro-XCT are correlated by additionally collecting effective catalyst surface area, double layer capacitance, and polarization curves prior to imaging at various stages of degradation.
ERIC Educational Resources Information Center
Rohs, C. Renee
2007-01-01
Numerous connections between the visual arts and sciences are evident if we choose to look for them. In February 2006, students and faculty from the Art and Geol/Geog departments at NW Missouri State University put together an exhibit at a local art gallery featuring works that were born out of science, inspired by science, or exploring the…
God: Do I Have Your Attention?
ERIC Educational Resources Information Center
Colzato, Lorenza S.; van Beest, Ilja; van den Wildenberg, Wery P. M.; Scorolli, Claudia; Dorchin, Shirley; Meiran, Nachshon; Borghi, Anna M.; Hommel, Bernhard
2010-01-01
Religion is commonly defined as a set of rules, developed as part of a culture. Here we provide evidence that practice in following these rules systematically changes the way people attend to visual stimuli, as indicated by the individual sizes of the global precedence effect (better performance to global than to local features). We show that this…
Zmigrod, Sharon; Zmigrod, Leor; Hommel, Bernhard
2015-01-01
While recent studies have investigated how processes underlying human creativity are affected by particular visual-attentional states, we tested the impact of more stable attention-related preferences. These were assessed by means of Navon’s global-local task, in which participants respond to the global or local features of large letters constructed from smaller letters. Three standard measures were derived from this task: the sizes of the global precedence effect, the global interference effect (i.e., the impact of incongruent letters at the global level on local processing), and the local interference effect (i.e., the impact of incongruent letters at the local level on global processing). These measures were correlated with performance in a convergent-thinking creativity task (the Remote Associates Task), a divergent-thinking creativity task (the Alternate Uses Task), and a measure of fluid intelligence (Raven’s matrices). Flexibility in divergent thinking was predicted by the local interference effect while convergent thinking was predicted by intelligence only. We conclude that a stronger attentional bias to visual information about the “bigger picture” promotes cognitive flexibility in searching for multiple solutions. PMID:26579030
Matsumiya, Kazumichi
2013-10-01
Current views on face perception assume that the visual system receives only visual facial signals. However, I show that the visual perception of faces is systematically biased by adaptation to a haptically explored face. Recently, face aftereffects (FAEs; the altered perception of faces after adaptation to a face) have been demonstrated not only in visual perception but also in haptic perception; therefore, I combined the two FAEs to examine whether the visual system receives face-related signals from the haptic modality. I found that adaptation to a haptically explored facial expression on a face mask produced a visual FAE for facial expression. This cross-modal FAE was not due to explicitly imaging a face, response bias, or adaptation to local features. Furthermore, FAEs transferred from vision to haptics. These results indicate that visual face processing depends on substrates adapted by haptic faces, which suggests that face processing relies on shared representation underlying cross-modal interactions.
Vatovec, Christine
2013-01-01
Theory-based research is needed to understand how maps of environmental health risk information influence risk beliefs and protective behavior. Using theoretical concepts from multiple fields of study including visual cognition, semiotics, health behavior, and learning and memory supports a comprehensive assessment of this influence. We report results from thirteen cognitive interviews that provide theory-based insights into how visual features influenced what participants saw and the meaning of what they saw as they viewed three formats of water test results for private wells (choropleth map, dot map, and a table). The unit of perception, color, proximity to hazards, geographic distribution, and visual salience had substantial influences on what participants saw and their resulting risk beliefs. These influences are explained by theoretical factors that shape what is seen, properties of features that shape cognition (pre-attentive, symbolic, visual salience), information processing (top-down and bottom-up), and the strength of concrete compared to abstract information. Personal relevance guided top-down attention to proximal and larger hazards that shaped stronger risk beliefs. Meaning was more local for small perceptual units and global for large units. Three aspects of color were important: pre-attentive “incremental risk” meaning of sequential shading, symbolic safety meaning of stoplight colors, and visual salience that drew attention. The lack of imagery, geographic information, and color diminished interest in table information. Numeracy and prior beliefs influenced comprehension for some participants. Results guided the creation of an integrated conceptual framework for application to future studies. Ethics should guide the selection of map features that support appropriate communication goals. PMID:22715919
Severtson, Dolores J; Vatovec, Christine
2012-08-01
Theory-based research is needed to understand how maps of environmental health risk information influence risk beliefs and protective behavior. Using theoretical concepts from multiple fields of study including visual cognition, semiotics, health behavior, and learning and memory supports a comprehensive assessment of this influence. The authors report results from 13 cognitive interviews that provide theory-based insights into how visual features influenced what participants saw and the meaning of what they saw as they viewed 3 formats of water test results for private wells (choropleth map, dot map, and a table). The unit of perception, color, proximity to hazards, geographic distribution, and visual salience had substantial influences on what participants saw and their resulting risk beliefs. These influences are explained by theoretical factors that shape what is seen, properties of features that shape cognition (preattentive, symbolic, visual salience), information processing (top-down and bottom-up), and the strength of concrete compared with abstract information. Personal relevance guided top-down attention to proximal and larger hazards that shaped stronger risk beliefs. Meaning was more local for small perceptual units and global for large units. Three aspects of color were important: preattentive "incremental risk" meaning of sequential shading, symbolic safety meaning of stoplight colors, and visual salience that drew attention. The lack of imagery, geographic information, and color diminished interest in table information. Numeracy and prior beliefs influenced comprehension for some participants. Results guided the creation of an integrated conceptual framework for application to future studies. Ethics should guide the selection of map features that support appropriate communication goals.
Experience-Driven Formation of Parts-Based Representations in a Model of Layered Visual Memory
Jitsev, Jenia; von der Malsburg, Christoph
2009-01-01
Growing neuropsychological and neurophysiological evidence suggests that the visual cortex uses parts-based representations to encode, store and retrieve relevant objects. In such a scheme, objects are represented as a set of spatially distributed local features, or parts, arranged in stereotypical fashion. To encode the local appearance and to represent the relations between the constituent parts, there has to be an appropriate memory structure formed by previous experience with visual objects. Here, we propose a model how a hierarchical memory structure supporting efficient storage and rapid recall of parts-based representations can be established by an experience-driven process of self-organization. The process is based on the collaboration of slow bidirectional synaptic plasticity and homeostatic unit activity regulation, both running at the top of fast activity dynamics with winner-take-all character modulated by an oscillatory rhythm. These neural mechanisms lay down the basis for cooperation and competition between the distributed units and their synaptic connections. Choosing human face recognition as a test task, we show that, under the condition of open-ended, unsupervised incremental learning, the system is able to form memory traces for individual faces in a parts-based fashion. On a lower memory layer the synaptic structure is developed to represent local facial features and their interrelations, while the identities of different persons are captured explicitly on a higher layer. An additional property of the resulting representations is the sparseness of both the activity during the recall and the synaptic patterns comprising the memory traces. PMID:19862345
A pose estimation method for unmanned ground vehicles in GPS denied environments
NASA Astrophysics Data System (ADS)
Tamjidi, Amirhossein; Ye, Cang
2012-06-01
This paper presents a pose estimation method based on the 1-Point RANSAC EKF (Extended Kalman Filter) framework. The method fuses the depth data from a LIDAR and the visual data from a monocular camera to estimate the pose of a Unmanned Ground Vehicle (UGV) in a GPS denied environment. Its estimation framework continuy updates the vehicle's 6D pose state and temporary estimates of the extracted visual features' 3D positions. In contrast to the conventional EKF-SLAM (Simultaneous Localization And Mapping) frameworks, the proposed method discards feature estimates from the extended state vector once they are no longer observed for several steps. As a result, the extended state vector always maintains a reasonable size that is suitable for online calculation. The fusion of laser and visual data is performed both in the feature initialization part of the EKF-SLAM process and in the motion prediction stage. A RANSAC pose calculation procedure is devised to produce pose estimate for the motion model. The proposed method has been successfully tested on the Ford campus's LIDAR-Vision dataset. The results are compared with the ground truth data of the dataset and the estimation error is ~1.9% of the path length.
Distinguishing among potential mechanisms of singleton suppression.
Gaspelin, Nicholas; Luck, Steven J
2018-04-01
Previous research has revealed that people can suppress salient stimuli that might otherwise capture visual attention. The present study tests between 3 possible mechanisms of visual suppression. According to first-order feature suppression models , items are suppressed on the basis of simple feature values. According to second-order feature suppression models , items are suppressed on the basis of local discontinuities within a given feature dimension. According to global-salience suppression models , items are suppressed on the basis of their dimension-independent salience levels. The current study distinguished among these models by varying the predictability of the singleton color value. If items are suppressed by virtue of salience alone, then it should not matter whether the singleton color is predictable. However, evidence from probe processing and eye movements indicated that suppression is possible only when the color values are predictable. Moreover, the ability to suppress salient items developed gradually as participants gained experience with the feature that defined the salient distractor. These results are consistent with first-order feature suppression models, and are inconsistent with the other models of suppression. In other words, people primarily suppress salient distractors on the basis of their simple features and not on the basis of salience per se. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Kim, Bumhwi; Ban, Sang-Woo; Lee, Minho
2013-10-01
Humans can efficiently perceive arbitrary visual objects based on an incremental learning mechanism with selective attention. This paper proposes a new task specific top-down attention model to locate a target object based on its form and color representation along with a bottom-up saliency based on relativity of primitive visual features and some memory modules. In the proposed model top-down bias signals corresponding to the target form and color features are generated, which draw the preferential attention to the desired object by the proposed selective attention model in concomitance with the bottom-up saliency process. The object form and color representation and memory modules have an incremental learning mechanism together with a proper object feature representation scheme. The proposed model includes a Growing Fuzzy Topology Adaptive Resonance Theory (GFTART) network which plays two important roles in object color and form biased attention; one is to incrementally learn and memorize color and form features of various objects, and the other is to generate a top-down bias signal to localize a target object by focusing on the candidate local areas. Moreover, the GFTART network can be utilized for knowledge inference which enables the perception of new unknown objects on the basis of the object form and color features stored in the memory during training. Experimental results show that the proposed model is successful in focusing on the specified target objects, in addition to the incremental representation and memorization of various objects in natural scenes. In addition, the proposed model properly infers new unknown objects based on the form and color features of previously trained objects. Copyright © 2013 Elsevier Ltd. All rights reserved.
How is visual salience computed in the brain? Insights from behaviour, neurobiology and modelling
Veale, Richard; Hafed, Ziad M.
2017-01-01
Inherent in visual scene analysis is a bottleneck associated with the need to sequentially sample locations with foveating eye movements. The concept of a ‘saliency map’ topographically encoding stimulus conspicuity over the visual scene has proven to be an efficient predictor of eye movements. Our work reviews insights into the neurobiological implementation of visual salience computation. We start by summarizing the role that different visual brain areas play in salience computation, whether at the level of feature analysis for bottom-up salience or at the level of goal-directed priority maps for output behaviour. We then delve into how a subcortical structure, the superior colliculus (SC), participates in salience computation. The SC represents a visual saliency map via a centre-surround inhibition mechanism in the superficial layers, which feeds into priority selection mechanisms in the deeper layers, thereby affecting saccadic and microsaccadic eye movements. Lateral interactions in the local SC circuit are particularly important for controlling active populations of neurons. This, in turn, might help explain long-range effects, such as those of peripheral cues on tiny microsaccades. Finally, we show how a combination of in vitro neurophysiology and large-scale computational modelling is able to clarify how salience computation is implemented in the local circuit of the SC. This article is part of the themed issue ‘Auditory and visual scene analysis’. PMID:28044023
JBrowse: a dynamic web platform for genome visualization and analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Buels, Robert; Yao, Eric; Diesh, Colin M.
JBrowse is a fast and full-featured genome browser built with JavaScript and HTML5. It is easily embedded into websites or apps but can also be served as a standalone web page. Overall improvements to speed and scalability are accompanied by specific enhancements that support complex interactive queries on large track sets. Analysis functions can readily be added using the plugin framework; most visual aspects of tracks can also be customized, along with clicks, mouseovers, menus, and popup boxes. JBrowse can also be used to browse local annotation files offline and to generate high-resolution figures for publication. JBrowse is a maturemore » web application suitable for genome visualization and analysis.« less
D-peaks: a visual tool to display ChIP-seq peaks along the genome.
Brohée, Sylvain; Bontempi, Gianluca
2012-01-01
ChIP-sequencing is a method of choice to localize the positions of protein binding sites on DNA on a whole genomic scale. The deciphering of the sequencing data produced by this novel technique is challenging and it is achieved by their rigorous interpretation using dedicated tools and adapted visualization programs. Here, we present a bioinformatics tool (D-peaks) that adds several possibilities (including, user-friendliness, high-quality, relative position with respect to the genomic features) to the well-known visualization browsers or databases already existing. D-peaks is directly available through its web interface http://rsat.ulb.ac.be/dpeaks/ as well as a command line tool.
JBrowse: a dynamic web platform for genome visualization and analysis.
Buels, Robert; Yao, Eric; Diesh, Colin M; Hayes, Richard D; Munoz-Torres, Monica; Helt, Gregg; Goodstein, David M; Elsik, Christine G; Lewis, Suzanna E; Stein, Lincoln; Holmes, Ian H
2016-04-12
JBrowse is a fast and full-featured genome browser built with JavaScript and HTML5. It is easily embedded into websites or apps but can also be served as a standalone web page. Overall improvements to speed and scalability are accompanied by specific enhancements that support complex interactive queries on large track sets. Analysis functions can readily be added using the plugin framework; most visual aspects of tracks can also be customized, along with clicks, mouseovers, menus, and popup boxes. JBrowse can also be used to browse local annotation files offline and to generate high-resolution figures for publication. JBrowse is a mature web application suitable for genome visualization and analysis.
Recurrent V1-V2 interaction in early visual boundary processing.
Neumann, H; Sepp, W
1999-11-01
A majority of cortical areas are connected via feedforward and feedback fiber projections. In feedforward pathways we mainly observe stages of feature detection and integration. The computational role of the descending pathways at different stages of processing remains mainly unknown. Based on empirical findings we suggest that the top-down feedback pathways subserve a context-dependent gain control mechanism. We propose a new computational model for recurrent contour processing in which normalized activities of orientation selective contrast cells are fed forward to the next processing stage. There, the arrangement of input activation is matched against local patterns of contour shape. The resulting activities are subsequently fed back to the previous stage to locally enhance those initial measurements that are consistent with the top-down generated responses. In all, we suggest a computational theory for recurrent processing in the visual cortex in which the significance of local measurements is evaluated on the basis of a broader visual context that is represented in terms of contour code patterns. The model serves as a framework to link physiological with perceptual data gathered in psychophysical experiments. It handles a variety of perceptual phenomena, such as the local grouping of fragmented shape outline, texture surround and density effects, and the interpolation of illusory contours.
KinImmerse: Macromolecular VR for NMR ensembles
Block, Jeremy N; Zielinski, David J; Chen, Vincent B; Davis, Ian W; Vinson, E Claire; Brady, Rachael; Richardson, Jane S; Richardson, David C
2009-01-01
Background In molecular applications, virtual reality (VR) and immersive virtual environments have generally been used and valued for the visual and interactive experience – to enhance intuition and communicate excitement – rather than as part of the actual research process. In contrast, this work develops a software infrastructure for research use and illustrates such use on a specific case. Methods The Syzygy open-source toolkit for VR software was used to write the KinImmerse program, which translates the molecular capabilities of the kinemage graphics format into software for display and manipulation in the DiVE (Duke immersive Virtual Environment) or other VR system. KinImmerse is supported by the flexible display construction and editing features in the KiNG kinemage viewer and it implements new forms of user interaction in the DiVE. Results In addition to molecular visualizations and navigation, KinImmerse provides a set of research tools for manipulation, identification, co-centering of multiple models, free-form 3D annotation, and output of results. The molecular research test case analyzes the local neighborhood around an individual atom within an ensemble of nuclear magnetic resonance (NMR) models, enabling immersive visual comparison of the local conformation with the local NMR experimental data, including target curves for residual dipolar couplings (RDCs). Conclusion The promise of KinImmerse for production-level molecular research in the DiVE is shown by the locally co-centered RDC visualization developed there, which gave new insights now being pursued in wider data analysis. PMID:19222844
Meaux, Emilie; Vuilleumier, Patrik
2016-11-01
The ability to decode facial emotions is of primary importance for human social interactions; yet, it is still debated how we analyze faces to determine their expression. Here we compared the processing of emotional face expressions through holistic integration and/or local analysis of visual features, and determined which brain systems mediate these distinct processes. Behavioral, physiological, and brain responses to happy and angry faces were assessed by presenting congruent global configurations of expressions (e.g., happy top+happy bottom), incongruent composite configurations (e.g., angry top+happy bottom), and isolated features (e.g. happy top only). Top and bottom parts were always from the same individual. Twenty-six healthy volunteers were scanned using fMRI while they classified the expression in either the top or the bottom face part but ignored information in the other non-target part. Results indicate that the recognition of happy and anger expressions is neither strictly holistic nor analytic Both routes were involved, but with a different role for analytic and holistic information depending on the emotion type, and different weights of local features between happy and anger expressions. Dissociable neural pathways were engaged depending on emotional face configurations. In particular, regions within the face processing network differed in their sensitivity to holistic expression information, which predominantly activated fusiform, inferior occipital areas and amygdala when internal features were congruent (i.e. template matching), whereas more local analysis of independent features preferentially engaged STS and prefrontal areas (IFG/OFC) in the context of full face configurations, but early visual areas and pulvinar when seen in isolated parts. Collectively, these findings suggest that facial emotion recognition recruits separate, but interactive dorsal and ventral routes within the face processing networks, whose engagement may be shaped by reciprocal interactions and modulated by task demands. Copyright © 2016 Elsevier Inc. All rights reserved.
Object attributes combine additively in visual search.
Pramod, R T; Arun, S P
2016-01-01
We perceive objects as containing a variety of attributes: local features, relations between features, internal details, and global properties. But we know little about how they combine. Here, we report a remarkably simple additive rule that governs how these diverse object attributes combine in vision. The perceived dissimilarity between two objects was accurately explained as a sum of (a) spatially tuned local contour-matching processes modulated by part decomposition; (b) differences in internal details, such as texture; (c) differences in emergent attributes, such as symmetry; and (d) differences in global properties, such as orientation or overall configuration of parts. Our results elucidate an enduring question in object vision by showing that the whole object is not a sum of its parts but a sum of its many attributes.
Texture analysis based on the Hermite transform for image classification and segmentation
NASA Astrophysics Data System (ADS)
Estudillo-Romero, Alfonso; Escalante-Ramirez, Boris; Savage-Carmona, Jesus
2012-06-01
Texture analysis has become an important task in image processing because it is used as a preprocessing stage in different research areas including medical image analysis, industrial inspection, segmentation of remote sensed imaginary, multimedia indexing and retrieval. In order to extract visual texture features a texture image analysis technique is presented based on the Hermite transform. Psychovisual evidence suggests that the Gaussian derivatives fit the receptive field profiles of mammalian visual systems. The Hermite transform describes locally basic texture features in terms of Gaussian derivatives. Multiresolution combined with several analysis orders provides detection of patterns that characterizes every texture class. The analysis of the local maximum energy direction and steering of the transformation coefficients increase the method robustness against the texture orientation. This method presents an advantage over classical filter bank design because in the latter a fixed number of orientations for the analysis has to be selected. During the training stage, a subset of the Hermite analysis filters is chosen in order to improve the inter-class separability, reduce dimensionality of the feature vectors and computational cost during the classification stage. We exhaustively evaluated the correct classification rate of real randomly selected training and testing texture subsets using several kinds of common used texture features. A comparison between different distance measurements is also presented. Results of the unsupervised real texture segmentation using this approach and comparison with previous approaches showed the benefits of our proposal.
Computer-aided diagnosis of mammographic masses using geometric verification-based image retrieval
NASA Astrophysics Data System (ADS)
Li, Qingliang; Shi, Weili; Yang, Huamin; Zhang, Huimao; Li, Guoxin; Chen, Tao; Mori, Kensaku; Jiang, Zhengang
2017-03-01
Computer-Aided Diagnosis of masses in mammograms is an important indicator of breast cancer. The use of retrieval systems in breast examination is increasing gradually. In this respect, the method of exploiting the vocabulary tree framework and the inverted file in the mammographic masse retrieval have been proved high accuracy and excellent scalability. However it just considered the features in each image as a visual word and had ignored the spatial configurations of features. It greatly affect the retrieval performance. To overcome this drawback, we introduce the geometric verification method to retrieval in mammographic masses. First of all, we obtain corresponding match features based on the vocabulary tree framework and the inverted file. After that, we grasps the main point of local similarity characteristic of deformations in the local regions by constructing the circle regions of corresponding pairs. Meanwhile we segment the circle to express the geometric relationship of local matches in the area and generate the spatial encoding strictly. Finally we judge whether the matched features are correct or not, based on verifying the all spatial encoding are whether satisfied the geometric consistency. Experiments show the promising results of our approach.
De Sá Teixeira, Nuno
2016-01-01
Visual memory for the spatial location where a moving target vanishes has been found to be systematically displaced downward in the direction of gravity. Moreover, it was recently reported that the magnitude of the downward error increases steadily with increasing retention intervals imposed after object’s offset and before observers are allowed to perform the spatial localization task, in a pattern where the remembered vanishing location drifts downward as if following a falling trajectory. This outcome was taken to reflect the dynamics of a representational model of earth’s gravity. The present study aims to establish the spatial and temporal features of this downward drift by taking into account the dynamics of the motor response. The obtained results show that the memory for the last location of the target drifts downward with time, thus replicating previous results. Moreover, the time taken for completion of the behavioural localization movements seems to add to the imposed retention intervals in determining the temporal frame during which the visual memory is updated. Overall, it is reported that the representation of spatial location drifts downward by about 3 pixels for each two-fold increase of time until response. The outcomes are discussed in relation to a predictive internal model of gravity which outputs an on-line spatial update of remembered objects’ location. PMID:26910260
Visual Semantic Based 3D Video Retrieval System Using HDFS.
Kumar, C Ranjith; Suguna, S
2016-08-01
This paper brings out a neoteric frame of reference for visual semantic based 3d video search and retrieval applications. Newfangled 3D retrieval application spotlight on shape analysis like object matching, classification and retrieval not only sticking up entirely with video retrieval. In this ambit, we delve into 3D-CBVR (Content Based Video Retrieval) concept for the first time. For this purpose, we intent to hitch on BOVW and Mapreduce in 3D framework. Instead of conventional shape based local descriptors, we tried to coalesce shape, color and texture for feature extraction. For this purpose, we have used combination of geometric & topological features for shape and 3D co-occurrence matrix for color and texture. After thriving extraction of local descriptors, TB-PCT (Threshold Based- Predictive Clustering Tree) algorithm is used to generate visual codebook and histogram is produced. Further, matching is performed using soft weighting scheme with L 2 distance function. As a final step, retrieved results are ranked according to the Index value and acknowledged to the user as a feedback .In order to handle prodigious amount of data and Efficacious retrieval, we have incorporated HDFS in our Intellection. Using 3D video dataset, we future the performance of our proposed system which can pan out that the proposed work gives meticulous result and also reduce the time intricacy.
De Sá Teixeira, Nuno
2016-01-01
Visual memory for the spatial location where a moving target vanishes has been found to be systematically displaced downward in the direction of gravity. Moreover, it was recently reported that the magnitude of the downward error increases steadily with increasing retention intervals imposed after object's offset and before observers are allowed to perform the spatial localization task, in a pattern where the remembered vanishing location drifts downward as if following a falling trajectory. This outcome was taken to reflect the dynamics of a representational model of earth's gravity. The present study aims to establish the spatial and temporal features of this downward drift by taking into account the dynamics of the motor response. The obtained results show that the memory for the last location of the target drifts downward with time, thus replicating previous results. Moreover, the time taken for completion of the behavioural localization movements seems to add to the imposed retention intervals in determining the temporal frame during which the visual memory is updated. Overall, it is reported that the representation of spatial location drifts downward by about 3 pixels for each two-fold increase of time until response. The outcomes are discussed in relation to a predictive internal model of gravity which outputs an on-line spatial update of remembered objects' location.
Psychophysics and Neuronal Bases of Sound Localization in Humans
Ahveninen, Jyrki; Kopco, Norbert; Jääskeläinen, Iiro P.
2013-01-01
Localization of sound sources is a considerable computational challenge for the human brain. Whereas the visual system can process basic spatial information in parallel, the auditory system lacks a straightforward correspondence between external spatial locations and sensory receptive fields. Consequently, the question how different acoustic features supporting spatial hearing are represented in the central nervous system is still open. Functional neuroimaging studies in humans have provided evidence for a posterior auditory “where” pathway that encompasses non-primary auditory cortex areas, including the planum temporale (PT) and posterior superior temporal gyrus (STG), which are strongly activated by horizontal sound direction changes, distance changes, and movement. However, these areas are also activated by a wide variety of other stimulus features, posing a challenge for the interpretation that the underlying areas are purely spatial. This review discusses behavioral and neuroimaging studies on sound localization, and some of the competing models of representation of auditory space in humans. PMID:23886698
A Fine-Scale Functional Logic to Convergence from Retina to Thalamus.
Liang, Liang; Fratzl, Alex; Goldey, Glenn; Ramesh, Rohan N; Sugden, Arthur U; Morgan, Josh L; Chen, Chinfei; Andermann, Mark L
2018-05-31
Numerous well-defined classes of retinal ganglion cells innervate the thalamus to guide image-forming vision, yet the rules governing their convergence and divergence remain unknown. Using two-photon calcium imaging in awake mouse thalamus, we observed a functional arrangement of retinal ganglion cell axonal boutons in which coarse-scale retinotopic ordering gives way to fine-scale organization based on shared preferences for other visual features. Specifically, at the ∼6 μm scale, clusters of boutons from different axons often showed similar preferences for either one or multiple features, including axis and direction of motion, spatial frequency, and changes in luminance. Conversely, individual axons could "de-multiplex" information channels by participating in multiple, functionally distinct bouton clusters. Finally, ultrastructural analyses demonstrated that retinal axonal boutons in a local cluster often target the same dendritic domain. These data suggest that functionally specific convergence and divergence of retinal axons may impart diverse, robust, and often novel feature selectivity to visual thalamus. Copyright © 2018 Elsevier Inc. All rights reserved.
'Where' and 'what' in visual search.
Atkinson, J; Braddick, O J
1989-01-01
A line segment target can be detected among distractors of a different orientation by a fast 'preattentive' process. One view is that this depends on detection of a 'feature gradient', which enables subjects to locate where the target is without necessarily identifying what it is. An alternative view is that a target can be identified as distinctive in a particular 'feature map' without subjects knowing where it is in that map. Experiments are reported in which briefly exposed arrays of line segments were followed by a pattern mask, and the threshold stimulus-mask interval determined for three tasks: 'what'--subjects reported whether the target was vertical or horizontal among oblique distractors; 'coarse where'--subjects reported whether the target was in the upper or lower half of the array; 'fine where'--subjects reported whether or not the target was in a set of four particular array positions. The threshold interval was significantly lower for the 'coarse where' than for the 'what' task, indicating that, even though localization in this task depends on the target's orientation difference, this localization is possible without absolute identification of target orientation. However, for the 'fine where' task, intervals as long as or longer than those for the 'what' task were required. It appears either that different localization processes work at different levels of resolution, or that a single localization process, independent of identification, can increase its resolution at the expense of processing speed. These possibilities are discussed in terms of distinct neural representations of the visual field and fixed or variable localization processes acting upon them.
Schallmo, Michael-Paul; Grant, Andrea N; Burton, Philip C; Olman, Cheryl A
2016-08-01
Although V1 responses are driven primarily by elements within a neuron's receptive field, which subtends about 1° visual angle in parafoveal regions, previous work has shown that localized fMRI responses to visual elements reflect not only local feature encoding but also long-range pattern attributes. However, separating the response to an image feature from the response to the surrounding stimulus and studying the interactions between these two responses demands both spatial precision and signal independence, which may be challenging to attain with fMRI. The present study used 7 Tesla fMRI with 1.2-mm resolution to measure the interactions between small sinusoidal grating patches (targets) at 3° eccentricity and surrounds of various sizes and orientations to test the conditions under which localized, context-dependent fMRI responses could be predicted from either psychophysical or electrophysiological data. Targets were presented at 8%, 16%, and 32% contrast while manipulating (a) spatial extent of parallel (strongly suppressive) or orthogonal (weakly suppressive) surrounds, (b) locus of attention, (c) stimulus onset asynchrony between target and surround, and (d) blocked versus event-related design. In all experiments, the V1 fMRI signal was lower when target stimuli were flanked by parallel versus orthogonal context. Attention amplified fMRI responses to all stimuli but did not show a selective effect on central target responses or a measurable effect on orientation-dependent surround suppression. Suppression of the V1 fMRI response by parallel surrounds was stronger than predicted from psychophysics but showed a better match to previous electrophysiological reports.
NASA Astrophysics Data System (ADS)
Wan, Qianwen; Panetta, Karen; Agaian, Sos
2017-05-01
Autonomous facial recognition system is widely used in real-life applications, such as homeland border security, law enforcement identification and authentication, and video-based surveillance analysis. Issues like low image quality, non-uniform illumination as well as variations in poses and facial expressions can impair the performance of recognition systems. To address the non-uniform illumination challenge, we present a novel robust autonomous facial recognition system inspired by the human visual system based, so called, logarithmical image visualization technique. In this paper, the proposed method, for the first time, utilizes the logarithmical image visualization technique coupled with the local binary pattern to perform discriminative feature extraction for facial recognition system. The Yale database, the Yale-B database and the ATT database are used for computer simulation accuracy and efficiency testing. The extensive computer simulation demonstrates the method's efficiency, accuracy, and robustness of illumination invariance for facial recognition.
Murphy-Baum, Benjamin L; Taylor, W Rowland
2015-09-30
Much of the computational power of the retina derives from the activity of amacrine cells, a large and diverse group of GABAergic and glycinergic inhibitory interneurons. Here, we identify an ON-type orientation-selective, wide-field, polyaxonal amacrine cell (PAC) in the rabbit retina and demonstrate how its orientation selectivity arises from the structure of the dendritic arbor and the pattern of excitatory and inhibitory inputs. Excitation from ON bipolar cells and inhibition arising from the OFF pathway converge to generate a quasi-linear integration of visual signals in the receptive field center. This serves to suppress responses to high spatial frequencies, thereby improving sensitivity to larger objects and enhancing orientation selectivity. Inhibition also regulates the magnitude and time course of excitatory inputs to this PAC through serial inhibitory connections onto the presynaptic terminals of ON bipolar cells. This presynaptic inhibition is driven by graded potentials within local microcircuits, similar in extent to the size of single bipolar cell receptive fields. Additional presynaptic inhibition is generated by spiking amacrine cells on a larger spatial scale covering several hundred microns. The orientation selectivity of this PAC may be a substrate for the inhibition that mediates orientation selectivity in some types of ganglion cells. Significance statement: The retina comprises numerous excitatory and inhibitory circuits that encode specific features in the visual scene, such as orientation, contrast, or motion. Here, we identify a wide-field inhibitory neuron that responds to visual stimuli of a particular orientation, a feature selectivity that is primarily due to the elongated shape of the dendritic arbor. Integration of convergent excitatory and inhibitory inputs from the ON and OFF visual pathways suppress responses to small objects and fine textures, thus enhancing selectivity for larger objects. Feedback inhibition regulates the strength and speed of excitation on both local and wide-field spatial scales. This study demonstrates how different synaptic inputs are regulated to tune a neuron to respond to specific features in the visual scene. Copyright © 2015 the authors 0270-6474/15/3513336-15$15.00/0.
Potts, Geoffrey F; Wood, Susan M; Kothmann, Delia; Martin, Laura E
2008-10-21
Attention directs limited-capacity information processing resources to a subset of available perceptual representations. The mechanisms by which attention selects task-relevant representations for preferential processing are not fully known. Triesman and Gelade's [Triesman, A., Gelade, G., 1980. A feature integration theory of attention. Cognit. Psychol. 12, 97-136.] influential attention model posits that simple features are processed preattentively, in parallel, but that attention is required to serially conjoin multiple features into an object representation. Event-related potentials have provided evidence for this model showing parallel processing of perceptual features in the posterior Selection Negativity (SN) and serial, hierarchic processing of feature conjunctions in the Frontal Selection Positivity (FSP). Most prior studies have been done on conjunctions within one sensory modality while many real-world objects have multimodal features. It is not known if the same neural systems of posterior parallel processing of simple features and frontal serial processing of feature conjunctions seen within a sensory modality also operate on conjunctions between modalities. The current study used ERPs and simultaneously presented auditory and visual stimuli in three task conditions: Attend Auditory (auditory feature determines the target, visual features are irrelevant), Attend Visual (visual features relevant, auditory irrelevant), and Attend Conjunction (target defined by the co-occurrence of an auditory and a visual feature). In the Attend Conjunction condition when the auditory but not the visual feature was a target there was an SN over auditory cortex, when the visual but not auditory stimulus was a target there was an SN over visual cortex, and when both auditory and visual stimuli were targets (i.e. conjunction target) there were SNs over both auditory and visual cortex, indicating parallel processing of the simple features within each modality. In contrast, an FSP was present when either the visual only or both auditory and visual features were targets, but not when only the auditory stimulus was a target, indicating that the conjunction target determination was evaluated serially and hierarchically with visual information taking precedence. This indicates that the detection of a target defined by audio-visual conjunction is achieved via the same mechanism as within a single perceptual modality, through separate, parallel processing of the auditory and visual features and serial processing of the feature conjunction elements, rather than by evaluation of a fused multimodal percept.
Interactive Dose Shaping - efficient strategies for CPU-based real-time treatment planning
NASA Astrophysics Data System (ADS)
Ziegenhein, P.; Kamerling, C. P.; Oelfke, U.
2014-03-01
Conventional intensity modulated radiation therapy (IMRT) treatment planning is based on the traditional concept of iterative optimization using an objective function specified by dose volume histogram constraints for pre-segmented VOIs. This indirect approach suffers from unavoidable shortcomings: i) The control of local dose features is limited to segmented VOIs. ii) Any objective function is a mathematical measure of the plan quality, i.e., is not able to define the clinically optimal treatment plan. iii) Adapting an existing plan to changed patient anatomy as detected by IGRT procedures is difficult. To overcome these shortcomings, we introduce the method of Interactive Dose Shaping (IDS) as a new paradigm for IMRT treatment planning. IDS allows for a direct and interactive manipulation of local dose features in real-time. The key element driving the IDS process is a two-step Dose Modification and Recovery (DMR) strategy: A local dose modification is initiated by the user which translates into modified fluence patterns. This also affects existing desired dose features elsewhere which is compensated by a heuristic recovery process. The IDS paradigm was implemented together with a CPU-based ultra-fast dose calculation and a 3D GUI for dose manipulation and visualization. A local dose feature can be implemented via the DMR strategy within 1-2 seconds. By imposing a series of local dose features, equal plan qualities could be achieved compared to conventional planning for prostate and head and neck cases within 1-2 minutes. The idea of Interactive Dose Shaping for treatment planning has been introduced and first applications of this concept have been realized.
Local and Global Gestalt Laws: A Neurally Based Spectral Approach.
Favali, Marta; Citti, Giovanna; Sarti, Alessandro
2017-02-01
This letter presents a mathematical model of figure-ground articulation that takes into account both local and global gestalt laws and is compatible with the functional architecture of the primary visual cortex (V1). The local gestalt law of good continuation is described by means of suitable connectivity kernels that are derived from Lie group theory and quantitatively compared with long-range connectivity in V1. Global gestalt constraints are then introduced in terms of spectral analysis of a connectivity matrix derived from these kernels. This analysis performs grouping of local features and individuates perceptual units with the highest salience. Numerical simulations are performed, and results are obtained by applying the technique to a number of stimuli.
Urban topography for flood modeling by fusion of OpenStreetMap, SRTM and local knowledge
NASA Astrophysics Data System (ADS)
Winsemius, Hessel; Donchyts, Gennadii; Eilander, Dirk; Chen, Jorik; Leskens, Anne; Coughlan, Erin; Mawanda, Shaban; Ward, Philip; Diaz Loaiza, Andres; Luo, Tianyi; Iceland, Charles
2016-04-01
Topography data is essential for understanding and modeling of urban flood hazard. Within urban areas, much of the topography is defined by highly localized man-made features such as roads, channels, ditches, culverts and buildings. This results in the requirement that urban flood models require high resolution topography, and water conveying connections within the topography are considered. In recent years, more and more topography information is collected through LIDAR surveys however there are still many cities in the world where high resolution topography data is not available. Furthermore, information on connectivity is required for flood modelling, even when LIDAR data are used. In this contribution, we demonstrate how high resolution terrain data can be synthesized using a fusion between features in OpenStreetMap (OSM) data (including roads, culverts, channels and buildings) and existing low resolution and noisy SRTM elevation data using the Google Earth Engine platform. Our method uses typical existing OSM properties to estimate heights and topology associated with the features, and uses these to correct noise and burn features on top of the existing low resolution SRTM elevation data. The method has been setup in the Google Earth Engine platform so that local stakeholders and mapping teams can on-the-fly propose, include and visualize the effect of additional features and properties of features, which are deemed important for topography and water conveyance. These features can be included in a workshop environment. We pilot our tool over Dar Es Salaam.
Avian visual behavior and the organization of the telencephalon.
Shimizu, Toru; Patton, Tadd B; Husband, Scott A
2010-01-01
Birds have excellent visual abilities that are comparable or superior to those of primates, but how the bird brain solves complex visual problems is poorly understood. More specifically, we lack knowledge about how such superb abilities are used in nature and how the brain, especially the telencephalon, is organized to process visual information. Here we review the results of several studies that examine the organization of the avian telencephalon and the relevance of visual abilities to avian social and reproductive behavior. Video playback and photographic stimuli show that birds can detect and evaluate subtle differences in local facial features of potential mates in a fashion similar to that of primates. These techniques have also revealed that birds do not attend well to global configural changes in the face, suggesting a fundamental difference between birds and primates in face perception. The telencephalon plays a major role in the visual and visuo-cognitive abilities of birds and primates, and anatomical data suggest that these animals may share similar organizational characteristics in the visual telencephalon. As is true in the primate cerebral cortex, different visual features are processed separately in the avian telencephalon where separate channels are organized in the anterior-posterior axis roughly parallel to the major laminae. Furthermore, the efferent projections from the primary visual telencephalon form an extensive column-like continuum involving the dorsolateral pallium and the lateral basal ganglia. Such a column-like organization may exist not only for vision, but for other sensory modalities and even for a continuum that links sensory and limbic areas of the avian brain. Behavioral and neural studies must be integrated in order to understand how birds have developed their amazing visual systems through 150 million years of evolution. 2010 S. Karger AG, Basel.
Avian Visual Behavior and the Organization of the Telencephalon
Shimizu, Toru; Patton, Tadd B.; Husband, Scott A.
2010-01-01
Birds have excellent visual abilities that are comparable or superior to those of primates, but how the bird brain solves complex visual problems is poorly understood. More specifically, we lack knowledge about how such superb abilities are used in nature and how the brain, especially the telencephalon, is organized to process visual information. Here we review the results of several studies that examine the organization of the avian telencephalon and the relevance of visual abilities to avian social and reproductive behavior. Video playback and photographic stimuli show that birds can detect and evaluate subtle differences in local facial features of potential mates in a fashion similar to that of primates. These techniques have also revealed that birds do not attend well to global configural changes in the face, suggesting a fundamental difference between birds and primates in face perception. The telencephalon plays a major role in the visual and visuo-cognitive abilities of birds and primates, and anatomical data suggest that these animals may share similar organizational characteristics in the visual telencephalon. As is true in the primate cerebral cortex, different visual features are processed separately in the avian telencephalon where separate channels are organized in the anterior-posterior axis roughly parallel to the major laminae. Furthermore, the efferent projections from the primary visual telencephalon form an extensive column-like continuum involving the dorsolateral pallium and the lateral basal ganglia. Such a column-like organization may exist not only for vision, but for other sensory modalities and even for a continuum that links sensory and limbic areas of the avian brain. Behavioral and neural studies must be integrated in order to understand how birds have developed their amazing visual systems through 150 million years of evolution. PMID:20733296
High resolution satellite image indexing and retrieval using SURF features and bag of visual words
NASA Astrophysics Data System (ADS)
Bouteldja, Samia; Kourgli, Assia
2017-03-01
In this paper, we evaluate the performance of SURF descriptor for high resolution satellite imagery (HRSI) retrieval through a BoVW model on a land-use/land-cover (LULC) dataset. Local feature approaches such as SIFT and SURF descriptors can deal with a large variation of scale, rotation and illumination of the images, providing, therefore, a better discriminative power and retrieval efficiency than global features, especially for HRSI which contain a great range of objects and spatial patterns. Moreover, we combine SURF and color features to improve the retrieval accuracy, and we propose to learn a category-specific dictionary for each image category which results in a more discriminative image representation and boosts the image retrieval performance.
A study of perceptual analysis in a high-level autistic subject with exceptional graphic abilities.
Mottron, L; Belleville, S
1993-11-01
We report here the case study of a patient (E.C.) with an Asperger syndrome, or autism with quasinormal intelligence, who shows an outstanding ability for three-dimensional drawing of inanimate objects (savant syndrome). An assessment of the subsystems proposed in recent models of object recognition evidenced intact perceptual analysis and identification. The initial (or primal sketch), viewer-centered (or 2-1/2-D), or object-centered (3-D) representations and the recognition and name levels were functional. In contrast, E.C.'s pattern of performance in three different types of tasks converge to suggest an anomaly in the hierarchical organization of the local and global parts of a figure: a local interference effect in incongruent hierarchical visual stimuli, a deficit in relating local parts to global form information in impossible figures, and an absence of feature-grouping in graphic recall. The results are discussed in relation to normal visual perception and to current accounts of the savant syndrome in autism.
Multimap formation in visual cortex
Jain, Rishabh; Millin, Rachel; Mel, Bartlett W.
2015-01-01
An extrastriate visual area such as V2 or V4 contains neurons selective for a multitude of complex shapes, all sharing a common topographic organization. Simultaneously developing multiple interdigitated maps—hereafter a “multimap”—is challenging in that neurons must compete to generate a diversity of response types locally, while cooperating with their dispersed same-type neighbors to achieve uniform visual field coverage for their response type at all orientations, scales, etc. Previously proposed map development schemes have relied on smooth spatial interaction functions to establish both topography and columnar organization, but by locally homogenizing cells' response properties, local smoothing mechanisms effectively rule out multimap formation. We found in computer simulations that the key requirements for multimap development are that neurons are enabled for plasticity only within highly active regions of cortex designated “learning eligibility regions” (LERs), but within an LER, each cell's learning rate is determined only by its activity level with no dependence on location. We show that a hybrid developmental rule that combines spatial and activity-dependent learning criteria in this way successfully produces multimaps when the input stream contains multiple distinct feature types, or in the degenerate case of a single feature type, produces a V1-like map with “salt-and-pepper” structure. Our results support the hypothesis that cortical maps containing a fine mixture of different response types, whether in monkey extrastriate cortex, mouse V1 or elsewhere in the cortex, rather than signaling a breakdown of map formation mechanisms at the fine scale, are a product of a generic cortical developmental scheme designed to map cells with a diversity of response properties across a shared topographic space. PMID:26641946
Recognizing Materials using Perceptually Inspired Features
Sharan, Lavanya; Liu, Ce; Rosenholtz, Ruth; Adelson, Edward H.
2013-01-01
Our world consists not only of objects and scenes but also of materials of various kinds. Being able to recognize the materials that surround us (e.g., plastic, glass, concrete) is important for humans as well as for computer vision systems. Unfortunately, materials have received little attention in the visual recognition literature, and very few computer vision systems have been designed specifically to recognize materials. In this paper, we present a system for recognizing material categories from single images. We propose a set of low and mid-level image features that are based on studies of human material recognition, and we combine these features using an SVM classifier. Our system outperforms a state-of-the-art system [Varma and Zisserman, 2009] on a challenging database of real-world material categories [Sharan et al., 2009]. When the performance of our system is compared directly to that of human observers, humans outperform our system quite easily. However, when we account for the local nature of our image features and the surface properties they measure (e.g., color, texture, local shape), our system rivals human performance. We suggest that future progress in material recognition will come from: (1) a deeper understanding of the role of non-local surface properties (e.g., extended highlights, object identity); and (2) efforts to model such non-local surface properties in images. PMID:23914070
JBrowse: A dynamic web platform for genome visualization and analysis
Buels, Robert; Yao, Eric; Diesh, Colin M.; ...
2016-04-12
Background: JBrowse is a fast and full-featured genome browser built with JavaScript and HTML5. It is easily embedded into websites or apps but can also be served as a standalone web page. Results: Overall improvements to speed and scalability are accompanied by specific enhancements that support complex interactive queries on large track sets. Analysis functions can readily be added using the plugin framework; most visual aspects of tracks can also be customized, along with clicks, mouseovers, menus, and popup boxes. JBrowse can also be used to browse local annotation files offline and to generate high-resolution figures for publication. Conclusions: JBrowsemore » is a mature web application suitable for genome visualization and analysis.« less
NASA Astrophysics Data System (ADS)
Kümmel, Stephan
Being able to visualize the dynamics of electrons in organic materials is a fascinating perspective. Simulations based on time-dependent density functional theory allow to realize this hope, as they visualize the flow of charge through molecular structures in real-space and real-time. We here present results on two fundamental processes: Photoemission from organic semiconductor molecules and charge transport through molecular structures. In the first part we demonstrate that angular resolved photoemission intensities - from both theory and experiment - can often be interpreted as a visualization of molecular orbitals. However, counter-intuitive quantum-mechanical electron dynamics such as emission perpendicular to the direction of the electrical field can substantially alter the picture, adding surprising features to the molecular orbital interpretation. In a second study we calculate the flow of charge through conjugated molecules. The calculations show in real time how breaks in the conjugation can lead to a local buildup of charge and the formation of local electrical dipoles. These can interact with neighboring molecular chains. As a consequence, collections of ''molecular electrical wires'' can show distinctly different characteristics than ''classical electrical wires''. German Science Foundation GRK 1640.
NASA Astrophysics Data System (ADS)
Pietrzyk, Mariusz W.; Donovan, Tim; Brennan, Patrick C.; Dix, Alan; Manning, David J.
2011-03-01
Aim: To optimize automated classification of radiological errors during lung nodule detection from chest radiographs (CxR) using a support vector machine (SVM) run on the spatial frequency features extracted from the local background of selected regions. Background: The majority of the unreported pulmonary nodules are visually detected but not recognized; shown by the prolonged dwell time values at false-negative regions. Similarly, overestimated nodule locations are capturing substantial amounts of foveal attention. Spatial frequency properties of selected local backgrounds are correlated with human observer responses either in terms of accuracy in indicating abnormality position or in the precision of visual sampling the medical images. Methods: Seven radiologists participated in the eye tracking experiments conducted under conditions of pulmonary nodule detection from a set of 20 postero-anterior CxR. The most dwelled locations have been identified and subjected to spatial frequency (SF) analysis. The image-based features of selected ROI were extracted with un-decimated Wavelet Packet Transform. An analysis of variance was run to select SF features and a SVM schema was implemented to classify False-Negative and False-Positive from all ROI. Results: A relative high overall accuracy was obtained for each individually developed Wavelet-SVM algorithm, with over 90% average correct ratio for errors recognition from all prolonged dwell locations. Conclusion: The preliminary results show that combined eye-tracking and image-based features can be used for automated detection of radiological error with SVM. The work is still in progress and not all analytical procedures have been completed, which might have an effect on the specificity of the algorithm.
NASA Astrophysics Data System (ADS)
Bijl, Piet; Toet, Alexander; Kooi, Frank L.
2016-10-01
Visual images of a civilian target ship on a sea background were produced using a CAD model. The total set consisted of 264 images and included 3 different color schemes, 2 ship viewing aspects, 5 sun illumination conditions, 2 sea reflection values, 2 ship positions with respect to the horizon and 3 values of atmospheric contrast reduction. In a perception experiment, the images were presented on a display in a long darkened corridor. Observers were asked to indicate the range at which they were able to detect the ship and classify the following 5 ship elements: accommodation, funnel, hull, mast, and hat above the bridge. This resulted in a total of 1584 Target Acquisition (TA) range estimates for two observers. Next, the ship contour, ship elements and corresponding TA ranges were analyzed applying several feature size and contrast measures. Most data coincide on a contrast versus angular size plot using (1) the long axis as characteristic ship/ship feature size and (2) local Weber contrast as characteristic ship/ship feature contrast. Finally, the data were compared with a variety of visual performance functions assumed to be representative for Target Acquisition: the TOD (Triangle Orientation Discrimination), MRC (Minimum Resolvable Contrast), CTF (Contrast Threshold Function), TTP (Targeting Task Performance) metric and circular disc detection data for the unaided eye (Blackwell). The results provide strong evidence for the TOD case: both position and slope of the TOD curve match the ship detection and classification data without any free parameter. In contrast, the MRC and CTF are too steep, the TTP and disc detection curves are too shallow and all these curves need an overall scaling factor in order to coincide with the ship and ship feature recognition data.
Object attributes combine additively in visual search
Pramod, R. T.; Arun, S. P.
2016-01-01
We perceive objects as containing a variety of attributes: local features, relations between features, internal details, and global properties. But we know little about how they combine. Here, we report a remarkably simple additive rule that governs how these diverse object attributes combine in vision. The perceived dissimilarity between two objects was accurately explained as a sum of (a) spatially tuned local contour-matching processes modulated by part decomposition; (b) differences in internal details, such as texture; (c) differences in emergent attributes, such as symmetry; and (d) differences in global properties, such as orientation or overall configuration of parts. Our results elucidate an enduring question in object vision by showing that the whole object is not a sum of its parts but a sum of its many attributes. PMID:26967014
Multimodality imaging of the orbit
Hande, Pradipta C; Talwar, Inder
2012-01-01
The role of imaging is well established in the evaluation of orbital diseases. Ultrasonography, Computed tomography and Magnetic resonance imaging are complementary modalities, which allow direct visualization of regional anatomy, accurate localization and help to characterize lesions to make a reliable radiological diagnosis. The purpose of this pictorial essay is to highlight the imaging features of commonly encountered pathologies which involve the orbit. PMID:23599570
Improving Large-Scale Image Retrieval Through Robust Aggregation of Local Descriptors.
Husain, Syed Sameed; Bober, Miroslaw
2017-09-01
Visual search and image retrieval underpin numerous applications, however the task is still challenging predominantly due to the variability of object appearance and ever increasing size of the databases, often exceeding billions of images. Prior art methods rely on aggregation of local scale-invariant descriptors, such as SIFT, via mechanisms including Bag of Visual Words (BoW), Vector of Locally Aggregated Descriptors (VLAD) and Fisher Vectors (FV). However, their performance is still short of what is required. This paper presents a novel method for deriving a compact and distinctive representation of image content called Robust Visual Descriptor with Whitening (RVD-W). It significantly advances the state of the art and delivers world-class performance. In our approach local descriptors are rank-assigned to multiple clusters. Residual vectors are then computed in each cluster, normalized using a direction-preserving normalization function and aggregated based on the neighborhood rank. Importantly, the residual vectors are de-correlated and whitened in each cluster before aggregation, leading to a balanced energy distribution in each dimension and significantly improved performance. We also propose a new post-PCA normalization approach which improves separability between the matching and non-matching global descriptors. This new normalization benefits not only our RVD-W descriptor but also improves existing approaches based on FV and VLAD aggregation. Furthermore, we show that the aggregation framework developed using hand-crafted SIFT features also performs exceptionally well with Convolutional Neural Network (CNN) based features. The RVD-W pipeline outperforms state-of-the-art global descriptors on both the Holidays and Oxford datasets. On the large scale datasets, Holidays1M and Oxford1M, SIFT-based RVD-W representation obtains a mAP of 45.1 and 35.1 percent, while CNN-based RVD-W achieve a mAP of 63.5 and 44.8 percent, all yielding superior performance to the state-of-the-art.
The forest, the trees, and the leaves: Differences of processing across development.
Krakowski, Claire-Sara; Poirel, Nicolas; Vidal, Julie; Roëll, Margot; Pineau, Arlette; Borst, Grégoire; Houdé, Olivier
2016-08-01
To act and think, children and adults are continually required to ignore irrelevant visual information to focus on task-relevant items. As real-world visual information is organized into structures, we designed a feature visual search task containing 3-level hierarchical stimuli (i.e., local shapes that constituted intermediate shapes that formed the global figure) that was presented to 112 participants aged 5, 6, 9, and 21 years old. This task allowed us to explore (a) which level is perceptively the most salient at each age (i.e., the fastest detected level) and (b) what kind of attentional processing occurs for each level across development (i.e., efficient processing: detection time does not increase with the number of stimuli on the display; less efficient processing: detection time increases linearly with the growing number of distractors). Results showed that the global level was the most salient at 5 years of age, whereas the global and intermediate levels were both salient for 9-year-olds and adults. Interestingly, at 6 years of age, the intermediate level was the most salient level. Second, all participants showed an efficient processing of both intermediate and global levels of hierarchical stimuli, and a less efficient processing of the local level, suggesting a local disadvantage rather than a global advantage in visual search. The cognitive cost for selecting the local target was higher for 5- and 6-year-old children compared to 9-year-old children and adults. These results are discussed with regards to the development of executive control. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Ellefsen, Kyle L; Settle, Brett; Parker, Ian; Smith, Ian F
2014-09-01
Local Ca(2+) transients such as puffs and sparks form the building blocks of cellular Ca(2+) signaling in numerous cell types. They have traditionally been studied by linescan confocal microscopy, but advances in TIRF microscopy together with improved electron-multiplied CCD (EMCCD) cameras now enable rapid (>500 frames s(-1)) imaging of subcellular Ca(2+) signals with high spatial resolution in two dimensions. This approach yields vastly more information (ca. 1 Gb min(-1)) than linescan imaging, rendering visual identification and analysis of local events imaged both laborious and subject to user bias. Here we describe a routine to rapidly automate identification and analysis of local Ca(2+) events. This features an intuitive graphical user-interfaces and runs under Matlab and the open-source Python software. The underlying algorithm features spatial and temporal noise filtering to reliably detect even small events in the presence of noisy and fluctuating baselines; localizes sites of Ca(2+) release with sub-pixel resolution; facilitates user review and editing of data; and outputs time-sequences of fluorescence ratio signals for identified event sites along with Excel-compatible tables listing amplitudes and kinetics of events. Copyright © 2014 Elsevier Ltd. All rights reserved.
Internal curvature signal and noise in low- and high-level vision
Grabowecky, Marcia; Kim, Yee Joon; Suzuki, Satoru
2011-01-01
How does internal processing contribute to visual pattern perception? By modeling visual search performance, we estimated internal signal and noise relevant to perception of curvature, a basic feature important for encoding of three-dimensional surfaces and objects. We used isolated, sparse, crowded, and face contexts to determine how internal curvature signal and noise depended on image crowding, lateral feature interactions, and level of pattern processing. Observers reported the curvature of a briefly flashed segment, which was presented alone (without lateral interaction) or among multiple straight segments (with lateral interaction). Each segment was presented with no context (engaging low-to-intermediate-level curvature processing), embedded within a face context as the mouth (engaging high-level face processing), or embedded within an inverted-scrambled-face context as a control for crowding. Using a simple, biologically plausible model of curvature perception, we estimated internal curvature signal and noise as the mean and standard deviation, respectively, of the Gaussian-distributed population activity of local curvature-tuned channels that best simulated behavioral curvature responses. Internal noise was increased by crowding but not by face context (irrespective of lateral interactions), suggesting prevention of noise accumulation in high-level pattern processing. In contrast, internal curvature signal was unaffected by crowding but modulated by lateral interactions. Lateral interactions (with straight segments) increased curvature signal when no contextual elements were added, but equivalent interactions reduced curvature signal when each segment was presented within a face. These opposing effects of lateral interactions are consistent with the phenomena of local-feature contrast in low-level processing and global-feature averaging in high-level processing. PMID:21209356
Multi-scale image segmentation method with visual saliency constraints and its application
NASA Astrophysics Data System (ADS)
Chen, Yan; Yu, Jie; Sun, Kaimin
2018-03-01
Object-based image analysis method has many advantages over pixel-based methods, so it is one of the current research hotspots. It is very important to get the image objects by multi-scale image segmentation in order to carry out object-based image analysis. The current popular image segmentation methods mainly share the bottom-up segmentation principle, which is simple to realize and the object boundaries obtained are accurate. However, the macro statistical characteristics of the image areas are difficult to be taken into account, and fragmented segmentation (or over-segmentation) results are difficult to avoid. In addition, when it comes to information extraction, target recognition and other applications, image targets are not equally important, i.e., some specific targets or target groups with particular features worth more attention than the others. To avoid the problem of over-segmentation and highlight the targets of interest, this paper proposes a multi-scale image segmentation method with visually saliency graph constraints. Visual saliency theory and the typical feature extraction method are adopted to obtain the visual saliency information, especially the macroscopic information to be analyzed. The visual saliency information is used as a distribution map of homogeneity weight, where each pixel is given a weight. This weight acts as one of the merging constraints in the multi- scale image segmentation. As a result, pixels that macroscopically belong to the same object but are locally different can be more likely assigned to one same object. In addition, due to the constraint of visual saliency model, the constraint ability over local-macroscopic characteristics can be well controlled during the segmentation process based on different objects. These controls will improve the completeness of visually saliency areas in the segmentation results while diluting the controlling effect for non- saliency background areas. Experiments show that this method works better for texture image segmentation than traditional multi-scale image segmentation methods, and can enable us to give priority control to the saliency objects of interest. This method has been used in image quality evaluation, scattered residential area extraction, sparse forest extraction and other applications to verify its validation. All applications showed good results.
Non-conscious processing of motion coherence can boost conscious access.
Kaunitz, Lisandro; Fracasso, Alessio; Lingnau, Angelika; Melcher, David
2013-01-01
Research on the scope and limits of non-conscious vision can advance our understanding of the functional and neural underpinnings of visual awareness. Here we investigated whether distributed local features can be bound, outside of awareness, into coherent patterns. We used continuous flash suppression (CFS) to create interocular suppression, and thus lack of awareness, for a moving dot stimulus that varied in terms of coherence with an overall pattern (radial flow). Our results demonstrate that for radial motion, coherence favors the detection of patterns of moving dots even under interocular suppression. Coherence caused dots to break through the masks more often: this indicates that the visual system was able to integrate low-level motion signals into a coherent pattern outside of visual awareness. In contrast, in an experiment using meaningful or scrambled biological motion we did not observe any increase in the sensitivity of detection for meaningful patterns. Overall, our results are in agreement with previous studies on face processing and with the hypothesis that certain features are spatiotemporally bound into coherent patterns even outside of attention or awareness.
Levichkina, Ekaterina; Saalmann, Yuri B; Vidyasagar, Trichur R
2017-03-01
Primate posterior parietal cortex (PPC) is known to be involved in controlling spatial attention. Neurons in one part of the PPC, the lateral intraparietal area (LIP), show enhanced responses to objects at attended locations. Although many are selective for object features, such as the orientation of a visual stimulus, it is not clear how LIP circuits integrate feature-selective information when providing attentional feedback about behaviorally relevant locations to the visual cortex. We studied the relationship between object feature and spatial attention properties of LIP cells in two macaques by measuring the cells' orientation selectivity and the degree of attentional enhancement while performing a delayed match-to-sample task. Monkeys had to match both the location and orientation of two visual gratings presented separately in time. We found a wide range in orientation selectivity and degree of attentional enhancement among LIP neurons. However, cells with significant attentional enhancement had much less orientation selectivity in their response than cells which showed no significant modulation by attention. Additionally, orientation-selective cells showed working memory activity for their preferred orientation, whereas cells showing attentional enhancement also synchronized with local neuronal activity. These results are consistent with models of selective attention incorporating two stages, where an initial feature-selective process guides a second stage of focal spatial attention. We suggest that LIP contributes to both stages, where the first stage involves orientation-selective LIP cells that support working memory of the relevant feature, and the second stage involves attention-enhanced LIP cells that synchronize to provide feedback on spatial priorities. © 2017 The Authors. Physiological Reports published by Wiley Periodicals, Inc. on behalf of The Physiological Society and the American Physiological Society.
Simmering, Vanessa R; Wood, Chelsey M
2017-08-01
Working memory is a basic cognitive process that predicts higher-level skills. A central question in theories of working memory development is the generality of the mechanisms proposed to explain improvements in performance. Prior theories have been closely tied to particular tasks and/or age groups, limiting their generalizability. The cognitive dynamics theory of visual working memory development has been proposed to overcome this limitation. From this perspective, developmental improvements arise through the coordination of cognitive processes to meet demands of different behavioral tasks. This notion is described as real-time stability, and can be probed through experiments that assess how changing task demands impact children's performance. The current studies test this account by probing visual working memory for colors and shapes in a change detection task that compares detection of changes to new features versus swaps in color-shape binding. In Experiment 1, 3- to 4-year-old children showed impairments specific to binding swaps, as predicted by decreased real-time stability early in development; 5- to 6-year-old children showed a slight advantage on binding swaps, but 7- to 8-year-old children and adults showed no difference across trial types. Experiment 2 tested the proposed explanation of young children's binding impairment through added perceptual structure, which supported the stability and precision of feature localization in memory-a process key to detecting binding swaps. This additional structure improved young children's binding swap detection, but not new-feature detection or adults' performance. These results provide further evidence for the cognitive dynamics and real-time stability explanation of visual working memory development. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
NASA Astrophysics Data System (ADS)
Qi, K.; Qingfeng, G.
2017-12-01
With the popular use of High-Resolution Satellite (HRS) images, more and more research efforts have been placed on land-use scene classification. However, it makes the task difficult with HRS images for the complex background and multiple land-cover classes or objects. This article presents a multiscale deeply described correlaton model for land-use scene classification. Specifically, the convolutional neural network is introduced to learn and characterize the local features at different scales. Then, learnt multiscale deep features are explored to generate visual words. The spatial arrangement of visual words is achieved through the introduction of adaptive vector quantized correlograms at different scales. Experiments on two publicly available land-use scene datasets demonstrate that the proposed model is compact and yet discriminative for efficient representation of land-use scene images, and achieves competitive classification results with the state-of-art methods.
The method for detecting small lesions in medical image based on sliding window
NASA Astrophysics Data System (ADS)
Han, Guilai; Jiao, Yuan
2016-10-01
At present, the research on computer-aided diagnosis includes the sample image segmentation, extracting visual features, generating the classification model by learning, and according to the model generated to classify and judge the inspected images. However, this method has a large scale of calculation and speed is slow. And because medical images are usually low contrast, when the traditional image segmentation method is applied to the medical image, there is a complete failure. As soon as possible to find the region of interest, improve detection speed, this topic attempts to introduce the current popular visual attention model into small lesions detection. However, Itti model is mainly for natural images. But the effect is not ideal when it is used to medical images which usually are gray images. Especially in the early stages of some cancers, the focus of a disease in the whole image is not the most significant region and sometimes is very difficult to be found. But these lesions are prominent in the local areas. This paper proposes a visual attention mechanism based on sliding window, and use sliding window to calculate the significance of a local area. Combined with the characteristics of the lesion, select the features of gray, entropy, corner and edge to generate a saliency map. Then the significant region is segmented and distinguished. This method reduces the difficulty of image segmentation, and improves the detection accuracy of small lesions, and it has great significance to early discovery, early diagnosis and treatment of cancers.
Golestaneh, S Alireza; Karam, Lina
2016-08-24
Perceptual image quality assessment (IQA) attempts to use computational models to estimate the image quality in accordance with subjective evaluations. Reduced-reference (RR) image quality assessment (IQA) methods make use of partial information or features extracted from the reference image for estimating the quality of distorted images. Finding a balance between the number of RR features and accuracy of the estimated image quality is essential and important in IQA. In this paper we propose a training-free low-cost RRIQA method that requires a very small number of RR features (6 RR features). The proposed RRIQA algorithm is based on the discrete wavelet transform (DWT) of locally weighted gradient magnitudes.We apply human visual system's contrast sensitivity and neighborhood gradient information to weight the gradient magnitudes in a locally adaptive manner. The RR features are computed by measuring the entropy of each DWT subband, for each scale, and pooling the subband entropies along all orientations, resulting in L RR features (one average entropy per scale) for an L-level DWT. Extensive experiments performed on seven large-scale benchmark databases demonstrate that the proposed RRIQA method delivers highly competitive performance as compared to the state-of-the-art RRIQA models as well as full reference ones for both natural and texture images. The MATLAB source code of REDLOG and the evaluation results are publicly available online at https://http://lab.engineering.asu.edu/ivulab/software/redlog/.
Nagai, Takehiro; Matsushima, Toshiki; Koida, Kowa; Tani, Yusuke; Kitazaki, Michiteru; Nakauchi, Shigeki
2015-10-01
Humans can visually recognize material categories of objects, such as glass, stone, and plastic, easily. However, little is known about the kinds of surface quality features that contribute to such material class recognition. In this paper, we examine the relationship between perceptual surface features and material category discrimination performance for pictures of materials, focusing on temporal aspects, including reaction time and effects of stimulus duration. The stimuli were pictures of objects with an identical shape but made of different materials that could be categorized into seven classes (glass, plastic, metal, stone, wood, leather, and fabric). In a pre-experiment, observers rated the pictures on nine surface features, including visual (e.g., glossiness and transparency) and non-visual features (e.g., heaviness and warmness), on a 7-point scale. In the main experiments, observers judged whether two simultaneously presented pictures were classified as the same or different material category. Reaction times and effects of stimulus duration were measured. The results showed that visual feature ratings were correlated with material discrimination performance for short reaction times or short stimulus durations, while non-visual feature ratings were correlated only with performance for long reaction times or long stimulus durations. These results suggest that the mechanisms underlying visual and non-visual feature processing may differ in terms of processing time, although the cause is unclear. Visual surface features may mainly contribute to material recognition in daily life, while non-visual features may contribute only weakly, if at all. Copyright © 2014 Elsevier Ltd. All rights reserved.
Global Enhancement but Local Suppression in Feature-based Attention.
Forschack, Norman; Andersen, Søren K; Müller, Matthias M
2017-04-01
A key property of feature-based attention is global facilitation of the attended feature throughout the visual field. Previously, we presented superimposed red and blue randomly moving dot kinematograms (RDKs) flickering at a different frequency each to elicit frequency-specific steady-state visual evoked potentials (SSVEPs) that allowed us to analyze neural dynamics in early visual cortex when participants shifted attention to one of the two colors. Results showed amplification of the attended and suppression of the unattended color as measured by SSVEP amplitudes. Here, we tested whether the suppression of the unattended color also operates globally. To this end, we presented superimposed flickering red and blue RDKs in the center of a screen and a red and blue RDK in the left and right periphery, respectively, also flickering at different frequencies. Participants shifted attention to one color of the superimposed RDKs in the center to discriminate coherent motion events in the attended from the unattended color RDK, whereas the peripheral RDKs were task irrelevant. SSVEP amplitudes elicited by the centrally presented RDKs confirmed the previous findings of amplification and suppression. For peripherally located RDKs, we found the expected SSVEP amplitude increase, relative to precue baseline when color matched the one of the centrally attended RDK. We found no reduction in SSVEP amplitude relative to precue baseline, when the peripheral color matched the unattended one of the central RDK, indicating that, while facilitation in feature-based attention operates globally, suppression seems to be linked to the location of focused attention.
Tools for visually exploring biological networks.
Suderman, Matthew; Hallett, Michael
2007-10-15
Many tools exist for visually exploring biological networks including well-known examples such as Cytoscape, VisANT, Pathway Studio and Patika. These systems play a key role in the development of integrative biology, systems biology and integrative bioinformatics. The trend in the development of these tools is to go beyond 'static' representations of cellular state, towards a more dynamic model of cellular processes through the incorporation of gene expression data, subcellular localization information and time-dependent behavior. We provide a comprehensive review of the relative advantages and disadvantages of existing systems with two goals in mind: to aid researchers in efficiently identifying the appropriate existing tools for data visualization; to describe the necessary and realistic goals for the next generation of visualization tools. In view of the first goal, we provide in the Supplementary Material a systematic comparison of more than 35 existing tools in terms of over 25 different features. Supplementary data are available at Bioinformatics online.
Managing Spatial Selections With Contextual Snapshots
Mindek, P; Gröller, M E; Bruckner, S
2014-01-01
Spatial selections are a ubiquitous concept in visualization. By localizing particular features, they can be analysed and compared in different views. However, the semantics of such selections often depend on specific parameter settings and it can be difficult to reconstruct them without additional information. In this paper, we present the concept of contextual snapshots as an effective means for managing spatial selections in visualized data. The selections are automatically associated with the context in which they have been created. Contextual snapshots can also be used as the basis for interactive integrated and linked views, which enable in-place investigation and comparison of multiple visual representations of data. Our approach is implemented as a flexible toolkit with well-defined interfaces for integration into existing systems. We demonstrate the power and generality of our techniques by applying them to several distinct scenarios such as the visualization of simulation data, the analysis of historical documents and the display of anatomical data. PMID:25821284
Storage of features, conjunctions and objects in visual working memory.
Vogel, E K; Woodman, G F; Luck, S J
2001-02-01
Working memory can be divided into separate subsystems for verbal and visual information. Although the verbal system has been well characterized, the storage capacity of visual working memory has not yet been established for simple features or for conjunctions of features. The authors demonstrate that it is possible to retain information about only 3-4 colors or orientations in visual working memory at one time. Observers are also able to retain both the color and the orientation of 3-4 objects, indicating that visual working memory stores integrated objects rather than individual features. Indeed, objects defined by a conjunction of four features can be retained in working memory just as well as single-feature objects, allowing many individual features to be retained when distributed across a small number of objects. Thus, the capacity of visual working memory must be understood in terms of integrated objects rather than individual features.
Acquiring Semantically Meaningful Models for Robotic Localization, Mapping and Target Recognition
2014-12-21
information, including suggesstions for reducing this burden, to Washington Headquarters Services , Directorate for Information Operations and Reports, 1215...Representations • Point features tracking • Recovery of relative motion, visual odometry • Loop closure • Environment models, sparse clouds of points...that co- occur with the object of interest Chair-Background Table-Background Object Level Segmentation Jaccard Index Silber .[5] 15.12 RenFox[4
Linder, Nina; Turkki, Riku; Walliander, Margarita; Mårtensson, Andreas; Diwan, Vinod; Rahtu, Esa; Pietikäinen, Matti; Lundin, Mikael; Lundin, Johan
2014-01-01
Microscopy is the gold standard for diagnosis of malaria, however, manual evaluation of blood films is highly dependent on skilled personnel in a time-consuming, error-prone and repetitive process. In this study we propose a method using computer vision detection and visualization of only the diagnostically most relevant sample regions in digitized blood smears. Giemsa-stained thin blood films with P. falciparum ring-stage trophozoites (n = 27) and uninfected controls (n = 20) were digitally scanned with an oil immersion objective (0.1 µm/pixel) to capture approximately 50,000 erythrocytes per sample. Parasite candidate regions were identified based on color and object size, followed by extraction of image features (local binary patterns, local contrast and Scale-invariant feature transform descriptors) used as input to a support vector machine classifier. The classifier was trained on digital slides from ten patients and validated on six samples. The diagnostic accuracy was tested on 31 samples (19 infected and 12 controls). From each digitized area of a blood smear, a panel with the 128 most probable parasite candidate regions was generated. Two expert microscopists were asked to visually inspect the panel on a tablet computer and to judge whether the patient was infected with P. falciparum. The method achieved a diagnostic sensitivity and specificity of 95% and 100% as well as 90% and 100% for the two readers respectively using the diagnostic tool. Parasitemia was separately calculated by the automated system and the correlation coefficient between manual and automated parasitemia counts was 0.97. We developed a decision support system for detecting malaria parasites using a computer vision algorithm combined with visualization of sample areas with the highest probability of malaria infection. The system provides a novel method for blood smear screening with a significantly reduced need for visual examination and has a potential to increase the throughput in malaria diagnostics.
Handfield, Louis-François; Chong, Yolanda T.; Simmons, Jibril; Andrews, Brenda J.; Moses, Alan M.
2013-01-01
Protein subcellular localization has been systematically characterized in budding yeast using fluorescently tagged proteins. Based on the fluorescence microscopy images, subcellular localization of many proteins can be classified automatically using supervised machine learning approaches that have been trained to recognize predefined image classes based on statistical features. Here, we present an unsupervised analysis of protein expression patterns in a set of high-resolution, high-throughput microscope images. Our analysis is based on 7 biologically interpretable features which are evaluated on automatically identified cells, and whose cell-stage dependency is captured by a continuous model for cell growth. We show that it is possible to identify most previously identified localization patterns in a cluster analysis based on these features and that similarities between the inferred expression patterns contain more information about protein function than can be explained by a previous manual categorization of subcellular localization. Furthermore, the inferred cell-stage associated to each fluorescence measurement allows us to visualize large groups of proteins entering the bud at specific stages of bud growth. These correspond to proteins localized to organelles, revealing that the organelles must be entering the bud in a stereotypical order. We also identify and organize a smaller group of proteins that show subtle differences in the way they move around the bud during growth. Our results suggest that biologically interpretable features based on explicit models of cell morphology will yield unprecedented power for pattern discovery in high-resolution, high-throughput microscopy images. PMID:23785265
Perceptual Learning Induces Persistent Attentional Capture by Nonsalient Shapes.
Qu, Zhe; Hillyard, Steven A; Ding, Yulong
2017-02-01
Visual attention can be attracted automatically by salient simple features, but whether and how nonsalient complex stimuli such as shapes may capture attention in humans remains unclear. Here, we present strong electrophysiological evidence that a nonsalient shape presented among similar shapes can provoke a robust and persistent capture of attention as a consequence of extensive training in visual search (VS) for that shape. Strikingly, this attentional capture that followed perceptual learning (PL) was evident even when the trained shape was task-irrelevant, was presented outside the focus of top-down spatial attention, and was undetected by the observer. Moreover, this attentional capture persisted for at least 3-5 months after training had been terminated. This involuntary capture of attention was indexed by electrophysiological recordings of the N2pc component of the event-related brain potential, which was localized to ventral extrastriate visual cortex, and was highly predictive of stimulus-specific improvement in VS ability following PL. These findings provide the first evidence that nonsalient shapes can capture visual attention automatically following PL and challenge the prominent view that detection of feature conjunctions requires top-down focal attention. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Lymphoma diagnosis in histopathology using a multi-stage visual learning approach
NASA Astrophysics Data System (ADS)
Codella, Noel; Moradi, Mehdi; Matasar, Matt; Sveda-Mahmood, Tanveer; Smith, John R.
2016-03-01
This work evaluates the performance of a multi-stage image enhancement, segmentation, and classification approach for lymphoma recognition in hematoxylin and eosin (H and E) stained histopathology slides of excised human lymph node tissue. In the first stage, the original histology slide undergoes various image enhancement and segmentation operations, creating an additional 5 images for every slide. These new images emphasize unique aspects of the original slide, including dominant staining, staining segmentations, non-cellular groupings, and cellular groupings. For the resulting 6 total images, a collection of visual features are extracted from 3 different spatial configurations. Visual features include the first fully connected layer (4096 dimensions) of the Caffe convolutional neural network trained from ImageNet data. In total, over 200 resultant visual descriptors are extracted for each slide. Non-linear SVMs are trained over each of the over 200 descriptors, which are then input to a forward stepwise ensemble selection that optimizes a late fusion sum of logistically normalized model outputs using local hill climbing. The approach is evaluated on a public NIH dataset containing 374 images representing 3 lymphoma conditions: chronic lymphocytic leukemia (CLL), follicular lymphoma (FL), and mantle cell lymphoma (MCL). Results demonstrate a 38.4% reduction in residual error over the current state-of-art on this dataset.
Topological visual mapping in robotics.
Romero, Anna; Cazorla, Miguel
2012-08-01
A key problem in robotics is the construction of a map from its environment. This map could be used in different tasks, like localization, recognition, obstacle avoidance, etc. Besides, the simultaneous location and mapping (SLAM) problem has had a lot of interest in the robotics community. This paper presents a new method for visual mapping, using topological instead of metric information. For that purpose, we propose prior image segmentation into regions in order to group the extracted invariant features in a graph so that each graph defines a single region of the image. Although others methods have been proposed for visual SLAM, our method is complete, in the sense that it makes all the process: it presents a new method for image matching; it defines a way to build the topological map; and it also defines a matching criterion for loop-closing. The matching process will take into account visual features and their structure using the graph transformation matching (GTM) algorithm, which allows us to process the matching and to remove out the outliers. Then, using this image comparison method, we propose an algorithm for constructing topological maps. During the experimentation phase, we will test the robustness of the method and its ability constructing topological maps. We have also introduced new hysteresis behavior in order to solve some problems found building the graph.
Image Feature Types and Their Predictions of Aesthetic Preference and Naturalness
Ibarra, Frank F.; Kardan, Omid; Hunter, MaryCarol R.; Kotabe, Hiroki P.; Meyer, Francisco A. C.; Berman, Marc G.
2017-01-01
Previous research has investigated ways to quantify visual information of a scene in terms of a visual processing hierarchy, i.e., making sense of visual environment by segmentation and integration of elementary sensory input. Guided by this research, studies have developed categories for low-level visual features (e.g., edges, colors), high-level visual features (scene-level entities that convey semantic information such as objects), and how models of those features predict aesthetic preference and naturalness. For example, in Kardan et al. (2015a), 52 participants provided aesthetic preference and naturalness ratings, which are used in the current study, for 307 images of mixed natural and urban content. Kardan et al. (2015a) then developed a model using low-level features to predict aesthetic preference and naturalness and could do so with high accuracy. What has yet to be explored is the ability of higher-level visual features (e.g., horizon line position relative to viewer, geometry of building distribution relative to visual access) to predict aesthetic preference and naturalness of scenes, and whether higher-level features mediate some of the association between the low-level features and aesthetic preference or naturalness. In this study we investigated these relationships and found that low- and high- level features explain 68.4% of the variance in aesthetic preference ratings and 88.7% of the variance in naturalness ratings. Additionally, several high-level features mediated the relationship between the low-level visual features and aaesthetic preference. In a multiple mediation analysis, the high-level feature mediators accounted for over 50% of the variance in predicting aesthetic preference. These results show that high-level visual features play a prominent role predicting aesthetic preference, but do not completely eliminate the predictive power of the low-level visual features. These strong predictors provide powerful insights for future research relating to landscape and urban design with the aim of maximizing subjective well-being, which could lead to improved health outcomes on a larger scale. PMID:28503158
Wood texture classification by fuzzy neural networks
NASA Astrophysics Data System (ADS)
Gonzaga, Adilson; de Franca, Celso A.; Frere, Annie F.
1999-03-01
The majority of scientific papers focusing on wood classification for pencil manufacturing take into account defects and visual appearance. Traditional methodologies are base don texture analysis by co-occurrence matrix, by image modeling, or by tonal measures over the plate surface. In this work, we propose to classify plates of wood without biological defects like insect holes, nodes, and cracks, by analyzing their texture. By this methodology we divide the plate image in several rectangular windows or local areas and reduce the number of gray levels. From each local area, we compute the histogram of difference sand extract texture features, given them as input to a Local Neuro-Fuzzy Network. Those features are from the histogram of differences instead of the image pixels due to their better performance and illumination independence. Among several features like media, contrast, second moment, entropy, and IDN, the last three ones have showed better results for network training. Each LNN output is taken as input to a Partial Neuro-Fuzzy Network (PNFN) classifying a pencil region on the plate. At last, the outputs from the PNFN are taken as input to a Global Fuzzy Logic doing the plate classification. Each pencil classification within the plate is done taking into account each quality index.
NASA Astrophysics Data System (ADS)
Ha, Sieu D.; Qi, Yabing; Kahn, Antoine
2010-08-01
Temperature-dependent I- V measurements determine that pentacene is effectively p-doped by tetrafluoro-tetracyanoquinodimethane (F 4-TCNQ). It has been shown by scanning tunneling microscopy (STM) that the donated hole is localized by the ionized dopant counter potential, and that the hole can be visualized [4]. Here, it is argued that the effect of the localized hole on STM images should depend on distance as 1/ ɛr, as per the Coulomb potential. By fitting line profiles of localized hole features to the Coulomb potential, it is shown that approximate values for the relative permittivity and Hubbard U of pentacene can be extracted.
Research on metallic material defect detection based on bionic sensing of human visual properties
NASA Astrophysics Data System (ADS)
Zhang, Pei Jiang; Cheng, Tao
2018-05-01
Due to the fact that human visual system can quickly lock the areas of interest in complex natural environment and focus on it, this paper proposes an eye-based visual attention mechanism by simulating human visual imaging features based on human visual attention mechanism Bionic Sensing Visual Inspection Model Method to Detect Defects of Metallic Materials in the Mechanical Field. First of all, according to the biologically visually significant low-level features, the mark of defect experience marking is used as the intermediate feature of simulated visual perception. Afterwards, SVM method was used to train the advanced features of visual defects of metal material. According to the weight of each party, the biometrics detection model of metal material defect, which simulates human visual characteristics, is obtained.
Observers' cognitive states modulate how visual inputs relate to gaze control.
Kardan, Omid; Henderson, John M; Yourganov, Grigori; Berman, Marc G
2016-09-01
Previous research has shown that eye-movements change depending on both the visual features of our environment, and the viewer's top-down knowledge. One important question that is unclear is the degree to which the visual goals of the viewer modulate how visual features of scenes guide eye-movements. Here, we propose a systematic framework to investigate this question. In our study, participants performed 3 different visual tasks on 135 scenes: search, memorization, and aesthetic judgment, while their eye-movements were tracked. Canonical correlation analyses showed that eye-movements were reliably more related to low-level visual features at fixations during the visual search task compared to the aesthetic judgment and scene memorization tasks. Different visual features also had different relevance to eye-movements between tasks. This modulation of the relationship between visual features and eye-movements by task was also demonstrated with classification analyses, where classifiers were trained to predict the viewing task based on eye movements and visual features at fixations. Feature loadings showed that the visual features at fixations could signal task differences independent of temporal and spatial properties of eye-movements. When classifying across participants, edge density and saliency at fixations were as important as eye-movements in the successful prediction of task, with entropy and hue also being significant, but with smaller effect sizes. When classifying within participants, brightness and saturation were also significant contributors. Canonical correlation and classification results, together with a test of moderation versus mediation, suggest that the cognitive state of the observer moderates the relationship between stimulus-driven visual features and eye-movements. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Wavelet processing techniques for digital mammography
NASA Astrophysics Data System (ADS)
Laine, Andrew F.; Song, Shuwu
1992-09-01
This paper introduces a novel approach for accomplishing mammographic feature analysis through multiresolution representations. We show that efficient (nonredundant) representations may be identified from digital mammography and used to enhance specific mammographic features within a continuum of scale space. The multiresolution decomposition of wavelet transforms provides a natural hierarchy in which to embed an interactive paradigm for accomplishing scale space feature analysis. Similar to traditional coarse to fine matching strategies, the radiologist may first choose to look for coarse features (e.g., dominant mass) within low frequency levels of a wavelet transform and later examine finer features (e.g., microcalcifications) at higher frequency levels. In addition, features may be extracted by applying geometric constraints within each level of the transform. Choosing wavelets (or analyzing functions) that are simultaneously localized in both space and frequency, results in a powerful methodology for image analysis. Multiresolution and orientation selectivity, known biological mechanisms in primate vision, are ingrained in wavelet representations and inspire the techniques presented in this paper. Our approach includes local analysis of complete multiscale representations. Mammograms are reconstructed from wavelet representations, enhanced by linear, exponential and constant weight functions through scale space. By improving the visualization of breast pathology we can improve the chances of early detection of breast cancers (improve quality) while requiring less time to evaluate mammograms for most patients (lower costs).
NASA Astrophysics Data System (ADS)
Miyazawa, Arata; Hong, Young-Joo; Makita, Shuichi; Kasaragod, Deepa K.; Miura, Masahiro; Yasuno, Yoshiaki
2017-02-01
Local statistics are widely utilized for quantification and image processing of OCT. For example, local mean is used to reduce speckle, local variation of polarization state (degree-of-polarization-uniformity (DOPU)) is used to visualize melanin. Conventionally, these statistics are calculated in a rectangle kernel whose size is uniform over the image. However, the fixed size and shape of the kernel result in a tradeoff between image sharpness and statistical accuracy. Superpixel is a cluster of pixels which is generated by grouping image pixels based on the spatial proximity and similarity of signal values. Superpixels have variant size and flexible shapes which preserve the tissue structure. Here we demonstrate a new superpixel method which is tailored for multifunctional Jones matrix OCT (JM-OCT). This new method forms the superpixels by clustering image pixels in a 6-dimensional (6-D) feature space (spatial two dimensions and four dimensions of optical features). All image pixels were clustered based on their spatial proximity and optical feature similarity. The optical features are scattering, OCT-A, birefringence and DOPU. The method is applied to retinal OCT. Generated superpixels preserve the tissue structures such as retinal layers, sclera, vessels, and retinal pigment epithelium. Hence, superpixel can be utilized as a local statistics kernel which would be more suitable than a uniform rectangle kernel. Superpixelized image also can be used for further image processing and analysis. Since it reduces the number of pixels to be analyzed, it reduce the computational cost of such image processing.
Visual affective classification by combining visual and text features.
Liu, Ningning; Wang, Kai; Jin, Xin; Gao, Boyang; Dellandréa, Emmanuel; Chen, Liming
2017-01-01
Affective analysis of images in social networks has drawn much attention, and the texts surrounding images are proven to provide valuable semantic meanings about image content, which can hardly be represented by low-level visual features. In this paper, we propose a novel approach for visual affective classification (VAC) task. This approach combines visual representations along with novel text features through a fusion scheme based on Dempster-Shafer (D-S) Evidence Theory. Specifically, we not only investigate different types of visual features and fusion methods for VAC, but also propose textual features to effectively capture emotional semantics from the short text associated to images based on word similarity. Experiments are conducted on three public available databases: the International Affective Picture System (IAPS), the Artistic Photos and the MirFlickr Affect set. The results demonstrate that the proposed approach combining visual and textual features provides promising results for VAC task.
Visual affective classification by combining visual and text features
Liu, Ningning; Wang, Kai; Jin, Xin; Gao, Boyang; Dellandréa, Emmanuel; Chen, Liming
2017-01-01
Affective analysis of images in social networks has drawn much attention, and the texts surrounding images are proven to provide valuable semantic meanings about image content, which can hardly be represented by low-level visual features. In this paper, we propose a novel approach for visual affective classification (VAC) task. This approach combines visual representations along with novel text features through a fusion scheme based on Dempster-Shafer (D-S) Evidence Theory. Specifically, we not only investigate different types of visual features and fusion methods for VAC, but also propose textual features to effectively capture emotional semantics from the short text associated to images based on word similarity. Experiments are conducted on three public available databases: the International Affective Picture System (IAPS), the Artistic Photos and the MirFlickr Affect set. The results demonstrate that the proposed approach combining visual and textual features provides promising results for VAC task. PMID:28850566
News video story segmentation method using fusion of audio-visual features
NASA Astrophysics Data System (ADS)
Wen, Jun; Wu, Ling-da; Zeng, Pu; Luan, Xi-dao; Xie, Yu-xiang
2007-11-01
News story segmentation is an important aspect for news video analysis. This paper presents a method for news video story segmentation. Different form prior works, which base on visual features transform, the proposed technique uses audio features as baseline and fuses visual features with it to refine the results. At first, it selects silence clips as audio features candidate points, and selects shot boundaries and anchor shots as two kinds of visual features candidate points. Then this paper selects audio feature candidates as cues and develops different fusion method, which effectively using diverse type visual candidates to refine audio candidates, to get story boundaries. Experiment results show that this method has high efficiency and adaptability to different kinds of news video.
Liu, B; Meng, X; Wu, G; Huang, Y
2012-05-17
In this article, we aimed to study whether feature precedence existed in the cognitive processing of multifeature visual information in the human brain. In our experiment, we paid attention to two important visual features as follows: color and shape. In order to avoid the presence of semantic constraints between them and the resulting impact, pure color and simple geometric shape were chosen as the color feature and shape feature of visual stimulus, respectively. We adopted an "old/new" paradigm to study the cognitive processing of color feature, shape feature and the combination of color feature and shape feature, respectively. The experiment consisted of three tasks as follows: Color task, Shape task and Color-Shape task. The results showed that the feature-based pattern would be activated in the human brain in processing multifeature visual information without semantic association between features. Furthermore, shape feature was processed earlier than color feature, and the cognitive processing of color feature was more difficult than that of shape feature. Copyright © 2012 IBRO. Published by Elsevier Ltd. All rights reserved.
Ghosh, Tonmoy; Fattah, Shaikh Anowarul; Wahid, Khan A
2018-01-01
Wireless capsule endoscopy (WCE) is the most advanced technology to visualize whole gastrointestinal (GI) tract in a non-invasive way. But the major disadvantage here, it takes long reviewing time, which is very laborious as continuous manual intervention is necessary. In order to reduce the burden of the clinician, in this paper, an automatic bleeding detection method for WCE video is proposed based on the color histogram of block statistics, namely CHOBS. A single pixel in WCE image may be distorted due to the capsule motion in the GI tract. Instead of considering individual pixel values, a block surrounding to that individual pixel is chosen for extracting local statistical features. By combining local block features of three different color planes of RGB color space, an index value is defined. A color histogram, which is extracted from those index values, provides distinguishable color texture feature. A feature reduction technique utilizing color histogram pattern and principal component analysis is proposed, which can drastically reduce the feature dimension. For bleeding zone detection, blocks are classified using extracted local features that do not incorporate any computational burden for feature extraction. From extensive experimentation on several WCE videos and 2300 images, which are collected from a publicly available database, a very satisfactory bleeding frame and zone detection performance is achieved in comparison to that obtained by some of the existing methods. In the case of bleeding frame detection, the accuracy, sensitivity, and specificity obtained from proposed method are 97.85%, 99.47%, and 99.15%, respectively, and in the case of bleeding zone detection, 95.75% of precision is achieved. The proposed method offers not only low feature dimension but also highly satisfactory bleeding detection performance, which even can effectively detect bleeding frame and zone in a continuous WCE video data.
Feature-based attentional modulations in the absence of direct visual stimulation.
Serences, John T; Boynton, Geoffrey M
2007-07-19
When faced with a crowded visual scene, observers must selectively attend to behaviorally relevant objects to avoid sensory overload. Often this selection process is guided by prior knowledge of a target-defining feature (e.g., the color red when looking for an apple), which enhances the firing rate of visual neurons that are selective for the attended feature. Here, we used functional magnetic resonance imaging and a pattern classification algorithm to predict the attentional state of human observers as they monitored a visual feature (one of two directions of motion). We find that feature-specific attention effects spread across the visual field-even to regions of the scene that do not contain a stimulus. This spread of feature-based attention to empty regions of space may facilitate the perception of behaviorally relevant stimuli by increasing sensitivity to attended features at all locations in the visual field.
VRML and Collaborative Environments: New Tools for Networked Visualization
NASA Astrophysics Data System (ADS)
Crutcher, R. M.; Plante, R. L.; Rajlich, P.
We present two new applications that engage the network as a tool for astronomical research and/or education. The first is a VRML server which allows users over the Web to interactively create three-dimensional visualizations of FITS images contained in the NCSA Astronomy Digital Image Library (ADIL). The server's Web interface allows users to select images from the ADIL, fill in processing parameters, and create renderings featuring isosurfaces, slices, contours, and annotations; the often extensive computations are carried out on an NCSA SGI supercomputer server without the user having an individual account on the system. The user can then download the 3D visualizations as VRML files, which may be rotated and manipulated locally on virtually any class of computer. The second application is the ADILBrowser, a part of the NCSA Horizon Image Data Browser Java package. ADILBrowser allows a group of participants to browse images from the ADIL within a collaborative session. The collaborative environment is provided by the NCSA Habanero package which includes text and audio chat tools and a white board. The ADILBrowser is just an example of a collaborative tool that can be built with the Horizon and Habanero packages. The classes provided by these packages can be assembled to create custom collaborative applications that visualize data either from local disk or from anywhere on the network.
Acquiring skill at medical image inspection: learning localized in early visual processes
NASA Astrophysics Data System (ADS)
Sowden, Paul T.; Davies, Ian R. L.; Roling, Penny; Watt, Simon J.
1997-04-01
Acquisition of the skill of medical image inspection could be due to changes in visual search processes, 'low-level' sensory learning, and higher level 'conceptual learning.' Here, we report two studies that investigate the extent to which learning in medical image inspection involves low- level learning. Early in the visual processing pathway cells are selective for direction of luminance contrast. We exploit this in the present studies by using transfer across direction of contrast as a 'marker' to indicate the level of processing at which learning occurs. In both studies twelve observers trained for four days at detecting features in x- ray images (experiment one equals discs in the Nijmegen phantom, experiment two equals micro-calcification clusters in digitized mammograms). Half the observers examined negative luminance contrast versions of the images and the remainder examined positive contrast versions. On the fifth day, observers swapped to inspect their respective opposite contrast images. In both experiments leaning occurred across sessions. In experiment one, learning did not transfer across direction of luminance contrast, while in experiment two there was only partial transfer. These findings are consistent with the contention that some of the leaning was localized early in the visual processing pathway. The implications of these results for current medical image inspection training schedules are discussed.
Content-based image retrieval by matching hierarchical attributed region adjacency graphs
NASA Astrophysics Data System (ADS)
Fischer, Benedikt; Thies, Christian J.; Guld, Mark O.; Lehmann, Thomas M.
2004-05-01
Content-based image retrieval requires a formal description of visual information. In medical applications, all relevant biological objects have to be represented by this description. Although color as the primary feature has proven successful in publicly available retrieval systems of general purpose, this description is not applicable to most medical images. Additionally, it has been shown that global features characterizing the whole image do not lead to acceptable results in the medical context or that they are only suitable for specific applications. For a general purpose content-based comparison of medical images, local, i.e. regional features that are collected on multiple scales must be used. A hierarchical attributed region adjacency graph (HARAG) provides such a representation and transfers image comparison to graph matching. However, building a HARAG from an image requires a restriction in size to be computationally feasible while at the same time all visually plausible information must be preserved. For this purpose, mechanisms for the reduction of the graph size are presented. Even with a reduced graph, the problem of graph matching remains NP-complete. In this paper, the Similarity Flooding approach and Hopfield-style neural networks are adapted from the graph matching community to the needs of HARAG comparison. Based on synthetic image material build from simple geometric objects, all visually similar regions were matched accordingly showing the framework's general applicability to content-based image retrieval of medical images.
Xu, Yingying; Lin, Lanfen; Hu, Hongjie; Wang, Dan; Zhu, Wenchao; Wang, Jian; Han, Xian-Hua; Chen, Yen-Wei
2018-01-01
The bag of visual words (BoVW) model is a powerful tool for feature representation that can integrate various handcrafted features like intensity, texture, and spatial information. In this paper, we propose a novel BoVW-based method that incorporates texture and spatial information for the content-based image retrieval to assist radiologists in clinical diagnosis. This paper presents a texture-specific BoVW method to represent focal liver lesions (FLLs). Pixels in the region of interest (ROI) are classified into nine texture categories using the rotation-invariant uniform local binary pattern method. The BoVW-based features are calculated for each texture category. In addition, a spatial cone matching (SCM)-based representation strategy is proposed to describe the spatial information of the visual words in the ROI. In a pilot study, eight radiologists with different clinical experience performed diagnoses for 20 cases with and without the top six retrieved results. A total of 132 multiphase computed tomography volumes including five pathological types were collected. The texture-specific BoVW was compared to other BoVW-based methods using the constructed dataset of FLLs. The results show that our proposed model outperforms the other three BoVW methods in discriminating different lesions. The SCM method, which adds spatial information to the orderless BoVW model, impacted the retrieval performance. In the pilot trial, the average diagnosis accuracy of the radiologists was improved from 66 to 80% using the retrieval system. The preliminary results indicate that the texture-specific features and the SCM-based BoVW features can effectively characterize various liver lesions. The retrieval system has the potential to improve the diagnostic accuracy and the confidence of the radiologists.
Visualizing dispersive features in 2D image via minimum gradient method
DOE Office of Scientific and Technical Information (OSTI.GOV)
He, Yu; Wang, Yan; Shen, Zhi -Xun
Here, we developed a minimum gradient based method to track ridge features in a 2D image plot, which is a typical data representation in many momentum resolved spectroscopy experiments. Through both analytic formulation and numerical simulation, we compare this new method with existing DC (distribution curve) based and higher order derivative based analyses. We find that the new method has good noise resilience and enhanced contrast especially for weak intensity features and meanwhile preserves the quantitative local maxima information from the raw image. An algorithm is proposed to extract 1D ridge dispersion from the 2D image plot, whose quantitative applicationmore » to angle-resolved photoemission spectroscopy measurements on high temperature superconductors is demonstrated.« less
Visualizing dispersive features in 2D image via minimum gradient method
He, Yu; Wang, Yan; Shen, Zhi -Xun
2017-07-24
Here, we developed a minimum gradient based method to track ridge features in a 2D image plot, which is a typical data representation in many momentum resolved spectroscopy experiments. Through both analytic formulation and numerical simulation, we compare this new method with existing DC (distribution curve) based and higher order derivative based analyses. We find that the new method has good noise resilience and enhanced contrast especially for weak intensity features and meanwhile preserves the quantitative local maxima information from the raw image. An algorithm is proposed to extract 1D ridge dispersion from the 2D image plot, whose quantitative applicationmore » to angle-resolved photoemission spectroscopy measurements on high temperature superconductors is demonstrated.« less
Using endemic road features to create self-explaining roads and reduce vehicle speeds.
Charlton, Samuel G; Mackie, Hamish W; Baas, Peter H; Hay, Karen; Menezes, Miguel; Dixon, Claire
2010-11-01
This paper describes a project undertaken to establish a self-explaining roads (SER) design programme on existing streets in an urban area. The methodology focussed on developing a process to identify functional road categories and designs based on endemic road characteristics taken from functional exemplars in the study area. The study area was divided into two sections, one to receive SER treatments designed to maximise visual differences between road categories, and a matched control area to remain untreated for purposes of comparison. The SER design for local roads included increased landscaping and community islands to limit forward visibility, and removal of road markings to create a visually distinct road environment. In comparison, roads categorised as collectors received increased delineation, addition of cycle lanes, and improved amenity for pedestrians. Speed data collected 3 months after implementation showed a significant reduction in vehicle speeds on local roads and increased homogeneity of speeds on both local and collector roads. The objective speed data, combined with residents' speed choice ratings, indicated that the project was successful in creating two discriminably different road categories. 2010 Elsevier Ltd. All rights reserved.
Hi-fidelity multi-scale local processing for visually optimized far-infrared Herschel images
NASA Astrophysics Data System (ADS)
Li Causi, G.; Schisano, E.; Liu, S. J.; Molinari, S.; Di Giorgio, A.
2016-07-01
In the context of the "Hi-Gal" multi-band full-plane mapping program for the Galactic Plane, as imaged by the Herschel far-infrared satellite, we have developed a semi-automatic tool which produces high definition, high quality color maps optimized for visual perception of extended features, like bubbles and filaments, against the high background variations. We project the map tiles of three selected bands onto a 3-channel panorama, which spans the central 130 degrees of galactic longitude times 2.8 degrees of galactic latitude, at the pixel scale of 3.2", in cartesian galactic coordinates. Then we process this image piecewise, applying a custom multi-scale local stretching algorithm, enforced by a local multi-scale color balance. Finally, we apply an edge-preserving contrast enhancement to perform an artifact-free details sharpening. Thanks to this tool, we have thus produced a stunning giga-pixel color image of the far-infrared Galactic Plane that we made publicly available with the recent release of the Hi-Gal mosaics and compact source catalog.
Flow visualization of unsteady phenomena in the hypersonic regime using high-speed video camera
NASA Astrophysics Data System (ADS)
Hashimoto, Tokitada; Saito, Tsutomu; Takayama, Kazuyoshi
2004-02-01
Flows over double cones and wedges featured with a large shock induced separation zone are representative of many parts of hypersonic vehicle geometries. To be practically important at shock interactions is phenomena that the shock wave produced from another objects carries out incidence to bow shock around a blunt body in the hypersonic flows, the two shock waves interact each other and various shock interactions occur according to the intensity of the shock wave and depending on the case of the local maximum of pressure and heat flux is locally produced on the body surface. The six types of shock interactions are classified, and particularly in the Type IV, a shear layer generated from the intersection of the two shock reached on the body surface, and locally anomalous pressure increase and aerodynamic heating occurred experimentally. In the present study, unsteady shock oscillations and periodically separation flows were visualized by means of high-speed video camera. Particularly, sequential observations with combination of schlieren methods are very effective because of flow unsteadiness.
NASA Astrophysics Data System (ADS)
Starkey, Eleanor; Barnes, Mhari; Quinn, Paul; Large, Andy
2016-04-01
Pressures associated with flooding and climate change have significantly increased over recent years. Natural Flood Risk Management (NFRM) is now seen as being a more appropriate and favourable approach in some locations. At the same time, catchment managers are also encouraged to adopt a more integrated, evidence-based and bottom-up approach. This includes engaging with local communities. Although NFRM features are being more readily installed, there is still limited evidence associated with their ability to reduce flood risk and offer multiple benefits. In particular, local communities and land owners are still uncertain about what the features entail and how they will perform, which is a huge barrier affecting widespread uptake. Traditional hydrometric monitoring techniques are well established but they still struggle to successfully monitor and capture NFRM performance spatially and temporally in a visual and more meaningful way for those directly affected on the ground. Two UK-based case studies are presented here where unique NFRM features have been carefully designed and installed in rural headwater catchments. This includes a 1km2 sub-catchment of the Haltwhistle Burn (northern England) and a 2km2 sub-catchment of Eddleston Water (southern Scotland). Both of these pilot sites are subject to prolonged flooding in winter and flash flooding in summer. This exacerbates sediment, debris and water quality issues downstream. Examples of NFRM features include ponds, woody debris and a log feature inspired by the children's game 'Kerplunk'. They have been tested and monitored over the 2015-2016 winter storms using low-cost techniques by both researchers and members of the community ('citizen scientists'). Results show that monitoring techniques such as regular consumer specification time-lapse cameras, photographs, videos and 'kite-cams' are suitable for long-term and low-cost monitoring of a variety of NFRM features. These techniques have been compared against traditional hydrometric monitoring equipment. It is clear that traditional techniques are expensive, require specialist skills and outputs are complicated to the untrained eye. These alternative methods tested are visually more meaningful, can be interpreted by all stakeholders and techniques can be easily utilised by citizen scientists, land owners or flood groups. Such techniques therefore offer a before, during and after NFRM monitoring solution which can be more realistically and readily implemented, supports engagement and subsequent uptake and maintenance of NFRM features on a local level. Although monitoring techniques presented are relatively simple, they are regarded as being essential given that many schemes are not monitored at all.
Caudate nucleus reactivity predicts perceptual learning rate for visual feature conjunctions.
Reavis, Eric A; Frank, Sebastian M; Tse, Peter U
2015-04-15
Useful information in the visual environment is often contained in specific conjunctions of visual features (e.g., color and shape). The ability to quickly and accurately process such conjunctions can be learned. However, the neural mechanisms responsible for such learning remain largely unknown. It has been suggested that some forms of visual learning might involve the dopaminergic neuromodulatory system (Roelfsema et al., 2010; Seitz and Watanabe, 2005), but this hypothesis has not yet been directly tested. Here we test the hypothesis that learning visual feature conjunctions involves the dopaminergic system, using functional neuroimaging, genetic assays, and behavioral testing techniques. We use a correlative approach to evaluate potential associations between individual differences in visual feature conjunction learning rate and individual differences in dopaminergic function as indexed by neuroimaging and genetic markers. We find a significant correlation between activity in the caudate nucleus (a component of the dopaminergic system connected to visual areas of the brain) and visual feature conjunction learning rate. Specifically, individuals who showed a larger difference in activity between positive and negative feedback on an unrelated cognitive task, indicative of a more reactive dopaminergic system, learned visual feature conjunctions more quickly than those who showed a smaller activity difference. This finding supports the hypothesis that the dopaminergic system is involved in visual learning, and suggests that visual feature conjunction learning could be closely related to associative learning. However, no significant, reliable correlations were found between feature conjunction learning and genotype or dopaminergic activity in any other regions of interest. Copyright © 2015 Elsevier Inc. All rights reserved.
Example-Based Image Colorization Using Locality Consistent Sparse Representation.
Bo Li; Fuchen Zhao; Zhuo Su; Xiangguo Liang; Yu-Kun Lai; Rosin, Paul L
2017-11-01
Image colorization aims to produce a natural looking color image from a given gray-scale image, which remains a challenging problem. In this paper, we propose a novel example-based image colorization method exploiting a new locality consistent sparse representation. Given a single reference color image, our method automatically colorizes the target gray-scale image by sparse pursuit. For efficiency and robustness, our method operates at the superpixel level. We extract low-level intensity features, mid-level texture features, and high-level semantic features for each superpixel, which are then concatenated to form its descriptor. The collection of feature vectors for all the superpixels from the reference image composes the dictionary. We formulate colorization of target superpixels as a dictionary-based sparse reconstruction problem. Inspired by the observation that superpixels with similar spatial location and/or feature representation are likely to match spatially close regions from the reference image, we further introduce a locality promoting regularization term into the energy formulation, which substantially improves the matching consistency and subsequent colorization results. Target superpixels are colorized based on the chrominance information from the dominant reference superpixels. Finally, to further improve coherence while preserving sharpness, we develop a new edge-preserving filter for chrominance channels with the guidance from the target gray-scale image. To the best of our knowledge, this is the first work on sparse pursuit image colorization from single reference images. Experimental results demonstrate that our colorization method outperforms the state-of-the-art methods, both visually and quantitatively using a user study.
NASA Astrophysics Data System (ADS)
Madokoro, H.; Yamanashi, A.; Sato, K.
2013-08-01
This paper presents an unsupervised scene classification method for actualizing semantic recognition of indoor scenes. Background and foreground features are respectively extracted using Gist and color scale-invariant feature transform (SIFT) as feature representations based on context. We used hue, saturation, and value SIFT (HSV-SIFT) because of its simple algorithm with low calculation costs. Our method creates bags of features for voting visual words created from both feature descriptors to a two-dimensional histogram. Moreover, our method generates labels as candidates of categories for time-series images while maintaining stability and plasticity together. Automatic labeling of category maps can be realized using labels created using adaptive resonance theory (ART) as teaching signals for counter propagation networks (CPNs). We evaluated our method for semantic scene classification using KTH's image database for robot localization (KTH-IDOL), which is popularly used for robot localization and navigation. The mean classification accuracies of Gist, gray SIFT, one class support vector machines (OC-SVM), position-invariant robust features (PIRF), and our method are, respectively, 39.7, 58.0, 56.0, 63.6, and 79.4%. The result of our method is 15.8% higher than that of PIRF. Moreover, we applied our method for fine classification using our original mobile robot. We obtained mean classification accuracy of 83.2% for six zones.
High-order statistics of weber local descriptors for image representation.
Han, Xian-Hua; Chen, Yen-Wei; Xu, Gang
2015-06-01
Highly discriminant visual features play a key role in different image classification applications. This study aims to realize a method for extracting highly-discriminant features from images by exploring a robust local descriptor inspired by Weber's law. The investigated local descriptor is based on the fact that human perception for distinguishing a pattern depends not only on the absolute intensity of the stimulus but also on the relative variance of the stimulus. Therefore, we firstly transform the original stimulus (the images in our study) into a differential excitation-domain according to Weber's law, and then explore a local patch, called micro-Texton, in the transformed domain as Weber local descriptor (WLD). Furthermore, we propose to employ a parametric probability process to model the Weber local descriptors, and extract the higher-order statistics to the model parameters for image representation. The proposed strategy can adaptively characterize the WLD space using generative probability model, and then learn the parameters for better fitting the training space, which would lead to more discriminant representation for images. In order to validate the efficiency of the proposed strategy, we apply three different image classification applications including texture, food images and HEp-2 cell pattern recognition, which validates that our proposed strategy has advantages over the state-of-the-art approaches.
Sneve, Markus H; Sreenivasan, Kartik K; Alnæs, Dag; Endestad, Tor; Magnussen, Svein
2015-01-01
Retention of features in visual short-term memory (VSTM) involves maintenance of sensory traces in early visual cortex. However, the mechanism through which this is accomplished is not known. Here, we formulate specific hypotheses derived from studies on feature-based attention to test the prediction that visual cortex is recruited by attentional mechanisms during VSTM of low-level features. Functional magnetic resonance imaging (fMRI) of human visual areas revealed that neural populations coding for task-irrelevant feature information are suppressed during maintenance of detailed spatial frequency memory representations. The narrow spectral extent of this suppression agrees well with known effects of feature-based attention. Additionally, analyses of effective connectivity during maintenance between retinotopic areas in visual cortex show that the observed highlighting of task-relevant parts of the feature spectrum originates in V4, a visual area strongly connected with higher-level control regions and known to convey top-down influence to earlier visual areas during attentional tasks. In line with this property of V4 during attentional operations, we demonstrate that modulations of earlier visual areas during memory maintenance have behavioral consequences, and that these modulations are a result of influences from V4. Copyright © 2014 Elsevier Ltd. All rights reserved.
Visual Prediction Error Spreads Across Object Features in Human Visual Cortex
Summerfield, Christopher; Egner, Tobias
2016-01-01
Visual cognition is thought to rely heavily on contextual expectations. Accordingly, previous studies have revealed distinct neural signatures for expected versus unexpected stimuli in visual cortex. However, it is presently unknown how the brain combines multiple concurrent stimulus expectations such as those we have for different features of a familiar object. To understand how an unexpected object feature affects the simultaneous processing of other expected feature(s), we combined human fMRI with a task that independently manipulated expectations for color and motion features of moving-dot stimuli. Behavioral data and neural signals from visual cortex were then interrogated to adjudicate between three possible ways in which prediction error (surprise) in the processing of one feature might affect the concurrent processing of another, expected feature: (1) feature processing may be independent; (2) surprise might “spread” from the unexpected to the expected feature, rendering the entire object unexpected; or (3) pairing a surprising feature with an expected feature might promote the inference that the two features are not in fact part of the same object. To formalize these rival hypotheses, we implemented them in a simple computational model of multifeature expectations. Across a range of analyses, behavior and visual neural signals consistently supported a model that assumes a mixing of prediction error signals across features: surprise in one object feature spreads to its other feature(s), thus rendering the entire object unexpected. These results reveal neurocomputational principles of multifeature expectations and indicate that objects are the unit of selection for predictive vision. SIGNIFICANCE STATEMENT We address a key question in predictive visual cognition: how does the brain combine multiple concurrent expectations for different features of a single object such as its color and motion trajectory? By combining a behavioral protocol that independently varies expectation of (and attention to) multiple object features with computational modeling and fMRI, we demonstrate that behavior and fMRI activity patterns in visual cortex are best accounted for by a model in which prediction error in one object feature spreads to other object features. These results demonstrate how predictive vision forms object-level expectations out of multiple independent features. PMID:27810936
Object-based attention underlies the rehearsal of feature binding in visual working memory.
Shen, Mowei; Huang, Xiang; Gao, Zaifeng
2015-04-01
Feature binding is a core concept in many research fields, including the study of working memory (WM). Over the past decade, it has been debated whether keeping the feature binding in visual WM consumes more visual attention than the constituent single features. Previous studies have only explored the contribution of domain-general attention or space-based attention in the binding process; no study so far has explored the role of object-based attention in retaining binding in visual WM. We hypothesized that object-based attention underlay the mechanism of rehearsing feature binding in visual WM. Therefore, during the maintenance phase of a visual WM task, we inserted a secondary mental rotation (Experiments 1-3), transparent motion (Experiment 4), or an object-based feature report task (Experiment 5) to consume the object-based attention available for binding. In line with the prediction of the object-based attention hypothesis, Experiments 1-5 revealed a more significant impairment for binding than for constituent single features. However, this selective binding impairment was not observed when inserting a space-based visual search task (Experiment 6). We conclude that object-based attention underlies the rehearsal of binding representation in visual WM. (c) 2015 APA, all rights reserved.
Plaisted, Kate; Saksida, Lisa; Alcántara, José; Weisblatt, Emma
2003-02-28
The weak central coherence hypothesis of Frith is one of the most prominent theories concerning the abnormal performance of individuals with autism on tasks that involve local and global processing. Individuals with autism often outperform matched nonautistic individuals on tasks in which success depends upon processing of local features, and underperform on tasks that require global processing. We review those studies that have been unable to identify the locus of the mechanisms that may be responsible for weak central coherence effects and those that show that local processing is enhanced in autism but not at the expense of global processing. In the light of these studies, we propose that the mechanisms which can give rise to 'weak central coherence' effects may be perceptual. More specifically, we propose that perception operates to enhance the representation of individual perceptual features but that this does not impact adversely on representations that involve integration of features. This proposal was supported in the two experiments we report on configural and feature discrimination learning in high-functioning children with autism. We also examined processes of perception directly, in an auditory filtering task which measured the width of auditory filters in individuals with autism and found that the width of auditory filters in autism were abnormally broad. We consider the implications of these findings for perceptual theories of the mechanisms underpinning weak central coherence effects.
Douglas, Danielle; Newsome, Rachel N; Man, Louisa LY
2018-01-01
A significant body of research in cognitive neuroscience is aimed at understanding how object concepts are represented in the human brain. However, it remains unknown whether and where the visual and abstract conceptual features that define an object concept are integrated. We addressed this issue by comparing the neural pattern similarities among object-evoked fMRI responses with behavior-based models that independently captured the visual and conceptual similarities among these stimuli. Our results revealed evidence for distinctive coding of visual features in lateral occipital cortex, and conceptual features in the temporal pole and parahippocampal cortex. By contrast, we found evidence for integrative coding of visual and conceptual object features in perirhinal cortex. The neuroanatomical specificity of this effect was highlighted by results from a searchlight analysis. Taken together, our findings suggest that perirhinal cortex uniquely supports the representation of fully specified object concepts through the integration of their visual and conceptual features. PMID:29393853
Coherent Image Layout using an Adaptive Visual Vocabulary
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dillard, Scott E.; Henry, Michael J.; Bohn, Shawn J.
When querying a huge image database containing millions of images, the result of the query may still contain many thousands of images that need to be presented to the user. We consider the problem of arranging such a large set of images into a visually coherent layout, one that places similar images next to each other. Image similarity is determined using a bag-of-features model, and the layout is constructed from a hierarchical clustering of the image set by mapping an in-order traversal of the hierarchy tree into a space-filling curve. This layout method provides strong locality guarantees so we aremore » able to quantitatively evaluate performance using standard image retrieval benchmarks. Performance of the bag-of-features method is best when the vocabulary is learned on the image set being clustered. Because learning a large, discriminative vocabulary is a computationally demanding task, we present a novel method for efficiently adapting a generic visual vocabulary to a particular dataset. We evaluate our clustering and vocabulary adaptation methods on a variety of image datasets and show that adapting a generic vocabulary to a particular set of images improves performance on both hierarchical clustering and image retrieval tasks.« less
Jingling, Li; Tseng, Chia-Huei; Zhaoping, Li
2013-09-10
Salient items usually capture attention and are beneficial to visual search. Jingling and Tseng (2013), nevertheless, have discovered that a salient collinear column can impair local visual search. The display used in that study had 21 rows and 27 columns of bars, all uniformly horizontal (or vertical) except for one column of bars orthogonally oriented to all other bars, making this unique column of collinear (or noncollinear) bars salient in the display. Observers discriminated an oblique target bar superimposed on one of the bars either in the salient column or in the background. Interestingly, responses were slower for a target in a salient collinear column than in the background. This opens a theoretical question of how contour integration interacts with salience computation, which is addressed here by an examination of how salience modulated the search impairment from the collinear column. We show that the collinear column needs to have a high orientation contrast with its neighbors to exert search interference. A collinear column of high contrast in color or luminance did not produce the same impairment. Our results show that orientation-defined salience interacted with collinear contour differently from other feature dimensions, which is consistent with the neuronal properties in V1.
Fazenda, Bruno; Scarre, Chris; Till, Rupert; Pasalodos, Raquel Jiménez; Guerra, Manuel Rojo; Tejedor, Cristina; Peredo, Roberto Ontañón; Watson, Aaron; Wyatt, Simon; Benito, Carlos García; Drinkall, Helen; Foulds, Frederick
2017-09-01
During the 1980 s, acoustic studies of Upper Palaeolithic imagery in French caves-using the technology then available-suggested a relationship between acoustic response and the location of visual motifs. This paper presents an investigation, using modern acoustic measurement techniques, into such relationships within the caves of La Garma, Las Chimeneas, La Pasiega, El Castillo, and Tito Bustillo in Northern Spain. It addresses methodological issues concerning acoustic measurement at enclosed archaeological sites and outlines a general framework for extraction of acoustic features that may be used to support archaeological hypotheses. The analysis explores possible associations between the position of visual motifs (which may be up to 40 000 yrs old) and localized acoustic responses. Results suggest that motifs, in general, and lines and dots, in particular, are statistically more likely to be found in places where reverberation is moderate and where the low frequency acoustic response has evidence of resonant behavior. The work presented suggests that an association of the location of Palaeolithic motifs with acoustic features is a statistically weak but tenable hypothesis, and that an appreciation of sound could have influenced behavior among Palaeolithic societies of this region.
Can we measure beauty? Computational evaluation of coral reef aesthetics
Guibert, Marine; Foerschner, Anja; Co, Tim; Calhoun, Sandi; George, Emma; Hatay, Mark; Dinsdale, Elizabeth; Sandin, Stuart A.; Smith, Jennifer E.; Vermeij, Mark J.A.; Felts, Ben; Dustan, Phillip; Salamon, Peter; Rohwer, Forest
2015-01-01
The natural beauty of coral reefs attracts millions of tourists worldwide resulting in substantial revenues for the adjoining economies. Although their visual appearance is a pivotal factor attracting humans to coral reefs current monitoring protocols exclusively target biogeochemical parameters, neglecting changes in their aesthetic appearance. Here we introduce a standardized computational approach to assess coral reef environments based on 109 visual features designed to evaluate the aesthetic appearance of art. The main feature groups include color intensity and diversity of the image, relative size, color, and distribution of discernable objects within the image, and texture. Specific coral reef aesthetic values combining all 109 features were calibrated against an established biogeochemical assessment (NCEAS) using machine learning algorithms. These values were generated for ∼2,100 random photographic images collected from 9 coral reef locations exposed to varying levels of anthropogenic influence across 2 ocean systems. Aesthetic values proved accurate predictors of the NCEAS scores (root mean square error < 5 for N ≥ 3) and significantly correlated to microbial abundance at each site. This shows that mathematical approaches designed to assess the aesthetic appearance of photographic images can be used as an inexpensive monitoring tool for coral reef ecosystems. It further suggests that human perception of aesthetics is not purely subjective but influenced by inherent reactions towards measurable visual cues. By quantifying aesthetic features of coral reef systems this method provides a cost efficient monitoring tool that targets one of the most important socioeconomic values of coral reefs directly tied to revenue for its local population. PMID:26587350
Patterns in the sky: Natural visualization of aircraft flow fields
NASA Technical Reports Server (NTRS)
Campbell, James F.; Chambers, Joseph R.
1994-01-01
The objective of the current publication is to present the collection of flight photographs to illustrate the types of flow patterns that were visualized and to present qualitative correlations with computational and wind tunnel results. Initially in section 2, the condensation process is discussed, including a review of relative humidity, vapor pressure, and factors which determine the presence of visible condensate. Next, outputs from computer code calculations are postprocessed by using water-vapor relationships to determine if computed values of relative humidity in the local flow field correlate with the qualitative features of the in-flight condensation patterns. The photographs are then presented in section 3 by flow type and subsequently in section 4 by aircraft type to demonstrate the variety of condensed flow fields that was visualized for a wide range of aircraft and flight maneuvers.
Implementing WebGL and HTML5 in Macromolecular Visualization and Modern Computer-Aided Drug Design.
Yuan, Shuguang; Chan, H C Stephen; Hu, Zhenquan
2017-06-01
Web browsers have long been recognized as potential platforms for remote macromolecule visualization. However, the difficulty in transferring large-scale data to clients and the lack of native support for hardware-accelerated applications in the local browser undermine the feasibility of such utilities. With the introduction of WebGL and HTML5 technologies in recent years, it is now possible to exploit the power of a graphics-processing unit (GPU) from a browser without any third-party plugin. Many new tools have been developed for biological molecule visualization and modern drug discovery. In contrast to traditional offline tools, real-time computing, interactive data analysis, and cross-platform analyses feature WebGL- and HTML5-based tools, facilitating biological research in a more efficient and user-friendly way. Copyright © 2017 Elsevier Ltd. All rights reserved.
EEG artifact elimination by extraction of ICA-component features using image processing algorithms.
Radüntz, T; Scouten, J; Hochmuth, O; Meffert, B
2015-03-30
Artifact rejection is a central issue when dealing with electroencephalogram recordings. Although independent component analysis (ICA) separates data in linearly independent components (IC), the classification of these components as artifact or EEG signal still requires visual inspection by experts. In this paper, we achieve automated artifact elimination using linear discriminant analysis (LDA) for classification of feature vectors extracted from ICA components via image processing algorithms. We compare the performance of this automated classifier to visual classification by experts and identify range filtering as a feature extraction method with great potential for automated IC artifact recognition (accuracy rate 88%). We obtain almost the same level of recognition performance for geometric features and local binary pattern (LBP) features. Compared to the existing automated solutions the proposed method has two main advantages: First, it does not depend on direct recording of artifact signals, which then, e.g. have to be subtracted from the contaminated EEG. Second, it is not limited to a specific number or type of artifact. In summary, the present method is an automatic, reliable, real-time capable and practical tool that reduces the time intensive manual selection of ICs for artifact removal. The results are very promising despite the relatively small channel resolution of 25 electrodes. Copyright © 2015 The Authors. Published by Elsevier B.V. All rights reserved.
Detection of potential mosquito breeding sites based on community sourced geotagged images
NASA Astrophysics Data System (ADS)
Agarwal, Ankit; Chaudhuri, Usashi; Chaudhuri, Subhasis; Seetharaman, Guna
2014-06-01
Various initiatives have been taken all over the world to involve the citizens in the collection and reporting of data to make better and informed data-driven decisions. Our work shows how the geotagged images collected through the general population can be used to combat Malaria and Dengue by identifying and visualizing localities that contain potential mosquito breeding sites. Our method first employs image quality assessment on the client side to reject the images with distortions like blur and artifacts. Each geotagged image received on the server is converted into a feature vector using the bag of visual words model. We train an SVM classifier on a histogram-based feature vector obtained after the vector quantization of SIFT features to discriminate images containing either a small stagnant water body like puddle, or open containers and tires, bushes etc. from those that contain flowing water, manicured lawns, tires attached to a vehicle etc. A geographical heat map is generated by assigning a specific location a probability value of it being a potential mosquito breeding ground of mosquito using feature level fusion or the max approach presented in the paper. The heat map thus generated can be used by concerned health authorities to take appropriate action and to promote civic awareness.
Bilinear Convolutional Neural Networks for Fine-grained Visual Recognition.
Lin, Tsung-Yu; RoyChowdhury, Aruni; Maji, Subhransu
2017-07-04
We present a simple and effective architecture for fine-grained recognition called Bilinear Convolutional Neural Networks (B-CNNs). These networks represent an image as a pooled outer product of features derived from two CNNs and capture localized feature interactions in a translationally invariant manner. B-CNNs are related to orderless texture representations built on deep features but can be trained in an end-to-end manner. Our most accurate model obtains 84.1%, 79.4%, 84.5% and 91.3% per-image accuracy on the Caltech-UCSD birds [66], NABirds [63], FGVC aircraft [42], and Stanford cars [33] dataset respectively and runs at 30 frames-per-second on a NVIDIA Titan X GPU. We then present a systematic analysis of these networks and show that (1) the bilinear features are highly redundant and can be reduced by an order of magnitude in size without significant loss in accuracy, (2) are also effective for other image classification tasks such as texture and scene recognition, and (3) can be trained from scratch on the ImageNet dataset offering consistent improvements over the baseline architecture. Finally, we present visualizations of these models on various datasets using top activations of neural units and gradient-based inversion techniques. The source code for the complete system is available at http://vis-www.cs.umass.edu/bcnn.
Probst, Yasmine; Nguyen, Duc Thanh; Tran, Minh Khoi; Li, Wanqing
2015-07-27
Dietary assessment, while traditionally based on pen-and-paper, is rapidly moving towards automatic approaches. This study describes an Australian automatic food record method and its prototype for dietary assessment via the use of a mobile phone and techniques of image processing and pattern recognition. Common visual features including scale invariant feature transformation (SIFT), local binary patterns (LBP), and colour are used for describing food images. The popular bag-of-words (BoW) model is employed for recognizing the images taken by a mobile phone for dietary assessment. Technical details are provided together with discussions on the issues and future work.
NASA Technical Reports Server (NTRS)
Hippensteele, S. A.; Russell, L. M.; Stepka, F. S.
1981-01-01
Commercially available elements of a composite consisting of a plastic sheet coated with liquid crystal, another sheet with a thin layer of a conducting material (gold or carbon), and copper bus bar strips were evaluated and found to provide a simple, convenient, accurate, and low-cost measuring device for use in heat transfer research. The particular feature of the composite is its ability to obtain local heat transfer coefficients and isotherm patterns that provide visual evaluation of the thermal performances of turbine blade cooling configurations. Examples of the use of the composite are presented.
NASA Astrophysics Data System (ADS)
Li, J.; Zhang, T.; Huang, Q.; Liu, Q.
2014-12-01
Today's climate datasets are featured with large volume, high degree of spatiotemporal complexity and evolving fast overtime. As visualizing large volume distributed climate datasets is computationally intensive, traditional desktop based visualization applications fail to handle the computational intensity. Recently, scientists have developed remote visualization techniques to address the computational issue. Remote visualization techniques usually leverage server-side parallel computing capabilities to perform visualization tasks and deliver visualization results to clients through network. In this research, we aim to build a remote parallel visualization platform for visualizing and analyzing massive climate data. Our visualization platform was built based on Paraview, which is one of the most popular open source remote visualization and analysis applications. To further enhance the scalability and stability of the platform, we have employed cloud computing techniques to support the deployment of the platform. In this platform, all climate datasets are regular grid data which are stored in NetCDF format. Three types of data access methods are supported in the platform: accessing remote datasets provided by OpenDAP servers, accessing datasets hosted on the web visualization server and accessing local datasets. Despite different data access methods, all visualization tasks are completed at the server side to reduce the workload of clients. As a proof of concept, we have implemented a set of scientific visualization methods to show the feasibility of the platform. Preliminary results indicate that the framework can address the computation limitation of desktop based visualization applications.
Local statistics of retinal optic flow for self-motion through natural sceneries.
Calow, Dirk; Lappe, Markus
2007-12-01
Image analysis in the visual system is well adapted to the statistics of natural scenes. Investigations of natural image statistics have so far mainly focused on static features. The present study is dedicated to the measurement and the analysis of the statistics of optic flow generated on the retina during locomotion through natural environments. Natural locomotion includes bouncing and swaying of the head and eye movement reflexes that stabilize gaze onto interesting objects in the scene while walking. We investigate the dependencies of the local statistics of optic flow on the depth structure of the natural environment and on the ego-motion parameters. To measure these dependencies we estimate the mutual information between correlated data sets. We analyze the results with respect to the variation of the dependencies over the visual field, since the visual motions in the optic flow vary depending on visual field position. We find that retinal flow direction and retinal speed show only minor statistical interdependencies. Retinal speed is statistically tightly connected to the depth structure of the scene. Retinal flow direction is statistically mostly driven by the relation between the direction of gaze and the direction of ego-motion. These dependencies differ at different visual field positions such that certain areas of the visual field provide more information about ego-motion and other areas provide more information about depth. The statistical properties of natural optic flow may be used to tune the performance of artificial vision systems based on human imitating behavior, and may be useful for analyzing properties of natural vision systems.
NASA Astrophysics Data System (ADS)
Tang, Chaoqing; Tian, Gui Yun; Chen, Xiaotian; Wu, Jianbo; Li, Kongjing; Meng, Hongying
2017-12-01
Active thermography provides infrared images that contain sub-surface defect information, while visible images only reveal surface information. Mapping infrared information to visible images offers more comprehensive visualization for decision-making in rail inspection. However, the common information for registration is limited due to different modalities in both local and global level. For example, rail track which has low temperature contrast reveals rich details in visible images, but turns blurry in the infrared counterparts. This paper proposes a registration algorithm called Edge-Guided Speeded-Up-Robust-Features (EG-SURF) to address this issue. Rather than sequentially integrating local and global information in matching stage which suffered from buckets effect, this algorithm adaptively integrates local and global information into a descriptor to gather more common information before matching. This adaptability consists of two facets, an adaptable weighting factor between local and global information, and an adaptable main direction accuracy. The local information is extracted using SURF while the global information is represented by shape context from edges. Meanwhile, in shape context generation process, edges are weighted according to local scale and decomposed into bins using a vector decomposition manner to provide more accurate descriptor. The proposed algorithm is qualitatively and quantitatively validated using eddy current pulsed thermography scene in the experiments. In comparison with other algorithms, better performance has been achieved.
Remote Sensing of Martian Terrain Hazards via Visually Salient Feature Detection
NASA Astrophysics Data System (ADS)
Al-Milli, S.; Shaukat, A.; Spiteri, C.; Gao, Y.
2014-04-01
The main objective of the FASTER remote sensing system is the detection of rocks on planetary surfaces by employing models that can efficiently characterise rocks in terms of semantic descriptions. The proposed technique abates some of the algorithmic limitations of existing methods with no training requirements, lower computational complexity and greater robustness towards visual tracking applications over long-distance planetary terrains. Visual saliency models inspired from biological systems help to identify important regions (such as rocks) in the visual scene. Surface rocks are therefore completely described in terms of their local or global conspicuity pop-out characteristics. These local and global pop-out cues are (but not limited to); colour, depth, orientation, curvature, size, luminance intensity, shape, topology etc. The currently applied methods follow a purely bottom-up strategy of visual attention for selection of conspicuous regions in the visual scene without any topdown control. Furthermore the choice of models used (tested and evaluated) are relatively fast among the state-of-the-art and have very low computational load. Quantitative evaluation of these state-ofthe- art models was carried out using benchmark datasets including the Surrey Space Centre Lab Testbed, Pangu generated images, RAL Space SEEKER and CNES Mars Yard datasets. The analysis indicates that models based on visually salient information in the frequency domain (SRA, SDSR, PQFT) are the best performing ones for detecting rocks in an extra-terrestrial setting. In particular the SRA model seems to be the most optimum of the lot especially that it requires the least computational time while keeping errors competitively low. The salient objects extracted using these models can then be merged with the Digital Elevation Models (DEMs) generated from the same navigation cameras in order to be fused to the navigation map thus giving a clear indication of the rock locations.
NASA Technical Reports Server (NTRS)
Lu, F. K.; Settles, G. S.; Bogdonoff, S. M.
1983-01-01
The interaction between a turbulent boundary layer and a shock wave generated by a sharp fin with leading edge sweepback was investigated. The incoming flow was at Mach 2.96 and at a unit Reynolds number of 63 x 10 to the 6th power 0.1 m. The approximate incoming boundary layer thickness was either 4 mm or 17 mm. The fins used were at 5 deg, 9 deg and 15 deg incidence and had leading edge sweepback from 0 deg to 65 deg. The tests consisted of surface kerosene lampblack streak visualization, surface pressure measurements, shock wave shape determination by shadowgraphs, and localized vapor screen visualization. The upstream influence lengths of the fin interactions were correlated using viscous and inviscid flow parameters. The parameters affecting the surface features close to the fin and way from the fin were also identified. Essentially, the surface features in the farfield were found to be conical.
Reinforcement learning in computer vision
NASA Astrophysics Data System (ADS)
Bernstein, A. V.; Burnaev, E. V.
2018-04-01
Nowadays, machine learning has become one of the basic technologies used in solving various computer vision tasks such as feature detection, image segmentation, object recognition and tracking. In many applications, various complex systems such as robots are equipped with visual sensors from which they learn state of surrounding environment by solving corresponding computer vision tasks. Solutions of these tasks are used for making decisions about possible future actions. It is not surprising that when solving computer vision tasks we should take into account special aspects of their subsequent application in model-based predictive control. Reinforcement learning is one of modern machine learning technologies in which learning is carried out through interaction with the environment. In recent years, Reinforcement learning has been used both for solving such applied tasks as processing and analysis of visual information, and for solving specific computer vision problems such as filtering, extracting image features, localizing objects in scenes, and many others. The paper describes shortly the Reinforcement learning technology and its use for solving computer vision problems.
Directional virtual backbone based data aggregation scheme for Wireless Visual Sensor Networks.
Zhang, Jing; Liu, Shi-Jian; Tsai, Pei-Wei; Zou, Fu-Min; Ji, Xiao-Rong
2018-01-01
Data gathering is a fundamental task in Wireless Visual Sensor Networks (WVSNs). Features of directional antennas and the visual data make WVSNs more complex than the conventional Wireless Sensor Network (WSN). The virtual backbone is a technique, which is capable of constructing clusters. The version associating with the aggregation operation is also referred to as the virtual backbone tree. In most of the existing literature, the main focus is on the efficiency brought by the construction of clusters that the existing methods neglect local-balance problems in general. To fill up this gap, Directional Virtual Backbone based Data Aggregation Scheme (DVBDAS) for the WVSNs is proposed in this paper. In addition, a measurement called the energy consumption density is proposed for evaluating the adequacy of results in the cluster-based construction problems. Moreover, the directional virtual backbone construction scheme is proposed by considering the local-balanced factor. Furthermore, the associated network coding mechanism is utilized to construct DVBDAS. Finally, both the theoretical analysis of the proposed DVBDAS and the simulations are given for evaluating the performance. The experimental results prove that the proposed DVBDAS achieves higher performance in terms of both the energy preservation and the network lifetime extension than the existing methods.
Visual search, visual streams, and visual architectures.
Green, M
1991-10-01
Most psychological, physiological, and computational models of early vision suggest that retinal information is divided into a parallel set of feature modules. The dominant theories of visual search assume that these modules form a "blackboard" architecture: a set of independent representations that communicate only through a central processor. A review of research shows that blackboard-based theories, such as feature-integration theory, cannot easily explain the existing data. The experimental evidence is more consistent with a "network" architecture, which stresses that: (1) feature modules are directly connected to one another, (2) features and their locations are represented together, (3) feature detection and integration are not distinct processing stages, and (4) no executive control process, such as focal attention, is needed to integrate features. Attention is not a spotlight that synthesizes objects from raw features. Instead, it is better to conceptualize attention as an aperture which masks irrelevant visual information.
Visual Typo Correction by Collocative Optimization: A Case Study on Merchandize Images.
Wei, Xiao-Yong; Yang, Zhen-Qun; Ngo, Chong-Wah; Zhang, Wei
2014-02-01
Near-duplicate retrieval (NDR) in merchandize images is of great importance to a lot of online applications on e-Commerce websites. In those applications where the requirement of response time is critical, however, the conventional techniques developed for a general purpose NDR are limited, because expensive post-processing like spatial verification or hashing is usually employed to compromise the quantization errors among the visual words used for the images. In this paper, we argue that most of the errors are introduced because of the quantization process where the visual words are considered individually, which has ignored the contextual relations among words. We propose a "spelling or phrase correction" like process for NDR, which extends the concept of collocations to visual domain for modeling the contextual relations. Binary quadratic programming is used to enforce the contextual consistency of words selected for an image, so that the errors (typos) are eliminated and the quality of the quantization process is improved. The experimental results show that the proposed method can improve the efficiency of NDR by reducing vocabulary size by 1000% times, and under the scenario of merchandize image NDR, the expensive local interest point feature used in conventional approaches can be replaced by color-moment feature, which reduces the time cost by 9202% while maintaining comparable performance to the state-of-the-art methods.
Perceptual learning in visual search: fast, enduring, but non-specific.
Sireteanu, R; Rettenbach, R
1995-07-01
Visual search has been suggested as a tool for isolating visual primitives. Elementary "features" were proposed to involve parallel search, while serial search is necessary for items without a "feature" status, or, in some cases, for conjunctions of "features". In this study, we investigated the role of practice in visual search tasks. We found that, under some circumstances, initially serial tasks can become parallel after a few hundred trials. Learning in visual search is far less specific than learning of visual discriminations and hyperacuity, suggesting that it takes place at another level in the central visual pathway, involving different neural circuits.
Botly, Leigh C P; De Rosa, Eve
2012-10-01
The visual search task established the feature integration theory of attention in humans and measures visuospatial attentional contributions to feature binding. We recently demonstrated that the neuromodulator acetylcholine (ACh), from the nucleus basalis magnocellularis (NBM), supports the attentional processes required for feature binding using a rat digging-based task. Additional research has demonstrated cholinergic contributions from the NBM to visuospatial attention in rats. Here, we combined these lines of evidence and employed visual search in rats to examine whether cortical cholinergic input supports visuospatial attention specifically for feature binding. We trained 18 male Long-Evans rats to perform visual search using touch screen-equipped operant chambers. Sessions comprised Feature Search (no feature binding required) and Conjunctive Search (feature binding required) trials using multiple stimulus set sizes. Following acquisition of visual search, 8 rats received bilateral NBM lesions using 192 IgG-saporin to selectively reduce cholinergic afferentation of the neocortex, which we hypothesized would selectively disrupt the visuospatial attentional processes needed for efficient conjunctive visual search. As expected, relative to sham-lesioned rats, ACh-NBM-lesioned rats took significantly longer to locate the target stimulus on Conjunctive Search, but not Feature Search trials, thus demonstrating that cholinergic contributions to visuospatial attention are important for feature binding in rats.
The 2D analytic signal for envelope detection and feature extraction on ultrasound images.
Wachinger, Christian; Klein, Tassilo; Navab, Nassir
2012-08-01
The fundamental property of the analytic signal is the split of identity, meaning the separation of qualitative and quantitative information in form of the local phase and the local amplitude, respectively. Especially the structural representation, independent of brightness and contrast, of the local phase is interesting for numerous image processing tasks. Recently, the extension of the analytic signal from 1D to 2D, covering also intrinsic 2D structures, was proposed. We show the advantages of this improved concept on ultrasound RF and B-mode images. Precisely, we use the 2D analytic signal for the envelope detection of RF data. This leads to advantages for the extraction of the information-bearing signal from the modulated carrier wave. We illustrate this, first, by visual assessment of the images, and second, by performing goodness-of-fit tests to a Nakagami distribution, indicating a clear improvement of statistical properties. The evaluation is performed for multiple window sizes and parameter estimation techniques. Finally, we show that the 2D analytic signal allows for an improved estimation of local features on B-mode images. Copyright © 2012 Elsevier B.V. All rights reserved.
Awareness Becomes Necessary Between Adaptive Pattern Coding of Open and Closed Curvatures
Sweeny, Timothy D.; Grabowecky, Marcia; Suzuki, Satoru
2012-01-01
Visual pattern processing becomes increasingly complex along the ventral pathway, from the low-level coding of local orientation in the primary visual cortex to the high-level coding of face identity in temporal visual areas. Previous research using pattern aftereffects as a psychophysical tool to measure activation of adaptive feature coding has suggested that awareness is relatively unimportant for the coding of orientation, but awareness is crucial for the coding of face identity. We investigated where along the ventral visual pathway awareness becomes crucial for pattern coding. Monoptic masking, which interferes with neural spiking activity in low-level processing while preserving awareness of the adaptor, eliminated open-curvature aftereffects but preserved closed-curvature aftereffects. In contrast, dichoptic masking, which spares spiking activity in low-level processing while wiping out awareness, preserved open-curvature aftereffects but eliminated closed-curvature aftereffects. This double dissociation suggests that adaptive coding of open and closed curvatures straddles the divide between weakly and strongly awareness-dependent pattern coding. PMID:21690314
Top-Down Visual Saliency via Joint CRF and Dictionary Learning.
Yang, Jimei; Yang, Ming-Hsuan
2017-03-01
Top-down visual saliency is an important module of visual attention. In this work, we propose a novel top-down saliency model that jointly learns a Conditional Random Field (CRF) and a visual dictionary. The proposed model incorporates a layered structure from top to bottom: CRF, sparse coding and image patches. With sparse coding as an intermediate layer, CRF is learned in a feature-adaptive manner; meanwhile with CRF as the output layer, the dictionary is learned under structured supervision. For efficient and effective joint learning, we develop a max-margin approach via a stochastic gradient descent algorithm. Experimental results on the Graz-02 and PASCAL VOC datasets show that our model performs favorably against state-of-the-art top-down saliency methods for target object localization. In addition, the dictionary update significantly improves the performance of our model. We demonstrate the merits of the proposed top-down saliency model by applying it to prioritizing object proposals for detection and predicting human fixations.
An evaluation of attention models for use in SLAM
NASA Astrophysics Data System (ADS)
Dodge, Samuel; Karam, Lina
2013-12-01
In this paper we study the application of visual saliency models for the simultaneous localization and mapping (SLAM) problem. We consider visual SLAM, where the location of the camera and a map of the environment can be generated using images from a single moving camera. In visual SLAM, the interest point detector is of key importance. This detector must be invariant to certain image transformations so that features can be matched across di erent frames. Recent work has used a model of human visual attention to detect interest points, however it is unclear as to what is the best attention model for this purpose. To this aim, we compare the performance of interest points from four saliency models (Itti, GBVS, RARE, and AWS) with the performance of four traditional interest point detectors (Harris, Shi-Tomasi, SIFT, and FAST). We evaluate these detectors under several di erent types of image transformation and nd that the Itti saliency model, in general, achieves the best performance in terms of keypoint repeatability.
SU-F-R-18: Updates to the Computational Environment for Radiological Research for Image Analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Apte, Aditya P.; Deasy, Joseph O.
2016-06-15
Purpose: To present new tools in CERR for Texture Analysis and Visualization. Method: (1) Quantitative Image Analysis: We added the ability to compute Haralick texture features based on local neighbourhood. The Texture features depend on many parameters used in their derivation. For example: (a) directionality, (b) quantization of image, (c) patch-size for the neighborhood, (d) handling of the edge voxels within the region of interest, (e) Averaging co-occurance matrix vs texture features for different directions etc. A graphical user interface was built to set these parameters and then visualize their impact on the resulting texture maps. The entire functionality wasmore » written in Matlab. Array indexing was used to speed up the texture calculation. The computation speed is very competitive with the ITK library. Moreover, our implementation works with multiple CPUs and the computation time can be further reduced by using multiple processor threads. In order to reduce the Haralick texture maps into scalar features, we propose the use of Texture Volume Histograms. This lets users make use of the entire distribution of texture values within the region of interest rather than using just the mean and the standard deviations. (2) Qualitative/Visualization tools: The derived texture maps are stored as a new scan (derived) within CERR’s planC data structure. A display that compares various scans was built to show the raw image and the derived texture maps side-by-side. These images are positionally linked and can be navigated together. CERR’s graphics handling was updated and sped-up to be compatible with the newer Matlab versions. As a result, the users can use (a) different window levels and colormaps for different viewports, (b) click-and-drag or use mouse scroll-wheel to navigate slices. Results: The new features and updates are available via https://www.github.com/adityaapte/cerr . Conclusion: Features added to CERR increase its utility in Radiomics and Outcomes modeling.« less
NASA Astrophysics Data System (ADS)
Du, Hongbo; Al-Jubouri, Hanan; Sellahewa, Harin
2014-05-01
Content-based image retrieval is an automatic process of retrieving images according to image visual contents instead of textual annotations. It has many areas of application from automatic image annotation and archive, image classification and categorization to homeland security and law enforcement. The key issues affecting the performance of such retrieval systems include sensible image features that can effectively capture the right amount of visual contents and suitable similarity measures to find similar and relevant images ranked in a meaningful order. Many different approaches, methods and techniques have been developed as a result of very intensive research in the past two decades. Among many existing approaches, is a cluster-based approach where clustering methods are used to group local feature descriptors into homogeneous regions, and search is conducted by comparing the regions of the query image against those of the stored images. This paper serves as a review of works in this area. The paper will first summarize the existing work reported in the literature and then present the authors' own investigations in this field. The paper intends to highlight not only achievements made by recent research but also challenges and difficulties still remaining in this area.
Altered brain response for semantic knowledge in Alzheimer's disease.
Wierenga, Christina E; Stricker, Nikki H; McCauley, Ashley; Simmons, Alan; Jak, Amy J; Chang, Yu-Ling; Nation, Daniel A; Bangen, Katherine J; Salmon, David P; Bondi, Mark W
2011-02-01
Word retrieval deficits are common in Alzheimer's disease (AD) and are thought to reflect a degradation of semantic memory. Yet, the nature of semantic deterioration in AD and the underlying neural correlates of these semantic memory changes remain largely unknown. We examined the semantic memory impairment in AD by investigating the neural correlates of category knowledge (e.g., living vs. nonliving) and featural processing (global vs. local visual information). During event-related fMRI, 10 adults diagnosed with mild AD and 22 cognitively normal (CN) older adults named aloud items from three categories for which processing of specific visual features has previously been dissociated from categorical features. Results showed widespread group differences in the categorical representation of semantic knowledge in several language-related brain areas. For example, the right inferior frontal gyrus showed selective brain response for nonliving items in the CN group but living items in the AD group. Additionally, the AD group showed increased brain response for word retrieval irrespective of category in Broca's homologue in the right hemisphere and rostral cingulate cortex bilaterally, which suggests greater recruitment of frontally mediated neural compensatory mechanisms in the face of semantic alteration. Copyright © 2010 Elsevier Ltd. All rights reserved.
Visualizing Gyrokinetic Turbulence in a Tokamak
NASA Astrophysics Data System (ADS)
Stantchev, George
2005-10-01
Multi-dimensional data output from gyrokinetic microturbulence codes are often difficult to visualize, in part due to the non-trivial geometry of the underlying grids, in part due to high irregularity of the relevant scalar field structures in turbulent regions. For instance, traditional isosurface extraction methods are likely to fail for the electrostatic potential field whose level sets may exhibit various geometric pathologies. To address these issues we develop an advanced interactive 3D gyrokinetic turbulence visualization framework which we apply in the study of microtearing instabilities calculated with GS2 in the MAST and NSTX geometries. In these simulations GS2 uses field-line-following coordinates such that the computational domain maps in physical space to a long, twisting flux tube with strong cross-sectional shear. Using statistical wavelet analysis we create a sparse multiple-scale volumetric representation of the relevant scalar fields, which we visualize via a variation of the so called splatting technique. To handle the problem of highly anisotropic flux tube configurations we adapt a geometry-driven surface illumination algorithm that places local light sources for effective feature-enhanced visualization.
Can responses to basic non-numerical visual features explain neural numerosity responses?
Harvey, Ben M; Dumoulin, Serge O
2017-04-01
Humans and many animals can distinguish between stimuli that differ in numerosity, the number of objects in a set. Human and macaque parietal lobes contain neurons that respond to changes in stimulus numerosity. However, basic non-numerical visual features can affect neural responses to and perception of numerosity, and visual features often co-vary with numerosity. Therefore, it is debated whether numerosity or co-varying low-level visual features underlie neural and behavioral responses to numerosity. To test the hypothesis that non-numerical visual features underlie neural numerosity responses in a human parietal numerosity map, we analyze responses to a group of numerosity stimulus configurations that have the same numerosity progression but vary considerably in their non-numerical visual features. Using ultra-high-field (7T) fMRI, we measure responses to these stimulus configurations in an area of posterior parietal cortex whose responses are believed to reflect numerosity-selective activity. We describe an fMRI analysis method to distinguish between alternative models of neural response functions, following a population receptive field (pRF) modeling approach. For each stimulus configuration, we first quantify the relationships between numerosity and several non-numerical visual features that have been proposed to underlie performance in numerosity discrimination tasks. We then determine how well responses to these non-numerical visual features predict the observed fMRI responses, and compare this to the predictions of responses to numerosity. We demonstrate that a numerosity response model predicts observed responses more accurately than models of responses to simple non-numerical visual features. As such, neural responses in cognitive processing need not reflect simpler properties of early sensory inputs. Copyright © 2017 Elsevier Inc. All rights reserved.
Yang, Yan-Li; Deng, Hong-Xia; Xing, Gui-Yang; Xia, Xiao-Luan; Li, Hai-Fang
2015-02-01
It is not clear whether the method used in functional brain-network related research can be applied to explore the feature binding mechanism of visual perception. In this study, we investigated feature binding of color and shape in visual perception. Functional magnetic resonance imaging data were collected from 38 healthy volunteers at rest and while performing a visual perception task to construct brain networks active during resting and task states. Results showed that brain regions involved in visual information processing were obviously activated during the task. The components were partitioned using a greedy algorithm, indicating the visual network existed during the resting state. Z-values in the vision-related brain regions were calculated, confirming the dynamic balance of the brain network. Connectivity between brain regions was determined, and the result showed that occipital and lingual gyri were stable brain regions in the visual system network, the parietal lobe played a very important role in the binding process of color features and shape features, and the fusiform and inferior temporal gyri were crucial for processing color and shape information. Experimental findings indicate that understanding visual feature binding and cognitive processes will help establish computational models of vision, improve image recognition technology, and provide a new theoretical mechanism for feature binding in visual perception.
Localization from Visual Landmarks on a Free-Flying Robot
NASA Technical Reports Server (NTRS)
Coltin, Brian; Fusco, Jesse; Moratto, Zack; Alexandrov, Oleg; Nakamura, Robert
2016-01-01
We present the localization approach for Astrobee,a new free-flying robot designed to navigate autonomously on board the International Space Station (ISS). Astrobee will conduct experiments in microgravity, as well as assisst astronauts and ground controllers. Astrobee replaces the SPHERES robots which currently operate on the ISS, which were limited to operating in a small cube since their localization system relied on triangulation from ultrasonic transmitters. Astrobee localizes with only monocular vision and an IMU, enabling it to traverse the entire US segment of the station. Features detected on a previously-built map, optical flow information,and IMU readings are all integrated into an extended Kalman filter (EKF) to estimate the robot pose. We introduce several modifications to the filter to make it more robust to noise.Finally, we extensively evaluate the behavior of the filter on atwo-dimensional testing surface.
Neural Mechanisms of Cortical Motion Computation Based on a Neuromorphic Sensory System
Abdul-Kreem, Luma Issa; Neumann, Heiko
2015-01-01
The visual cortex analyzes motion information along hierarchically arranged visual areas that interact through bidirectional interconnections. This work suggests a bio-inspired visual model focusing on the interactions of the cortical areas in which a new mechanism of feedforward and feedback processing are introduced. The model uses a neuromorphic vision sensor (silicon retina) that simulates the spike-generation functionality of the biological retina. Our model takes into account two main model visual areas, namely V1 and MT, with different feature selectivities. The initial motion is estimated in model area V1 using spatiotemporal filters to locally detect the direction of motion. Here, we adapt the filtering scheme originally suggested by Adelson and Bergen to make it consistent with the spike representation of the DVS. The responses of area V1 are weighted and pooled by area MT cells which are selective to different velocities, i.e. direction and speed. Such feature selectivity is here derived from compositions of activities in the spatio-temporal domain and integrating over larger space-time regions (receptive fields). In order to account for the bidirectional coupling of cortical areas we match properties of the feature selectivity in both areas for feedback processing. For such linkage we integrate the responses over different speeds along a particular preferred direction. Normalization of activities is carried out over the spatial as well as the feature domains to balance the activities of individual neurons in model areas V1 and MT. Our model was tested using different stimuli that moved in different directions. The results reveal that the error margin between the estimated motion and synthetic ground truth is decreased in area MT comparing with the initial estimation of area V1. In addition, the modulated V1 cell activations shows an enhancement of the initial motion estimation that is steered by feedback signals from MT cells. PMID:26554589
Metric Learning for Hyperspectral Image Segmentation
NASA Technical Reports Server (NTRS)
Bue, Brian D.; Thompson, David R.; Gilmore, Martha S.; Castano, Rebecca
2011-01-01
We present a metric learning approach to improve the performance of unsupervised hyperspectral image segmentation. Unsupervised spatial segmentation can assist both user visualization and automatic recognition of surface features. Analysts can use spatially-continuous segments to decrease noise levels and/or localize feature boundaries. However, existing segmentation methods use tasks-agnostic measures of similarity. Here we learn task-specific similarity measures from training data, improving segment fidelity to classes of interest. Multiclass Linear Discriminate Analysis produces a linear transform that optimally separates a labeled set of training classes. The defines a distance metric that generalized to a new scenes, enabling graph-based segmentation that emphasizes key spectral features. We describe tests based on data from the Compact Reconnaissance Imaging Spectrometer (CRISM) in which learned metrics improve segment homogeneity with respect to mineralogical classes.
Heinen, Klaartje; Feredoes, Eva; Weiskopf, Nikolaus; Ruff, Christian C; Driver, Jon
2014-11-01
Voluntary selective attention can prioritize different features in a visual scene. The frontal eye-fields (FEF) are one potential source of such feature-specific top-down signals, but causal evidence for influences on visual cortex (as was shown for "spatial" attention) has remained elusive. Here, we show that transcranial magnetic stimulation (TMS) applied to right FEF increased the blood oxygen level-dependent (BOLD) signals in visual areas processing "target feature" but not in "distracter feature"-processing regions. TMS-induced BOLD signals increase in motion-responsive visual cortex (MT+) when motion was attended in a display with moving dots superimposed on face stimuli, but in face-responsive fusiform area (FFA) when faces were attended to. These TMS effects on BOLD signal in both regions were negatively related to performance (on the motion task), supporting the behavioral relevance of this pathway. Our findings provide new causal evidence for the human FEF in the control of nonspatial "feature"-based attention, mediated by dynamic influences on feature-specific visual cortex that vary with the currently attended property. © The Author 2013. Published by Oxford University Press.
Cross-Modal Retrieval With CNN Visual Features: A New Baseline.
Wei, Yunchao; Zhao, Yao; Lu, Canyi; Wei, Shikui; Liu, Luoqi; Zhu, Zhenfeng; Yan, Shuicheng
2017-02-01
Recently, convolutional neural network (CNN) visual features have demonstrated their powerful ability as a universal representation for various recognition tasks. In this paper, cross-modal retrieval with CNN visual features is implemented with several classic methods. Specifically, off-the-shelf CNN visual features are extracted from the CNN model, which is pretrained on ImageNet with more than one million images from 1000 object categories, as a generic image representation to tackle cross-modal retrieval. To further enhance the representational ability of CNN visual features, based on the pretrained CNN model on ImageNet, a fine-tuning step is performed by using the open source Caffe CNN library for each target data set. Besides, we propose a deep semantic matching method to address the cross-modal retrieval problem with respect to samples which are annotated with one or multiple labels. Extensive experiments on five popular publicly available data sets well demonstrate the superiority of CNN visual features for cross-modal retrieval.
Infrared and Visual Image Fusion through Fuzzy Measure and Alternating Operators
Bai, Xiangzhi
2015-01-01
The crucial problem of infrared and visual image fusion is how to effectively extract the image features, including the image regions and details and combine these features into the final fusion result to produce a clear fused image. To obtain an effective fusion result with clear image details, an algorithm for infrared and visual image fusion through the fuzzy measure and alternating operators is proposed in this paper. Firstly, the alternating operators constructed using the opening and closing based toggle operator are analyzed. Secondly, two types of the constructed alternating operators are used to extract the multi-scale features of the original infrared and visual images for fusion. Thirdly, the extracted multi-scale features are combined through the fuzzy measure-based weight strategy to form the final fusion features. Finally, the final fusion features are incorporated with the original infrared and visual images using the contrast enlargement strategy. All the experimental results indicate that the proposed algorithm is effective for infrared and visual image fusion. PMID:26184229
Infrared and Visual Image Fusion through Fuzzy Measure and Alternating Operators.
Bai, Xiangzhi
2015-07-15
The crucial problem of infrared and visual image fusion is how to effectively extract the image features, including the image regions and details and combine these features into the final fusion result to produce a clear fused image. To obtain an effective fusion result with clear image details, an algorithm for infrared and visual image fusion through the fuzzy measure and alternating operators is proposed in this paper. Firstly, the alternating operators constructed using the opening and closing based toggle operator are analyzed. Secondly, two types of the constructed alternating operators are used to extract the multi-scale features of the original infrared and visual images for fusion. Thirdly, the extracted multi-scale features are combined through the fuzzy measure-based weight strategy to form the final fusion features. Finally, the final fusion features are incorporated with the original infrared and visual images using the contrast enlargement strategy. All the experimental results indicate that the proposed algorithm is effective for infrared and visual image fusion.
Snow Tweets: Emergency Information Dissemination in a US County During 2014 Winter Storms
Bonnan-White, Jess; Shulman, Jason; Bielecke, Abigail
2014-01-01
Introduction: This paper describes how American federal, state, and local organizations created, sourced, and disseminated emergency information via social media in preparation for several winter storms in one county in the state of New Jersey (USA). Methods: Postings submitted to Twitter for three winter storm periods were collected from selected organizations, along with a purposeful sample of select private local users. Storm-related posts were analyzed for stylistic features (hashtags, retweet mentions, embedded URLs). Sharing and re-tweeting patterns were also mapped using NodeXL. Results: Results indicate emergency management entities were active in providing preparedness and response information during the selected winter weather events. A large number of posts, however, did not include unique Twitter features that maximize dissemination and discovery by users. Visual representations of interactions illustrate opportunities for developing stronger relationships among agencies. Discussion: Whereas previous research predominantly focuses on large-scale national or international disaster contexts, the current study instead provides needed analysis in a small-scale context. With practice during localized events like extreme weather, effective information dissemination in large events can be enhanced. PMID:25685629
Snow Tweets: Emergency Information Dissemination in a US County During 2014 Winter Storms.
Bonnan-White, Jess; Shulman, Jason; Bielecke, Abigail
2014-12-22
This paper describes how American federal, state, and local organizations created, sourced, and disseminated emergency information via social media in preparation for several winter storms in one county in the state of New Jersey (USA). Postings submitted to Twitter for three winter storm periods were collected from selected organizations, along with a purposeful sample of select private local users. Storm-related posts were analyzed for stylistic features (hashtags, retweet mentions, embedded URLs). Sharing and re-tweeting patterns were also mapped using NodeXL. RESULTS indicate emergency management entities were active in providing preparedness and response information during the selected winter weather events. A large number of posts, however, did not include unique Twitter features that maximize dissemination and discovery by users. Visual representations of interactions illustrate opportunities for developing stronger relationships among agencies. Whereas previous research predominantly focuses on large-scale national or international disaster contexts, the current study instead provides needed analysis in a small-scale context. With practice during localized events like extreme weather, effective information dissemination in large events can be enhanced.
NASA Astrophysics Data System (ADS)
Liu, Xiaoqi; Wang, Chengliang; Bai, Jianying; Liao, Guobin
2018-02-01
Portal hypertensive gastropathy (PHG) is common in gastrointestinal (GI) diseases, and a severe stage of PHG (S-PHG) is a source of gastrointestinal active bleeding. Generally, the diagnosis of PHG is made visually during endoscopic examination; compared with traditional endoscopy, (wireless capsule endoscopy) WCE with noninvasive and painless is chosen as a prevalent tool for visual observation of PHG. However, accurate measurement of WCE images with PHG is a difficult task due to faint contrast and confusing variations in background gastric mucosal tissue for physicians. Therefore, this paper proposes a comprehensive methodology to automatically detect S-PHG images in WCE video to help physicians accurately diagnose S-PHG. Firstly, a rough dominatecolor-tone extraction approach is proposed for better describing global color distribution information of gastric mucosa. Secondly, a hybrid two-layer texture acquisition model is designed by integrating co-occurrence matrix into local binary pattern to depict complex and unique gastric mucosal microstructure local variation. Finally, features of mucosal color and microstructure texture are merged into linear support vector machine to accomplish this automatic classification task. Experiments were implemented on an annotated data set including 1,050 SPHG and 1,370 normal images collected from 36 real patients of different nationalities, ages and genders. By comparison with three traditional texture extraction methods, our method, combined with experimental results, performs best in detection of S-PHG images in WCE video: the maximum of accuracy, sensitivity and specificity reach 0.90, 0.92 and 0.92 respectively.
Mobile visual object identification: from SIFT-BoF-RANSAC to Sketchprint
NASA Astrophysics Data System (ADS)
Voloshynovskiy, Sviatoslav; Diephuis, Maurits; Holotyak, Taras
2015-03-01
Mobile object identification based on its visual features find many applications in the interaction with physical objects and security. Discriminative and robust content representation plays a central role in object and content identification. Complex post-processing methods are used to compress descriptors and their geometrical information, aggregate them into more compact and discriminative representations and finally re-rank the results based on the similarity geometries of descriptors. Unfortunately, most of the existing descriptors are not very robust and discriminative once applied to the various contend such as real images, text or noise-like microstructures next to requiring at least 500-1'000 descriptors per image for reliable identification. At the same time, the geometric re-ranking procedures are still too complex to be applied to the numerous candidates obtained from the feature similarity based search only. This restricts that list of candidates to be less than 1'000 which obviously causes a higher probability of miss. In addition, the security and privacy of content representation has become a hot research topic in multimedia and security communities. In this paper, we introduce a new framework for non- local content representation based on SketchPrint descriptors. It extends the properties of local descriptors to a more informative and discriminative, yet geometrically invariant content representation. In particular it allows images to be compactly represented by 100 SketchPrint descriptors without being fully dependent on re-ranking methods. We consider several use cases, applying SketchPrint descriptors to natural images, text documents, packages and micro-structures and compare them with the traditional local descriptors.
Learning to rank using user clicks and visual features for image retrieval.
Yu, Jun; Tao, Dacheng; Wang, Meng; Rui, Yong
2015-04-01
The inconsistency between textual features and visual contents can cause poor image search results. To solve this problem, click features, which are more reliable than textual information in justifying the relevance between a query and clicked images, are adopted in image ranking model. However, the existing ranking model cannot integrate visual features, which are efficient in refining the click-based search results. In this paper, we propose a novel ranking model based on the learning to rank framework. Visual features and click features are simultaneously utilized to obtain the ranking model. Specifically, the proposed approach is based on large margin structured output learning and the visual consistency is integrated with the click features through a hypergraph regularizer term. In accordance with the fast alternating linearization method, we design a novel algorithm to optimize the objective function. This algorithm alternately minimizes two different approximations of the original objective function by keeping one function unchanged and linearizing the other. We conduct experiments on a large-scale dataset collected from the Microsoft Bing image search engine, and the results demonstrate that the proposed learning to rank models based on visual features and user clicks outperforms state-of-the-art algorithms.
Blind image quality assessment via probabilistic latent semantic analysis.
Yang, Xichen; Sun, Quansen; Wang, Tianshu
2016-01-01
We propose a blind image quality assessment that is highly unsupervised and training free. The new method is based on the hypothesis that the effect caused by distortion can be expressed by certain latent characteristics. Combined with probabilistic latent semantic analysis, the latent characteristics can be discovered by applying a topic model over a visual word dictionary. Four distortion-affected features are extracted to form the visual words in the dictionary: (1) the block-based local histogram; (2) the block-based local mean value; (3) the mean value of contrast within a block; (4) the variance of contrast within a block. Based on the dictionary, the latent topics in the images can be discovered. The discrepancy between the frequency of the topics in an unfamiliar image and a large number of pristine images is applied to measure the image quality. Experimental results for four open databases show that the newly proposed method correlates well with human subjective judgments of diversely distorted images.
Looking without Perceiving: Impaired Preattentive Perceptual Grouping in Autism Spectrum Disorder
Carther-Krone, Tiffany A.; Shomstein, Sarah; Marotta, Jonathan J.
2016-01-01
Before becoming aware of a visual scene, our perceptual system has organized and selected elements in our environment to which attention should be allocated. Part of this process involves grouping perceptual features into a global whole. Individuals with autism spectrum disorders (ASD) rely on a more local processing strategy, which may be driven by difficulties perceptually grouping stimuli. We tested this notion using a line discrimination task in which two horizontal lines were superimposed on a background of black and white dots organized so that, on occasion, the dots induced the Ponzo illusion if perceptually grouped together. Results showed that even though neither group was aware of the illusion, the ASD group was significantly less likely than typically developing group to make perceptual judgments influenced by the illusion, revealing difficulties in preattentive grouping of visual stimuli. This may explain why individuals with ASD rely on local processing strategies, and offers new insight into the mechanism driving perceptual grouping in the typically developing human brain. PMID:27355678
Visual search for feature and conjunction targets with an attention deficit.
Arguin, M; Joanette, Y; Cavanagh, P
1993-01-01
Abstract Brain-damaged subjects who had previously been identified as suffering from a visual attention deficit for contralesional stimulation were tested on a series of visual search tasks. The experiments examined the hypothesis that the processing of single features is preattentive but that feature integration, necessary for the correct perception of conjunctions of features, requires attention (Treisman & Gelade, 1980 Treisman & Sato, 1990). Subjects searched for a feature target (orientation or color) or for a conjunction target (orientation and color) in unilateral displays in which the number of items presented was variable. Ocular fixation was controlled so that trials on which eye movements occurred were cancelled. While brain-damaged subjects with a visual attention disorder (VAD subjects) performed similarly to normal controls in feature search tasks, they showed a marked deficit in conjunction search. Specifically, VAD subjects exhibited an important reduction of their serial search rates for a conjunction target with contralesional displays. In support of Treisman's feature integration theory, a visual attention deficit leads to a marked impairment in feature integration whereas it does not appear to affect feature encoding.
Dynamic binding of visual features by neuronal/stimulus synchrony.
Iwabuchi, A
1998-05-01
When people see a visual scene, certain parts of the visual scene are treated as belonging together and we regard them as a perceptual unit, which is called a "figure". People focus on figures, and the remaining parts of the scene are disregarded as "ground". In Gestalt psychology this process is called "figure-ground segregation". According to current perceptual psychology, a figure is formed by binding various visual features in a scene, and developments in neuroscience have revealed that there are many feature-encoding neurons, which respond to such features specifically. It is not known, however, how the brain binds different features of an object into a coherent visual object representation. Recently, the theory of binding by neuronal synchrony, which argues that feature binding is dynamically mediated by neuronal synchrony of feature-encoding neurons, has been proposed. This review article portrays the problem of figure-ground segregation and features binding, summarizes neurophysiological and psychophysical experiments and theory relevant to feature binding by neuronal/stimulus synchrony, and suggests possible directions for future research on this topic.
HOTS: A Hierarchy of Event-Based Time-Surfaces for Pattern Recognition.
Lagorce, Xavier; Orchard, Garrick; Galluppi, Francesco; Shi, Bertram E; Benosman, Ryad B
2017-07-01
This paper describes novel event-based spatio-temporal features called time-surfaces and how they can be used to create a hierarchical event-based pattern recognition architecture. Unlike existing hierarchical architectures for pattern recognition, the presented model relies on a time oriented approach to extract spatio-temporal features from the asynchronously acquired dynamics of a visual scene. These dynamics are acquired using biologically inspired frameless asynchronous event-driven vision sensors. Similarly to cortical structures, subsequent layers in our hierarchy extract increasingly abstract features using increasingly large spatio-temporal windows. The central concept is to use the rich temporal information provided by events to create contexts in the form of time-surfaces which represent the recent temporal activity within a local spatial neighborhood. We demonstrate that this concept can robustly be used at all stages of an event-based hierarchical model. First layer feature units operate on groups of pixels, while subsequent layer feature units operate on the output of lower level feature units. We report results on a previously published 36 class character recognition task and a four class canonical dynamic card pip task, achieving near 100 percent accuracy on each. We introduce a new seven class moving face recognition task, achieving 79 percent accuracy.This paper describes novel event-based spatio-temporal features called time-surfaces and how they can be used to create a hierarchical event-based pattern recognition architecture. Unlike existing hierarchical architectures for pattern recognition, the presented model relies on a time oriented approach to extract spatio-temporal features from the asynchronously acquired dynamics of a visual scene. These dynamics are acquired using biologically inspired frameless asynchronous event-driven vision sensors. Similarly to cortical structures, subsequent layers in our hierarchy extract increasingly abstract features using increasingly large spatio-temporal windows. The central concept is to use the rich temporal information provided by events to create contexts in the form of time-surfaces which represent the recent temporal activity within a local spatial neighborhood. We demonstrate that this concept can robustly be used at all stages of an event-based hierarchical model. First layer feature units operate on groups of pixels, while subsequent layer feature units operate on the output of lower level feature units. We report results on a previously published 36 class character recognition task and a four class canonical dynamic card pip task, achieving near 100 percent accuracy on each. We introduce a new seven class moving face recognition task, achieving 79 percent accuracy.
A conceptual framework of computations in mid-level vision
Kubilius, Jonas; Wagemans, Johan; Op de Beeck, Hans P.
2014-01-01
If a picture is worth a thousand words, as an English idiom goes, what should those words—or, rather, descriptors—capture? What format of image representation would be sufficiently rich if we were to reconstruct the essence of images from their descriptors? In this paper, we set out to develop a conceptual framework that would be: (i) biologically plausible in order to provide a better mechanistic understanding of our visual system; (ii) sufficiently robust to apply in practice on realistic images; and (iii) able to tap into underlying structure of our visual world. We bring forward three key ideas. First, we argue that surface-based representations are constructed based on feature inference from the input in the intermediate processing layers of the visual system. Such representations are computed in a largely pre-semantic (prior to categorization) and pre-attentive manner using multiple cues (orientation, color, polarity, variation in orientation, and so on), and explicitly retain configural relations between features. The constructed surfaces may be partially overlapping to compensate for occlusions and are ordered in depth (figure-ground organization). Second, we propose that such intermediate representations could be formed by a hierarchical computation of similarity between features in local image patches and pooling of highly-similar units, and reestimated via recurrent loops according to the task demands. Finally, we suggest to use datasets composed of realistically rendered artificial objects and surfaces in order to better understand a model's behavior and its limitations. PMID:25566044
A conceptual framework of computations in mid-level vision.
Kubilius, Jonas; Wagemans, Johan; Op de Beeck, Hans P
2014-01-01
If a picture is worth a thousand words, as an English idiom goes, what should those words-or, rather, descriptors-capture? What format of image representation would be sufficiently rich if we were to reconstruct the essence of images from their descriptors? In this paper, we set out to develop a conceptual framework that would be: (i) biologically plausible in order to provide a better mechanistic understanding of our visual system; (ii) sufficiently robust to apply in practice on realistic images; and (iii) able to tap into underlying structure of our visual world. We bring forward three key ideas. First, we argue that surface-based representations are constructed based on feature inference from the input in the intermediate processing layers of the visual system. Such representations are computed in a largely pre-semantic (prior to categorization) and pre-attentive manner using multiple cues (orientation, color, polarity, variation in orientation, and so on), and explicitly retain configural relations between features. The constructed surfaces may be partially overlapping to compensate for occlusions and are ordered in depth (figure-ground organization). Second, we propose that such intermediate representations could be formed by a hierarchical computation of similarity between features in local image patches and pooling of highly-similar units, and reestimated via recurrent loops according to the task demands. Finally, we suggest to use datasets composed of realistically rendered artificial objects and surfaces in order to better understand a model's behavior and its limitations.
Feature Masking in Computer Game Promotes Visual Imagery
ERIC Educational Resources Information Center
Smith, Glenn Gordon; Morey, Jim; Tjoe, Edwin
2007-01-01
Can learning of mental imagery skills for visualizing shapes be accelerated with feature masking? Chemistry, physics fine arts, military tactics, and laparoscopic surgery often depend on mentally visualizing shapes in their absence. Does working with "spatial feature-masks" (skeletal shapes, missing key identifying portions) encourage people to…
Computer-aided Classification of Mammographic Masses Using Visually Sensitive Image Features
Wang, Yunzhi; Aghaei, Faranak; Zarafshani, Ali; Qiu, Yuchen; Qian, Wei; Zheng, Bin
2017-01-01
Purpose To develop a new computer-aided diagnosis (CAD) scheme that computes visually sensitive image features routinely used by radiologists to develop a machine learning classifier and distinguish between the malignant and benign breast masses detected from digital mammograms. Methods An image dataset including 301 breast masses was retrospectively selected. From each segmented mass region, we computed image features that mimic five categories of visually sensitive features routinely used by radiologists in reading mammograms. We then selected five optimal features in the five feature categories and applied logistic regression models for classification. A new CAD interface was also designed to show lesion segmentation, computed feature values and classification score. Results Areas under ROC curves (AUC) were 0.786±0.026 and 0.758±0.027 when to classify mass regions depicting on two view images, respectively. By fusing classification scores computed from two regions, AUC increased to 0.806±0.025. Conclusion This study demonstrated a new approach to develop CAD scheme based on 5 visually sensitive image features. Combining with a “visual aid” interface, CAD results may be much more easily explainable to the observers and increase their confidence to consider CAD generated classification results than using other conventional CAD approaches, which involve many complicated and visually insensitive texture features. PMID:27911353
Solid object visualization of 3D ultrasound data
NASA Astrophysics Data System (ADS)
Nelson, Thomas R.; Bailey, Michael J.
2000-04-01
Visualization of volumetric medical data is challenging. Rapid-prototyping (RP) equipment producing solid object prototype models of computer generated structures is directly applicable to visualization of medical anatomic data. The purpose of this study was to develop methods for transferring 3D Ultrasound (3DUS) data to RP equipment for visualization of patient anatomy. 3DUS data were acquired using research and clinical scanning systems. Scaling information was preserved and the data were segmented using threshold and local operators to extract features of interest, converted from voxel raster coordinate format to a set of polygons representing an iso-surface and transferred to the RP machine to create a solid 3D object. Fabrication required 30 to 60 minutes depending on object size and complexity. After creation the model could be touched and viewed. A '3D visualization hardcopy device' has advantages for conveying spatial relations compared to visualization using computer display systems. The hardcopy model may be used for teaching or therapy planning. Objects may be produced at the exact dimension of the original object or scaled up (or down) to facilitate matching the viewers reference frame more optimally. RP models represent a useful means of communicating important information in a tangible fashion to patients and physicians.
Coding of navigational affordances in the human visual system
Epstein, Russell A.
2017-01-01
A central component of spatial navigation is determining where one can and cannot go in the immediate environment. We used fMRI to test the hypothesis that the human visual system solves this problem by automatically identifying the navigational affordances of the local scene. Multivoxel pattern analyses showed that a scene-selective region of dorsal occipitoparietal cortex, known as the occipital place area, represents pathways for movement in scenes in a manner that is tolerant to variability in other visual features. These effects were found in two experiments: One using tightly controlled artificial environments as stimuli, the other using a diverse set of complex, natural scenes. A reconstruction analysis demonstrated that the population codes of the occipital place area could be used to predict the affordances of novel scenes. Taken together, these results reveal a previously unknown mechanism for perceiving the affordance structure of navigable space. PMID:28416669
Animated analysis of geoscientific datasets: An interactive graphical application
NASA Astrophysics Data System (ADS)
Morse, Peter; Reading, Anya; Lueg, Christopher
2017-12-01
Geoscientists are required to analyze and draw conclusions from increasingly large volumes of data. There is a need to recognise and characterise features and changing patterns of Earth observables within such large datasets. It is also necessary to identify significant subsets of the data for more detailed analysis. We present an innovative, interactive software tool and workflow to visualise, characterise, sample and tag large geoscientific datasets from both local and cloud-based repositories. It uses an animated interface and human-computer interaction to utilise the capacity of human expert observers to identify features via enhanced visual analytics. 'Tagger' enables users to analyze datasets that are too large in volume to be drawn legibly on a reasonable number of single static plots. Users interact with the moving graphical display, tagging data ranges of interest for subsequent attention. The tool provides a rapid pre-pass process using fast GPU-based OpenGL graphics and data-handling and is coded in the Quartz Composer visual programing language (VPL) on Mac OSX. It makes use of interoperable data formats, and cloud-based (or local) data storage and compute. In a case study, Tagger was used to characterise a decade (2000-2009) of data recorded by the Cape Sorell Waverider Buoy, located approximately 10 km off the west coast of Tasmania, Australia. These data serve as a proxy for the understanding of Southern Ocean storminess, which has both local and global implications. This example shows use of the tool to identify and characterise 4 different types of storm and non-storm events during this time. Events characterised in this way are compared with conventional analysis, noting advantages and limitations of data analysis using animation and human interaction. Tagger provides a new ability to make use of humans as feature detectors in computer-based analysis of large-volume geosciences and other data.
Utilizing Visual Effects Software for Efficient and Flexible Isostatic Adjustment Modelling
NASA Astrophysics Data System (ADS)
Meldgaard, A.; Nielsen, L.; Iaffaldano, G.
2017-12-01
The isostatic adjustment signal generated by transient ice sheet loading is an important indicator of past ice sheet extent and the rheological constitution of the interior of the Earth. Finite element modelling has proved to be a very useful tool in these studies. We present a simple numerical model for 3D visco elastic Earth deformation and a new approach to the design of such models utilizing visual effects software designed for the film and game industry. The software package Houdini offers an assortment of optimized tools and libraries which greatly facilitate the creation of efficient numerical algorithms. In particular, we make use of Houdini's procedural work flow, the SIMD programming language VEX, Houdini's sparse matrix creation and inversion libraries, an inbuilt tetrahedralizer for grid creation, and the user interface, which facilitates effortless manipulation of 3D geometry. We mitigate many of the time consuming steps associated with the authoring of efficient algorithms from scratch while still keeping the flexibility that may be lost with the use of commercial dedicated finite element programs. We test the efficiency of the algorithm by comparing simulation times with off-the-shelf solutions from the Abaqus software package. The algorithm is tailored for the study of local isostatic adjustment patterns, in close vicinity to present ice sheet margins. In particular, we wish to examine possible causes for the considerable spatial differences in the uplift magnitude which are apparent from field observations in these areas. Such features, with spatial scales of tens of kilometres, are not resolvable with current global isostatic adjustment models, and may require the inclusion of local topographic features. We use the presented algorithm to study a near field area where field observations are abundant, namely, Disko Bay in West Greenland with the intention of constraining Earth parameters and ice thickness. In addition, we assess how local topographic features may influence the differential isostatic uplift in the area.
Internal attention to features in visual short-term memory guides object learning
Fan, Judith E.; Turk-Browne, Nicholas B.
2013-01-01
Attending to objects in the world affects how we perceive and remember them. What are the consequences of attending to an object in mind? In particular, how does reporting the features of a recently seen object guide visual learning? In three experiments, observers were presented with abstract shapes in a particular color, orientation, and location. After viewing each object, observers were cued to report one feature from visual short-term memory (VSTM). In a subsequent test, observers were cued to report features of the same objects from visual long-term memory (VLTM). We tested whether reporting a feature from VSTM: (1) enhances VLTM for just that feature (practice-benefit hypothesis), (2) enhances VLTM for all features (object-based hypothesis), or (3) simultaneously enhances VLTM for that feature and suppresses VLTM for unreported features (feature-competition hypothesis). The results provided support for the feature-competition hypothesis, whereby the representation of an object in VLTM was biased towards features reported from VSTM and away from unreported features (Experiment 1). This bias could not be explained by the amount of sensory exposure or response learning (Experiment 2) and was amplified by the reporting of multiple features (Experiment 3). Taken together, these results suggest that selective internal attention induces competitive dynamics among features during visual learning, flexibly tuning object representations to align with prior mnemonic goals. PMID:23954925
Internal attention to features in visual short-term memory guides object learning.
Fan, Judith E; Turk-Browne, Nicholas B
2013-11-01
Attending to objects in the world affects how we perceive and remember them. What are the consequences of attending to an object in mind? In particular, how does reporting the features of a recently seen object guide visual learning? In three experiments, observers were presented with abstract shapes in a particular color, orientation, and location. After viewing each object, observers were cued to report one feature from visual short-term memory (VSTM). In a subsequent test, observers were cued to report features of the same objects from visual long-term memory (VLTM). We tested whether reporting a feature from VSTM: (1) enhances VLTM for just that feature (practice-benefit hypothesis), (2) enhances VLTM for all features (object-based hypothesis), or (3) simultaneously enhances VLTM for that feature and suppresses VLTM for unreported features (feature-competition hypothesis). The results provided support for the feature-competition hypothesis, whereby the representation of an object in VLTM was biased towards features reported from VSTM and away from unreported features (Experiment 1). This bias could not be explained by the amount of sensory exposure or response learning (Experiment 2) and was amplified by the reporting of multiple features (Experiment 3). Taken together, these results suggest that selective internal attention induces competitive dynamics among features during visual learning, flexibly tuning object representations to align with prior mnemonic goals. Copyright © 2013 Elsevier B.V. All rights reserved.
Toward semantic-based retrieval of visual information: a model-based approach
NASA Astrophysics Data System (ADS)
Park, Youngchoon; Golshani, Forouzan; Panchanathan, Sethuraman
2002-07-01
This paper center around the problem of automated visual content classification. To enable classification based image or visual object retrieval, we propose a new image representation scheme called visual context descriptor (VCD) that is a multidimensional vector in which each element represents the frequency of a unique visual property of an image or a region. VCD utilizes the predetermined quality dimensions (i.e., types of features and quantization level) and semantic model templates mined in priori. Not only observed visual cues, but also contextually relevant visual features are proportionally incorporated in VCD. Contextual relevance of a visual cue to a semantic class is determined by using correlation analysis of ground truth samples. Such co-occurrence analysis of visual cues requires transformation of a real-valued visual feature vector (e.g., color histogram, Gabor texture, etc.,) into a discrete event (e.g., terms in text). Good-feature to track, rule of thirds, iterative k-means clustering and TSVQ are involved in transformation of feature vectors into unified symbolic representations called visual terms. Similarity-based visual cue frequency estimation is also proposed and used for ensuring the correctness of model learning and matching since sparseness of sample data causes the unstable results of frequency estimation of visual cues. The proposed method naturally allows integration of heterogeneous visual or temporal or spatial cues in a single classification or matching framework, and can be easily integrated into a semantic knowledge base such as thesaurus, and ontology. Robust semantic visual model template creation and object based image retrieval are demonstrated based on the proposed content description scheme.
Ip, Ifan Betina; Bridge, Holly; Parker, Andrew J.
2014-01-01
An important advance in the study of visual attention has been the identification of a non-spatial component of attention that enhances the response to similar features or objects across the visual field. Here we test whether this non-spatial component can co-select individual features that are perceptually bound into a coherent object. We combined human psychophysics and functional magnetic resonance imaging (fMRI) to demonstrate the ability to co-select individual features from perceptually coherent objects. Our study used binocular disparity and visual motion to define disparity structure-from-motion (dSFM) stimuli. Although the spatial attention system induced strong modulations of the fMRI response in visual regions, the non-spatial system’s ability to co-select features of the dSFM stimulus was less pronounced and variable across subjects. Our results demonstrate that feature and global feature attention effects are variable across participants, suggesting that the feature attention system may be limited in its ability to automatically select features within the attended object. Careful comparison of the task design suggests that even minor differences in the perceptual task may be critical in revealing the presence of global feature attention. PMID:24936974
Global-local visual biases correspond with visual-spatial orientation.
Basso, Michael R; Lowery, Natasha
2004-02-01
Within the past decade, numerous investigations have demonstrated reliable associations of global-local visual processing biases with right and left hemisphere function, respectively (cf. Van Kleeck, 1989). Yet the relevance of these biases to other cognitive functions is not well understood. Towards this end, the present research examined the relationship between global-local visual biases and perception of visual-spatial orientation. Twenty-six women and 23 men completed a global-local judgment task (Kimchi and Palmer, 1982) and the Judgment of Line Orientation Test (JLO; Benton, Sivan, Hamsher, Varney, and Spreen, 1994), a measure of visual-spatial orientation. As expected, men had better performance on JLO. Extending previous findings, global biases were related to better visual-spatial acuity on JLO. The findings suggest that global-local biases and visual-spatial orientation may share underlying cerebral mechanisms. Implications of these findings for other visually mediated cognitive outcomes are discussed.
Ocular Toxocariasis: Clinical Features and Long-term Visual Outcomes in Adult Patients.
Despreaux, Raphaelle; Fardeau, Christine; Touhami, Sara; Brasnu, Emmanuelle; Champion, Emmanuelle; Paris, Luc; Touitou, Valérie; Bodaghi, Bahram; Lehoang, Phuc
2016-06-01
To investigate clinical characteristics and treatment outcomes of proven ocular toxocariasis (OT) in adult patients. Retrospective, consecutive, interventional case series. setting: Institutional. Consecutive OT patients with positive serum serology and positive western blot (WB) on ocular sample. Clinical features, optical coherence tomography (OCT), and treatment outcomes. Best-corrected visual acuity (BCVA) and OCT central foveal thickness (CFT). Fourteen patients were included between 2011 and 2013. Mean age at diagnosis was 45.6 years. Mean duration between the first symptoms and diagnosis was 15.1 months. Uveitis was unilateral in all cases and all patients displayed vitreous inflammation. The main baseline findings were presence of ≥1 peripheral granulomas (57.1%), vasculitis (57.1%), vitreoretinal traction (57.1%), and chronic macular edema (ME) (71.4%). Delayed diagnosis (>8 months) seemed to be associated with higher rate of ME. All patients received albendazole. Systemic (n = 5) and/or local corticosteroids (CS) (n = 7) were administered in case of ME and/or posterior segment inflammation. Vitrectomy was performed when vitreous inflammation was severe and persistent despite CS or in case of threatening traction or visually significant epimacular membrane (28.6%). Overall, this regimen allowed significant decrease of CFT (P = .01). In the vitrectomy subgroup, mean BCVA increased (P = .01) and CFT decreased (P = .017). While some features such as granuloma are typical signs of OT, atypical features can delay the diagnosis. In doubtful situations, WB on ocular samples seems to be more specific than serum antibodies alone. ME seems to be a common complication of longstanding OT in the adult. Copyright © 2016 Elsevier Inc. All rights reserved.
Learning optimal features for visual pattern recognition
NASA Astrophysics Data System (ADS)
Labusch, Kai; Siewert, Udo; Martinetz, Thomas; Barth, Erhardt
2007-02-01
The optimal coding hypothesis proposes that the human visual system has adapted to the statistical properties of the environment by the use of relatively simple optimality criteria. We here (i) discuss how the properties of different models of image coding, i.e. sparseness, decorrelation, and statistical independence are related to each other (ii) propose to evaluate the different models by verifiable performance measures (iii) analyse the classification performance on images of handwritten digits (MNIST data base). We first employ the SPARSENET algorithm (Olshausen, 1998) to derive a local filter basis (on 13 × 13 pixels windows). We then filter the images in the database (28 × 28 pixels images of digits) and reduce the dimensionality of the resulting feature space by selecting the locally maximal filter responses. We then train a support vector machine on a training set to classify the digits and report results obtained on a separate test set. Currently, the best state-of-the-art result on the MNIST data base has an error rate of 0,4%. This result, however, has been obtained by using explicit knowledge that is specific to the data (elastic distortion model for digits). We here obtain an error rate of 0,55% which is second best but does not use explicit data specific knowledge. In particular it outperforms by far all methods that do not use data-specific knowledge.
Plaisted, Kate; Saksida, Lisa; Alcántara, José; Weisblatt, Emma
2003-01-01
The weak central coherence hypothesis of Frith is one of the most prominent theories concerning the abnormal performance of individuals with autism on tasks that involve local and global processing. Individuals with autism often outperform matched nonautistic individuals on tasks in which success depends upon processing of local features, and underperform on tasks that require global processing. We review those studies that have been unable to identify the locus of the mechanisms that may be responsible for weak central coherence effects and those that show that local processing is enhanced in autism but not at the expense of global processing. In the light of these studies, we propose that the mechanisms which can give rise to 'weak central coherence' effects may be perceptual. More specifically, we propose that perception operates to enhance the representation of individual perceptual features but that this does not impact adversely on representations that involve integration of features. This proposal was supported in the two experiments we report on configural and feature discrimination learning in high-functioning children with autism. We also examined processes of perception directly, in an auditory filtering task which measured the width of auditory filters in individuals with autism and found that the width of auditory filters in autism were abnormally broad. We consider the implications of these findings for perceptual theories of the mechanisms underpinning weak central coherence effects. PMID:12639334
Admissible Diffusion Wavelets and Their Applications in Space-Frequency Processing.
Hou, Tingbo; Qin, Hong
2013-01-01
As signal processing tools, diffusion wavelets and biorthogonal diffusion wavelets have been propelled by recent research in mathematics. They employ diffusion as a smoothing and scaling process to empower multiscale analysis. However, their applications in graphics and visualization are overshadowed by nonadmissible wavelets and their expensive computation. In this paper, our motivation is to broaden the application scope to space-frequency processing of shape geometry and scalar fields. We propose the admissible diffusion wavelets (ADW) on meshed surfaces and point clouds. The ADW are constructed in a bottom-up manner that starts from a local operator in a high frequency, and dilates by its dyadic powers to low frequencies. By relieving the orthogonality and enforcing normalization, the wavelets are locally supported and admissible, hence facilitating data analysis and geometry processing. We define the novel rapid reconstruction, which recovers the signal from multiple bands of high frequencies and a low-frequency base in full resolution. It enables operations localized in both space and frequency by manipulating wavelet coefficients through space-frequency filters. This paper aims to build a common theoretic foundation for a host of applications, including saliency visualization, multiscale feature extraction, spectral geometry processing, etc.
Anatomy of the Attraction Basins: Breaking with the Intuition.
Hernando, Leticia; Mendiburu, Alexander; Lozano, Jose A
2018-05-22
Solving combinatorial optimization problems efficiently requires the development of algorithms that consider the specific properties of the problems. In this sense, local search algorithms are designed over a neighborhood structure that partially accounts for these properties. Considering a neighborhood, the space is usually interpreted as a natural landscape, with valleys and mountains. Under this perception, it is commonly believed that, if maximizing, the solutions located in the slopes of the same mountain belong to the same attraction basin, with the peaks of the mountains being the local optima. Unfortunately, this is a widespread erroneous visualization of a combinatorial landscape. Thus, our aim is to clarify this aspect, providing a detailed analysis of, first, the existence of plateaus where the local optima are involved, and second, the properties that define the topology of the attraction basins, picturing a reliable visualization of the landscapes. Some of the features explored in this paper have never been examined before. Hence, new findings about the structure of the attraction basins are shown. The study is focused on instances of permutation-based combinatorial optimization problems considering the 2-exchange and the insert neighborhoods. As a consequence of this work, we break away from the extended belief about the anatomy of attraction basins.
Cortical theta wanes for language.
Hermes, Dora; Miller, Kai J; Vansteensel, Mariska J; Edwards, Erik; Ferrier, Cyrille H; Bleichner, Martin G; van Rijen, Peter C; Aarnoutse, Erik J; Ramsey, Nick F
2014-01-15
The role of low frequency oscillations in language areas is not yet understood. Using ECoG in six human subjects, we studied whether different language regions show prominent power changes in a specific rhythm, in similar manner as the alpha rhythm shows the most prominent power changes in visual areas. Broca's area and temporal language areas were localized in individual subjects using fMRI. In these areas, the theta rhythm showed the most pronounced power changes and theta power decreased significantly during verb generation. To better understand the role of this language-related theta decrease, we then studied the interaction between low frequencies and local neuronal activity reflected in high frequencies. Amplitude-amplitude correlations showed that theta power correlated negatively with high frequency activity, specifically across verb generation trials. Phase-amplitude coupling showed that during control trials, high frequency power was coupled to theta phase, but this coupling decreased significantly during verb generation trials. These results suggest a dynamic interaction between the neuronal mechanisms underlying the theta rhythm and local neuronal activity in language areas. As visual areas show a pronounced alpha rhythm that may reflect pulsed inhibition, language regions show a pronounced theta rhythm with highly similar features. © 2013.
Early melanoma diagnosis with mobile imaging.
Do, Thanh-Toan; Zhou, Yiren; Zheng, Haitian; Cheung, Ngai-Man; Koh, Dawn
2014-01-01
We research a mobile imaging system for early diagnosis of melanoma. Different from previous work, we focus on smartphone-captured images, and propose a detection system that runs entirely on the smartphone. Smartphone-captured images taken under loosely-controlled conditions introduce new challenges for melanoma detection, while processing performed on the smartphone is subject to computation and memory constraints. To address these challenges, we propose to localize the skin lesion by combining fast skin detection and fusion of two fast segmentation results. We propose new features to capture color variation and border irregularity which are useful for smartphone-captured images. We also propose a new feature selection criterion to select a small set of good features used in the final lightweight system. Our evaluation confirms the effectiveness of proposed algorithms and features. In addition, we present our system prototype which computes selected visual features from a user-captured skin lesion image, and analyzes them to estimate the likelihood of malignance, all on an off-the-shelf smartphone.
Probst, Yasmine; Nguyen, Duc Thanh; Tran, Minh Khoi; Li, Wanqing
2015-01-01
Dietary assessment, while traditionally based on pen-and-paper, is rapidly moving towards automatic approaches. This study describes an Australian automatic food record method and its prototype for dietary assessment via the use of a mobile phone and techniques of image processing and pattern recognition. Common visual features including scale invariant feature transformation (SIFT), local binary patterns (LBP), and colour are used for describing food images. The popular bag-of-words (BoW) model is employed for recognizing the images taken by a mobile phone for dietary assessment. Technical details are provided together with discussions on the issues and future work. PMID:26225994
Trichoscopy of Steroid-Induced Atrophy.
Pirmez, Rodrigo; Abraham, Leonardo S; Duque-Estrada, Bruna; Damasco, Patrícia; Farias, Débora Cadore; Kelly, Yanna; Doche, Isabella
2017-10-01
Intralesional corticosteroid (IL-CS) injections have been used to treat a variety of dermatological and nondermatological diseases. Although an important therapeutic tool in dermatology, a number of local side effects, including skin atrophy, have been reported following IL-CS injections. We recently noticed that a subset of patients with steroid-induced atrophy presented with ivory-colored areas under trichoscopy. We performed a retrospective analysis of trichoscopic images and medical records from patients presenting ivory-colored areas associated with atrophic scalp lesions. In this paper, we associate this feature with the presence of steroid deposits in the dermis and report additional trichoscopic features of steroid-induced atrophy on the scalp, such as prominent blood vessels and visualization of hair bulbs.
Learning to recognize objects on the fly: a neurally based dynamic field approach.
Faubel, Christian; Schöner, Gregor
2008-05-01
Autonomous robots interacting with human users need to build and continuously update scene representations. This entails the problem of rapidly learning to recognize new objects under user guidance. Based on analogies with human visual working memory, we propose a dynamical field architecture, in which localized peaks of activation represent objects over a small number of simple feature dimensions. Learning consists of laying down memory traces of such peaks. We implement the dynamical field model on a service robot and demonstrate how it learns 30 objects from a very small number of views (about 5 per object are sufficient). We also illustrate how properties of feature binding emerge from this framework.
Nagarajan, Mahesh B.; De, Titas; Lochmüller, Eva-Maria; Eckstein, Felix; Wismüller, Axel
2017-01-01
The ability of Anisotropic Minkowski Functionals (AMFs) to capture local anisotropy while evaluating topological properties of the underlying gray-level structures has been previously demonstrated. We evaluate the ability of this approach to characterize local structure properties of trabecular bone micro-architecture in ex vivo proximal femur specimens, as visualized on multi-detector CT, for purposes of biomechanical bone strength prediction. To this end, volumetric AMFs were computed locally for each voxel of volumes of interest (VOI) extracted from the femoral head of 146 specimens. The local anisotropy captured by such AMFs was quantified using a fractional anisotropy measure; the magnitude and direction of anisotropy at every pixel was stored in histograms that served as a feature vectors that characterized the VOIs. A linear multi-regression analysis algorithm was used to predict the failure load (FL) from the feature sets; the predicted FL was compared to the true FL determined through biomechanical testing. The prediction performance was measured by the root mean square error (RMSE) for each feature set. The best prediction performance was obtained from the fractional anisotropy histogram of AMF Euler Characteristic (RMSE = 1.01 ± 0.13), which was significantly better than MDCT-derived mean BMD (RMSE = 1.12 ± 0.16, p<0.05). We conclude that such anisotropic Minkowski Functionals can capture valuable information regarding regional trabecular bone quality and contribute to improved bone strength prediction, which is important for improving the clinical assessment of osteoporotic fracture risk. PMID:29170581
Ghosh, Tonmoy; Wahid, Khan A.
2018-01-01
Wireless capsule endoscopy (WCE) is the most advanced technology to visualize whole gastrointestinal (GI) tract in a non-invasive way. But the major disadvantage here, it takes long reviewing time, which is very laborious as continuous manual intervention is necessary. In order to reduce the burden of the clinician, in this paper, an automatic bleeding detection method for WCE video is proposed based on the color histogram of block statistics, namely CHOBS. A single pixel in WCE image may be distorted due to the capsule motion in the GI tract. Instead of considering individual pixel values, a block surrounding to that individual pixel is chosen for extracting local statistical features. By combining local block features of three different color planes of RGB color space, an index value is defined. A color histogram, which is extracted from those index values, provides distinguishable color texture feature. A feature reduction technique utilizing color histogram pattern and principal component analysis is proposed, which can drastically reduce the feature dimension. For bleeding zone detection, blocks are classified using extracted local features that do not incorporate any computational burden for feature extraction. From extensive experimentation on several WCE videos and 2300 images, which are collected from a publicly available database, a very satisfactory bleeding frame and zone detection performance is achieved in comparison to that obtained by some of the existing methods. In the case of bleeding frame detection, the accuracy, sensitivity, and specificity obtained from proposed method are 97.85%, 99.47%, and 99.15%, respectively, and in the case of bleeding zone detection, 95.75% of precision is achieved. The proposed method offers not only low feature dimension but also highly satisfactory bleeding detection performance, which even can effectively detect bleeding frame and zone in a continuous WCE video data. PMID:29468094
The magnifying glass - A feature space local expansion for visual analysis. [and image enhancement
NASA Technical Reports Server (NTRS)
Juday, R. D.
1981-01-01
The Magnifying Glass Transformation (MGT) technique is proposed, as a multichannel spectral operation yielding visual imagery which is enhanced in a specified spectral vicinity, guided by the statistics of training samples. An application example is that in which the discrimination among spectral neighbors within an interactive display may be increased without altering distant object appearances or overall interpretation. A direct histogram specification technique is applied to the channels within the multispectral image so that a subset of the spectral domain occupies an increased fraction of the domain. The transformation is carried out by obtaining the training information, establishing the condition of the covariance matrix, determining the influenced solid, and initializing the lookup table. Finally, the image is transformed.
Global Image Dissimilarity in Macaque Inferotemporal Cortex Predicts Human Visual Search Efficiency
Sripati, Arun P.; Olson, Carl R.
2010-01-01
Finding a target in a visual scene can be easy or difficult depending on the nature of the distractors. Research in humans has suggested that search is more difficult the more similar the target and distractors are to each other. However, it has not yielded an objective definition of similarity. We hypothesized that visual search performance depends on similarity as determined by the degree to which two images elicit overlapping patterns of neuronal activity in visual cortex. To test this idea, we recorded from neurons in monkey inferotemporal cortex (IT) and assessed visual search performance in humans using pairs of images formed from the same local features in different global arrangements. The ability of IT neurons to discriminate between two images was strongly predictive of the ability of humans to discriminate between them during visual search, accounting overall for 90% of the variance in human performance. A simple physical measure of global similarity – the degree of overlap between the coarse footprints of a pair of images – largely explains both the neuronal and the behavioral results. To explain the relation between population activity and search behavior, we propose a model in which the efficiency of global oddball search depends on contrast-enhancing lateral interactions in high-order visual cortex. PMID:20107054
Linear and Non-Linear Visual Feature Learning in Rat and Humans
Bossens, Christophe; Op de Beeck, Hans P.
2016-01-01
The visual system processes visual input in a hierarchical manner in order to extract relevant features that can be used in tasks such as invariant object recognition. Although typically investigated in primates, recent work has shown that rats can be trained in a variety of visual object and shape recognition tasks. These studies did not pinpoint the complexity of the features used by these animals. Many tasks might be solved by using a combination of relatively simple features which tend to be correlated. Alternatively, rats might extract complex features or feature combinations which are nonlinear with respect to those simple features. In the present study, we address this question by starting from a small stimulus set for which one stimulus-response mapping involves a simple linear feature to solve the task while another mapping needs a well-defined nonlinear combination of simpler features related to shape symmetry. We verified computationally that the nonlinear task cannot be trivially solved by a simple V1-model. We show how rats are able to solve the linear feature task but are unable to acquire the nonlinear feature. In contrast, humans are able to use the nonlinear feature and are even faster in uncovering this solution as compared to the linear feature. The implications for the computational capabilities of the rat visual system are discussed. PMID:28066201
Yoon, Jong H; Sheremata, Summer L; Rokem, Ariel; Silver, Michael A
2013-10-31
Cognitive and information processing deficits are core features and important sources of disability in schizophrenia. Our understanding of the neural substrates of these deficits remains incomplete, in large part because the complexity of impairments in schizophrenia makes the identification of specific deficits very challenging. Vision science presents unique opportunities in this regard: many years of basic research have led to detailed characterization of relationships between structure and function in the early visual system and have produced sophisticated methods to quantify visual perception and characterize its neural substrates. We present a selective review of research that illustrates the opportunities for discovery provided by visual studies in schizophrenia. We highlight work that has been particularly effective in applying vision science methods to identify specific neural abnormalities underlying information processing deficits in schizophrenia. In addition, we describe studies that have utilized psychophysical experimental designs that mitigate generalized deficit confounds, thereby revealing specific visual impairments in schizophrenia. These studies contribute to accumulating evidence that early visual cortex is a useful experimental system for the study of local cortical circuit abnormalities in schizophrenia. The high degree of similarity across neocortical areas of neuronal subtypes and their patterns of connectivity suggests that insights obtained from the study of early visual cortex may be applicable to other brain regions. We conclude with a discussion of future studies that combine vision science and neuroimaging methods. These studies have the potential to address pressing questions in schizophrenia, including the dissociation of local circuit deficits vs. impairments in feedback modulation by cognitive processes such as spatial attention and working memory, and the relative contributions of glutamatergic and GABAergic deficits.
Visual attention to features by associative learning.
Gozli, Davood G; Moskowitz, Joshua B; Pratt, Jay
2014-11-01
Expecting a particular stimulus can facilitate processing of that stimulus over others, but what is the fate of other stimuli that are known to co-occur with the expected stimulus? This study examined the impact of learned association on feature-based attention. The findings show that the effectiveness of an uninformative color transient in orienting attention can change by learned associations between colors and the expected target shape. In an initial acquisition phase, participants learned two distinct sequences of stimulus-response-outcome, where stimuli were defined by shape ('S' vs. 'H'), responses were localized key-presses (left vs. right), and outcomes were colors (red vs. green). Next, in a test phase, while expecting a target shape (80% probable), participants showed reliable attentional orienting to the color transient associated with the target shape, and showed no attentional orienting with the color associated with the alternative target shape. This bias seemed to be driven by learned association between shapes and colors, and not modulated by the response. In addition, the bias seemed to depend on observing target-color conjunctions, since encountering the two features disjunctively (without spatiotemporal overlap) did not replicate the findings. We conclude that associative learning - likely mediated by mechanisms underlying visual object representation - can extend the impact of goal-driven attention to features associated with a target stimulus. Copyright © 2014 Elsevier B.V. All rights reserved.
Stimulus information contaminates summation tests of independent neural representations of features
NASA Technical Reports Server (NTRS)
Shimozaki, Steven S.; Eckstein, Miguel P.; Abbey, Craig K.
2002-01-01
Many models of visual processing assume that visual information is analyzed into separable and independent neural codes, or features. A common psychophysical test of independent features is known as a summation study, which measures performance in a detection, discrimination, or visual search task as the number of proposed features increases. Improvement in human performance with increasing number of available features is typically attributed to the summation, or combination, of information across independent neural coding of the features. In many instances, however, increasing the number of available features also increases the stimulus information in the task, as assessed by an optimal observer that does not include the independent neural codes. In a visual search task with spatial frequency and orientation as the component features, a particular set of stimuli were chosen so that all searches had equivalent stimulus information, regardless of the number of features. In this case, human performance did not improve with increasing number of features, implying that the improvement observed with additional features may be due to stimulus information and not the combination across independent features.
The development of organized visual search
Woods, Adam J.; Goksun, Tilbe; Chatterjee, Anjan; Zelonis, Sarah; Mehta, Anika; Smith, Sabrina E.
2013-01-01
Visual search plays an important role in guiding behavior. Children have more difficulty performing conjunction search tasks than adults. The present research evaluates whether developmental differences in children's ability to organize serial visual search (i.e., search organization skills) contribute to performance limitations in a typical conjunction search task. We evaluated 134 children between the ages of 2 and 17 on separate tasks measuring search for targets defined by a conjunction of features or by distinct features. Our results demonstrated that children organize their visual search better as they get older. As children's skills at organizing visual search improve they become more accurate at locating targets with conjunction of features amongst distractors, but not for targets with distinct features. Developmental limitations in children's abilities to organize their visual search of the environment are an important component of poor conjunction search in young children. In addition, our findings provide preliminary evidence that, like other visuospatial tasks, exposure to reading may influence children's spatial orientation to the visual environment when performing a visual search. PMID:23584560
Feature diagnosticity and task context shape activity in human scene-selective cortex.
Lowe, Matthew X; Gallivan, Jason P; Ferber, Susanne; Cant, Jonathan S
2016-01-15
Scenes are constructed from multiple visual features, yet previous research investigating scene processing has often focused on the contributions of single features in isolation. In the real world, features rarely exist independently of one another and likely converge to inform scene identity in unique ways. Here, we utilize fMRI and pattern classification techniques to examine the interactions between task context (i.e., attend to diagnostic global scene features; texture or layout) and high-level scene attributes (content and spatial boundary) to test the novel hypothesis that scene-selective cortex represents multiple visual features, the importance of which varies according to their diagnostic relevance across scene categories and task demands. Our results show for the first time that scene representations are driven by interactions between multiple visual features and high-level scene attributes. Specifically, univariate analysis of scene-selective cortex revealed that task context and feature diagnosticity shape activity differentially across scene categories. Examination using multivariate decoding methods revealed results consistent with univariate findings, but also evidence for an interaction between high-level scene attributes and diagnostic visual features within scene categories. Critically, these findings suggest visual feature representations are not distributed uniformly across scene categories but are shaped by task context and feature diagnosticity. Thus, we propose that scene-selective cortex constructs a flexible representation of the environment by integrating multiple diagnostically relevant visual features, the nature of which varies according to the particular scene being perceived and the goals of the observer. Copyright © 2015 Elsevier Inc. All rights reserved.
Rutishauser, Ueli; Kotowicz, Andreas; Laurent, Gilles
2013-01-01
Brain activity often consists of interactions between internal—or on-going—and external—or sensory—activity streams, resulting in complex, distributed patterns of neural activity. Investigation of such interactions could benefit from closed-loop experimental protocols in which one stream can be controlled depending on the state of the other. We describe here methods to present rapid and precisely timed visual stimuli to awake animals, conditional on features of the animal’s on-going brain state; those features are the presence, power and phase of oscillations in local field potentials (LFP). The system can process up to 64 channels in real time. We quantified its performance using simulations, synthetic data and animal experiments (chronic recordings in the dorsal cortex of awake turtles). The delay from detection of an oscillation to the onset of a visual stimulus on an LCD screen was 47.5 ms and visual-stimulus onset could be locked to the phase of ongoing oscillations at any frequency ≤40 Hz. Our software’s architecture is flexible, allowing on-the-fly modifications by experimenters and the addition of new closed-loop control and analysis components through plugins. The source code of our system “StimOMatic” is available freely as open-source. PMID:23473800
Visuomotor Transformations Underlying Hunting Behavior in Zebrafish
Bianco, Isaac H.; Engert, Florian
2015-01-01
Summary Visuomotor circuits filter visual information and determine whether or not to engage downstream motor modules to produce behavioral outputs. However, the circuit mechanisms that mediate and link perception of salient stimuli to execution of an adaptive response are poorly understood. We combined a virtual hunting assay for tethered larval zebrafish with two-photon functional calcium imaging to simultaneously monitor neuronal activity in the optic tectum during naturalistic behavior. Hunting responses showed mixed selectivity for combinations of visual features, specifically stimulus size, speed, and contrast polarity. We identified a subset of tectal neurons with similar highly selective tuning, which show non-linear mixed selectivity for visual features and are likely to mediate the perceptual recognition of prey. By comparing neural dynamics in the optic tectum during response versus non-response trials, we discovered premotor population activity that specifically preceded initiation of hunting behavior and exhibited anatomical localization that correlated with motor variables. In summary, the optic tectum contains non-linear mixed selectivity neurons that are likely to mediate reliable detection of ethologically relevant sensory stimuli. Recruitment of small tectal assemblies appears to link perception to action by providing the premotor commands that release hunting responses. These findings allow us to propose a model circuit for the visuomotor transformations underlying a natural behavior. PMID:25754638
Visuomotor transformations underlying hunting behavior in zebrafish.
Bianco, Isaac H; Engert, Florian
2015-03-30
Visuomotor circuits filter visual information and determine whether or not to engage downstream motor modules to produce behavioral outputs. However, the circuit mechanisms that mediate and link perception of salient stimuli to execution of an adaptive response are poorly understood. We combined a virtual hunting assay for tethered larval zebrafish with two-photon functional calcium imaging to simultaneously monitor neuronal activity in the optic tectum during naturalistic behavior. Hunting responses showed mixed selectivity for combinations of visual features, specifically stimulus size, speed, and contrast polarity. We identified a subset of tectal neurons with similar highly selective tuning, which show non-linear mixed selectivity for visual features and are likely to mediate the perceptual recognition of prey. By comparing neural dynamics in the optic tectum during response versus non-response trials, we discovered premotor population activity that specifically preceded initiation of hunting behavior and exhibited anatomical localization that correlated with motor variables. In summary, the optic tectum contains non-linear mixed selectivity neurons that are likely to mediate reliable detection of ethologically relevant sensory stimuli. Recruitment of small tectal assemblies appears to link perception to action by providing the premotor commands that release hunting responses. These findings allow us to propose a model circuit for the visuomotor transformations underlying a natural behavior. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
Visual short-term memory always requires general attention.
Morey, Candice C; Bieler, Malte
2013-02-01
The role of attention in visual memory remains controversial; while some evidence has suggested that memory for binding between features demands no more attention than does memory for the same features, other evidence has indicated cognitive costs or mnemonic benefits for explicitly attending to bindings. We attempted to reconcile these findings by examining how memory for binding, for features, and for features during binding is affected by a concurrent attention-demanding task. We demonstrated that performing a concurrent task impairs memory for as few as two visual objects, regardless of whether each object includes one or more features. We argue that this pattern of results reflects an essential role for domain-general attention in visual memory, regardless of the simplicity of the to-be-remembered stimuli. We then discuss the implications of these findings for theories of visual working memory.
Vision Based Localization in Urban Environments
NASA Technical Reports Server (NTRS)
McHenry, Michael; Cheng, Yang; Matthies, Larry
2005-01-01
As part of DARPA's MARS2020 program, the Jet Propulsion Laboratory developed a vision-based system for localization in urban environments that requires neither GPS nor active sensors. System hardware consists of a pair of small FireWire cameras and a standard Pentium-based computer. The inputs to the software system consist of: 1) a crude grid-based map describing the positions of buildings, 2) an initial estimate of robot location and 3) the video streams produced by each camera. At each step during the traverse the system: captures new image data, finds image features hypothesized to lie on the outside of a building, computes the range to those features, determines an estimate of the robot's motion since the previous step and combines that data with the map to update a probabilistic representation of the robot's location. This probabilistic representation allows the system to simultaneously represent multiple possible locations, For our testing, we have derived the a priori map manually using non-orthorectified overhead imagery, although this process could be automated. The software system consists of two primary components. The first is the vision system which uses binocular stereo ranging together with a set of heuristics to identify features likely to be part of building exteriors and to compute an estimate of the robot's motion since the previous step. The resulting visual features and the associated range measurements are software component, a particle-filter based localization system. This system uses the map and the then fed to the second primary most recent results from the vision system to update the estimate of the robot's location. This report summarizes the design of both the hardware and software and will include the results of applying the system to the global localization of a robot over an approximately half-kilometer traverse across JPL'S Pasadena campus.
NASA Astrophysics Data System (ADS)
Kurugol, Sila; Dy, Jennifer G.; Rajadhyaksha, Milind; Gossage, Kirk W.; Weissmann, Jesse; Brooks, Dana H.
2011-03-01
The examination of the dermis/epidermis junction (DEJ) is clinically important for skin cancer diagnosis. Reflectance confocal microscopy (RCM) is an emerging tool for detection of skin cancers in vivo. However, visual localization of the DEJ in RCM images, with high accuracy and repeatability, is challenging, especially in fair skin, due to low contrast, heterogeneous structure and high inter- and intra-subject variability. We recently proposed a semi-automated algorithm to localize the DEJ in z-stacks of RCM images of fair skin, based on feature segmentation and classification. Here we extend the algorithm to dark skin. The extended algorithm first decides the skin type and then applies the appropriate DEJ localization method. In dark skin, strong backscatter from the pigment melanin causes the basal cells above the DEJ to appear with high contrast. To locate those high contrast regions, the algorithm operates on small tiles (regions) and finds the peaks of the smoothed average intensity depth profile of each tile. However, for some tiles, due to heterogeneity, multiple peaks in the depth profile exist and the strongest peak might not be the basal layer peak. To select the correct peak, basal cells are represented with a vector of texture features. The peak with most similar features to this feature vector is selected. The results show that the algorithm detected the skin types correctly for all 17 stacks tested (8 fair, 9 dark). The DEJ detection algorithm achieved an average distance from the ground truth DEJ surface of around 4.7μm for dark skin and around 7-14μm for fair skin.
Guided SAR image despeckling with probabilistic non local weights
NASA Astrophysics Data System (ADS)
Gokul, Jithin; Nair, Madhu S.; Rajan, Jeny
2017-12-01
SAR images are generally corrupted by granular disturbances called speckle, which makes visual analysis and detail extraction a difficult task. Non Local despeckling techniques with probabilistic similarity has been a recent trend in SAR despeckling. To achieve effective speckle suppression without compromising detail preservation, we propose an improvement for the existing Generalized Guided Filter with Bayesian Non-Local Means (GGF-BNLM) method. The proposed method (Guided SAR Image Despeckling with Probabilistic Non Local Weights) replaces parametric constants based on heuristics in GGF-BNLM method with dynamically derived values based on the image statistics for weight computation. Proposed changes make GGF-BNLM method adaptive and as a result, significant improvement is achieved in terms of performance. Experimental analysis on SAR images shows excellent speckle reduction without compromising feature preservation when compared to GGF-BNLM method. Results are also compared with other state-of-the-art and classic SAR depseckling techniques to demonstrate the effectiveness of the proposed method.
Taubert, Jessica; Parr, Lisa A
2011-01-01
All primates can recognize faces and do so by analyzing the subtle variation that exists between faces. Through a series of three experiments, we attempted to clarify the nature of second-order information processing in nonhuman primates. Experiment one showed that both chimpanzees (Pan troglodytes) and rhesus monkeys (Macaca mulatta) tolerate geometric distortions along the vertical axis, suggesting that information about absolute position of features does not contribute to accurate face recognition. Chimpanzees differed from monkeys, however, in that they were more sensitive to distortions along the horizontal axis, suggesting that when building a global representation of facial identity, horizontal relations between features are more diagnostic of identity than vertical relations. Two further experiments were performed to determine whether the monkeys were simply less sensitive to horizontal relations compared to chimpanzees or were instead relying on local features. The results of these experiments confirm that monkeys can utilize a holistic strategy when discriminating between faces regardless of familiarity. In contrast, our data show that chimpanzees, like humans, use a combination of holistic and local features when the faces are unfamiliar, but primarily holistic information when the faces become familiar. We argue that our comparative approach to the study of face recognition reveals the impact that individual experience and social organization has on visual cognition.
NASA Astrophysics Data System (ADS)
Zhao, Yiqun; Wang, Zhihui
2015-12-01
The Internet of things (IOT) is a kind of intelligent networks which can be used to locate, track, identify and supervise people and objects. One of important core technologies of intelligent visual internet of things ( IVIOT) is the intelligent visual tag system. In this paper, a research is done into visual feature extraction and establishment of visual tags of the human face based on ORL face database. Firstly, we use the principal component analysis (PCA) algorithm for face feature extraction, then adopt the support vector machine (SVM) for classifying and face recognition, finally establish a visual tag for face which is already classified. We conducted a experiment focused on a group of people face images, the result show that the proposed algorithm have good performance, and can show the visual tag of objects conveniently.
Extraction of prostatic lumina and automated recognition for prostatic calculus image using PCA-SVM.
Wang, Zhuocai; Xu, Xiangmin; Ding, Xiaojun; Xiao, Hui; Huang, Yusheng; Liu, Jian; Xing, Xiaofen; Wang, Hua; Liao, D Joshua
2011-01-01
Identification of prostatic calculi is an important basis for determining the tissue origin. Computation-assistant diagnosis of prostatic calculi may have promising potential but is currently still less studied. We studied the extraction of prostatic lumina and automated recognition for calculus images. Extraction of lumina from prostate histology images was based on local entropy and Otsu threshold recognition using PCA-SVM and based on the texture features of prostatic calculus. The SVM classifier showed an average time 0.1432 second, an average training accuracy of 100%, an average test accuracy of 93.12%, a sensitivity of 87.74%, and a specificity of 94.82%. We concluded that the algorithm, based on texture features and PCA-SVM, can recognize the concentric structure and visualized features easily. Therefore, this method is effective for the automated recognition of prostatic calculi.
Humphreys, Glyn W
2016-10-01
The Treisman Bartlett lecture, reported in the Quarterly Journal of Experimental Psychology in 1988, provided a major overview of the feature integration theory of attention. This has continued to be a dominant account of human visual attention to this day. The current paper provides a summary of the work reported in the lecture and an update on critical aspects of the theory as applied to visual object perception. The paper highlights the emergence of findings that pose significant challenges to the theory and which suggest that revisions are required that allow for (a) several rather than a single form of feature integration, (b) some forms of feature integration to operate preattentively, (c) stored knowledge about single objects and interactions between objects to modulate perceptual integration, (d) the application of feature-based inhibition to object files where visual features are specified, which generates feature-based spreading suppression and scene segmentation, and (e) a role for attention in feature confirmation rather than feature integration in visual selection. A feature confirmation account of attention in object perception is outlined.
Separate Capacities for Storing Different Features in Visual Working Memory
ERIC Educational Resources Information Center
Wang, Benchi; Cao, Xiaohua; Theeuwes, Jan; Olivers, Christian N. L.; Wang, Zhiguo
2017-01-01
Recent empirical and theoretical work suggests that visual features such as color and orientation can be stored or retrieved independently in visual working memory (VWM), even in cases when they belong to the same object. Yet it remains unclear whether different feature dimensions have their own capacity limits, or whether they compete for shared…
Hardman, Kyle; Cowan, Nelson
2014-01-01
Visual working memory stores stimuli from our environment as representations that can be accessed by high-level control processes. This study addresses a longstanding debate in the literature about whether storage limits in visual working memory include a limit to the complexity of discrete items. We examined the issue with a number of change-detection experiments that used complex stimuli which possessed multiple features per stimulus item. We manipulated the number of relevant features of the stimulus objects in order to vary feature load. In all of our experiments, we found that increased feature load led to a reduction in change-detection accuracy. However, we found that feature load alone could not account for the results, but that a consideration of the number of relevant objects was also required. This study supports capacity limits for both feature and object storage in visual working memory. PMID:25089739
Shao, Feng; Lin, Weisi; Gu, Shanbo; Jiang, Gangyi; Srikanthan, Thambipillai
2013-05-01
Perceptual quality assessment is a challenging issue in 3D signal processing research. It is important to study 3D signal directly instead of studying simple extension of the 2D metrics directly to the 3D case as in some previous studies. In this paper, we propose a new perceptual full-reference quality assessment metric of stereoscopic images by considering the binocular visual characteristics. The major technical contribution of this paper is that the binocular perception and combination properties are considered in quality assessment. To be more specific, we first perform left-right consistency checks and compare matching error between the corresponding pixels in binocular disparity calculation, and classify the stereoscopic images into non-corresponding, binocular fusion, and binocular suppression regions. Also, local phase and local amplitude maps are extracted from the original and distorted stereoscopic images as features in quality assessment. Then, each region is evaluated independently by considering its binocular perception property, and all evaluation results are integrated into an overall score. Besides, a binocular just noticeable difference model is used to reflect the visual sensitivity for the binocular fusion and suppression regions. Experimental results show that compared with the relevant existing metrics, the proposed metric can achieve higher consistency with subjective assessment of stereoscopic images.
Deep Multimodal Distance Metric Learning Using Click Constraints for Image Ranking.
Yu, Jun; Yang, Xiaokang; Gao, Fei; Tao, Dacheng
2017-12-01
How do we retrieve images accurately? Also, how do we rank a group of images precisely and efficiently for specific queries? These problems are critical for researchers and engineers to generate a novel image searching engine. First, it is important to obtain an appropriate description that effectively represent the images. In this paper, multimodal features are considered for describing images. The images unique properties are reflected by visual features, which are correlated to each other. However, semantic gaps always exist between images visual features and semantics. Therefore, we utilize click feature to reduce the semantic gap. The second key issue is learning an appropriate distance metric to combine these multimodal features. This paper develops a novel deep multimodal distance metric learning (Deep-MDML) method. A structured ranking model is adopted to utilize both visual and click features in distance metric learning (DML). Specifically, images and their related ranking results are first collected to form the training set. Multimodal features, including click and visual features, are collected with these images. Next, a group of autoencoders is applied to obtain initially a distance metric in different visual spaces, and an MDML method is used to assign optimal weights for different modalities. Next, we conduct alternating optimization to train the ranking model, which is used for the ranking of new queries with click features. Compared with existing image ranking methods, the proposed method adopts a new ranking model to use multimodal features, including click features and visual features in DML. We operated experiments to analyze the proposed Deep-MDML in two benchmark data sets, and the results validate the effects of the method.
HyMoTrack: A Mobile AR Navigation System for Complex Indoor Environments.
Gerstweiler, Georg; Vonach, Emanuel; Kaufmann, Hannes
2015-12-24
Navigating in unknown big indoor environments with static 2D maps is a challenge, especially when time is a critical factor. In order to provide a mobile assistant, capable of supporting people while navigating in indoor locations, an accurate and reliable localization system is required in almost every corner of the building. We present a solution to this problem through a hybrid tracking system specifically designed for complex indoor spaces, which runs on mobile devices like smartphones or tablets. The developed algorithm only uses the available sensors built into standard mobile devices, especially the inertial sensors and the RGB camera. The combination of multiple optical tracking technologies, such as 2D natural features and features of more complex three-dimensional structures guarantees the robustness of the system. All processing is done locally and no network connection is needed. State-of-the-art indoor tracking approaches use mainly radio-frequency signals like Wi-Fi or Bluetooth for localizing a user. In contrast to these approaches, the main advantage of the developed system is the capability of delivering a continuous 3D position and orientation of the mobile device with centimeter accuracy. This makes it usable for localization and 3D augmentation purposes, e.g. navigation tasks or location-based information visualization.
HyMoTrack: A Mobile AR Navigation System for Complex Indoor Environments
Gerstweiler, Georg; Vonach, Emanuel; Kaufmann, Hannes
2015-01-01
Navigating in unknown big indoor environments with static 2D maps is a challenge, especially when time is a critical factor. In order to provide a mobile assistant, capable of supporting people while navigating in indoor locations, an accurate and reliable localization system is required in almost every corner of the building. We present a solution to this problem through a hybrid tracking system specifically designed for complex indoor spaces, which runs on mobile devices like smartphones or tablets. The developed algorithm only uses the available sensors built into standard mobile devices, especially the inertial sensors and the RGB camera. The combination of multiple optical tracking technologies, such as 2D natural features and features of more complex three-dimensional structures guarantees the robustness of the system. All processing is done locally and no network connection is needed. State-of-the-art indoor tracking approaches use mainly radio-frequency signals like Wi-Fi or Bluetooth for localizing a user. In contrast to these approaches, the main advantage of the developed system is the capability of delivering a continuous 3D position and orientation of the mobile device with centimeter accuracy. This makes it usable for localization and 3D augmentation purposes, e.g. navigation tasks or location-based information visualization. PMID:26712755
A method for real-time visual stimulus selection in the study of cortical object perception.
Leeds, Daniel D; Tarr, Michael J
2016-06-01
The properties utilized by visual object perception in the mid- and high-level ventral visual pathway are poorly understood. To better establish and explore possible models of these properties, we adopt a data-driven approach in which we repeatedly interrogate neural units using functional Magnetic Resonance Imaging (fMRI) to establish each unit's image selectivity. This approach to imaging necessitates a search through a broad space of stimulus properties using a limited number of samples. To more quickly identify the complex visual features underlying human cortical object perception, we implemented a new functional magnetic resonance imaging protocol in which visual stimuli are selected in real-time based on BOLD responses to recently shown images. Two variations of this protocol were developed, one relying on natural object stimuli and a second based on synthetic object stimuli, both embedded in feature spaces based on the complex visual properties of the objects. During fMRI scanning, we continuously controlled stimulus selection in the context of a real-time search through these image spaces in order to maximize neural responses across pre-determined 1cm(3) rain regions. Elsewhere we have reported the patterns of cortical selectivity revealed by this approach (Leeds et al., 2014). In contrast, here our objective is to present more detailed methods and explore the technical and biological factors influencing the behavior of our real-time stimulus search. We observe that: 1) Searches converged more reliably when exploring a more precisely parameterized space of synthetic objects; 2) real-time estimation of cortical responses to stimuli is reasonably consistent; 3) search behavior was acceptably robust to delays in stimulus displays and subject motion effects. Overall, our results indicate that real-time fMRI methods may provide a valuable platform for continuing study of localized neural selectivity, both for visual object representation and beyond. Copyright © 2016 Elsevier Inc. All rights reserved.
A method for real-time visual stimulus selection in the study of cortical object perception
Leeds, Daniel D.; Tarr, Michael J.
2016-01-01
The properties utilized by visual object perception in the mid- and high-level ventral visual pathway are poorly understood. To better establish and explore possible models of these properties, we adopt a data-driven approach in which we repeatedly interrogate neural units using functional Magnetic Resonance Imaging (fMRI) to establish each unit’s image selectivity. This approach to imaging necessitates a search through a broad space of stimulus properties using a limited number of samples. To more quickly identify the complex visual features underlying human cortical object perception, we implemented a new functional magnetic resonance imaging protocol in which visual stimuli are selected in real-time based on BOLD responses to recently shown images. Two variations of this protocol were developed, one relying on natural object stimuli and a second based on synthetic object stimuli, both embedded in feature spaces based on the complex visual properties of the objects. During fMRI scanning, we continuously controlled stimulus selection in the context of a real-time search through these image spaces in order to maximize neural responses across predetermined 1 cm3 brain regions. Elsewhere we have reported the patterns of cortical selectivity revealed by this approach (Leeds 2014). In contrast, here our objective is to present more detailed methods and explore the technical and biological factors influencing the behavior of our real-time stimulus search. We observe that: 1) Searches converged more reliably when exploring a more precisely parameterized space of synthetic objects; 2) Real-time estimation of cortical responses to stimuli are reasonably consistent; 3) Search behavior was acceptably robust to delays in stimulus displays and subject motion effects. Overall, our results indicate that real-time fMRI methods may provide a valuable platform for continuing study of localized neural selectivity, both for visual object representation and beyond. PMID:26973168
ERIC Educational Resources Information Center
Chen, Y.; Norton, D. J.; McBain, R.; Gold, J.; Frazier, J. A.; Coyle, J. T.
2012-01-01
An important issue for understanding visual perception in autism concerns whether individuals with this neurodevelopmental disorder possess an advantage in processing local visual information, and if so, what is the nature of this advantage. Perception of movement speed is a visual process that relies on computation of local spatiotemporal signals…
Weighted feature selection criteria for visual servoing of a telerobot
NASA Technical Reports Server (NTRS)
Feddema, John T.; Lee, C. S. G.; Mitchell, O. R.
1989-01-01
Because of the continually changing environment of a space station, visual feedback is a vital element of a telerobotic system. A real time visual servoing system would allow a telerobot to track and manipulate randomly moving objects. Methodologies for the automatic selection of image features to be used to visually control the relative position between an eye-in-hand telerobot and a known object are devised. A weighted criteria function with both image recognition and control components is used to select the combination of image features which provides the best control. Simulation and experimental results of a PUMA robot arm visually tracking a randomly moving carburetor gasket with a visual update time of 70 milliseconds are discussed.
Trace Elemental Imaging of Rare Earth Elements Discriminates Tissues at Microscale in Flat Fossils
Gueriau, Pierre; Mocuta, Cristian; Dutheil, Didier B.; Cohen, Serge X.; Thiaudière, Dominique; Charbonnier, Sylvain; Clément, Gaël; Bertrand, Loïc
2014-01-01
The interpretation of flattened fossils remains a major challenge due to compression of their complex anatomies during fossilization, making critical anatomical features invisible or hardly discernible. Key features are often hidden under greatly preserved decay prone tissues, or an unpreparable sedimentary matrix. A method offering access to such anatomical features is of paramount interest to resolve taxonomic affinities and to study fossils after a least possible invasive preparation. Unfortunately, the widely-used X-ray micro-computed tomography, for visualizing hidden or internal structures of a broad range of fossils, is generally inapplicable to flattened specimens, due to the very high differential absorbance in distinct directions. Here we show that synchrotron X-ray fluorescence spectral raster-scanning coupled to spectral decomposition or a much faster Kullback-Leibler divergence based statistical analysis provides microscale visualization of tissues. We imaged exceptionally well-preserved fossils from the Late Cretaceous without needing any prior delicate preparation. The contrasting elemental distributions greatly improved the discrimination of skeletal elements material from both the sedimentary matrix and fossilized soft tissues. Aside content in alkaline earth elements and phosphorus, a critical parameter for tissue discrimination is the distinct amounts of rare earth elements. Local quantification of rare earths may open new avenues for fossil description but also in paleoenvironmental and taphonomical studies. PMID:24489809
Trace elemental imaging of rare earth elements discriminates tissues at microscale in flat fossils.
Gueriau, Pierre; Mocuta, Cristian; Dutheil, Didier B; Cohen, Serge X; Thiaudière, Dominique; Charbonnier, Sylvain; Clément, Gaël; Bertrand, Loïc
2014-01-01
The interpretation of flattened fossils remains a major challenge due to compression of their complex anatomies during fossilization, making critical anatomical features invisible or hardly discernible. Key features are often hidden under greatly preserved decay prone tissues, or an unpreparable sedimentary matrix. A method offering access to such anatomical features is of paramount interest to resolve taxonomic affinities and to study fossils after a least possible invasive preparation. Unfortunately, the widely-used X-ray micro-computed tomography, for visualizing hidden or internal structures of a broad range of fossils, is generally inapplicable to flattened specimens, due to the very high differential absorbance in distinct directions. Here we show that synchrotron X-ray fluorescence spectral raster-scanning coupled to spectral decomposition or a much faster Kullback-Leibler divergence based statistical analysis provides microscale visualization of tissues. We imaged exceptionally well-preserved fossils from the Late Cretaceous without needing any prior delicate preparation. The contrasting elemental distributions greatly improved the discrimination of skeletal elements material from both the sedimentary matrix and fossilized soft tissues. Aside content in alkaline earth elements and phosphorus, a critical parameter for tissue discrimination is the distinct amounts of rare earth elements. Local quantification of rare earths may open new avenues for fossil description but also in paleoenvironmental and taphonomical studies.
Weakly supervised visual dictionary learning by harnessing image attributes.
Gao, Yue; Ji, Rongrong; Liu, Wei; Dai, Qionghai; Hua, Gang
2014-12-01
Bag-of-features (BoFs) representation has been extensively applied to deal with various computer vision applications. To extract discriminative and descriptive BoF, one important step is to learn a good dictionary to minimize the quantization loss between local features and codewords. While most existing visual dictionary learning approaches are engaged with unsupervised feature quantization, the latest trend has turned to supervised learning by harnessing the semantic labels of images or regions. However, such labels are typically too expensive to acquire, which restricts the scalability of supervised dictionary learning approaches. In this paper, we propose to leverage image attributes to weakly supervise the dictionary learning procedure without requiring any actual labels. As a key contribution, our approach establishes a generative hidden Markov random field (HMRF), which models the quantized codewords as the observed states and the image attributes as the hidden states, respectively. Dictionary learning is then performed by supervised grouping the observed states, where the supervised information is stemmed from the hidden states of the HMRF. In such a way, the proposed dictionary learning approach incorporates the image attributes to learn a semantic-preserving BoF representation without any genuine supervision. Experiments in large-scale image retrieval and classification tasks corroborate that our approach significantly outperforms the state-of-the-art unsupervised dictionary learning approaches.
Impact of feature saliency on visual category learning.
Hammer, Rubi
2015-01-01
People have to sort numerous objects into a large number of meaningful categories while operating in varying contexts. This requires identifying the visual features that best predict the 'essence' of objects (e.g., edibility), rather than categorizing objects based on the most salient features in a given context. To gain this capacity, visual category learning (VCL) relies on multiple cognitive processes. These may include unsupervised statistical learning, that requires observing multiple objects for learning the statistics of their features. Other learning processes enable incorporating different sources of supervisory information, alongside the visual features of the categorized objects, from which the categorical relations between few objects can be deduced. These deductions enable inferring that objects from the same category may differ from one another in some high-saliency feature dimensions, whereas lower-saliency feature dimensions can best differentiate objects from distinct categories. Here I illustrate how feature saliency affects VCL, by also discussing kinds of supervisory information enabling reflective categorization. Arguably, principles debated here are often being ignored in categorization studies.
Impact of feature saliency on visual category learning
Hammer, Rubi
2015-01-01
People have to sort numerous objects into a large number of meaningful categories while operating in varying contexts. This requires identifying the visual features that best predict the ‘essence’ of objects (e.g., edibility), rather than categorizing objects based on the most salient features in a given context. To gain this capacity, visual category learning (VCL) relies on multiple cognitive processes. These may include unsupervised statistical learning, that requires observing multiple objects for learning the statistics of their features. Other learning processes enable incorporating different sources of supervisory information, alongside the visual features of the categorized objects, from which the categorical relations between few objects can be deduced. These deductions enable inferring that objects from the same category may differ from one another in some high-saliency feature dimensions, whereas lower-saliency feature dimensions can best differentiate objects from distinct categories. Here I illustrate how feature saliency affects VCL, by also discussing kinds of supervisory information enabling reflective categorization. Arguably, principles debated here are often being ignored in categorization studies. PMID:25954220
Scoring nuclear pleomorphism using a visual BoF modulated by a graph structure
NASA Astrophysics Data System (ADS)
Moncayo-Martínez, Ricardo; Romo-Bucheli, David; Arias, Viviana; Romero, Eduardo
2017-11-01
Nuclear pleomorphism has been recognized as a key histological criterium in breast cancer grading systems (such as Bloom Richardson and Nothingham grading systems). However, the nuclear pleomorphism assessment is subjective and presents high inter-reader variability. Automatic algorithms might facilitate quantitative estimation of nuclear variations in shape and size. Nevertheless, the automatic segmentation of the nuclei is difficult and still and open research problem. This paper presents a method using a bag of multi-scale visual features, modulated by a graph structure, to grade nuclei in breast cancer microscopical fields. This strategy constructs hematoxylin-eosin image patches, each containing a nucleus that is represented by a set of visual words in the BoF. The contribution of each visual word is computed by examining the visual words in an associated graph built when projecting the multi-dimensional BoF to a bi-dimensional plane where local relationships are conserved. The methodology was evaluated using 14 breast cancer cases of the Cancer Genome Atlas database. From these cases, a set of 134 microscopical fields was extracted, and under a leave-one-out validation scheme, an average F-score of 0.68 was obtained.
A novel approach for fire recognition using hybrid features and manifold learning-based classifier
NASA Astrophysics Data System (ADS)
Zhu, Rong; Hu, Xueying; Tang, Jiajun; Hu, Sheng
2018-03-01
Although image/video based fire recognition has received growing attention, an efficient and robust fire detection strategy is rarely explored. In this paper, we propose a novel approach to automatically identify the flame or smoke regions in an image. It is composed to three stages: (1) a block processing is applied to divide an image into several nonoverlapping image blocks, and these image blocks are identified as suspicious fire regions or not by using two color models and a color histogram-based similarity matching method in the HSV color space, (2) considering that compared to other information, the flame and smoke regions have significant visual characteristics, so that two kinds of image features are extracted for fire recognition, where local features are obtained based on the Scale Invariant Feature Transform (SIFT) descriptor and the Bags of Keypoints (BOK) technique, and texture features are extracted based on the Gray Level Co-occurrence Matrices (GLCM) and the Wavelet-based Analysis (WA) methods, and (3) a manifold learning-based classifier is constructed based on two image manifolds, which is designed via an improve Globular Neighborhood Locally Linear Embedding (GNLLE) algorithm, and the extracted hybrid features are used as input feature vectors to train the classifier, which is used to make decision for fire images or non fire images. Experiments and comparative analyses with four approaches are conducted on the collected image sets. The results show that the proposed approach is superior to the other ones in detecting fire and achieving a high recognition accuracy and a low error rate.
Robust Stereo Visual Odometry Using Improved RANSAC-Based Methods for Mobile Robot Localization
Liu, Yanqing; Gu, Yuzhang; Li, Jiamao; Zhang, Xiaolin
2017-01-01
In this paper, we present a novel approach for stereo visual odometry with robust motion estimation that is faster and more accurate than standard RANSAC (Random Sample Consensus). Our method makes improvements in RANSAC in three aspects: first, the hypotheses are preferentially generated by sampling the input feature points on the order of ages and similarities of the features; second, the evaluation of hypotheses is performed based on the SPRT (Sequential Probability Ratio Test) that makes bad hypotheses discarded very fast without verifying all the data points; third, we aggregate the three best hypotheses to get the final estimation instead of only selecting the best hypothesis. The first two aspects improve the speed of RANSAC by generating good hypotheses and discarding bad hypotheses in advance, respectively. The last aspect improves the accuracy of motion estimation. Our method was evaluated in the KITTI (Karlsruhe Institute of Technology and Toyota Technological Institute) and the New Tsukuba dataset. Experimental results show that the proposed method achieves better results for both speed and accuracy than RANSAC. PMID:29027935
Phan, Thanh Vân; Seoud, Lama; Chakor, Hadi; Cheriet, Farida
2016-01-01
Age-related macular degeneration (AMD) is a disease which causes visual deficiency and irreversible blindness to the elderly. In this paper, an automatic classification method for AMD is proposed to perform robust and reproducible assessments in a telemedicine context. First, a study was carried out to highlight the most relevant features for AMD characterization based on texture, color, and visual context in fundus images. A support vector machine and a random forest were used to classify images according to the different AMD stages following the AREDS protocol and to evaluate the features' relevance. Experiments were conducted on a database of 279 fundus images coming from a telemedicine platform. The results demonstrate that local binary patterns in multiresolution are the most relevant for AMD classification, regardless of the classifier used. Depending on the classification task, our method achieves promising performances with areas under the ROC curve between 0.739 and 0.874 for screening and between 0.469 and 0.685 for grading. Moreover, the proposed automatic AMD classification system is robust with respect to image quality. PMID:27190636
NASA Astrophysics Data System (ADS)
Kirubanandham, A.; Lujan-Regalado, I.; Vallabhaneni, R.; Chawla, N.
2016-11-01
Decreasing pitch size in electronic packaging has resulted in a drastic decrease in solder volumes. The Sn grain crystallography and fraction of intermetallic compounds (IMCs) in small-scale solder joints evolve much differently at the smaller length scales. A cross-sectional study limits the morphological analysis of microstructural features to two dimensions. This study utilizes serial sectioning technique in conjunction with electron backscatter diffraction to investigate the crystallographic orientation of both Sn grains and Cu6Sn5 IMCs in Cu/Pure Sn/Cu solder joints in three dimensional (3D). Quantification of grain aspect ratio is affected by local cooling rate differences within the solder volume. Backscatter electron imaging and focused ion beam serial sectioning enabled the visualization of morphology of both nanosized Cu6Sn5 IMCs and the hollow hexagonal morphology type Cu6Sn5 IMCs in 3D. Quantification and visualization of microstructural features in 3D thus enable us to better understand the microstructure and deformation mechanics within these small scale solder joints.
NASA Astrophysics Data System (ADS)
Davis, Paul B.; Abidi, Mongi A.
1989-05-01
PET is the only imaging modality that provides doctors with early analytic and quantitative biochemical assessment and precise localization of pathology. In PET images, boundary information as well as local pixel intensity are both crucial for manual and/or automated feature tracing, extraction, and identification. Unfortunately, the present PET technology does not provide the necessary image quality from which such precise analytic and quantitative measurements can be made. PET images suffer from significantly high levels of radial noise present in the form of streaks caused by the inexactness of the models used in image reconstruction. In this paper, our objective is to model PET noise and remove it without altering dominant features in the image. The ultimate goal here is to enhance these dominant features to allow for automatic computer interpretation and classification of PET images by developing techniques that take into consideration PET signal characteristics, data collection, and data reconstruction. We have modeled the noise steaks in PET images in both rectangular and polar representations and have shown both analytically and through computer simulation that it exhibits consistent mapping patterns. A class of filters was designed and applied successfully. Visual inspection of the filtered images show clear enhancement over the original images.
sRNAdb: A small non-coding RNA database for gram-positive bacteria
2012-01-01
Background The class of small non-coding RNA molecules (sRNA) regulates gene expression by different mechanisms and enables bacteria to mount a physiological response due to adaptation to the environment or infection. Over the last decades the number of sRNAs has been increasing rapidly. Several databases like Rfam or fRNAdb were extended to include sRNAs as a class of its own. Furthermore new specialized databases like sRNAMap (gram-negative bacteria only) and sRNATarBase (target prediction) were established. To the best of the authors’ knowledge no database focusing on sRNAs from gram-positive bacteria is publicly available so far. Description In order to understand sRNA’s functional and phylogenetic relationships we have developed sRNAdb and provide tools for data analysis and visualization. The data compiled in our database is assembled from experiments as well as from bioinformatics analyses. The software enables comparison and visualization of gene loci surrounding the sRNAs of interest. To accomplish this, we use a client–server based approach. Offline versions of the database including analyses and visualization tools can easily be installed locally on the user’s computer. This feature facilitates customized local addition of unpublished sRNA candidates and related information such as promoters or terminators using tab-delimited files. Conclusion sRNAdb allows a user-friendly and comprehensive comparative analysis of sRNAs from available sequenced gram-positive prokaryotic replicons. Offline versions including analysis and visualization tools facilitate complex user specific bioinformatics analyses. PMID:22883983
A survey of simultaneous localization and mapping on unstructured lunar complex environment
NASA Astrophysics Data System (ADS)
Wang, Yiqiao; Zhang, Wei; An, Pei
2017-10-01
Simultaneous localization and mapping (SLAM) technology is the key to realizing lunar rover's intelligent perception and autonomous navigation. It embodies the autonomous ability of mobile robot, and has attracted plenty of concerns of researchers in the past thirty years. Visual sensors are meaningful to SLAM research because they can provide a wealth of information. Visual SLAM uses merely images as external information to estimate the location of the robot and construct the environment map. Nowadays, SLAM technology still has problems when applied in large-scale, unstructured and complex environment. Based on the latest technology in the field of visual SLAM, this paper investigates and summarizes the SLAM technology using in the unstructured complex environment of lunar surface. In particular, we focus on summarizing and comparing the detection and matching of features of SIFT, SURF and ORB, in the meanwhile discussing their advantages and disadvantages. We have analyzed the three main methods: SLAM Based on Extended Kalman Filter, SLAM Based on Particle Filter and SLAM Based on Graph Optimization (EKF-SLAM, PF-SLAM and Graph-based SLAM). Finally, this article summarizes and discusses the key scientific and technical difficulties in the lunar context that Visual SLAM faces. At the same time, we have explored the frontier issues such as multi-sensor fusion SLAM and multi-robot cooperative SLAM technology. We also predict and prospect the development trend of lunar rover SLAM technology, and put forward some ideas of further research.
Pigmented skin lesion detection using random forest and wavelet-based texture
NASA Astrophysics Data System (ADS)
Hu, Ping; Yang, Tie-jun
2016-10-01
The incidence of cutaneous malignant melanoma, a disease of worldwide distribution and is the deadliest form of skin cancer, has been rapidly increasing over the last few decades. Because advanced cutaneous melanoma is still incurable, early detection is an important step toward a reduction in mortality. Dermoscopy photographs are commonly used in melanoma diagnosis and can capture detailed features of a lesion. A great variability exists in the visual appearance of pigmented skin lesions. Therefore, in order to minimize the diagnostic errors that result from the difficulty and subjectivity of visual interpretation, an automatic detection approach is required. The objectives of this paper were to propose a hybrid method using random forest and Gabor wavelet transformation to accurately differentiate which part belong to lesion area and the other is not in a dermoscopy photographs and analyze segmentation accuracy. A random forest classifier consisting of a set of decision trees was used for classification. Gabor wavelets transformation are the mathematical model of visual cortical cells of mammalian brain and an image can be decomposed into multiple scales and multiple orientations by using it. The Gabor function has been recognized as a very useful tool in texture analysis, due to its optimal localization properties in both spatial and frequency domain. Texture features based on Gabor wavelets transformation are found by the Gabor filtered image. Experiment results indicate the following: (1) the proposed algorithm based on random forest outperformed the-state-of-the-art in pigmented skin lesions detection (2) and the inclusion of Gabor wavelet transformation based texture features improved segmentation accuracy significantly.
The role of convexity in perception of symmetry and in visual short-term memory.
Bertamini, Marco; Helmy, Mai Salah; Hulleman, Johan
2013-01-01
Visual perception of shape is affected by coding of local convexities and concavities. For instance, a recent study reported that deviations from symmetry carried by convexities were easier to detect than deviations carried by concavities. We removed some confounds and extended this work from a detection of reflection of a contour (i.e., bilateral symmetry), to a detection of repetition of a contour (i.e., translational symmetry). We tested whether any convexity advantage is specific to bilateral symmetry in a two-interval (Experiment 1) and a single-interval (Experiment 2) detection task. In both, we found a convexity advantage only for repetition. When we removed the need to choose which region of the contour to monitor (Experiment 3) the effect disappeared. In a second series of studies, we again used shapes with multiple convex or concave features. Participants performed a change detection task in which only one of the features could change. We did not find any evidence that convexities are special in visual short-term memory, when the to-be-remembered features only changed shape (Experiment 4), when they changed shape and changed from concave to convex and vice versa (Experiment 5), or when these conditions were mixed (Experiment 6). We did find a small advantage for coding convexity as well as concavity over an isolated (and thus ambiguous) contour. The latter is consistent with the known effect of closure on processing of shape. We conclude that convexity plays a role in many perceptual tasks but that it does not have a basic encoding advantage over concavity.
Weighted score-level feature fusion based on Dempster-Shafer evidence theory for action recognition
NASA Astrophysics Data System (ADS)
Zhang, Guoliang; Jia, Songmin; Li, Xiuzhi; Zhang, Xiangyin
2018-01-01
The majority of human action recognition methods use multifeature fusion strategy to improve the classification performance, where the contribution of different features for specific action has not been paid enough attention. We present an extendible and universal weighted score-level feature fusion method using the Dempster-Shafer (DS) evidence theory based on the pipeline of bag-of-visual-words. First, the partially distinctive samples in the training set are selected to construct the validation set. Then, local spatiotemporal features and pose features are extracted from these samples to obtain evidence information. The DS evidence theory and the proposed rule of survival of the fittest are employed to achieve evidence combination and calculate optimal weight vectors of every feature type belonging to each action class. Finally, the recognition results are deduced via the weighted summation strategy. The performance of the established recognition framework is evaluated on Penn Action dataset and a subset of the joint-annotated human metabolome database (sub-JHMDB). The experiment results demonstrate that the proposed feature fusion method can adequately exploit the complementarity among multiple features and improve upon most of the state-of-the-art algorithms on Penn Action and sub-JHMDB datasets.
Welikala, R A; Fraz, M M; Dehmeshki, J; Hoppe, A; Tah, V; Mann, S; Williamson, T H; Barman, S A
2015-07-01
Proliferative diabetic retinopathy (PDR) is a condition that carries a high risk of severe visual impairment. The hallmark of PDR is the growth of abnormal new vessels. In this paper, an automated method for the detection of new vessels from retinal images is presented. This method is based on a dual classification approach. Two vessel segmentation approaches are applied to create two separate binary vessel map which each hold vital information. Local morphology features are measured from each binary vessel map to produce two separate 4-D feature vectors. Independent classification is performed for each feature vector using a support vector machine (SVM) classifier. The system then combines these individual outcomes to produce a final decision. This is followed by the creation of additional features to generate 21-D feature vectors, which feed into a genetic algorithm based feature selection approach with the objective of finding feature subsets that improve the performance of the classification. Sensitivity and specificity results using a dataset of 60 images are 0.9138 and 0.9600, respectively, on a per patch basis and 1.000 and 0.975, respectively, on a per image basis. Copyright © 2015 Elsevier Ltd. All rights reserved.
Spatial and Feature-Based Attention in a Layered Cortical Microcircuit Model
Wagatsuma, Nobuhiko; Potjans, Tobias C.; Diesmann, Markus; Sakai, Ko; Fukai, Tomoki
2013-01-01
Directing attention to the spatial location or the distinguishing feature of a visual object modulates neuronal responses in the visual cortex and the stimulus discriminability of subjects. However, the spatial and feature-based modes of attention differently influence visual processing by changing the tuning properties of neurons. Intriguingly, neurons' tuning curves are modulated similarly across different visual areas under both these modes of attention. Here, we explored the mechanism underlying the effects of these two modes of visual attention on the orientation selectivity of visual cortical neurons. To do this, we developed a layered microcircuit model. This model describes multiple orientation-specific microcircuits sharing their receptive fields and consisting of layers 2/3, 4, 5, and 6. These microcircuits represent a functional grouping of cortical neurons and mutually interact via lateral inhibition and excitatory connections between groups with similar selectivity. The individual microcircuits receive bottom-up visual stimuli and top-down attention in different layers. A crucial assumption of the model is that feature-based attention activates orientation-specific microcircuits for the relevant feature selectively, whereas spatial attention activates all microcircuits homogeneously, irrespective of their orientation selectivity. Consequently, our model simultaneously accounts for the multiplicative scaling of neuronal responses in spatial attention and the additive modulations of orientation tuning curves in feature-based attention, which have been observed widely in various visual cortical areas. Simulations of the model predict contrasting differences between excitatory and inhibitory neurons in the two modes of attentional modulations. Furthermore, the model replicates the modulation of the psychophysical discriminability of visual stimuli in the presence of external noise. Our layered model with a biologically suggested laminar structure describes the basic circuit mechanism underlying the attention-mode specific modulations of neuronal responses and visual perception. PMID:24324628
Binding of motion and colour is early and automatic.
Blaser, Erik; Papathomas, Thomas; Vidnyánszky, Zoltán
2005-04-01
At what stages of the human visual hierarchy different features are bound together, and whether this binding requires attention, is still highly debated. We used a colour-contingent motion after-effect (CCMAE) to study the binding of colour and motion signals. The logic of our approach was as follows: if CCMAEs can be evoked by targeted adaptation of early motion processing stages, without allowing for feedback from higher motion integration stages, then this would support our hypothesis that colour and motion are bound automatically on the basis of spatiotemporally local information. Our results show for the first time that CCMAE's can be evoked by adaptation to a locally paired opposite-motion dot display, a stimulus that, importantly, is known to trigger direction-specific responses in the primary visual cortex yet results in strong inhibition of the directional responses in area MT of macaques as well as in area MT+ in humans and, indeed, is perceived only as motionless flicker. The magnitude of the CCMAE in the locally paired condition was not significantly different from control conditions where the different directions were spatiotemporally separated (i.e. not locally paired) and therefore perceived as two moving fields. These findings provide evidence that adaptation at an early, local motion stage, and only adaptation at this stage, underlies this CCMAE, which in turn implies that spatiotemporally coincident colour and motion signals are bound automatically, most probably as early as cortical area V1, even when the association between colour and motion is perceptually inaccessible.
Li, Yiqing; Wang, Yu; Zi, Yanyang; Zhang, Mingquan
2015-10-21
The various multi-sensor signal features from a diesel engine constitute a complex high-dimensional dataset. The non-linear dimensionality reduction method, t-distributed stochastic neighbor embedding (t-SNE), provides an effective way to implement data visualization for complex high-dimensional data. However, irrelevant features can deteriorate the performance of data visualization, and thus, should be eliminated a priori. This paper proposes a feature subset score based t-SNE (FSS-t-SNE) data visualization method to deal with the high-dimensional data that are collected from multi-sensor signals. In this method, the optimal feature subset is constructed by a feature subset score criterion. Then the high-dimensional data are visualized in 2-dimension space. According to the UCI dataset test, FSS-t-SNE can effectively improve the classification accuracy. An experiment was performed with a large power marine diesel engine to validate the proposed method for diesel engine malfunction classification. Multi-sensor signals were collected by a cylinder vibration sensor and a cylinder pressure sensor. Compared with other conventional data visualization methods, the proposed method shows good visualization performance and high classification accuracy in multi-malfunction classification of a diesel engine.
Li, Yiqing; Wang, Yu; Zi, Yanyang; Zhang, Mingquan
2015-01-01
The various multi-sensor signal features from a diesel engine constitute a complex high-dimensional dataset. The non-linear dimensionality reduction method, t-distributed stochastic neighbor embedding (t-SNE), provides an effective way to implement data visualization for complex high-dimensional data. However, irrelevant features can deteriorate the performance of data visualization, and thus, should be eliminated a priori. This paper proposes a feature subset score based t-SNE (FSS-t-SNE) data visualization method to deal with the high-dimensional data that are collected from multi-sensor signals. In this method, the optimal feature subset is constructed by a feature subset score criterion. Then the high-dimensional data are visualized in 2-dimension space. According to the UCI dataset test, FSS-t-SNE can effectively improve the classification accuracy. An experiment was performed with a large power marine diesel engine to validate the proposed method for diesel engine malfunction classification. Multi-sensor signals were collected by a cylinder vibration sensor and a cylinder pressure sensor. Compared with other conventional data visualization methods, the proposed method shows good visualization performance and high classification accuracy in multi-malfunction classification of a diesel engine. PMID:26506347
Visual feature integration with an attention deficit.
Arguin, M; Cavanagh, P; Joanette, Y
1994-01-01
Treisman's feature integration theory proposes that the perception of illusory conjunctions of correctly encoded visual features is due to the failure of an attentional process. This hypothesis was examined by studying brain-damaged subjects who had previously been shown to have difficulty in attending to contralesional stimulation. These subjects exhibited a massive feature integration deficit for contralesional stimulation relative to ipsilesional displays. In contrast, both normal age-matched controls and brain-damaged subjects who did not exhibit any evidence of an attention deficit showed comparable feature integration performance with left- and right-hemifield stimulation. These observations indicate the crucial function of attention for visual feature integration in normal perception.
Features in visual search combine linearly
Pramod, R. T.; Arun, S. P.
2014-01-01
Single features such as line orientation and length are known to guide visual search, but relatively little is known about how multiple features combine in search. To address this question, we investigated how search for targets differing in multiple features (intensity, length, orientation) from the distracters is related to searches for targets differing in each of the individual features. We tested race models (based on reaction times) and co-activation models (based on reciprocal of reaction times) for their ability to predict multiple feature searches. Multiple feature searches were best accounted for by a co-activation model in which feature information combined linearly (r = 0.95). This result agrees with the classic finding that these features are separable i.e., subjective dissimilarity ratings sum linearly. We then replicated the classical finding that the length and width of a rectangle are integral features—in other words, they combine nonlinearly in visual search. However, to our surprise, upon including aspect ratio as an additional feature, length and width combined linearly and this model outperformed all other models. Thus, length and width of a rectangle became separable when considered together with aspect ratio. This finding predicts that searches involving shapes with identical aspect ratio should be more difficult than searches where shapes differ in aspect ratio. We confirmed this prediction on a variety of shapes. We conclude that features in visual search co-activate linearly and demonstrate for the first time that aspect ratio is a novel feature that guides visual search. PMID:24715328
Left cytoarchitectonic BA 44 processes syntactic gender violations in determiner phrases.
Heim, Stefan; van Ermingen, Muna; Huber, Walter; Amunts, Katrin
2010-10-01
Recent neuroimaging studies make contradictory predictions about the involvement of left Brodmann's area (BA) 44 in processing local syntactic violations in determiner phrases (DPs). Some studies suggest a role for BA 44 in detecting local syntactic violations, whereas others attribute this function to the left premotor cortex. Therefore, the present event-related functional magnetic resonance imaging (fMRI) study investigated whether left-cytoarchitectonic BA 44 was activated when German DPs involving syntactic gender violations were compared with correct DPs (correct: 'der Baum'-the[masculine] tree[masculine]; violated: 'das Baum'--the[neuter] tree[masculine]). Grammaticality judgements were made for both visual and auditory DPs to be able to generalize the results across modalities. Grammaticality judgements involved, among others, left BA 44 and left BA 6 in the premotor cortex for visual and auditory stimuli. Most importantly, activation in left BA 44 was consistently higher for violated than for correct DPs. This finding was behaviourally corroborated by longer reaction times for violated versus correct DPs. Additional brain regions, showing the same effect, included left premotor cortex, supplementary motor area, right middle and superior frontal cortex, and left cerebellum. Based on earlier findings from the literature, the results indicate the involvement of left BA 44 in processing local syntactic violations when these include morphological features, whereas left premotor cortex seems crucial for the detection of local word category violations. © 2010 Wiley-Liss, Inc.
Lighting design for globally illuminated volume rendering.
Zhang, Yubo; Ma, Kwan-Liu
2013-12-01
With the evolution of graphics hardware, high quality global illumination becomes available for real-time volume rendering. Compared to local illumination, global illumination can produce realistic shading effects which are closer to real world scenes, and has proven useful for enhancing volume data visualization to enable better depth and shape perception. However, setting up optimal lighting could be a nontrivial task for average users. There were lighting design works for volume visualization but they did not consider global light transportation. In this paper, we present a lighting design method for volume visualization employing global illumination. The resulting system takes into account view and transfer-function dependent content of the volume data to automatically generate an optimized three-point lighting environment. Our method fully exploits the back light which is not used by previous volume visualization systems. By also including global shadow and multiple scattering, our lighting system can effectively enhance the depth and shape perception of volumetric features of interest. In addition, we propose an automatic tone mapping operator which recovers visual details from overexposed areas while maintaining sufficient contrast in the dark areas. We show that our method is effective for visualizing volume datasets with complex structures. The structural information is more clearly and correctly presented under the automatically generated light sources.
Feature and Region Selection for Visual Learning.
Zhao, Ji; Wang, Liantao; Cabral, Ricardo; De la Torre, Fernando
2016-03-01
Visual learning problems, such as object classification and action recognition, are typically approached using extensions of the popular bag-of-words (BoWs) model. Despite its great success, it is unclear what visual features the BoW model is learning. Which regions in the image or video are used to discriminate among classes? Which are the most discriminative visual words? Answering these questions is fundamental for understanding existing BoW models and inspiring better models for visual recognition. To answer these questions, this paper presents a method for feature selection and region selection in the visual BoW model. This allows for an intermediate visualization of the features and regions that are important for visual learning. The main idea is to assign latent weights to the features or regions, and jointly optimize these latent variables with the parameters of a classifier (e.g., support vector machine). There are four main benefits of our approach: 1) our approach accommodates non-linear additive kernels, such as the popular χ(2) and intersection kernel; 2) our approach is able to handle both regions in images and spatio-temporal regions in videos in a unified way; 3) the feature selection problem is convex, and both problems can be solved using a scalable reduced gradient method; and 4) we point out strong connections with multiple kernel learning and multiple instance learning approaches. Experimental results in the PASCAL VOC 2007, MSR Action Dataset II and YouTube illustrate the benefits of our approach.
Aghamohammadi, Amirhossein; Ang, Mei Choo; A Sundararajan, Elankovan; Weng, Ng Kok; Mogharrebi, Marzieh; Banihashem, Seyed Yashar
2018-01-01
Visual tracking in aerial videos is a challenging task in computer vision and remote sensing technologies due to appearance variation difficulties. Appearance variations are caused by camera and target motion, low resolution noisy images, scale changes, and pose variations. Various approaches have been proposed to deal with appearance variation difficulties in aerial videos, and amongst these methods, the spatiotemporal saliency detection approach reported promising results in the context of moving target detection. However, it is not accurate for moving target detection when visual tracking is performed under appearance variations. In this study, a visual tracking method is proposed based on spatiotemporal saliency and discriminative online learning methods to deal with appearance variations difficulties. Temporal saliency is used to represent moving target regions, and it was extracted based on the frame difference with Sauvola local adaptive thresholding algorithms. The spatial saliency is used to represent the target appearance details in candidate moving regions. SLIC superpixel segmentation, color, and moment features can be used to compute feature uniqueness and spatial compactness of saliency measurements to detect spatial saliency. It is a time consuming process, which prompted the development of a parallel algorithm to optimize and distribute the saliency detection processes that are loaded into the multi-processors. Spatiotemporal saliency is then obtained by combining the temporal and spatial saliencies to represent moving targets. Finally, a discriminative online learning algorithm was applied to generate a sample model based on spatiotemporal saliency. This sample model is then incrementally updated to detect the target in appearance variation conditions. Experiments conducted on the VIVID dataset demonstrated that the proposed visual tracking method is effective and is computationally efficient compared to state-of-the-art methods.
2018-01-01
Visual tracking in aerial videos is a challenging task in computer vision and remote sensing technologies due to appearance variation difficulties. Appearance variations are caused by camera and target motion, low resolution noisy images, scale changes, and pose variations. Various approaches have been proposed to deal with appearance variation difficulties in aerial videos, and amongst these methods, the spatiotemporal saliency detection approach reported promising results in the context of moving target detection. However, it is not accurate for moving target detection when visual tracking is performed under appearance variations. In this study, a visual tracking method is proposed based on spatiotemporal saliency and discriminative online learning methods to deal with appearance variations difficulties. Temporal saliency is used to represent moving target regions, and it was extracted based on the frame difference with Sauvola local adaptive thresholding algorithms. The spatial saliency is used to represent the target appearance details in candidate moving regions. SLIC superpixel segmentation, color, and moment features can be used to compute feature uniqueness and spatial compactness of saliency measurements to detect spatial saliency. It is a time consuming process, which prompted the development of a parallel algorithm to optimize and distribute the saliency detection processes that are loaded into the multi-processors. Spatiotemporal saliency is then obtained by combining the temporal and spatial saliencies to represent moving targets. Finally, a discriminative online learning algorithm was applied to generate a sample model based on spatiotemporal saliency. This sample model is then incrementally updated to detect the target in appearance variation conditions. Experiments conducted on the VIVID dataset demonstrated that the proposed visual tracking method is effective and is computationally efficient compared to state-of-the-art methods. PMID:29438421
A Quantitative Visual Mapping and Visualization Approach for Deep Ocean Floor Research
NASA Astrophysics Data System (ADS)
Hansteen, T. H.; Kwasnitschka, T.
2013-12-01
Geological fieldwork on the sea floor is still impaired by our inability to resolve features on a sub-meter scale resolution in a quantifiable reference frame and over an area large enough to reveal the context of local observations. In order to overcome these issues, we have developed an integrated workflow of visual mapping techniques leading to georeferenced data sets which we examine using state-of-the-art visualization technology to recreate an effective working style of field geology. We demonstrate a microbathymetrical workflow, which is based on photogrammetric reconstruction of ROV imagery referenced to the acoustic vehicle track. The advantage over established acoustical systems lies in the true three-dimensionality of the data as opposed to the perspective projection from above produced by downward looking mapping methods. A full color texture mosaic derived from the imagery allows studies at resolutions beyond the resolved geometry (usually one order of magnitude below the image resolution) while color gives additional clues, which can only be partly resolved in acoustic backscatter. The creation of a three-dimensional model changes the working style from the temporal domain of a video recording back to the spatial domain of a map. We examine these datasets using a custom developed immersive virtual visualization environment. The ARENA (Artificial Research Environment for Networked Analysis) features a (lower) hemispherical screen at a diameter of six meters, accommodating up to four scientists at once thus providing the ability to browse data interactively among a group of researchers. This environment facilitates (1) the development of spatial understanding analogue to on-land outcrop studies, (2) quantitative observations of seafloor morphology and physical parameters of its deposits, (3) more effective formulation and communication of working hypotheses.
NASA Astrophysics Data System (ADS)
Qin, Xinqiang; Hu, Gang; Hu, Kai
2018-01-01
The decomposition of multiple source images using bidimensional empirical mode decomposition (BEMD) often produces mismatched bidimensional intrinsic mode functions, either by their number or their frequency, making image fusion difficult. A solution to this problem is proposed using a fixed number of iterations and a union operation in the sifting process. By combining the local regional features of the images, an image fusion method has been developed. First, the source images are decomposed using the proposed BEMD to produce the first intrinsic mode function (IMF) and residue component. Second, for the IMF component, a selection and weighted average strategy based on local area energy is used to obtain a high-frequency fusion component. Third, for the residue component, a selection and weighted average strategy based on local average gray difference is used to obtain a low-frequency fusion component. Finally, the fused image is obtained by applying the inverse BEMD transform. Experimental results show that the proposed algorithm provides superior performance over methods based on wavelet transform, line and column-based EMD, and complex empirical mode decomposition, both in terms of visual quality and objective evaluation criteria.
Accessing eSDO Solar Image Processing and Visualization through AstroGrid
NASA Astrophysics Data System (ADS)
Auden, E.; Dalla, S.
2008-08-01
The eSDO project is funded by the UK's Science and Technology Facilities Council (STFC) to integrate Solar Dynamics Observatory (SDO) data, algorithms, and visualization tools with the UK's Virtual Observatory project, AstroGrid. In preparation for the SDO launch in January 2009, the eSDO team has developed nine algorithms covering coronal behaviour, feature recognition, and global / local helioseismology. Each of these algorithms has been deployed as an AstroGrid Common Execution Architecture (CEA) application so that they can be included in complex VO workflows. In addition, the PLASTIC-enabled eSDO "Streaming Tool" online movie application allows users to search multi-instrument solar archives through AstroGrid web services and visualise the image data through galleries, an interactive movie viewing applet, and QuickTime movies generated on-the-fly.
Cross-Modal Facilitation in Speech Prosody
ERIC Educational Resources Information Center
Foxton, Jessica M.; Riviere, Louis-David; Barone, Pascal
2010-01-01
Speech prosody has traditionally been considered solely in terms of its auditory features, yet correlated visual features exist, such as head and eyebrow movements. This study investigated the extent to which visual prosodic features are able to affect the perception of the auditory features. Participants were presented with videos of a speaker…
ACUTE RETINAL ARTERIAL OCCLUSIVE DISORDERS
Hayreh, Sohan Singh
2011-01-01
The initial section deals with basic sciences; among the various topics briefly discussed are the anatomical features of ophthalmic, central retinal and cilioretinal arteries which may play a role in acute retinal arterial ischemic disorders. Crucial information required in the management of central retinal artery occlusion (CRAO) is the length of time the retina can survive following that. An experimental study shows that CRAO for 97 minutes produces no detectable permanent retinal damage but there is a progressive ischemic damage thereafter, and by 4 hours the retina has suffered irreversible damage. In the clinical section, I discuss at length various controversies on acute retinal arterial ischemic disorders. Classification of acute retinal arterial ischemic disorders These are of 4 types: CRAO, branch retinal artery occlusion (BRAO), cotton wools spots and amaurosis fugax. Both CRAO and BRAO further comprise multiple clinical entities. Contrary to the universal belief, pathogenetically, clinically and for management, CRAO is not one clinical entity but 4 distinct clinical entities – non-arteritic CRAO, non-arteritic CRAO with cilioretinal artery sparing, arteritic CRAO associated with giant cell arteritis (GCA) and transient non-arteritic CRAO. Similarly, BRAO comprises permanent BRAO, transient BRAO and cilioretinal artery occlusion (CLRAO), and the latter further consists of 3 distinct clinical entities - non-arteritic CLRAO alone, non-arteritic CLRAO associated with central retinal vein occlusion and arteritic CLRAO associated with GCA. Understanding these classifications is essential to comprehend fully various aspects of these disorders. Central retinal artery occlusion The pathogeneses, clinical features and management of the various types of CRAO are discussed in detail. Contrary to the prevalent belief, spontaneous improvement in both visual acuity and visual fields does occur, mainly during the first 7 days. The incidence of spontaneous visual acuity improvement during the first 7 days differs significantly (p<0.001) among the 4 types of CRAO; among them, in eyes with initial visual acuity of counting finger or worse, visual acuity improved, remained stable or deteriorated in nonarteritic CRAO in 22%, 66% and 12% respectively; in nonarteritic CRAO with cilioretinal artery sparing in 67%, 33% and none respectively; and in transient nonarteritic CRAO in 82%, 18% and none respectively. Arteritic CRAO shows no change. Recent studies have shown that administration of local intra-arterial thrombolytic agent not only has no beneficial effect but also can be harmful. Prevalent multiple misconceptions on CRAO are discussed. Branch retinal artery occlusion Pathogeneses, clinical features and management of various types of BRAO are discussed at length. The natural history of visual acuity outcome shows a final visual acuity of 20/40 or better in 89% of permanent BRAO cases, 100% of transient BRAO and 100% of nonarteritic CLRAO alone. Cotton wools spots These are common, non-specific acute focal retinal ischemic lesions, seen in many retinopathies. Their pathogenesis and clinical features are discussed in detail. Amaurosis fugax Its pathogenesis, clinical features and management are described. PMID:21620994
Relevance feedback-based building recognition
NASA Astrophysics Data System (ADS)
Li, Jing; Allinson, Nigel M.
2010-07-01
Building recognition is a nontrivial task in computer vision research which can be utilized in robot localization, mobile navigation, etc. However, existing building recognition systems usually encounter the following two problems: 1) extracted low level features cannot reveal the true semantic concepts; and 2) they usually involve high dimensional data which require heavy computational costs and memory. Relevance feedback (RF), widely applied in multimedia information retrieval, is able to bridge the gap between the low level visual features and high level concepts; while dimensionality reduction methods can mitigate the high-dimensional problem. In this paper, we propose a building recognition scheme which integrates the RF and subspace learning algorithms. Experimental results undertaken on our own building database show that the newly proposed scheme appreciably enhances the recognition accuracy.
Attentive Tracking Disrupts Feature Binding in Visual Working Memory
Fougnie, Daryl; Marois, René
2009-01-01
One of the most influential theories in visual cognition proposes that attention is necessary to bind different visual features into coherent object percepts (Treisman & Gelade, 1980). While considerable evidence supports a role for attention in perceptual feature binding, whether attention plays a similar function in visual working memory (VWM) remains controversial. To test the attentional requirements of VWM feature binding, here we gave participants an attention-demanding multiple object tracking task during the retention interval of a VWM task. Results show that the tracking task disrupted memory for color-shape conjunctions above and beyond any impairment to working memory for object features, and that this impairment was larger when the VWM stimuli were presented at different spatial locations. These results demonstrate that the role of visuospatial attention in feature binding is not unique to perception, but extends to the working memory of these perceptual representations as well. PMID:19609460
Generic decoding of seen and imagined objects using hierarchical visual features.
Horikawa, Tomoyasu; Kamitani, Yukiyasu
2017-05-22
Object recognition is a key function in both human and machine vision. While brain decoding of seen and imagined objects has been achieved, the prediction is limited to training examples. We present a decoding approach for arbitrary objects using the machine vision principle that an object category is represented by a set of features rendered invariant through hierarchical processing. We show that visual features, including those derived from a deep convolutional neural network, can be predicted from fMRI patterns, and that greater accuracy is achieved for low-/high-level features with lower-/higher-level visual areas, respectively. Predicted features are used to identify seen/imagined object categories (extending beyond decoder training) from a set of computed features for numerous object images. Furthermore, decoding of imagined objects reveals progressive recruitment of higher-to-lower visual representations. Our results demonstrate a homology between human and machine vision and its utility for brain-based information retrieval.
Working memory resources are shared across sensory modalities.
Salmela, V R; Moisala, M; Alho, K
2014-10-01
A common assumption in the working memory literature is that the visual and auditory modalities have separate and independent memory stores. Recent evidence on visual working memory has suggested that resources are shared between representations, and that the precision of representations sets the limit for memory performance. We tested whether memory resources are also shared across sensory modalities. Memory precision for two visual (spatial frequency and orientation) and two auditory (pitch and tone duration) features was measured separately for each feature and for all possible feature combinations. Thus, only the memory load was varied, from one to four features, while keeping the stimuli similar. In Experiment 1, two gratings and two tones-both containing two varying features-were presented simultaneously. In Experiment 2, two gratings and two tones-each containing only one varying feature-were presented sequentially. The memory precision (delayed discrimination threshold) for a single feature was close to the perceptual threshold. However, as the number of features to be remembered was increased, the discrimination thresholds increased more than twofold. Importantly, the decrease in memory precision did not depend on the modality of the other feature(s), or on whether the features were in the same or in separate objects. Hence, simultaneously storing one visual and one auditory feature had an effect on memory precision equal to those of simultaneously storing two visual or two auditory features. The results show that working memory is limited by the precision of the stored representations, and that working memory can be described as a resource pool that is shared across modalities.
Visual scan-path analysis with feature space transient fixation moments
NASA Astrophysics Data System (ADS)
Dempere-Marco, Laura; Hu, Xiao-Peng; Yang, Guang-Zhong
2003-05-01
The study of eye movements provides useful insight into the cognitive processes underlying visual search tasks. The analysis of the dynamics of eye movements has often been approached from a purely spatial perspective. In many cases, however, it may not be possible to define meaningful or consistent dynamics without considering the features underlying the scan paths. In this paper, the definition of the feature space has been attempted through the concept of visual similarity and non-linear low dimensional embedding, which defines a mapping from the image space into a low dimensional feature manifold that preserves the intrinsic similarity of image patterns. This has enabled the definition of perceptually meaningful features without the use of domain specific knowledge. Based on this, this paper introduces a new concept called Feature Space Transient Fixation Moments (TFM). The approach presented tackles the problem of feature space representation of visual search through the use of TFM. We demonstrate the practical values of this concept for characterizing the dynamics of eye movements in goal directed visual search tasks. We also illustrate how this model can be used to elucidate the fundamental steps involved in skilled search tasks through the evolution of transient fixation moments.
2016-01-01
This study was performed to quantitatively analyze medical knowledge of, and experience with, decision-making in preoperative virtual planning of mandibular reconstruction. Three shape descriptors were designed to evaluate local differences between reconstructed mandibles and patients’ original mandibles. We targeted an asymmetrical, wide range of cutting areas including the mandibular sidepiece, and defined a unique three-dimensional coordinate system for each mandibular image. The generalized algorithms for computing the shape descriptors were integrated into interactive planning software, where the user can refine the preoperative plan using the spatial map of the local shape distance as a visual guide. A retrospective study was conducted with two oral surgeons and two dental technicians using the developed software. The obtained 120 reconstruction plans show that the participants preferred a moderate shape distance rather than optimization to the smallest. We observed that a visually plausible shape could be obtained when considering specific anatomical features (e.g., mental foramen. mandibular midline). The proposed descriptors can be used to multilaterally evaluate reconstruction plans and systematically learn surgical procedures. PMID:27583465
Discriminating Power of Localized Three-Dimensional Facial Morphology
Hammond, Peter; Hutton, Tim J.; Allanson, Judith E.; Buxton, Bernard; Campbell, Linda E.; Clayton-Smith, Jill; Donnai, Dian; Karmiloff-Smith, Annette; Metcalfe, Kay; Murphy, Kieran C.; Patton, Michael; Pober, Barbara; Prescott, Katrina; Scambler, Pete; Shaw, Adam; Smith, Ann C. M.; Stevens, Angela F.; Temple, I. Karen; Hennekam, Raoul; Tassabehji, May
2005-01-01
Many genetic syndromes involve a facial gestalt that suggests a preliminary diagnosis to an experienced clinical geneticist even before a clinical examination and genotyping are undertaken. Previously, using visualization and pattern recognition, we showed that dense surface models (DSMs) of full face shape characterize facial dysmorphology in Noonan and in 22q11 deletion syndromes. In this much larger study of 696 individuals, we extend the use of DSMs of the full face to establish accurate discrimination between controls and individuals with Williams, Smith-Magenis, 22q11 deletion, or Noonan syndromes and between individuals with different syndromes in these groups. However, the full power of the DSM approach is demonstrated by the comparable discriminating abilities of localized facial features, such as periorbital, perinasal, and perioral patches, and the correlation of DSM-based predictions and molecular findings. This study demonstrates the potential of face shape models to assist clinical training through visualization, to support clinical diagnosis of affected individuals through pattern recognition, and to enable the objective comparison of individuals sharing other phenotypic or genotypic properties. PMID:16380911
SeqDepot: streamlined database of biological sequences and precomputed features.
Ulrich, Luke E; Zhulin, Igor B
2014-01-15
Assembling and/or producing integrated knowledge of sequence features continues to be an onerous and redundant task despite a large number of existing resources. We have developed SeqDepot-a novel database that focuses solely on two primary goals: (i) assimilating known primary sequences with predicted feature data and (ii) providing the most simple and straightforward means to procure and readily use this information. Access to >28.5 million sequences and 300 million features is provided through a well-documented and flexible RESTful interface that supports fetching specific data subsets, bulk queries, visualization and searching by MD5 digests or external database identifiers. We have also developed an HTML5/JavaScript web application exemplifying how to interact with SeqDepot and Perl/Python scripts for use with local processing pipelines. Freely available on the web at http://seqdepot.net/. RESTaccess via http://seqdepot.net/api/v1. Database files and scripts maybe downloaded from http://seqdepot.net/download.
NASA Astrophysics Data System (ADS)
Maskey, Manil; Ramachandran, Rahul; Kuo, Kwo-Sen
2015-04-01
The Collaborative WorkBench (CWB) has been successfully developed to support collaborative science algorithm development. It incorporates many features that enable and enhance science collaboration, including the support for both asynchronous and synchronous modes of interactions in collaborations. With the former, members in a team can share a full range of research artifacts, e.g. data, code, visualizations, and even virtual machine images. With the latter, they can engage in dynamic interactions such as notification, instant messaging, file exchange, and, most notably, collaborative programming. CWB also implements behind-the-scene provenance capture as well as version control to relieve scientists of these chores. Furthermore, it has achieved a seamless integration between researchers' local compute environments and those of the Cloud. CWB has also been successfully extended to support instrument verification and validation. Adopted by almost every researcher, the current practice of downloading data to local compute resources for analysis results in much duplication and inefficiency. CWB leverages Cloud infrastructure to provide a central location for data used by an entire science team, thereby eliminating much of this duplication and waste. Furthermore, use of CWB in concert with this same Cloud infrastructure enables co-located analysis with data where opportunities of data-parallelism can be better exploited, thereby further improving efficiency. With its collaboration-enabling features apposite to steps throughout the scientific process, we expect CWB to fundamentally transform research collaboration and realize maximum science productivity.
Colonnier, Fabien; Manecy, Augustin; Juston, Raphaël; Mallot, Hanspeter; Leitel, Robert; Floreano, Dario; Viollet, Stéphane
2015-02-25
In this study, a miniature artificial compound eye (15 mm in diameter) called the curved artificial compound eye (CurvACE) was endowed for the first time with hyperacuity, using similar micro-movements to those occurring in the fly's compound eye. A periodic micro-scanning movement of only a few degrees enables the vibrating compound eye to locate contrasting objects with a 40-fold greater resolution than that imposed by the interommatidial angle. In this study, we developed a new algorithm merging the output of 35 local processing units consisting of adjacent pairs of artificial ommatidia. The local measurements performed by each pair are processed in parallel with very few computational resources, which makes it possible to reach a high refresh rate of 500 Hz. An aerial robotic platform with two degrees of freedom equipped with the active CurvACE placed over naturally textured panels was able to assess its linear position accurately with respect to the environment thanks to its efficient gaze stabilization system. The algorithm was found to perform robustly at different light conditions as well as distance variations relative to the ground and featured small closed-loop positioning errors of the robot in the range of 45 mm. In addition, three tasks of interest were performed without having to change the algorithm: short-range odometry, visual stabilization, and tracking contrasting objects (hands) moving over a textured background.
Visual Saliency Detection Based on Multiscale Deep CNN Features.
Guanbin Li; Yizhou Yu
2016-11-01
Visual saliency is a fundamental problem in both cognitive and computational sciences, including computer vision. In this paper, we discover that a high-quality visual saliency model can be learned from multiscale features extracted using deep convolutional neural networks (CNNs), which have had many successes in visual recognition tasks. For learning such saliency models, we introduce a neural network architecture, which has fully connected layers on top of CNNs responsible for feature extraction at three different scales. The penultimate layer of our neural network has been confirmed to be a discriminative high-level feature vector for saliency detection, which we call deep contrast feature. To generate a more robust feature, we integrate handcrafted low-level features with our deep contrast feature. To promote further research and evaluation of visual saliency models, we also construct a new large database of 4447 challenging images and their pixelwise saliency annotations. Experimental results demonstrate that our proposed method is capable of achieving the state-of-the-art performance on all public benchmarks, improving the F-measure by 6.12% and 10%, respectively, on the DUT-OMRON data set and our new data set (HKU-IS), and lowering the mean absolute error by 9% and 35.3%, respectively, on these two data sets.
Huynh, Duong L; Tripathy, Srimant P; Bedell, Harold E; Ögmen, Haluk
2015-01-01
Human memory is content addressable-i.e., contents of the memory can be accessed using partial information about the bound features of a stored item. In this study, we used a cross-feature cuing technique to examine how the human visual system encodes, binds, and retains information about multiple stimulus features within a set of moving objects. We sought to characterize the roles of three different features (position, color, and direction of motion, the latter two of which are processed preferentially within the ventral and dorsal visual streams, respectively) in the construction and maintenance of object representations. We investigated the extent to which these features are bound together across the following processing stages: during stimulus encoding, sensory (iconic) memory, and visual short-term memory. Whereas all features examined here can serve as cues for addressing content, their effectiveness shows asymmetries and varies according to cue-report pairings and the stage of information processing and storage. Position-based indexing theories predict that position should be more effective as a cue compared to other features. While we found a privileged role for position as a cue at the stimulus-encoding stage, position was not the privileged cue at the sensory and visual short-term memory stages. Instead, the pattern that emerged from our findings is one that mirrors the parallel processing streams in the visual system. This stream-specific binding and cuing effectiveness manifests itself in all three stages of information processing examined here. Finally, we find that the Leaky Flask model proposed in our previous study is applicable to all three features.
Hardman, Kyle O; Cowan, Nelson
2015-03-01
Visual working memory stores stimuli from our environment as representations that can be accessed by high-level control processes. This study addresses a longstanding debate in the literature about whether storage limits in visual working memory include a limit to the complexity of discrete items. We examined the issue with a number of change-detection experiments that used complex stimuli that possessed multiple features per stimulus item. We manipulated the number of relevant features of the stimulus objects in order to vary feature load. In all of our experiments, we found that increased feature load led to a reduction in change-detection accuracy. However, we found that feature load alone could not account for the results but that a consideration of the number of relevant objects was also required. This study supports capacity limits for both feature and object storage in visual working memory. PsycINFO Database Record (c) 2015 APA, all rights reserved.
Extraction of Prostatic Lumina and Automated Recognition for Prostatic Calculus Image Using PCA-SVM
Wang, Zhuocai; Xu, Xiangmin; Ding, Xiaojun; Xiao, Hui; Huang, Yusheng; Liu, Jian; Xing, Xiaofen; Wang, Hua; Liao, D. Joshua
2011-01-01
Identification of prostatic calculi is an important basis for determining the tissue origin. Computation-assistant diagnosis of prostatic calculi may have promising potential but is currently still less studied. We studied the extraction of prostatic lumina and automated recognition for calculus images. Extraction of lumina from prostate histology images was based on local entropy and Otsu threshold recognition using PCA-SVM and based on the texture features of prostatic calculus. The SVM classifier showed an average time 0.1432 second, an average training accuracy of 100%, an average test accuracy of 93.12%, a sensitivity of 87.74%, and a specificity of 94.82%. We concluded that the algorithm, based on texture features and PCA-SVM, can recognize the concentric structure and visualized features easily. Therefore, this method is effective for the automated recognition of prostatic calculi. PMID:21461364
Doctor, Daniel H.; Doctor, Katarina Z.
2012-01-01
In this study the influence of geologic features related to sinkhole susceptibility was analyzed and the results were mapped for the region of Jefferson County, West Virginia. A model of sinkhole density was constructed using Geographically Weighted Regression (GWR) that estimated the relations among discrete geologic or hydrologic features and sinkhole density at each sinkhole location. Nine conditioning factors on sinkhole occurrence were considered as independent variables: distance to faults, fold axes, fracture traces oriented along bedrock strike, fracture traces oriented across bedrock strike, ponds, streams, springs, quarries, and interpolated depth to groundwater. GWR model parameter estimates for each variable were evaluated for significance, and the results were mapped. The results provide visual insight into the influence of these variables on localized sinkhole density, and can be used to provide an objective means of weighting conditioning factors in models of sinkhole susceptibility or hazard risk.
The Role of Attention in the Maintenance of Feature Bindings in Visual Short-term Memory
ERIC Educational Resources Information Center
Johnson, Jeffrey S.; Hollingworth, Andrew; Luck, Steven J.
2008-01-01
This study examined the role of attention in maintaining feature bindings in visual short-term memory. In a change-detection paradigm, participants attempted to detect changes in the colors and orientations of multiple objects; the changes consisted of new feature values in a feature-memory condition and changes in how existing feature values were…
Briand, K A; Klein, R M
1987-05-01
In the present study we investigated whether the visually allocated "beam" studied by Posner and others is the same visual attentional resource that performs the role of feature integration in Treisman's model. Subjects were cued to attend to a certain spatial location by a visual cue, and performance at expected and unexpected stimulus locations was compared. Subjects searched for a target letter (R) with distractor letters that either could give rise to illusory conjunctions (PQ) or could not (PB). Results from three separate experiments showed that orienting attention in response to central cues (endogenous orienting) showed similar effects for both conjunction and feature search. However, when attention was oriented with peripheral visual cues (exogenous orienting), conjunction search showed larger effects of attention than did feature search. It is suggested that the attentional systems that are oriented in response to central and peripheral cues may not be the same and that only the latter performs a role in feature integration. Possibilities for future research are discussed.
Slope histogram distribution-based parametrisation of Martian geomorphic features
NASA Astrophysics Data System (ADS)
Balint, Zita; Székely, Balázs; Kovács, Gábor
2014-05-01
The application of geomorphometric methods on the large Martian digital topographic datasets paves the way to analyse the Martian areomorphic processes in more detail. One of the numerous methods is the analysis is to analyse local slope distributions. To this implementation a visualization program code was developed that allows to calculate the local slope histograms and to compare them based on Kolmogorov distance criterion. As input data we used the digital elevation models (DTMs) derived from HRSC high-resolution stereo camera image from various Martian regions. The Kolmogorov-criterion based discrimination produces classes of slope histograms that displayed using coloration obtaining an image map. In this image map the distribution can be visualized by their different colours representing the various classes. Our goal is to create a local slope histogram based classification for large Martian areas in order to obtain information about general morphological characteristics of the region. This is a contribution of the TMIS.ascrea project, financed by the Austrian Research Promotion Agency (FFG). The present research is partly realized in the frames of TÁMOP 4.2.4.A/2-11-1-2012-0001 high priority "National Excellence Program - Elaborating and Operating an Inland Student and Researcher Personal Support System convergence program" project's scholarship support, using Hungarian state and European Union funds and cofinances from the European Social Fund.
Figure-ground modulation in awake primate thalamus.
Jones, Helen E; Andolina, Ian M; Shipp, Stewart D; Adams, Daniel L; Cudeiro, Javier; Salt, Thomas E; Sillito, Adam M
2015-06-02
Figure-ground discrimination refers to the perception of an object, the figure, against a nondescript background. Neural mechanisms of figure-ground detection have been associated with feedback interactions between higher centers and primary visual cortex and have been held to index the effect of global analysis on local feature encoding. Here, in recordings from visual thalamus of alert primates, we demonstrate a robust enhancement of neuronal firing when the figure, as opposed to the ground, component of a motion-defined figure-ground stimulus is located over the receptive field. In this paradigm, visual stimulation of the receptive field and its near environs is identical across both conditions, suggesting the response enhancement reflects higher integrative mechanisms. It thus appears that cortical activity generating the higher-order percept of the figure is simultaneously reentered into the lowest level that is anatomically possible (the thalamus), so that the signature of the evolving representation of the figure is imprinted on the input driving it in an iterative process.
Figure-ground modulation in awake primate thalamus
Jones, Helen E.; Andolina, Ian M.; Shipp, Stewart D.; Adams, Daniel L.; Cudeiro, Javier; Salt, Thomas E.; Sillito, Adam M.
2015-01-01
Figure-ground discrimination refers to the perception of an object, the figure, against a nondescript background. Neural mechanisms of figure-ground detection have been associated with feedback interactions between higher centers and primary visual cortex and have been held to index the effect of global analysis on local feature encoding. Here, in recordings from visual thalamus of alert primates, we demonstrate a robust enhancement of neuronal firing when the figure, as opposed to the ground, component of a motion-defined figure-ground stimulus is located over the receptive field. In this paradigm, visual stimulation of the receptive field and its near environs is identical across both conditions, suggesting the response enhancement reflects higher integrative mechanisms. It thus appears that cortical activity generating the higher-order percept of the figure is simultaneously reentered into the lowest level that is anatomically possible (the thalamus), so that the signature of the evolving representation of the figure is imprinted on the input driving it in an iterative process. PMID:25901330
Web3DMol: interactive protein structure visualization based on WebGL.
Shi, Maoxiang; Gao, Juntao; Zhang, Michael Q
2017-07-03
A growing number of web-based databases and tools for protein research are being developed. There is now a widespread need for visualization tools to present the three-dimensional (3D) structure of proteins in web browsers. Here, we introduce our 3D modeling program-Web3DMol-a web application focusing on protein structure visualization in modern web browsers. Users submit a PDB identification code or select a PDB archive from their local disk, and Web3DMol will display and allow interactive manipulation of the 3D structure. Featured functions, such as sequence plot, fragment segmentation, measure tool and meta-information display, are offered for users to gain a better understanding of protein structure. Easy-to-use APIs are available for developers to reuse and extend Web3DMol. Web3DMol can be freely accessed at http://web3dmol.duapp.com/, and the source code is distributed under the MIT license. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Global Sensory Qualities and Aesthetic Experience in Music.
Brattico, Pauli; Brattico, Elvira; Vuust, Peter
2017-01-01
A well-known tradition in the study of visual aesthetics holds that the experience of visual beauty is grounded in global computational or statistical properties of the stimulus, for example, scale-invariant Fourier spectrum or self-similarity. Some approaches rely on neural mechanisms, such as efficient computation, processing fluency, or the responsiveness of the cells in the primary visual cortex. These proposals are united by the fact that the contributing factors are hypothesized to be global (i.e., they concern the percept as a whole), formal or non-conceptual (i.e., they concern form instead of content), computational and/or statistical, and based on relatively low-level sensory properties. Here we consider that the study of aesthetic responses to music could benefit from the same approach. Thus, along with local features such as pitch, tuning, consonance/dissonance, harmony, timbre, or beat, also global sonic properties could be viewed as contributing toward creating an aesthetic musical experience. Several such properties are discussed and their neural implementation is reviewed in the light of recent advances in neuroaesthetics.
Klink, P Christiaan; Dagnino, Bruno; Gariel-Mathis, Marie-Alice; Roelfsema, Pieter R
2017-07-05
The visual cortex is hierarchically organized, with low-level areas coding for simple features and higher areas for complex ones. Feedforward and feedback connections propagate information between areas in opposite directions, but their functional roles are only partially understood. We used electrical microstimulation to perturb the propagation of neuronal activity between areas V1 and V4 in monkeys performing a texture-segregation task. In both areas, microstimulation locally caused a brief phase of excitation, followed by inhibition. Both these effects propagated faithfully in the feedforward direction from V1 to V4. Stimulation of V4, however, caused little V1 excitation, but it did yield a delayed suppression during the late phase of visually driven activity. This suppression was pronounced for the V1 figure representation and weaker for background representations. Our results reveal functional differences between feedforward and feedback processing in texture segregation and suggest a specific modulating role for feedback connections in perceptual organization. Copyright © 2017 Elsevier Inc. All rights reserved.
A Predictive Model of Anesthesia Depth Based on SVM in the Primary Visual Cortex
Shi, Li; Li, Xiaoyuan; Wan, Hong
2013-01-01
In this paper, a novel model for predicting anesthesia depth is put forward based on local field potentials (LFPs) in the primary visual cortex (V1 area) of rats. The model is constructed using a Support Vector Machine (SVM) to realize anesthesia depth online prediction and classification. The raw LFP signal was first decomposed into some special scaling components. Among these components, those containing higher frequency information were well suited for more precise analysis of the performance of the anesthetic depth by wavelet transform. Secondly, the characteristics of anesthetized states were extracted by complexity analysis. In addition, two frequency domain parameters were selected. The above extracted features were used as the input vector of the predicting model. Finally, we collected the anesthesia samples from the LFP recordings under the visual stimulus experiments of Long Evans rats. Our results indicate that the predictive model is accurate and computationally fast, and that it is also well suited for online predicting. PMID:24044024
Price, D; Tyler, L K; Neto Henriques, R; Campbell, K L; Williams, N; Treder, M S; Taylor, J R; Henson, R N A
2017-06-09
Slowing is a common feature of ageing, yet a direct relationship between neural slowing and brain atrophy is yet to be established in healthy humans. We combine magnetoencephalographic (MEG) measures of neural processing speed with magnetic resonance imaging (MRI) measures of white and grey matter in a large population-derived cohort to investigate the relationship between age-related structural differences and visual evoked field (VEF) and auditory evoked field (AEF) delay across two different tasks. Here we use a novel technique to show that VEFs exhibit a constant delay, whereas AEFs exhibit delay that accumulates over time. White-matter (WM) microstructure in the optic radiation partially mediates visual delay, suggesting increased transmission time, whereas grey matter (GM) in auditory cortex partially mediates auditory delay, suggesting less efficient local processing. Our results demonstrate that age has dissociable effects on neural processing speed, and that these effects relate to different types of brain atrophy.
Price, D.; Tyler, L. K.; Neto Henriques, R.; Campbell, K. L.; Williams, N.; Treder, M.S.; Taylor, J. R.; Brayne, Carol; Bullmore, Edward T.; Calder, Andrew C.; Cusack, Rhodri; Dalgleish, Tim; Duncan, John; Matthews, Fiona E.; Marslen-Wilson, William D.; Rowe, James B.; Shafto, Meredith A.; Cheung, Teresa; Davis, Simon; Geerligs, Linda; Kievit, Rogier; McCarrey, Anna; Mustafa, Abdur; Samu, David; Tsvetanov, Kamen A.; van Belle, Janna; Bates, Lauren; Emery, Tina; Erzinglioglu, Sharon; Gadie, Andrew; Gerbase, Sofia; Georgieva, Stanimira; Hanley, Claire; Parkin, Beth; Troy, David; Auer, Tibor; Correia, Marta; Gao, Lu; Green, Emma; Allen, Jodie; Amery, Gillian; Amunts, Liana; Barcroft, Anne; Castle, Amanda; Dias, Cheryl; Dowrick, Jonathan; Fair, Melissa; Fisher, Hayley; Goulding, Anna; Grewal, Adarsh; Hale, Geoff; Hilton, Andrew; Johnson, Frances; Johnston, Patricia; Kavanagh-Williamson, Thea; Kwasniewska, Magdalena; McMinn, Alison; Norman, Kim; Penrose, Jessica; Roby, Fiona; Rowland, Diane; Sargeant, John; Squire, Maggie; Stevens, Beth; Stoddart, Aldabra; Stone, Cheryl; Thompson, Tracy; Yazlik, Ozlem; Barnes, Dan; Dixon, Marie; Hillman, Jaya; Mitchell, Joanne; Villis, Laura; Henson, R. N. A.
2017-01-01
Slowing is a common feature of ageing, yet a direct relationship between neural slowing and brain atrophy is yet to be established in healthy humans. We combine magnetoencephalographic (MEG) measures of neural processing speed with magnetic resonance imaging (MRI) measures of white and grey matter in a large population-derived cohort to investigate the relationship between age-related structural differences and visual evoked field (VEF) and auditory evoked field (AEF) delay across two different tasks. Here we use a novel technique to show that VEFs exhibit a constant delay, whereas AEFs exhibit delay that accumulates over time. White-matter (WM) microstructure in the optic radiation partially mediates visual delay, suggesting increased transmission time, whereas grey matter (GM) in auditory cortex partially mediates auditory delay, suggesting less efficient local processing. Our results demonstrate that age has dissociable effects on neural processing speed, and that these effects relate to different types of brain atrophy. PMID:28598417
Yang, Wei; You, Kaiming; Li, Wei; Kim, Young-il
2017-01-01
This paper presents a vehicle autonomous localization method in local area of coal mine tunnel based on vision sensors and ultrasonic sensors. Barcode tags are deployed in pairs on both sides of the tunnel walls at certain intervals as artificial landmarks. The barcode coding is designed based on UPC-A code. The global coordinates of the upper left inner corner point of the feature frame of each barcode tag deployed in the tunnel are uniquely represented by the barcode. Two on-board vision sensors are used to recognize each pair of barcode tags on both sides of the tunnel walls. The distance between the upper left inner corner point of the feature frame of each barcode tag and the vehicle center point can be determined by using a visual distance projection model. The on-board ultrasonic sensors are used to measure the distance from the vehicle center point to the left side of the tunnel walls. Once the spatial geometric relationship between the barcode tags and the vehicle center point is established, the 3D coordinates of the vehicle center point in the tunnel’s global coordinate system can be calculated. Experiments on a straight corridor and an underground tunnel have shown that the proposed vehicle autonomous localization method is not only able to quickly recognize the barcode tags affixed to the tunnel walls, but also has relatively small average localization errors in the vehicle center point’s plane and vertical coordinates to meet autonomous unmanned vehicle positioning requirements in local area of coal mine tunnel. PMID:28141829
Xu, Zirui; Yang, Wei; You, Kaiming; Li, Wei; Kim, Young-Il
2017-01-01
This paper presents a vehicle autonomous localization method in local area of coal mine tunnel based on vision sensors and ultrasonic sensors. Barcode tags are deployed in pairs on both sides of the tunnel walls at certain intervals as artificial landmarks. The barcode coding is designed based on UPC-A code. The global coordinates of the upper left inner corner point of the feature frame of each barcode tag deployed in the tunnel are uniquely represented by the barcode. Two on-board vision sensors are used to recognize each pair of barcode tags on both sides of the tunnel walls. The distance between the upper left inner corner point of the feature frame of each barcode tag and the vehicle center point can be determined by using a visual distance projection model. The on-board ultrasonic sensors are used to measure the distance from the vehicle center point to the left side of the tunnel walls. Once the spatial geometric relationship between the barcode tags and the vehicle center point is established, the 3D coordinates of the vehicle center point in the tunnel's global coordinate system can be calculated. Experiments on a straight corridor and an underground tunnel have shown that the proposed vehicle autonomous localization method is not only able to quickly recognize the barcode tags affixed to the tunnel walls, but also has relatively small average localization errors in the vehicle center point's plane and vertical coordinates to meet autonomous unmanned vehicle positioning requirements in local area of coal mine tunnel.
Unconscious analyses of visual scenes based on feature conjunctions.
Tachibana, Ryosuke; Noguchi, Yasuki
2015-06-01
To efficiently process a cluttered scene, the visual system analyzes statistical properties or regularities of visual elements embedded in the scene. It is controversial, however, whether those scene analyses could also work for stimuli unconsciously perceived. Here we show that our brain performs the unconscious scene analyses not only using a single featural cue (e.g., orientation) but also based on conjunctions of multiple visual features (e.g., combinations of color and orientation information). Subjects foveally viewed a stimulus array (duration: 50 ms) where 4 types of bars (red-horizontal, red-vertical, green-horizontal, and green-vertical) were intermixed. Although a conscious perception of those bars was inhibited by a subsequent mask stimulus, the brain correctly analyzed the information about color, orientation, and color-orientation conjunctions of those invisible bars. The information of those features was then used for the unconscious configuration analysis (statistical processing) of the central bars, which induced a perceptual bias and illusory feature binding in visible stimuli at peripheral locations. While statistical analyses and feature binding are normally 2 key functions of the visual system to construct coherent percepts of visual scenes, our results show that a high-level analysis combining those 2 functions is correctly performed by unconscious computations in the brain. (c) 2015 APA, all rights reserved).
MacLean, Mary H; Giesbrecht, Barry
2015-07-01
Task-relevant and physically salient features influence visual selective attention. In the present study, we investigated the influence of task-irrelevant and physically nonsalient reward-associated features on visual selective attention. Two hypotheses were tested: One predicts that the effects of target-defining task-relevant and task-irrelevant features interact to modulate visual selection; the other predicts that visual selection is determined by the independent combination of relevant and irrelevant feature effects. These alternatives were tested using a visual search task that contained multiple targets, placing a high demand on the need for selectivity, and that was data-limited and required unspeeded responses, emphasizing early perceptual selection processes. One week prior to the visual search task, participants completed a training task in which they learned to associate particular colors with a specific reward value. In the search task, the reward-associated colors were presented surrounding targets and distractors, but were neither physically salient nor task-relevant. In two experiments, the irrelevant reward-associated features influenced performance, but only when they were presented in a task-relevant location. The costs induced by the irrelevant reward-associated features were greater when they oriented attention to a target than to a distractor. In a third experiment, we examined the effects of selection history in the absence of reward history and found that the interaction between task relevance and selection history differed, relative to when the features had previously been associated with reward. The results indicate that under conditions that demand highly efficient perceptual selection, physically nonsalient task-irrelevant and task-relevant factors interact to influence visual selective attention.
Classification of visual and linguistic tasks using eye-movement features.
Coco, Moreno I; Keller, Frank
2014-03-07
The role of the task has received special attention in visual-cognition research because it can provide causal explanations of goal-directed eye-movement responses. The dependency between visual attention and task suggests that eye movements can be used to classify the task being performed. A recent study by Greene, Liu, and Wolfe (2012), however, fails to achieve accurate classification of visual tasks based on eye-movement features. In the present study, we hypothesize that tasks can be successfully classified when they differ with respect to the involvement of other cognitive domains, such as language processing. We extract the eye-movement features used by Greene et al. as well as additional features from the data of three different tasks: visual search, object naming, and scene description. First, we demonstrated that eye-movement responses make it possible to characterize the goals of these tasks. Then, we trained three different types of classifiers and predicted the task participants performed with an accuracy well above chance (a maximum of 88% for visual search). An analysis of the relative importance of features for classification accuracy reveals that just one feature, i.e., initiation time, is sufficient for above-chance performance (a maximum of 79% accuracy in object naming). Crucially, this feature is independent of task duration, which differs systematically across the three tasks we investigated. Overall, the best task classification performance was obtained with a set of seven features that included both spatial information (e.g., entropy of attention allocation) and temporal components (e.g., total fixation on objects) of the eye-movement record. This result confirms the task-dependent allocation of visual attention and extends previous work by showing that task classification is possible when tasks differ in the cognitive processes involved (purely visual tasks such as search vs. communicative tasks such as scene description).
[Central nervous system dysgerminoma: a clinicopathological study of 3 cases].
Bellil, Selma; Braham, Emna; Limaiem, Faten; Bellil, Khadija; Chelly, Ines; Mekni, Amina; Haouet, Slim; Zitouna, Moncef; Jemel, Hafedh; Khaldi, Moncef; Kchir, Nidhameddine
2009-03-01
Intracranial germ cell tumors are rarely seen and typically localize in the pineal or suprasellar region. The largest category of germ cell tumors is dysgerminoma. to describe clinicopathological features and immunohistochemical profile of dysgerminomas. We report three cases of central nervous system dysgerminomas. There were two young women and a man who were 6, 11 and 23-year-old. They presented with symptoms of insipidus diabetes (n=3) with association to visual field defects in the third case. Radiological findings showed a supra seller lesion in two cases. Double localization in the pineal and suprasellar regions was seen in the third case. Histologic examination and immunohistochemical study of surgical specimen were consistent with primary central nervous system dysgerminoma.
NASA Astrophysics Data System (ADS)
Liang, Lijiao; Zhen, Shujun; Huang, Chengzhi
2017-02-01
A highly selective method was presented for colorimetric determination of melamine using uracil 5‧-triphosphate sodium modified gold nanoparticles (UTP-Au NPs) in this paper. Specific hydrogen-bonding interaction between uracil base (U) and melamine resulted in the aggregation of AuNPs, displaying variations of localized surface plasmon resonance (LSPR) features such as color change from red to blue and enhanced localized surface plasmon resonance light scattering (LSPR-LS) signals. Accordingly, the concentration of melamine could be quantified based on naked eye or a spectrometric method. This method was simple, inexpensive, environmental friendly and highly selective, which has been successfully used for the detection of melamine in pretreated liquid milk products with high recoveries.
Observation of airplane flow fields by natural condensation effects
NASA Technical Reports Server (NTRS)
Campbell, James F.; Chambers, Joseph R.; Rumsey, Christopher L.
1988-01-01
In-flight condensation patterns can illustrate a variety of airplane flow fields, such as attached and separated flows, vortex flows, and expansion and shock waves. These patterns are a unique source of flow visualization that has not been utilized previously. Condensation patterns at full-scale Reynolds number can provide useful information for researchers experimenting in subscale tunnels. It is also shown that computed values of relative humidity in the local flow field provide an inexpensive way to analyze the qualitative features of the condensation pattern, although a more complete theoretical modeling is necessary to obtain details of the condensation process. Furthermore, the analysis revealed that relative humidity is more sensitive to changes in local static temperature than to changes in pressure.
Tunneling interferometry and measurement of the thickness of ultrathin metallic Pb(111) films
NASA Astrophysics Data System (ADS)
Ustavshchikov, S. S.; Putilov, A. V.; Aladyshkin, A. Yu.
2017-10-01
Spectra of the differential tunneling conductivity for ultrathin lead films grown on Si(111) 7 × 7 single crystals with a thickness of 9 to 50 ML have been studied by low-temperature scanning tunneling microscopy and spectroscopy. The presence of local maxima of the tunneling conductivity is characteristic of such systems. The energies of maxima of the differential conductivity are determined by the spectrum of quantum-confined states of electrons in a metallic layer and, consequently, the local thickness of the layer. It has been shown that features of the microstructure of substrates, such as steps of monatomic height, structural defects, and inclusions of other materials covered with a lead layer, can be visualized by bias-modulation scanning tunneling spectroscopy.
Wen, Haiguang; Shi, Junxing; Chen, Wei; Liu, Zhongming
2018-02-28
The brain represents visual objects with topographic cortical patterns. To address how distributed visual representations enable object categorization, we established predictive encoding models based on a deep residual network, and trained them to predict cortical responses to natural movies. Using this predictive model, we mapped human cortical representations to 64,000 visual objects from 80 categories with high throughput and accuracy. Such representations covered both the ventral and dorsal pathways, reflected multiple levels of object features, and preserved semantic relationships between categories. In the entire visual cortex, object representations were organized into three clusters of categories: biological objects, non-biological objects, and background scenes. In a finer scale specific to each cluster, object representations revealed sub-clusters for further categorization. Such hierarchical clustering of category representations was mostly contributed by cortical representations of object features from middle to high levels. In summary, this study demonstrates a useful computational strategy to characterize the cortical organization and representations of visual features for rapid categorization.
FaceWarehouse: a 3D facial expression database for visual computing.
Cao, Chen; Weng, Yanlin; Zhou, Shun; Tong, Yiying; Zhou, Kun
2014-03-01
We present FaceWarehouse, a database of 3D facial expressions for visual computing applications. We use Kinect, an off-the-shelf RGBD camera, to capture 150 individuals aged 7-80 from various ethnic backgrounds. For each person, we captured the RGBD data of her different expressions, including the neutral expression and 19 other expressions such as mouth-opening, smile, kiss, etc. For every RGBD raw data record, a set of facial feature points on the color image such as eye corners, mouth contour, and the nose tip are automatically localized, and manually adjusted if better accuracy is required. We then deform a template facial mesh to fit the depth data as closely as possible while matching the feature points on the color image to their corresponding points on the mesh. Starting from these fitted face meshes, we construct a set of individual-specific expression blendshapes for each person. These meshes with consistent topology are assembled as a rank-3 tensor to build a bilinear face model with two attributes: identity and expression. Compared with previous 3D facial databases, for every person in our database, there is a much richer matching collection of expressions, enabling depiction of most human facial actions. We demonstrate the potential of FaceWarehouse for visual computing with four applications: facial image manipulation, face component transfer, real-time performance-based facial image animation, and facial animation retargeting from video to image.
Jamieson, Andrew R; Giger, Maryellen L; Drukker, Karen; Li, Hui; Yuan, Yading; Bhooshan, Neha
2010-01-01
In this preliminary study, recently developed unsupervised nonlinear dimension reduction (DR) and data representation techniques were applied to computer-extracted breast lesion feature spaces across three separate imaging modalities: Ultrasound (U.S.) with 1126 cases, dynamic contrast enhanced magnetic resonance imaging with 356 cases, and full-field digital mammography with 245 cases. Two methods for nonlinear DR were explored: Laplacian eigenmaps [M. Belkin and P. Niyogi, "Laplacian eigenmaps for dimensionality reduction and data representation," Neural Comput. 15, 1373-1396 (2003)] and t-distributed stochastic neighbor embedding (t-SNE) [L. van der Maaten and G. Hinton, "Visualizing data using t-SNE," J. Mach. Learn. Res. 9, 2579-2605 (2008)]. These methods attempt to map originally high dimensional feature spaces to more human interpretable lower dimensional spaces while preserving both local and global information. The properties of these methods as applied to breast computer-aided diagnosis (CADx) were evaluated in the context of malignancy classification performance as well as in the visual inspection of the sparseness within the two-dimensional and three-dimensional mappings. Classification performance was estimated by using the reduced dimension mapped feature output as input into both linear and nonlinear classifiers: Markov chain Monte Carlo based Bayesian artificial neural network (MCMC-BANN) and linear discriminant analysis. The new techniques were compared to previously developed breast CADx methodologies, including automatic relevance determination and linear stepwise (LSW) feature selection, as well as a linear DR method based on principal component analysis. Using ROC analysis and 0.632+bootstrap validation, 95% empirical confidence intervals were computed for the each classifier's AUC performance. In the large U.S. data set, sample high performance results include, AUC0.632+ = 0.88 with 95% empirical bootstrap interval [0.787;0.895] for 13 ARD selected features and AUC0.632+ = 0.87 with interval [0.817;0.906] for four LSW selected features compared to 4D t-SNE mapping (from the original 81D feature space) giving AUC0.632+ = 0.90 with interval [0.847;0.919], all using the MCMC-BANN. Preliminary results appear to indicate capability for the new methods to match or exceed classification performance of current advanced breast lesion CADx algorithms. While not appropriate as a complete replacement of feature selection in CADx problems, DR techniques offer a complementary approach, which can aid elucidation of additional properties associated with the data. Specifically, the new techniques were shown to possess the added benefit of delivering sparse lower dimensional representations for visual interpretation, revealing intricate data structure of the feature space.
A spherical model for orientation and spatial-frequency tuning in a cortical hypercolumn.
Bressloff, Paul C; Cowan, Jack D
2003-01-01
A theory is presented of the way in which the hypercolumns in primary visual cortex (V1) are organized to detect important features of visual images, namely local orientation and spatial-frequency. Given the existence in V1 of dual maps for these features, both organized around orientation pinwheels, we constructed a model of a hypercolumn in which orientation and spatial-frequency preferences are represented by the two angular coordinates of a sphere. The two poles of this sphere are taken to correspond, respectively, to high and low spatial-frequency preferences. In Part I of the paper, we use mean-field methods to derive exact solutions for localized activity states on the sphere. We show how cortical amplification through recurrent interactions generates a sharply tuned, contrast-invariant population response to both local orientation and local spatial frequency, even in the case of a weakly biased input from the lateral geniculate nucleus (LGN). A major prediction of our model is that this response is non-separable with respect to the local orientation and spatial frequency of a stimulus. That is, orientation tuning is weaker around the pinwheels, and there is a shift in spatial-frequency tuning towards that of the closest pinwheel at non-optimal orientations. In Part II of the paper, we demonstrate that a simple feed-forward model of spatial-frequency preference, unlike that for orientation preference, does not generate a faithful representation when amplified by recurrent interactions in V1. We then introduce the idea that cortico-geniculate feedback modulates LGN activity to generate a faithful representation, thus providing a new functional interpretation of the role of this feedback pathway. Using linear filter theory, we show that if the feedback from a cortical cell is taken to be approximately equal to the reciprocal of the corresponding feed-forward receptive field (in the two-dimensional Fourier domain), then the mismatch between the feed-forward and cortical frequency representations is eliminated. We therefore predict that cortico-geniculate feedback connections innervate the LGN in a pattern determined by the orientation and spatial-frequency biases of feed-forward receptive fields. Finally, we show how recurrent cortical interactions can generate cross-orientation suppression. PMID:14561324
Perception Of "Features" And "Objects": Applications To The Design Of Instrument Panel Displays
NASA Astrophysics Data System (ADS)
Poynter, Douglas; Czarnomski, Alan J.
1988-10-01
An experiment was conducted to determine whether socalled feature displays allow for faster and more accurate processing compared to object displays. Previous psychological studies indicate that features can be processed in parallel across the visual field, whereas objects must be processed one at a time with the aid of attentional focus. Numbers and letters are examples of objects; line orientation and color are examples of features. In this experiment, subjects were asked to search displays composed of up to 16 elements for the presence of specific elements. The ability to detect, localize, and identify targets was influenced by display format. Digital errors increased with the number of elements, the number of targets, and the distance of the target from the fixation point. Line orientation errors increased only with the number of targets. Several other display types were evaluated, and each produced a pattern of errors similar to either digital or line orientation format. Results of the study were discussed in terms of Feature Integration Theory, which distinguishes between elements that are processed with parallel versus serial mechanisms.
The Objective Identification and Quantification of Interstitial Lung Abnormalities in Smokers.
Ash, Samuel Y; Harmouche, Rola; Ross, James C; Diaz, Alejandro A; Hunninghake, Gary M; Putman, Rachel K; Onieva, Jorge; Martinez, Fernando J; Choi, Augustine M; Lynch, David A; Hatabu, Hiroto; Rosas, Ivan O; Estepar, Raul San Jose; Washko, George R
2017-08-01
Previous investigation suggests that visually detected interstitial changes in the lung parenchyma of smokers are highly clinically relevant and predict outcomes, including death. Visual subjective analysis to detect these changes is time-consuming, insensitive to subtle changes, and requires training to enhance reproducibility. Objective detection of such changes could provide a method of disease identification without these limitations. The goal of this study was to develop and test a fully automated image processing tool to objectively identify radiographic features associated with interstitial abnormalities in the computed tomography scans of a large cohort of smokers. An automated tool that uses local histogram analysis combined with distance from the pleural surface was used to detect radiographic features consistent with interstitial lung abnormalities in computed tomography scans from 2257 individuals from the Genetic Epidemiology of COPD study, a longitudinal observational study of smokers. The sensitivity and specificity of this tool was determined based on its ability to detect the visually identified presence of these abnormalities. The tool had a sensitivity of 87.8% and a specificity of 57.5% for the detection of interstitial lung abnormalities, with a c-statistic of 0.82, and was 100% sensitive and 56.7% specific for the detection of the visual subtype of interstitial abnormalities called fibrotic parenchymal abnormalities, with a c-statistic of 0.89. In smokers, a fully automated image processing tool is able to identify those individuals who have interstitial lung abnormalities with moderate sensitivity and specificity. Copyright © 2017 The Association of University Radiologists. Published by Elsevier Inc. All rights reserved.
Zhou, Zhenyu; Liu, Wei; Cui, Jiali; Wang, Xunheng; Arias, Diana; Wen, Ying; Bansal, Ravi; Hao, Xuejun; Wang, Zhishun; Peterson, Bradley S; Xu, Dongrong
2011-02-01
Signal variation in diffusion-weighted images (DWIs) is influenced both by thermal noise and by spatially and temporally varying artifacts, such as rigid-body motion and cardiac pulsation. Motion artifacts are particularly prevalent when scanning difficult patient populations, such as human infants. Although some motion during data acquisition can be corrected using image coregistration procedures, frequently individual DWIs are corrupted beyond repair by sudden, large amplitude motion either within or outside of the imaging plane. We propose a novel approach to identify and reject outlier images automatically using local binary patterns (LBP) and 2D partial least square (2D-PLS) to estimate diffusion tensors robustly. This method uses an enhanced LBP algorithm to extract texture features from a local texture feature of the image matrix from the DWI data. Because the images have been transformed to local texture matrices, we are able to extract discriminating information that identifies outliers in the data set by extending a traditional one-dimensional PLS algorithm to a two-dimension operator. The class-membership matrix in this 2D-PLS algorithm is adapted to process samples that are image matrix, and the membership matrix thus represents varying degrees of importance of local information within the images. We also derive the analytic form of the generalized inverse of the class-membership matrix. We show that this method can effectively extract local features from brain images obtained from a large sample of human infants to identify images that are outliers in their textural features, permitting their exclusion from further processing when estimating tensors using the DWIs. This technique is shown to be superior in performance when compared with visual inspection and other common methods to address motion-related artifacts in DWI data. This technique is applicable to correct motion artifact in other magnetic resonance imaging (MRI) techniques (e.g., the bootstrapping estimation) that use univariate or multivariate regression methods to fit MRI data to a pre-specified model. Copyright © 2011 Elsevier Inc. All rights reserved.
Acharya, U Rajendra; Bhat, Shreya; Koh, Joel E W; Bhandary, Sulatha V; Adeli, Hojjat
2017-09-01
Glaucoma is an optic neuropathy defined by characteristic damage to the optic nerve and accompanying visual field deficits. Early diagnosis and treatment are critical to prevent irreversible vision loss and ultimate blindness. Current techniques for computer-aided analysis of the optic nerve and retinal nerve fiber layer (RNFL) are expensive and require keen interpretation by trained specialists. Hence, an automated system is highly desirable for a cost-effective and accurate screening for the diagnosis of glaucoma. This paper presents a new methodology and a computerized diagnostic system. Adaptive histogram equalization is used to convert color images to grayscale images followed by convolution of these images with Leung-Malik (LM), Schmid (S), and maximum response (MR4 and MR8) filter banks. The basic microstructures in typical images are called textons. The convolution process produces textons. Local configuration pattern (LCP) features are extracted from these textons. The significant features are selected using a sequential floating forward search (SFFS) method and ranked using the statistical t-test. Finally, various classifiers are used for classification of images into normal and glaucomatous classes. A high classification accuracy of 95.8% is achieved using six features obtained from the LM filter bank and the k-nearest neighbor (kNN) classifier. A glaucoma integrative index (GRI) is also formulated to obtain a reliable and effective system. Copyright © 2017 Elsevier Ltd. All rights reserved.
Visual imagery of famous faces: effects of memory and attention revealed by fMRI.
Ishai, Alumit; Haxby, James V; Ungerleider, Leslie G
2002-12-01
Complex pictorial information can be represented and retrieved from memory as mental visual images. Functional brain imaging studies have shown that visual perception and visual imagery share common neural substrates. The type of memory (short- or long-term) that mediates the generation of mental images, however, has not been addressed previously. The purpose of this study was to investigate the neural correlates underlying imagery generated from short- and long-term memory (STM and LTM). We used famous faces to localize the visual response during perception and to compare the responses during visual imagery generated from STM (subjects memorized specific pictures of celebrities before the imagery task) and imagery from LTM (subjects imagined famous faces without seeing specific pictures during the experimental session). We found that visual perception of famous faces activated the inferior occipital gyri, lateral fusiform gyri, the superior temporal sulcus, and the amygdala. Small subsets of these face-selective regions were activated during imagery. Additionally, visual imagery of famous faces activated a network of regions composed of bilateral calcarine, hippocampus, precuneus, intraparietal sulcus (IPS), and the inferior frontal gyrus (IFG). In all these regions, imagery generated from STM evoked more activation than imagery from LTM. Regardless of memory type, focusing attention on features of the imagined faces (e.g., eyes, lips, or nose) resulted in increased activation in the right IPS and right IFG. Our results suggest differential effects of memory and attention during the generation and maintenance of mental images of faces.
Spatial and temporal coherence in perceptual binding
Blake, Randolph; Yang, Yuede
1997-01-01
Component visual features of objects are registered by distributed patterns of activity among neurons comprising multiple pathways and visual areas. How these distributed patterns of activity give rise to unified representations of objects remains unresolved, although one recent, controversial view posits temporal coherence of neural activity as a binding agent. Motivated by the possible role of temporal coherence in feature binding, we devised a novel psychophysical task that requires the detection of temporal coherence among features comprising complex visual images. Results show that human observers can more easily detect synchronized patterns of temporal contrast modulation within hybrid visual images composed of two components when those components are drawn from the same original picture. Evidently, time-varying changes within spatially coherent features produce more salient neural signals. PMID:9192701
Raudies, Florian; Hasselmo, Michael E.
2015-01-01
Firing fields of grid cells in medial entorhinal cortex show compression or expansion after manipulations of the location of environmental barriers. This compression or expansion could be selective for individual grid cell modules with particular properties of spatial scaling. We present a model for differences in the response of modules to barrier location that arise from different mechanisms for the influence of visual features on the computation of location that drives grid cell firing patterns. These differences could arise from differences in the position of visual features within the visual field. When location was computed from the movement of visual features on the ground plane (optic flow) in the ventral visual field, this resulted in grid cell spatial firing that was not sensitive to barrier location in modules modeled with small spacing between grid cell firing fields. In contrast, when location was computed from static visual features on walls of barriers, i.e. in the more dorsal visual field, this resulted in grid cell spatial firing that compressed or expanded based on the barrier locations in modules modeled with large spacing between grid cell firing fields. This indicates that different grid cell modules might have differential properties for computing location based on visual cues, or the spatial radius of sensitivity to visual cues might differ between modules. PMID:26584432
Olivers, Christian N L; Meijer, Frank; Theeuwes, Jan
2006-10-01
In 7 experiments, the authors explored whether visual attention (the ability to select relevant visual information) and visual working memory (the ability to retain relevant visual information) share the same content representations. The presence of singleton distractors interfered more strongly with a visual search task when it was accompanied by an additional memory task. Singleton distractors interfered even more when they were identical or related to the object held in memory, but only when it was difficult to verbalize the memory content. Furthermore, this content-specific interaction occurred for features that were relevant to the memory task but not for irrelevant features of the same object or for once-remembered objects that could be forgotten. Finally, memory-related distractors attracted more eye movements but did not result in longer fixations. The results demonstrate memory-driven attentional capture on the basis of content-specific representations. Copyright 2006 APA.
Hierarchical acquisition of visual specificity in spatial contextual cueing.
Lie, Kin-Pou
2015-01-01
Spatial contextual cueing refers to visual search performance's being improved when invariant associations between target locations and distractor spatial configurations are learned incidentally. Using the instance theory of automatization and the reverse hierarchy theory of visual perceptual learning, this study explores the acquisition of visual specificity in spatial contextual cueing. Two experiments in which detailed visual features were irrelevant for distinguishing between spatial contexts found that spatial contextual cueing was visually generic in difficult trials when the trials were not preceded by easy trials (Experiment 1) but that spatial contextual cueing progressed to visual specificity when difficult trials were preceded by easy trials (Experiment 2). These findings support reverse hierarchy theory, which predicts that even when detailed visual features are irrelevant for distinguishing between spatial contexts, spatial contextual cueing can progress to visual specificity if the stimuli remain constant, the task is difficult, and difficult trials are preceded by easy trials. However, these findings are inconsistent with instance theory, which predicts that when detailed visual features are irrelevant for distinguishing between spatial contexts, spatial contextual cueing will not progress to visual specificity. This study concludes that the acquisition of visual specificity in spatial contextual cueing is more plausibly hierarchical, rather than instance-based.
Atoms of recognition in human and computer vision.
Ullman, Shimon; Assif, Liav; Fetaya, Ethan; Harari, Daniel
2016-03-08
Discovering the visual features and representations used by the brain to recognize objects is a central problem in the study of vision. Recently, neural network models of visual object recognition, including biological and deep network models, have shown remarkable progress and have begun to rival human performance in some challenging tasks. These models are trained on image examples and learn to extract features and representations and to use them for categorization. It remains unclear, however, whether the representations and learning processes discovered by current models are similar to those used by the human visual system. Here we show, by introducing and using minimal recognizable images, that the human visual system uses features and processes that are not used by current models and that are critical for recognition. We found by psychophysical studies that at the level of minimal recognizable images a minute change in the image can have a drastic effect on recognition, thus identifying features that are critical for the task. Simulations then showed that current models cannot explain this sensitivity to precise feature configurations and, more generally, do not learn to recognize minimal images at a human level. The role of the features shown here is revealed uniquely at the minimal level, where the contribution of each feature is essential. A full understanding of the learning and use of such features will extend our understanding of visual recognition and its cortical mechanisms and will enhance the capacity of computational models to learn from visual experience and to deal with recognition and detailed image interpretation.
Human Occipital and Parietal GABA Selectively Influence Visual Perception of Orientation and Size.
Song, Chen; Sandberg, Kristian; Andersen, Lau Møller; Blicher, Jakob Udby; Rees, Geraint
2017-09-13
GABA is the primary inhibitory neurotransmitter in human brain. The level of GABA varies substantially across individuals, and this variability is associated with interindividual differences in visual perception. However, it remains unclear whether the association between GABA level and visual perception reflects a general influence of visual inhibition or whether the GABA levels of different cortical regions selectively influence perception of different visual features. To address this, we studied how the GABA levels of parietal and occipital cortices related to interindividual differences in size, orientation, and brightness perception. We used visual contextual illusion as a perceptual assay since the illusion dissociates perceptual content from stimulus content and the magnitude of the illusion reflects the effect of visual inhibition. Across individuals, we observed selective correlations between the level of GABA and the magnitude of contextual illusion. Specifically, parietal GABA level correlated with size illusion magnitude but not with orientation or brightness illusion magnitude; in contrast, occipital GABA level correlated with orientation illusion magnitude but not with size or brightness illusion magnitude. Our findings reveal a region- and feature-dependent influence of GABA level on human visual perception. Parietal and occipital cortices contain, respectively, topographic maps of size and orientation preference in which neural responses to stimulus sizes and stimulus orientations are modulated by intraregional lateral connections. We propose that these lateral connections may underlie the selective influence of GABA on visual perception. SIGNIFICANCE STATEMENT GABA, the primary inhibitory neurotransmitter in human visual system, varies substantially across individuals. This interindividual variability in GABA level is linked to interindividual differences in many aspects of visual perception. However, the widespread influence of GABA raises the question of whether interindividual variability in GABA reflects an overall variability in visual inhibition and has a general influence on visual perception or whether the GABA levels of different cortical regions have selective influence on perception of different visual features. Here we report a region- and feature-dependent influence of GABA level on human visual perception. Our findings suggest that GABA level of a cortical region selectively influences perception of visual features that are topographically mapped in this region through intraregional lateral connections. Copyright © 2017 Song, Sandberg et al.
NASA Astrophysics Data System (ADS)
Jeong, Samuel; Ito, Yoshikazu; Edwards, Gary; Fujita, Jun-ichi
2018-06-01
The visualization of localized electronic charges on nanocatalysts is expected to yield fundamental information about catalytic reaction mechanisms. We have developed a high-sensitivity detection technique for the visualization of localized charges on a catalyst and their corresponding electric field distribution, using a low-energy beam of 1 to 5 keV electrons and a high-sensitivity scanning transmission electron microscope (STEM) detector. The highest sensitivity for visualizing a localized electric field was ∼0.08 V/µm at a distance of ∼17 µm from a localized charge at 1 keV of the primary electron energy, and a weak local electric field produced by 200 electrons accumulated on the carbon nanotube (CNT) apex can be visualized. We also observed that Au nanoparticles distributed on a CNT forest tended to accumulate a certain amount of charges, about 150 electrons, at a ‑2 V bias.
Cicmil, Nela; Bridge, Holly; Parker, Andrew J.; Woolrich, Mark W.; Krug, Kristine
2014-01-01
Magnetoencephalography (MEG) allows the physiological recording of human brain activity at high temporal resolution. However, spatial localization of the source of the MEG signal is an ill-posed problem as the signal alone cannot constrain a unique solution and additional prior assumptions must be enforced. An adequate source reconstruction method for investigating the human visual system should place the sources of early visual activity in known locations in the occipital cortex. We localized sources of retinotopic MEG signals from the human brain with contrasting reconstruction approaches (minimum norm, multiple sparse priors, and beamformer) and compared these to the visual retinotopic map obtained with fMRI in the same individuals. When reconstructing brain responses to visual stimuli that differed by angular position, we found reliable localization to the appropriate retinotopic visual field quadrant by a minimum norm approach and by beamforming. Retinotopic map eccentricity in accordance with the fMRI map could not consistently be localized using an annular stimulus with any reconstruction method, but confining eccentricity stimuli to one visual field quadrant resulted in significant improvement with the minimum norm. These results inform the application of source analysis approaches for future MEG studies of the visual system, and indicate some current limits on localization accuracy of MEG signals. PMID:24904268
Roldan, Stephanie M
2017-01-01
One of the fundamental goals of object recognition research is to understand how a cognitive representation produced from the output of filtered and transformed sensory information facilitates efficient viewer behavior. Given that mental imagery strongly resembles perceptual processes in both cortical regions and subjective visual qualities, it is reasonable to question whether mental imagery facilitates cognition in a manner similar to that of perceptual viewing: via the detection and recognition of distinguishing features. Categorizing the feature content of mental imagery holds potential as a reverse pathway by which to identify the components of a visual stimulus which are most critical for the creation and retrieval of a visual representation. This review will examine the likelihood that the information represented in visual mental imagery reflects distinctive object features thought to facilitate efficient object categorization and recognition during perceptual viewing. If it is the case that these representational features resemble their sensory counterparts in both spatial and semantic qualities, they may well be accessible through mental imagery as evaluated through current investigative techniques. In this review, methods applied to mental imagery research and their findings are reviewed and evaluated for their efficiency in accessing internal representations, and implications for identifying diagnostic features are discussed. An argument is made for the benefits of combining mental imagery assessment methods with diagnostic feature research to advance the understanding of visual perceptive processes, with suggestions for avenues of future investigation.
Roldan, Stephanie M.
2017-01-01
One of the fundamental goals of object recognition research is to understand how a cognitive representation produced from the output of filtered and transformed sensory information facilitates efficient viewer behavior. Given that mental imagery strongly resembles perceptual processes in both cortical regions and subjective visual qualities, it is reasonable to question whether mental imagery facilitates cognition in a manner similar to that of perceptual viewing: via the detection and recognition of distinguishing features. Categorizing the feature content of mental imagery holds potential as a reverse pathway by which to identify the components of a visual stimulus which are most critical for the creation and retrieval of a visual representation. This review will examine the likelihood that the information represented in visual mental imagery reflects distinctive object features thought to facilitate efficient object categorization and recognition during perceptual viewing. If it is the case that these representational features resemble their sensory counterparts in both spatial and semantic qualities, they may well be accessible through mental imagery as evaluated through current investigative techniques. In this review, methods applied to mental imagery research and their findings are reviewed and evaluated for their efficiency in accessing internal representations, and implications for identifying diagnostic features are discussed. An argument is made for the benefits of combining mental imagery assessment methods with diagnostic feature research to advance the understanding of visual perceptive processes, with suggestions for avenues of future investigation. PMID:28588538
Chow, Cristelle; Fortier, Marielle Valerie; Das, Lena; Menon, Anuradha P; Vasanwala, Rashida; Lam, Joyce C M; Ng, Zhi Min; Ling, Simon Robert; Chan, Derrick W S; Choong, Chew Thye; Liew, Wendy K M; Thomas, Terrence
2015-05-01
Anatomical localization of the rapid-onset obesity with hypothalamic dysfunction, hypoventilation, and autonomic dysregulation (ROHHAD) syndrome has proved elusive. Most patients had neuroimaging after cardiorespiratory collapse, revealing a range of ischemic lesions. A 15-year-old obese boy with an acute febrile encephalopathy had hypoventilation, autonomic dysfunction, visual hallucinations, hyperekplexia, and disordered body temperature, and saltwater regulation. These features describe the ROHHAD syndrome. Cerebrospinal fluid analysis showed pleocytosis, elevated neopterins, and oligoclonal bands, and serology for systemic and antineuronal antibodies was negative. He improved after receiving intravenous steroids, immunoglobulins, and long-term mycophenolate. Screening for neural crest tumors was negative. Magnetic resonance imaging of the brain early in his illness showed focal inflammation in the periaqueductal gray matter and hypothalamus. This unique localization explains almost all symptoms of this rare autoimmune encephalitis. Copyright © 2015 Elsevier Inc. All rights reserved.
A novel visual saliency analysis model based on dynamic multiple feature combination strategy
NASA Astrophysics Data System (ADS)
Lv, Jing; Ye, Qi; Lv, Wen; Zhang, Libao
2017-06-01
The human visual system can quickly focus on a small number of salient objects. This process was known as visual saliency analysis and these salient objects are called focus of attention (FOA). The visual saliency analysis mechanism can be used to extract the salient regions and analyze saliency of object in an image, which is time-saving and can avoid unnecessary costs of computing resources. In this paper, a novel visual saliency analysis model based on dynamic multiple feature combination strategy is introduced. In the proposed model, we first generate multi-scale feature maps of intensity, color and orientation features using Gaussian pyramids and the center-surround difference. Then, we evaluate the contribution of all feature maps to the saliency map according to the area of salient regions and their average intensity, and attach different weights to different features according to their importance. Finally, we choose the largest salient region generated by the region growing method to perform the evaluation. Experimental results show that the proposed model cannot only achieve higher accuracy in saliency map computation compared with other traditional saliency analysis models, but also extract salient regions with arbitrary shapes, which is of great value for the image analysis and understanding.
Task-relevant perceptual features can define categories in visual memory too.
Antonelli, Karla B; Williams, Carrick C
2017-11-01
Although Konkle, Brady, Alvarez, and Oliva (2010, Journal of Experimental Psychology: General, 139(3), 558) claim that visual long-term memory (VLTM) is organized on underlying conceptual, not perceptual, information, visual memory results from visual search tasks are not well explained by this theory. We hypothesized that when viewing an object, any task-relevant visual information is critical to the organizational structure of VLTM. In two experiments, we examined the organization of VLTM by measuring the amount of retroactive interference created by objects possessing different combinations of task-relevant features. Based on task instructions, only the conceptual category was task relevant or both the conceptual category and a perceptual object feature were task relevant. Findings indicated that when made task relevant, perceptual object feature information, along with conceptual category information, could affect memory organization for objects in VLTM. However, when perceptual object feature information was task irrelevant, it did not contribute to memory organization; instead, memory defaulted to being organized around conceptual category information. These findings support the theory that a task-defined organizational structure is created in VLTM based on the relevance of particular object features and information.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Morozov, Dmitriy; Weber, Gunther H.
2014-03-31
Topological techniques provide robust tools for data analysis. They are used, for example, for feature extraction, for data de-noising, and for comparison of data sets. This chapter concerns contour trees, a topological descriptor that records the connectivity of the isosurfaces of scalar functions. These trees are fundamental to analysis and visualization of physical phenomena modeled by real-valued measurements. We study the parallel analysis of contour trees. After describing a particular representation of a contour tree, called local{global representation, we illustrate how di erent problems that rely on contour trees can be solved in parallel with minimal communication.
Coarse-graining time series data: Recurrence plot of recurrence plots and its application for music
NASA Astrophysics Data System (ADS)
Fukino, Miwa; Hirata, Yoshito; Aihara, Kazuyuki
2016-02-01
We propose a nonlinear time series method for characterizing two layers of regularity simultaneously. The key of the method is using the recurrence plots hierarchically, which allows us to preserve the underlying regularities behind the original time series. We demonstrate the proposed method with musical data. The proposed method enables us to visualize both the local and the global musical regularities or two different features at the same time. Furthermore, the determinism scores imply that the proposed method may be useful for analyzing emotional response to the music.
Coarse-graining time series data: Recurrence plot of recurrence plots and its application for music.
Fukino, Miwa; Hirata, Yoshito; Aihara, Kazuyuki
2016-02-01
We propose a nonlinear time series method for characterizing two layers of regularity simultaneously. The key of the method is using the recurrence plots hierarchically, which allows us to preserve the underlying regularities behind the original time series. We demonstrate the proposed method with musical data. The proposed method enables us to visualize both the local and the global musical regularities or two different features at the same time. Furthermore, the determinism scores imply that the proposed method may be useful for analyzing emotional response to the music.
Edge directed image interpolation with Bamberger pyramids
NASA Astrophysics Data System (ADS)
Rosiles, Jose Gerardo
2005-08-01
Image interpolation is a standard feature in digital image editing software, digital camera systems and printers. Classical methods for resizing produce blurred images with unacceptable quality. Bamberger Pyramids and filter banks have been successfully used for texture and image analysis. They provide excellent multiresolution and directional selectivity. In this paper we present an edge-directed image interpolation algorithm which takes advantage of the simultaneous spatial-directional edge localization at the subband level. The proposed algorithm outperform classical schemes like bilinear and bicubic schemes from the visual and numerical point of views.
Interactive Visualization of Assessment Data: The Software Package Mondrian
ERIC Educational Resources Information Center
Unlu, Ali; Sargin, Anatol
2009-01-01
Mondrian is state-of-the-art statistical data visualization software featuring modern interactive visualization techniques for a wide range of data types. This article reviews the capabilities, functionality, and interactive properties of this software package. Key features of Mondrian are illustrated with data from the Programme for International…
NASA Astrophysics Data System (ADS)
Iramina, Keiji; Ge, Sheng; Hyodo, Akira; Hayami, Takehito; Ueno, Shoogo
2009-04-01
In this study, we applied a transcranial magnetic stimulation (TMS) to investigate the temporal aspect for the functional processing of visual attention. Although it has been known that right posterior parietal cortex (PPC) in the brain has a role in certain visual search tasks, there is little knowledge about the temporal aspect of this area. Three visual search tasks that have different difficulties of task execution individually were carried out. These three visual search tasks are the "easy feature task," the "hard feature task," and the "conjunction task." To investigate the temporal aspect of the PPC involved in the visual search, we applied various stimulus onset asynchronies (SOAs) and measured the reaction time of the visual search. The magnetic stimulation was applied on the right PPC or the left PPC by the figure-eight coil. The results show that the reaction times of the hard feature task are longer than those of the easy feature task. When SOA=150 ms, compared with no-TMS condition, there was a significant increase in target-present reaction time when TMS pulses were applied. We considered that the right PPC was involved in the visual search at about SOA=150 ms after visual stimulus presentation. The magnetic stimulation to the right PPC disturbed the processing of the visual search. However, the magnetic stimulation to the left PPC gives no effect on the processing of the visual search.
Decontaminate feature for tracking: adaptive tracking via evolutionary feature subset
NASA Astrophysics Data System (ADS)
Liu, Qiaoyuan; Wang, Yuru; Yin, Minghao; Ren, Jinchang; Li, Ruizhi
2017-11-01
Although various visual tracking algorithms have been proposed in the last 2-3 decades, it remains a challenging problem for effective tracking with fast motion, deformation, occlusion, etc. Under complex tracking conditions, most tracking models are not discriminative and adaptive enough. When the combined feature vectors are inputted to the visual models, this may lead to redundancy causing low efficiency and ambiguity causing poor performance. An effective tracking algorithm is proposed to decontaminate features for each video sequence adaptively, where the visual modeling is treated as an optimization problem from the perspective of evolution. Every feature vector is compared to a biological individual and then decontaminated via classical evolutionary algorithms. With the optimized subsets of features, the "curse of dimensionality" has been avoided while the accuracy of the visual model has been improved. The proposed algorithm has been tested on several publicly available datasets with various tracking challenges and benchmarked with a number of state-of-the-art approaches. The comprehensive experiments have demonstrated the efficacy of the proposed methodology.
Perceptual congruency of audio-visual speech affects ventriloquism with bilateral visual stimuli.
Kanaya, Shoko; Yokosawa, Kazuhiko
2011-02-01
Many studies on multisensory processes have focused on performance in simplified experimental situations, with a single stimulus in each sensory modality. However, these results cannot necessarily be applied to explain our perceptual behavior in natural scenes where various signals exist within one sensory modality. We investigated the role of audio-visual syllable congruency on participants' auditory localization bias or the ventriloquism effect using spoken utterances and two videos of a talking face. Salience of facial movements was also manipulated. Results indicated that more salient visual utterances attracted participants' auditory localization. Congruent pairing of audio-visual utterances elicited greater localization bias than incongruent pairing, while previous studies have reported little dependency on the reality of stimuli in ventriloquism. Moreover, audio-visual illusory congruency, owing to the McGurk effect, caused substantial visual interference on auditory localization. Multisensory performance appears more flexible and adaptive in this complex environment than in previous studies.
A Unifying Motif for Spatial and Directional Surround Suppression.
Liu, Liu D; Miller, Kenneth D; Pack, Christopher C
2018-01-24
In the visual system, the response to a stimulus in a neuron's receptive field can be modulated by stimulus context, and the strength of these contextual influences vary with stimulus intensity. Recent work has shown how a theoretical model, the stabilized supralinear network (SSN), can account for such modulatory influences, using a small set of computational mechanisms. Although the predictions of the SSN have been confirmed in primary visual cortex (V1), its computational principles apply with equal validity to any cortical structure. We have therefore tested the generality of the SSN by examining modulatory influences in the middle temporal area (MT) of the macaque visual cortex, using electrophysiological recordings and pharmacological manipulations. We developed a novel stimulus that can be adjusted parametrically to be larger or smaller in the space of all possible motion directions. We found, as predicted by the SSN, that MT neurons integrate across motion directions for low-contrast stimuli, but that they exhibit suppression by the same stimuli when they are high in contrast. These results are analogous to those found in visual cortex when stimulus size is varied in the space domain. We further tested the mechanisms of inhibition using pharmacological manipulations of inhibitory efficacy. As predicted by the SSN, local manipulation of inhibitory strength altered firing rates, but did not change the strength of surround suppression. These results are consistent with the idea that the SSN can account for modulatory influences along different stimulus dimensions and in different cortical areas. SIGNIFICANCE STATEMENT Visual neurons are selective for specific stimulus features in a region of visual space known as the receptive field, but can be modulated by stimuli outside of the receptive field. The SSN model has been proposed to account for these and other modulatory influences, and tested in V1. As this model is not specific to any particular stimulus feature or brain region, we wondered whether similar modulatory influences might be observed for other stimulus dimensions and other regions. We tested for specific patterns of modulatory influences in the domain of motion direction, using electrophysiological recordings from MT. Our data confirm the predictions of the SSN in MT, suggesting that the SSN computations might be a generic feature of sensory cortex. Copyright © 2018 the authors 0270-6474/18/380989-11$15.00/0.
Dynamic Integration of Task-Relevant Visual Features in Posterior Parietal Cortex
Freedman, David J.
2014-01-01
Summary The primate visual system consists of multiple hierarchically organized cortical areas, each specialized for processing distinct aspects of the visual scene. For example, color and form are encoded in ventral pathway areas such as V4 and inferior temporal cortex, while motion is preferentially processed in dorsal pathway areas such as the middle temporal area. Such representations often need to be integrated perceptually to solve tasks which depend on multiple features. We tested the hypothesis that the lateral intraparietal area (LIP) integrates disparate task-relevant visual features by recording from LIP neurons in monkeys trained to identify target stimuli composed of conjunctions of color and motion features. We show that LIP neurons exhibit integrative representations of both color and motion features when they are task relevant, and task-dependent shifts of both direction and color tuning. This suggests that LIP plays a role in flexibly integrating task-relevant sensory signals. PMID:25199703
Combined Feature Based and Shape Based Visual Tracker for Robot Navigation
NASA Technical Reports Server (NTRS)
Deans, J.; Kunz, C.; Sargent, R.; Park, E.; Pedersen, L.
2005-01-01
We have developed a combined feature based and shape based visual tracking system designed to enable a planetary rover to visually track and servo to specific points chosen by a user with centimeter precision. The feature based tracker uses invariant feature detection and matching across a stereo pair, as well as matching pairs before and after robot movement in order to compute an incremental 6-DOF motion at each tracker update. This tracking method is subject to drift over time, which can be compensated by the shape based method. The shape based tracking method consists of 3D model registration, which recovers 6-DOF motion given sufficient shape and proper initialization. By integrating complementary algorithms, the combined tracker leverages the efficiency and robustness of feature based methods with the precision and accuracy of model registration. In this paper, we present the algorithms and their integration into a combined visual tracking system.
Visual Working Memory Is Independent of the Cortical Spacing Between Memoranda.
Harrison, William J; Bays, Paul M
2018-03-21
The sensory recruitment hypothesis states that visual short-term memory is maintained in the same visual cortical areas that initially encode a stimulus' features. Although it is well established that the distance between features in visual cortex determines their visibility, a limitation known as crowding, it is unknown whether short-term memory is similarly constrained by the cortical spacing of memory items. Here, we investigated whether the cortical spacing between sequentially presented memoranda affects the fidelity of memory in humans (of both sexes). In a first experiment, we varied cortical spacing by taking advantage of the log-scaling of visual cortex with eccentricity, presenting memoranda in peripheral vision sequentially along either the radial or tangential visual axis with respect to the fovea. In a second experiment, we presented memoranda sequentially either within or beyond the critical spacing of visual crowding, a distance within which visual features cannot be perceptually distinguished due to their nearby cortical representations. In both experiments and across multiple measures, we found strong evidence that the ability to maintain visual features in memory is unaffected by cortical spacing. These results indicate that the neural architecture underpinning working memory has properties inconsistent with the known behavior of sensory neurons in visual cortex. Instead, the dissociation between perceptual and memory representations supports a role of higher cortical areas such as posterior parietal or prefrontal regions or may involve an as yet unspecified mechanism in visual cortex in which stimulus features are bound to their temporal order. SIGNIFICANCE STATEMENT Although much is known about the resolution with which we can remember visual objects, the cortical representation of items held in short-term memory remains contentious. A popular hypothesis suggests that memory of visual features is maintained via the recruitment of the same neural architecture in sensory cortex that encodes stimuli. We investigated this claim by manipulating the spacing in visual cortex between sequentially presented memoranda such that some items shared cortical representations more than others while preventing perceptual interference between stimuli. We found clear evidence that short-term memory is independent of the intracortical spacing of memoranda, revealing a dissociation between perceptual and memory representations. Our data indicate that working memory relies on different neural mechanisms from sensory perception. Copyright © 2018 Harrison and Bays.
Glaucoma detection based on local binary patterns in fundus photographs
NASA Astrophysics Data System (ADS)
Alsheh Ali, Maya; Hurtut, Thomas; Faucon, Timothée.; Cheriet, Farida
2014-03-01
Glaucoma, a group of diseases that lead to optic neuropathy, is one of the most common reasons for blindness worldwide. Glaucoma rarely causes symptoms until the later stages of the disease. Early detection of glaucoma is very important to prevent visual loss since optic nerve damages cannot be reversed. To detect glaucoma, purely data-driven techniques have advantages, especially when the disease characteristics are complex and when precise image-based measurements are difficult to obtain. In this paper, we present our preliminary study for glaucoma detection using an automatic method based on local texture features extracted from fundus photographs. It implements the completed modeling of Local Binary Patterns to capture representative texture features from the whole image. A local region is represented by three operators: its central pixel (LBPC) and its local differences as two complementary components, the sign (which is the classical LBP) and the magnitude (LBPM). An image texture is finally described by both the distribution of LBP and the joint-distribution of LBPM and LBPC. Our images are then classified using a nearest-neighbor method with a leave-one-out validation strategy. On a sample set of 41 fundus images (13 glaucomatous, 28 non-glaucomatous), our method achieves 95:1% success rate with a specificity of 92:3% and a sensitivity of 96:4%. This study proposes a reproducible glaucoma detection process that could be used in a low-priced medical screening, thus avoiding the inter-experts variability issue.
Toward Optimal Manifold Hashing via Discrete Locally Linear Embedding.
Rongrong Ji; Hong Liu; Liujuan Cao; Di Liu; Yongjian Wu; Feiyue Huang
2017-11-01
Binary code learning, also known as hashing, has received increasing attention in large-scale visual search. By transforming high-dimensional features to binary codes, the original Euclidean distance is approximated via Hamming distance. More recently, it is advocated that it is the manifold distance, rather than the Euclidean distance, that should be preserved in the Hamming space. However, it retains as an open problem to directly preserve the manifold structure by hashing. In particular, it first needs to build the local linear embedding in the original feature space, and then quantize such embedding to binary codes. Such a two-step coding is problematic and less optimized. Besides, the off-line learning is extremely time and memory consuming, which needs to calculate the similarity matrix of the original data. In this paper, we propose a novel hashing algorithm, termed discrete locality linear embedding hashing (DLLH), which well addresses the above challenges. The DLLH directly reconstructs the manifold structure in the Hamming space, which learns optimal hash codes to maintain the local linear relationship of data points. To learn discrete locally linear embeddingcodes, we further propose a discrete optimization algorithm with an iterative parameters updating scheme. Moreover, an anchor-based acceleration scheme, termed Anchor-DLLH, is further introduced, which approximates the large similarity matrix by the product of two low-rank matrices. Experimental results on three widely used benchmark data sets, i.e., CIFAR10, NUS-WIDE, and YouTube Face, have shown superior performance of the proposed DLLH over the state-of-the-art approaches.
Visual feature-tolerance in the reading network.
Rauschecker, Andreas M; Bowen, Reno F; Perry, Lee M; Kevan, Alison M; Dougherty, Robert F; Wandell, Brian A
2011-09-08
A century of neurology and neuroscience shows that seeing words depends on ventral occipital-temporal (VOT) circuitry. Typically, reading is learned using high-contrast line-contour words. We explored whether a specific VOT region, the visual word form area (VWFA), learns to see only these words or recognizes words independent of the specific shape-defining visual features. Word forms were created using atypical features (motion-dots, luminance-dots) whose statistical properties control word-visibility. We measured fMRI responses as word form visibility varied, and we used TMS to interfere with neural processing in specific cortical circuits, while subjects performed a lexical decision task. For all features, VWFA responses increased with word-visibility and correlated with performance. TMS applied to motion-specialized area hMT+ disrupted reading performance for motion-dots, but not line-contours or luminance-dots. A quantitative model describes feature-convergence in the VWFA and relates VWFA responses to behavioral performance. These findings suggest how visual feature-tolerance in the reading network arises through signal convergence from feature-specialized cortical areas. Copyright © 2011 Elsevier Inc. All rights reserved.
Fagot, J; Kruschke, J K; Dépy, D; Vauclair, J
1998-10-01
We examined attention shifting in baboons and humans during the learning of visual categories. Within a conditional matching-to-sample task, participants of the two species sequentially learned two two-feature categories which shared a common feature. Results showed that humans encoded both features of the initially learned category, but predominantly only the distinctive feature of the subsequently learned category. Although baboons initially encoded both features of the first category, they ultimately retained only the distinctive features of each category. Empirical data from the two species were analyzed with the 1996 ADIT connectionist model of Kruschke. ADIT fits the baboon data when the attentional shift rate is zero, and the human data when the attentional shift rate is not zero. These empirical and modeling results suggest species differences in learned attention to visual features.
Asymmetries in visual search for conjunctive targets.
Cohen, A
1993-08-01
Asymmetry is demonstrated between conjunctive targets in visual search with no detectable asymmetries between the individual features that compose these targets. Experiment 1 demonstrated this phenomenon for targets composed of color and shape. Experiment 2 and 4 demonstrate this asymmetry for targets composed of size and orientation and for targets composed of contrast level and orientation, respectively. Experiment 3 demonstrates that search rate of individual features cannot predict search rate for conjunctive targets. These results demonstrate the need for 2 levels of representations: one of features and one of conjunction of features. A model related to the modified feature integration theory is proposed to account for these results. The proposed model and other models of visual search are discussed.
Tornow, Ralf P.; Odstrcilik, Jan; Mayer, Markus A.; Gazarek, Jiri; Jan, Jiri; Kubena, Tomas; Cernosek, Pavel
2013-01-01
The retinal ganglion axons are an important part of the visual system, which can be directly observed by fundus camera. The layer they form together inside the retina is the retinal nerve fiber layer (RNFL). This paper describes results of a texture RNFL analysis in color fundus photographs and compares these results with quantitative measurement of RNFL thickness obtained from optical coherence tomography on normal subjects. It is shown that local mean value, standard deviation, and Shannon entropy extracted from the green and blue channel of fundus images are correlated with corresponding RNFL thickness. The linear correlation coefficients achieved values 0.694, 0.547, and 0.512 for respective features measured on 439 retinal positions in the peripapillary area from 23 eyes of 15 different normal subjects. PMID:24454526
Kolar, Radim; Tornow, Ralf P; Laemmer, Robert; Odstrcilik, Jan; Mayer, Markus A; Gazarek, Jiri; Jan, Jiri; Kubena, Tomas; Cernosek, Pavel
2013-01-01
The retinal ganglion axons are an important part of the visual system, which can be directly observed by fundus camera. The layer they form together inside the retina is the retinal nerve fiber layer (RNFL). This paper describes results of a texture RNFL analysis in color fundus photographs and compares these results with quantitative measurement of RNFL thickness obtained from optical coherence tomography on normal subjects. It is shown that local mean value, standard deviation, and Shannon entropy extracted from the green and blue channel of fundus images are correlated with corresponding RNFL thickness. The linear correlation coefficients achieved values 0.694, 0.547, and 0.512 for respective features measured on 439 retinal positions in the peripapillary area from 23 eyes of 15 different normal subjects.
God: Do I have your attention?
Colzato, Lorenza S; van Beest, Ilja; van den Wildenberg, Wery P M; Scorolli, Claudia; Dorchin, Shirley; Meiran, Nachshon; Borghi, Anna M; Hommel, Bernhard
2010-10-01
Religion is commonly defined as a set of rules, developed as part of a culture. Here we provide evidence that practice in following these rules systematically changes the way people attend to visual stimuli, as indicated by the individual sizes of the global precedence effect (better performance to global than to local features). We show that this effect is significantly reduced in Calvinism, a religion emphasizing individual responsibility, and increased in Catholicism and Judaism, religions emphasizing social solidarity. We also show that this effect is long-lasting (still affecting baptized atheists) and that its size systematically varies as a function of the amount and strictness of religious practices. These findings suggest that religious practice induces particular cognitive-control styles that induce chronic, directional biases in the control of visual attention. Copyright 2010 Elsevier B.V. All rights reserved.
Liu, Jinping; Tang, Zhaohui; Xu, Pengfei; Liu, Wenzhong; Zhang, Jin; Zhu, Jianyong
2016-06-29
The topic of online product quality inspection (OPQI) with smart visual sensors is attracting increasing interest in both the academic and industrial communities on account of the natural connection between the visual appearance of products with their underlying qualities. Visual images captured from granulated products (GPs), e.g., cereal products, fabric textiles, are comprised of a large number of independent particles or stochastically stacking locally homogeneous fragments, whose analysis and understanding remains challenging. A method of image statistical modeling-based OPQI for GP quality grading and monitoring by a Weibull distribution(WD) model with a semi-supervised learning classifier is presented. WD-model parameters (WD-MPs) of GP images' spatial structures, obtained with omnidirectional Gaussian derivative filtering (OGDF), which were demonstrated theoretically to obey a specific WD model of integral form, were extracted as the visual features. Then, a co-training-style semi-supervised classifier algorithm, named COSC-Boosting, was exploited for semi-supervised GP quality grading, by integrating two independent classifiers with complementary nature in the face of scarce labeled samples. Effectiveness of the proposed OPQI method was verified and compared in the field of automated rice quality grading with commonly-used methods and showed superior performance, which lays a foundation for the quality control of GP on assembly lines.
A new approach to modeling the influence of image features on fixation selection in scenes
Nuthmann, Antje; Einhäuser, Wolfgang
2015-01-01
Which image characteristics predict where people fixate when memorizing natural images? To answer this question, we introduce a new analysis approach that combines a novel scene-patch analysis with generalized linear mixed models (GLMMs). Our method allows for (1) directly describing the relationship between continuous feature value and fixation probability, and (2) assessing each feature's unique contribution to fixation selection. To demonstrate this method, we estimated the relative contribution of various image features to fixation selection: luminance and luminance contrast (low-level features); edge density (a mid-level feature); visual clutter and image segmentation to approximate local object density in the scene (higher-level features). An additional predictor captured the central bias of fixation. The GLMM results revealed that edge density, clutter, and the number of homogenous segments in a patch can independently predict whether image patches are fixated or not. Importantly, neither luminance nor contrast had an independent effect above and beyond what could be accounted for by the other predictors. Since the parcellation of the scene and the selection of features can be tailored to the specific research question, our approach allows for assessing the interplay of various factors relevant for fixation selection in scenes in a powerful and flexible manner. PMID:25752239
Regions of mid-level human visual cortex sensitive to the global coherence of local image patches.
Mannion, Damien J; Kersten, Daniel J; Olman, Cheryl A
2014-08-01
The global structural arrangement and spatial layout of the visual environment must be derived from the integration of local signals represented in the lower tiers of the visual system. This interaction between the spatially local and global properties of visual stimulation underlies many of our visual capacities, and how this is achieved in the brain is a central question for visual and cognitive neuroscience. Here, we examine the sensitivity of regions of the posterior human brain to the global coordination of spatially displaced naturalistic image patches. We presented observers with image patches in two circular apertures to the left and right of central fixation, with the patches drawn from either the same (coherent condition) or different (noncoherent condition) extended image. Using fMRI at 7T (n = 5), we find that global coherence affected signal amplitude in regions of dorsal mid-level cortex. Furthermore, we find that extensive regions of mid-level visual cortex contained information in their local activity pattern that could discriminate coherent and noncoherent stimuli. These findings indicate that the global coordination of local naturalistic image information has important consequences for the processing in human mid-level visual cortex.
Red to Green or Fast to Slow? Infants' Visual Working Memory for "Just Salient Differences"
ERIC Educational Resources Information Center
Kaldy, Zsuzsa; Blaser, Erik
2013-01-01
In this study, 6-month-old infants' visual working memory for a static feature (color) and a dynamic feature (rotational motion) was compared. Comparing infants' use of different features can only be done properly if experimental manipulations to those features are equally salient (Kaldy & Blaser, 2009; Kaldy, Blaser, & Leslie,…
NASA Astrophysics Data System (ADS)
Erdélyi, Miklós; Sinkó, József; Gajdos, Tamás.; Novák, Tibor
2017-02-01
Optical super-resolution techniques such as single molecule localization have become one of the most dynamically developed areas in optical microscopy. These techniques routinely provide images of fixed cells or tissues with sub-diffraction spatial resolution, and can even be applied for live cell imaging under appropriate circumstances. Localization techniques are based on the precise fitting of the point spread functions (PSF) to the measured images of stochastically excited, identical fluorescent molecules. These techniques require controlling the rate between the on, off and the bleached states, keeping the number of active fluorescent molecules at an optimum value, so their diffraction limited images can be detected separately both spatially and temporally. Because of the numerous (and sometimes unknown) parameters, the imaging system can only be handled stochastically. For example, the rotation of the dye molecules obscures the polarization dependent PSF shape, and only an averaged distribution - typically estimated by a Gaussian function - is observed. TestSTORM software was developed to generate image stacks for traditional localization microscopes, where localization meant the precise determination of the spatial position of the molecules. However, additional optical properties (polarization, spectra, etc.) of the emitted photons can be used for further monitoring the chemical and physical properties (viscosity, pH, etc.) of the local environment. The image stack generating program was upgraded by several new features, such as: multicolour, polarization dependent PSF, built-in 3D visualization, structured background. These features make the program an ideal tool for optimizing the imaging and sample preparation conditions.
Hierarchical extreme learning machine based reinforcement learning for goal localization
NASA Astrophysics Data System (ADS)
AlDahoul, Nouar; Zaw Htike, Zaw; Akmeliawati, Rini
2017-03-01
The objective of goal localization is to find the location of goals in noisy environments. Simple actions are performed to move the agent towards the goal. The goal detector should be capable of minimizing the error between the predicted locations and the true ones. Few regions need to be processed by the agent to reduce the computational effort and increase the speed of convergence. In this paper, reinforcement learning (RL) method was utilized to find optimal series of actions to localize the goal region. The visual data, a set of images, is high dimensional unstructured data and needs to be represented efficiently to get a robust detector. Different deep Reinforcement models have already been used to localize a goal but most of them take long time to learn the model. This long learning time results from the weights fine tuning stage that is applied iteratively to find an accurate model. Hierarchical Extreme Learning Machine (H-ELM) was used as a fast deep model that doesn’t fine tune the weights. In other words, hidden weights are generated randomly and output weights are calculated analytically. H-ELM algorithm was used in this work to find good features for effective representation. This paper proposes a combination of Hierarchical Extreme learning machine and Reinforcement learning to find an optimal policy directly from visual input. This combination outperforms other methods in terms of accuracy and learning speed. The simulations and results were analysed by using MATLAB.
Caharel, Stéphanie; Leleu, Arnaud; Bernard, Christian; Viggiano, Maria-Pia; Lalonde, Robert; Rebaï, Mohamed
2013-11-01
The properties of the face-sensitive N170 component of the event-related brain potential (ERP) were explored through an orientation discrimination task using natural faces, objects, and Arcimboldo paintings presented upright or inverted. Because Arcimboldo paintings are composed of non-face objects but have a global face configuration, they provide great control to disentangle high-level face-like or object-like visual processes at the level of the N170, and may help to examine the implication of each hemisphere in the global/holistic processing of face formats. For upright position, N170 amplitudes in the right occipito-temporal region did not differ between natural faces and Arcimboldo paintings but were larger for both of these categories than for objects, supporting the view that as early as the N170 time-window, the right hemisphere is involved in holistic perceptual processing of face-like configurations irrespective of their features. Conversely, in the left hemisphere, N170 amplitudes differed between Arcimboldo portraits and natural faces, suggesting that this hemisphere processes local facial features. For upside-down orientation in both hemispheres, N170 amplitudes did not differ between Arcimboldo paintings and objects, but were reduced for both categories compared to natural faces, indicating that the disruption of holistic processing with inversion leads to an object-like processing of Arcimboldo paintings due to the lack of local facial features. Overall, these results provide evidence that global/holistic perceptual processing of faces and face-like formats involves the right hemisphere as early as the N170 time-window, and that the local processing of face features is rather implemented in the left hemisphere. © 2013.
Duncan, Justin; Gosselin, Frédéric; Cobarro, Charlène; Dugas, Gabrielle; Blais, Caroline; Fiset, Daniel
2017-12-01
Horizontal information was recently suggested to be crucial for face identification. In the present paper, we expand on this finding and investigate the role of orientations for all the basic facial expressions and neutrality. To this end, we developed orientation bubbles to quantify utilization of the orientation spectrum by the visual system in a facial expression categorization task. We first validated the procedure in Experiment 1 with a simple plaid-detection task. In Experiment 2, we used orientation bubbles to reveal the diagnostic-i.e., task relevant-orientations for the basic facial expressions and neutrality. Overall, we found that horizontal information was highly diagnostic for expressions-surprise excepted. We also found that utilization of horizontal information strongly predicted performance level in this task. Despite the recent surge of research on horizontals, the link with local features remains unexplored. We were thus also interested in investigating this link. In Experiment 3, location bubbles were used to reveal the diagnostic features for the basic facial expressions. Crucially, Experiments 2 and 3 were run in parallel on the same participants, in an interleaved fashion. This way, we were able to correlate individual orientation and local diagnostic profiles. Our results indicate that individual differences in horizontal tuning are best predicted by utilization of the eyes.
Impaired recognition of facial emotions from low-spatial frequencies in Asperger syndrome.
Kätsyri, Jari; Saalasti, Satu; Tiippana, Kaisa; von Wendt, Lennart; Sams, Mikko
2008-01-01
The theory of 'weak central coherence' [Happe, F., & Frith, U. (2006). The weak coherence account: Detail-focused cognitive style in autism spectrum disorders. Journal of Autism and Developmental Disorders, 36(1), 5-25] implies that persons with autism spectrum disorders (ASDs) have a perceptual bias for local but not for global stimulus features. The recognition of emotional facial expressions representing various different levels of detail has not been studied previously in ASDs. We analyzed the recognition of four basic emotional facial expressions (anger, disgust, fear and happiness) from low-spatial frequencies (overall global shapes without local features) in adults with an ASD. A group of 20 participants with Asperger syndrome (AS) was compared to a group of non-autistic age- and sex-matched controls. Emotion recognition was tested from static and dynamic facial expressions whose spatial frequency contents had been manipulated by low-pass filtering at two levels. The two groups recognized emotions similarly from non-filtered faces and from dynamic vs. static facial expressions. In contrast, the participants with AS were less accurate than controls in recognizing facial emotions from very low-spatial frequencies. The results suggest intact recognition of basic facial emotions and dynamic facial information, but impaired visual processing of global features in ASDs.
Topic Transition in Educational Videos Using Visually Salient Words
ERIC Educational Resources Information Center
Gandhi, Ankit; Biswas, Arijit; Deshmukh, Om
2015-01-01
In this paper, we propose a visual saliency algorithm for automatically finding the topic transition points in an educational video. First, we propose a method for assigning a saliency score to each word extracted from an educational video. We design several mid-level features that are indicative of visual saliency. The optimal feature combination…
van Gemert, Jan C; Veenman, Cor J; Smeulders, Arnold W M; Geusebroek, Jan-Mark
2010-07-01
This paper studies automatic image classification by modeling soft assignment in the popular codebook model. The codebook model describes an image as a bag of discrete visual words selected from a vocabulary, where the frequency distributions of visual words in an image allow classification. One inherent component of the codebook model is the assignment of discrete visual words to continuous image features. Despite the clear mismatch of this hard assignment with the nature of continuous features, the approach has been successfully applied for some years. In this paper, we investigate four types of soft assignment of visual words to image features. We demonstrate that explicitly modeling visual word assignment ambiguity improves classification performance compared to the hard assignment of the traditional codebook model. The traditional codebook model is compared against our method for five well-known data sets: 15 natural scenes, Caltech-101, Caltech-256, and Pascal VOC 2007/2008. We demonstrate that large codebook vocabulary sizes completely deteriorate the performance of the traditional model, whereas the proposed model performs consistently. Moreover, we show that our method profits in high-dimensional feature spaces and reaps higher benefits when increasing the number of image categories.
Fuzzy Classification of High Resolution Remote Sensing Scenes Using Visual Attention Features.
Li, Linyi; Xu, Tingbao; Chen, Yun
2017-01-01
In recent years the spatial resolutions of remote sensing images have been improved greatly. However, a higher spatial resolution image does not always lead to a better result of automatic scene classification. Visual attention is an important characteristic of the human visual system, which can effectively help to classify remote sensing scenes. In this study, a novel visual attention feature extraction algorithm was proposed, which extracted visual attention features through a multiscale process. And a fuzzy classification method using visual attention features (FC-VAF) was developed to perform high resolution remote sensing scene classification. FC-VAF was evaluated by using remote sensing scenes from widely used high resolution remote sensing images, including IKONOS, QuickBird, and ZY-3 images. FC-VAF achieved more accurate classification results than the others according to the quantitative accuracy evaluation indices. We also discussed the role and impacts of different decomposition levels and different wavelets on the classification accuracy. FC-VAF improves the accuracy of high resolution scene classification and therefore advances the research of digital image analysis and the applications of high resolution remote sensing images.
Fuzzy Classification of High Resolution Remote Sensing Scenes Using Visual Attention Features
Xu, Tingbao; Chen, Yun
2017-01-01
In recent years the spatial resolutions of remote sensing images have been improved greatly. However, a higher spatial resolution image does not always lead to a better result of automatic scene classification. Visual attention is an important characteristic of the human visual system, which can effectively help to classify remote sensing scenes. In this study, a novel visual attention feature extraction algorithm was proposed, which extracted visual attention features through a multiscale process. And a fuzzy classification method using visual attention features (FC-VAF) was developed to perform high resolution remote sensing scene classification. FC-VAF was evaluated by using remote sensing scenes from widely used high resolution remote sensing images, including IKONOS, QuickBird, and ZY-3 images. FC-VAF achieved more accurate classification results than the others according to the quantitative accuracy evaluation indices. We also discussed the role and impacts of different decomposition levels and different wavelets on the classification accuracy. FC-VAF improves the accuracy of high resolution scene classification and therefore advances the research of digital image analysis and the applications of high resolution remote sensing images. PMID:28761440
Threat as a feature in visual semantic object memory.
Calley, Clifford S; Motes, Michael A; Chiang, H-Sheng; Buhl, Virginia; Spence, Jeffrey S; Abdi, Hervé; Anand, Raksha; Maguire, Mandy; Estevez, Leonardo; Briggs, Richard; Freeman, Thomas; Kraut, Michael A; Hart, John
2013-08-01
Threatening stimuli have been found to modulate visual processes related to perception and attention. The present functional magnetic resonance imaging (fMRI) study investigated whether threat modulates visual object recognition of man-made and naturally occurring categories of stimuli. Compared with nonthreatening pictures, threatening pictures of real items elicited larger fMRI BOLD signal changes in medial visual cortices extending inferiorly into the temporo-occipital (TO) "what" pathways. This region elicited greater signal changes for threatening items compared to nonthreatening from both the natural-occurring and man-made stimulus supraordinate categories, demonstrating a featural component to these visual processing areas. Two additional loci of signal changes within more lateral inferior TO areas (bilateral BA18 and 19 as well as the right ventral temporal lobe) were detected for a category-feature interaction, with stronger responses to man-made (category) threatening (feature) stimuli than to natural threats. The findings are discussed in terms of visual recognition of processing efficiently or rapidly groups of items that confer an advantage for survival. Copyright © 2012 Wiley Periodicals, Inc.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Berres, Anne Sabine
This slide presentation describes basic topological concepts, including topological spaces, homeomorphisms, homotopy, betti numbers. Scalar field topology explores finding topological features and scalar field visualization, and vector field topology explores finding topological features and vector field visualization.
Breast Histopathological Image Retrieval Based on Latent Dirichlet Allocation.
Ma, Yibing; Jiang, Zhiguo; Zhang, Haopeng; Xie, Fengying; Zheng, Yushan; Shi, Huaqiang; Zhao, Yu
2017-07-01
In the field of pathology, whole slide image (WSI) has become the major carrier of visual and diagnostic information. Content-based image retrieval among WSIs can aid the diagnosis of an unknown pathological image by finding its similar regions in WSIs with diagnostic information. However, the huge size and complex content of WSI pose several challenges for retrieval. In this paper, we propose an unsupervised, accurate, and fast retrieval method for a breast histopathological image. Specifically, the method presents a local statistical feature of nuclei for morphology and distribution of nuclei, and employs the Gabor feature to describe the texture information. The latent Dirichlet allocation model is utilized for high-level semantic mining. Locality-sensitive hashing is used to speed up the search. Experiments on a WSI database with more than 8000 images from 15 types of breast histopathology demonstrate that our method achieves about 0.9 retrieval precision as well as promising efficiency. Based on the proposed framework, we are developing a search engine for an online digital slide browsing and retrieval platform, which can be applied in computer-aided diagnosis, pathology education, and WSI archiving and management.
Insect Detection of Small Targets Moving in Visual Clutter
Barnett, Paul D; O'Carroll, David C
2006-01-01
Detection of targets that move within visual clutter is a common task for animals searching for prey or conspecifics, a task made even more difficult when a moving pursuer needs to analyze targets against the motion of background texture (clutter). Despite the limited optical acuity of the compound eye of insects, this challenging task seems to have been solved by their tiny visual system. Here we describe neurons found in the male hoverfly,Eristalis tenax, that respond selectively to small moving targets. Although many of these target neurons are inhibited by the motion of a background pattern, others respond to target motion within the receptive field under a surprisingly large range of background motion stimuli. Some neurons respond whether or not there is a speed differential between target and background. Analysis of responses to very small targets (smaller than the size of the visual field of single photoreceptors) or those targets with reduced contrast shows that these neurons have extraordinarily high contrast sensitivity. Our data suggest that rejection of background motion may result from extreme selectivity for small targets contrasting against local patches of the background, combined with this high sensitivity, such that background patterns rarely contain features that satisfactorily drive the neuron. PMID:16448249
Anisotropic-Scale Junction Detection and Matching for Indoor Images.
Xue, Nan; Xia, Gui-Song; Bai, Xiang; Zhang, Liangpei; Shen, Weiming
Junctions play an important role in characterizing local geometrical structures of images, and the detection of which is a longstanding but challenging task. Existing junction detectors usually focus on identifying the location and orientations of junction branches while ignoring their scales, which, however, contain rich geometries of images. This paper presents a novel approach for junction detection and characterization, which especially exploits the locally anisotropic geometries of a junction and estimates its scales by relying on an a-contrario model. The output junctions are with anisotropic scales, saying that a scale parameter is associated with each branch of a junction and are thus named as anisotropic-scale junctions (ASJs). We then apply the new detected ASJs for matching indoor images, where there are dramatic changes of viewpoints and the detected local visual features, e.g., key-points, are usually insufficient and lack distinctive ability. We propose to use the anisotropic geometries of our junctions to improve the matching precision of indoor images. The matching results on sets of indoor images demonstrate that our approach achieves the state-of-the-art performance on indoor image matching.Junctions play an important role in characterizing local geometrical structures of images, and the detection of which is a longstanding but challenging task. Existing junction detectors usually focus on identifying the location and orientations of junction branches while ignoring their scales, which, however, contain rich geometries of images. This paper presents a novel approach for junction detection and characterization, which especially exploits the locally anisotropic geometries of a junction and estimates its scales by relying on an a-contrario model. The output junctions are with anisotropic scales, saying that a scale parameter is associated with each branch of a junction and are thus named as anisotropic-scale junctions (ASJs). We then apply the new detected ASJs for matching indoor images, where there are dramatic changes of viewpoints and the detected local visual features, e.g., key-points, are usually insufficient and lack distinctive ability. We propose to use the anisotropic geometries of our junctions to improve the matching precision of indoor images. The matching results on sets of indoor images demonstrate that our approach achieves the state-of-the-art performance on indoor image matching.
Exploration of complex visual feature spaces for object perception
Leeds, Daniel D.; Pyles, John A.; Tarr, Michael J.
2014-01-01
The mid- and high-level visual properties supporting object perception in the ventral visual pathway are poorly understood. In the absence of well-specified theory, many groups have adopted a data-driven approach in which they progressively interrogate neural units to establish each unit's selectivity. Such methods are challenging in that they require search through a wide space of feature models and stimuli using a limited number of samples. To more rapidly identify higher-level features underlying human cortical object perception, we implemented a novel functional magnetic resonance imaging method in which visual stimuli are selected in real-time based on BOLD responses to recently shown stimuli. This work was inspired by earlier primate physiology work, in which neural selectivity for mid-level features in IT was characterized using a simple parametric approach (Hung et al., 2012). To extend such work to human neuroimaging, we used natural and synthetic object stimuli embedded in feature spaces constructed on the basis of the complex visual properties of the objects themselves. During fMRI scanning, we employed a real-time search method to control continuous stimulus selection within each image space. This search was designed to maximize neural responses across a pre-determined 1 cm3 brain region within ventral cortex. To assess the value of this method for understanding object encoding, we examined both the behavior of the method itself and the complex visual properties the method identified as reliably activating selected brain regions. We observed: (1) Regions selective for both holistic and component object features and for a variety of surface properties; (2) Object stimulus pairs near one another in feature space that produce responses at the opposite extremes of the measured activity range. Together, these results suggest that real-time fMRI methods may yield more widely informative measures of selectivity within the broad classes of visual features associated with cortical object representation. PMID:25309408
Zhaoping, Li; Zhe, Li
2012-01-01
From a computational theory of V1, we formulate an optimization problem to investigate neural properties in the primary visual cortex (V1) from human reaction times (RTs) in visual search. The theory is the V1 saliency hypothesis that the bottom-up saliency of any visual location is represented by the highest V1 response to it relative to the background responses. The neural properties probed are those associated with the less known V1 neurons tuned simultaneously or conjunctively in two feature dimensions. The visual search is to find a target bar unique in color (C), orientation (O), motion direction (M), or redundantly in combinations of these features (e.g., CO, MO, or CM) among uniform background bars. A feature singleton target is salient because its evoked V1 response largely escapes the iso-feature suppression on responses to the background bars. The responses of the conjunctively tuned cells are manifested in the shortening of the RT for a redundant feature target (e.g., a CO target) from that predicted by a race between the RTs for the two corresponding single feature targets (e.g., C and O targets). Our investigation enables the following testable predictions. Contextual suppression on the response of a CO-tuned or MO-tuned conjunctive cell is weaker when the contextual inputs differ from the direct inputs in both feature dimensions, rather than just one. Additionally, CO-tuned cells and MO-tuned cells are often more active than the single feature tuned cells in response to the redundant feature targets, and this occurs more frequently for the MO-tuned cells such that the MO-tuned cells are no less likely than either the M-tuned or O-tuned neurons to be the most responsive neuron to dictate saliency for an MO target. PMID:22719829
Zhaoping, Li; Zhe, Li
2012-01-01
From a computational theory of V1, we formulate an optimization problem to investigate neural properties in the primary visual cortex (V1) from human reaction times (RTs) in visual search. The theory is the V1 saliency hypothesis that the bottom-up saliency of any visual location is represented by the highest V1 response to it relative to the background responses. The neural properties probed are those associated with the less known V1 neurons tuned simultaneously or conjunctively in two feature dimensions. The visual search is to find a target bar unique in color (C), orientation (O), motion direction (M), or redundantly in combinations of these features (e.g., CO, MO, or CM) among uniform background bars. A feature singleton target is salient because its evoked V1 response largely escapes the iso-feature suppression on responses to the background bars. The responses of the conjunctively tuned cells are manifested in the shortening of the RT for a redundant feature target (e.g., a CO target) from that predicted by a race between the RTs for the two corresponding single feature targets (e.g., C and O targets). Our investigation enables the following testable predictions. Contextual suppression on the response of a CO-tuned or MO-tuned conjunctive cell is weaker when the contextual inputs differ from the direct inputs in both feature dimensions, rather than just one. Additionally, CO-tuned cells and MO-tuned cells are often more active than the single feature tuned cells in response to the redundant feature targets, and this occurs more frequently for the MO-tuned cells such that the MO-tuned cells are no less likely than either the M-tuned or O-tuned neurons to be the most responsive neuron to dictate saliency for an MO target.
Visual arts training is linked to flexible attention to local and global levels of visual stimuli.
Chamberlain, Rebecca; Wagemans, Johan
2015-10-01
Observational drawing skill has been shown to be associated with the ability to focus on local visual details. It is unclear whether superior performance in local processing is indicative of the ability to attend to, and flexibly switch between, local and global levels of visual stimuli. It is also unknown whether these attentional enhancements remain specific to observational drawing skill or are a product of a wide range of artistic activities. The current study aimed to address these questions by testing if flexible visual processing predicts artistic group membership and observational drawing skill in a sample of first-year bachelor's degree art students (n=23) and non-art students (n=23). A pattern of local and global visual processing enhancements was found in relation to artistic group membership and drawing skill, with local processing ability found to be specifically related to individual differences in drawing skill. Enhanced global processing and more fluent switching between local and global levels of hierarchical stimuli predicted both drawing skill and artistic group membership, suggesting that these are beneficial attentional mechanisms for art-making in a range of domains. These findings support a top-down attentional model of artistic expertise and shed light on the domain specific and domain-general attentional enhancements induced by proficiency in the visual arts. Copyright © 2015 Elsevier B.V. All rights reserved.
Infants' Visual Localization of Visual and Auditory Targets.
ERIC Educational Resources Information Center
Bechtold, A. Gordon; And Others
This study is an investigation of 2-month-old infants' abilities to visually localize visual and auditory peripheral stimuli. Each subject (N=40) was presented with 50 trials; 25 of these visual and 25 auditory. The infant was placed in a semi-upright infant seat positioned 122 cm from the center speaker of an arc formed by five loudspeakers. At…
Deep Filter Banks for Texture Recognition, Description, and Segmentation.
Cimpoi, Mircea; Maji, Subhransu; Kokkinos, Iasonas; Vedaldi, Andrea
Visual textures have played a key role in image understanding because they convey important semantics of images, and because texture representations that pool local image descriptors in an orderless manner have had a tremendous impact in diverse applications. In this paper we make several contributions to texture understanding. First, instead of focusing on texture instance and material category recognition, we propose a human-interpretable vocabulary of texture attributes to describe common texture patterns, complemented by a new describable texture dataset for benchmarking. Second, we look at the problem of recognizing materials and texture attributes in realistic imaging conditions, including when textures appear in clutter, developing corresponding benchmarks on top of the recently proposed OpenSurfaces dataset. Third, we revisit classic texture represenations, including bag-of-visual-words and the Fisher vectors, in the context of deep learning and show that these have excellent efficiency and generalization properties if the convolutional layers of a deep model are used as filter banks. We obtain in this manner state-of-the-art performance in numerous datasets well beyond textures, an efficient method to apply deep features to image regions, as well as benefit in transferring features from one domain to another.
A Small World of Neuronal Synchrony
Yu, Shan; Huang, Debin; Singer, Wolf
2008-01-01
A small-world network has been suggested to be an efficient solution for achieving both modular and global processing—a property highly desirable for brain computations. Here, we investigated functional networks of cortical neurons using correlation analysis to identify functional connectivity. To reconstruct the interaction network, we applied the Ising model based on the principle of maximum entropy. This allowed us to assess the interactions by measuring pairwise correlations and to assess the strength of coupling from the degree of synchrony. Visual responses were recorded in visual cortex of anesthetized cats, simultaneously from up to 24 neurons. First, pairwise correlations captured most of the patterns in the population's activity and, therefore, provided a reliable basis for the reconstruction of the interaction networks. Second, and most importantly, the resulting networks had small-world properties; the average path lengths were as short as in simulated random networks, but the clustering coefficients were larger. Neurons differed considerably with respect to the number and strength of interactions, suggesting the existence of “hubs” in the network. Notably, there was no evidence for scale-free properties. These results suggest that cortical networks are optimized for the coexistence of local and global computations: feature detection and feature integration or binding. PMID:18400792
a Mapping Method of Slam Based on Look up Table
NASA Astrophysics Data System (ADS)
Wang, Z.; Li, J.; Wang, A.; Wang, J.
2017-09-01
In the last years several V-SLAM(Visual Simultaneous Localization and Mapping) approaches have appeared showing impressive reconstructions of the world. However these maps are built with far more than the required information. This limitation comes from the whole process of each key-frame. In this paper we present for the first time a mapping method based on the LOOK UP TABLE(LUT) for visual SLAM that can improve the mapping effectively. As this method relies on extracting features in each cell divided from image, it can get the pose of camera that is more representative of the whole key-frame. The tracking direction of key-frames is obtained by counting the number of parallax directions of feature points. LUT stored all mapping needs the number of cell corresponding to the tracking direction which can reduce the redundant information in the key-frame, and is more efficient to mapping. The result shows that a better map with less noise is build using less than one-third of the time. We believe that the capacity of LUT efficiently building maps makes it a good choice for the community to investigate in the scene reconstruction problems.