Dimensionality of visual complexity in computer graphics scenes
NASA Astrophysics Data System (ADS)
Ramanarayanan, Ganesh; Bala, Kavita; Ferwerda, James A.; Walter, Bruce
2008-02-01
How do human observers perceive visual complexity in images? This problem is especially relevant for computer graphics, where a better understanding of visual complexity can aid in the development of more advanced rendering algorithms. In this paper, we describe a study of the dimensionality of visual complexity in computer graphics scenes. We conducted an experiment where subjects judged the relative complexity of 21 high-resolution scenes, rendered with photorealistic methods. Scenes were gathered from web archives and varied in theme, number and layout of objects, material properties, and lighting. We analyzed the subject responses using multidimensional scaling of pooled subject responses. This analysis embedded the stimulus images in a two-dimensional space, with axes that roughly corresponded to "numerosity" and "material / lighting complexity". In a follow-up analysis, we derived a one-dimensional complexity ordering of the stimulus images. We compared this ordering with several computable complexity metrics, such as scene polygon count and JPEG compression size, and did not find them to be very correlated. Understanding the differences between these measures can lead to the design of more efficient rendering algorithms in computer graphics.
Metabolic Mapping of the Brain's Response to Visual Stimulation: Studies in Humans.
ERIC Educational Resources Information Center
Phelps, Michael E.; Kuhl, David E.
1981-01-01
Studies demonstrate increasing glucose metabolic rates in human primary (PVC) and association (AVC) visual cortex as complexity of visual scenes increase. AVC increased more rapidly with scene complexity than PVC and increased local metabolic activities above control subject with eyes closed; indicates wide range and metabolic reserve of visual…
A novel scene management technology for complex virtual battlefield environment
NASA Astrophysics Data System (ADS)
Sheng, Changchong; Jiang, Libing; Tang, Bo; Tang, Xiaoan
2018-04-01
The efficient scene management of virtual environment is an important research content of computer real-time visualization, which has a decisive influence on the efficiency of drawing. However, Traditional scene management methods do not suitable for complex virtual battlefield environments, this paper combines the advantages of traditional scene graph technology and spatial data structure method, using the idea of management and rendering separation, a loose object-oriented scene graph structure is established to manage the entity model data in the scene, and the performance-based quad-tree structure is created for traversing and rendering. In addition, the collaborative update relationship between the above two structural trees is designed to achieve efficient scene management. Compared with the previous scene management method, this method is more efficient and meets the needs of real-time visualization.
Deconstructing Visual Scenes in Cortex: Gradients of Object and Spatial Layout Information
Kravitz, Dwight J.; Baker, Chris I.
2013-01-01
Real-world visual scenes are complex cluttered, and heterogeneous stimuli engaging scene- and object-selective cortical regions including parahippocampal place area (PPA), retrosplenial complex (RSC), and lateral occipital complex (LOC). To understand the unique contribution of each region to distributed scene representations, we generated predictions based on a neuroanatomical framework adapted from monkey and tested them using minimal scenes in which we independently manipulated both spatial layout (open, closed, and gradient) and object content (furniture, e.g., bed, dresser). Commensurate with its strong connectivity with posterior parietal cortex, RSC evidenced strong spatial layout information but no object information, and its response was not even modulated by object presence. In contrast, LOC, which lies within the ventral visual pathway, contained strong object information but no background information. Finally, PPA, which is connected with both the dorsal and the ventral visual pathway, showed information about both objects and spatial backgrounds and was sensitive to the presence or absence of either. These results suggest that 1) LOC, PPA, and RSC have distinct representations, emphasizing different aspects of scenes, 2) the specific representations in each region are predictable from their patterns of connectivity, and 3) PPA combines both spatial layout and object information as predicted by connectivity. PMID:22473894
NASA Astrophysics Data System (ADS)
Graham, James; Ternovskiy, Igor V.
2013-06-01
We applied a two stage unsupervised hierarchical learning system to model complex dynamic surveillance and cyber space monitoring systems using a non-commercial version of the NeoAxis visualization software. The hierarchical scene learning and recognition approach is based on hierarchical expectation maximization, and was linked to a 3D graphics engine for validation of learning and classification results and understanding the human - autonomous system relationship. Scene recognition is performed by taking synthetically generated data and feeding it to a dynamic logic algorithm. The algorithm performs hierarchical recognition of the scene by first examining the features of the objects to determine which objects are present, and then determines the scene based on the objects present. This paper presents a framework within which low level data linked to higher-level visualization can provide support to a human operator and be evaluated in a detailed and systematic way.
Traffic Signs in Complex Visual Environments
DOT National Transportation Integrated Search
1982-11-01
The effects of sign luminance on detection and recognition of traffic control devices is mediated through contrast with the immediate surround. Additionally, complex visual scenes are known to degrade visual performance with targets well above visual...
Figure-Ground Organization in Visual Cortex for Natural Scenes
2016-01-01
Abstract Figure-ground organization and border-ownership assignment are essential for understanding natural scenes. It has been shown that many neurons in the macaque visual cortex signal border-ownership in displays of simple geometric shapes such as squares, but how well these neurons resolve border-ownership in natural scenes is not known. We studied area V2 neurons in behaving macaques with static images of complex natural scenes. We found that about half of the neurons were border-ownership selective for contours in natural scenes, and this selectivity originated from the image context. The border-ownership signals emerged within 70 ms after stimulus onset, only ∼30 ms after response onset. A substantial fraction of neurons were highly consistent across scenes. Thus, the cortical mechanisms of figure-ground organization are fast and efficient even in images of complex natural scenes. Understanding how the brain performs this task so fast remains a challenge. PMID:28058269
ERIC Educational Resources Information Center
Wilkinson, Krista M.; Light, Janice
2011-01-01
Purpose: Many individuals with complex communication needs may benefit from visual aided augmentative and alternative communication systems. In visual scene displays (VSDs), language concepts are embedded into a photograph of a naturalistic event. Humans play a central role in communication development and might be important elements in VSDs.…
ERIC Educational Resources Information Center
Freeth, M.; Chapman, P.; Ropar, D.; Mitchell, P.
2010-01-01
Visual fixation patterns whilst viewing complex photographic scenes containing one person were studied in 24 high-functioning adolescents with Autism Spectrum Disorders (ASD) and 24 matched typically developing adolescents. Over two different scene presentation durations both groups spent a large, strikingly similar proportion of their viewing…
Cultural differences in the lateral occipital complex while viewing incongruent scenes
Yang, Yung-Jui; Goh, Joshua; Hong, Ying-Yi; Park, Denise C.
2010-01-01
Converging behavioral and neuroimaging evidence indicates that culture influences the processing of complex visual scenes. Whereas Westerners focus on central objects and tend to ignore context, East Asians process scenes more holistically, attending to the context in which objects are embedded. We investigated cultural differences in contextual processing by manipulating the congruence of visual scenes presented in an fMR-adaptation paradigm. We hypothesized that East Asians would show greater adaptation to incongruent scenes, consistent with their tendency to process contextual relationships more extensively than Westerners. Sixteen Americans and 16 native Chinese were scanned while viewing sets of pictures consisting of a focal object superimposed upon a background scene. In half of the pictures objects were paired with congruent backgrounds, and in the other half objects were paired with incongruent backgrounds. We found that within both the right and left lateral occipital complexes, Chinese participants showed significantly greater adaptation to incongruent scenes than to congruent scenes relative to American participants. These results suggest that Chinese were more sensitive to contextual incongruity than were Americans and that they reacted to incongruent object/background pairings by focusing greater attention on the object. PMID:20083532
Idiosyncratic characteristics of saccadic eye movements when viewing different visual environments.
Andrews, T J; Coppola, D M
1999-08-01
Eye position was recorded in different viewing conditions to assess whether the temporal and spatial characteristics of saccadic eye movements in different individuals are idiosyncratic. Our aim was to determine the degree to which oculomotor control is based on endogenous factors. A total of 15 naive subjects viewed five visual environments: (1) The absence of visual stimulation (i.e. a dark room); (2) a repetitive visual environment (i.e. simple textured patterns); (3) a complex natural scene; (4) a visual search task; and (5) reading text. Although differences in visual environment had significant effects on eye movements, idiosyncrasies were also apparent. For example, the mean fixation duration and size of an individual's saccadic eye movements when passively viewing a complex natural scene covaried significantly with those same parameters in the absence of visual stimulation and in a repetitive visual environment. In contrast, an individual's spatio-temporal characteristics of eye movements during active tasks such as reading text or visual search covaried together, but did not correlate with the pattern of eye movements detected when viewing a natural scene, simple patterns or in the dark. These idiosyncratic patterns of eye movements in normal viewing reveal an endogenous influence on oculomotor control. The independent covariance of eye movements during different visual tasks shows that saccadic eye movements during active tasks like reading or visual search differ from those engaged during the passive inspection of visual scenes.
Teng, Santani
2017-01-01
In natural environments, visual and auditory stimulation elicit responses across a large set of brain regions in a fraction of a second, yielding representations of the multimodal scene and its properties. The rapid and complex neural dynamics underlying visual and auditory information processing pose major challenges to human cognitive neuroscience. Brain signals measured non-invasively are inherently noisy, the format of neural representations is unknown, and transformations between representations are complex and often nonlinear. Further, no single non-invasive brain measurement technique provides a spatio-temporally integrated view. In this opinion piece, we argue that progress can be made by a concerted effort based on three pillars of recent methodological development: (i) sensitive analysis techniques such as decoding and cross-classification, (ii) complex computational modelling using models such as deep neural networks, and (iii) integration across imaging methods (magnetoencephalography/electroencephalography, functional magnetic resonance imaging) and models, e.g. using representational similarity analysis. We showcase two recent efforts that have been undertaken in this spirit and provide novel results about visual and auditory scene analysis. Finally, we discuss the limits of this perspective and sketch a concrete roadmap for future research. This article is part of the themed issue ‘Auditory and visual scene analysis’. PMID:28044019
Cichy, Radoslaw Martin; Teng, Santani
2017-02-19
In natural environments, visual and auditory stimulation elicit responses across a large set of brain regions in a fraction of a second, yielding representations of the multimodal scene and its properties. The rapid and complex neural dynamics underlying visual and auditory information processing pose major challenges to human cognitive neuroscience. Brain signals measured non-invasively are inherently noisy, the format of neural representations is unknown, and transformations between representations are complex and often nonlinear. Further, no single non-invasive brain measurement technique provides a spatio-temporally integrated view. In this opinion piece, we argue that progress can be made by a concerted effort based on three pillars of recent methodological development: (i) sensitive analysis techniques such as decoding and cross-classification, (ii) complex computational modelling using models such as deep neural networks, and (iii) integration across imaging methods (magnetoencephalography/electroencephalography, functional magnetic resonance imaging) and models, e.g. using representational similarity analysis. We showcase two recent efforts that have been undertaken in this spirit and provide novel results about visual and auditory scene analysis. Finally, we discuss the limits of this perspective and sketch a concrete roadmap for future research.This article is part of the themed issue 'Auditory and visual scene analysis'. © 2017 The Authors.
Rolls, Edmund T.; Webb, Tristan J.
2014-01-01
Searching for and recognizing objects in complex natural scenes is implemented by multiple saccades until the eyes reach within the reduced receptive field sizes of inferior temporal cortex (IT) neurons. We analyze and model how the dorsal and ventral visual streams both contribute to this. Saliency detection in the dorsal visual system including area LIP is modeled by graph-based visual saliency, and allows the eyes to fixate potential objects within several degrees. Visual information at the fixated location subtending approximately 9° corresponding to the receptive fields of IT neurons is then passed through a four layer hierarchical model of the ventral cortical visual system, VisNet. We show that VisNet can be trained using a synaptic modification rule with a short-term memory trace of recent neuronal activity to capture both the required view and translation invariances to allow in the model approximately 90% correct object recognition for 4 objects shown in any view across a range of 135° anywhere in a scene. The model was able to generalize correctly within the four trained views and the 25 trained translations. This approach analyses the principles by which complementary computations in the dorsal and ventral visual cortical streams enable objects to be located and recognized in complex natural scenes. PMID:25161619
Where's Wally: the influence of visual salience on referring expression generation.
Clarke, Alasdair D F; Elsner, Micha; Rohde, Hannah
2013-01-01
REFERRING EXPRESSION GENERATION (REG) PRESENTS THE CONVERSE PROBLEM TO VISUAL SEARCH: given a scene and a specified target, how does one generate a description which would allow somebody else to quickly and accurately locate the target?Previous work in psycholinguistics and natural language processing has failed to find an important and integrated role for vision in this task. That previous work, which relies largely on simple scenes, tends to treat vision as a pre-process for extracting feature categories that are relevant to disambiguation. However, the visual search literature suggests that some descriptions are better than others at enabling listeners to search efficiently within complex stimuli. This paper presents a study testing whether participants are sensitive to visual features that allow them to compose such "good" descriptions. Our results show that visual properties (salience, clutter, area, and distance) influence REG for targets embedded in images from the Where's Wally? books. Referring expressions for large targets are shorter than those for smaller targets, and expressions about targets in highly cluttered scenes use more words. We also find that participants are more likely to mention non-target landmarks that are large, salient, and in close proximity to the target. These findings identify a key role for visual salience in language production decisions and highlight the importance of scene complexity for REG.
Viewing the dynamics and control of visual attention through the lens of electrophysiology
Woodman, Geoffrey F.
2013-01-01
How we find what we are looking for in complex visual scenes is a seemingly simple ability that has taken half a century to unravel. The first study to use the term visual search showed that as the number of objects in a complex scene increases, observers’ reaction times increase proportionally (Green and Anderson, 1956). This observation suggests that our ability to process the objects in the scenes is limited in capacity. However, if it is known that the target will have a certain feature attribute, for example, that it will be red, then only an increase in the number of red items increases reaction time. This observation suggests that we can control which visual inputs receive the benefit of our limited capacity to recognize the objects, such as those defined by the color red, as the items we seek. The nature of the mechanisms that underlie these basic phenomena in the literature on visual search have been more difficult to definitively determine. In this paper, I discuss how electrophysiological methods have provided us with the necessary tools to understand the nature of the mechanisms that give rise to the effects observed in the first visual search paper. I begin by describing how recordings of event-related potentials from humans and nonhuman primates have shown us how attention is deployed to possible target items in complex visual scenes. Then, I will discuss how event-related potential experiments have allowed us to directly measure the memory representations that are used to guide these deployments of attention to items with target-defining features. PMID:23357579
Neural representations of contextual guidance in visual search of real-world scenes.
Preston, Tim J; Guo, Fei; Das, Koel; Giesbrecht, Barry; Eckstein, Miguel P
2013-05-01
Exploiting scene context and object-object co-occurrence is critical in guiding eye movements and facilitating visual search, yet the mediating neural mechanisms are unknown. We used functional magnetic resonance imaging while observers searched for target objects in scenes and used multivariate pattern analyses (MVPA) to show that the lateral occipital complex (LOC) can predict the coarse spatial location of observers' expectations about the likely location of 213 different targets absent from the scenes. In addition, we found weaker but significant representations of context location in an area related to the orienting of attention (intraparietal sulcus, IPS) as well as a region related to scene processing (retrosplenial cortex, RSC). Importantly, the degree of agreement among 100 independent raters about the likely location to contain a target object in a scene correlated with LOC's ability to predict the contextual location while weaker but significant effects were found in IPS, RSC, the human motion area, and early visual areas (V1, V3v). When contextual information was made irrelevant to observers' behavioral task, the MVPA analysis of LOC and the other areas' activity ceased to predict the location of context. Thus, our findings suggest that the likely locations of targets in scenes are represented in various visual areas with LOC playing a key role in contextual guidance during visual search of objects in real scenes.
Integrating mechanisms of visual guidance in naturalistic language production.
Coco, Moreno I; Keller, Frank
2015-05-01
Situated language production requires the integration of visual attention and linguistic processing. Previous work has not conclusively disentangled the role of perceptual scene information and structural sentence information in guiding visual attention. In this paper, we present an eye-tracking study that demonstrates that three types of guidance, perceptual, conceptual, and structural, interact to control visual attention. In a cued language production experiment, we manipulate perceptual (scene clutter) and conceptual guidance (cue animacy) and measure structural guidance (syntactic complexity of the utterance). Analysis of the time course of language production, before and during speech, reveals that all three forms of guidance affect the complexity of visual responses, quantified in terms of the entropy of attentional landscapes and the turbulence of scan patterns, especially during speech. We find that perceptual and conceptual guidance mediate the distribution of attention in the scene, whereas structural guidance closely relates to scan pattern complexity. Furthermore, the eye-voice span of the cued object and its perceptual competitor are similar; its latency mediated by both perceptual and structural guidance. These results rule out a strict interpretation of structural guidance as the single dominant form of visual guidance in situated language production. Rather, the phase of the task and the associated demands of cross-modal cognitive processing determine the mechanisms that guide attention.
Experiencing simultanagnosia through windowed viewing of complex social scenes.
Dalrymple, Kirsten A; Birmingham, Elina; Bischof, Walter F; Barton, Jason J S; Kingstone, Alan
2011-01-07
Simultanagnosia is a disorder of visual attention, defined as an inability to see more than one object at once. It has been conceived as being due to a constriction of the visual "window" of attention, a metaphor that we examine in the present article. A simultanagnosic patient (SL) and two non-simultanagnosic control patients (KC and ES) described social scenes while their eye movements were monitored. These data were compared to a group of healthy subjects who described the same scenes under the same conditions as the patients, or through an aperture that restricted their vision to a small portion of the scene. Experiment 1 demonstrated that SL showed unusually low proportions of fixations to the eyes in social scenes, which contrasted with all other participants who demonstrated the standard preferential bias toward eyes. Experiments 2 and 3 revealed that when healthy participants viewed scenes through a window that was contingent on where they looked (Experiment 2) or where they moved a computer mouse (Experiment 3), their behavior closely mirrored that of patient SL. These findings suggest that a constricted window of visual processing has important consequences for how simultanagnosic patients explore their world. Our paradigm's capacity to mimic simultanagnosic behaviors while viewing complex scenes implies that it may be a valid way of modeling simultanagnosia in healthy individuals, providing a useful tool for future research. More broadly, our results support the thesis that people fixate the eyes in social scenes because they are informative to the meaning of the scene. Copyright © 2010 Elsevier B.V. All rights reserved.
Coding of navigational affordances in the human visual system
Epstein, Russell A.
2017-01-01
A central component of spatial navigation is determining where one can and cannot go in the immediate environment. We used fMRI to test the hypothesis that the human visual system solves this problem by automatically identifying the navigational affordances of the local scene. Multivoxel pattern analyses showed that a scene-selective region of dorsal occipitoparietal cortex, known as the occipital place area, represents pathways for movement in scenes in a manner that is tolerant to variability in other visual features. These effects were found in two experiments: One using tightly controlled artificial environments as stimuli, the other using a diverse set of complex, natural scenes. A reconstruction analysis demonstrated that the population codes of the occipital place area could be used to predict the affordances of novel scenes. Taken together, these results reveal a previously unknown mechanism for perceiving the affordance structure of navigable space. PMID:28416669
Scene-Aware Adaptive Updating for Visual Tracking via Correlation Filters
Zhang, Sirou; Qiao, Xiaoya
2017-01-01
In recent years, visual object tracking has been widely used in military guidance, human-computer interaction, road traffic, scene monitoring and many other fields. The tracking algorithms based on correlation filters have shown good performance in terms of accuracy and tracking speed. However, their performance is not satisfactory in scenes with scale variation, deformation, and occlusion. In this paper, we propose a scene-aware adaptive updating mechanism for visual tracking via a kernel correlation filter (KCF). First, a low complexity scale estimation method is presented, in which the corresponding weight in five scales is employed to determine the final target scale. Then, the adaptive updating mechanism is presented based on the scene-classification. We classify the video scenes as four categories by video content analysis. According to the target scene, we exploit the adaptive updating mechanism to update the kernel correlation filter to improve the robustness of the tracker, especially in scenes with scale variation, deformation, and occlusion. We evaluate our tracker on the CVPR2013 benchmark. The experimental results obtained with the proposed algorithm are improved by 33.3%, 15%, 6%, 21.9% and 19.8% compared to those of the KCF tracker on the scene with scale variation, partial or long-time large-area occlusion, deformation, fast motion and out-of-view. PMID:29140311
Miskovic, Vladimir; Martinovic, Jasna; Wieser, Matthias J; Petro, Nathan M; Bradley, Margaret M; Keil, Andreas
2015-03-01
Emotionally arousing scenes readily capture visual attention, prompting amplified neural activity in sensory regions of the brain. The physical stimulus features and related information channels in the human visual system that contribute to this modulation, however, are not known. Here, we manipulated low-level physical parameters of complex scenes varying in hedonic valence and emotional arousal in order to target the relative contributions of luminance based versus chromatic visual channels to emotional perception. Stimulus-evoked brain electrical activity was measured during picture viewing and used to quantify neural responses sensitive to lower-tier visual cortical involvement (steady-state visual evoked potentials) as well as the late positive potential, reflecting a more distributed cortical event. Results showed that the enhancement for emotional content was stimulus-selective when examining the steady-state segments of the evoked visual potentials. Response amplification was present only for low spatial frequency, grayscale stimuli, and not for high spatial frequency, red/green stimuli. In contrast, the late positive potential was modulated by emotion regardless of the scene's physical properties. Our findings are discussed in relation to neurophysiologically plausible constraints operating at distinct stages of the cortical processing stream. Copyright © 2015 Elsevier B.V. All rights reserved.
The role of memory for visual search in scenes
Võ, Melissa Le-Hoa; Wolfe, Jeremy M.
2014-01-01
Many daily activities involve looking for something. The ease with which these searches are performed often allows one to forget that searching represents complex interactions between visual attention and memory. While a clear understanding exists of how search efficiency will be influenced by visual features of targets and their surrounding distractors or by the number of items in the display, the role of memory in search is less well understood. Contextual cueing studies have shown that implicit memory for repeated item configurations can facilitate search in artificial displays. When searching more naturalistic environments, other forms of memory come into play. For instance, semantic memory provides useful information about which objects are typically found where within a scene, and episodic scene memory provides information about where a particular object was seen the last time a particular scene was viewed. In this paper, we will review work on these topics, with special emphasis on the role of memory in guiding search in organized, real-world scenes. PMID:25684693
The role of memory for visual search in scenes.
Le-Hoa Võ, Melissa; Wolfe, Jeremy M
2015-03-01
Many daily activities involve looking for something. The ease with which these searches are performed often allows one to forget that searching represents complex interactions between visual attention and memory. Although a clear understanding exists of how search efficiency will be influenced by visual features of targets and their surrounding distractors or by the number of items in the display, the role of memory in search is less well understood. Contextual cueing studies have shown that implicit memory for repeated item configurations can facilitate search in artificial displays. When searching more naturalistic environments, other forms of memory come into play. For instance, semantic memory provides useful information about which objects are typically found where within a scene, and episodic scene memory provides information about where a particular object was seen the last time a particular scene was viewed. In this paper, we will review work on these topics, with special emphasis on the role of memory in guiding search in organized, real-world scenes. © 2015 New York Academy of Sciences.
DspaceOgre 3D Graphics Visualization Tool
NASA Technical Reports Server (NTRS)
Jain, Abhinandan; Myin, Steven; Pomerantz, Marc I.
2011-01-01
This general-purpose 3D graphics visualization C++ tool is designed for visualization of simulation and analysis data for articulated mechanisms. Examples of such systems are vehicles, robotic arms, biomechanics models, and biomolecular structures. DspaceOgre builds upon the open-source Ogre3D graphics visualization library. It provides additional classes to support the management of complex scenes involving multiple viewpoints and different scene groups, and can be used as a remote graphics server. This software provides improved support for adding programs at the graphics processing unit (GPU) level for improved performance. It also improves upon the messaging interface it exposes for use as a visualization server.
Flies and humans share a motion estimation strategy that exploits natural scene statistics
Clark, Damon A.; Fitzgerald, James E.; Ales, Justin M.; Gohl, Daryl M.; Silies, Marion A.; Norcia, Anthony M.; Clandinin, Thomas R.
2014-01-01
Sighted animals extract motion information from visual scenes by processing spatiotemporal patterns of light falling on the retina. The dominant models for motion estimation exploit intensity correlations only between pairs of points in space and time. Moving natural scenes, however, contain more complex correlations. Here we show that fly and human visual systems encode the combined direction and contrast polarity of moving edges using triple correlations that enhance motion estimation in natural environments. Both species extract triple correlations with neural substrates tuned for light or dark edges, and sensitivity to specific triple correlations is retained even as light and dark edge motion signals are combined. Thus, both species separately process light and dark image contrasts to capture motion signatures that can improve estimation accuracy. This striking convergence argues that statistical structures in natural scenes have profoundly affected visual processing, driving a common computational strategy over 500 million years of evolution. PMID:24390225
Scene analysis for effective visual search in rough three-dimensional-modeling scenes
NASA Astrophysics Data System (ADS)
Wang, Qi; Hu, Xiaopeng
2016-11-01
Visual search is a fundamental technology in the computer vision community. It is difficult to find an object in complex scenes when there exist similar distracters in the background. We propose a target search method in rough three-dimensional-modeling scenes based on a vision salience theory and camera imaging model. We give the definition of salience of objects (or features) and explain the way that salience measurements of objects are calculated. Also, we present one type of search path that guides to the target through salience objects. Along the search path, when the previous objects are localized, the search region of each subsequent object decreases, which is calculated through imaging model and an optimization method. The experimental results indicate that the proposed method is capable of resolving the ambiguities resulting from distracters containing similar visual features with the target, leading to an improvement of search speed by over 50%.
Neural correlates of contextual cueing are modulated by explicit learning.
Westerberg, Carmen E; Miller, Brennan B; Reber, Paul J; Cohen, Neal J; Paller, Ken A
2011-10-01
Contextual cueing refers to the facilitated ability to locate a particular visual element in a scene due to prior exposure to the same scene. This facilitation is thought to reflect implicit learning, as it typically occurs without the observer's knowledge that scenes repeat. Unlike most other implicit learning effects, contextual cueing can be impaired following damage to the medial temporal lobe. Here we investigated neural correlates of contextual cueing and explicit scene memory in two participant groups. Only one group was explicitly instructed about scene repetition. Participants viewed a sequence of complex scenes that depicted a landscape with five abstract geometric objects. Superimposed on each object was a letter T or L rotated left or right by 90°. Participants responded according to the target letter (T) orientation. Responses were highly accurate for all scenes. Response speeds were faster for repeated versus novel scenes. The magnitude of this contextual cueing did not differ between the two groups. Also, in both groups repeated scenes yielded reduced hemodynamic activation compared with novel scenes in several regions involved in visual perception and attention, and reductions in some of these areas were correlated with response-time facilitation. In the group given instructions about scene repetition, recognition memory for scenes was superior and was accompanied by medial temporal and more anterior activation. Thus, strategic factors can promote explicit memorization of visual scene information, which appears to engage additional neural processing beyond what is required for implicit learning of object configurations and target locations in a scene. Copyright © 2011 Elsevier Ltd. All rights reserved.
Neural correlates of contextual cueing are modulated by explicit learning
Westerberg, Carmen E.; Miller, Brennan B.; Reber, Paul J.; Cohen, Neal J.; Paller, Ken A.
2011-01-01
Contextual cueing refers to the facilitated ability to locate a particular visual element in a scene due to prior exposure to the same scene. This facilitation is thought to reflect implicit learning, as it typically occurs without the observer’s knowledge that scenes repeat. Unlike most other implicit learning effects, contextual cueing can be impaired following damage to the medial temporal lobe. Here we investigated neural correlates of contextual cueing and explicit scene memory in two participant groups. Only one group was explicitly instructed about scene repetition. Participants viewed a sequence of complex scenes that depicted a landscape with five abstract geometric objects. Superimposed on each object was a letter T or L rotated left or right by 90°. Participants responded according to the target letter (T) orientation. Responses were highly accurate for all scenes. Response speeds were faster for repeated versus novel scenes. The magnitude of this contextual cueing did not differ between the two groups. Also, in both groups repeated scenes yielded reduced hemodynamic activation compared with novel scenes in several regions involved in visual perception and attention, and reductions in some of these areas were correlated with response-time facilitation. In the group given instructions about scene repetition, recognition memory for scenes was superior and was accompanied by medial temporal and more anterior activation. Thus, strategic factors can promote explicit memorization of visual scene information, which appears to engage additional neural processing beyond what is required for implicit learning of object configurations and target locations in a scene. PMID:21889947
Bekhtereva, Valeria; Müller, Matthias M
2017-10-01
Is color a critical feature in emotional content extraction and involuntary attentional orienting toward affective stimuli? Here we used briefly presented emotional distractors to investigate the extent to which color information can influence the time course of attentional bias in early visual cortex. While participants performed a demanding visual foreground task, complex unpleasant and neutral background images were displayed in color or grayscale format for a short period of 133 ms and were immediately masked. Such a short presentation poses a challenge for visual processing. In the visual detection task, participants attended to flickering squares that elicited the steady-state visual evoked potential (SSVEP), allowing us to analyze the temporal dynamics of the competition for processing resources in early visual cortex. Concurrently we measured the visual event-related potentials (ERPs) evoked by the unpleasant and neutral background scenes. The results showed (a) that the distraction effect was greater with color than with grayscale images and (b) that it lasted longer with colored unpleasant distractor images. Furthermore, classical and mass-univariate ERP analyses indicated that, when presented in color, emotional scenes elicited more pronounced early negativities (N1-EPN) relative to neutral scenes, than when the scenes were presented in grayscale. Consistent with neural data, unpleasant scenes were rated as being more emotionally negative and received slightly higher arousal values when they were shown in color than when they were presented in grayscale. Taken together, these findings provide evidence for the modulatory role of picture color on a cascade of coordinated perceptual processes: by facilitating the higher-level extraction of emotional content, color influences the duration of the attentional bias to briefly presented affective scenes in lower-tier visual areas.
Kahn, Itamar; Wig, Gagan S.; Schacter, Daniel L.
2012-01-01
Asymmetrical specialization of cognitive processes across the cerebral hemispheres is a hallmark of healthy brain development and an important evolutionary trait underlying higher cognition in humans. While previous research, including studies of priming, divided visual field presentation, and split-brain patients, demonstrates a general pattern of right/left asymmetry of form-specific versus form-abstract visual processing, little is known about brain organization underlying this dissociation. Here, using repetition priming of complex visual scenes and high-resolution functional magnetic resonance imaging (MRI), we demonstrate asymmetrical form specificity of visual processing between the right and left hemispheres within a region known to be critical for processing of visual spatial scenes (parahippocampal place area [PPA]). Next, we use resting-state functional connectivity MRI analyses to demonstrate that this functional asymmetry is associated with differential intrinsic activity correlations of the right versus left PPA with regions critically involved in perceptual versus conceptual processing, respectively. Our results demonstrate that the PPA comprises lateralized subregions across the cerebral hemispheres that are engaged in functionally dissociable yet complementary components of visual scene analysis. Furthermore, this functional asymmetry is associated with differential intrinsic functional connectivity of the PPA with distinct brain areas known to mediate dissociable cognitive processes. PMID:21968568
Wilkinson, Krista M; Light, Janice
2011-12-01
Many individuals with complex communication needs may benefit from visual aided augmentative and alternative communication systems. In visual scene displays (VSDs), language concepts are embedded into a photograph of a naturalistic event. Humans play a central role in communication development and might be important elements in VSDs. However, many VSDs omit human figures. In this study, the authors sought to describe the distribution of visual attention to humans in naturalistic scenes as compared with other elements. Nineteen college students observed 8 photographs in which a human figure appeared near 1 or more items that might be expected to compete for visual attention (such as a Christmas tree or a table loaded with food). Eye-tracking technology allowed precise recording of participants' gaze. The fixation duration over a 7-s viewing period and latency to view elements in the photograph were measured. Participants fixated on the human figures more rapidly and for longer than expected based on the size of these figures, regardless of the other elements in the scene. Human figures attract attention in a photograph even when presented alongside other attractive distracters. Results suggest that humans may be a powerful means to attract visual attention to key elements in VSDs.
Stevens, W Dale; Kahn, Itamar; Wig, Gagan S; Schacter, Daniel L
2012-08-01
Asymmetrical specialization of cognitive processes across the cerebral hemispheres is a hallmark of healthy brain development and an important evolutionary trait underlying higher cognition in humans. While previous research, including studies of priming, divided visual field presentation, and split-brain patients, demonstrates a general pattern of right/left asymmetry of form-specific versus form-abstract visual processing, little is known about brain organization underlying this dissociation. Here, using repetition priming of complex visual scenes and high-resolution functional magnetic resonance imaging (MRI), we demonstrate asymmetrical form specificity of visual processing between the right and left hemispheres within a region known to be critical for processing of visual spatial scenes (parahippocampal place area [PPA]). Next, we use resting-state functional connectivity MRI analyses to demonstrate that this functional asymmetry is associated with differential intrinsic activity correlations of the right versus left PPA with regions critically involved in perceptual versus conceptual processing, respectively. Our results demonstrate that the PPA comprises lateralized subregions across the cerebral hemispheres that are engaged in functionally dissociable yet complementary components of visual scene analysis. Furthermore, this functional asymmetry is associated with differential intrinsic functional connectivity of the PPA with distinct brain areas known to mediate dissociable cognitive processes.
NASA Astrophysics Data System (ADS)
Qi, K.; Qingfeng, G.
2017-12-01
With the popular use of High-Resolution Satellite (HRS) images, more and more research efforts have been placed on land-use scene classification. However, it makes the task difficult with HRS images for the complex background and multiple land-cover classes or objects. This article presents a multiscale deeply described correlaton model for land-use scene classification. Specifically, the convolutional neural network is introduced to learn and characterize the local features at different scales. Then, learnt multiscale deep features are explored to generate visual words. The spatial arrangement of visual words is achieved through the introduction of adaptive vector quantized correlograms at different scales. Experiments on two publicly available land-use scene datasets demonstrate that the proposed model is compact and yet discriminative for efficient representation of land-use scene images, and achieves competitive classification results with the state-of-art methods.
Neuroscience-Enabled Complex Visual Scene Understanding
2012-04-12
some cases, it is hard to precisely say where or what we are looking at since a complex task governs eye fixations, for example in driving. While in...another objects ( say a door) can be resolved using the prior information about the scene. This knowledge can be provided from gist models, such as one...separation and combination of class-dependent features for handwriting recognition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 21, no. 10, pp. 1089
Visual flight control in naturalistic and artificial environments.
Baird, Emily; Dacke, Marie
2012-12-01
Although the visual flight control strategies of flying insects have evolved to cope with the complexity of the natural world, studies investigating this behaviour have typically been performed indoors using simplified two-dimensional artificial visual stimuli. How well do the results from these studies reflect the natural behaviour of flying insects considering the radical differences in contrast, spatial composition, colour and dimensionality between these visual environments? Here, we aim to answer this question by investigating the effect of three- and two-dimensional naturalistic and artificial scenes on bumblebee flight control in an outdoor setting and compare the results with those of similar experiments performed in an indoor setting. In particular, we focus on investigating the effect of axial (front-to-back) visual motion cues on ground speed and centring behaviour. Our results suggest that, in general, ground speed control and centring behaviour in bumblebees is not affected by whether the visual scene is two- or three dimensional, naturalistic or artificial, or whether the experiment is conducted indoors or outdoors. The only effect that we observe between naturalistic and artificial scenes on flight control is that when the visual scene is three-dimensional and the visual information on the floor is minimised, bumblebees fly further from the midline of the tunnel. The findings presented here have implications not only for understanding the mechanisms of visual flight control in bumblebees, but also for the results of past and future investigations into visually guided flight control in other insects.
Rapid discrimination of visual scene content in the human brain.
Anokhin, Andrey P; Golosheykin, Simon; Sirevaag, Erik; Kristjansson, Sean; Rohrbaugh, John W; Heath, Andrew C
2006-06-06
The rapid evaluation of complex visual environments is critical for an organism's adaptation and survival. Previous studies have shown that emotionally significant visual scenes, both pleasant and unpleasant, elicit a larger late positive wave in the event-related brain potential (ERP) than emotionally neutral pictures. The purpose of the present study was to examine whether neuroelectric responses elicited by complex pictures discriminate between specific, biologically relevant contents of the visual scene and to determine how early in the picture processing this discrimination occurs. Subjects (n = 264) viewed 55 color slides differing in both scene content and emotional significance. No categorical judgments or responses were required. Consistent with previous studies, we found that emotionally arousing pictures, regardless of their content, produce a larger late positive wave than neutral pictures. However, when pictures were further categorized by content, anterior ERP components in a time window between 200 and 600 ms following stimulus onset showed a high selectivity for pictures with erotic content compared to other pictures regardless of their emotional valence (pleasant, neutral, and unpleasant) or emotional arousal. The divergence of ERPs elicited by erotic and non-erotic contents started at 185 ms post-stimulus in the fronto-central midline region, with a later onset in parietal regions. This rapid, selective, and content-specific processing of erotic materials and its dissociation from other pictures (including emotionally positive pictures) suggests the existence of a specialized neural network for prioritized processing of a distinct category of biologically relevant stimuli with high adaptive and evolutionary significance.
Rapid discrimination of visual scene content in the human brain
Anokhin, Andrey P.; Golosheykin, Simon; Sirevaag, Erik; Kristjansson, Sean; Rohrbaugh, John W.; Heath, Andrew C.
2007-01-01
The rapid evaluation of complex visual environments is critical for an organism's adaptation and survival. Previous studies have shown that emotionally significant visual scenes, both pleasant and unpleasant, elicit a larger late positive wave in the event-related brain potential (ERP) than emotionally neutral pictures. The purpose of the present study was to examine whether neuroelectric responses elicited by complex pictures discriminate between specific, biologically relevant contents of the visual scene and to determine how early in the picture processing this discrimination occurs. Subjects (n=264) viewed 55 color slides differing in both scene content and emotional significance. No categorical judgments or responses were required. Consistent with previous studies, we found that emotionally arousing pictures, regardless of their content, produce a larger late positive wave than neutral pictures. However, when pictures were further categorized by content, anterior ERP components in a time window between 200−600 ms following stimulus onset showed a high selectivity for pictures with erotic content compared to other pictures regardless of their emotional valence (pleasant, neutral, and unpleasant) or emotional arousal. The divergence of ERPs elicited by erotic and non-erotic contents started at 185 ms post-stimulus in the fronto-central midline regions, with a later onset in parietal regions. This rapid, selective, and content-specific processing of erotic materials and its dissociation from other pictures (including emotionally positive pictures) suggests the existence of a specialized neural network for prioritized processing of a distinct category of biologically relevant stimuli with high adaptive and evolutionary significance. PMID:16712815
Liao, Pin-Chao; Sun, Xinlu; Liu, Mei; Shih, Yu-Nien
2018-01-11
Navigated safety inspection based on task-specific checklists can increase the hazard detection rate, theoretically with interference from scene complexity. Visual clutter, a proxy of scene complexity, can theoretically impair visual search performance, but its impact on the effect of safety inspection performance remains to be explored for the optimization of navigated inspection. This research aims to explore whether the relationship between working memory and hazard detection rate is moderated by visual clutter. Based on a perceptive model of hazard detection, we: (a) developed a mathematical influence model for construction hazard detection; (b) designed an experiment to observe the performance of hazard detection rate with adjusted working memory under different levels of visual clutter, while using an eye-tracking device to observe participants' visual search processes; (c) utilized logistic regression to analyze the developed model under various visual clutter. The effect of a strengthened working memory on the detection rate through increased search efficiency is more apparent in high visual clutter. This study confirms the role of visual clutter in construction-navigated inspections, thus serving as a foundation for the optimization of inspection planning.
Azzopardi, George; Petkov, Nicolai
2014-01-01
The remarkable abilities of the primate visual system have inspired the construction of computational models of some visual neurons. We propose a trainable hierarchical object recognition model, which we call S-COSFIRE (S stands for Shape and COSFIRE stands for Combination Of Shifted FIlter REsponses) and use it to localize and recognize objects of interests embedded in complex scenes. It is inspired by the visual processing in the ventral stream (V1/V2 → V4 → TEO). Recognition and localization of objects embedded in complex scenes is important for many computer vision applications. Most existing methods require prior segmentation of the objects from the background which on its turn requires recognition. An S-COSFIRE filter is automatically configured to be selective for an arrangement of contour-based features that belong to a prototype shape specified by an example. The configuration comprises selecting relevant vertex detectors and determining certain blur and shift parameters. The response is computed as the weighted geometric mean of the blurred and shifted responses of the selected vertex detectors. S-COSFIRE filters share similar properties with some neurons in inferotemporal cortex, which provided inspiration for this work. We demonstrate the effectiveness of S-COSFIRE filters in two applications: letter and keyword spotting in handwritten manuscripts and object spotting in complex scenes for the computer vision system of a domestic robot. S-COSFIRE filters are effective to recognize and localize (deformable) objects in images of complex scenes without requiring prior segmentation. They are versatile trainable shape detectors, conceptually simple and easy to implement. The presented hierarchical shape representation contributes to a better understanding of the brain and to more robust computer vision algorithms. PMID:25126068
Neural codes of seeing architectural styles
Choo, Heeyoung; Nasar, Jack L.; Nikrahei, Bardia; Walther, Dirk B.
2017-01-01
Images of iconic buildings, such as the CN Tower, instantly transport us to specific places, such as Toronto. Despite the substantial impact of architectural design on people’s visual experience of built environments, we know little about its neural representation in the human brain. In the present study, we have found patterns of neural activity associated with specific architectural styles in several high-level visual brain regions, but not in primary visual cortex (V1). This finding suggests that the neural correlates of the visual perception of architectural styles stem from style-specific complex visual structure beyond the simple features computed in V1. Surprisingly, the network of brain regions representing architectural styles included the fusiform face area (FFA) in addition to several scene-selective regions. Hierarchical clustering of error patterns further revealed that the FFA participated to a much larger extent in the neural encoding of architectural styles than entry-level scene categories. We conclude that the FFA is involved in fine-grained neural encoding of scenes at a subordinate-level, in our case, architectural styles of buildings. This study for the first time shows how the human visual system encodes visual aspects of architecture, one of the predominant and longest-lasting artefacts of human culture. PMID:28071765
Neural codes of seeing architectural styles.
Choo, Heeyoung; Nasar, Jack L; Nikrahei, Bardia; Walther, Dirk B
2017-01-10
Images of iconic buildings, such as the CN Tower, instantly transport us to specific places, such as Toronto. Despite the substantial impact of architectural design on people's visual experience of built environments, we know little about its neural representation in the human brain. In the present study, we have found patterns of neural activity associated with specific architectural styles in several high-level visual brain regions, but not in primary visual cortex (V1). This finding suggests that the neural correlates of the visual perception of architectural styles stem from style-specific complex visual structure beyond the simple features computed in V1. Surprisingly, the network of brain regions representing architectural styles included the fusiform face area (FFA) in addition to several scene-selective regions. Hierarchical clustering of error patterns further revealed that the FFA participated to a much larger extent in the neural encoding of architectural styles than entry-level scene categories. We conclude that the FFA is involved in fine-grained neural encoding of scenes at a subordinate-level, in our case, architectural styles of buildings. This study for the first time shows how the human visual system encodes visual aspects of architecture, one of the predominant and longest-lasting artefacts of human culture.
Miskovic, Vladimir; Martinovic, Jasna; Wieser, Matthias M.; Petro, Nathan M.; Bradley, Margaret M.; Keil, Andreas
2015-01-01
Emotionally arousing scenes readily capture visual attention, prompting amplified neural activity in sensory regions of the brain. The physical stimulus features and related information channels in the human visual system that contribute to this modulation, however, are not known. Here, we manipulated low-level physical parameters of complex scenes varying in hedonic valence and emotional arousal in order to target the relative contributions of luminance based versus chromatic visual channels to emotional perception. Stimulus-evoked brain electrical activity was measured during picture viewing and used to quantify neural responses sensitive to lower-tier visual cortical involvement (steady-state visual evoked potentials) as well as the late positive potential, reflecting a more distributed cortical event. Results showed that the enhancement for emotional content was stimulus-selective when examining the steady-state segments of the evoked visual potentials. Response amplification was present only for low spatial frequency, grayscale stimuli, and not for high spatial frequency, red/green stimuli. In contrast, the late positive potential was modulated by emotion regardless of the scene’s physical properties. Our findings are discussed in relation to neurophysiologically plausible constraints operating at distinct stages of the cortical processing stream. PMID:25640949
How high is visual short-term memory capacity for object layout?
Sanocki, Thomas; Sellers, Eric; Mittelstadt, Jeff; Sulman, Noah
2010-05-01
Previous research measuring visual short-term memory (VSTM) suggests that the capacity for representing the layout of objects is fairly high. In four experiments, we further explored the capacity of VSTM for layout of objects, using the change detection method. In Experiment 1, participants retained most of the elements in displays of 4 to 8 elements. In Experiments 2 and 3, with up to 20 elements, participants retained many of them, reaching a capacity of 13.4 stimulus elements. In Experiment 4, participants retained much of a complex naturalistic scene. In most cases, increasing display size caused only modest reductions in performance, consistent with the idea of configural, variable-resolution grouping. The results indicate that participants can retain a substantial amount of scene layout information (objects and locations) in short-term memory. We propose that this is a case of remote visual understanding, where observers' ability to integrate information from a scene is paramount.
Visual supports for shared reading with young children: the effect of static overlay design.
Wood Jackson, Carla; Wahlquist, Jordan; Marquis, Cassandra
2011-06-01
This study examined the effects of two types of static overlay design (visual scene display and grid display) on 39 children's use of a speech-generating device during shared storybook reading with an adult. This pilot project included two groups: preschool children with typical communication skills (n = 26) and with complex communication needs (n = 13). All participants engaged in shared reading with two books using each visual layout on a speech-generating device (SGD). The children averaged a greater number of activations when presented with a grid display during introductory exploration and free play. There was a large effect of the static overlay design on the number of silent hits, evidencing more silent hits with visual scene displays. On average, the children demonstrated relatively few spontaneous activations of the speech-generating device while the adult was reading, regardless of overlay design. When responding to questions, children with communication needs appeared to perform better when using visual scene displays, but the effect of display condition on the accuracy of responses to wh-questions was not statistically significant. In response to an open ended question, children with communication disorders demonstrated more frequent activations of the SGD using a grid display than a visual scene. Suggestions for future research as well as potential implications for designing AAC systems for shared reading with young children are discussed.
Automatic acquisition of motion trajectories: tracking hockey players
NASA Astrophysics Data System (ADS)
Okuma, Kenji; Little, James J.; Lowe, David
2003-12-01
Computer systems that have the capability of analyzing complex and dynamic scenes play an essential role in video annotation. Scenes can be complex in such a way that there are many cluttered objects with different colors, shapes and sizes, and can be dynamic with multiple interacting moving objects and a constantly changing background. In reality, there are many scenes that are complex, dynamic, and challenging enough for computers to describe. These scenes include games of sports, air traffic, car traffic, street intersections, and cloud transformations. Our research is about the challenge of inventing a descriptive computer system that analyzes scenes of hockey games where multiple moving players interact with each other on a constantly moving background due to camera motions. Ultimately, such a computer system should be able to acquire reliable data by extracting the players" motion as their trajectories, querying them by analyzing the descriptive information of data, and predict the motions of some hockey players based on the result of the query. Among these three major aspects of the system, we primarily focus on visual information of the scenes, that is, how to automatically acquire motion trajectories of hockey players from video. More accurately, we automatically analyze the hockey scenes by estimating parameters (i.e., pan, tilt, and zoom) of the broadcast cameras, tracking hockey players in those scenes, and constructing a visual description of the data by displaying trajectories of those players. Many technical problems in vision such as fast and unpredictable players' motions and rapid camera motions make our challenge worth tackling. To the best of our knowledge, there have not been any automatic video annotation systems for hockey developed in the past. Although there are many obstacles to overcome, our efforts and accomplishments would hopefully establish the infrastructure of the automatic hockey annotation system and become a milestone for research in automatic video annotation in this domain.
Acute stress influences the discrimination of complex scenes and complex faces in young healthy men.
Paul, M; Lech, R K; Scheil, J; Dierolf, A M; Suchan, B; Wolf, O T
2016-04-01
The stress-induced release of glucocorticoids has been demonstrated to influence hippocampal functions via the modulation of specific receptors. At the behavioral level stress is known to influence hippocampus dependent long-term memory. In recent years, studies have consistently associated the hippocampus with the non-mnemonic perception of scenes, while adjacent regions in the medial temporal lobe were associated with the perception of objects, and faces. So far it is not known whether and how stress influences non-mnemonic perceptual processes. In a behavioral study, fifty male participants were subjected either to the stressful socially evaluated cold-pressor test or to a non-stressful control procedure, before they completed a visual discrimination task, comprising scenes and faces. The complexity of the face and scene stimuli was manipulated in easy and difficult conditions. A significant three way interaction between stress, stimulus type and complexity was found. Stressed participants tended to commit more errors in the complex scenes condition. For complex faces a descriptive tendency in the opposite direction (fewer errors under stress) was observed. As a result the difference between the number of errors for scenes and errors for faces was significantly larger in the stress group. These results indicate that, beyond the effects of stress on long-term memory, stress influences the discrimination of spatial information, especially when the perception is characterized by a high complexity. Copyright © 2016 Elsevier Ltd. All rights reserved.
The Effects of Similarity on High-Level Visual Working Memory Processing.
Yang, Li; Mo, Lei
2017-01-01
Similarity has been observed to have opposite effects on visual working memory (VWM) for complex images. How can these discrepant results be reconciled? To answer this question, we used a change-detection paradigm to test visual working memory performance for multiple real-world objects. We found that working memory for moderate similarity items was worse than that for either high or low similarity items. This pattern was unaffected by manipulations of stimulus type (faces vs. scenes), encoding duration (limited vs. self-paced), and presentation format (simultaneous vs. sequential). We also found that the similarity effects differed in strength in different categories (scenes vs. faces). These results suggest that complex real-world objects are represented using a centre-surround inhibition organization . These results support the category-specific cortical resource theory and further suggest that centre-surround inhibition organization may differ by category.
Top-down control of visual perception: attention in natural vision.
Rolls, Edmund T
2008-01-01
Top-down perceptual influences can bias (or pre-empt) perception. In natural scenes, the receptive fields of neurons in the inferior temporal visual cortex (IT) shrink to become close to the size of objects. This facilitates the read-out of information from the ventral visual system, because the information is primarily about the object at the fovea. Top-down attentional influences are much less evident in natural scenes than when objects are shown against blank backgrounds, though are still present. It is suggested that the reduced receptive-field size in natural scenes, and the effects of top-down attention contribute to change blindness. The receptive fields of IT neurons in complex scenes, though including the fovea, are frequently asymmetric around the fovea, and it is proposed that this is the solution the IT uses to represent multiple objects and their relative spatial positions in a scene. Networks that implement probabilistic decision-making are described, and it is suggested that, when in perceptual systems they take decisions (or 'test hypotheses'), they influence lower-level networks to bias visual perception. Finally, it is shown that similar processes extend to systems involved in the processing of emotion-provoking sensory stimuli, in that word-level cognitive states provide top-down biasing that reaches as far down as the orbitofrontal cortex, where, at the first stage of affective representations, olfactory, taste, flavour, and touch processing is biased (or pre-empted) in humans.
Scan patterns when viewing natural scenes: Emotion, complexity, and repetition
Bradley, Margaret M.; Houbova, Petra; Miccoli, Laura; Costa, Vincent D.; Lang, Peter J.
2011-01-01
Eye movements were monitored during picture viewing and effects of hedonic content, perceptual composition, and repetition on scanning assessed. In Experiment 1, emotional and neutral pictures that were figure-ground compositions or more complex scenes were presented for a 6 s free viewing period. Viewing emotional pictures or complex scenes prompted more fixations and broader scanning of the visual array, compared to neutral pictures or simple figure-ground compositions. Effects of emotion and composition were independent, supporting the hypothesis that these oculomotor indices reflect enhanced information seeking. Experiment 2 tested an orienting hypothesis by repeatedly presenting the same pictures. Although repetition altered specific scan patterns, emotional, compared to neutral, picture viewing continued to prompt oculomotor differences, suggesting that motivationally relevant cues enhance information seeking in appetitive and defensive contexts. PMID:21649664
Differential Visual Processing of Animal Images, with and without Conscious Awareness
Zhu, Weina; Drewes, Jan; Peatfield, Nicholas A.; Melcher, David
2016-01-01
The human visual system can quickly and efficiently extract categorical information from a complex natural scene. The rapid detection of animals in a scene is one compelling example of this phenomenon, and it suggests the automatic processing of at least some types of categories with little or no attentional requirements (Li et al., 2002, 2005). The aim of this study is to investigate whether the remarkable capability to categorize complex natural scenes exist in the absence of awareness, based on recent reports that “invisible” stimuli, which do not reach conscious awareness, can still be processed by the human visual system (Pasley et al., 2004; Williams et al., 2004; Fang and He, 2005; Jiang et al., 2006, 2007; Kaunitz et al., 2011a). In two experiments, we recorded event-related potentials (ERPs) in response to animal and non-animal/vehicle stimuli in both aware and unaware conditions in a continuous flash suppression (CFS) paradigm. Our results indicate that even in the “unseen” condition, the brain responds differently to animal and non-animal/vehicle images, consistent with rapid activation of animal-selective feature detectors prior to, or outside of, suppression by the CFS mask. PMID:27790106
Differential Visual Processing of Animal Images, with and without Conscious Awareness.
Zhu, Weina; Drewes, Jan; Peatfield, Nicholas A; Melcher, David
2016-01-01
The human visual system can quickly and efficiently extract categorical information from a complex natural scene. The rapid detection of animals in a scene is one compelling example of this phenomenon, and it suggests the automatic processing of at least some types of categories with little or no attentional requirements (Li et al., 2002, 2005). The aim of this study is to investigate whether the remarkable capability to categorize complex natural scenes exist in the absence of awareness, based on recent reports that "invisible" stimuli, which do not reach conscious awareness, can still be processed by the human visual system (Pasley et al., 2004; Williams et al., 2004; Fang and He, 2005; Jiang et al., 2006, 2007; Kaunitz et al., 2011a). In two experiments, we recorded event-related potentials (ERPs) in response to animal and non-animal/vehicle stimuli in both aware and unaware conditions in a continuous flash suppression (CFS) paradigm. Our results indicate that even in the "unseen" condition, the brain responds differently to animal and non-animal/vehicle images, consistent with rapid activation of animal-selective feature detectors prior to, or outside of, suppression by the CFS mask.
Wagner, Dylan D; Kelley, William M; Heatherton, Todd F
2011-12-01
People are able to rapidly infer complex personality traits and mental states even from the most minimal person information. Research has shown that when observers view a natural scene containing people, they spend a disproportionate amount of their time looking at the social features (e.g., faces, bodies). Does this preference for social features merely reflect the biological salience of these features or are observers spontaneously attempting to make sense of complex social dynamics? Using functional neuroimaging, we investigated neural responses to social and nonsocial visual scenes in a large sample of participants (n = 48) who varied on an individual difference measure assessing empathy and mentalizing (i.e., empathizing). Compared with other scene categories, viewing natural social scenes activated regions associated with social cognition (e.g., dorsomedial prefrontal cortex and temporal poles). Moreover, activity in these regions during social scene viewing was strongly correlated with individual differences in empathizing. These findings offer neural evidence that observers spontaneously engage in social cognition when viewing complex social material but that the degree to which people do so is mediated by individual differences in trait empathizing.
Assessing Multiple Object Tracking in Young Children Using a Game
ERIC Educational Resources Information Center
Ryokai, Kimiko; Farzin, Faraz; Kaltman, Eric; Niemeyer, Greg
2013-01-01
Visual tracking of multiple objects in a complex scene is a critical survival skill. When we attempt to safely cross a busy street, follow a ball's position during a sporting event, or monitor children in a busy playground, we rely on our brain's capacity to selectively attend to and track the position of specific objects in a dynamic scene. This…
Psychophysical Criteria for Visual Simulation Systems.
1980-05-01
definitive data were found to estab- lish detection thresholds; therefore, this is one area where a psycho- physical study was recommended. Differential size...The specific functional relationships needinq quantification were the following: 1. The effect of Horizontal Aniseikonia on Target Detection and...Transition Technique 6. The Effects of Scene Complexity and Separation on the Detection of Scene Misalignment 7. Absolute Brightness Levels in
Lescroart, Mark D.; Stansbury, Dustin E.; Gallant, Jack L.
2015-01-01
Perception of natural visual scenes activates several functional areas in the human brain, including the Parahippocampal Place Area (PPA), Retrosplenial Complex (RSC), and the Occipital Place Area (OPA). It is currently unclear what specific scene-related features are represented in these areas. Previous studies have suggested that PPA, RSC, and/or OPA might represent at least three qualitatively different classes of features: (1) 2D features related to Fourier power; (2) 3D spatial features such as the distance to objects in a scene; or (3) abstract features such as the categories of objects in a scene. To determine which of these hypotheses best describes the visual representation in scene-selective areas, we applied voxel-wise modeling (VM) to BOLD fMRI responses elicited by a set of 1386 images of natural scenes. VM provides an efficient method for testing competing hypotheses by comparing predictions of brain activity based on encoding models that instantiate each hypothesis. Here we evaluated three different encoding models that instantiate each of the three hypotheses listed above. We used linear regression to fit each encoding model to the fMRI data recorded from each voxel, and we evaluated each fit model by estimating the amount of variance it predicted in a withheld portion of the data set. We found that voxel-wise models based on Fourier power or the subjective distance to objects in each scene predicted much of the variance predicted by a model based on object categories. Furthermore, the response variance explained by these three models is largely shared, and the individual models explain little unique variance in responses. Based on an evaluation of previous studies and the data we present here, we conclude that there is currently no good basis to favor any one of the three alternative hypotheses about visual representation in scene-selective areas. We offer suggestions for further studies that may help resolve this issue. PMID:26594164
Moving through a multiplex holographic scene
NASA Astrophysics Data System (ADS)
Mrongovius, Martina
2013-02-01
This paper explores how movement can be used as a compositional element in installations of multiplex holograms. My holographic images are created from montages of hand-held video and photo-sequences. These spatially dynamic compositions are visually complex but anchored to landmarks and hints of the capturing process - such as the appearance of the photographer's shadow - to establish a sense of connection to the holographic scene. Moving around in front of the hologram, the viewer animates the holographic scene. A perception of motion then results from the viewer's bodily awareness of physical motion and the visual reading of dynamics within the scene or movement of perspective through a virtual suggestion of space. By linking and transforming the physical motion of the viewer with the visual animation, the viewer's bodily awareness - including proprioception, balance and orientation - play into the holographic composition. How multiplex holography can be a tool for exploring coupled, cross-referenced and transformed perceptions of movement is demonstrated with a number of holographic image installations. Through this process I expanded my creative composition practice to consider how dynamic and spatial scenes can be conveyed through the fragmented view of a multiplex hologram. This body of work was developed through an installation art practice and was the basis of my recently completed doctoral thesis: 'The Emergent Holographic Scene — compositions of movement and affect using multiplex holographic images'.
A review of visual perception mechanisms that regulate rapid adaptive camouflage in cuttlefish.
Chiao, Chuan-Chin; Chubb, Charles; Hanlon, Roger T
2015-09-01
We review recent research on the visual mechanisms of rapid adaptive camouflage in cuttlefish. These neurophysiologically complex marine invertebrates can camouflage themselves against almost any background, yet their ability to quickly (0.5-2 s) alter their body patterns on different visual backgrounds poses a vexing challenge: how to pick the correct body pattern amongst their repertoire. The ability of cuttlefish to change appropriately requires a visual system that can rapidly assess complex visual scenes and produce the motor responses-the neurally controlled body patterns-that achieve camouflage. Using specifically designed visual backgrounds and assessing the corresponding body patterns quantitatively, we and others have uncovered several aspects of scene variation that are important in regulating cuttlefish patterning responses. These include spatial scale of background pattern, background intensity, background contrast, object edge properties, object contrast polarity, object depth, and the presence of 3D objects. Moreover, arm postures and skin papillae are also regulated visually for additional aspects of concealment. By integrating these visual cues, cuttlefish are able to rapidly select appropriate body patterns for concealment throughout diverse natural environments. This sensorimotor approach of studying cuttlefish camouflage thus provides unique insights into the mechanisms of visual perception in an invertebrate image-forming eye.
Thiessen, Amber; Beukelman, David; Hux, Karen; Longenecker, Maria
2016-04-01
The purpose of the study was to compare the visual attention patterns of adults with aphasia and adults without neurological conditions when viewing visual scenes with 2 types of engagement. Eye-tracking technology was used to measure the visual attention patterns of 10 adults with aphasia and 10 adults without neurological conditions. Participants viewed camera-engaged (i.e., human figure facing camera) and task-engaged (i.e., human figure looking at and touching an object) visual scenes. Participants with aphasia responded to engagement cues by focusing on objects of interest more for task-engaged scenes than camera-engaged scenes; however, the difference in their responses to these scenes were not as pronounced as those observed in adults without neurological conditions. In addition, people with aphasia spent more time looking at background areas of interest and less time looking at person areas of interest for camera-engaged scenes than did control participants. Results indicate people with aphasia visually attend to scenes differently than adults without neurological conditions. As a consequence, augmentative and alternative communication (AAC) facilitators may have different visual attention behaviors than the people with aphasia for whom they are constructing or selecting visual scenes. Further examination of the visual attention of people with aphasia may help optimize visual scene selection.
Teacher Vision: Expert and Novice Teachers' Perception of Problematic Classroom Management Scenes
ERIC Educational Resources Information Center
Wolff, Charlotte E.; Jarodzka, Halszka; van den Bogert, Niek; Boshuizen, Henny P. A.
2016-01-01
Visual expertise has been explored in numerous professions, but research on teachers' vision remains limited. Teachers' visual expertise is an important professional skill, particularly the ability to simultaneously perceive and interpret classroom situations for effective classroom management. This skill is complex and relies on an awareness of…
Brayfield, Brad P.
2016-01-01
The navigation of bees and ants from hive to food and back has captivated people for more than a century. Recently, the Navigation by Scene Familiarity Hypothesis (NSFH) has been proposed as a parsimonious approach that is congruent with the limited neural elements of these insects’ brains. In the NSFH approach, an agent completes an initial training excursion, storing images along the way. To retrace the path, the agent scans the area and compares the current scenes to those previously experienced. By turning and moving to minimize the pixel-by-pixel differences between encountered and stored scenes, the agent is guided along the path without having memorized the sequence. An important premise of the NSFH is that the visual information of the environment is adequate to guide navigation without aliasing. Here we demonstrate that an image landscape of an indoor setting possesses ample navigational information. We produced a visual landscape of our laboratory and part of the adjoining corridor consisting of 2816 panoramic snapshots arranged in a grid at 12.7-cm centers. We show that pixel-by-pixel comparisons of these images yield robust translational and rotational visual information. We also produced a simple algorithm that tracks previously experienced routes within our lab based on an insect-inspired scene familiarity approach and demonstrate that adequate visual information exists for an agent to retrace complex training routes, including those where the path’s end is not visible from its origin. We used this landscape to systematically test the interplay of sensor morphology, angles of inspection, and similarity threshold with the recapitulation performance of the agent. Finally, we compared the relative information content and chance of aliasing within our visually rich laboratory landscape to scenes acquired from indoor corridors with more repetitive scenery. PMID:27119720
Scan patterns when viewing natural scenes: emotion, complexity, and repetition.
Bradley, Margaret M; Houbova, Petra; Miccoli, Laura; Costa, Vincent D; Lang, Peter J
2011-11-01
Eye movements were monitored during picture viewing, and effects of hedonic content, perceptual composition, and repetition on scanning assessed. In Experiment 1, emotional and neutral pictures that were figure-ground compositions or more complex scenes were presented for a 6-s free viewing period. Viewing emotional pictures or complex scenes prompted more fixations and broader scanning of the visual array, compared to neutral pictures or simple figure-ground compositions. Effects of emotion and composition were independent, supporting the hypothesis that these oculomotor indices reflect enhanced information seeking. Experiment 2 tested an orienting hypothesis by repeatedly presenting the same pictures. Although repetition altered specific scan patterns, emotional, compared to neutral, picture viewing continued to prompt oculomotor differences, suggesting that motivationally relevant cues enhance information seeking in appetitive and defensive contexts. Copyright © 2011 Society for Psychophysiological Research.
Two Distinct Scene-Processing Networks Connecting Vision and Memory.
Baldassano, Christopher; Esteva, Andre; Fei-Fei, Li; Beck, Diane M
2016-01-01
A number of regions in the human brain are known to be involved in processing natural scenes, but the field has lacked a unifying framework for understanding how these different regions are organized and interact. We provide evidence from functional connectivity and meta-analyses for a new organizational principle, in which scene processing relies upon two distinct networks that split the classically defined parahippocampal place area (PPA). The first network of strongly connected regions consists of the occipital place area/transverse occipital sulcus and posterior PPA, which contain retinotopic maps and are not strongly coupled to the hippocampus at rest. The second network consists of the caudal inferior parietal lobule, retrosplenial complex, and anterior PPA, which connect to the hippocampus (especially anterior hippocampus), and are implicated in both visual and nonvisual tasks, including episodic memory and navigation. We propose that these two distinct networks capture the primary functional division among scene-processing regions, between those that process visual features from the current view of a scene and those that connect information from a current scene view with a much broader temporal and spatial context. This new framework for understanding the neural substrates of scene-processing bridges results from many lines of research, and makes specific functional predictions.
How visual attention is modified by disparities and textures changes?
NASA Astrophysics Data System (ADS)
Khaustova, Dar'ya; Fournier, Jérome; Wyckens, Emmanuel; Le Meur, Olivier
2013-03-01
The 3D image/video quality of experience is a multidimensional concept that depends on 2D image quality, depth quantity and visual comfort. The relationship between these parameters is not yet clearly defined. From this perspective, we aim to understand how texture complexity, depth quantity and visual comfort influence the way people observe 3D content in comparison with 2D. Six scenes with different structural parameters were generated using Blender software. For these six scenes, the following parameters were modified: texture complexity and the amount of depth changing the camera baseline and the convergence distance at the shooting side. Our study was conducted using an eye-tracker and a 3DTV display. During the eye-tracking experiment, each observer freely examined images with different depth levels and texture complexities. To avoid memory bias, we ensured that each observer had only seen scene content once. Collected fixation data were used to build saliency maps and to analyze differences between 2D and 3D conditions. Our results show that the introduction of disparity shortened saccade length; however fixation durations remained unaffected. An analysis of the saliency maps did not reveal any differences between 2D and 3D conditions for the viewing duration of 20 s. When the whole period was divided into smaller intervals, we found that for the first 4 s the introduced disparity was conducive to the section of saliency regions. However, this contribution is quite minimal if the correlation between saliency maps is analyzed. Nevertheless, we did not find that discomfort (comfort) had any influence on visual attention. We believe that existing metrics and methods are depth insensitive and do not reveal such differences. Based on the analysis of heat maps and paired t-tests of inter-observer visual congruency values we deduced that the selected areas of interest depend on texture complexities.
Cultural differences in attention: Eye movement evidence from a comparative visual search task.
Alotaibi, Albandri; Underwood, Geoffrey; Smith, Alastair D
2017-10-01
Individual differences in visual attention have been linked to thinking style: analytic thinking (common in individualistic cultures) is thought to promote attention to detail and focus on the most important part of a scene, whereas holistic thinking (common in collectivist cultures) promotes attention to the global structure of a scene and the relationship between its parts. However, this theory is primarily based on relatively simple judgement tasks. We compared groups from Great Britain (an individualist culture) and Saudi Arabia (a collectivist culture) on a more complex comparative visual search task, using simple natural scenes. A higher overall number of fixations for Saudi participants, along with longer search times, indicated less efficient search behaviour than British participants. Furthermore, intra-group comparisons of scan-path for Saudi participants revealed less similarity than within the British group. Together, these findings suggest that there is a positive relationship between an analytic cognitive style and controlled attention. Copyright © 2017 Elsevier Inc. All rights reserved.
Out of Mind, Out of Sight: Unexpected Scene Elements Frequently Go Unnoticed Until Primed.
Slavich, George M; Zimbardo, Philip G
2013-12-01
The human visual system employs a sophisticated set of strategies for scanning the environment and directing attention to stimuli that can be expected given the context and a person's past experience. Although these strategies enable us to navigate a very complex physical and social environment, they can also cause highly salient, but unexpected stimuli to go completely unnoticed. To examine the generality of this phenomenon, we conducted eight studies that included 15 different experimental conditions and 1,577 participants in all. These studies revealed that a large majority of participants do not report having seen a woman in the center of an urban scene who was photographed in midair as she was committing suicide. Despite seeing the scene repeatedly, 46 % of all participants failed to report seeing a central figure and only 4.8 % reported seeing a falling person. Frequency of noticing the suicidal woman was highest for participants who read a narrative priming story that increased the extent to which she was schematically congruent with the scene. In contrast to this robust effect of inattentional blindness , a majority of participants reported seeing other peripheral objects in the visual scene that were equally difficult to detect, yet more consistent with the scene. Follow-up qualitative analyses revealed that participants reported seeing many elements that were not actually present, but which could have been expected given the overall context of the scene. Together, these findings demonstrate the robustness of inattentional blindness and highlight the specificity with which different visual primes may increase noticing behavior.
Optic Flow Dominates Visual Scene Polarity in Causing Adaptive Modification of Locomotor Trajectory
NASA Technical Reports Server (NTRS)
Nomura, Y.; Mulavara, A. P.; Richards, J. T.; Brady, R.; Bloomberg, Jacob J.
2005-01-01
Locomotion and posture are influenced and controlled by vestibular, visual and somatosensory information. Optic flow and scene polarity are two characteristics of a visual scene that have been identified as being critical in how they affect perceived body orientation and self-motion. The goal of this study was to determine the role of optic flow and visual scene polarity on adaptive modification in locomotor trajectory. Two computer-generated virtual reality scenes were shown to subjects during 20 minutes of treadmill walking. One scene was a highly polarized scene while the other was composed of objects displayed in a non-polarized fashion. Both virtual scenes depicted constant rate self-motion equivalent to walking counterclockwise around the perimeter of a room. Subjects performed Stepping Tests blindfolded before and after scene exposure to assess adaptive changes in locomotor trajectory. Subjects showed a significant difference in heading direction, between pre and post adaptation stepping tests, when exposed to either scene during treadmill walking. However, there was no significant difference in the subjects heading direction between the two visual scene polarity conditions. Therefore, it was inferred from these data that optic flow has a greater role than visual polarity in influencing adaptive locomotor function.
NASA Technical Reports Server (NTRS)
Krauzlis, Rich; Stone, Leland; Null, Cynthia H. (Technical Monitor)
1998-01-01
When viewing objects, primates use a combination of saccadic and pursuit eye movements to stabilize the retinal image of the object of regard within the high-acuity region near the fovea. Although these movements involve widespread regions of the nervous system, they mix seamlessly in normal behavior. Saccades are discrete movements that quickly direct the eyes toward a visual target, thereby translating the image of the target from an eccentric retinal location to the fovea. In contrast, pursuit is a continuous movement that slowly rotates the eyes to compensate for the motion of the visual target, minimizing the blur that can compromise visual acuity. While other mammalian species can generate smooth optokinetic eye movements - which track the motion of the entire visual surround - only primates can smoothly pursue a single small element within a complex visual scene, regardless of the motion elsewhere on the retina. This ability likely reflects the greater ability of primates to segment the visual scene, to identify individual visual objects, and to select a target of interest.
Visually-guided attention enhances target identification in a complex auditory scene.
Best, Virginia; Ozmeral, Erol J; Shinn-Cunningham, Barbara G
2007-06-01
In auditory scenes containing many similar sound sources, sorting of acoustic information into streams becomes difficult, which can lead to disruptions in the identification of behaviorally relevant targets. This study investigated the benefit of providing simple visual cues for when and/or where a target would occur in a complex acoustic mixture. Importantly, the visual cues provided no information about the target content. In separate experiments, human subjects either identified learned birdsongs in the presence of a chorus of unlearned songs or recalled strings of spoken digits in the presence of speech maskers. A visual cue indicating which loudspeaker (from an array of five) would contain the target improved accuracy for both kinds of stimuli. A cue indicating which time segment (out of a possible five) would contain the target also improved accuracy, but much more for birdsong than for speech. These results suggest that in real world situations, information about where a target of interest is located can enhance its identification, while information about when to listen can also be helpful when targets are unfamiliar or extremely similar to their competitors.
Visually-guided Attention Enhances Target Identification in a Complex Auditory Scene
Ozmeral, Erol J.; Shinn-Cunningham, Barbara G.
2007-01-01
In auditory scenes containing many similar sound sources, sorting of acoustic information into streams becomes difficult, which can lead to disruptions in the identification of behaviorally relevant targets. This study investigated the benefit of providing simple visual cues for when and/or where a target would occur in a complex acoustic mixture. Importantly, the visual cues provided no information about the target content. In separate experiments, human subjects either identified learned birdsongs in the presence of a chorus of unlearned songs or recalled strings of spoken digits in the presence of speech maskers. A visual cue indicating which loudspeaker (from an array of five) would contain the target improved accuracy for both kinds of stimuli. A cue indicating which time segment (out of a possible five) would contain the target also improved accuracy, but much more for birdsong than for speech. These results suggest that in real world situations, information about where a target of interest is located can enhance its identification, while information about when to listen can also be helpful when targets are unfamiliar or extremely similar to their competitors. PMID:17453308
Guidance of visual search by memory and knowledge.
Hollingworth, Andrew
2012-01-01
To behave intelligently in the world, humans must be able to find objects efficiently within the complex environments they inhabit. A growing proportion of the literature on visual search is devoted to understanding this type of natural search. In the present chapter, I review the literature on visual search through natural scenes, focusing on the role of memory and knowledge in guiding attention to task-relevant objects.
Feature diagnosticity and task context shape activity in human scene-selective cortex.
Lowe, Matthew X; Gallivan, Jason P; Ferber, Susanne; Cant, Jonathan S
2016-01-15
Scenes are constructed from multiple visual features, yet previous research investigating scene processing has often focused on the contributions of single features in isolation. In the real world, features rarely exist independently of one another and likely converge to inform scene identity in unique ways. Here, we utilize fMRI and pattern classification techniques to examine the interactions between task context (i.e., attend to diagnostic global scene features; texture or layout) and high-level scene attributes (content and spatial boundary) to test the novel hypothesis that scene-selective cortex represents multiple visual features, the importance of which varies according to their diagnostic relevance across scene categories and task demands. Our results show for the first time that scene representations are driven by interactions between multiple visual features and high-level scene attributes. Specifically, univariate analysis of scene-selective cortex revealed that task context and feature diagnosticity shape activity differentially across scene categories. Examination using multivariate decoding methods revealed results consistent with univariate findings, but also evidence for an interaction between high-level scene attributes and diagnostic visual features within scene categories. Critically, these findings suggest visual feature representations are not distributed uniformly across scene categories but are shaped by task context and feature diagnosticity. Thus, we propose that scene-selective cortex constructs a flexible representation of the environment by integrating multiple diagnostically relevant visual features, the nature of which varies according to the particular scene being perceived and the goals of the observer. Copyright © 2015 Elsevier Inc. All rights reserved.
Combined influence of visual scene and body tilt on arm pointing movements: gravity matters!
Scotto Di Cesare, Cécile; Sarlegna, Fabrice R; Bourdin, Christophe; Mestre, Daniel R; Bringoux, Lionel
2014-01-01
Performing accurate actions such as goal-directed arm movements requires taking into account visual and body orientation cues to localize the target in space and produce appropriate reaching motor commands. We experimentally tilted the body and/or the visual scene to investigate how visual and body orientation cues are combined for the control of unseen arm movements. Subjects were asked to point toward a visual target using an upward movement during slow body and/or visual scene tilts. When the scene was tilted, final pointing errors varied as a function of the direction of the scene tilt (forward or backward). Actual forward body tilt resulted in systematic target undershoots, suggesting that the brain may have overcompensated for the biomechanical movement facilitation arising from body tilt. Combined body and visual scene tilts also affected final pointing errors according to the orientation of the visual scene. The data were further analysed using either a body-centered or a gravity-centered reference frame to encode visual scene orientation with simple additive models (i.e., 'combined' tilts equal to the sum of 'single' tilts). We found that the body-centered model could account only for some of the data regarding kinematic parameters and final errors. In contrast, the gravity-centered modeling in which the body and visual scene orientations were referred to vertical could explain all of these data. Therefore, our findings suggest that the brain uses gravity, thanks to its invariant properties, as a reference for the combination of visual and non-visual cues.
Combined Influence of Visual Scene and Body Tilt on Arm Pointing Movements: Gravity Matters!
Scotto Di Cesare, Cécile; Sarlegna, Fabrice R.; Bourdin, Christophe; Mestre, Daniel R.; Bringoux, Lionel
2014-01-01
Performing accurate actions such as goal-directed arm movements requires taking into account visual and body orientation cues to localize the target in space and produce appropriate reaching motor commands. We experimentally tilted the body and/or the visual scene to investigate how visual and body orientation cues are combined for the control of unseen arm movements. Subjects were asked to point toward a visual target using an upward movement during slow body and/or visual scene tilts. When the scene was tilted, final pointing errors varied as a function of the direction of the scene tilt (forward or backward). Actual forward body tilt resulted in systematic target undershoots, suggesting that the brain may have overcompensated for the biomechanical movement facilitation arising from body tilt. Combined body and visual scene tilts also affected final pointing errors according to the orientation of the visual scene. The data were further analysed using either a body-centered or a gravity-centered reference frame to encode visual scene orientation with simple additive models (i.e., ‘combined’ tilts equal to the sum of ‘single’ tilts). We found that the body-centered model could account only for some of the data regarding kinematic parameters and final errors. In contrast, the gravity-centered modeling in which the body and visual scene orientations were referred to vertical could explain all of these data. Therefore, our findings suggest that the brain uses gravity, thanks to its invariant properties, as a reference for the combination of visual and non-visual cues. PMID:24925371
Guidance of visual attention by semantic information in real-world scenes
Wu, Chia-Chien; Wick, Farahnaz Ahmed; Pomplun, Marc
2014-01-01
Recent research on attentional guidance in real-world scenes has focused on object recognition within the context of a scene. This approach has been valuable for determining some factors that drive the allocation of visual attention and determine visual selection. This article provides a review of experimental work on how different components of context, especially semantic information, affect attentional deployment. We review work from the areas of object recognition, scene perception, and visual search, highlighting recent studies examining semantic structure in real-world scenes. A better understanding on how humans parse scene representations will not only improve current models of visual attention but also advance next-generation computer vision systems and human-computer interfaces. PMID:24567724
Collerton, Daniel; Perry, Elaine; McKeith, Ian
2005-12-01
As many as two million people in the United Kingdom repeatedly see people, animals, and objects that have no objective reality. Hallucinations on the border of sleep, dementing illnesses, delirium, eye disease, and schizophrenia account for 90% of these. The remainder have rarer disorders. We review existing models of recurrent complex visual hallucinations (RCVH) in the awake person, including cortical irritation, cortical hyperexcitability and cortical release, top-down activation, misperception, dream intrusion, and interactive models. We provide evidence that these can neither fully account for the phenomenology of RCVH, nor for variations in the frequency of RCVH in different disorders. We propose a novel Perception and Attention Deficit (PAD) model for RCVH. A combination of impaired attentional binding and poor sensory activation of a correct proto-object, in conjunction with a relatively intact scene representation, bias perception to allow the intrusion of a hallucinatory proto-object into a scene perception. Incorporation of this image into a context-specific hallucinatory scene representation accounts for repetitive hallucinations. We suggest that these impairments are underpinned by disturbances in a lateral frontal cortex-ventral visual stream system. We show how the frequency of RCVH in different diseases is related to the coexistence of attentional and visual perceptual impairments; how attentional and perceptual processes can account for their phenomenology; and that diseases and other states with high rates of RCVH have cholinergic dysfunction in both frontal cortex and the ventral visual stream. Several tests of the model are indicated, together with a number of treatment options that it generates.
Wilkinson, Krista M; Light, Janice; Drager, Kathryn
2012-09-01
Aided augmentative and alternative (AAC) interventions have been demonstrated to facilitate a variety of communication outcomes in persons with intellectual disabilities. Most aided AAC systems rely on a visual modality. When the medium for communication is visual, it seems likely that the effectiveness of intervention depends in part on the effectiveness and efficiency with which the information presented in the display can be perceived, identified, and extracted by communicators and their partners. Understanding of visual-cognitive processing - that is, how a user attends, perceives, and makes sense of the visual information on the display - therefore seems critical to designing effective aided AAC interventions. In this Forum Note, we discuss characteristics of one particular type of aided AAC display, that is, Visual Scene Displays (VSDs) as they may relate to user visual and cognitive processing. We consider three specific ways in which bodies of knowledge drawn from the visual cognitive sciences may be relevant to the composition of VSDs, with the understanding the direct research with children with complex communication needs is necessary to verify or refute our speculations.
Statistical regularities in art: Relations with visual coding and perception.
Graham, Daniel J; Redies, Christoph
2010-07-21
Since at least 1935, vision researchers have used art stimuli to test human response to complex scenes. This is sensible given the "inherent interestingness" of art and its relation to the natural visual world. The use of art stimuli has remained popular, especially in eye tracking studies. Moreover, stimuli in common use by vision scientists are inspired by the work of famous artists (e.g., Mondrians). Artworks are also popular in vision science as illustrations of a host of visual phenomena, such as depth cues and surface properties. However, until recently, there has been scant consideration of the spatial, luminance, and color statistics of artwork, and even less study of ways that regularities in such statistics could affect visual processing. Furthermore, the relationship between regularities in art images and those in natural scenes has received little or no attention. In the past few years, there has been a concerted effort to study statistical regularities in art as they relate to neural coding and visual perception, and art stimuli have begun to be studied in rigorous ways, as natural scenes have been. In this minireview, we summarize quantitative studies of links between regular statistics in artwork and processing in the visual stream. The results of these studies suggest that art is especially germane to understanding human visual coding and perception, and it therefore warrants wider study. Copyright 2010 Elsevier Ltd. All rights reserved.
Scene perception in posterior cortical atrophy: categorization, description and fixation patterns.
Shakespeare, Timothy J; Yong, Keir X X; Frost, Chris; Kim, Lois G; Warrington, Elizabeth K; Crutch, Sebastian J
2013-01-01
Partial or complete Balint's syndrome is a core feature of the clinico-radiological syndrome of posterior cortical atrophy (PCA), in which individuals experience a progressive deterioration of cortical vision. Although multi-object arrays are frequently used to detect simultanagnosia in the clinical assessment and diagnosis of PCA, to date there have been no group studies of scene perception in patients with the syndrome. The current study involved three linked experiments conducted in PCA patients and healthy controls. Experiment 1 evaluated the accuracy and latency of complex scene perception relative to individual faces and objects (color and grayscale) using a categorization paradigm. PCA patients were both less accurate (faces < scenes < objects) and slower (scenes < objects < faces) than controls on all categories, with performance strongly associated with their level of basic visual processing impairment; patients also showed a small advantage for color over grayscale stimuli. Experiment 2 involved free description of real world scenes. PCA patients generated fewer features and more misperceptions than controls, though perceptual errors were always consistent with the patient's global understanding of the scene (whether correct or not). Experiment 3 used eye tracking measures to compare patient and control eye movements over initial and subsequent fixations of scenes. Patients' fixation patterns were significantly different to those of young and age-matched controls, with comparable group differences for both initial and subsequent fixations. Overall, these findings describe the variability in everyday scene perception exhibited by individuals with PCA, and indicate the importance of exposure duration in the perception of complex scenes.
The functional consequences of social distraction: Attention and memory for complex scenes.
Doherty, Brianna Ruth; Patai, Eva Zita; Duta, Mihaela; Nobre, Anna Christina; Scerif, Gaia
2017-01-01
Cognitive scientists have long proposed that social stimuli attract visual attention even when task irrelevant, but the consequences of this privileged status for memory are unknown. To address this, we combined computational approaches, eye-tracking methodology, and individual-differences measures. Participants searched for targets in scenes containing social or non-social distractors equated for low-level visual salience. Subsequent memory precision for target locations was tested. Individual differences in autistic traits and social anxiety were also measured. Eye-tracking revealed significantly more attentional capture to social compared to non-social distractors. Critically, memory precision for target locations was poorer for social scenes. This effect was moderated by social anxiety, with anxious individuals remembering target locations better under conditions of social distraction. These findings shed further light onto the privileged attentional status of social stimuli and its functional consequences on memory across individuals. Copyright © 2016. Published by Elsevier B.V.
The Effect of Visual Information on the Manual Approach and Landing
NASA Technical Reports Server (NTRS)
Wewerinke, P. H.
1982-01-01
The effect of visual information in combination with basic display information on the approach performance. A pre-experimental model analysis was performed in terms of the optimal control model. The resulting aircraft approach performance predictions were compared with the results of a moving base simulator program. The results illustrate that the model provides a meaningful description of the visual (scene) perception process involved in the complex (multi-variable, time varying) manual approach task with a useful predictive capability. The theoretical framework was shown to allow a straight-forward investigation of the complex interaction of a variety of task variables.
Remembering faces and scenes: The mixed-category advantage in visual working memory.
Jiang, Yuhong V; Remington, Roger W; Asaad, Anthony; Lee, Hyejin J; Mikkalson, Taylor C
2016-09-01
We examined the mixed-category memory advantage for faces and scenes to determine how domain-specific cortical resources constrain visual working memory. Consistent with previous findings, visual working memory for a display of 2 faces and 2 scenes was better than that for a display of 4 faces or 4 scenes. This pattern was unaffected by manipulations of encoding duration. However, the mixed-category advantage was carried solely by faces: Memory for scenes was not better when scenes were encoded with faces rather than with other scenes. The asymmetry between faces and scenes was found when items were presented simultaneously or sequentially, centrally, or peripherally, and when scenes were drawn from a narrow category. A further experiment showed a mixed-category advantage in memory for faces and bodies, but not in memory for scenes and objects. The results suggest that unique category-specific interactions contribute significantly to the mixed-category advantage in visual working memory. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Groen, Iris I A; Silson, Edward H; Baker, Chris I
2017-02-19
Visual scene analysis in humans has been characterized by the presence of regions in extrastriate cortex that are selectively responsive to scenes compared with objects or faces. While these regions have often been interpreted as representing high-level properties of scenes (e.g. category), they also exhibit substantial sensitivity to low-level (e.g. spatial frequency) and mid-level (e.g. spatial layout) properties, and it is unclear how these disparate findings can be united in a single framework. In this opinion piece, we suggest that this problem can be resolved by questioning the utility of the classical low- to high-level framework of visual perception for scene processing, and discuss why low- and mid-level properties may be particularly diagnostic for the behavioural goals specific to scene perception as compared to object recognition. In particular, we highlight the contributions of low-level vision to scene representation by reviewing (i) retinotopic biases and receptive field properties of scene-selective regions and (ii) the temporal dynamics of scene perception that demonstrate overlap of low- and mid-level feature representations with those of scene category. We discuss the relevance of these findings for scene perception and suggest a more expansive framework for visual scene analysis.This article is part of the themed issue 'Auditory and visual scene analysis'. © 2017 The Author(s).
2017-01-01
Visual scene analysis in humans has been characterized by the presence of regions in extrastriate cortex that are selectively responsive to scenes compared with objects or faces. While these regions have often been interpreted as representing high-level properties of scenes (e.g. category), they also exhibit substantial sensitivity to low-level (e.g. spatial frequency) and mid-level (e.g. spatial layout) properties, and it is unclear how these disparate findings can be united in a single framework. In this opinion piece, we suggest that this problem can be resolved by questioning the utility of the classical low- to high-level framework of visual perception for scene processing, and discuss why low- and mid-level properties may be particularly diagnostic for the behavioural goals specific to scene perception as compared to object recognition. In particular, we highlight the contributions of low-level vision to scene representation by reviewing (i) retinotopic biases and receptive field properties of scene-selective regions and (ii) the temporal dynamics of scene perception that demonstrate overlap of low- and mid-level feature representations with those of scene category. We discuss the relevance of these findings for scene perception and suggest a more expansive framework for visual scene analysis. This article is part of the themed issue ‘Auditory and visual scene analysis’. PMID:28044013
Scene and Position Specificity in Visual Memory for Objects
ERIC Educational Resources Information Center
Hollingworth, Andrew
2006-01-01
This study investigated whether and how visual representations of individual objects are bound in memory to scene context. Participants viewed a series of naturalistic scenes, and memory for the visual form of a target object in each scene was examined in a 2-alternative forced-choice test, with the distractor object either a different object…
Modulation of visually evoked movement responses in moving virtual environments.
Reed-Jones, Rebecca J; Vallis, Lori Ann
2009-01-01
Virtual-reality technology is being increasingly used to understand how humans perceive and act in the moving world around them. What is currently not clear is how virtual reality technology is perceived by human participants and what virtual scenes are effective in evoking movement responses to visual stimuli. We investigated the effect of virtual-scene context on human responses to a virtual visual perturbation. We hypothesised that exposure to a natural scene that matched the visual expectancies of the natural world would create a perceptual set towards presence, and thus visual guidance of body movement in a subsequently presented virtual scene. Results supported this hypothesis; responses to a virtual visual perturbation presented in an ambiguous virtual scene were increased when participants first viewed a scene that consisted of natural landmarks which provided 'real-world' visual motion cues. Further research in this area will provide a basis of knowledge for the effective use of this technology in the study of human movement responses.
The Relationship Between Online Visual Representation of a Scene and Long-Term Scene Memory
ERIC Educational Resources Information Center
Hollingworth, Andrew
2005-01-01
In 3 experiments the author investigated the relationship between the online visual representation of natural scenes and long-term visual memory. In a change detection task, a target object either changed or remained the same from an initial image of a natural scene to a test image. Two types of changes were possible: rotation in depth, or…
An Improved Text Localization Method for Natural Scene Images
NASA Astrophysics Data System (ADS)
Jiang, Mengdi; Cheng, Jianghua; Chen, Minghui; Ku, Xishu
2018-01-01
In order to extract text information effectively from natural scene image with complex background, multi-orientation perspective and multilingual languages, we present a new method based on the improved Stroke Feature Transform (SWT). Firstly, The Maximally Stable Extremal Region (MSER) method is used to detect text candidate regions. Secondly, the SWT algorithm is used in the candidate regions, which can improve the edge detection compared with tradition SWT method. Finally, the Frequency-tuned (FT) visual saliency is introduced to remove non-text candidate regions. The experiment results show that, the method can achieve good robustness for complex background with multi-orientation perspective, various characters and font sizes.
Effects of aging on neural connectivity underlying selective memory for emotional scenes
Waring, Jill D.; Addis, Donna Rose; Kensinger, Elizabeth A.
2012-01-01
Older adults show age-related reductions in memory for neutral items within complex visual scenes, but just like young adults, older adults exhibit a memory advantage for emotional items within scenes compared with the background scene information. The present study examined young and older adults’ encoding-stage effective connectivity for selective memory of emotional items versus memory for both the emotional item and its background. In a functional magnetic resonance imaging (fMRI) study, participants viewed scenes containing either positive or negative items within neutral backgrounds. Outside the scanner, participants completed a memory test for items and backgrounds. Irrespective of scene content being emotionally positive or negative, older adults had stronger positive connections among frontal regions and from frontal regions to medial temporal lobe structures than did young adults, especially when items and backgrounds were subsequently remembered. These results suggest there are differences between young and older adults’ connectivity accompanying the encoding of emotional scenes. Older adults may require more frontal connectivity to encode all elements of a scene rather than just encoding the emotional item. PMID:22542836
Effects of aging on neural connectivity underlying selective memory for emotional scenes.
Waring, Jill D; Addis, Donna Rose; Kensinger, Elizabeth A
2013-02-01
Older adults show age-related reductions in memory for neutral items within complex visual scenes, but just like young adults, older adults exhibit a memory advantage for emotional items within scenes compared with the background scene information. The present study examined young and older adults' encoding-stage effective connectivity for selective memory of emotional items versus memory for both the emotional item and its background. In a functional magnetic resonance imaging (fMRI) study, participants viewed scenes containing either positive or negative items within neutral backgrounds. Outside the scanner, participants completed a memory test for items and backgrounds. Irrespective of scene content being emotionally positive or negative, older adults had stronger positive connections among frontal regions and from frontal regions to medial temporal lobe structures than did young adults, especially when items and backgrounds were subsequently remembered. These results suggest there are differences between young and older adults' connectivity accompanying the encoding of emotional scenes. Older adults may require more frontal connectivity to encode all elements of a scene rather than just encoding the emotional item. Published by Elsevier Inc.
Wilkinson, Krista M.; Light, Janice; Drager, Kathryn
2013-01-01
Aided augmentative and alternative (AAC) interventions have been demonstrated to facilitate a variety of communication outcomes in persons with intellectual disabilities. Most aided AAC systems rely on a visual modality. When the medium for communication is visual, it seems likely that the effectiveness of intervention depends in part on the effectiveness and efficiency with which the information presented in the display can be perceived, identified, and extracted by communicators and their partners. Understanding of visual-cognitive processing – that is, how a user attends, perceives, and makes sense of the visual information on the display – therefore seems critical to designing effective aided AAC interventions. In this Forum Note, we discuss characteristics of one particular type of aided AAC display, that is, Visual Scene Displays (VSDs) as they may relate to user visual and cognitive processing. We consider three specific ways in which bodies of knowledge drawn from the visual cognitive sciences may be relevant to the composition of VSDs, with the understanding the direct research with children with complex communication needs is necessary to verify or refute our speculations. PMID:22946989
A category adjustment approach to memory for spatial location in natural scenes.
Holden, Mark P; Curby, Kim M; Newcombe, Nora S; Shipley, Thomas F
2010-05-01
Memories for spatial locations often show systematic errors toward the central value of the surrounding region. This bias has been explained using a Bayesian model in which fine-grained and categorical information are combined (Huttenlocher, Hedges, & Duncan, 1991). However, experiments testing this model have largely used locations contained in simple geometric shapes. Use of this paradigm raises 2 issues. First, do results generalize to the complex natural world? Second, what types of information might be used to segment complex spaces into constituent categories? Experiment 1 addressed the 1st question by showing a bias toward prototypical values in memory for spatial locations in complex natural scenes. Experiment 2 addressed the 2nd question by manipulating the availability of basic visual cues (using color negatives) or of semantic information about the scene (using inverted images). Error patterns suggest that both perceptual and conceptual information are involved in segmentation. The possible neurological foundations of location memory of this kind are discussed. PsycINFO Database Record (c) 2010 APA, all rights reserved.
Scene perception in posterior cortical atrophy: categorization, description and fixation patterns
Shakespeare, Timothy J.; Yong, Keir X. X.; Frost, Chris; Kim, Lois G.; Warrington, Elizabeth K.; Crutch, Sebastian J.
2013-01-01
Partial or complete Balint's syndrome is a core feature of the clinico-radiological syndrome of posterior cortical atrophy (PCA), in which individuals experience a progressive deterioration of cortical vision. Although multi-object arrays are frequently used to detect simultanagnosia in the clinical assessment and diagnosis of PCA, to date there have been no group studies of scene perception in patients with the syndrome. The current study involved three linked experiments conducted in PCA patients and healthy controls. Experiment 1 evaluated the accuracy and latency of complex scene perception relative to individual faces and objects (color and grayscale) using a categorization paradigm. PCA patients were both less accurate (faces < scenes < objects) and slower (scenes < objects < faces) than controls on all categories, with performance strongly associated with their level of basic visual processing impairment; patients also showed a small advantage for color over grayscale stimuli. Experiment 2 involved free description of real world scenes. PCA patients generated fewer features and more misperceptions than controls, though perceptual errors were always consistent with the patient's global understanding of the scene (whether correct or not). Experiment 3 used eye tracking measures to compare patient and control eye movements over initial and subsequent fixations of scenes. Patients' fixation patterns were significantly different to those of young and age-matched controls, with comparable group differences for both initial and subsequent fixations. Overall, these findings describe the variability in everyday scene perception exhibited by individuals with PCA, and indicate the importance of exposure duration in the perception of complex scenes. PMID:24106469
Collet, Anne-Claire; Fize, Denis; VanRullen, Rufin
2015-01-01
Rapid visual categorization is a crucial ability for survival of many animal species, including monkeys and humans. In real conditions, objects (either animate or inanimate) are never isolated but embedded in a complex background made of multiple elements. It has been shown in humans and monkeys that the contextual background can either enhance or impair object categorization, depending on context/object congruency (for example, an animal in a natural vs. man-made environment). Moreover, a scene is not only a collection of objects; it also has global physical features (i.e phase and amplitude of Fourier spatial frequencies) which help define its gist. In our experiment, we aimed to explore and compare the contribution of the amplitude spectrum of scenes in the context-object congruency effect in monkeys and humans. We designed a rapid visual categorization task, Animal versus Non-Animal, using as contexts both real scenes photographs and noisy backgrounds built from the amplitude spectrum of real scenes but with randomized phase spectrum. We showed that even if the contextual congruency effect was comparable in both species when the context was a real scene, it differed when the foreground object was surrounded by a noisy background: in monkeys we found a similar congruency effect in both conditions, but in humans the congruency effect was absent (or even reversed) when the context was a noisy background. PMID:26207915
Ryals, Anthony J.; Wang, Jane X.; Polnaszek, Kelly L.; Voss, Joel L.
2015-01-01
Although hippocampus unequivocally supports explicit/ declarative memory, fewer findings have demonstrated its role in implicit expressions of memory. We tested for hippocampal contributions to an implicit expression of configural/relational memory for complex scenes using eye-movement tracking during functional magnetic resonance imaging (fMRI) scanning. Participants studied scenes and were later tested using scenes that resembled study scenes in their overall feature configuration but comprised different elements. These configurally similar scenes were used to limit explicit memory, and were intermixed with new scenes that did not resemble studied scenes. Scene configuration memory was expressed through eye movements reflecting exploration overlap (EO), which is the viewing of the same scene locations at both study and test. EO reliably discriminated similar study-test scene pairs from study-new scene pairs, was reliably greater for similarity-based recognition hits than for misses, and correlated with hippocampal fMRI activity. In contrast, subjects could not reliably discriminate similar from new scenes by overt judgments, although ratings of familiarity were slightly higher for similar than new scenes. Hippocampal fMRI correlates of this weak explicit memory were distinct from EO-related activity. These findings collectively suggest that EO was an implicit expression of scene configuration memory associated with hippocampal activity. Visual exploration can therefore reflect implicit hippocampal-related memory processing that can be observed in eye-movement behavior during naturalistic scene viewing. PMID:25620526
Long-Term Memories Bias Sensitivity and Target Selection in Complex Scenes
Patai, Eva Zita; Doallo, Sonia; Nobre, Anna Christina
2014-01-01
In everyday situations we often rely on our memories to find what we are looking for in our cluttered environment. Recently, we developed a new experimental paradigm to investigate how long-term memory (LTM) can guide attention, and showed how the pre-exposure to a complex scene in which a target location had been learned facilitated the detection of the transient appearance of the target at the remembered location (Summerfield, Lepsien, Gitelman, Mesulam, & Nobre, 2006; Summerfield, Rao, Garside, & Nobre, 2011). The present study extends these findings by investigating whether and how LTM can enhance perceptual sensitivity to identify targets occurring within their complex scene context. Behavioral measures showed superior perceptual sensitivity (d′) for targets located in remembered spatial contexts. We used the N2pc event-related potential to test whether LTM modulated the process of selecting the target from its scene context. Surprisingly, in contrast to effects of visual spatial cues or implicit contextual cueing, LTM for target locations significantly attenuated the N2pc potential. We propose that the mechanism by which these explicitly available LTMs facilitate perceptual identification of targets may differ from mechanisms triggered by other types of top-down sources of information. PMID:23016670
The Nature and Timing of Tele-Pseudoscopic Experiences
Hill, Harold; Allison, Robert S
2016-01-01
Interchanging the left and right eye views of a scene (pseudoscopic viewing) has been reported to produce vivid stereoscopic effects under certain conditions. In two separate field studies, we examined the experiences of 124 observers (76 in Study 1 and 48 in Study 2) while pseudoscopically viewing a distant natural outdoor scene. We found large individual differences in both the nature and the timing of their pseudoscopic experiences. While some observers failed to notice anything unusual about the pseudoscopic scene, most experienced multiple pseudoscopic phenomena, including apparent scene depth reversals, apparent object shape reversals, apparent size and flatness changes, apparent reversals of border ownership, and even complex illusory foreground surfaces. When multiple effects were experienced, patterns of co-occurrence suggested possible causal relationships between apparent scene depth reversals and several other pseudoscopic phenomena. The latency for experiencing pseudoscopic phenomena was found to correlate significantly with observer visual acuity, but not stereoacuity, in both studies. PMID:27482368
Does scene context always facilitate retrieval of visual object representations?
Nakashima, Ryoichi; Yokosawa, Kazuhiko
2011-04-01
An object-to-scene binding hypothesis maintains that visual object representations are stored as part of a larger scene representation or scene context, and that scene context facilitates retrieval of object representations (see, e.g., Hollingworth, Journal of Experimental Psychology: Learning, Memory and Cognition, 32, 58-69, 2006). Support for this hypothesis comes from data using an intentional memory task. In the present study, we examined whether scene context always facilitates retrieval of visual object representations. In two experiments, we investigated whether the scene context facilitates retrieval of object representations, using a new paradigm in which a memory task is appended to a repeated-flicker change detection task. Results indicated that in normal scene viewing, in which many simultaneous objects appear, scene context facilitation of the retrieval of object representations-henceforth termed object-to-scene binding-occurred only when the observer was required to retain much information for a task (i.e., an intentional memory task).
Eye movements and attention in reading, scene perception, and visual search.
Rayner, Keith
2009-08-01
Eye movements are now widely used to investigate cognitive processes during reading, scene perception, and visual search. In this article, research on the following topics is reviewed with respect to reading: (a) the perceptual span (or span of effective vision), (b) preview benefit, (c) eye movement control, and (d) models of eye movements. Related issues with respect to eye movements during scene perception and visual search are also reviewed. It is argued that research on eye movements during reading has been somewhat advanced over research on eye movements in scene perception and visual search and that some of the paradigms developed to study reading should be more widely adopted in the study of scene perception and visual search. Research dealing with "real-world" tasks and research utilizing the visual-world paradigm are also briefly discussed.
Social relevance drives viewing behavior independent of low-level salience in rhesus macaques
Solyst, James A.; Buffalo, Elizabeth A.
2014-01-01
Quantifying attention to social stimuli during the viewing of complex social scenes with eye tracking has proven to be a sensitive method in the diagnosis of autism spectrum disorders years before average clinical diagnosis. Rhesus macaques provide an ideal model for understanding the mechanisms underlying social viewing behavior, but to date no comparable behavioral task has been developed for use in monkeys. Using a novel scene-viewing task, we monitored the gaze of three rhesus macaques while they freely viewed well-controlled composed social scenes and analyzed the time spent viewing objects and monkeys. In each of six behavioral sessions, monkeys viewed a set of 90 images (540 unique scenes) with each image presented twice. In two-thirds of the repeated scenes, either a monkey or an object was replaced with a novel item (manipulated scenes). When viewing a repeated scene, monkeys made longer fixations and shorter saccades, shifting from a rapid orienting to global scene contents to a more local analysis of fewer items. In addition to this repetition effect, in manipulated scenes, monkeys demonstrated robust memory by spending more time viewing the replaced items. By analyzing attention to specific scene content, we found that monkeys strongly preferred to view conspecifics and that this was not related to their salience in terms of low-level image features. A model-free analysis of viewing statistics found that monkeys that were viewed earlier and longer had direct gaze and redder sex skin around their face and rump, two important visual social cues. These data provide a quantification of viewing strategy, memory and social preferences in rhesus macaques viewing complex social scenes, and they provide an important baseline with which to compare to the effects of therapeutics aimed at enhancing social cognition. PMID:25414633
Changing scenes: memory for naturalistic events following change blindness.
Mäntylä, Timo; Sundström, Anna
2004-11-01
Research on scene perception indicates that viewers often fail to detect large changes to scene regions when these changes occur during a visual disruption such as a saccade or a movie cut. In two experiments, we examined whether this relative inability to detect changes would produce systematic biases in event memory. In Experiment 1, participants decided whether two successively presented images were the same or different, followed by a memory task, in which they recalled the content of the viewed scene. In Experiment 2, participants viewed a short video, in which an actor carried out a series of daily activities, and central scenes' attributes were changed during a movie cut. A high degree of change blindness was observed in both experiments, and these effects were related to scene complexity (Experiment 1) and level of retrieval support (Experiment 2). Most important, participants reported the changed, rather than the initial, event attributes following a failure in change detection. These findings suggest that attentional limitations during encoding contribute to biases in episodic memory.
Willems, Roel M; Clevis, Krien; Hagoort, Peter
2011-09-01
We investigated how visual and linguistic information interact in the perception of emotion. We borrowed a phenomenon from film theory which states that presentation of an as such neutral visual scene intensifies the percept of fear or suspense induced by a different channel of information, such as language. Our main aim was to investigate how neutral visual scenes can enhance responses to fearful language content in parts of the brain involved in the perception of emotion. Healthy participants' brain activity was measured (using functional magnetic resonance imaging) while they read fearful and less fearful sentences presented with or without a neutral visual scene. The main idea is that the visual scenes intensify the fearful content of the language by subtly implying and concretizing what is described in the sentence. Activation levels in the right anterior temporal pole were selectively increased when a neutral visual scene was paired with a fearful sentence, compared to reading the sentence alone, as well as to reading of non-fearful sentences presented with the same neutral scene. We conclude that the right anterior temporal pole serves a binding function of emotional information across domains such as visual and linguistic information.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Arthur Bleeker, PNNL
2015-03-11
SVF is a full featured OpenGL 3d framework that allows for rapid creation of complex visualizations. The SVF framework handles much of the lifecycle and complex tasks required for a 3d visualization. Unlike a game framework SVF was designed to use fewer resources, work well in a windowed environment, and only render when necessary. The scene also takes advantage of multiple threads to free up the UI thread as much as possible. Shapes (actors) in the scene are created by adding or removing functionality (through support objects) during runtime. This allows a highly flexible and dynamic means of creating highlymore » complex actors without the code complexity (it also helps overcome the lack of multiple inheritance in Java.) All classes are highly customizable and there are abstract classes which are intended to be subclassed to allow a developer to create more complex and highly performant actors. There are multiple demos included in the framework to help the developer get started and shows off nearly all of the functionality. Some simple shapes (actors) are already created for you such as text, bordered text, radial text, text area, complex paths, NURBS paths, cube, disk, grid, plane, geometric shapes, and volumetric area. It also comes with various camera types for viewing that can be dragged, zoomed, and rotated. Picking or selecting items in the scene can be accomplished in various ways depending on your needs (raycasting or color picking.) The framework currently has functionality for tooltips, animation, actor pools, color gradients, 2d physics, text, 1d/2d/3d textures, children, blending, clipping planes, view frustum culling, custom shaders, and custom actor states« less
Lu, Kun-Han; Hung, Shao-Chin; Wen, Haiguang; Marussich, Lauren; Liu, Zhongming
2016-01-01
Complex, sustained, dynamic, and naturalistic visual stimulation can evoke distributed brain activities that are highly reproducible within and across individuals. However, the precise origins of such reproducible responses remain incompletely understood. Here, we employed concurrent functional magnetic resonance imaging (fMRI) and eye tracking to investigate the experimental and behavioral factors that influence fMRI activity and its intra- and inter-subject reproducibility during repeated movie stimuli. We found that widely distributed and highly reproducible fMRI responses were attributed primarily to the high-level natural content in the movie. In the absence of such natural content, low-level visual features alone in a spatiotemporally scrambled control stimulus evoked significantly reduced degree and extent of reproducible responses, which were mostly confined to the primary visual cortex (V1). We also found that the varying gaze behavior affected the cortical response at the peripheral part of V1 and in the oculomotor network, with minor effects on the response reproducibility over the extrastriate visual areas. Lastly, scene transitions in the movie stimulus due to film editing partly caused the reproducible fMRI responses at widespread cortical areas, especially along the ventral visual pathway. Therefore, the naturalistic nature of a movie stimulus is necessary for driving highly reliable visual activations. In a movie-stimulation paradigm, scene transitions and individuals’ gaze behavior should be taken as potential confounding factors in order to properly interpret cortical activity that supports natural vision. PMID:27564573
Sherman, Aleksandra; Grabowecky, Marcia; Suzuki, Satoru
2015-08-01
What shapes art appreciation? Much research has focused on the importance of visual features themselves (e.g., symmetry, natural scene statistics) and of the viewer's experience and expertise with specific artworks. However, even after taking these factors into account, there are considerable individual differences in art preferences. Our new result suggests that art preference is also influenced by the compatibility between visual properties and the characteristics of the viewer's visual system. Specifically, we have demonstrated, using 120 artworks from diverse periods, cultures, genres, and styles, that art appreciation is increased when the level of visual complexity within an artwork is compatible with the viewer's visual working memory capacity. The result highlights the importance of the interaction between visual features and the beholder's general visual capacity in shaping art appreciation. (c) 2015 APA, all rights reserved).
Parahippocampal and retrosplenial contributions to human spatial navigation
Epstein, Russell A.
2010-01-01
Spatial navigation is a core cognitive ability in humans and animals. Neuroimaging studies have identified two functionally-defined brain regions that activate during navigational tasks and also during passive viewing of navigationally-relevant stimuli such as environmental scenes: the parahippocampal place area (PPA) and the retrosplenial complex (RSC). Recent findings indicate that the PPA and RSC play distinct and complementary roles in spatial navigation, with the PPA more concerned with representation of the local visual scene and RSC more concerned with situating the scene within the broader spatial environment. These findings are a first step towards understanding the separate components of the cortical network that mediates spatial navigation in humans. PMID:18760955
Neurotoxic lesions of ventrolateral prefrontal cortex impair object-in-place scene memory
Wilson, Charles R E; Gaffan, David; Mitchell, Anna S; Baxter, Mark G
2007-01-01
Disconnection of the frontal lobe from the inferotemporal cortex produces deficits in a number of cognitive tasks that require the application of memory-dependent rules to visual stimuli. The specific regions of frontal cortex that interact with the temporal lobe in performance of these tasks remain undefined. One capacity that is impaired by frontal–temporal disconnection is rapid learning of new object-in-place scene problems, in which visual discriminations between two small typographic characters are learned in the context of different visually complex scenes. In the present study, we examined whether neurotoxic lesions of ventrolateral prefrontal cortex in one hemisphere, combined with ablation of inferior temporal cortex in the contralateral hemisphere, would impair learning of new object-in-place scene problems. Male macaque monkeys learned 10 or 20 new object-in-place problems in each daily test session. Unilateral neurotoxic lesions of ventrolateral prefrontal cortex produced by multiple injections of a mixture of ibotenate and N-methyl-d-aspartate did not affect performance. However, when disconnection from inferotemporal cortex was completed by ablating this region contralateral to the neurotoxic prefrontal lesion, new learning was substantially impaired. Sham disconnection (injecting saline instead of neurotoxin contralateral to the inferotemporal lesion) did not affect performance. These findings support two conclusions: first, that the ventrolateral prefrontal cortex is a critical area within the frontal lobe for scene memory; and second, the effects of ablations of prefrontal cortex can be confidently attributed to the loss of cell bodies within the prefrontal cortex rather than to interruption of fibres of passage through the lesioned area. PMID:17445247
How do visual and postural cues combine for self-tilt perception during slow pitch rotations?
Scotto Di Cesare, C; Buloup, F; Mestre, D R; Bringoux, L
2014-11-01
Self-orientation perception relies on the integration of multiple sensory inputs which convey spatially-related visual and postural cues. In the present study, an experimental set-up was used to tilt the body and/or the visual scene to investigate how these postural and visual cues are integrated for self-tilt perception (the subjective sensation of being tilted). Participants were required to repeatedly rate a confidence level for self-tilt perception during slow (0.05°·s(-1)) body and/or visual scene pitch tilts up to 19° relative to vertical. Concurrently, subjects also had to perform arm reaching movements toward a body-fixed target at certain specific angles of tilt. While performance of a concurrent motor task did not influence the main perceptual task, self-tilt detection did vary according to the visuo-postural stimuli. Slow forward or backward tilts of the visual scene alone did not induce a marked sensation of self-tilt contrary to actual body tilt. However, combined body and visual scene tilt influenced self-tilt perception more strongly, although this effect was dependent on the direction of visual scene tilt: only a forward visual scene tilt combined with a forward body tilt facilitated self-tilt detection. In such a case, visual scene tilt did not seem to induce vection but rather may have produced a deviation of the perceived orientation of the longitudinal body axis in the forward direction, which may have lowered the self-tilt detection threshold during actual forward body tilt. Copyright © 2014 Elsevier B.V. All rights reserved.
NASA Technical Reports Server (NTRS)
Foyle, David C.; Kaiser, Mary K.; Johnson, Walter W.
1992-01-01
This paper reviews some of the sources of visual information that are available in the out-the-window scene and describes how these visual cues are important for routine pilotage and training, as well as the development of simulator visual systems and enhanced or synthetic vision systems for aircraft cockpits. It is shown how these visual cues may change or disappear under environmental or sensor conditions, and how the visual scene can be augmented by advanced displays to capitalize on the pilot's excellent ability to extract visual information from the visual scene.
Fuzzy Classification of High Resolution Remote Sensing Scenes Using Visual Attention Features.
Li, Linyi; Xu, Tingbao; Chen, Yun
2017-01-01
In recent years the spatial resolutions of remote sensing images have been improved greatly. However, a higher spatial resolution image does not always lead to a better result of automatic scene classification. Visual attention is an important characteristic of the human visual system, which can effectively help to classify remote sensing scenes. In this study, a novel visual attention feature extraction algorithm was proposed, which extracted visual attention features through a multiscale process. And a fuzzy classification method using visual attention features (FC-VAF) was developed to perform high resolution remote sensing scene classification. FC-VAF was evaluated by using remote sensing scenes from widely used high resolution remote sensing images, including IKONOS, QuickBird, and ZY-3 images. FC-VAF achieved more accurate classification results than the others according to the quantitative accuracy evaluation indices. We also discussed the role and impacts of different decomposition levels and different wavelets on the classification accuracy. FC-VAF improves the accuracy of high resolution scene classification and therefore advances the research of digital image analysis and the applications of high resolution remote sensing images.
Fuzzy Classification of High Resolution Remote Sensing Scenes Using Visual Attention Features
Xu, Tingbao; Chen, Yun
2017-01-01
In recent years the spatial resolutions of remote sensing images have been improved greatly. However, a higher spatial resolution image does not always lead to a better result of automatic scene classification. Visual attention is an important characteristic of the human visual system, which can effectively help to classify remote sensing scenes. In this study, a novel visual attention feature extraction algorithm was proposed, which extracted visual attention features through a multiscale process. And a fuzzy classification method using visual attention features (FC-VAF) was developed to perform high resolution remote sensing scene classification. FC-VAF was evaluated by using remote sensing scenes from widely used high resolution remote sensing images, including IKONOS, QuickBird, and ZY-3 images. FC-VAF achieved more accurate classification results than the others according to the quantitative accuracy evaluation indices. We also discussed the role and impacts of different decomposition levels and different wavelets on the classification accuracy. FC-VAF improves the accuracy of high resolution scene classification and therefore advances the research of digital image analysis and the applications of high resolution remote sensing images. PMID:28761440
Campagne, Aurélie; Fradcourt, Benoit; Pichat, Cédric; Baciu, Monica; Kauffmann, Louise; Peyrin, Carole
2016-01-01
Visual processing of emotional stimuli critically depends on the type of cognitive appraisal involved. The present fMRI pilot study aimed to investigate the cerebral correlates involved in the visual processing of emotional scenes in two tasks, one emotional, based on the appraisal of personal emotional experience, and the other motivational, based on the appraisal of the tendency to action. Given that the use of spatial frequency information is relatively flexible during the visual processing of emotional stimuli depending on the task's demands, we also explored the effect of the type of spatial frequency in visual stimuli in each task by using emotional scenes filtered in low spatial frequency (LSF) and high spatial frequencies (HSF). Activation was observed in the visual areas of the fusiform gyrus for all emotional scenes in both tasks, and in the amygdala for unpleasant scenes only. The motivational task induced additional activation in frontal motor-related areas (e.g. premotor cortex, SMA) and parietal regions (e.g. superior and inferior parietal lobules). Parietal regions were recruited particularly during the motivational appraisal of approach in response to pleasant scenes. These frontal and parietal activations, respectively, suggest that motor and navigation processes play a specific role in the identification of the tendency to action in the motivational task. Furthermore, activity observed in the motivational task, in response to both pleasant and unpleasant scenes, was significantly greater for HSF than for LSF scenes, suggesting that the tendency to action is driven mainly by the detailed information contained in scenes. Results for the emotional task suggest that spatial frequencies play only a small role in the evaluation of unpleasant and pleasant emotions. Our preliminary study revealed a partial distinction between visual processing of emotional scenes during identification of the tendency to action, and during identification of personal emotional experiences. It also illustrates flexible use of the spatial frequencies contained in scenes depending on their emotional valence and on task demands.
Invariant visual object recognition: a model, with lighting invariance.
Rolls, Edmund T; Stringer, Simon M
2006-01-01
How are invariant representations of objects formed in the visual cortex? We describe a neurophysiological and computational approach which focusses on a feature hierarchy model in which invariant representations can be built by self-organizing learning based on the statistics of the visual input. The model can use temporal continuity in an associative synaptic learning rule with a short term memory trace, and/or it can use spatial continuity in Continuous Transformation learning. The model of visual processing in the ventral cortical stream can build representations of objects that are invariant with respect to translation, view, size, and in this paper we show also lighting. The model has been extended to provide an account of invariant representations in the dorsal visual system of the global motion produced by objects such as looming, rotation, and object-based movement. The model has been extended to incorporate top-down feedback connections to model the control of attention by biased competition in for example spatial and object search tasks. The model has also been extended to account for how the visual system can select single objects in complex visual scenes, and how multiple objects can be represented in a scene.
Invariant Visual Object and Face Recognition: Neural and Computational Bases, and a Model, VisNet
Rolls, Edmund T.
2012-01-01
Neurophysiological evidence for invariant representations of objects and faces in the primate inferior temporal visual cortex is described. Then a computational approach to how invariant representations are formed in the brain is described that builds on the neurophysiology. A feature hierarchy model in which invariant representations can be built by self-organizing learning based on the temporal and spatial statistics of the visual input produced by objects as they transform in the world is described. VisNet can use temporal continuity in an associative synaptic learning rule with a short-term memory trace, and/or it can use spatial continuity in continuous spatial transformation learning which does not require a temporal trace. The model of visual processing in the ventral cortical stream can build representations of objects that are invariant with respect to translation, view, size, and also lighting. The model has been extended to provide an account of invariant representations in the dorsal visual system of the global motion produced by objects such as looming, rotation, and object-based movement. The model has been extended to incorporate top-down feedback connections to model the control of attention by biased competition in, for example, spatial and object search tasks. The approach has also been extended to account for how the visual system can select single objects in complex visual scenes, and how multiple objects can be represented in a scene. The approach has also been extended to provide, with an additional layer, for the development of representations of spatial scenes of the type found in the hippocampus. PMID:22723777
Invariant Visual Object and Face Recognition: Neural and Computational Bases, and a Model, VisNet.
Rolls, Edmund T
2012-01-01
Neurophysiological evidence for invariant representations of objects and faces in the primate inferior temporal visual cortex is described. Then a computational approach to how invariant representations are formed in the brain is described that builds on the neurophysiology. A feature hierarchy model in which invariant representations can be built by self-organizing learning based on the temporal and spatial statistics of the visual input produced by objects as they transform in the world is described. VisNet can use temporal continuity in an associative synaptic learning rule with a short-term memory trace, and/or it can use spatial continuity in continuous spatial transformation learning which does not require a temporal trace. The model of visual processing in the ventral cortical stream can build representations of objects that are invariant with respect to translation, view, size, and also lighting. The model has been extended to provide an account of invariant representations in the dorsal visual system of the global motion produced by objects such as looming, rotation, and object-based movement. The model has been extended to incorporate top-down feedback connections to model the control of attention by biased competition in, for example, spatial and object search tasks. The approach has also been extended to account for how the visual system can select single objects in complex visual scenes, and how multiple objects can be represented in a scene. The approach has also been extended to provide, with an additional layer, for the development of representations of spatial scenes of the type found in the hippocampus.
Rules infants look by: Testing the assumption of transitivity in visual salience.
Kibbe, Melissa M; Kaldy, Zsuzsa; Blaser, Erik
2018-01-01
What drives infants' attention in complex visual scenes? Early models of infant attention suggested that the degree to which different visual features were detectable determines their attentional priority. Here, we tested this by asking whether two targets - defined by different features, but each equally salient when evaluated independently - would drive attention equally when pitted head-to-head. In Experiment 1, we presented 6-month-old infants with an array of gabor patches in which a target region varied either in color or spatial frequency from the background. Using a forced-choice preferential-looking method, we measured how readily infants fixated the target as its featural difference from the background was parametrically increased. Then, in Experiment 2, we used these psychometric preference functions to choose values for color and spatial frequency targets that were equally salient (preferred), and pitted them against each other within the same display. We reasoned that, if salience is transitive, then the stimuli should be iso-salient and infants should therefore show no systematic preference for either stimulus. On the contrary, we found that infants consistently preferred the color-defined stimulus. This suggests that computing visual salience in more complex scenes needs to include factors above and beyond local salience values.
Steady-state visual evoked potentials as a research tool in social affective neuroscience
Wieser, Matthias J.; Miskovic, Vladimir; Keil, Andreas
2017-01-01
Like many other primates, humans place a high premium on social information transmission and processing. One important aspect of this information concerns the emotional state of other individuals, conveyed by distinct visual cues such as facial expressions, overt actions, or by cues extracted from the situational context. A rich body of theoretical and empirical work has demonstrated that these socio-emotional cues are processed by the human visual system in a prioritized fashion, in the service of optimizing social behavior. Furthermore, socio-emotional perception is highly dependent on situational contexts and previous experience. Here, we review current issues in this area of research and discuss the utility of the steady-state visual evoked potential (ssVEP) technique for addressing key empirical questions. Methodological advantages and caveats are discussed with particular regard to quantifying time-varying competition among multiple perceptual objects, trial-by-trial analysis of visual cortical activation, functional connectivity, and the control of low-level stimulus features. Studies on facial expression and emotional scene processing are summarized, with an emphasis on viewing faces and other social cues in emotional contexts, or when competing with each other. Further, because the ssVEP technique can be readily accommodated to studying the viewing of complex scenes with multiple elements, it enables researchers to advance theoretical models of socio-emotional perception, based on complex, quasi-naturalistic viewing situations. PMID:27699794
Residual attention guidance in blindsight monkeys watching complex natural scenes.
Yoshida, Masatoshi; Itti, Laurent; Berg, David J; Ikeda, Takuro; Kato, Rikako; Takaura, Kana; White, Brian J; Munoz, Douglas P; Isa, Tadashi
2012-08-07
Patients with damage to primary visual cortex (V1) demonstrate residual performance on laboratory visual tasks despite denial of conscious seeing (blindsight) [1]. After a period of recovery, which suggests a role for plasticity [2], visual sensitivity higher than chance is observed in humans and monkeys for simple luminance-defined stimuli, grating stimuli, moving gratings, and other stimuli [3-7]. Some residual cognitive processes including bottom-up attention and spatial memory have also been demonstrated [8-10]. To date, little is known about blindsight with natural stimuli and spontaneous visual behavior. In particular, is orienting attention toward salient stimuli during free viewing still possible? We used a computational saliency map model to analyze spontaneous eye movements of monkeys with blindsight from unilateral ablation of V1. Despite general deficits in gaze allocation, monkeys were significantly attracted to salient stimuli. The contribution of orientation features to salience was nearly abolished, whereas contributions of motion, intensity, and color features were preserved. Control experiments employing laboratory stimuli confirmed the free-viewing finding that lesioned monkeys retained color sensitivity. Our results show that attention guidance over complex natural scenes is preserved in the absence of V1, thereby directly challenging theories and models that crucially depend on V1 to compute the low-level visual features that guide attention. Copyright © 2012 Elsevier Ltd. All rights reserved.
Clevis, Krien; Hagoort, Peter
2011-01-01
We investigated how visual and linguistic information interact in the perception of emotion. We borrowed a phenomenon from film theory which states that presentation of an as such neutral visual scene intensifies the percept of fear or suspense induced by a different channel of information, such as language. Our main aim was to investigate how neutral visual scenes can enhance responses to fearful language content in parts of the brain involved in the perception of emotion. Healthy participants’ brain activity was measured (using functional magnetic resonance imaging) while they read fearful and less fearful sentences presented with or without a neutral visual scene. The main idea is that the visual scenes intensify the fearful content of the language by subtly implying and concretizing what is described in the sentence. Activation levels in the right anterior temporal pole were selectively increased when a neutral visual scene was paired with a fearful sentence, compared to reading the sentence alone, as well as to reading of non-fearful sentences presented with the same neutral scene. We conclude that the right anterior temporal pole serves a binding function of emotional information across domains such as visual and linguistic information. PMID:20530540
Yoo, Seung-Woo; Lee, Inah
2017-01-01
How visual scene memory is processed differentially by the upstream structures of the hippocampus is largely unknown. We sought to dissociate functionally the lateral and medial subdivisions of the entorhinal cortex (LEC and MEC, respectively) in visual scene-dependent tasks by temporarily inactivating the LEC and MEC in the same rat. When the rat made spatial choices in a T-maze using visual scenes displayed on LCD screens, the inactivation of the MEC but not the LEC produced severe deficits in performance. However, when the task required the animal to push a jar or to dig in the sand in the jar using the same scene stimuli, the LEC but not the MEC became important. Our findings suggest that the entorhinal cortex is critical for scene-dependent mnemonic behavior, and the response modality may interact with a sensory modality to determine the involvement of the LEC and MEC in scene-based memory tasks. DOI: http://dx.doi.org/10.7554/eLife.21543.001 PMID:28169828
Anticipation in Real-World Scenes: The Role of Visual Context and Visual Memory.
Coco, Moreno I; Keller, Frank; Malcolm, George L
2016-11-01
The human sentence processor is able to make rapid predictions about upcoming linguistic input. For example, upon hearing the verb eat, anticipatory eye-movements are launched toward edible objects in a visual scene (Altmann & Kamide, 1999). However, the cognitive mechanisms that underlie anticipation remain to be elucidated in ecologically valid contexts. Previous research has, in fact, mainly used clip-art scenes and object arrays, raising the possibility that anticipatory eye-movements are limited to displays containing a small number of objects in a visually impoverished context. In Experiment 1, we confirm that anticipation effects occur in real-world scenes and investigate the mechanisms that underlie such anticipation. In particular, we demonstrate that real-world scenes provide contextual information that anticipation can draw on: When the target object is not present in the scene, participants infer and fixate regions that are contextually appropriate (e.g., a table upon hearing eat). Experiment 2 investigates whether such contextual inference requires the co-presence of the scene, or whether memory representations can be utilized instead. The same real-world scenes as in Experiment 1 are presented to participants, but the scene disappears before the sentence is heard. We find that anticipation occurs even when the screen is blank, including when contextual inference is required. We conclude that anticipatory language processing is able to draw upon global scene representations (such as scene type) to make contextual inferences. These findings are compatible with theories assuming contextual guidance, but posit a challenge for theories assuming object-based visual indices. Copyright © 2015 Cognitive Science Society, Inc.
Camouflage and visual perception
Troscianko, Tom; Benton, Christopher P.; Lovell, P. George; Tolhurst, David J.; Pizlo, Zygmunt
2008-01-01
How does an animal conceal itself from visual detection by other animals? This review paper seeks to identify general principles that may apply in this broad area. It considers mechanisms of visual encoding, of grouping and object encoding, and of search. In most cases, the evidence base comes from studies of humans or species whose vision approximates to that of humans. The effort is hampered by a relatively sparse literature on visual function in natural environments and with complex foraging tasks. However, some general constraints emerge as being potentially powerful principles in understanding concealment—a ‘constraint’ here means a set of simplifying assumptions. Strategies that disrupt the unambiguous encoding of discontinuities of intensity (edges), and of other key visual attributes, such as motion, are key here. Similar strategies may also defeat grouping and object-encoding mechanisms. Finally, the paper considers how we may understand the processes of search for complex targets in complex scenes. The aim is to provide a number of pointers towards issues, which may be of assistance in understanding camouflage and concealment, particularly with reference to how visual systems can detect the shape of complex, concealed objects. PMID:18990671
Comparing object recognition from binary and bipolar edge images for visual prostheses.
Jung, Jae-Hyun; Pu, Tian; Peli, Eli
2016-11-01
Visual prostheses require an effective representation method due to the limited display condition which has only 2 or 3 levels of grayscale in low resolution. Edges derived from abrupt luminance changes in images carry essential information for object recognition. Typical binary (black and white) edge images have been used to represent features to convey essential information. However, in scenes with a complex cluttered background, the recognition rate of the binary edge images by human observers is limited and additional information is required. The polarity of edges and cusps (black or white features on a gray background) carries important additional information; the polarity may provide shape from shading information missing in the binary edge image. This depth information may be restored by using bipolar edges. We compared object recognition rates from 16 binary edge images and bipolar edge images by 26 subjects to determine the possible impact of bipolar filtering in visual prostheses with 3 or more levels of grayscale. Recognition rates were higher with bipolar edge images and the improvement was significant in scenes with complex backgrounds. The results also suggest that erroneous shape from shading interpretation of bipolar edges resulting from pigment rather than boundaries of shape may confound the recognition.
The roles of scene priming and location priming in object-scene consistency effects
Heise, Nils; Ansorge, Ulrich
2014-01-01
Presenting consistent objects in scenes facilitates object recognition as compared to inconsistent objects. Yet the mechanisms by which scenes influence object recognition are still not understood. According to one theory, consistent scenes facilitate visual search for objects at expected places. Here, we investigated two predictions following from this theory: If visual search is responsible for consistency effects, consistency effects could be weaker (1) with better-primed than less-primed object locations, and (2) with less-primed than better-primed scenes. In Experiments 1 and 2, locations of objects were varied within a scene to a different degree (one, two, or four possible locations). In addition, object-scene consistency was studied as a function of progressive numbers of repetitions of the backgrounds. Because repeating locations and backgrounds could facilitate visual search for objects, these repetitions might alter the object-scene consistency effect by lowering of location uncertainty. Although we find evidence for a significant consistency effect, we find no clear support for impacts of scene priming or location priming on the size of the consistency effect. Additionally, we find evidence that the consistency effect is dependent on the eccentricity of the target objects. These results point to only small influences of priming to object-scene consistency effects but all-in-all the findings can be reconciled with a visual-search explanation of the consistency effect. PMID:24910628
[Visual representation of natural scenes in flicker changes].
Nakashima, Ryoichi; Yokosawa, Kazuhiko
2010-08-01
Coherence theory in scene perception (Rensink, 2002) assumes the retention of volatile object representations on which attention is not focused. On the other hand, visual memory theory in scene perception (Hollingworth & Henderson, 2002) assumes that robust object representations are retained. In this study, we hypothesized that the difference between these two theories is derived from the difference of the experimental tasks that they are based on. In order to verify this hypothesis, we examined the properties of visual representation by using a change detection and memory task in a flicker paradigm. We measured the representations when participants were instructed to search for a change in a scene, and compared them with the intentional memory representations. The visual representations were retained in visual long-term memory even in the flicker paradigm, and were as robust as the intentional memory representations. However, the results indicate that the representations are unavailable for explicitly localizing a scene change, but are available for answering the recognition test. This suggests that coherence theory and visual memory theory are compatible.
Meyerhoff, Hauke S; Huff, Markus
2016-04-01
Human long-term memory for visual objects and scenes is tremendous. Here, we test how auditory information contributes to long-term memory performance for realistic scenes. In a total of six experiments, we manipulated the presentation modality (auditory, visual, audio-visual) as well as semantic congruency and temporal synchrony between auditory and visual information of brief filmic clips. Our results show that audio-visual clips generally elicit more accurate memory performance than unimodal clips. This advantage even increases with congruent visual and auditory information. However, violations of audio-visual synchrony hardly have any influence on memory performance. Memory performance remained intact even with a sequential presentation of auditory and visual information, but finally declined when the matching tracks of one scene were presented separately with intervening tracks during learning. With respect to memory performance, our results therefore show that audio-visual integration is sensitive to semantic congruency but remarkably robust against asymmetries between different modalities.
Bag of Visual Words Model with Deep Spatial Features for Geographical Scene Classification
Wu, Lin
2017-01-01
With the popular use of geotagging images, more and more research efforts have been placed on geographical scene classification. In geographical scene classification, valid spatial feature selection can significantly boost the final performance. Bag of visual words (BoVW) can do well in selecting feature in geographical scene classification; nevertheless, it works effectively only if the provided feature extractor is well-matched. In this paper, we use convolutional neural networks (CNNs) for optimizing proposed feature extractor, so that it can learn more suitable visual vocabularies from the geotagging images. Our approach achieves better performance than BoVW as a tool for geographical scene classification, respectively, in three datasets which contain a variety of scene categories. PMID:28706534
Doerschner, K.; Boyaci, H.; Maloney, L. T.
2007-01-01
We investigated limits on the human visual system’s ability to discount directional variation in complex lights field when estimating Lambertian surface color. Directional variation in the light field was represented in the frequency domain using spherical harmonics. The bidirectional reflectance distribution function of a Lambertian surface acts as a low-pass filter on directional variation in the light field. Consequently, the visual system needs to discount only the low-pass component of the incident light corresponding to the first nine terms of a spherical harmonics expansion (Basri & Jacobs, 2001; Ramamoorthi & Hanrahan, 2001) to accurately estimate surface color. We test experimentally whether the visual system discounts directional variation in the light field up to this physical limit. Our results are consistent with the claim that the visual system can compensate for all of the complexity in the light field that affects the appearance of Lambertian surfaces. PMID:18053846
A systematic comparison between visual cues for boundary detection.
Mély, David A; Kim, Junkyung; McGill, Mason; Guo, Yuliang; Serre, Thomas
2016-03-01
The detection of object boundaries is a critical first step for many visual processing tasks. Multiple cues (we consider luminance, color, motion and binocular disparity) available in the early visual system may signal object boundaries but little is known about their relative diagnosticity and how to optimally combine them for boundary detection. This study thus aims at understanding how early visual processes inform boundary detection in natural scenes. We collected color binocular video sequences of natural scenes to construct a video database. Each scene was annotated with two full sets of ground-truth contours (one set limited to object boundaries and another set which included all edges). We implemented an integrated computational model of early vision that spans all considered cues, and then assessed their diagnosticity by training machine learning classifiers on individual channels. Color and luminance were found to be most diagnostic while stereo and motion were least. Combining all cues yielded a significant improvement in accuracy beyond that of any cue in isolation. Furthermore, the accuracy of individual cues was found to be a poor predictor of their unique contribution for the combination. This result suggested a complex interaction between cues, which we further quantified using regularization techniques. Our systematic assessment of the accuracy of early vision models for boundary detection together with the resulting annotated video dataset should provide a useful benchmark towards the development of higher-level models of visual processing. Copyright © 2016 Elsevier Ltd. All rights reserved.
Fixational Eye Movements in the Earliest Stage of Metazoan Evolution
Bielecki, Jan; Høeg, Jens T.; Garm, Anders
2013-01-01
All known photoreceptor cells adapt to constant light stimuli, fading the retinal image when exposed to an immobile visual scene. Counter strategies are therefore necessary to prevent blindness, and in mammals this is accomplished by fixational eye movements. Cubomedusae occupy a key position for understanding the evolution of complex visual systems and their eyes are assumedly subject to the same adaptive problems as the vertebrate eye, but lack motor control of their visual system. The morphology of the visual system of cubomedusae ensures a constant orientation of the eyes and a clear division of the visual field, but thereby also a constant retinal image when exposed to stationary visual scenes. Here we show that bell contractions used for swimming in the medusae refresh the retinal image in the upper lens eye of Tripedalia cystophora. This strongly suggests that strategies comparable to fixational eye movements have evolved at the earliest metazoan stage to compensate for the intrinsic property of the photoreceptors. Since the timing and amplitude of the rhopalial movements concur with the spatial and temporal resolution of the eye it circumvents the need for post processing in the central nervous system to remove image blur. PMID:23776673
Fixational eye movements in the earliest stage of metazoan evolution.
Bielecki, Jan; Høeg, Jens T; Garm, Anders
2013-01-01
All known photoreceptor cells adapt to constant light stimuli, fading the retinal image when exposed to an immobile visual scene. Counter strategies are therefore necessary to prevent blindness, and in mammals this is accomplished by fixational eye movements. Cubomedusae occupy a key position for understanding the evolution of complex visual systems and their eyes are assumedly subject to the same adaptive problems as the vertebrate eye, but lack motor control of their visual system. The morphology of the visual system of cubomedusae ensures a constant orientation of the eyes and a clear division of the visual field, but thereby also a constant retinal image when exposed to stationary visual scenes. Here we show that bell contractions used for swimming in the medusae refresh the retinal image in the upper lens eye of Tripedalia cystophora. This strongly suggests that strategies comparable to fixational eye movements have evolved at the earliest metazoan stage to compensate for the intrinsic property of the photoreceptors. Since the timing and amplitude of the rhopalial movements concur with the spatial and temporal resolution of the eye it circumvents the need for post processing in the central nervous system to remove image blur.
The singular nature of auditory and visual scene analysis in autism
Lin, I.-Fan; Shirama, Aya; Kato, Nobumasa
2017-01-01
Individuals with autism spectrum disorder often have difficulty acquiring relevant auditory and visual information in daily environments, despite not being diagnosed as hearing impaired or having low vision. Resent psychophysical and neurophysiological studies have shown that autistic individuals have highly specific individual differences at various levels of information processing, including feature extraction, automatic grouping and top-down modulation in auditory and visual scene analysis. Comparison of the characteristics of scene analysis between auditory and visual modalities reveals some essential commonalities, which could provide clues about the underlying neural mechanisms. Further progress in this line of research may suggest effective methods for diagnosing and supporting autistic individuals. This article is part of the themed issue ‘Auditory and visual scene analysis'. PMID:28044025
Funnell, Elaine; Wilding, John
2011-02-01
We report a longitudinal study of an exceptional child (S.R.) whose early-acquired visual agnosia, following encephalitis at 8 weeks of age, did not prevent her from learning to construct an increasing vocabulary of visual object forms (drawn from different categories), albeit slowly. S.R. had problems perceiving subtle differences in shape; she was unable to segment local letters within global displays; and she would bring complex scenes close to her eyes: a symptom suggestive of an attempt to reduce visual crowding. Investigations revealed a robust ability to use the gestalt grouping factors of proximity and collinearity to detect fragmented forms in noisy backgrounds, compared with a very weak ability to segment fragmented forms on the basis of contrasts of shape. When contrasts in spatial grouping and shape were pitted against each other, shape made little contribution, consistent with problems in perceiving complex scenes, but when shape contrast was varied, and spatial grouping was held constant, S.R. showed the same hierarchy of difficulty as the controls, although her responses were slowed. This is the first report of a child's visual-perceptual development following very early neurological impairments to the visual cortex. Her ability to learn to perceive visual shape following damage at a rudimentary stage of perceptual development contrasts starkly with the loss of such ability in childhood cases of acquired visual agnosia that follow damage to the established perceptual system. Clearly, there is a critical period during which neurological damage to the highly active, early developing visual-perceptual system does not prevent but only impairs further learning.
The Identification and Modeling of Visual Cue Usage in Manual Control Task Experiments
NASA Technical Reports Server (NTRS)
Sweet, Barbara Townsend; Trejo, Leonard J. (Technical Monitor)
1999-01-01
Many fields of endeavor require humans to conduct manual control tasks while viewing a perspective scene. Manual control refers to tasks in which continuous, or nearly continuous, control adjustments are required. Examples include flying an aircraft, driving a car, and riding a bicycle. Perspective scenes can arise through natural viewing of the world, simulation of a scene (as in flight simulators), or through imaging devices (such as the cameras on an unmanned aerospace vehicle). Designers frequently have some degree of control over the content and characteristics of a perspective scene; airport designers can choose runway markings, vehicle designers can influence the size and shape of windows, as well as the location of the pilot, and simulator database designers can choose scene complexity and content. Little theoretical framework exists to help designers determine the answers to questions related to perspective scene content. An empirical approach is most commonly used to determine optimum perspective scene configurations. The goal of the research effort described in this dissertation has been to provide a tool for modeling the characteristics of human operators conducting manual control tasks with perspective-scene viewing. This is done for the purpose of providing an algorithmic, as opposed to empirical, method for analyzing the effects of changing perspective scene content for closed-loop manual control tasks.
NASA Astrophysics Data System (ADS)
Keane, Tommy P.; Cahill, Nathan D.; Tarduno, John A.; Jacobs, Robert A.; Pelz, Jeff B.
2014-02-01
Mobile eye-tracking provides the fairly unique opportunity to record and elucidate cognition in action. In our research, we are searching for patterns in, and distinctions between, the visual-search performance of experts and novices in the geo-sciences. Traveling to regions resultant from various geological processes as part of an introductory field studies course in geology, we record the prima facie gaze patterns of experts and novices when they are asked to determine the modes of geological activity that have formed the scene-view presented to them. Recording eye video and scene video in natural settings generates complex imagery that requires advanced applications of computer vision research to generate registrations and mappings between the views of separate observers. By developing such mappings, we could then place many observers into a single mathematical space where we can spatio-temporally analyze inter- and intra-subject fixations, saccades, and head motions. While working towards perfecting these mappings, we developed an updated experiment setup that allowed us to statistically analyze intra-subject eye-movement events without the need for a common domain. Through such analyses we are finding statistical differences between novices and experts in these visual-search tasks. In the course of this research we have developed a unified, open-source, software framework for processing, visualization, and interaction of mobile eye-tracking and high-resolution panoramic imagery.
Behavioral and Neural Representations of Spatial Directions across Words, Schemas, and Images.
Weisberg, Steven M; Marchette, Steven A; Chatterjee, Anjan
2018-05-23
Modern spatial navigation requires fluency with multiple representational formats, including visual scenes, signs, and words. These formats convey different information. Visual scenes are rich and specific but contain extraneous details. Arrows, as an example of signs, are schematic representations in which the extraneous details are eliminated, but analog spatial properties are preserved. Words eliminate all spatial information and convey spatial directions in a purely abstract form. How does the human brain compute spatial directions within and across these formats? To investigate this question, we conducted two experiments on men and women: a behavioral study that was preregistered and a neuroimaging study using multivoxel pattern analysis of fMRI data to uncover similarities and differences among representational formats. Participants in the behavioral study viewed spatial directions presented as images, schemas, or words (e.g., "left"), and responded to each trial, indicating whether the spatial direction was the same or different as the one viewed previously. They responded more quickly to schemas and words than images, despite the visual complexity of stimuli being matched. Participants in the fMRI study performed the same task but responded only to occasional catch trials. Spatial directions in images were decodable in the intraparietal sulcus bilaterally but were not in schemas and words. Spatial directions were also decodable between all three formats. These results suggest that intraparietal sulcus plays a role in calculating spatial directions in visual scenes, but this neural circuitry may be bypassed when the spatial directions are presented as schemas or words. SIGNIFICANCE STATEMENT Human navigators encounter spatial directions in various formats: words ("turn left"), schematic signs (an arrow showing a left turn), and visual scenes (a road turning left). The brain must transform these spatial directions into a plan for action. Here, we investigate similarities and differences between neural representations of these formats. We found that bilateral intraparietal sulci represent spatial directions in visual scenes and across the three formats. We also found that participants respond quickest to schemas, then words, then images, suggesting that spatial directions in abstract formats are easier to interpret than concrete formats. These results support a model of spatial direction interpretation in which spatial directions are either computed for real world action or computed for efficient visual comparison. Copyright © 2018 the authors 0270-6474/18/384996-12$15.00/0.
Intrinsic dimensionality predicts the saliency of natural dynamic scenes.
Vig, Eleonora; Dorr, Michael; Martinetz, Thomas; Barth, Erhardt
2012-06-01
Since visual attention-based computer vision applications have gained popularity, ever more complex, biologically inspired models seem to be needed to predict salient locations (or interest points) in naturalistic scenes. In this paper, we explore how far one can go in predicting eye movements by using only basic signal processing, such as image representations derived from efficient coding principles, and machine learning. To this end, we gradually increase the complexity of a model from simple single-scale saliency maps computed on grayscale videos to spatiotemporal multiscale and multispectral representations. Using a large collection of eye movements on high-resolution videos, supervised learning techniques fine-tune the free parameters whose addition is inevitable with increasing complexity. The proposed model, although very simple, demonstrates significant improvement in predicting salient locations in naturalistic videos over four selected baseline models and two distinct data labeling scenarios.
Correlated Topic Vector for Scene Classification.
Wei, Pengxu; Qin, Fei; Wan, Fang; Zhu, Yi; Jiao, Jianbin; Ye, Qixiang
2017-07-01
Scene images usually involve semantic correlations, particularly when considering large-scale image data sets. This paper proposes a novel generative image representation, correlated topic vector, to model such semantic correlations. Oriented from the correlated topic model, correlated topic vector intends to naturally utilize the correlations among topics, which are seldom considered in the conventional feature encoding, e.g., Fisher vector, but do exist in scene images. It is expected that the involvement of correlations can increase the discriminative capability of the learned generative model and consequently improve the recognition accuracy. Incorporated with the Fisher kernel method, correlated topic vector inherits the advantages of Fisher vector. The contributions to the topics of visual words have been further employed by incorporating the Fisher kernel framework to indicate the differences among scenes. Combined with the deep convolutional neural network (CNN) features and Gibbs sampling solution, correlated topic vector shows great potential when processing large-scale and complex scene image data sets. Experiments on two scene image data sets demonstrate that correlated topic vector improves significantly the deep CNN features, and outperforms existing Fisher kernel-based features.
Campagne, Aurélie; Fradcourt, Benoit; Pichat, Cédric; Baciu, Monica; Kauffmann, Louise; Peyrin, Carole
2016-01-01
Visual processing of emotional stimuli critically depends on the type of cognitive appraisal involved. The present fMRI pilot study aimed to investigate the cerebral correlates involved in the visual processing of emotional scenes in two tasks, one emotional, based on the appraisal of personal emotional experience, and the other motivational, based on the appraisal of the tendency to action. Given that the use of spatial frequency information is relatively flexible during the visual processing of emotional stimuli depending on the task’s demands, we also explored the effect of the type of spatial frequency in visual stimuli in each task by using emotional scenes filtered in low spatial frequency (LSF) and high spatial frequencies (HSF). Activation was observed in the visual areas of the fusiform gyrus for all emotional scenes in both tasks, and in the amygdala for unpleasant scenes only. The motivational task induced additional activation in frontal motor-related areas (e.g. premotor cortex, SMA) and parietal regions (e.g. superior and inferior parietal lobules). Parietal regions were recruited particularly during the motivational appraisal of approach in response to pleasant scenes. These frontal and parietal activations, respectively, suggest that motor and navigation processes play a specific role in the identification of the tendency to action in the motivational task. Furthermore, activity observed in the motivational task, in response to both pleasant and unpleasant scenes, was significantly greater for HSF than for LSF scenes, suggesting that the tendency to action is driven mainly by the detailed information contained in scenes. Results for the emotional task suggest that spatial frequencies play only a small role in the evaluation of unpleasant and pleasant emotions. Our preliminary study revealed a partial distinction between visual processing of emotional scenes during identification of the tendency to action, and during identification of personal emotional experiences. It also illustrates flexible use of the spatial frequencies contained in scenes depending on their emotional valence and on task demands. PMID:26757433
L. Linsen; B.J. Karis; E.G. McPherson; B. Hamann
2005-01-01
In computer graphics, models describing the fractal branching structure of trees typically exploit the modularity of tree structures. The models are based on local production rules, which are applied iteratively and simultaneously to create a complex branching system. The objective is to generate three-dimensional scenes of often many realistic- looking and non-...
Visual wetness perception based on image color statistics.
Sawayama, Masataka; Adelson, Edward H; Nishida, Shin'ya
2017-05-01
Color vision provides humans and animals with the abilities to discriminate colors based on the wavelength composition of light and to determine the location and identity of objects of interest in cluttered scenes (e.g., ripe fruit among foliage). However, we argue that color vision can inform us about much more than color alone. Since a trichromatic image carries more information about the optical properties of a scene than a monochromatic image does, color can help us recognize complex material qualities. Here we show that human vision uses color statistics of an image for the perception of an ecologically important surface condition (i.e., wetness). Psychophysical experiments showed that overall enhancement of chromatic saturation, combined with a luminance tone change that increases the darkness and glossiness of the image, tended to make dry scenes look wetter. Theoretical analysis along with image analysis of real objects indicated that our image transformation, which we call the wetness enhancing transformation, is consistent with actual optical changes produced by surface wetting. Furthermore, we found that the wetness enhancing transformation operator was more effective for the images with many colors (large hue entropy) than for those with few colors (small hue entropy). The hue entropy may be used to separate surface wetness from other surface states having similar optical properties. While surface wetness and surface color might seem to be independent, there are higher order color statistics that can influence wetness judgments, in accord with the ecological statistics. The present findings indicate that the visual system uses color image statistics in an elegant way to help estimate the complex physical status of a scene.
Visual search in scenes involves selective and non-selective pathways
Wolfe, Jeremy M; Vo, Melissa L-H; Evans, Karla K; Greene, Michelle R
2010-01-01
How do we find objects in scenes? For decades, visual search models have been built on experiments in which observers search for targets, presented among distractor items, isolated and randomly arranged on blank backgrounds. Are these models relevant to search in continuous scenes? This paper argues that the mechanisms that govern artificial, laboratory search tasks do play a role in visual search in scenes. However, scene-based information is used to guide search in ways that had no place in earlier models. Search in scenes may be best explained by a dual-path model: A “selective” path in which candidate objects must be individually selected for recognition and a “non-selective” path in which information can be extracted from global / statistical information. PMID:21227734
On the Encoding of Panoramic Visual Scenes in Navigating Wood Ants.
Buehlmann, Cornelia; Woodgate, Joseph L; Collett, Thomas S
2016-08-08
A natural visual panorama is a complex stimulus formed of many component shapes. It gives an animal a sense of place and supplies guiding signals for controlling the animal's direction of travel [1]. Insects with their economical neural processing [2] are good subjects for analyzing the encoding and memory of such scenes [3-5]. Honeybees [6] and ants [7, 8] foraging from their nest can follow habitual routes guided only by visual cues within a natural panorama. Here, we analyze the headings that ants adopt when a familiar panorama composed of two or three shapes is manipulated by removing a shape or by replacing training shapes with unfamiliar ones. We show that (1) ants recognize a component shape not only through its particular visual features, but also by its spatial relation to other shapes in the scene, and that (2) each segmented shape [9] contributes its own directional signal to generating the ant's chosen heading. We found earlier that ants trained to a feeder placed to one side of a single shape [10] and tested with shapes of different widths learn the retinal position of the training shape's center of mass (CoM) [11, 12] when heading toward the feeder. They then guide themselves by placing the shape's CoM in the remembered retinal position [10]. This use of CoM in a one-shape panorama combined with the results here suggests that the ants' memory of a multi-shape panorama comprises the retinal positions of the horizontal CoMs of each major component shape within the scene, bolstered by local descriptors of that shape. Copyright © 2016 Elsevier Ltd. All rights reserved.
The priming function of in-car audio instruction.
Keyes, Helen; Whitmore, Antony; Naneva, Stanislava; McDermott, Daragh
2018-05-01
Studies to date have focused on the priming power of visual road signs, but not the priming potential of audio road scene instruction. Here, the relative priming power of visual, audio, and multisensory road scene instructions was assessed. In a lab-based study, participants responded to target road scene turns following visual, audio, or multisensory road turn primes which were congruent or incongruent to the primes in direction, or control primes. All types of instruction (visual, audio, and multisensory) were successful in priming responses to a road scene. Responses to multisensory-primed targets (both audio and visual) were faster than responses to either audio or visual primes alone. Incongruent audio primes did not affect performance negatively in the manner of incongruent visual or multisensory primes. Results suggest that audio instructions have the potential to prime drivers to respond quickly and safely to their road environment. Peak performance will be observed if audio and visual road instruction primes can be timed to co-occur.
Eye Movements and Visual Memory for Scenes
2005-01-01
Scene memory research has demonstrated that the memory representation of a semantically inconsistent object in a scene is more detailed and/or complete... memory during scene viewing, then changes to semantically inconsistent objects (which should be represented more com- pletely) should be detected more... semantic description. Due to the surprise nature of the visual memory test, any learning that occurred during the search portion of the experiment was
Visual Acuity Using Head-fixed Displays During Passive Self and Surround Motion
NASA Technical Reports Server (NTRS)
Wood, Scott J.; Black, F. Owen; Stallings, Valerie; Peters, Brian
2007-01-01
The ability to read head-fixed displays on various motion platforms requires the suppression of vestibulo-ocular reflexes. This study examined dynamic visual acuity while viewing a head-fixed display during different self and surround rotation conditions. Twelve healthy subjects were asked to report the orientation of Landolt C optotypes presented on a micro-display fixed to a rotating chair at 50 cm distance. Acuity thresholds were determined by the lowest size at which the subjects correctly identified 3 of 5 optotype orientations at peak velocity. Visual acuity was compared across four different conditions, each tested at 0.05 and 0.4 Hz (peak amplitude of 57 deg/s). The four conditions included: subject rotated in semi-darkness (i.e., limited to background illumination of the display), subject stationary while visual scene rotated, subject rotated around a stationary visual background, and both subject and visual scene rotated together. Visual acuity performance was greatest when the subject rotated around a stationary visual background; i.e., when both vestibular and visual inputs provided concordant information about the motion. Visual acuity performance was most reduced when the subject and visual scene rotated together; i.e., when the visual scene provided discordant information about the motion. Ranges of 4-5 logMAR step sizes across the conditions indicated the acuity task was sufficient to discriminate visual performance levels. The background visual scene can influence the ability to read head-fixed displays during passive motion disturbances. Dynamic visual acuity using head-fixed displays can provide an operationally relevant screening tool for visual performance during exposure to novel acceleration environments.
Drager, Kathryn; Light, Janice; Caron, Jessica Gosnell
2017-01-01
Purpose Augmentative and alternative communication (AAC) promotes communicative participation and language development for young children with complex communication needs. However, the motor, linguistic, and cognitive demands of many AAC technologies restrict young children's operational use of and influence over these technologies. The purpose of the current study is to better understand young children's participation in programming vocabulary “just in time” on an AAC application with minimized demands. Method A descriptive study was implemented to highlight the participation of 10 typically developing toddlers (M age: 16 months, range: 10–22 months) in just-in-time vocabulary programming in an AAC app with visual scene displays. Results All 10 toddlers participated in some capacity in adding new visual scene displays and vocabulary to the app just in time. Differences in participation across steps were observed, suggesting variation in the developmental demands of controls involved in vocabulary programming. Conclusions Results from the current study provide clinical insights toward involving young children in AAC programming just in time and steps that may allow for more independent participation or require more scaffolding. Technology designed to minimize motor, cognitive, and linguistic demands may allow children to participate in programming devices at a younger age. PMID:28586825
Holyfield, Christine; Drager, Kathryn; Light, Janice; Caron, Jessica Gosnell
2017-08-15
Augmentative and alternative communication (AAC) promotes communicative participation and language development for young children with complex communication needs. However, the motor, linguistic, and cognitive demands of many AAC technologies restrict young children's operational use of and influence over these technologies. The purpose of the current study is to better understand young children's participation in programming vocabulary "just in time" on an AAC application with minimized demands. A descriptive study was implemented to highlight the participation of 10 typically developing toddlers (M age: 16 months, range: 10-22 months) in just-in-time vocabulary programming in an AAC app with visual scene displays. All 10 toddlers participated in some capacity in adding new visual scene displays and vocabulary to the app just in time. Differences in participation across steps were observed, suggesting variation in the developmental demands of controls involved in vocabulary programming. Results from the current study provide clinical insights toward involving young children in AAC programming just in time and steps that may allow for more independent participation or require more scaffolding. Technology designed to minimize motor, cognitive, and linguistic demands may allow children to participate in programming devices at a younger age.
Unconscious analyses of visual scenes based on feature conjunctions.
Tachibana, Ryosuke; Noguchi, Yasuki
2015-06-01
To efficiently process a cluttered scene, the visual system analyzes statistical properties or regularities of visual elements embedded in the scene. It is controversial, however, whether those scene analyses could also work for stimuli unconsciously perceived. Here we show that our brain performs the unconscious scene analyses not only using a single featural cue (e.g., orientation) but also based on conjunctions of multiple visual features (e.g., combinations of color and orientation information). Subjects foveally viewed a stimulus array (duration: 50 ms) where 4 types of bars (red-horizontal, red-vertical, green-horizontal, and green-vertical) were intermixed. Although a conscious perception of those bars was inhibited by a subsequent mask stimulus, the brain correctly analyzed the information about color, orientation, and color-orientation conjunctions of those invisible bars. The information of those features was then used for the unconscious configuration analysis (statistical processing) of the central bars, which induced a perceptual bias and illusory feature binding in visible stimuli at peripheral locations. While statistical analyses and feature binding are normally 2 key functions of the visual system to construct coherent percepts of visual scenes, our results show that a high-level analysis combining those 2 functions is correctly performed by unconscious computations in the brain. (c) 2015 APA, all rights reserved).
How affective information from faces and scenes interacts in the brain
Vandenbulcke, Mathieu; Sinke, Charlotte B. A.; Goebel, Rainer; de Gelder, Beatrice
2014-01-01
Facial expression perception can be influenced by the natural visual context in which the face is perceived. We performed an fMRI experiment presenting participants with fearful or neutral faces against threatening or neutral background scenes. Triangles and scrambled scenes served as control stimuli. The results showed that the valence of the background influences face selective activity in the right anterior parahippocampal place area (PPA) and subgenual anterior cingulate cortex (sgACC) with higher activation for neutral backgrounds compared to threatening backgrounds (controlled for isolated background effects) and that this effect correlated with trait empathy in the sgACC. In addition, the left fusiform gyrus (FG) responds to the affective congruence between face and background scene. The results show that valence of the background modulates face processing and support the hypothesis that empathic processing in sgACC is inhibited when affective information is present in the background. In addition, the findings reveal a pattern of complex scene perception showing a gradient of functional specialization along the posterior–anterior axis: from sensitivity to the affective content of scenes (extrastriate body area: EBA and posterior PPA), over scene emotion–face emotion interaction (left FG) via category–scene interaction (anterior PPA) to scene–category–personality interaction (sgACC). PMID:23956081
Processing reafferent and exafferent visual information for action and perception.
Reichenbach, Alexandra; Diedrichsen, Jörn
2015-01-01
A recent study suggests that reafferent hand-related visual information utilizes a privileged, attention-independent processing channel for motor control. This process was termed visuomotor binding to reflect its proposed function: linking visual reafferences to the corresponding motor control centers. Here, we ask whether the advantage of processing reafferent over exafferent visual information is a specific feature of the motor processing stream or whether the improved processing also benefits the perceptual processing stream. Human participants performed a bimanual reaching task in a cluttered visual display, and one of the visual hand cursors could be displaced laterally during the movement. We measured the rapid feedback responses of the motor system as well as matched perceptual judgments of which cursor was displaced. Perceptual judgments were either made by watching the visual scene without moving or made simultaneously to the reaching tasks, such that the perceptual processing stream could also profit from the specialized processing of reafferent information in the latter case. Our results demonstrate that perceptual judgments in the heavily cluttered visual environment were improved when performed based on reafferent information. Even in this case, however, the filtering capability of the perceptual processing stream suffered more from the increasing complexity of the visual scene than the motor processing stream. These findings suggest partly shared and partly segregated processing of reafferent information for vision for motor control versus vision for perception.
Cant, Jonathan S; Xu, Yaoda
2017-02-01
Our visual system can extract summary statistics from large collections of objects without forming detailed representations of the individual objects in the ensemble. In a region in ventral visual cortex encompassing the collateral sulcus and the parahippocampal gyrus and overlapping extensively with the scene-selective parahippocampal place area (PPA), we have previously reported fMRI adaptation to object ensembles when ensemble statistics repeated, even when local image features differed across images (e.g., two different images of the same strawberry pile). We additionally showed that this ensemble representation is similar to (but still distinct from) how visual texture patterns are processed in this region and is not explained by appealing to differences in the color of the elements that make up the ensemble. To further explore the nature of ensemble representation in this brain region, here we used PPA as our ROI and investigated in detail how the shape and surface properties (i.e., both texture and color) of the individual objects constituting an ensemble affect the ensemble representation in anterior-medial ventral visual cortex. We photographed object ensembles of stone beads that varied in shape and surface properties. A given ensemble always contained beads of the same shape and surface properties (e.g., an ensemble of star-shaped rose quartz beads). A change to the shape and/or surface properties of all the beads in an ensemble resulted in a significant release from adaptation in PPA compared with conditions in which no ensemble feature changed. In contrast, in the object-sensitive lateral occipital area (LO), we only observed a significant release from adaptation when the shape of the ensemble elements varied, and found no significant results in additional scene-sensitive regions, namely, the retrosplenial complex and occipital place area. Together, these results demonstrate that the shape and surface properties of the individual objects comprising an ensemble both contribute significantly to object ensemble representation in anterior-medial ventral visual cortex and further demonstrate a functional dissociation between object- (LO) and scene-selective (PPA) visual cortical regions and within the broader scene-processing network itself.
Visual flow scene effects on the somatogravic illusion in non-pilots.
Eriksson, Lars; von Hofsten, Claes; Tribukait, Arne; Eiken, Ola; Andersson, Peter; Hedström, Johan
2008-09-01
The somatogravic illusion (SGI) is easily broken when the pilot looks out the aircraft window during daylight flight, but it has proven difficult to break or even reduce the SGI in non-pilots in simulators using synthetic visual scenes. Could visual-flow scenes that accommodate compensatory head movement reduce the SGI in naive subjects? We investigated the effects of visual cues on the SGI induced by a human centrifuge. The subject was equipped with a head-tracked, head-mounted display (HMD) and was seated in a fixed gondola facing the center of rotation. The angular velocity of the centrifuge increased from near zero until a 0.57-G centripetal acceleration was attained, resulting in a tilt of the gravitoinertial force vector, corresponding to a pitch-up of 30 degrees. The subject indicated perceived horizontal continuously by means of a manual adjustable-plate system. We performed two experiments with within-subjects designs. In Experiment 1, the subjects (N = 13) viewed a darkened HMD and a presentation of simple visual flow beneath a horizon. In Experiment 2, the subjects (N = 12) viewed a darkened HMD, a scene including symbology superimposed on simple visual flow and horizon, and this scene without visual flow (static). In Experiment 1, visual flow reduced the SGI from 12.4 +/- 1.4 degrees (mean +/- SE) to 8.7 +/- 1.5 degrees. In Experiment 2, the SGI was smaller in the visual flow condition (9.3 +/- 1.8 degrees) than with the static scene (13.3 +/- 1.7 degrees) and without HMD presentation (14.5 +/- 2.3 degrees), respectively. It is possible to reduce the SGI in non-pilots by means of a synthetic horizon and simple visual flow conveyed by a head-tracked HMD. This may reflect the power of a more intuitive display for reducing the SGI.
Helo, Andrea; van Ommen, Sandrien; Pannasch, Sebastian; Danteny-Dordoigne, Lucile; Rämä, Pia
2017-11-01
Conceptual representations of everyday scenes are built in interaction with visual environment and these representations guide our visual attention. Perceptual features and object-scene semantic consistency have been found to attract our attention during scene exploration. The present study examined how visual attention in 24-month-old toddlers is attracted by semantic violations and how perceptual features (i. e. saliency, centre distance, clutter and object size) and linguistic properties (i. e. object label frequency and label length) affect gaze distribution. We compared eye movements of 24-month-old toddlers and adults while exploring everyday scenes which either contained an inconsistent (e.g., soap on a breakfast table) or consistent (e.g., soap in a bathroom) object. Perceptual features such as saliency, centre distance and clutter of the scene affected looking times in the toddler group during the whole viewing time whereas looking times in adults were affected only by centre distance during the early viewing time. Adults looked longer to inconsistent than consistent objects either if the objects had a high or a low saliency. In contrast, toddlers presented semantic consistency effect only when objects were highly salient. Additionally, toddlers with lower vocabulary skills looked longer to inconsistent objects while toddlers with higher vocabulary skills look equally long to both consistent and inconsistent objects. Our results indicate that 24-month-old children use scene context to guide visual attention when exploring the visual environment. However, perceptual features have a stronger influence in eye movement guidance in toddlers than in adults. Our results also indicate that language skills influence cognitive but not perceptual guidance of eye movements during scene perception in toddlers. Copyright © 2017 Elsevier Inc. All rights reserved.
Comparing object recognition from binary and bipolar edge images for visual prostheses
Jung, Jae-Hyun; Pu, Tian; Peli, Eli
2017-01-01
Visual prostheses require an effective representation method due to the limited display condition which has only 2 or 3 levels of grayscale in low resolution. Edges derived from abrupt luminance changes in images carry essential information for object recognition. Typical binary (black and white) edge images have been used to represent features to convey essential information. However, in scenes with a complex cluttered background, the recognition rate of the binary edge images by human observers is limited and additional information is required. The polarity of edges and cusps (black or white features on a gray background) carries important additional information; the polarity may provide shape from shading information missing in the binary edge image. This depth information may be restored by using bipolar edges. We compared object recognition rates from 16 binary edge images and bipolar edge images by 26 subjects to determine the possible impact of bipolar filtering in visual prostheses with 3 or more levels of grayscale. Recognition rates were higher with bipolar edge images and the improvement was significant in scenes with complex backgrounds. The results also suggest that erroneous shape from shading interpretation of bipolar edges resulting from pigment rather than boundaries of shape may confound the recognition. PMID:28458481
The Role of Visual Experience on the Representation and Updating of Novel Haptic Scenes
ERIC Educational Resources Information Center
Pasqualotto, Achille; Newell, Fiona N.
2007-01-01
We investigated the role of visual experience on the spatial representation and updating of haptic scenes by comparing recognition performance across sighted, congenitally and late blind participants. We first established that spatial updating occurs in sighted individuals to haptic scenes of novel objects. All participants were required to…
Miconi, Thomas; Groomes, Laura; Kreiman, Gabriel
2016-01-01
When searching for an object in a scene, how does the brain decide where to look next? Visual search theories suggest the existence of a global “priority map” that integrates bottom-up visual information with top-down, target-specific signals. We propose a mechanistic model of visual search that is consistent with recent neurophysiological evidence, can localize targets in cluttered images, and predicts single-trial behavior in a search task. This model posits that a high-level retinotopic area selective for shape features receives global, target-specific modulation and implements local normalization through divisive inhibition. The normalization step is critical to prevent highly salient bottom-up features from monopolizing attention. The resulting activity pattern constitues a priority map that tracks the correlation between local input and target features. The maximum of this priority map is selected as the locus of attention. The visual input is then spatially enhanced around the selected location, allowing object-selective visual areas to determine whether the target is present at this location. This model can localize objects both in array images and when objects are pasted in natural scenes. The model can also predict single-trial human fixations, including those in error and target-absent trials, in a search task involving complex objects. PMID:26092221
Modeling global scene factors in attention
NASA Astrophysics Data System (ADS)
Torralba, Antonio
2003-07-01
Models of visual attention have focused predominantly on bottom-up approaches that ignored structured contextual and scene information. I propose a model of contextual cueing for attention guidance based on the global scene configuration. It is shown that the statistics of low-level features across the whole image can be used to prime the presence or absence of objects in the scene and to predict their location, scale, and appearance before exploring the image. In this scheme, visual context information can become available early in the visual processing chain, which allows modulation of the saliency of image regions and provides an efficient shortcut for object detection and recognition. 2003 Optical Society of America
Age-related macular degeneration changes the processing of visual scenes in the brain.
Ramanoël, Stephen; Chokron, Sylvie; Hera, Ruxandra; Kauffmann, Louise; Chiquet, Christophe; Krainik, Alexandre; Peyrin, Carole
2018-01-01
In age-related macular degeneration (AMD), the processing of fine details in a visual scene, based on a high spatial frequency processing, is impaired, while the processing of global shapes, based on a low spatial frequency processing, is relatively well preserved. The present fMRI study aimed to investigate the residual abilities and functional brain changes of spatial frequency processing in visual scenes in AMD patients. AMD patients and normally sighted elderly participants performed a categorization task using large black and white photographs of scenes (indoors vs. outdoors) filtered in low and high spatial frequencies, and nonfiltered. The study also explored the effect of luminance contrast on the processing of high spatial frequencies. The contrast across scenes was either unmodified or equalized using a root-mean-square contrast normalization in order to increase contrast in high-pass filtered scenes. Performance was lower for high-pass filtered scenes than for low-pass and nonfiltered scenes, for both AMD patients and controls. The deficit for processing high spatial frequencies was more pronounced in AMD patients than in controls and was associated with lower activity for patients than controls not only in the occipital areas dedicated to central and peripheral visual fields but also in a distant cerebral region specialized for scene perception, the parahippocampal place area. Increasing the contrast improved the processing of high spatial frequency content and spurred activation of the occipital cortex for AMD patients. These findings may lead to new perspectives for rehabilitation procedures for AMD patients.
How emotion leads to selective memory: neuroimaging evidence.
Waring, Jill D; Kensinger, Elizabeth A
2011-06-01
Often memory for emotionally arousing items is enhanced relative to neutral items within complex visual scenes, but this enhancement can come at the expense of memory for peripheral background information. This 'trade-off' effect has been elicited by a range of stimulus valence and arousal levels, yet the magnitude of the effect has been shown to vary with these factors. Using fMRI, this study investigated the neural mechanisms underlying this selective memory for emotional scenes. Further, we examined how these processes are affected by stimulus dimensions of arousal and valence. The trade-off effect in memory occurred for low to high arousal positive and negative scenes. There was a core emotional memory network associated with the trade-off among all the emotional scene types, however, there were additional regions that were uniquely associated with the trade-off for each individual scene type. These results suggest that there is a common network of regions associated with the emotional memory trade-off effect, but that valence and arousal also independently affect the neural activity underlying the effect. Copyright © 2011 Elsevier Ltd. All rights reserved.
How emotion leads to selective memory: Neuroimaging evidence
Waring, Jill D.; Kensinger, Elizabeth A.
2011-01-01
Often memory for emotionally arousing items is enhanced relative to neutral items within complex visual scenes, but this enhancement can come at the expense of memory for peripheral background information. This ‘trade-off’ effect has been elicited by a range of stimulus valence and arousal levels, yet the magnitude of the effect has been shown to vary with these factors. Using fMRI, this study investigated the neural mechanisms underlying this selective memory for emotional scenes. Further, we examined how these processes are affected by stimulus dimensions of arousal and valence. The trade-off effect in memory occurred for low to high arousal positive and negative scenes. There was a core emotional memory network associated with the trade-off among all the emotional scene types, however there were additional regions that were uniquely associated with the trade-off for each individual scene type. These results suggest that there is a common network of regions associated with the emotional memory tradeoff effect, but that valence and arousal also independently affect the neural activity underlying the effect. PMID:21414333
A Model of Manual Control with Perspective Scene Viewing
NASA Technical Reports Server (NTRS)
Sweet, Barbara Townsend
2013-01-01
A model of manual control during perspective scene viewing is presented, which combines the Crossover Model with a simpli ed model of perspective-scene viewing and visual- cue selection. The model is developed for a particular example task: an idealized constant- altitude task in which the operator controls longitudinal position in the presence of both longitudinal and pitch disturbances. An experiment is performed to develop and vali- date the model. The model corresponds closely with the experimental measurements, and identi ed model parameters are highly consistent with the visual cues available in the perspective scene. The modeling results indicate that operators used one visual cue for position control, and another visual cue for velocity control (lead generation). Additionally, operators responded more quickly to rotation (pitch) than translation (longitudinal).
Using articulated scene models for dynamic 3d scene analysis in vista spaces
NASA Astrophysics Data System (ADS)
Beuter, Niklas; Swadzba, Agnes; Kummert, Franz; Wachsmuth, Sven
2010-09-01
In this paper we describe an efficient but detailed new approach to analyze complex dynamic scenes directly in 3D. The arising information is important for mobile robots to solve tasks in the area of household robotics. In our work a mobile robot builds an articulated scene model by observing the environment in the visual field or rather in the so-called vista space. The articulated scene model consists of essential knowledge about the static background, about autonomously moving entities like humans or robots and finally, in contrast to existing approaches, information about articulated parts. These parts describe movable objects like chairs, doors or other tangible entities, which could be moved by an agent. The combination of the static scene, the self-moving entities and the movable objects in one articulated scene model enhances the calculation of each single part. The reconstruction process for parts of the static scene benefits from removal of the dynamic parts and in turn, the moving parts can be extracted more easily through the knowledge about the background. In our experiments we show, that the system delivers simultaneously an accurate static background model, moving persons and movable objects. This information of the articulated scene model enables a mobile robot to detect and keep track of interaction partners, to navigate safely through the environment and finally, to strengthen the interaction with the user through the knowledge about the 3D articulated objects and 3D scene analysis. [Figure not available: see fulltext.
Hayes, Scott M; Nadel, Lynn; Ryan, Lee
2007-01-01
Previous research has investigated intentional retrieval of contextual information and contextual influences on object identification and word recognition, yet few studies have investigated context effects in episodic memory for objects. To address this issue, unique objects embedded in a visually rich scene or on a white background were presented to participants. At test, objects were presented either in the original scene or on a white background. A series of behavioral studies with young adults demonstrated a context shift decrement (CSD)-decreased recognition performance when context is changed between encoding and retrieval. The CSD was not attenuated by encoding or retrieval manipulations, suggesting that binding of object and context may be automatic. A final experiment explored the neural correlates of the CSD, using functional Magnetic Resonance Imaging. Parahippocampal cortex (PHC) activation (right greater than left) during incidental encoding was associated with subsequent memory of objects in the context shift condition. Greater activity in right PHC was also observed during successful recognition of objects previously presented in a scene. Finally, a subset of regions activated during scene encoding, such as bilateral PHC, was reactivated when the object was presented on a white background at retrieval. Although participants were not required to intentionally retrieve contextual information, the results suggest that PHC may reinstate visual context to mediate successful episodic memory retrieval. The CSD is attributed to automatic and obligatory binding of object and context. The results suggest that PHC is important not only for processing of scene information, but also plays a role in successful episodic memory encoding and retrieval. These findings are consistent with the view that spatial information is stored in the hippocampal complex, one of the central tenets of Multiple Trace Theory. (c) 2007 Wiley-Liss, Inc.
Knepper, Daniel H.
2010-01-01
As part of the Central Colorado Mineral Resource Assessment Project, the digital image data for four Landsat Thematic Mapper scenes covering central Colorado between Wyoming and New Mexico were acquired and band ratios were calculated after masking pixels dominated by vegetation, snow, and terrain shadows. Ratio values were visually enhanced by contrast stretching, revealing only those areas with strong responses (high ratio values). A color-ratio composite mosaic was prepared for the four scenes so that the distribution of potentially hydrothermally altered rocks could be visually evaluated. To provide a more useful input to a Geographic Information System-based mineral resource assessment, the information contained in the color-ratio composite raster image mosaic was converted to vector-based polygons after thresholding to isolate the strongest ratio responses and spatial filtering to reduce vector complexity and isolate the largest occurrences of potentially hydrothermally altered rocks.
Compressed digital holography: from micro towards macro
NASA Astrophysics Data System (ADS)
Schretter, Colas; Bettens, Stijn; Blinder, David; Pesquet-Popescu, Béatrice; Cagnazzo, Marco; Dufaux, Frédéric; Schelkens, Peter
2016-09-01
signal processing methods from software-driven computer engineering and applied mathematics. The compressed sensing theory in particular established a practical framework for reconstructing the scene content using few linear combinations of complex measurements and a sparse prior for regularizing the solution. Compressed sensing found direct applications in digital holography for microscopy. Indeed, the wave propagation phenomenon in free space mixes in a natural way the spatial distribution of point sources from the 3-dimensional scene. As the 3-dimensional scene is mapped to a 2-dimensional hologram, the hologram samples form a compressed representation of the scene as well. This overview paper discusses contributions in the field of compressed digital holography at the micro scale. Then, an outreach on future extensions towards the real-size macro scale is discussed. Thanks to advances in sensor technologies, increasing computing power and the recent improvements in sparse digital signal processing, holographic modalities are on the verge of practical high-quality visualization at a macroscopic scale where much higher resolution holograms must be acquired and processed on the computer.
The lawful imprecision of human surface tilt estimation in natural scenes
2018-01-01
Estimating local surface orientation (slant and tilt) is fundamental to recovering the three-dimensional structure of the environment. It is unknown how well humans perform this task in natural scenes. Here, with a database of natural stereo-images having groundtruth surface orientation at each pixel, we find dramatic differences in human tilt estimation with natural and artificial stimuli. Estimates are precise and unbiased with artificial stimuli and imprecise and strongly biased with natural stimuli. An image-computable Bayes optimal model grounded in natural scene statistics predicts human bias, precision, and trial-by-trial errors without fitting parameters to the human data. The similarities between human and model performance suggest that the complex human performance patterns with natural stimuli are lawful, and that human visual systems have internalized local image and scene statistics to optimally infer the three-dimensional structure of the environment. These results generalize our understanding of vision from the lab to the real world. PMID:29384477
The lawful imprecision of human surface tilt estimation in natural scenes.
Kim, Seha; Burge, Johannes
2018-01-31
Estimating local surface orientation (slant and tilt) is fundamental to recovering the three-dimensional structure of the environment. It is unknown how well humans perform this task in natural scenes. Here, with a database of natural stereo-images having groundtruth surface orientation at each pixel, we find dramatic differences in human tilt estimation with natural and artificial stimuli. Estimates are precise and unbiased with artificial stimuli and imprecise and strongly biased with natural stimuli. An image-computable Bayes optimal model grounded in natural scene statistics predicts human bias, precision, and trial-by-trial errors without fitting parameters to the human data. The similarities between human and model performance suggest that the complex human performance patterns with natural stimuli are lawful, and that human visual systems have internalized local image and scene statistics to optimally infer the three-dimensional structure of the environment. These results generalize our understanding of vision from the lab to the real world. © 2018, Kim et al.
NASA Technical Reports Server (NTRS)
Brady, Rachel A.; Batson, Crystal D.; Peters, Brian T.; Mulavara, Ajitkumar P.; Bloomberg, Jacob J.
2010-01-01
We designed a gait training study that presented combinations of visual flow and support surface manipulations to investigate the response of healthy adults to novel discordant sensorimotor conditions. We aimed to determine whether a relationship existed between subjects visual dependence and their scores on a collective measure of anxiety, cognition, and postural stability in a new discordant environment presented at the conclusion of training (Transfer Test). A treadmill was mounted to a motion base platform positioned 2 m behind a large visual screen. Training consisted of three walking sessions, each within a week of the previous visit, that presented four 5-minute exposures to various combinations of support surface and visual scene manipulations, all lateral sinusoids. The conditions were scene translation only, support surface translation only, simultaneous scene and support surface translations in-phase, and simultaneous scene and support surface translations 180 out-of-phase. During the Transfer Test, the trained participants received a 2-minute novel exposure. A visual sinusoidal roll perturbation, with twice the original flow rate, was superimposed on a sinusoidal support surface roll perturbation that was 90 out of phase with the scene. A high correlation existed between normalized torso translation, measured in the scene-only condition at the first visit, and a combined measure of normalized heart rate, stride frequency, and reaction time at the transfer test. Results suggest that visually dependent participants experience decreased postural stability, increased anxiety, and increased reaction times compared to their less visually dependent counterparts when negotiating novel discordant conditions.
Mathematical modelling of animate and intentional motion.
Rittscher, Jens; Blake, Andrew; Hoogs, Anthony; Stein, Gees
2003-01-01
Our aim is to enable a machine to observe and interpret the behaviour of others. Mathematical models are employed to describe certain biological motions. The main challenge is to design models that are both tractable and meaningful. In the first part we will describe how computer vision techniques, in particular visual tracking, can be applied to recognize a small vocabulary of human actions in a constrained scenario. Mainly the problems of viewpoint and scale invariance need to be overcome to formalize a general framework. Hence the second part of the article is devoted to the question whether a particular human action should be captured in a single complex model or whether it is more promising to make extensive use of semantic knowledge and a collection of low-level models that encode certain motion primitives. Scene context plays a crucial role if we intend to give a higher-level interpretation rather than a low-level physical description of the observed motion. A semantic knowledge base is used to establish the scene context. This approach consists of three main components: visual analysis, the mapping from vision to language and the search of the semantic database. A small number of robust visual detectors is used to generate a higher-level description of the scene. The approach together with a number of results is presented in the third part of this article. PMID:12689374
Ferguson, Heather J; Breheny, Richard
2011-05-01
The time-course of representing others' perspectives is inconclusive across the currently available models of ToM processing. We report two visual-world studies investigating how knowledge about a character's basic preferences (e.g. Tom's favourite colour is pink) and higher-order desires (his wish to keep this preference secret) compete to influence online expectations about subsequent behaviour. Participants' eye movements around a visual scene were tracked while they listened to auditory narratives. While clear differences in anticipatory visual biases emerged between conditions in Experiment 1, post-hoc analyses testing the strength of the relevant biases suggested a discrepancy in the time-course of predicting appropriate referents within the different contexts. Specifically, predictions to the target emerged very early when there was no conflict between the character's basic preferences and higher-order desires, but appeared to be relatively delayed when comprehenders were provided with conflicting information about that character's desire to keep a secret. However, a second experiment demonstrated that this apparent 'cognitive cost' in inferring behaviour based on higher-order desires was in fact driven by low-level features between the context sentence and visual scene. Taken together, these results suggest that healthy adults are able to make complex higher-order ToM inferences without the need to call on costly cognitive processes. Results are discussed relative to previous accounts of ToM and language processing. Copyright © 2011 Elsevier B.V. All rights reserved.
Salient contour extraction from complex natural scene in night vision image
NASA Astrophysics Data System (ADS)
Han, Jing; Yue, Jiang; Zhang, Yi; Bai, Lian-fa
2014-03-01
The theory of center-surround interaction in non-classical receptive field can be applied in night vision information processing. In this work, an optimized compound receptive field modulation method is proposed to extract salient contour from complex natural scene in low-light-level (LLL) and infrared images. The kernel idea is that multi-feature analysis can recognize the inhomogeneity in modulatory coverage more accurately and that center and surround with the grouping structure satisfying Gestalt rule deserves high connection-probability. Computationally, a multi-feature contrast weighted inhibition model is presented to suppress background and lower mutual inhibition among contour elements; a fuzzy connection facilitation model is proposed to achieve the enhancement of contour response, the connection of discontinuous contour and the further elimination of randomly distributed noise and texture; a multi-scale iterative attention method is designed to accomplish dynamic modulation process and extract contours of targets in multi-size. This work provides a series of biologically motivated computational visual models with high-performance for contour detection from cluttered scene in night vision images.
Visual search for arbitrary objects in real scenes
Alvarez, George A.; Rosenholtz, Ruth; Kuzmova, Yoana I.; Sherman, Ashley M.
2011-01-01
How efficient is visual search in real scenes? In searches for targets among arrays of randomly placed distractors, efficiency is often indexed by the slope of the reaction time (RT) × Set Size function. However, it may be impossible to define set size for real scenes. As an approximation, we hand-labeled 100 indoor scenes and used the number of labeled regions as a surrogate for set size. In Experiment 1, observers searched for named objects (a chair, bowl, etc.). With set size defined as the number of labeled regions, search was very efficient (~5 ms/item). When we controlled for a possible guessing strategy in Experiment 2, slopes increased somewhat (~15 ms/item), but they were much shallower than search for a random object among other distinctive objects outside of a scene setting (Exp. 3: ~40 ms/item). In Experiments 4–6, observers searched repeatedly through the same scene for different objects. Increased familiarity with scenes had modest effects on RTs, while repetition of target items had large effects (>500 ms). We propose that visual search in scenes is efficient because scene-specific forms of attentional guidance can eliminate most regions from the “functional set size” of items that could possibly be the target. PMID:21671156
Visual search for arbitrary objects in real scenes.
Wolfe, Jeremy M; Alvarez, George A; Rosenholtz, Ruth; Kuzmova, Yoana I; Sherman, Ashley M
2011-08-01
How efficient is visual search in real scenes? In searches for targets among arrays of randomly placed distractors, efficiency is often indexed by the slope of the reaction time (RT) × Set Size function. However, it may be impossible to define set size for real scenes. As an approximation, we hand-labeled 100 indoor scenes and used the number of labeled regions as a surrogate for set size. In Experiment 1, observers searched for named objects (a chair, bowl, etc.). With set size defined as the number of labeled regions, search was very efficient (~5 ms/item). When we controlled for a possible guessing strategy in Experiment 2, slopes increased somewhat (~15 ms/item), but they were much shallower than search for a random object among other distinctive objects outside of a scene setting (Exp. 3: ~40 ms/item). In Experiments 4-6, observers searched repeatedly through the same scene for different objects. Increased familiarity with scenes had modest effects on RTs, while repetition of target items had large effects (>500 ms). We propose that visual search in scenes is efficient because scene-specific forms of attentional guidance can eliminate most regions from the "functional set size" of items that could possibly be the target.
ERIC Educational Resources Information Center
Thiessen, Amber; Beukelman, David; Hux, Karen; Longenecker, Maria
2016-01-01
Purpose: The purpose of the study was to compare the visual attention patterns of adults with aphasia and adults without neurological conditions when viewing visual scenes with 2 types of engagement. Method: Eye-tracking technology was used to measure the visual attention patterns of 10 adults with aphasia and 10 adults without neurological…
Brockmole, James R; Henderson, John M
2006-07-01
When confronted with a previously encountered scene, what information is used to guide search to a known target? We contrasted the role of a scene's basic-level category membership with its specific arrangement of visual properties. Observers were repeatedly shown photographs of scenes that contained consistently but arbitrarily located targets, allowing target positions to be associated with scene content. Learned scenes were then unexpectedly mirror reversed, spatially translating visual features as well as the target across the display while preserving the scene's identity and concept. Mirror reversals produced a cost as the eyes initially moved toward the position in the display in which the target had previously appeared. The cost was not complete, however; when initial search failed, the eyes were quickly directed to the target's new position. These results suggest that in real-world scenes, shifts of attention are initially based on scene identity, and subsequent shifts are guided by more detailed information regarding scene and object layout.
Greene, Michelle R; Baldassano, Christopher; Fei-Fei, Li; Beck, Diane M; Baker, Chris I
2018-01-01
Inherent correlations between visual and semantic features in real-world scenes make it difficult to determine how different scene properties contribute to neural representations. Here, we assessed the contributions of multiple properties to scene representation by partitioning the variance explained in human behavioral and brain measurements by three feature models whose inter-correlations were minimized a priori through stimulus preselection. Behavioral assessments of scene similarity reflected unique contributions from a functional feature model indicating potential actions in scenes as well as high-level visual features from a deep neural network (DNN). In contrast, similarity of cortical responses in scene-selective areas was uniquely explained by mid- and high-level DNN features only, while an object label model did not contribute uniquely to either domain. The striking dissociation between functional and DNN features in their contribution to behavioral and brain representations of scenes indicates that scene-selective cortex represents only a subset of behaviorally relevant scene information. PMID:29513219
Groen, Iris Ia; Greene, Michelle R; Baldassano, Christopher; Fei-Fei, Li; Beck, Diane M; Baker, Chris I
2018-03-07
Inherent correlations between visual and semantic features in real-world scenes make it difficult to determine how different scene properties contribute to neural representations. Here, we assessed the contributions of multiple properties to scene representation by partitioning the variance explained in human behavioral and brain measurements by three feature models whose inter-correlations were minimized a priori through stimulus preselection. Behavioral assessments of scene similarity reflected unique contributions from a functional feature model indicating potential actions in scenes as well as high-level visual features from a deep neural network (DNN). In contrast, similarity of cortical responses in scene-selective areas was uniquely explained by mid- and high-level DNN features only, while an object label model did not contribute uniquely to either domain. The striking dissociation between functional and DNN features in their contribution to behavioral and brain representations of scenes indicates that scene-selective cortex represents only a subset of behaviorally relevant scene information.
Delcasso, Sébastien; Huh, Namjung; Byeon, Jung Seop; Lee, Jihyun; Jung, Min Whan; Lee, Inah
2014-11-19
The hippocampus is important for contextual behavior, and the striatum plays key roles in decision making. When studying the functional relationships with the hippocampus, prior studies have focused mostly on the dorsolateral striatum (DLS), emphasizing the antagonistic relationships between the hippocampus and DLS in spatial versus response learning. By contrast, the functional relationships between the dorsomedial striatum (DMS) and hippocampus are relatively unknown. The current study reports that lesions to both the hippocampus and DMS profoundly impaired performance of rats in a visual scene-based memory task in which the animals were required to make a choice response by using visual scenes displayed in the background. Analysis of simultaneous recordings of local field potentials revealed that the gamma oscillatory power was higher in the DMS, but not in CA1, when the rat performed the task using familiar scenes than novel ones. In addition, the CA1-DMS networks increased coherence at γ, but not at θ, rhythm as the rat mastered the task. At the single-unit level, the neuronal populations in CA1 and DMS showed differential firing patterns when responses were made using familiar visual scenes than novel ones. Such learning-dependent firing patterns were observed earlier in the DMS than in CA1 before the rat made choice responses. The present findings suggest that both the hippocampus and DMS process memory representations for visual scenes in parallel with different time courses and that flexible choice action using background visual scenes requires coordinated operations of the hippocampus and DMS at γ frequencies. Copyright © 2014 the authors 0270-6474/14/3415534-14$15.00/0.
A Parallel Rendering Algorithm for MIMD Architectures
NASA Technical Reports Server (NTRS)
Crockett, Thomas W.; Orloff, Tobias
1991-01-01
Applications such as animation and scientific visualization demand high performance rendering of complex three dimensional scenes. To deliver the necessary rendering rates, highly parallel hardware architectures are required. The challenge is then to design algorithms and software which effectively use the hardware parallelism. A rendering algorithm targeted to distributed memory MIMD architectures is described. For maximum performance, the algorithm exploits both object-level and pixel-level parallelism. The behavior of the algorithm is examined both analytically and experimentally. Its performance for large numbers of processors is found to be limited primarily by communication overheads. An experimental implementation for the Intel iPSC/860 shows increasing performance from 1 to 128 processors across a wide range of scene complexities. It is shown that minimal modifications to the algorithm will adapt it for use on shared memory architectures as well.
Faces in Context: Does Face Perception Depend on the Orientation of the Visual Scene?
Taubert, Jessica; van Golde, Celine; Verstraten, Frans A J
2016-10-01
The mechanisms held responsible for familiar face recognition are thought to be orientation dependent; inverted faces are more difficult to recognize than their upright counterparts. Although this effect of inversion has been investigated extensively, researchers have typically sliced faces from photographs and presented them in isolation. As such, it is not known whether the perceived orientation of a face is inherited from the visual scene in which it appears. Here, we address this question by measuring performance in a simultaneous same-different task while manipulating both the orientation of the faces and the scene. We found that the face inversion effect survived scene inversion. Nonetheless, an improvement in performance when the scene was upside down suggests that sensitivity to identity increased when the faces were more easily segmented from the scene. Thus, while these data identify congruency with the visual environment as a contributing factor in recognition performance, they imply different mechanisms operate on upright and inverted faces. © The Author(s) 2016.
Developmental changes in attention to faces and bodies in static and dynamic scenes.
Stoesz, Brenda M; Jakobson, Lorna S
2014-01-01
Typically developing individuals show a strong visual preference for faces and face-like stimuli; however, this may come at the expense of attending to bodies or to other aspects of a scene. The primary goal of the present study was to provide additional insight into the development of attentional mechanisms that underlie perception of real people in naturalistic scenes. We examined the looking behaviors of typical children, adolescents, and young adults as they viewed static and dynamic scenes depicting one or more people. Overall, participants showed a bias to attend to faces more than on other parts of the scenes. Adding motion cues led to a reduction in the number, but an increase in the average duration of face fixations in single-character scenes. When multiple characters appeared in a scene, motion-related effects were attenuated and participants shifted their gaze from faces to bodies, or made off-screen glances. Children showed the largest effects related to the introduction of motion cues or additional characters, suggesting that they find dynamic faces difficult to process, and are especially prone to look away from faces when viewing complex social scenes-a strategy that could reduce the cognitive and the affective load imposed by having to divide one's attention between multiple faces. Our findings provide new insights into the typical development of social attention during natural scene viewing, and lay the foundation for future work examining gaze behaviors in typical and atypical development.
Adeli, Hossein; Vitu, Françoise; Zelinsky, Gregory J
2017-02-08
Modern computational models of attention predict fixations using saliency maps and target maps, which prioritize locations for fixation based on feature contrast and target goals, respectively. But whereas many such models are biologically plausible, none have looked to the oculomotor system for design constraints or parameter specification. Conversely, although most models of saccade programming are tightly coupled to underlying neurophysiology, none have been tested using real-world stimuli and tasks. We combined the strengths of these two approaches in MASC, a model of attention in the superior colliculus (SC) that captures known neurophysiological constraints on saccade programming. We show that MASC predicted the fixation locations of humans freely viewing naturalistic scenes and performing exemplar and categorical search tasks, a breadth achieved by no other existing model. Moreover, it did this as well or better than its more specialized state-of-the-art competitors. MASC's predictive success stems from its inclusion of high-level but core principles of SC organization: an over-representation of foveal information, size-invariant population codes, cascaded population averaging over distorted visual and motor maps, and competition between motor point images for saccade programming, all of which cause further modulation of priority (attention) after projection of saliency and target maps to the SC. Only by incorporating these organizing brain principles into our models can we fully understand the transformation of complex visual information into the saccade programs underlying movements of overt attention. With MASC, a theoretical footing now exists to generate and test computationally explicit predictions of behavioral and neural responses in visually complex real-world contexts. SIGNIFICANCE STATEMENT The superior colliculus (SC) performs a visual-to-motor transformation vital to overt attention, but existing SC models cannot predict saccades to visually complex real-world stimuli. We introduce a brain-inspired SC model that outperforms state-of-the-art image-based competitors in predicting the sequences of fixations made by humans performing a range of everyday tasks (scene viewing and exemplar and categorical search), making clear the value of looking to the brain for model design. This work is significant in that it will drive new research by making computationally explicit predictions of SC neural population activity in response to naturalistic stimuli and tasks. It will also serve as a blueprint for the construction of other brain-inspired models, helping to usher in the next generation of truly intelligent autonomous systems. Copyright © 2017 the authors 0270-6474/17/371453-15$15.00/0.
Optic flow-based collision-free strategies: From insects to robots.
Serres, Julien R; Ruffier, Franck
2017-09-01
Flying insects are able to fly smartly in an unpredictable environment. It has been found that flying insects have smart neurons inside their tiny brains that are sensitive to visual motion also called optic flow. Consequently, flying insects rely mainly on visual motion during their flight maneuvers such as: takeoff or landing, terrain following, tunnel crossing, lateral and frontal obstacle avoidance, and adjusting flight speed in a cluttered environment. Optic flow can be defined as the vector field of the apparent motion of objects, surfaces, and edges in a visual scene generated by the relative motion between an observer (an eye or a camera) and the scene. Translational optic flow is particularly interesting for short-range navigation because it depends on the ratio between (i) the relative linear speed of the visual scene with respect to the observer and (ii) the distance of the observer from obstacles in the surrounding environment without any direct measurement of either speed or distance. In flying insects, roll stabilization reflex and yaw saccades attenuate any rotation at the eye level in roll and yaw respectively (i.e. to cancel any rotational optic flow) in order to ensure pure translational optic flow between two successive saccades. Our survey focuses on feedback-loops which use the translational optic flow that insects employ for collision-free navigation. Optic flow is likely, over the next decade to be one of the most important visual cues that can explain flying insects' behaviors for short-range navigation maneuvers in complex tunnels. Conversely, the biorobotic approach can therefore help to develop innovative flight control systems for flying robots with the aim of mimicking flying insects' abilities and better understanding their flight. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.
The role of temporo-parietal junction (TPJ) in global Gestalt perception.
Huberle, Elisabeth; Karnath, Hans-Otto
2012-07-01
Grouping processes enable the coherent perception of our environment. A number of brain areas has been suggested to be involved in the integration of elements into objects including early and higher visual areas along the ventral visual pathway as well as motion-processing areas of the dorsal visual pathway. However, integration not only is required for the cortical representation of individual objects, but is also essential for the perception of more complex visual scenes consisting of several different objects and/or shapes. The present fMRI experiments aimed to address such integration processes. We investigated the neural correlates underlying the global Gestalt perception of hierarchically organized stimuli that allowed parametrical degrading of the object at the global level. The comparison of intact versus disturbed perception of the global Gestalt revealed a network of cortical areas including the temporo-parietal junction (TPJ), anterior cingulate cortex and the precuneus. The TPJ location corresponds well with the areas known to be typically lesioned in stroke patients with simultanagnosia following bilateral brain damage. These patients typically show a deficit in identifying the global Gestalt of a visual scene. Further, we found the closest relation between behavioral performance and fMRI activation for the TPJ. Our data thus argue for a significant role of the TPJ in human global Gestalt perception.
Reduced modulation of scanpaths in response to task demands in posterior cortical atrophy.
Shakespeare, Timothy J; Pertzov, Yoni; Yong, Keir X X; Nicholas, Jennifer; Crutch, Sebastian J
2015-02-01
A difficulty in perceiving visual scenes is one of the most striking impairments experienced by patients with the clinico-radiological syndrome posterior cortical atrophy (PCA). However whilst a number of studies have investigated perception of relatively simple experimental stimuli in these individuals, little is known about multiple object and complex scene perception and the role of eye movements in posterior cortical atrophy. We embrace the distinction between high-level (top-down) and low-level (bottom-up) influences upon scanning eye movements when looking at scenes. This distinction was inspired by Yarbus (1967), who demonstrated how the location of our fixations is affected by task instructions and not only the stimulus' low level properties. We therefore examined how scanning patterns are influenced by task instructions and low-level visual properties in 7 patients with posterior cortical atrophy, 8 patients with typical Alzheimer's disease, and 19 healthy age-matched controls. Each participant viewed 10 scenes under four task conditions (encoding, recognition, search and description) whilst eye movements were recorded. The results reveal significant differences between groups in the impact of test instructions upon scanpaths. Across tasks without a search component, posterior cortical atrophy patients were significantly less consistent than typical Alzheimer's disease patients and controls in where they were looking. By contrast, when comparing search and non-search tasks, it was controls who exhibited lowest between-task similarity ratings, suggesting they were better able than posterior cortical atrophy or typical Alzheimer's disease patients to respond appropriately to high-level needs by looking at task-relevant regions of a scene. Posterior cortical atrophy patients had a significant tendency to fixate upon more low-level salient parts of the scenes than controls irrespective of the viewing task. The study provides a detailed characterisation of scene perception abilities in posterior cortical atrophy and offers insights into the mechanisms by which high-level cognitive schemes interact with low-level perception. Copyright © 2015 Elsevier Ltd. All rights reserved.
The genesis of errors in drawing.
Chamberlain, Rebecca; Wagemans, Johan
2016-06-01
The difficulty adults find in drawing objects or scenes from real life is puzzling, assuming that there are few gross individual differences in the phenomenology of visual scenes and in fine motor control in the neurologically healthy population. A review of research concerning the perceptual, motoric and memorial correlates of drawing ability was conducted in order to understand why most adults err when trying to produce faithful representations of objects and scenes. The findings reveal that accurate perception of the subject and of the drawing is at the heart of drawing proficiency, although not to the extent that drawing skill elicits fundamental changes in visual perception. Instead, the decisive role of representational decisions reveals the importance of appropriate segmentation of the visual scene and of the influence of pictorial schemas. This leads to the conclusion that domain-specific, flexible, top-down control of visual attention plays a critical role in development of skill in visual art and may also be a window into creative thinking. Copyright © 2016 Elsevier Ltd. All rights reserved.
On a common circle: natural scenes and Gestalt rules.
Sigman, M; Cecchi, G A; Gilbert, C D; Magnasco, M O
2001-02-13
To understand how the human visual system analyzes images, it is essential to know the structure of the visual environment. In particular, natural images display consistent statistical properties that distinguish them from random luminance distributions. We have studied the geometric regularities of oriented elements (edges or line segments) present in an ensemble of visual scenes, asking how much information the presence of a segment in a particular location of the visual scene carries about the presence of a second segment at different relative positions and orientations. We observed strong long-range correlations in the distribution of oriented segments that extend over the whole visual field. We further show that a very simple geometric rule, cocircularity, predicts the arrangement of segments in natural scenes, and that different geometrical arrangements show relevant differences in their scaling properties. Our results show similarities to geometric features of previous physiological and psychophysical studies. We discuss the implications of these findings for theories of early vision.
Short temporal asynchrony disrupts visual object recognition
Singer, Jedediah M.; Kreiman, Gabriel
2014-01-01
Humans can recognize objects and scenes in a small fraction of a second. The cascade of signals underlying rapid recognition might be disrupted by temporally jittering different parts of complex objects. Here we investigated the time course over which shape information can be integrated to allow for recognition of complex objects. We presented fragments of object images in an asynchronous fashion and behaviorally evaluated categorization performance. We observed that visual recognition was significantly disrupted by asynchronies of approximately 30 ms, suggesting that spatiotemporal integration begins to break down with even small deviations from simultaneity. However, moderate temporal asynchrony did not completely obliterate recognition; in fact, integration of visual shape information persisted even with an asynchrony of 100 ms. We describe the data with a concise model based on the dynamic reduction of uncertainty about what image was presented. These results emphasize the importance of timing in visual processing and provide strong constraints for the development of dynamical models of visual shape recognition. PMID:24819738
Anticipation in Real-world Scenes: The Role of Visual Context and Visual Memory
ERIC Educational Resources Information Center
Coco, Moreno I.; Keller, Frank; Malcolm, George L.
2016-01-01
The human sentence processor is able to make rapid predictions about upcoming linguistic input. For example, upon hearing the verb eat, anticipatory eye-movements are launched toward edible objects in a visual scene (Altmann & Kamide, 1999). However, the cognitive mechanisms that underlie anticipation remain to be elucidated in ecologically…
View Combination: A Generalization Mechanism for Visual Recognition
ERIC Educational Resources Information Center
Friedman, Alinda; Waller, David; Thrash, Tyler; Greenauer, Nathan; Hodgson, Eric
2011-01-01
We examined whether view combination mechanisms shown to underlie object and scene recognition can integrate visual information across views that have little or no three-dimensional information at either the object or scene level. In three experiments, people learned four "views" of a two dimensional visual array derived from a three-dimensional…
What you see is what you expect: rapid scene understanding benefits from prior experience.
Greene, Michelle R; Botros, Abraham P; Beck, Diane M; Fei-Fei, Li
2015-05-01
Although we are able to rapidly understand novel scene images, little is known about the mechanisms that support this ability. Theories of optimal coding assert that prior visual experience can be used to ease the computational burden of visual processing. A consequence of this idea is that more probable visual inputs should be facilitated relative to more unlikely stimuli. In three experiments, we compared the perceptions of highly improbable real-world scenes (e.g., an underwater press conference) with common images matched for visual and semantic features. Although the two groups of images could not be distinguished by their low-level visual features, we found profound deficits related to the improbable images: Observers wrote poorer descriptions of these images (Exp. 1), had difficulties classifying the images as unusual (Exp. 2), and even had lower sensitivity to detect these images in noise than to detect their more probable counterparts (Exp. 3). Taken together, these results place a limit on our abilities for rapid scene perception and suggest that perception is facilitated by prior visual experience.
High-power graphic computers for visual simulation: a real-time--rendering revolution
NASA Technical Reports Server (NTRS)
Kaiser, M. K.
1996-01-01
Advances in high-end graphics computers in the past decade have made it possible to render visual scenes of incredible complexity and realism in real time. These new capabilities make it possible to manipulate and investigate the interactions of observers with their visual world in ways once only dreamed of. This paper reviews how these developments have affected two preexisting domains of behavioral research (flight simulation and motion perception) and have created a new domain (virtual environment research) which provides tools and challenges for the perceptual psychologist. Finally, the current limitations of these technologies are considered, with an eye toward how perceptual psychologist might shape future developments.
Iconic memory for the gist of natural scenes.
Clarke, Jason; Mack, Arien
2014-11-01
Does iconic memory contain the gist of multiple scenes? Three experiments were conducted. In the first, four scenes from different basic-level categories were briefly presented in one of two conditions: a cue or a no-cue condition. The cue condition was designed to provide an index of the contents of iconic memory of the display. Subjects were more sensitive to scene gist in the cue condition than in the no-cue condition. In the second, the scenes came from the same basic-level category. We found no difference in sensitivity between the two conditions. In the third, six scenes from different basic level categories were presented in the visual periphery. Subjects were more sensitive to scene gist in the cue condition. These results suggest that scene gist is contained in iconic memory even in the visual periphery; however, iconic representations are not sufficiently detailed to distinguish between scenes coming from the same category. Copyright © 2014 Elsevier Inc. All rights reserved.
Robust selectivity to two-object images in human visual cortex
Agam, Yigal; Liu, Hesheng; Papanastassiou, Alexander; Buia, Calin; Golby, Alexandra J.; Madsen, Joseph R.; Kreiman, Gabriel
2010-01-01
SUMMARY We can recognize objects in a fraction of a second in spite of the presence of other objects [1–3]. The responses in macaque areas V4 and inferior temporal cortex [4–15] to a neuron’s preferred stimuli are typically suppressed by the addition of a second object within the receptive field (see however [16, 17]). How can this suppression be reconciled with rapid visual recognition in complex scenes? One option is that certain “special categories” are unaffected by other objects [18] but this leaves the problem unsolved for other categories. Another possibility is that serial attentional shifts help ameliorate the problem of distractor objects [19–21]. Yet, psychophysical studies [1–3], scalp recordings [1] and neurophysiological recordings [14, 16, 22–24], suggest that the initial sweep of visual processing contains a significant amount of information. We recorded intracranial field potentials in human visual cortex during presentation of flashes of two-object images. Visual selectivity from temporal cortex during the initial ~200 ms was largely robust to the presence of other objects. We could train linear decoders on the responses to isolated objects and decode information in two-object images. These observations are compatible with parallel, hierarchical and feed-forward theories of rapid visual recognition [25] and may provide a neural substrate to begin to unravel rapid recognition in natural scenes. PMID:20417105
Institute for Brain and Neural Systems
2009-10-06
to deal with computational complexity when analyzing large amounts of information in visual scenes. It seems natural that in addition to exploring...algorithms using methods from statistical pattern recognition and machine learning. Over the last fifteen years, significant advances had been made in...recognition, robustness to noise and ability to cope with significant variations in lighting conditions. Identifying an occluded target adds another layer of
The neural bases of spatial frequency processing during scene perception
Kauffmann, Louise; Ramanoël, Stephen; Peyrin, Carole
2014-01-01
Theories on visual perception agree that scenes are processed in terms of spatial frequencies. Low spatial frequencies (LSF) carry coarse information whereas high spatial frequencies (HSF) carry fine details of the scene. However, how and where spatial frequencies are processed within the brain remain unresolved questions. The present review addresses these issues and aims to identify the cerebral regions differentially involved in low and high spatial frequency processing, and to clarify their attributes during scene perception. Results from a number of behavioral and neuroimaging studies suggest that spatial frequency processing is lateralized in both hemispheres, with the right and left hemispheres predominantly involved in the categorization of LSF and HSF scenes, respectively. There is also evidence that spatial frequency processing is retinotopically mapped in the visual cortex. HSF scenes (as opposed to LSF) activate occipital areas in relation to foveal representations, while categorization of LSF scenes (as opposed to HSF) activates occipital areas in relation to more peripheral representations. Concomitantly, a number of studies have demonstrated that LSF information may reach high-order areas rapidly, allowing an initial coarse parsing of the visual scene, which could then be sent back through feedback into the occipito-temporal cortex to guide finer HSF-based analysis. Finally, the review addresses spatial frequency processing within scene-selective regions areas of the occipito-temporal cortex. PMID:24847226
The visual light field in real scenes
Xia, Ling; Pont, Sylvia C.; Heynderickx, Ingrid
2014-01-01
Human observers' ability to infer the light field in empty space is known as the “visual light field.” While most relevant studies were performed using images on computer screens, we investigate the visual light field in a real scene by using a novel experimental setup. A “probe” and a scene were mixed optically using a semitransparent mirror. Twenty participants were asked to judge whether the probe fitted the scene with regard to the illumination intensity, direction, and diffuseness. Both smooth and rough probes were used to test whether observers use the additional cues for the illumination direction and diffuseness provided by the 3D texture over the rough probe. The results confirmed that observers are sensitive to the intensity, direction, and diffuseness of the illumination also in real scenes. For some lighting combinations on scene and probe, the awareness of a mismatch between the probe and scene was found to depend on which lighting condition was on the scene and which on the probe, which we called the “swap effect.” For these cases, the observers judged the fit to be better if the average luminance of the visible parts of the probe was closer to the average luminance of the visible parts of the scene objects. The use of a rough instead of smooth probe was found to significantly improve observers' abilities to detect mismatches in lighting diffuseness and directions. PMID:25926970
Corney, David; Haynes, John-Dylan; Rees, Geraint; Lotto, R. Beau
2009-01-01
Background The perception of brightness depends on spatial context: the same stimulus can appear light or dark depending on what surrounds it. A less well-known but equally important contextual phenomenon is that the colour of a stimulus can also alter its brightness. Specifically, stimuli that are more saturated (i.e. purer in colour) appear brighter than stimuli that are less saturated at the same luminance. Similarly, stimuli that are red or blue appear brighter than equiluminant yellow and green stimuli. This non-linear relationship between stimulus intensity and brightness, called the Helmholtz-Kohlrausch (HK) effect, was first described in the nineteenth century but has never been explained. Here, we take advantage of the relative simplicity of this ‘illusion’ to explain it and contextual effects more generally, by using a simple Bayesian ideal observer model of the human visual ecology. We also use fMRI brain scans to identify the neural correlates of brightness without changing the spatial context of the stimulus, which has complicated the interpretation of related fMRI studies. Results Rather than modelling human vision directly, we use a Bayesian ideal observer to model human visual ecology. We show that the HK effect is a result of encoding the non-linear statistical relationship between retinal images and natural scenes that would have been experienced by the human visual system in the past. We further show that the complexity of this relationship is due to the response functions of the cone photoreceptors, which themselves are thought to represent an efficient solution to encoding the statistics of images. Finally, we show that the locus of the response to the relationship between images and scenes lies in the primary visual cortex (V1), if not earlier in the visual system, since the brightness of colours (as opposed to their luminance) accords with activity in V1 as measured with fMRI. Conclusions The data suggest that perceptions of brightness represent a robust visual response to the likely sources of stimuli, as determined, in this instance, by the known statistical relationship between scenes and their retinal responses. While the responses of the early visual system (receptors in this case) may represent specifically the statistics of images, post receptor responses are more likely represent the statistical relationship between images and scenes. A corollary of this suggestion is that the visual cortex is adapted to relate the retinal image to behaviour given the statistics of its past interactions with the sources of retinal images: the visual cortex is adapted to the signals it receives from the eyes, and not directly to the world beyond. PMID:19333398
Slow changing postural cues cancel visual field dependence on self-tilt detection.
Scotto Di Cesare, C; Macaluso, T; Mestre, D R; Bringoux, L
2015-01-01
Interindividual differences influence the multisensory integration process involved in spatial perception. Here, we assessed the effect of visual field dependence on self-tilt detection relative to upright, as a function of static vs. slow changing visual or postural cues. To that aim, we manipulated slow rotations (i.e., 0.05° s(-1)) of the body and/or the visual scene in pitch. Participants had to indicate whether they felt being tilted forward at successive angles. Results show that thresholds for self-tilt detection substantially differed between visual field dependent/independent subjects, when only the visual scene was rotated. This difference was no longer present when the body was actually rotated, whatever the visual scene condition (i.e., absent, static or rotated relative to the observer). These results suggest that the cancellation of visual field dependence by dynamic postural cues may rely on a multisensory reweighting process, where slow changing vestibular/somatosensory inputs may prevail over visual inputs. Copyright © 2014 Elsevier B.V. All rights reserved.
Perception of Graphical Virtual Environments by Blind Users via Sensory Substitution
Maidenbaum, Shachar; Buchs, Galit; Abboud, Sami; Lavi-Rotbain, Ori; Amedi, Amir
2016-01-01
Graphical virtual environments are currently far from accessible to blind users as their content is mostly visual. This is especially unfortunate as these environments hold great potential for this population for purposes such as safe orientation, education, and entertainment. Previous tools have increased accessibility but there is still a long way to go. Visual-to-audio Sensory-Substitution-Devices (SSDs) can increase accessibility generically by sonifying on-screen content regardless of the specific environment and offer increased accessibility without the use of expensive dedicated peripherals like electrode/vibrator arrays. Using SSDs virtually utilizes similar skills as when using them in the real world, enabling both training on the device and training on environments virtually before real-world visits. This could enable more complex, standardized and autonomous SSD training and new insights into multisensory interaction and the visually-deprived brain. However, whether congenitally blind users, who have never experienced virtual environments, will be able to use this information for successful perception and interaction within them is currently unclear.We tested this using the EyeMusic SSD, which conveys whole-scene visual information, to perform virtual tasks otherwise impossible without vision. Congenitally blind users had to navigate virtual environments and find doors, differentiate between them based on their features (Experiment1:task1) and surroundings (Experiment1:task2) and walk through them; these tasks were accomplished with a 95% and 97% success rate, respectively. We further explored the reactions of congenitally blind users during their first interaction with a more complex virtual environment than in the previous tasks–walking down a virtual street, recognizing different features of houses and trees, navigating to cross-walks, etc. Users reacted enthusiastically and reported feeling immersed within the environment. They highlighted the potential usefulness of such environments for understanding what visual scenes are supposed to look like and their potential for complex training and suggested many future environments they wished to experience. PMID:26882473
Perception of Graphical Virtual Environments by Blind Users via Sensory Substitution.
Maidenbaum, Shachar; Buchs, Galit; Abboud, Sami; Lavi-Rotbain, Ori; Amedi, Amir
2016-01-01
Graphical virtual environments are currently far from accessible to blind users as their content is mostly visual. This is especially unfortunate as these environments hold great potential for this population for purposes such as safe orientation, education, and entertainment. Previous tools have increased accessibility but there is still a long way to go. Visual-to-audio Sensory-Substitution-Devices (SSDs) can increase accessibility generically by sonifying on-screen content regardless of the specific environment and offer increased accessibility without the use of expensive dedicated peripherals like electrode/vibrator arrays. Using SSDs virtually utilizes similar skills as when using them in the real world, enabling both training on the device and training on environments virtually before real-world visits. This could enable more complex, standardized and autonomous SSD training and new insights into multisensory interaction and the visually-deprived brain. However, whether congenitally blind users, who have never experienced virtual environments, will be able to use this information for successful perception and interaction within them is currently unclear.We tested this using the EyeMusic SSD, which conveys whole-scene visual information, to perform virtual tasks otherwise impossible without vision. Congenitally blind users had to navigate virtual environments and find doors, differentiate between them based on their features (Experiment1:task1) and surroundings (Experiment1:task2) and walk through them; these tasks were accomplished with a 95% and 97% success rate, respectively. We further explored the reactions of congenitally blind users during their first interaction with a more complex virtual environment than in the previous tasks-walking down a virtual street, recognizing different features of houses and trees, navigating to cross-walks, etc. Users reacted enthusiastically and reported feeling immersed within the environment. They highlighted the potential usefulness of such environments for understanding what visual scenes are supposed to look like and their potential for complex training and suggested many future environments they wished to experience.
Beyond the cockpit: The visual world as a flight instrument
NASA Technical Reports Server (NTRS)
Johnson, W. W.; Kaiser, M. K.; Foyle, D. C.
1992-01-01
The use of cockpit instruments to guide flight control is not always an option (e.g., low level rotorcraft flight). Under such circumstances the pilot must use out-the-window information for control and navigation. Thus it is important to determine the basis of visually guided flight for several reasons: (1) to guide the design and construction of the visual displays used in training simulators; (2) to allow modeling of visibility restrictions brought about by weather, cockpit constraints, or distortions introduced by sensor systems; and (3) to aid in the development of displays that augment the cockpit window scene and are compatible with the pilot's visual extraction of information from the visual scene. The authors are actively pursuing these questions. We have on-going studies using both low-cost, lower fidelity flight simulators, and state-of-the-art helicopter simulation research facilities. Research results will be presented on: (1) the important visual scene information used in altitude and speed control; (2) the utility of monocular, stereo, and hyperstereo cues for the control of flight; (3) perceptual effects due to the differences between normal unaided daylight vision, and that made available by various night vision devices (e.g., light intensifying goggles and infra-red sensor displays); and (4) the utility of advanced contact displays in which instrument information is made part of the visual scene, as on a 'scene linked' head-up display (e.g., displaying altimeter information on a virtual billboard located on the ground).
The Neural Dynamics of Attentional Selection in Natural Scenes.
Kaiser, Daniel; Oosterhof, Nikolaas N; Peelen, Marius V
2016-10-12
The human visual system can only represent a small subset of the many objects present in cluttered scenes at any given time, such that objects compete for representation. Despite these processing limitations, the detection of object categories in cluttered natural scenes is remarkably rapid. How does the brain efficiently select goal-relevant objects from cluttered scenes? In the present study, we used multivariate decoding of magneto-encephalography (MEG) data to track the neural representation of within-scene objects as a function of top-down attentional set. Participants detected categorical targets (cars or people) in natural scenes. The presence of these categories within a scene was decoded from MEG sensor patterns by training linear classifiers on differentiating cars and people in isolation and testing these classifiers on scenes containing one of the two categories. The presence of a specific category in a scene could be reliably decoded from MEG response patterns as early as 160 ms, despite substantial scene clutter and variation in the visual appearance of each category. Strikingly, we find that these early categorical representations fully depend on the match between visual input and top-down attentional set: only objects that matched the current attentional set were processed to the category level within the first 200 ms after scene onset. A sensor-space searchlight analysis revealed that this early attention bias was localized to lateral occipitotemporal cortex, reflecting top-down modulation of visual processing. These results show that attention quickly resolves competition between objects in cluttered natural scenes, allowing for the rapid neural representation of goal-relevant objects. Efficient attentional selection is crucial in many everyday situations. For example, when driving a car, we need to quickly detect obstacles, such as pedestrians crossing the street, while ignoring irrelevant objects. How can humans efficiently perform such tasks, given the multitude of objects contained in real-world scenes? Here we used multivariate decoding of magnetoencephalogaphy data to characterize the neural underpinnings of attentional selection in natural scenes with high temporal precision. We show that brain activity quickly tracks the presence of objects in scenes, but crucially only for those objects that were immediately relevant for the participant. These results provide evidence for fast and efficient attentional selection that mediates the rapid detection of goal-relevant objects in real-world environments. Copyright © 2016 the authors 0270-6474/16/3610522-07$15.00/0.
Reduced change blindness suggests enhanced attention to detail in individuals with autism.
Smith, Hayley; Milne, Elizabeth
2009-03-01
The phenomenon of change blindness illustrates that a limited number of items within the visual scene are attended to at any one time. It has been suggested that individuals with autism focus attention on less contextually relevant aspects of the visual scene, show superior perceptual discrimination and notice details which are often ignored by typical observers. In this study we investigated change blindness in autism by asking participants to detect continuity errors deliberately introduced into a short film. Whether the continuity errors involved central/marginal or social/non-social aspects of the visual scene was varied. Thirty adolescent participants, 15 with autistic spectrum disorder (ASD) and 15 typically developing (TD) controls participated. The participants with ASD detected significantly more errors than the TD participants. Both groups identified more errors involving central rather than marginal aspects of the scene, although this effect was larger in the TD participants. There was no difference in the number of social or non-social errors detected by either group of participants. In line with previous data suggesting an abnormally broad attentional spotlight and enhanced perceptual function in individuals with ASD, the results of this study suggest enhanced awareness of the visual scene in ASD. The results of this study could reflect superior top-down control of visual search in autism, enhanced perceptual function, or inefficient filtering of visual information in ASD.
Scan Patterns Predict Sentence Production in the Cross-Modal Processing of Visual Scenes
ERIC Educational Resources Information Center
Coco, Moreno I.; Keller, Frank
2012-01-01
Most everyday tasks involve multiple modalities, which raises the question of how the processing of these modalities is coordinated by the cognitive system. In this paper, we focus on the coordination of visual attention and linguistic processing during speaking. Previous research has shown that objects in a visual scene are fixated before they…
ERIC Educational Resources Information Center
Rice, Katherine; Moriuchi, Jennifer M.; Jones, Warren; Klin, Ami
2012-01-01
Objective: To examine patterns of variability in social visual engagement and their relationship to standardized measures of social disability in a heterogeneous sample of school-aged children with autism spectrum disorders (ASD). Method: Eye-tracking measures of visual fixation during free-viewing of dynamic social scenes were obtained for 109…
Shapiro, Arthur G; Hamburger, Kai
2007-01-01
A central tenet of Gestalt psychology is that the visual scene can be separated into figure and ground. The two illusions we present demonstrate that Gestalt processes can group spatial contrast information that cuts across the figure/ground separation. This finding suggests that visual processes that organise the visual scene do not necessarily require structural segmentation as their primary input.
Subramanian, Ramanathan; Shankar, Divya; Sebe, Nicu; Melcher, David
2014-03-26
A basic question in vision research regards where people look in complex scenes and how this influences their performance in various tasks. Previous studies with static images have demonstrated a close link between where people look and what they remember. Here, we examined the pattern of eye movements when participants watched neutral and emotional clips from Hollywood-style movies. Participants answered multiple-choice memory questions concerning visual and auditory scene details immediately upon viewing 1-min-long neutral or emotional movie clips. Fixations were more narrowly focused for emotional clips, and immediate memory for object details was worse compared to matched neutral scenes, implying preferential attention to emotional events. Although we found the expected correlation between where people looked and what they remembered for neutral clips, this relationship broke down for emotional clips. When participants were subsequently presented with key frames (static images) extracted from the movie clips such that presentation duration of the target objects (TOs) corresponding to the multiple-choice questions was matched and the earlier questions were repeated, more fixations were observed on the TOs, and memory performance also improved significantly, confirming that emotion modulates the relationship between gaze position and memory performance. Finally, in a long-term memory test, old/new recognition performance was significantly better for emotional scenes as compared to neutral scenes. Overall, these results are consistent with the hypothesis that emotional content draws eye fixations and strengthens memory for the scene gist while weakening encoding of peripheral scene details.
Redies, Christoph; Groß, Franziska
2013-01-01
Frames provide a visual link between artworks and their surround. We asked how image properties change as an observer zooms out from viewing a painting alone, to viewing the painting with its frame and, finally, the framed painting in its museum environment (museum scene). To address this question, we determined three higher-order image properties that are based on histograms of oriented luminance gradients. First, complexity was measured as the sum of the strengths of all gradients in the image. Second, we determined the self-similarity of histograms of the orientated gradients at different levels of spatial analysis. Third, we analyzed how much gradient strength varied across orientations (anisotropy). Results were obtained for three art museums that exhibited paintings from three major periods of Western art. In all three museums, the mean complexity of the frames was higher than that of the paintings or the museum scenes. Frames thus provide a barrier of complexity between the paintings and their exterior. By contrast, self-similarity and anisotropy values of images of framed paintings were intermediate between the images of the paintings and the museum scenes, i.e., the frames provided a transition between the paintings and their surround. We also observed differences between the three museums that may reflect modified frame usage in different art periods. For example, frames in the museum for 20th century art tended to be smaller and less complex than in the two other two museums that exhibit paintings from earlier art periods (13th–18th century and 19th century, respectively). Finally, we found that the three properties did not depend on the type of reproduction of the paintings (photographs in museums, scans from books or images from the Google Art Project). To the best of our knowledge, this study is the first to investigate the relation between frames and paintings by measuring physically defined, higher-order image properties. PMID:24265625
Pasqualotto, Achille; Esenkaya, Tayfun
2016-01-01
Visual-to-auditory sensory substitution is used to convey visual information through audition, and it was initially created to compensate for blindness; it consists of software converting the visual images captured by a video-camera into the equivalent auditory images, or "soundscapes". Here, it was used by blindfolded sighted participants to learn the spatial position of simple shapes depicted in images arranged on the floor. Very few studies have used sensory substitution to investigate spatial representation, while it has been widely used to investigate object recognition. Additionally, with sensory substitution we could study the performance of participants actively exploring the environment through audition, rather than passively localizing sound sources. Blindfolded participants egocentrically learnt the position of six images by using sensory substitution and then a judgment of relative direction task (JRD) was used to determine how this scene was represented. This task consists of imagining being in a given location, oriented in a given direction, and pointing towards the required image. Before performing the JRD task, participants explored a map that provided allocentric information about the scene. Although spatial exploration was egocentric, surprisingly we found that performance in the JRD task was better for allocentric perspectives. This suggests that the egocentric representation of the scene was updated. This result is in line with previous studies using visual and somatosensory scenes, thus supporting the notion that different sensory modalities produce equivalent spatial representation(s). Moreover, our results have practical implications to improve training methods with sensory substitution devices (SSD).
Henderson, John M; Chanceaux, Myriam; Smith, Tim J
2009-01-23
We investigated the relationship between visual clutter and visual search in real-world scenes. Specifically, we investigated whether visual clutter, indexed by feature congestion, sub-band entropy, and edge density, correlates with search performance as assessed both by traditional behavioral measures (response time and error rate) and by eye movements. Our results demonstrate that clutter is related to search performance. These results hold for both traditional search measures and for eye movements. The results suggest that clutter may serve as an image-based proxy for search set size in real-world scenes.
Generating descriptive visual words and visual phrases for large-scale image applications.
Zhang, Shiliang; Tian, Qi; Hua, Gang; Huang, Qingming; Gao, Wen
2011-09-01
Bag-of-visual Words (BoWs) representation has been applied for various problems in the fields of multimedia and computer vision. The basic idea is to represent images as visual documents composed of repeatable and distinctive visual elements, which are comparable to the text words. Notwithstanding its great success and wide adoption, visual vocabulary created from single-image local descriptors is often shown to be not as effective as desired. In this paper, descriptive visual words (DVWs) and descriptive visual phrases (DVPs) are proposed as the visual correspondences to text words and phrases, where visual phrases refer to the frequently co-occurring visual word pairs. Since images are the carriers of visual objects and scenes, a descriptive visual element set can be composed by the visual words and their combinations which are effective in representing certain visual objects or scenes. Based on this idea, a general framework is proposed for generating DVWs and DVPs for image applications. In a large-scale image database containing 1506 object and scene categories, the visual words and visual word pairs descriptive to certain objects or scenes are identified and collected as the DVWs and DVPs. Experiments show that the DVWs and DVPs are informative and descriptive and, thus, are more comparable with the text words than the classic visual words. We apply the identified DVWs and DVPs in several applications including large-scale near-duplicated image retrieval, image search re-ranking, and object recognition. The combination of DVW and DVP performs better than the state of the art in large-scale near-duplicated image retrieval in terms of accuracy, efficiency and memory consumption. The proposed image search re-ranking algorithm: DWPRank outperforms the state-of-the-art algorithm by 12.4% in mean average precision and about 11 times faster in efficiency.
Development of Moire machine vision
NASA Technical Reports Server (NTRS)
Harding, Kevin G.
1987-01-01
Three dimensional perception is essential to the development of versatile robotics systems in order to handle complex manufacturing tasks in future factories and in providing high accuracy measurements needed in flexible manufacturing and quality control. A program is described which will develop the potential of Moire techniques to provide this capability in vision systems and automated measurements, and demonstrate artificial intelligence (AI) techniques to take advantage of the strengths of Moire sensing. Moire techniques provide a means of optically manipulating the complex visual data in a three dimensional scene into a form which can be easily and quickly analyzed by computers. This type of optical data manipulation provides high productivity through integrated automation, producing a high quality product while reducing computer and mechanical manipulation requirements and thereby the cost and time of production. This nondestructive evaluation is developed to be able to make full field range measurement and three dimensional scene analysis.
Development of Moire machine vision
NASA Astrophysics Data System (ADS)
Harding, Kevin G.
1987-10-01
Three dimensional perception is essential to the development of versatile robotics systems in order to handle complex manufacturing tasks in future factories and in providing high accuracy measurements needed in flexible manufacturing and quality control. A program is described which will develop the potential of Moire techniques to provide this capability in vision systems and automated measurements, and demonstrate artificial intelligence (AI) techniques to take advantage of the strengths of Moire sensing. Moire techniques provide a means of optically manipulating the complex visual data in a three dimensional scene into a form which can be easily and quickly analyzed by computers. This type of optical data manipulation provides high productivity through integrated automation, producing a high quality product while reducing computer and mechanical manipulation requirements and thereby the cost and time of production. This nondestructive evaluation is developed to be able to make full field range measurement and three dimensional scene analysis.
Category search speeds up face-selective fMRI responses in a non-hierarchical cortical face network.
Jiang, Fang; Badler, Jeremy B; Righi, Giulia; Rossion, Bruno
2015-05-01
The human brain is extremely efficient at detecting faces in complex visual scenes, but the spatio-temporal dynamics of this remarkable ability, and how it is influenced by category-search, remain largely unknown. In the present study, human subjects were shown gradually-emerging images of faces or cars in visual scenes, while neural activity was recorded using functional magnetic resonance imaging (fMRI). Category search was manipulated by the instruction to indicate the presence of either a face or a car, in different blocks, as soon as an exemplar of the target category was detected in the visual scene. The category selectivity of most face-selective areas was enhanced when participants were instructed to report the presence of faces in gradually decreasing noise stimuli. Conversely, the same regions showed much less selectivity when participants were instructed instead to detect cars. When "face" was the target category, the fusiform face area (FFA) showed consistently earlier differentiation of face versus car stimuli than did the "occipital face area" (OFA). When "car" was the target category, only the FFA showed differentiation of face versus car stimuli. These observations provide further challenges for hierarchical models of cortical face processing and show that during gradual revealing of information, selective category-search may decrease the required amount of information, enhancing and speeding up category-selective responses in the human brain. Copyright © 2015 Elsevier Ltd. All rights reserved.
Stainer, Matthew J.; Scott-Brown, Kenneth C.; Tatler, Benjamin W.
2013-01-01
Where people look when viewing a scene has been a much explored avenue of vision research (e.g., see Tatler, 2009). Current understanding of eye guidance suggests that a combination of high and low-level factors influence fixation selection (e.g., Torralba et al., 2006), but that there are also strong biases toward the center of an image (Tatler, 2007). However, situations where we view multiplexed scenes are becoming increasingly common, and it is unclear how visual inspection might be arranged when content lacks normal semantic or spatial structure. Here we use the central bias to examine how gaze behavior is organized in scenes that are presented in their normal format, or disrupted by scrambling the quadrants and separating them by space. In Experiment 1, scrambling scenes had the strongest influence on gaze allocation. Observers were highly biased by the quadrant center, although physical space did not enhance this bias. However, the center of the display still contributed to fixation selection above chance, and was most influential early in scene viewing. When the top left quadrant was held constant across all conditions in Experiment 2, fixation behavior was significantly influenced by the overall arrangement of the display, with fixations being biased toward the quadrant center when the other three quadrants were scrambled (despite the visual information in this quadrant being identical in all conditions). When scenes are scrambled into four quadrants and semantic contiguity is disrupted, observers no longer appear to view the content as a single scene (despite it consisting of the same visual information overall), but rather anchor visual inspection around the four separate “sub-scenes.” Moreover, the frame of reference that observers use when viewing the multiplex seems to change across viewing time: from an early bias toward the display center to a later bias toward quadrant centers. PMID:24069008
ERIC Educational Resources Information Center
Henderson, John M.; Nuthmann, Antje; Luke, Steven G.
2013-01-01
Recent research on eye movements during scene viewing has primarily focused on where the eyes fixate. But eye fixations also differ in their durations. Here we investigated whether fixation durations in scene viewing are under the direct and immediate control of the current visual input. Subjects freely viewed photographs of scenes in preparation…
Initial Scene Representations Facilitate Eye Movement Guidance in Visual Search
ERIC Educational Resources Information Center
Castelhano, Monica S.; Henderson, John M.
2007-01-01
What role does the initial glimpse of a scene play in subsequent eye movement guidance? In 4 experiments, a brief scene preview was followed by object search through the scene via a small moving window that was tied to fixation position. Experiment 1 demonstrated that the scene preview resulted in more efficient eye movements compared with a…
Language-Mediated Eye Movements in the Absence of a Visual World: The "Blank Screen Paradigm"
ERIC Educational Resources Information Center
Altmann, Gerry T. M.
2004-01-01
The "visual world paradigm" typically involves presenting participants with a visual scene and recording eye movements as they either hear an instruction to manipulate objects in the scene or as they listen to a description of what may happen to those objects. In this study, participants heard each target sentence only after the corresponding…
Dynamic binding of visual features by neuronal/stimulus synchrony.
Iwabuchi, A
1998-05-01
When people see a visual scene, certain parts of the visual scene are treated as belonging together and we regard them as a perceptual unit, which is called a "figure". People focus on figures, and the remaining parts of the scene are disregarded as "ground". In Gestalt psychology this process is called "figure-ground segregation". According to current perceptual psychology, a figure is formed by binding various visual features in a scene, and developments in neuroscience have revealed that there are many feature-encoding neurons, which respond to such features specifically. It is not known, however, how the brain binds different features of an object into a coherent visual object representation. Recently, the theory of binding by neuronal synchrony, which argues that feature binding is dynamically mediated by neuronal synchrony of feature-encoding neurons, has been proposed. This review article portrays the problem of figure-ground segregation and features binding, summarizes neurophysiological and psychophysical experiments and theory relevant to feature binding by neuronal/stimulus synchrony, and suggests possible directions for future research on this topic.
Acoustical Awareness for Intelligent Robotic Action
2007-12-01
sound is desired or needed for some other purposes, but is interfering with the intended application, it is called noise. The Soundscape refers...to that which can be heard. Although often used interchangeably with the term Auditory Scene, the soundscape is a narrower definition, referring...difficult is the underlying complexity of the acoustical domain. The soundscape is always changing with time, more so than even the visual domain tends
Direct versus indirect processing changes the influence of color in natural scene categorization.
Otsuka, Sachio; Kawaguchi, Jun
2009-10-01
We examined whether participants would use a negative priming (NP) paradigm to categorize color and grayscale images of natural scenes that were presented peripherally and were ignored. We focused on (1) attentional resources allocated to natural scenes and (2) direct versus indirect processing of them. We set up low and high attention-load conditions, based on the set size of the searched stimuli in the prime display (one and five). Participants were required to detect and categorize the target objects in natural scenes in a central visual search task, ignoring peripheral natural images in both the prime and probe displays. The results showed that, irrespective of attention load, NP was observed for color scenes but not for grayscale scenes. We did not observe any effect of color information in central visual search, where participants responded directly to natural scenes. These results indicate that, in a situation in which participants indirectly process natural scenes, color information is critical to object categorization, but when the scenes are processed directly, color information does not contribute to categorization.
Matching optical flow to motor speed in virtual reality while running on a treadmill
Lafortuna, Claudio L.; Mugellini, Elena; Abou Khaled, Omar
2018-01-01
We investigated how visual and kinaesthetic/efferent information is integrated for speed perception in running. Twelve moderately trained to trained subjects ran on a treadmill at three different speeds (8, 10, 12 km/h) in front of a moving virtual scene. They were asked to match the visual speed of the scene to their running speed–i.e., treadmill’s speed. For each trial, participants indicated whether the scene was moving slower or faster than they were running. Visual speed was adjusted according to their response using a staircase until the Point of Subjective Equality (PSE) was reached, i.e., until visual and running speed were perceived as equivalent. For all three running speeds, participants systematically underestimated the visual speed relative to their actual running speed. Indeed, the speed of the visual scene had to exceed the actual running speed in order to be perceived as equivalent to the treadmill speed. The underestimation of visual speed was speed-dependent, and percentage of underestimation relative to running speed ranged from 15% at 8km/h to 31% at 12km/h. We suggest that this fact should be taken into consideration to improve the design of attractive treadmill-mediated virtual environments enhancing engagement into physical activity for healthier lifestyles and disease prevention and care. PMID:29641564
Matching optical flow to motor speed in virtual reality while running on a treadmill.
Caramenti, Martina; Lafortuna, Claudio L; Mugellini, Elena; Abou Khaled, Omar; Bresciani, Jean-Pierre; Dubois, Amandine
2018-01-01
We investigated how visual and kinaesthetic/efferent information is integrated for speed perception in running. Twelve moderately trained to trained subjects ran on a treadmill at three different speeds (8, 10, 12 km/h) in front of a moving virtual scene. They were asked to match the visual speed of the scene to their running speed-i.e., treadmill's speed. For each trial, participants indicated whether the scene was moving slower or faster than they were running. Visual speed was adjusted according to their response using a staircase until the Point of Subjective Equality (PSE) was reached, i.e., until visual and running speed were perceived as equivalent. For all three running speeds, participants systematically underestimated the visual speed relative to their actual running speed. Indeed, the speed of the visual scene had to exceed the actual running speed in order to be perceived as equivalent to the treadmill speed. The underestimation of visual speed was speed-dependent, and percentage of underestimation relative to running speed ranged from 15% at 8km/h to 31% at 12km/h. We suggest that this fact should be taken into consideration to improve the design of attractive treadmill-mediated virtual environments enhancing engagement into physical activity for healthier lifestyles and disease prevention and care.
-The Influence of Scene Context on Parafoveal Processing of Objects.
Castelhano, Monica S; Pereira, Effie J
2017-04-21
Many studies in reading have shown the enhancing effect of context on the processing of a word before it is directly fixated (parafoveal processing of words; Balota et al., 1985; Balota & Rayner, 1983; Ehrlich & Rayner, 1981). Here, we examined whether scene context influences the parafoveal processing of objects and enhances the extraction of object information. Using a modified boundary paradigm (Rayner, 1975), the Dot-Boundary paradigm, participants fixated on a suddenly-onsetting cue before the preview object would onset 4° away. The preview object could be identical to the target, visually similar, visually dissimilar, or a control (black rectangle). The preview changed to the target object once a saccade toward the object was made. Critically, the objects were presented on either a consistent or an inconsistent scene background. Results revealed that there was a greater processing benefit for consistent than inconsistent scene backgrounds and that identical and visually similar previews produced greater processing benefits than other previews. In the second experiment, we added an additional context condition in which the target location was inconsistent, but the scene semantics remained consistent. We found that changing the location of the target object disrupted the processing benefit derived from the consistent context. Most importantly, across both experiments, the effect of preview was not enhanced by scene context. Thus, preview information and scene context appear to independently boost the parafoveal processing of objects without any interaction from object-scene congruency.
Hillstrom, Anne P; Segabinazi, Joice D; Godwin, Hayward J; Liversedge, Simon P; Benson, Valerie
2017-02-19
We explored the influence of early scene analysis and visible object characteristics on eye movements when searching for objects in photographs of scenes. On each trial, participants were shown sequentially either a scene preview or a uniform grey screen (250 ms), a visual mask, the name of the target and the scene, now including the target at a likely location. During the participant's first saccade during search, the target location was changed to: (i) a different likely location, (ii) an unlikely but possible location or (iii) a very implausible location. The results showed that the first saccade landed more often on the likely location in which the target re-appeared than on unlikely or implausible locations, and overall the first saccade landed nearer the first target location with a preview than without. Hence, rapid scene analysis influenced initial eye movement planning, but availability of the target rapidly modified that plan. After the target moved, it was found more quickly when it appeared in a likely location than when it appeared in an unlikely or implausible location. The findings show that both scene gist and object properties are extracted rapidly, and are used in conjunction to guide saccadic eye movements during visual search.This article is part of the themed issue 'Auditory and visual scene analysis'. © 2017 The Author(s).
The effects of alcohol intoxication on attention and memory for visual scenes.
Harvey, Alistair J; Kneller, Wendy; Campbell, Alison C
2013-01-01
This study tests the claim that alcohol intoxication narrows the focus of visual attention on to the more salient features of a visual scene. A group of alcohol intoxicated and sober participants had their eye movements recorded as they encoded a photographic image featuring a central event of either high or low salience. All participants then recalled the details of the image the following day when sober. We sought to determine whether the alcohol group would pay less attention to the peripheral features of the encoded scene than their sober counterparts, whether this effect of attentional narrowing was stronger for the high-salience event than for the low-salience event, and whether it would lead to a corresponding deficit in peripheral recall. Alcohol was found to narrow the focus of foveal attention to the central features of both images but did not facilitate recall from this region. It also reduced the overall amount of information accurately recalled from each scene. These findings demonstrate that the concept of alcohol myopia originally posited to explain the social consequences of intoxication (Steele & Josephs, 1990) may be extended to explain the relative neglect of peripheral information during the processing of visual scenes.
Fu, Kun; Jin, Junqi; Cui, Runpeng; Sha, Fei; Zhang, Changshui
2017-12-01
Recent progress on automatic generation of image captions has shown that it is possible to describe the most salient information conveyed by images with accurate and meaningful sentences. In this paper, we propose an image captioning system that exploits the parallel structures between images and sentences. In our model, the process of generating the next word, given the previously generated ones, is aligned with the visual perception experience where the attention shifts among the visual regions-such transitions impose a thread of ordering in visual perception. This alignment characterizes the flow of latent meaning, which encodes what is semantically shared by both the visual scene and the text description. Our system also makes another novel modeling contribution by introducing scene-specific contexts that capture higher-level semantic information encoded in an image. The contexts adapt language models for word generation to specific scene types. We benchmark our system and contrast to published results on several popular datasets, using both automatic evaluation metrics and human evaluation. We show that either region-based attention or scene-specific contexts improves systems without those components. Furthermore, combining these two modeling ingredients attains the state-of-the-art performance.
Fixation and saliency during search of natural scenes: the case of visual agnosia.
Foulsham, Tom; Barton, Jason J S; Kingstone, Alan; Dewhurst, Richard; Underwood, Geoffrey
2009-07-01
Models of eye movement control in natural scenes often distinguish between stimulus-driven processes (which guide the eyes to visually salient regions) and those based on task and object knowledge (which depend on expectations or identification of objects and scene gist). In the present investigation, the eye movements of a patient with visual agnosia were recorded while she searched for objects within photographs of natural scenes and compared to those made by students and age-matched controls. Agnosia is assumed to disrupt the top-down knowledge available in this task, and so may increase the reliance on bottom-up cues. The patient's deficit in object recognition was seen in poor search performance and inefficient scanning. The low-level saliency of target objects had an effect on responses in visual agnosia, and the most salient region in the scene was more likely to be fixated by the patient than by controls. An analysis of model-predicted saliency at fixation locations indicated a closer match between fixations and low-level saliency in agnosia than in controls. These findings are discussed in relation to saliency-map models and the balance between high and low-level factors in eye guidance.
Kukona, Anuenue; Tabor, Whitney
2011-01-01
The visual world paradigm presents listeners with a challenging problem: they must integrate two disparate signals, the spoken language and the visual context, in support of action (e.g., complex movements of the eyes across a scene). We present Impulse Processing, a dynamical systems approach to incremental eye movements in the visual world that suggests a framework for integrating language, vision, and action generally. Our approach assumes that impulses driven by the language and the visual context impinge minutely on a dynamical landscape of attractors corresponding to the potential eye-movement behaviors of the system. We test three unique predictions of our approach in an empirical study in the visual world paradigm, and describe an implementation in an artificial neural network. We discuss the Impulse Processing framework in relation to other models of the visual world paradigm. PMID:21609355
Perceptual congruency of audio-visual speech affects ventriloquism with bilateral visual stimuli.
Kanaya, Shoko; Yokosawa, Kazuhiko
2011-02-01
Many studies on multisensory processes have focused on performance in simplified experimental situations, with a single stimulus in each sensory modality. However, these results cannot necessarily be applied to explain our perceptual behavior in natural scenes where various signals exist within one sensory modality. We investigated the role of audio-visual syllable congruency on participants' auditory localization bias or the ventriloquism effect using spoken utterances and two videos of a talking face. Salience of facial movements was also manipulated. Results indicated that more salient visual utterances attracted participants' auditory localization. Congruent pairing of audio-visual utterances elicited greater localization bias than incongruent pairing, while previous studies have reported little dependency on the reality of stimuli in ventriloquism. Moreover, audio-visual illusory congruency, owing to the McGurk effect, caused substantial visual interference on auditory localization. Multisensory performance appears more flexible and adaptive in this complex environment than in previous studies.
Learning what to expect (in visual perception)
Seriès, Peggy; Seitz, Aaron R.
2013-01-01
Expectations are known to greatly affect our experience of the world. A growing theory in computational neuroscience is that perception can be successfully described using Bayesian inference models and that the brain is “Bayes-optimal” under some constraints. In this context, expectations are particularly interesting, because they can be viewed as prior beliefs in the statistical inference process. A number of questions remain unsolved, however, for example: How fast do priors change over time? Are there limits in the complexity of the priors that can be learned? How do an individual’s priors compare to the true scene statistics? Can we unlearn priors that are thought to correspond to natural scene statistics? Where and what are the neural substrate of priors? Focusing on the perception of visual motion, we here review recent studies from our laboratories and others addressing these issues. We discuss how these data on motion perception fit within the broader literature on perceptual Bayesian priors, perceptual expectations, and statistical and perceptual learning and review the possible neural basis of priors. PMID:24187536
NASA Technical Reports Server (NTRS)
Bergeron, H. P.; Haynie, A. T.; Mcdede, J. B.
1980-01-01
A general aviation single pilot instrument flight rule simulation capability was developed. Problems experienced by single pilots flying in IFR conditions were investigated. The simulation required a three dimensional spatial navaid environment of a flight navigational area. A computer simulation of all the navigational aids plus 12 selected airports located in the Washington/Norfolk area was developed. All programmed locations in the list were referenced to a Cartesian coordinate system with the origin located at a specified airport's reference point. All navigational aids with their associated frequencies, call letters, locations, and orientations plus runways and true headings are included in the data base. The simulation included a TV displayed out-the-window visual scene of country and suburban terrain and a scaled model runway complex. Any of the programmed runways, with all its associated navaids, can be referenced to a runway on the airport in this visual scene. This allows a simulation of a full mission scenario including breakout and landing.
A three-layer model of natural image statistics.
Gutmann, Michael U; Hyvärinen, Aapo
2013-11-01
An important property of visual systems is to be simultaneously both selective to specific patterns found in the sensory input and invariant to possible variations. Selectivity and invariance (tolerance) are opposing requirements. It has been suggested that they could be joined by iterating a sequence of elementary selectivity and tolerance computations. It is, however, unknown what should be selected or tolerated at each level of the hierarchy. We approach this issue by learning the computations from natural images. We propose and estimate a probabilistic model of natural images that consists of three processing layers. Two natural image data sets are considered: image patches, and complete visual scenes downsampled to the size of small patches. For both data sets, we find that in the first two layers, simple and complex cell-like computations are performed. In the third layer, we mainly find selectivity to longer contours; for patch data, we further find some selectivity to texture, while for the downsampled complete scenes, some selectivity to curvature is observed. Copyright © 2013 Elsevier Ltd. All rights reserved.
Cortical systems mediating visual attention to both objects and spatial locations
Shomstein, Sarah; Behrmann, Marlene
2006-01-01
Natural visual scenes consist of many objects occupying a variety of spatial locations. Given that the plethora of information cannot be processed simultaneously, the multiplicity of inputs compete for representation. Using event-related functional MRI, we show that attention, the mechanism by which a subset of the input is selected, is mediated by the posterior parietal cortex (PPC). Of particular interest is that PPC activity is differentially sensitive to the object-based properties of the input, with enhanced activation for those locations bound by an attended object. Of great interest too is the ensuing modulation of activation in early cortical regions, reflected as differences in the temporal profile of the blood oxygenation level-dependent (BOLD) response for within-object versus between-object locations. These findings indicate that object-based selection results from an object-sensitive reorienting signal issued by the PPC. The dynamic circuit between the PPC and earlier sensory regions then enables observers to attend preferentially to objects of interest in complex scenes. PMID:16840559
Learning to Link Visual Contours
Li, Wu; Piëch, Valentin; Gilbert, Charles D.
2008-01-01
SUMMARY In complex visual scenes, linking related contour elements is important for object recognition. This process, thought to be stimulus driven and hard wired, has substrates in primary visual cortex (V1). Here, however, we find contour integration in V1 to depend strongly on perceptual learning and top-down influences that are specific to contour detection. In naive monkeys the information about contours embedded in complex backgrounds is absent in V1 neuronal responses, and is independent of the locus of spatial attention. Training animals to find embedded contours induces strong contour-related responses specific to the trained retinotopic region. These responses are most robust when animals perform the contour detection task, but disappear under anesthesia. Our findings suggest that top-down influences dynamically adapt neural circuits according to specific perceptual tasks. This may serve as a general neuronal mechanism of perceptual learning, and reflect top-down mediated changes in cortical states. PMID:18255036
He, Mengyang; Qi, Changzhu; Lu, Yang; Song, Amanda; Hayat, Saba Z; Xu, Xia
2018-05-21
Extensive studies have shown that a sports expert is superior to a sports novice in visually perceptual-cognitive processes of sports scene information, however the attentional and neural basis of it has not been thoroughly explored. The present study examined whether a sport expert has the attentional superiority on scene information relevant to his/her sport skill, and explored what factor drives this superiority. To address this problem, EEGs were recorded as participants passively viewed sport scenes (tennis vs. non-tennis) and negative emotional faces in the context of a visual attention task, where the pictures of sport scenes or of negative emotional faces randomly followed the pictures with overlapping sport scenes and negative emotional faces. ERP results showed that for experts, the evoked potential of attentional competition elicited by the overlap of tennis scene was significantly larger than that evoked by the overlap of non-tennis scene, while this effect was absent for novices. The LORETA showed that the experts' left medial frontal gyrus (MFG) cortex was significantly more active as compared to the right MFG when processing the overlap of tennis scene, but the lateralization effect was not significant in novices. Those results indicate that experts have attentional superiority on skill-related scene information, despite intruding the scene through negative emotional faces that are prone to cause negativity bias toward their visual field as a strong distractor. This superiority is actuated by the activation of left MFG cortex and probably due to self-reference. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.
Progress in high-level exploratory vision
NASA Astrophysics Data System (ADS)
Brand, Matthew
1993-08-01
We have been exploring the hypothesis that vision is an explanatory process, in which causal and functional reasoning about potential motion plays an intimate role in mediating the activity of low-level visual processes. In particular, we have explored two of the consequences of this view for the construction of purposeful vision systems: Causal and design knowledge can be used to (1) drive focus of attention, and (2) choose between ambiguous image interpretations. An important result of visual understanding is an explanation of the scene's causal structure: How action is originated, constrained, and prevented, and what will happen in the immediate future. In everyday visual experience, most action takes the form of motion, and most causal analysis takes the form of dynamical analysis. This is even true of static scenes, where much of a scene's interest lies in how possible motions are arrested. This paper describes our progress in developing domain theories and visual processes for the understanding of various kinds of structured scenes, including structures built out of children's constructive toys and simple mechanical devices.
High-dynamic-range scene compression in humans
NASA Astrophysics Data System (ADS)
McCann, John J.
2006-02-01
Single pixel dynamic-range compression alters a particular input value to a unique output value - a look-up table. It is used in chemical and most digital photographic systems having S-shaped transforms to render high-range scenes onto low-range media. Post-receptor neural processing is spatial, as shown by the physiological experiments of Dowling, Barlow, Kuffler, and Hubel & Wiesel. Human vision does not render a particular receptor-quanta catch as a unique response. Instead, because of spatial processing, the response to a particular quanta catch can be any color. Visual response is scene dependent. Stockham proposed an approach to model human range compression using low-spatial frequency filters. Campbell, Ginsberg, Wilson, Watson, Daly and many others have developed spatial-frequency channel models. This paper describes experiments measuring the properties of desirable spatial-frequency filters for a variety of scenes. Given the radiances of each pixel in the scene and the observed appearances of objects in the image, one can calculate the visual mask for that individual image. Here, visual mask is the spatial pattern of changes made by the visual system in processing the input image. It is the spatial signature of human vision. Low-dynamic range images with many white areas need no spatial filtering. High-dynamic-range images with many blacks, or deep shadows, require strong spatial filtering. Sun on the right and shade on the left requires directional filters. These experiments show that variable scene- scenedependent filters are necessary to mimic human vision. Although spatial-frequency filters can model human dependent appearances, the problem still remains that an analysis of the scene is still needed to calculate the scene-dependent strengths of each of the filters for each frequency.
Bilateral Theta-Burst TMS to Influence Global Gestalt Perception
Ritzinger, Bernd; Huberle, Elisabeth; Karnath, Hans-Otto
2012-01-01
While early and higher visual areas along the ventral visual pathway in the inferotemporal cortex are critical for the recognition of individual objects, the neural representation of human perception of complex global visual scenes remains under debate. Stroke patients with a selective deficit in the perception of a complex global Gestalt with intact recognition of individual objects – a deficit termed simultanagnosia – greatly helped to study this question. Interestingly, simultanagnosia typically results from bilateral lesions of the temporo-parietal junction (TPJ). The present study aimed to verify the relevance of this area for human global Gestalt perception. We applied continuous theta-burst TMS either unilaterally (left or right) or bilateral simultaneously over TPJ. Healthy subjects were presented with hierarchically organized visual stimuli that allowed parametrical degrading of the object at the global level. Identification of the global Gestalt was significantly modulated only for the bilateral TPJ stimulation condition. Our results strengthen the view that global Gestalt perception in the human brain involves TPJ and is co-dependent on both hemispheres. PMID:23110106
Bilateral theta-burst TMS to influence global gestalt perception.
Ritzinger, Bernd; Huberle, Elisabeth; Karnath, Hans-Otto
2012-01-01
While early and higher visual areas along the ventral visual pathway in the inferotemporal cortex are critical for the recognition of individual objects, the neural representation of human perception of complex global visual scenes remains under debate. Stroke patients with a selective deficit in the perception of a complex global Gestalt with intact recognition of individual objects - a deficit termed simultanagnosia - greatly helped to study this question. Interestingly, simultanagnosia typically results from bilateral lesions of the temporo-parietal junction (TPJ). The present study aimed to verify the relevance of this area for human global Gestalt perception. We applied continuous theta-burst TMS either unilaterally (left or right) or bilateral simultaneously over TPJ. Healthy subjects were presented with hierarchically organized visual stimuli that allowed parametrical degrading of the object at the global level. Identification of the global Gestalt was significantly modulated only for the bilateral TPJ stimulation condition. Our results strengthen the view that global Gestalt perception in the human brain involves TPJ and is co-dependent on both hemispheres.
Rice, Katherine; Moriuchi, Jennifer M; Jones, Warren; Klin, Ami
2012-03-01
To examine patterns of variability in social visual engagement and their relationship to standardized measures of social disability in a heterogeneous sample of school-aged children with autism spectrum disorders (ASD). Eye-tracking measures of visual fixation during free-viewing of dynamic social scenes were obtained for 109 children with ASD (mean age, 10.2 ± 3.2 years), 37 of whom were matched with 26 typically-developing (TD) children (mean age, 9.5 ± 2.2 years) on gender, age, and IQ. The smaller subset allowed between-group comparisons, whereas the larger group was used for within-group examinations of ASD heterogeneity. Between-group comparisons revealed significantly attenuated orientation to socially salient aspects of the scenes, with the largest effect size (Cohen's d = 1.5) obtained for reduced fixation on faces. Within-group analyses revealed a robust association between higher fixation on the inanimate environment and greater social disability. However, the associations between fixation on the eyes and mouth and social adaptation varied greatly, even reversing, when comparing different cognitive profile subgroups. Although patterns of social visual engagement with naturalistic social stimuli are profoundly altered in children with ASD, the social adaptivity of these behaviors varies for different groups of children. This variation likely represents different patterns of adaptation and maladaptation that should be traced longitudinally to the first years of life, before complex interactions between early predispositions and compensatory learning take place. We propose that variability in these early mechanisms of socialization may serve as proximal behavioral manifestations of genetic vulnerabilities. Copyright © 2012 American Academy of Child and Adolescent Psychiatry. Published by Elsevier Inc. All rights reserved.
Modulation of Temporal Precision in Thalamic Population Responses to Natural Visual Stimuli
Desbordes, Gaëlle; Jin, Jianzhong; Alonso, Jose-Manuel; Stanley, Garrett B.
2010-01-01
Natural visual stimuli have highly structured spatial and temporal properties which influence the way visual information is encoded in the visual pathway. In response to natural scene stimuli, neurons in the lateral geniculate nucleus (LGN) are temporally precise – on a time scale of 10–25 ms – both within single cells and across cells within a population. This time scale, established by non stimulus-driven elements of neuronal firing, is significantly shorter than that of natural scenes, yet is critical for the neural representation of the spatial and temporal structure of the scene. Here, a generalized linear model (GLM) that combines stimulus-driven elements with spike-history dependence associated with intrinsic cellular dynamics is shown to predict the fine timing precision of LGN responses to natural scene stimuli, the corresponding correlation structure across nearby neurons in the population, and the continuous modulation of spike timing precision and latency across neurons. A single model captured the experimentally observed neural response, across different levels of contrasts and different classes of visual stimuli, through interactions between the stimulus correlation structure and the nonlinearity in spike generation and spike history dependence. Given the sensitivity of the thalamocortical synapse to closely timed spikes and the importance of fine timing precision for the faithful representation of natural scenes, the modulation of thalamic population timing over these time scales is likely important for cortical representations of the dynamic natural visual environment. PMID:21151356
Language-guided visual processing affects reasoning: the role of referential and spatial anchoring.
Dumitru, Magda L; Joergensen, Gitte H; Cruickshank, Alice G; Altmann, Gerry T M
2013-06-01
Language is more than a source of information for accessing higher-order conceptual knowledge. Indeed, language may determine how people perceive and interpret visual stimuli. Visual processing in linguistic contexts, for instance, mirrors language processing and happens incrementally, rather than through variously-oriented fixations over a particular scene. The consequences of this atypical visual processing are yet to be determined. Here, we investigated the integration of visual and linguistic input during a reasoning task. Participants listened to sentences containing conjunctions or disjunctions (Nancy examined an ant and/or a cloud) and looked at visual scenes containing two pictures that either matched or mismatched the nouns. Degree of match between nouns and pictures (referential anchoring) and between their expected and actual spatial positions (spatial anchoring) affected fixations as well as judgments. We conclude that language induces incremental processing of visual scenes, which in turn becomes susceptible to reasoning errors during the language-meaning verification process. Copyright © 2013 Elsevier Inc. All rights reserved.
Songnian, Zhao; Qi, Zou; Chang, Liu; Xuemin, Liu; Shousi, Sun; Jun, Qiu
2014-04-23
How it is possible to "faithfully" represent a three-dimensional stereoscopic scene using Cartesian coordinates on a plane, and how three-dimensional perceptions differ between an actual scene and an image of the same scene are questions that have not yet been explored in depth. They seem like commonplace phenomena, but in fact, they are important and difficult issues for visual information processing, neural computation, physics, psychology, cognitive psychology, and neuroscience. The results of this study show that the use of plenoptic (or all-optical) functions and their dual plane parameterizations can not only explain the nature of information processing from the retina to the primary visual cortex and, in particular, the characteristics of the visual pathway's optical system and its affine transformation, but they can also clarify the reason why the vanishing point and line exist in a visual image. In addition, they can better explain the reasons why a three-dimensional Cartesian coordinate system can be introduced into the two-dimensional plane to express a real three-dimensional scene. 1. We introduce two different mathematical expressions of the plenoptic functions, Pw and Pv that can describe the objective world. We also analyze the differences between these two functions when describing visual depth perception, that is, the difference between how these two functions obtain the depth information of an external scene.2. The main results include a basic method for introducing a three-dimensional Cartesian coordinate system into a two-dimensional plane to express the depth of a scene, its constraints, and algorithmic implementation. In particular, we include a method to separate the plenoptic function and proceed with the corresponding transformation in the retina and visual cortex.3. We propose that size constancy, the vanishing point, and vanishing line form the basis of visual perception of the outside world, and that the introduction of a three-dimensional Cartesian coordinate system into a two dimensional plane reveals a corresponding mapping between a retinal image and the vanishing point and line.
2014-01-01
Background How it is possible to “faithfully” represent a three-dimensional stereoscopic scene using Cartesian coordinates on a plane, and how three-dimensional perceptions differ between an actual scene and an image of the same scene are questions that have not yet been explored in depth. They seem like commonplace phenomena, but in fact, they are important and difficult issues for visual information processing, neural computation, physics, psychology, cognitive psychology, and neuroscience. Results The results of this study show that the use of plenoptic (or all-optical) functions and their dual plane parameterizations can not only explain the nature of information processing from the retina to the primary visual cortex and, in particular, the characteristics of the visual pathway’s optical system and its affine transformation, but they can also clarify the reason why the vanishing point and line exist in a visual image. In addition, they can better explain the reasons why a three-dimensional Cartesian coordinate system can be introduced into the two-dimensional plane to express a real three-dimensional scene. Conclusions 1. We introduce two different mathematical expressions of the plenoptic functions, P w and P v that can describe the objective world. We also analyze the differences between these two functions when describing visual depth perception, that is, the difference between how these two functions obtain the depth information of an external scene. 2. The main results include a basic method for introducing a three-dimensional Cartesian coordinate system into a two-dimensional plane to express the depth of a scene, its constraints, and algorithmic implementation. In particular, we include a method to separate the plenoptic function and proceed with the corresponding transformation in the retina and visual cortex. 3. We propose that size constancy, the vanishing point, and vanishing line form the basis of visual perception of the outside world, and that the introduction of a three-dimensional Cartesian coordinate system into a two dimensional plane reveals a corresponding mapping between a retinal image and the vanishing point and line. PMID:24755246
Schomaker, Judith; Walper, Daniel; Wittmann, Bianca C; Einhäuser, Wolfgang
2017-04-01
In addition to low-level stimulus characteristics and current goals, our previous experience with stimuli can also guide attentional deployment. It remains unclear, however, if such effects act independently or whether they interact in guiding attention. In the current study, we presented natural scenes including every-day objects that differed in affective-motivational impact. In the first free-viewing experiment, we presented visually-matched triads of scenes in which one critical object was replaced that varied mainly in terms of motivational value, but also in terms of valence and arousal, as confirmed by ratings by a large set of observers. Treating motivation as a categorical factor, we found that it affected gaze. A linear-effect model showed that arousal, valence, and motivation predicted fixations above and beyond visual characteristics, like object size, eccentricity, or visual salience. In a second experiment, we experimentally investigated whether the effects of emotion and motivation could be modulated by visual salience. In a medium-salience condition, we presented the same unmodified scenes as in the first experiment. In a high-salience condition, we retained the saturation of the critical object in the scene, and decreased the saturation of the background, and in a low-salience condition, we desaturated the critical object while retaining the original saturation of the background. We found that highly salient objects guided gaze, but still found additional additive effects of arousal, valence and motivation, confirming that higher-level factors can also guide attention, as measured by fixations towards objects in natural scenes. Copyright © 2017 Elsevier Ltd. All rights reserved.
Pasqualotto, Achille; Esenkaya, Tayfun
2016-01-01
Visual-to-auditory sensory substitution is used to convey visual information through audition, and it was initially created to compensate for blindness; it consists of software converting the visual images captured by a video-camera into the equivalent auditory images, or “soundscapes”. Here, it was used by blindfolded sighted participants to learn the spatial position of simple shapes depicted in images arranged on the floor. Very few studies have used sensory substitution to investigate spatial representation, while it has been widely used to investigate object recognition. Additionally, with sensory substitution we could study the performance of participants actively exploring the environment through audition, rather than passively localizing sound sources. Blindfolded participants egocentrically learnt the position of six images by using sensory substitution and then a judgment of relative direction task (JRD) was used to determine how this scene was represented. This task consists of imagining being in a given location, oriented in a given direction, and pointing towards the required image. Before performing the JRD task, participants explored a map that provided allocentric information about the scene. Although spatial exploration was egocentric, surprisingly we found that performance in the JRD task was better for allocentric perspectives. This suggests that the egocentric representation of the scene was updated. This result is in line with previous studies using visual and somatosensory scenes, thus supporting the notion that different sensory modalities produce equivalent spatial representation(s). Moreover, our results have practical implications to improve training methods with sensory substitution devices (SSD). PMID:27148000
Interrupted Visual Searches Reveal Volatile Search Memory
ERIC Educational Resources Information Center
Shen, Y. Jeremy; Jiang, Yuhong V.
2006-01-01
This study investigated memory from interrupted visual searches. Participants conducted a change detection search task on polygons overlaid on scenes. Search was interrupted by various disruptions, including unfilled delay, passive viewing of other scenes, and additional search on new displays. Results showed that performance was unaffected by…
Chromatic information and feature detection in fast visual analysis
Del Viva, Maria M.; Punzi, Giovanni; Shevell, Steven K.; ...
2016-08-01
The visual system is able to recognize a scene based on a sketch made of very simple features. This ability is likely crucial for survival, when fast image recognition is necessary, and it is believed that a primal sketch is extracted very early in the visual processing. Such highly simplified representations can be sufficient for accurate object discrimination, but an open question is the role played by color in this process. Rich color information is available in natural scenes, yet artist's sketches are usually monochromatic; and, black-andwhite movies provide compelling representations of real world scenes. Also, the contrast sensitivity ofmore » color is low at fine spatial scales. We approach the question from the perspective of optimal information processing by a system endowed with limited computational resources. We show that when such limitations are taken into account, the intrinsic statistical properties of natural scenes imply that the most effective strategy is to ignore fine-scale color features and devote most of the bandwidth to gray-scale information. We find confirmation of these information-based predictions from psychophysics measurements of fast-viewing discrimination of natural scenes. As a result, we conclude that the lack of colored features in our visual representation, and our overall low sensitivity to high-frequency color components, are a consequence of an adaptation process, optimizing the size and power consumption of our brain for the visual world we live in.« less
Chromatic information and feature detection in fast visual analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Del Viva, Maria M.; Punzi, Giovanni; Shevell, Steven K.
The visual system is able to recognize a scene based on a sketch made of very simple features. This ability is likely crucial for survival, when fast image recognition is necessary, and it is believed that a primal sketch is extracted very early in the visual processing. Such highly simplified representations can be sufficient for accurate object discrimination, but an open question is the role played by color in this process. Rich color information is available in natural scenes, yet artist's sketches are usually monochromatic; and, black-andwhite movies provide compelling representations of real world scenes. Also, the contrast sensitivity ofmore » color is low at fine spatial scales. We approach the question from the perspective of optimal information processing by a system endowed with limited computational resources. We show that when such limitations are taken into account, the intrinsic statistical properties of natural scenes imply that the most effective strategy is to ignore fine-scale color features and devote most of the bandwidth to gray-scale information. We find confirmation of these information-based predictions from psychophysics measurements of fast-viewing discrimination of natural scenes. As a result, we conclude that the lack of colored features in our visual representation, and our overall low sensitivity to high-frequency color components, are a consequence of an adaptation process, optimizing the size and power consumption of our brain for the visual world we live in.« less
Synchronization of spontaneous eyeblinks while viewing video stories
Nakano, Tamami; Yamamoto, Yoshiharu; Kitajo, Keiichi; Takahashi, Toshimitsu; Kitazawa, Shigeru
2009-01-01
Blinks are generally suppressed during a task that requires visual attention and tend to occur immediately before or after the task when the timing of its onset and offset are explicitly given. During the viewing of video stories, blinks are expected to occur at explicit breaks such as scene changes. However, given that the scene length is unpredictable, there should also be appropriate timing for blinking within a scene to prevent temporal loss of critical visual information. Here, we show that spontaneous blinks were highly synchronized between and within subjects when they viewed the same short video stories, but were not explicitly tied to the scene breaks. Synchronized blinks occurred during scenes that required less attention such as at the conclusion of an action, during the absence of the main character, during a long shot and during repeated presentations of a similar scene. In contrast, blink synchronization was not observed when subjects viewed a background video or when they listened to a story read aloud. The results suggest that humans share a mechanism for controlling the timing of blinks that searches for an implicit timing that is appropriate to minimize the chance of losing critical information while viewing a stream of visual events. PMID:19640888
Common and Innovative Visuals: A sparsity modeling framework for video.
Abdolhosseini Moghadam, Abdolreza; Kumar, Mrityunjay; Radha, Hayder
2014-05-02
Efficient video representation models are critical for many video analysis and processing tasks. In this paper, we present a framework based on the concept of finding the sparsest solution to model video frames. To model the spatio-temporal information, frames from one scene are decomposed into two components: (i) a common frame, which describes the visual information common to all the frames in the scene/segment, and (ii) a set of innovative frames, which depicts the dynamic behaviour of the scene. The proposed approach exploits and builds on recent results in the field of compressed sensing to jointly estimate the common frame and the innovative frames for each video segment. We refer to the proposed modeling framework by CIV (Common and Innovative Visuals). We show how the proposed model can be utilized to find scene change boundaries and extend CIV to videos from multiple scenes. Furthermore, the proposed model is robust to noise and can be used for various video processing applications without relying on motion estimation and detection or image segmentation. Results for object tracking, video editing (object removal, inpainting) and scene change detection are presented to demonstrate the efficiency and the performance of the proposed model.
Marin, Manuela M.; Leder, Helmut
2013-01-01
Subjective complexity has been found to be related to hedonic measures of preference, pleasantness and beauty, but there is no consensus about the nature of this relationship in the visual and musical domains. Moreover, the affective content of stimuli has been largely neglected so far in the study of complexity but is crucial in many everyday contexts and in aesthetic experiences. We thus propose a cross-domain approach that acknowledges the multidimensional nature of complexity and that uses a wide range of objective complexity measures combined with subjective ratings. In four experiments, we employed pictures of affective environmental scenes, representational paintings, and Romantic solo and chamber music excerpts. Stimuli were pre-selected to vary in emotional content (pleasantness and arousal) and complexity (low versus high number of elements). For each set of stimuli, in a between-subjects design, ratings of familiarity, complexity, pleasantness and arousal were obtained for a presentation time of 25 s from 152 participants. In line with Berlyne’s collative-motivation model, statistical analyses controlling for familiarity revealed a positive relationship between subjective complexity and arousal, and the highest correlations were observed for musical stimuli. Evidence for a mediating role of arousal in the complexity-pleasantness relationship was demonstrated in all experiments, but was only significant for females with regard to music. The direction and strength of the linear relationship between complexity and pleasantness depended on the stimulus type and gender. For environmental scenes, the root mean square contrast measures and measures of compressed file size correlated best with subjective complexity, whereas only edge detection based on phase congruency yielded equivalent results for representational paintings. Measures of compressed file size and event density also showed positive correlations with complexity and arousal in music, which is relevant for the discussion on which aspects of complexity are domain-specific and which are domain-general. PMID:23977295
Marin, Manuela M; Leder, Helmut
2013-01-01
Subjective complexity has been found to be related to hedonic measures of preference, pleasantness and beauty, but there is no consensus about the nature of this relationship in the visual and musical domains. Moreover, the affective content of stimuli has been largely neglected so far in the study of complexity but is crucial in many everyday contexts and in aesthetic experiences. We thus propose a cross-domain approach that acknowledges the multidimensional nature of complexity and that uses a wide range of objective complexity measures combined with subjective ratings. In four experiments, we employed pictures of affective environmental scenes, representational paintings, and Romantic solo and chamber music excerpts. Stimuli were pre-selected to vary in emotional content (pleasantness and arousal) and complexity (low versus high number of elements). For each set of stimuli, in a between-subjects design, ratings of familiarity, complexity, pleasantness and arousal were obtained for a presentation time of 25 s from 152 participants. In line with Berlyne's collative-motivation model, statistical analyses controlling for familiarity revealed a positive relationship between subjective complexity and arousal, and the highest correlations were observed for musical stimuli. Evidence for a mediating role of arousal in the complexity-pleasantness relationship was demonstrated in all experiments, but was only significant for females with regard to music. The direction and strength of the linear relationship between complexity and pleasantness depended on the stimulus type and gender. For environmental scenes, the root mean square contrast measures and measures of compressed file size correlated best with subjective complexity, whereas only edge detection based on phase congruency yielded equivalent results for representational paintings. Measures of compressed file size and event density also showed positive correlations with complexity and arousal in music, which is relevant for the discussion on which aspects of complexity are domain-specific and which are domain-general.
Urakawa, Tomokazu; Ogata, Katsuya; Kimura, Takahiro; Kume, Yuko; Tobimatsu, Shozo
2015-01-01
Disambiguation of a noisy visual scene with prior knowledge is an indispensable task of the visual system. To adequately adapt to a dynamically changing visual environment full of noisy visual scenes, the implementation of knowledge-mediated disambiguation in the brain is imperative and essential for proceeding as fast as possible under the limited capacity of visual image processing. However, the temporal profile of the disambiguation process has not yet been fully elucidated in the brain. The present study attempted to determine how quickly knowledge-mediated disambiguation began to proceed along visual areas after the onset of a two-tone ambiguous image using magnetoencephalography with high temporal resolution. Using the predictive coding framework, we focused on activity reduction for the two-tone ambiguous image as an index of the implementation of disambiguation. Source analysis revealed that a significant activity reduction was observed in the lateral occipital area at approximately 120 ms after the onset of the ambiguous image, but not in preceding activity (about 115 ms) in the cuneus when participants perceptually disambiguated the ambiguous image with prior knowledge. These results suggested that knowledge-mediated disambiguation may be implemented as early as approximately 120 ms following an ambiguous visual scene, at least in the lateral occipital area, and provided an insight into the temporal profile of the disambiguation process of a noisy visual scene with prior knowledge. © 2014 Federation of European Neuroscience Societies and John Wiley & Sons Ltd.
ERIC Educational Resources Information Center
Marill, Thomas; And Others
The aim of the CYCLOPS Project research is the development of techniques for allowing computers to perform visual scene analysis, pre-processing of visual imagery, and perceptual learning. Work on scene analysis and learning has previously been described. The present report deals with research on pre-processing and with further work on scene…
Effects of Spatio-Temporal Aliasing on Out-the-Window Visual Systems
NASA Technical Reports Server (NTRS)
Sweet, Barbara T.; Stone, Leland S.; Liston, Dorion B.; Hebert, Tim M.
2014-01-01
Designers of out-the-window visual systems face a challenge when attempting to simulate the outside world as viewed from a cockpit. Many methodologies have been developed and adopted to aid in the depiction of particular scene features, or levels of static image detail. However, because aircraft move, it is necessary to also consider the quality of the motion in the simulated visual scene. When motion is introduced in the simulated visual scene, perceptual artifacts can become apparent. A particular artifact related to image motion, spatiotemporal aliasing, will be addressed. The causes of spatio-temporal aliasing will be discussed, and current knowledge regarding the impact of these artifacts on both motion perception and simulator task performance will be reviewed. Methods of reducing the impact of this artifact are also addressed
Voxel Datacubes for 3D Visualization in Blender
NASA Astrophysics Data System (ADS)
Gárate, Matías
2017-05-01
The growth of computational astrophysics and the complexity of multi-dimensional data sets evidences the need for new versatile visualization tools for both the analysis and presentation of the data. In this work, we show how to use the open-source software Blender as a three-dimensional (3D) visualization tool to study and visualize numerical simulation results, focusing on astrophysical hydrodynamic experiments. With a datacube as input, the software can generate a volume rendering of the 3D data, show the evolution of a simulation in time, and do a fly-around camera animation to highlight the points of interest. We explain the process to import simulation outputs into Blender using the voxel data format, and how to set up a visualization scene in the software interface. This method allows scientists to perform a complementary visual analysis of their data and display their results in an appealing way, both for outreach and science presentations.
New insights into ambient and focal visual fixations using an automatic classification algorithm
Follet, Brice; Le Meur, Olivier; Baccino, Thierry
2011-01-01
Overt visual attention is the act of directing the eyes toward a given area. These eye movements are characterised by saccades and fixations. A debate currently surrounds the role of visual fixations. Do they all have the same role in the free viewing of natural scenes? Recent studies suggest that at least two types of visual fixations exist: focal and ambient. The former is believed to be used to inspect local areas accurately, whereas the latter is used to obtain the context of the scene. We investigated the use of an automated system to cluster visual fixations in two groups using four types of natural scene images. We found new evidence to support a focal–ambient dichotomy. Our data indicate that the determining factor is the saccade amplitude. The dependence on the low-level visual features and the time course of these two kinds of visual fixations were examined. Our results demonstrate that there is an interplay between both fixation populations and that focal fixations are more dependent on low-level visual features than are ambient fixations. PMID:23145248
Effects of chromatic image statistics on illumination induced color differences.
Lucassen, Marcel P; Gevers, Theo; Gijsenij, Arjan; Dekker, Niels
2013-09-01
We measure the color fidelity of visual scenes that are rendered under different (simulated) illuminants and shown on a calibrated LCD display. Observers make triad illuminant comparisons involving the renderings from two chromatic test illuminants and one achromatic reference illuminant shown simultaneously. Four chromatic test illuminants are used: two along the daylight locus (yellow and blue), and two perpendicular to it (red and green). The observers select the rendering having the best color fidelity, thereby indirectly judging which of the two test illuminants induces the smallest color differences compared to the reference. Both multicolor test scenes and natural scenes are studied. The multicolor scenes are synthesized and represent ellipsoidal distributions in CIELAB chromaticity space having the same mean chromaticity but different chromatic orientations. We show that, for those distributions, color fidelity is best when the vector of the illuminant change (pointing from neutral to chromatic) is parallel to the major axis of the scene's chromatic distribution. For our selection of natural scenes, which generally have much broader chromatic distributions, we measure a higher color fidelity for the yellow and blue illuminants than for red and green. Scrambled versions of the natural images are also studied to exclude possible semantic effects. We quantitatively predict the average observer response (i.e., the illuminant probability) with four types of models, differing in the extent to which they incorporate information processing by the visual system. Results show different levels of performance for the models, and different levels for the multicolor scenes and the natural scenes. Overall, models based on the scene averaged color difference have the best performance. We discuss how color constancy algorithms may be improved by exploiting knowledge of the chromatic distribution of the visual scene.
Space flight visual simulation.
Xu, L
1985-01-01
In this paper, based on the scenes of stars seen by astronauts in their orbital flights, we have studied the mathematical model which must be constructed for CGI system to realize the space flight visual simulation. Considering such factors as the revolution and rotation of the Earth, exact date, time and site of orbital injection of the spacecraft, as well as its orbital flight and attitude motion, etc., we first defined all the instantaneous lines of sight and visual fields of astronauts in space. Then, through a series of coordinate transforms, the pictures of the scenes of stars changing with time-space were photographed one by one mathematically. In the procedure, we have designed a method of three-times "mathematical cutting." Finally, we obtained each instantaneous picture of the scenes of stars observed by astronauts through the window of the cockpit. Also, the dynamic conditions shaded by the Earth in the varying pictures of scenes of stars could be displayed.
Dima, Diana C; Perry, Gavin; Singh, Krish D
2018-06-11
In navigating our environment, we rapidly process and extract meaning from visual cues. However, the relationship between visual features and categorical representations in natural scene perception is still not well understood. Here, we used natural scene stimuli from different categories and filtered at different spatial frequencies to address this question in a passive viewing paradigm. Using representational similarity analysis (RSA) and cross-decoding of magnetoencephalography (MEG) data, we show that categorical representations emerge in human visual cortex at ∼180 ms and are linked to spatial frequency processing. Furthermore, dorsal and ventral stream areas reveal temporally and spatially overlapping representations of low and high-level layer activations extracted from a feedforward neural network. Our results suggest that neural patterns from extrastriate visual cortex switch from low-level to categorical representations within 200 ms, highlighting the rapid cascade of processing stages essential in human visual perception. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
Situated sentence processing: the coordinated interplay account and a neurobehavioral model.
Crocker, Matthew W; Knoeferle, Pia; Mayberry, Marshall R
2010-03-01
Empirical evidence demonstrating that sentence meaning is rapidly reconciled with the visual environment has been broadly construed as supporting the seamless interaction of visual and linguistic representations during situated comprehension. Based on recent behavioral and neuroscientific findings, however, we argue for the more deeply rooted coordination of the mechanisms underlying visual and linguistic processing, and for jointly considering the behavioral and neural correlates of scene-sentence reconciliation during situated comprehension. The Coordinated Interplay Account (CIA; Knoeferle, P., & Crocker, M. W. (2007). The influence of recent scene events on spoken comprehension: Evidence from eye movements. Journal of Memory and Language, 57(4), 519-543) asserts that incremental linguistic interpretation actively directs attention in the visual environment, thereby increasing the salience of attended scene information for comprehension. We review behavioral and neuroscientific findings in support of the CIA's three processing stages: (i) incremental sentence interpretation, (ii) language-mediated visual attention, and (iii) the on-line influence of non-linguistic visual context. We then describe a recently developed connectionist model which both embodies the central CIA proposals and has been successfully applied in modeling a range of behavioral findings from the visual world paradigm (Mayberry, M. R., Crocker, M. W., & Knoeferle, P. (2009). Learning to attend: A connectionist model of situated language comprehension. Cognitive Science). Results from a new simulation suggest the model also correlates with event-related brain potentials elicited by the immediate use of visual context for linguistic disambiguation (Knoeferle, P., Habets, B., Crocker, M. W., & Münte, T. F. (2008). Visual scenes trigger immediate syntactic reanalysis: Evidence from ERPs during situated spoken comprehension. Cerebral Cortex, 18(4), 789-795). Finally, we argue that the mechanisms underlying interpretation, visual attention, and scene apprehension are not only in close temporal synchronization, but have co-adapted to optimize real-time visual grounding of situated spoken language, thus facilitating the association of linguistic, visual and motor representations that emerge during the course of our embodied linguistic experience in the world. Copyright 2009 Elsevier Inc. All rights reserved.
Banno, Hayaki; Saiki, Jun
2015-03-01
Recent studies have sought to determine which levels of categories are processed first in visual scene categorization and have shown that the natural and man-made superordinate-level categories are understood faster than are basic-level categories. The current study examined the robustness of the superordinate-level advantage in a visual scene categorization task. A go/no-go categorization task was evaluated with response time distribution analysis using an ex-Gaussian template. A visual scene was categorized as either superordinate or basic level, and two basic-level categories forming a superordinate category were judged as either similar or dissimilar to each other. First, outdoor/ indoor groups and natural/man-made were used as superordinate categories to investigate whether the advantage could be generalized beyond the natural/man-made boundary. Second, a set of images forming a superordinate category was manipulated. We predicted that decreasing image set similarity within the superordinate-level category would work against the speed advantage. We found that basic-level categorization was faster than outdoor/indoor categorization when the outdoor category comprised dissimilar basic-level categories. Our results indicate that the superordinate-level advantage in visual scene categorization is labile across different categories and category structures. © 2015 SAGE Publications.
Hollingworth, Andrew; Henderson, John M
2004-07-01
In a change detection paradigm, the global orientation of a natural scene was incrementally changed in 1 degree intervals. In Experiments 1 and 2, participants demonstrated sustained change blindness to incremental rotation, often coming to consider a significantly different scene viewpoint as an unchanged continuation of the original view. Experiment 3 showed that participants who failed to detect the incremental rotation nevertheless reliably detected a single-step rotation back to the initial view. Together, these results demonstrate an important dissociation between explicit change detection and visual memory. Following a change, visual memory is updated to reflect the changed state of the environment, even if the change was not detected.
A scheme for racquet sports video analysis with the combination of audio-visual information
NASA Astrophysics Data System (ADS)
Xing, Liyuan; Ye, Qixiang; Zhang, Weigang; Huang, Qingming; Yu, Hua
2005-07-01
As a very important category in sports video, racquet sports video, e.g. table tennis, tennis and badminton, has been paid little attention in the past years. Considering the characteristics of this kind of sports video, we propose a new scheme for structure indexing and highlight generating based on the combination of audio and visual information. Firstly, a supervised classification method is employed to detect important audio symbols including impact (ball hit), audience cheers, commentator speech, etc. Meanwhile an unsupervised algorithm is proposed to group video shots into various clusters. Then, by taking advantage of temporal relationship between audio and visual signals, we can specify the scene clusters with semantic labels including rally scenes and break scenes. Thirdly, a refinement procedure is developed to reduce false rally scenes by further audio analysis. Finally, an exciting model is proposed to rank the detected rally scenes from which many exciting video clips such as game (match) points can be correctly retrieved. Experiments on two types of representative racquet sports video, table tennis video and tennis video, demonstrate encouraging results.
The perception of naturalness correlates with low-level visual features of environmental scenes.
Berman, Marc G; Hout, Michael C; Kardan, Omid; Hunter, MaryCarol R; Yourganov, Grigori; Henderson, John M; Hanayik, Taylor; Karimi, Hossein; Jonides, John
2014-01-01
Previous research has shown that interacting with natural environments vs. more urban or built environments can have salubrious psychological effects, such as improvements in attention and memory. Even viewing pictures of nature vs. pictures of built environments can produce similar effects. A major question is: What is it about natural environments that produces these benefits? Problematically, there are many differing qualities between natural and urban environments, making it difficult to narrow down the dimensions of nature that may lead to these benefits. In this study, we set out to uncover visual features that related to individuals' perceptions of naturalness in images. We quantified naturalness in two ways: first, implicitly using a multidimensional scaling analysis and second, explicitly with direct naturalness ratings. Features that seemed most related to perceptions of naturalness were related to the density of contrast changes in the scene, the density of straight lines in the scene, the average color saturation in the scene and the average hue diversity in the scene. We then trained a machine-learning algorithm to predict whether a scene was perceived as being natural or not based on these low-level visual features and we could do so with 81% accuracy. As such we were able to reliably predict subjective perceptions of naturalness with objective low-level visual features. Our results can be used in future studies to determine if these features, which are related to naturalness, may also lead to the benefits attained from interacting with nature.
Efficient summary statistical representation when change localization fails.
Haberman, Jason; Whitney, David
2011-10-01
People are sensitive to the summary statistics of the visual world (e.g., average orientation/speed/facial expression). We readily derive this information from complex scenes, often without explicit awareness. Given the fundamental and ubiquitous nature of summary statistical representation, we tested whether this kind of information is subject to the attentional constraints imposed by change blindness. We show that information regarding the summary statistics of a scene is available despite limited conscious access. In a novel experiment, we found that while observers can suffer from change blindness (i.e., not localize where change occurred between two views of the same scene), observers could nevertheless accurately report changes in the summary statistics (or "gist") about the very same scene. In the experiment, observers saw two successively presented sets of 16 faces that varied in expression. Four of the faces in the first set changed from one emotional extreme (e.g., happy) to another (e.g., sad) in the second set. Observers performed poorly when asked to locate any of the faces that changed (change blindness). However, when asked about the ensemble (which set was happier, on average), observer performance remained high. Observers were sensitive to the average expression even when they failed to localize any specific object change. That is, even when observers could not locate the very faces driving the change in average expression between the two sets, they nonetheless derived a precise ensemble representation. Thus, the visual system may be optimized to process summary statistics in an efficient manner, allowing it to operate despite minimal conscious access to the information presented.
A bottom-up model of spatial attention predicts human error patterns in rapid scene recognition.
Einhäuser, Wolfgang; Mundhenk, T Nathan; Baldi, Pierre; Koch, Christof; Itti, Laurent
2007-07-20
Humans demonstrate a peculiar ability to detect complex targets in rapidly presented natural scenes. Recent studies suggest that (nearly) no focal attention is required for overall performance in such tasks. Little is known, however, of how detection performance varies from trial to trial and which stages in the processing hierarchy limit performance: bottom-up visual processing (attentional selection and/or recognition) or top-down factors (e.g., decision-making, memory, or alertness fluctuations)? To investigate the relative contribution of these factors, eight human observers performed an animal detection task in natural scenes presented at 20 Hz. Trial-by-trial performance was highly consistent across observers, far exceeding the prediction of independent errors. This consistency demonstrates that performance is not primarily limited by idiosyncratic factors but by visual processing. Two statistical stimulus properties, contrast variation in the target image and the information-theoretical measure of "surprise" in adjacent images, predict performance on a trial-by-trial basis. These measures are tightly related to spatial attention, demonstrating that spatial attention and rapid target detection share common mechanisms. To isolate the causal contribution of the surprise measure, eight additional observers performed the animal detection task in sequences that were reordered versions of those all subjects had correctly recognized in the first experiment. Reordering increased surprise before and/or after the target while keeping the target and distractors themselves unchanged. Surprise enhancement impaired target detection in all observers. Consequently, and contrary to several previously published findings, our results demonstrate that attentional limitations, rather than target recognition alone, affect the detection of targets in rapidly presented visual sequences.
Eye Movements Reveal the Dynamic Simulation of Speed in Language
ERIC Educational Resources Information Center
Speed, Laura J.; Vigliocco, Gabriella
2014-01-01
This study investigates how speed of motion is processed in language. In three eye-tracking experiments, participants were presented with visual scenes and spoken sentences describing fast or slow events (e.g., "The lion ambled/dashed to the balloon"). Results showed that looking time to relevant objects in the visual scene was affected…
Honeybees can discriminate between Monet and Picasso paintings.
Wu, Wen; Moreno, Antonio M; Tangen, Jason M; Reinhard, Judith
2013-01-01
Honeybees (Apis mellifera) have remarkable visual learning and discrimination abilities that extend beyond learning simple colours, shapes or patterns. They can discriminate landscape scenes, types of flowers, and even human faces. This suggests that in spite of their small brain, honeybees have a highly developed capacity for processing complex visual information, comparable in many respects to vertebrates. Here, we investigated whether this capacity extends to complex images that humans distinguish on the basis of artistic style: Impressionist paintings by Monet and Cubist paintings by Picasso. We show that honeybees learned to simultaneously discriminate between five different Monet and Picasso paintings, and that they do not rely on luminance, colour, or spatial frequency information for discrimination. When presented with novel paintings of the same style, the bees even demonstrated some ability to generalize. This suggests that honeybees are able to discriminate Monet paintings from Picasso ones by extracting and learning the characteristic visual information inherent in each painting style. Our study further suggests that discrimination of artistic styles is not a higher cognitive function that is unique to humans, but simply due to the capacity of animals-from insects to humans-to extract and categorize the visual characteristics of complex images.
NASA Technical Reports Server (NTRS)
Johnson, Walter W.; Kaiser, Mary K.
2003-01-01
Perspective synthetic displays that supplement, or supplant, the optical windows traditionally used for guidance and control of aircraft are accompanied by potentially significant human factors problems related to the optical geometric conformality of the display. Such geometric conformality is broken when optical features are not in the location they would be if directly viewed through a window. This often occurs when the scene is relayed or generated from a location different from the pilot s eyepoint. However, assuming no large visual/vestibular effects, a pilot cad often learn to use such a display very effectively. Important problems may arise, however, when display accuracy or consistency is compromised, and this can usually be related to geometrical discrepancies between how the synthetic visual scene behaves and how the visual scene through a window behaves. In addition to these issues, this paper examines the potentially critical problem of the disorientation that can arise when both a synthetic display and a real window are present in a flight deck, and no consistent visual interpretation is available.
Visual search for changes in scenes creates long-term, incidental memory traces.
Utochkin, Igor S; Wolfe, Jeremy M
2018-05-01
Humans are very good at remembering large numbers of scenes over substantial periods of time. But how good are they at remembering changes to scenes? In this study, we tested scene memory and change detection two weeks after initial scene learning. In Experiments 1-3, scenes were learned incidentally during visual search for change. In Experiment 4, observers explicitly memorized scenes. At test, after two weeks observers were asked to discriminate old from new scenes, to recall a change that they had detected in the study phase, or to detect a newly introduced change in the memorization experiment. Next, they performed a change detection task, usually looking for the same change as in the study period. Scene recognition memory was found to be similar in all experiments, regardless of the study task. In Experiment 1, more difficult change detection produced better scene memory. Experiments 2 and 3 supported a "depth-of-processing" account for the effects of initial search and change detection on incidental memory for scenes. Of most interest, change detection was faster during the test phase than during the study phase, even when the observer had no explicit memory of having found that change previously. This result was replicated in two of our three change detection experiments. We conclude that scenes can be encoded incidentally as well as explicitly and that changes in those scenes can leave measurable traces even if they are not explicitly recalled.
Tung, Nicole D; Barr, Jason; Sheppard, Dion J; Elliot, Douglas A; Tottey, Leah S; Walsh, Kevan A J
2015-05-01
The delivery of forensic science evidence in a clear and understandable manner is an important aspect of a forensic scientist's role during expert witness delivery in a courtroom trial. This article describes an Integrated Evidence Platform (IEP) system based on spherical photography which allows the audience to view the crime scene via a virtual tour and view the forensic scientist's evidence and results in context. Equipment and software programmes used in the creation of the IEP include a Nikon DSLR camera, a Seitz Roundshot VR Drive, PTGui Pro, and Tourweaver Professional Edition. The IEP enables a clear visualization of the crime scene, with embedded information such as photographs of items of interest, complex forensic evidence, the results of laboratory analyses, and scientific opinion evidence presented in context. The IEP has resulted in significant improvements to the pretrial disclosure of forensic results, enhanced the delivery of evidence in court, and improved the jury's understanding of the spatial relationship between results. © 2015 American Academy of Forensic Sciences.
Is that disgust I see? Political ideology and biased visual attention.
Oosterhoff, Benjamin; Shook, Natalie J; Ford, Cameron
2018-01-15
Considerable evidence suggests that political liberals and conservatives vary in the way they process and respond to valenced (i.e., negative versus positive) information, with conservatives generally displaying greater negativity biases than liberals. Less is known about whether liberals and conservatives differentially prioritize certain forms of negative information over others. Across two studies using eye-tracking methodology, we examined differences in visual attention to negative scenes and facial expressions based on self-reported political ideology. In Study 1, scenes rated high in fear, disgust, sadness, and neutrality were presented simultaneously. Greater endorsement of socially conservative political attitudes was associated with less attentional engagement (i.e., lower dwell time) of disgust scenes and more attentional engagement toward neutral scenes. Socially conservative political attitudes were not significantly associated with visual attention to fear or sad scenes. In Study 2, images depicting facial expressions of fear, disgust, sadness, and neutrality were presented simultaneously. Greater endorsement of socially conservative political attitudes was associated with greater attentional engagement with facial expressions depicting disgust and less attentional engagement toward neutral faces. Visual attention to fearful or sad faces was not related to social conservatism. Endorsement of economically conservative political attitudes was not consistently associated with biases in visual attention across both studies. These findings support disease-avoidance models and suggest that social conservatism may be rooted within a greater sensitivity to disgust-related information. Copyright © 2017 Elsevier B.V. All rights reserved.
Ho-Phuoc, Tien; Guyader, Nathalie; Landragin, Frédéric; Guérin-Dugué, Anne
2012-02-03
Since Treisman's theory, it has been generally accepted that color is an elementary feature that guides eye movements when looking at natural scenes. Hence, most computational models of visual attention predict eye movements using color as an important visual feature. In this paper, using experimental data, we show that color does not affect where observers look when viewing natural scene images. Neither colors nor abnormal colors modify observers' fixation locations when compared to the same scenes in grayscale. In the same way, we did not find any significant difference between the scanpaths under grayscale, color, or abnormal color viewing conditions. However, we observed a decrease in fixation duration for color and abnormal color, and this was particularly true at the beginning of scene exploration. Finally, we found that abnormal color modifies saccade amplitude distribution.
Semantic guidance of eye movements in real-world scenes
Hwang, Alex D.; Wang, Hsueh-Cheng; Pomplun, Marc
2011-01-01
The perception of objects in our visual world is influenced by not only their low-level visual features such as shape and color, but also their high-level features such as meaning and semantic relations among them. While it has been shown that low-level features in real-world scenes guide eye movements during scene inspection and search, the influence of semantic similarity among scene objects on eye movements in such situations has not been investigated. Here we study guidance of eye movements by semantic similarity among objects during real-world scene inspection and search. By selecting scenes from the LabelMe object-annotated image database and applying Latent Semantic Analysis (LSA) to the object labels, we generated semantic saliency maps of real-world scenes based on the semantic similarity of scene objects to the currently fixated object or the search target. An ROC analysis of these maps as predictors of subjects’ gaze transitions between objects during scene inspection revealed a preference for transitions to objects that were semantically similar to the currently inspected one. Furthermore, during the course of a scene search, subjects’ eye movements were progressively guided toward objects that were semantically similar to the search target. These findings demonstrate substantial semantic guidance of eye movements in real-world scenes and show its importance for understanding real-world attentional control. PMID:21426914
Semantic guidance of eye movements in real-world scenes.
Hwang, Alex D; Wang, Hsueh-Cheng; Pomplun, Marc
2011-05-25
The perception of objects in our visual world is influenced by not only their low-level visual features such as shape and color, but also their high-level features such as meaning and semantic relations among them. While it has been shown that low-level features in real-world scenes guide eye movements during scene inspection and search, the influence of semantic similarity among scene objects on eye movements in such situations has not been investigated. Here we study guidance of eye movements by semantic similarity among objects during real-world scene inspection and search. By selecting scenes from the LabelMe object-annotated image database and applying latent semantic analysis (LSA) to the object labels, we generated semantic saliency maps of real-world scenes based on the semantic similarity of scene objects to the currently fixated object or the search target. An ROC analysis of these maps as predictors of subjects' gaze transitions between objects during scene inspection revealed a preference for transitions to objects that were semantically similar to the currently inspected one. Furthermore, during the course of a scene search, subjects' eye movements were progressively guided toward objects that were semantically similar to the search target. These findings demonstrate substantial semantic guidance of eye movements in real-world scenes and show its importance for understanding real-world attentional control. Copyright © 2011 Elsevier Ltd. All rights reserved.
Computational mechanisms underlying cortical responses to the affordance properties of visual scenes
Epstein, Russell A.
2018-01-01
Biologically inspired deep convolutional neural networks (CNNs), trained for computer vision tasks, have been found to predict cortical responses with remarkable accuracy. However, the internal operations of these models remain poorly understood, and the factors that account for their success are unknown. Here we develop a set of techniques for using CNNs to gain insights into the computational mechanisms underlying cortical responses. We focused on responses in the occipital place area (OPA), a scene-selective region of dorsal occipitoparietal cortex. In a previous study, we showed that fMRI activation patterns in the OPA contain information about the navigational affordances of scenes; that is, information about where one can and cannot move within the immediate environment. We hypothesized that this affordance information could be extracted using a set of purely feedforward computations. To test this idea, we examined a deep CNN with a feedforward architecture that had been previously trained for scene classification. We found that responses in the CNN to scene images were highly predictive of fMRI responses in the OPA. Moreover the CNN accounted for the portion of OPA variance relating to the navigational affordances of scenes. The CNN could thus serve as an image-computable candidate model of affordance-related responses in the OPA. We then ran a series of in silico experiments on this model to gain insights into its internal operations. These analyses showed that the computation of affordance-related features relied heavily on visual information at high-spatial frequencies and cardinal orientations, both of which have previously been identified as low-level stimulus preferences of scene-selective visual cortex. These computations also exhibited a strong preference for information in the lower visual field, which is consistent with known retinotopic biases in the OPA. Visualizations of feature selectivity within the CNN suggested that affordance-based responses encoded features that define the layout of the spatial environment, such as boundary-defining junctions and large extended surfaces. Together, these results map the sensory functions of the OPA onto a fully quantitative model that provides insights into its visual computations. More broadly, they advance integrative techniques for understanding visual cortex across multiple level of analysis: from the identification of cortical sensory functions to the modeling of their underlying algorithms. PMID:29684011
Photogrammetry and remote sensing for visualization of spatial data in a virtual reality environment
NASA Astrophysics Data System (ADS)
Bhagawati, Dwipen
2001-07-01
Researchers in many disciplines have started using the tool of Virtual Reality (VR) to gain new insights into problems in their respective disciplines. Recent advances in computer graphics, software and hardware technologies have created many opportunities for VR systems, advanced scientific and engineering applications being among them. In Geometronics, generally photogrammetry and remote sensing are used for management of spatial data inventory. VR technology can be suitably used for management of spatial data inventory. This research demonstrates usefulness of VR technology for inventory management by taking the roadside features as a case study. Management of roadside feature inventory involves positioning and visualization of the features. This research has developed a methodology to demonstrate how photogrammetric principles can be used to position the features using the video-logging images and GPS camera positioning and how image analysis can help produce appropriate texture for building the VR, which then can be visualized in a Cave Augmented Virtual Environment (CAVE). VR modeling was implemented in two stages to demonstrate the different approaches for modeling the VR scene. A simulated highway scene was implemented with the brute force approach, while modeling software was used to model the real world scene using feature positions produced in this research. The first approach demonstrates an implementation of the scene by writing C++ codes to include a multi-level wand menu for interaction with the scene that enables the user to interact with the scene. The interactions include editing the features inside the CAVE display, navigating inside the scene, and performing limited geographic analysis. The second approach demonstrates creation of a VR scene for a real roadway environment using feature positions determined in this research. The scene looks realistic with textures from the real site mapped on to the geometry of the scene. Remote sensing and digital image processing techniques were used for texturing the roadway features in this scene.
Zhao, Hui-Jie; Jiang, Cheng; Jia, Guo-Rui
2014-01-01
Adjacency effects may introduce errors in the quantitative applications of hyperspectral remote sensing, of which the significant item is the earth-atmosphere coupling radiance. However, the surrounding relief and shadow induce strong changes in hyperspectral images acquired from rugged terrain, which is not accurate to describe the spectral characteristics. Furthermore, the radiative coupling process between the earth and the atmosphere is more complex over the rugged scenes. In order to meet the requirements of real-time processing in data simulation, an equivalent reflectance of background was developed by taking into account the topography and the geometry between surroundings and targets based on the radiative transfer process. The contributions of the coupling to the signal at sensor level were then evaluated. This approach was integrated to the sensor-level radiance simulation model and then validated through simulating a set of actual radiance data. The results show that the visual effect of simulated images is consistent with that of observed images. It was also shown that the spectral similarity is improved over rugged scenes. In addition, the model precision is maintained at the same level over flat scenes.
Wu, Chia-Chien; Wang, Hsueh-Cheng; Pomplun, Marc
2014-12-01
A previous study (Vision Research 51 (2011) 1192-1205) found evidence for semantic guidance of visual attention during the inspection of real-world scenes, i.e., an influence of semantic relationships among scene objects on overt shifts of attention. In particular, the results revealed an observer bias toward gaze transitions between semantically similar objects. However, this effect is not necessarily indicative of semantic processing of individual objects but may be mediated by knowledge of the scene gist, which does not require object recognition, or by known spatial dependency among objects. To examine the mechanisms underlying semantic guidance, in the present study, participants were asked to view a series of displays with the scene gist excluded and spatial dependency varied. Our results show that spatial dependency among objects seems to be sufficient to induce semantic guidance. Scene gist, on the other hand, does not seem to affect how observers use semantic information to guide attention while viewing natural scenes. Extracting semantic information mainly based on spatial dependency may be an efficient strategy of the visual system that only adds little cognitive load to the viewing task. Copyright © 2014 Elsevier Ltd. All rights reserved.
Real-time visual simulation of APT system based on RTW and Vega
NASA Astrophysics Data System (ADS)
Xiong, Shuai; Fu, Chengyu; Tang, Tao
2012-10-01
The Matlab/Simulink simulation model of APT (acquisition, pointing and tracking) system is analyzed and established. Then the model's C code which can be used for real-time simulation is generated by RTW (Real-Time Workshop). Practical experiments show, the simulation result of running the C code is the same as running the Simulink model directly in the Matlab environment. MultiGen-Vega is a real-time 3D scene simulation software system. With it and OpenGL, the APT scene simulation platform is developed and used to render and display the virtual scenes of the APT system. To add some necessary graphics effects to the virtual scenes real-time, GLSL (OpenGL Shading Language) shaders are used based on programmable GPU. By calling the C code, the scene simulation platform can adjust the system parameters on-line and get APT system's real-time simulation data to drive the scenes. Practical application shows that this visual simulation platform has high efficiency, low charge and good simulation effect.
Hippocampal gamma-band Synchrony and pupillary responses index memory during visual search.
Montefusco-Siegmund, Rodrigo; Leonard, Timothy K; Hoffman, Kari L
2017-04-01
Memory for scenes is supported by the hippocampus, among other interconnected structures, but the neural mechanisms related to this process are not well understood. To assess the role of the hippocampus in memory-guided scene search, we recorded local field potentials and multiunit activity from the hippocampus of macaques as they performed goal-directed search tasks using natural scenes. We additionally measured pupil size during scene presentation, which in humans is modulated by recognition memory. We found that both pupil dilation and search efficiency accompanied scene repetition, thereby indicating memory for scenes. Neural correlates included a brief increase in hippocampal multiunit activity and a sustained synchronization of unit activity to gamma band oscillations (50-70 Hz). The repetition effects on hippocampal gamma synchronization occurred when pupils were most dilated, suggesting an interaction between aroused, attentive processing and hippocampal correlates of recognition memory. These results suggest that the hippocampus may support memory-guided visual search through enhanced local gamma synchrony. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
The effects of visual scenes on roll and pitch thresholds in pilots versus nonpilots.
Otakeno, Shinji; Matthews, Roger S J; Folio, Les; Previc, Fred H; Lessard, Charles S
2002-02-01
Previous studies have indicated that, compared with nonpilots, pilots rely more on vision than "seat-of-the-pants" sensations when presented with visual-vestibular conflict. The objective of this study was to evaluate whether pilots and nonpilots differ in their thresholds for tilt perception while viewing visual scenes depicting simulated flight. This study was conducted in the Advanced Spatial Disorientation Demonstrator (ASDD) at Brooks AFB, TX. There were 14 subjects (7 pilots and 7 nonpilots) who recorded tilt detection thresholds in pitch and roll while exposed to sub-threshold movement in each axis. During each test run, subjects were presented with computer-generated visual scenes depicting accelerating forward flight by day or night, and a blank (control) condition. The only significant effect detected by an analysis of variance (ANOVA) was that all subjects were more sensitive to tilt in roll than in pitch [F (2,24) = 18.96, p < 0.001]. Overall, pilots had marginally higher tilt detection thresholds compared with nonpilots (p = 0.055), but the type of visual scene had no significant effect on thresholds. In this study, pilots did not demonstrate greater visual dominance over vestibular and proprioceptive cues than nonpilots, but appeared to have higher pitch and roll thresholds overall. The finding of significantly lower detection thresholds in the roll axis vs. the pitch axis was an incidental finding for both subject groups.
Eye guidance during real-world scene search: The role color plays in central and peripheral vision.
Nuthmann, Antje; Malcolm, George L
2016-01-01
The visual system utilizes environmental features to direct gaze efficiently when locating objects. While previous research has isolated various features' contributions to gaze guidance, these studies generally used sparse displays and did not investigate how features facilitated search as a function of their location on the visual field. The current study investigated how features across the visual field--particularly color--facilitate gaze guidance during real-world search. A gaze-contingent window followed participants' eye movements, restricting color information to specified regions. Scene images were presented in full color, with color in the periphery and gray in central vision or gray in the periphery and color in central vision, or in grayscale. Color conditions were crossed with a search cue manipulation, with the target cued either with a word label or an exact picture. Search times increased as color information in the scene decreased. A gaze-data based decomposition of search time revealed color-mediated effects on specific subprocesses of search. Color in peripheral vision facilitated target localization, whereas color in central vision facilitated target verification. Picture cues facilitated search, with the effects of cue specificity and scene color combining additively. When available, the visual system utilizes the environment's color information to facilitate different real-world visual search behaviors based on the location within the visual field.
ERIC Educational Resources Information Center
Fletcher-Watson, S.; Collis, J. M.; Findlay, J. M.; Leekam, S. R.
2009-01-01
Change blindness describes the surprising difficulty of detecting large changes in visual scenes when changes occur during a visual disruption. In order to study the developmental course of this phenomenon, a modified version of the flicker paradigm, based on Rensink, O'Regan & Clark (1997), was given to three groups of children aged 6-12 years…
Reduced Change Blindness Suggests Enhanced Attention to Detail in Individuals with Autism
ERIC Educational Resources Information Center
Smith, Hayley; Milne, Elizabeth
2009-01-01
Background: The phenomenon of change blindness illustrates that a limited number of items within the visual scene are attended to at any one time. It has been suggested that individuals with autism focus attention on less contextually relevant aspects of the visual scene, show superior perceptual discrimination and notice details which are often…
ERIC Educational Resources Information Center
Brady, Timothy F.; Tenenbaum, Joshua B.
2013-01-01
When remembering a real-world scene, people encode both detailed information about specific objects and higher order information like the overall gist of the scene. However, formal models of change detection, like those used to estimate visual working memory capacity, assume observers encode only a simple memory representation that includes no…
Measuring familiarity for natural environments through visual images
William E. Hammitt
1979-01-01
An on-site visual preference methodology involving a pre-and-post rating of bog landscape photographs is discussed. Photographs were rated for familiarity as well as preference. Preference was shown to be closely related to familiarity, assuming visitors had the opportunity to view the scenes during the on-site hiking engagement. Scenes rated high on preference were...
Vos, Leia; Whitman, Douglas
2014-01-01
A considerable literature suggests that the right hemisphere is dominant in vigilance for novel and survival-related stimuli, such as predators, across a wide range of species. In contrast to vigilance for change, change blindness is a failure to detect obvious changes in a visual scene when they are obscured by a disruption in scene presentation. We studied lateralised change detection using a series of scenes with salient changes in either the left or right visual fields. In Study 1 left visual field changes were detected more rapidly than right visual field changes, confirming a right hemisphere advantage for change detection. Increasing stimulus difficulty resulted in greater right visual field detections and left hemisphere detection was more likely when change occurred in the right visual field on a prior trial. In Study 2 an intervening distractor task disrupted the influence of prior trials. Again, faster detection speeds were observed for the left visual field changes with a shift to a right visual field advantage with increasing time-to-detection. This suggests that a right hemisphere role for vigilance, or catching attention, and a left hemisphere role for target evaluation, or maintaining attention, is present at the earliest stage of change detection.
Sato, Naoyuki; Yamaguchi, Yoko
2009-06-01
The human cognitive map is known to be hierarchically organized consisting of a set of perceptually clustered landmarks. Patient studies have demonstrated that these cognitive maps are maintained by the hippocampus, while the neural dynamics are still poorly understood. The authors have shown that the neural dynamic "theta phase precession" observed in the rodent hippocampus may be capable of forming hierarchical cognitive maps in humans. In the model, a visual input sequence consisting of object and scene features in the central and peripheral visual fields, respectively, results in the formation of a hierarchical cognitive map for object-place associations. Surprisingly, it is possible for such a complex memory structure to be formed in a few seconds. In this paper, we evaluate the memory retrieval of object-place associations in the hierarchical network formed by theta phase precession. The results show that multiple object-place associations can be retrieved with the initial cue of a scene input. Importantly, according to the wide-to-narrow unidirectional connections among scene units, the spatial area for object-place retrieval can be controlled by the spatial area of the initial cue input. These results indicate that the hierarchical cognitive maps have computational advantages on a spatial-area selective retrieval of multiple object-place associations. Theta phase precession dynamics is suggested as a fundamental neural mechanism of the human cognitive map.
Katz, Matthew L.; Viney, Tim J.; Nikolic, Konstantin
2016-01-01
Sensory stimuli are encoded by diverse kinds of neurons but the identities of the recorded neurons that are studied are often unknown. We explored in detail the firing patterns of eight previously defined genetically-identified retinal ganglion cell (RGC) types from a single transgenic mouse line. We first introduce a new technique of deriving receptive field vectors (RFVs) which utilises a modified form of mutual information (“Quadratic Mutual Information”). We analysed the firing patterns of RGCs during presentation of short duration (~10 second) complex visual scenes (natural movies). We probed the high dimensional space formed by the visual input for a much smaller dimensional subspace of RFVs that give the most information about the response of each cell. The new technique is very efficient and fast and the derivation of novel types of RFVs formed by the natural scene visual input was possible even with limited numbers of spikes per cell. This approach enabled us to estimate the 'visual memory' of each cell type and the corresponding receptive field area by calculating Mutual Information as a function of the number of frames and radius. Finally, we made predictions of biologically relevant functions based on the RFVs of each cell type. RGC class analysis was complemented with results for the cells’ response to simple visual input in the form of black and white spot stimulation, and their classification on several key physiological metrics. Thus RFVs lead to predictions of biological roles based on limited data and facilitate analysis of sensory-evoked spiking data from defined cell types. PMID:26845435
The effect of non-visual working memory load on top-down modulation of visual processing
Rissman, Jesse; Gazzaley, Adam; D'Esposito, Mark
2009-01-01
While a core function of the working memory (WM) system is the active maintenance of behaviorally relevant sensory representations, it is also critical that distracting stimuli are appropriately ignored. We used functional magnetic resonance imaging to examine the role of domain-general WM resources in the top-down attentional modulation of task-relevant and irrelevant visual representations. In our dual-task paradigm, each trial began with the auditory presentation of six random (high load) or sequentially-ordered (low load) digits. Next, two relevant visual stimuli (e.g., faces), presented amongst two temporally interspersed visual distractors (e.g., scenes), were to be encoded and maintained across a 7-sec delay interval, after which memory for the relevant images and digits was probed. When taxed by high load digit maintenance, participants exhibited impaired performance on the visual WM task and a selective failure to attenuate the neural processing of task-irrelevant scene stimuli. The over-processing of distractor scenes under high load was indexed by elevated encoding activity in a scene-selective region-of-interest relative to low load and passive viewing control conditions, as well as by improved long-term recognition memory for these items. In contrast, the load manipulation did not affect participants' ability to upregulate activity in this region when scenes were task-relevant. These results highlight the critical role of domain-general WM resources in the goal-directed regulation of distractor processing. Moreover, the consequences of increased WM load in young adults closely resemble the effects of cognitive aging on distractor filtering [Gazzaley et al., (2005) Nature Neuroscience 8, 1298-1300], suggesting the possibility of a common underlying mechanism. PMID:19397858
Memory-guided attention during active viewing of edited dynamic scenes.
Valuch, Christian; König, Peter; Ansorge, Ulrich
2017-01-01
Films, TV shows, and other edited dynamic scenes contain many cuts, which are abrupt transitions from one video shot to the next. Cuts occur within or between scenes, and often join together visually and semantically related shots. Here, we tested to which degree memory for the visual features of the precut shot facilitates shifting attention to the postcut shot. We manipulated visual similarity across cuts, and measured how this affected covert attention (Experiment 1) and overt attention (Experiments 2 and 3). In Experiments 1 and 2, participants actively viewed a target movie that randomly switched locations with a second, distractor movie at the time of the cuts. In Experiments 1 and 2, participants were able to deploy attention more rapidly and accurately to the target movie's continuation when visual similarity was high than when it was low. Experiment 3 tested whether this could be explained by stimulus-driven (bottom-up) priming by feature similarity, using one clip at screen center that was followed by two alternative continuations to the left and right. Here, even the highest similarity across cuts did not capture attention. We conclude that following cuts of high visual similarity, memory-guided attention facilitates the deployment of attention, but this effect is (top-down) dependent on the viewer's active matching of scene content across cuts.
Tachistoscopic exposure and masking of real three-dimensional scenes
Pothier, Stephen; Philbeck, John; Chichka, David; Gajewski, Daniel A.
2010-01-01
Although there are many well-known forms of visual cues specifying absolute and relative distance, little is known about how visual space perception develops at small temporal scales. How much time does the visual system require to extract the information in the various absolute and relative distance cues? In this article, we describe a system that may be used to address this issue by presenting brief exposures of real, three-dimensional scenes, followed by a masking stimulus. The system is composed of an electronic shutter (a liquid crystal smart window) for exposing the stimulus scene, and a liquid crystal projector coupled with an electromechanical shutter for presenting the masking stimulus. This system can be used in both full- and reduced-cue viewing conditions, under monocular and binocular viewing, and at distances limited only by the testing space. We describe a configuration that may be used for studying the microgenesis of visual space perception in the context of visually directed walking. PMID:19182129
Fast cat-eye effect target recognition based on saliency extraction
NASA Astrophysics Data System (ADS)
Li, Li; Ren, Jianlin; Wang, Xingbin
2015-09-01
Background complexity is a main reason that results in false detection in cat-eye target recognition. Human vision has selective attention property which can help search the salient target from complex unknown scenes quickly and precisely. In the paper, we propose a novel cat-eye effect target recognition method named Multi-channel Saliency Processing before Fusion (MSPF). This method combines traditional cat-eye target recognition with the selective characters of visual attention. Furthermore, parallel processing enables it to achieve fast recognition. Experimental results show that the proposed method performs better in accuracy, robustness and speed compared to other methods.
Cortical feedback signals generalise across different spatial frequencies of feedforward inputs.
Revina, Yulia; Petro, Lucy S; Muckli, Lars
2017-09-22
Visual processing in cortex relies on feedback projections contextualising feedforward information flow. Primary visual cortex (V1) has small receptive fields and processes feedforward information at a fine-grained spatial scale, whereas higher visual areas have larger, spatially invariant receptive fields. Therefore, feedback could provide coarse information about the global scene structure or alternatively recover fine-grained structure by targeting small receptive fields in V1. We tested if feedback signals generalise across different spatial frequencies of feedforward inputs, or if they are tuned to the spatial scale of the visual scene. Using a partial occlusion paradigm, functional magnetic resonance imaging (fMRI) and multivoxel pattern analysis (MVPA) we investigated whether feedback to V1 contains coarse or fine-grained information by manipulating the spatial frequency of the scene surround outside an occluded image portion. We show that feedback transmits both coarse and fine-grained information as it carries information about both low (LSF) and high spatial frequencies (HSF). Further, feedback signals containing LSF information are similar to feedback signals containing HSF information, even without a large overlap in spatial frequency bands of the HSF and LSF scenes. Lastly, we found that feedback carries similar information about the spatial frequency band across different scenes. We conclude that cortical feedback signals contain information which generalises across different spatial frequencies of feedforward inputs. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Attention in the real world: toward understanding its neural basis
Peelen, Marius V.; Kastner, Sabine
2016-01-01
The efficient selection of behaviorally relevant objects from cluttered environments supports our everyday goals. Attentional selection has typically been studied in search tasks involving artificial and simplified displays. Although these studies have revealed important basic principles of attention, they do not explain how the brain efficiently selects familiar objects in complex and meaningful real-world scenes. Findings from recent neuroimaging studies indicate that real-world search is mediated by ‘what’ and ‘where’ attentional templates that are implemented in high-level visual cortex. These templates represent target-diagnostic properties and likely target locations, respectively, and are shaped by object familiarity, scene context, and memory. We propose a framework for real-world search that incorporates these recent findings and specifies directions for future study. PMID:24630872
To search or to like: Mapping fixations to differentiate two forms of incidental scene memory.
Choe, Kyoung Whan; Kardan, Omid; Kotabe, Hiroki P; Henderson, John M; Berman, Marc G
2017-10-01
We employed eye-tracking to investigate how performing different tasks on scenes (e.g., intentionally memorizing them, searching for an object, evaluating aesthetic preference) can affect eye movements during encoding and subsequent scene memory. We found that scene memorability decreased after visual search (one incidental encoding task) compared to intentional memorization, and that preference evaluation (another incidental encoding task) produced better memory, similar to the incidental memory boost previously observed for words and faces. By analyzing fixation maps, we found that although fixation map similarity could explain how eye movements during visual search impairs incidental scene memory, it could not explain the incidental memory boost from aesthetic preference evaluation, implying that implicit mechanisms were at play. We conclude that not all incidental encoding tasks should be taken to be similar, as different mechanisms (e.g., explicit or implicit) lead to memory enhancements or decrements for different incidental encoding tasks.
The hippocampus and visual perception
Lee, Andy C. H.; Yeung, Lok-Kin; Barense, Morgan D.
2012-01-01
In this review, we will discuss the idea that the hippocampus may be involved in both memory and perception, contrary to theories that posit functional and neuroanatomical segregation of these processes. This suggestion is based on a number of recent neuropsychological and functional neuroimaging studies that have demonstrated that the hippocampus is involved in the visual discrimination of complex spatial scene stimuli. We argue that these findings cannot be explained by long-term memory or working memory processing or, in the case of patient findings, dysfunction beyond the medial temporal lobe (MTL). Instead, these studies point toward a role for the hippocampus in higher-order spatial perception. We suggest that the hippocampus processes complex conjunctions of spatial features, and that it may be more appropriate to consider the representations for which this structure is critical, rather than the cognitive processes that it mediates. PMID:22529794
Visual Stimuli Induce Waves of Electrical Activity in Turtle Cortex
NASA Astrophysics Data System (ADS)
Prechtl, J. C.; Cohen, L. B.; Pesaran, B.; Mitra, P. P.; Kleinfeld, D.
1997-07-01
The computations involved in the processing of a visual scene invariably involve the interactions among neurons throughout all of visual cortex. One hypothesis is that the timing of neuronal activity, as well as the amplitude of activity, provides a means to encode features of objects. The experimental data from studies on cat [Gray, C. M., Konig, P., Engel, A. K. & Singer, W. (1989) Nature (London) 338, 334-337] support a view in which only synchronous (no phase lags) activity carries information about the visual scene. In contrast, theoretical studies suggest, on the one hand, the utility of multiple phases within a population of neurons as a means to encode independent visual features and, on the other hand, the likely existence of timing differences solely on the basis of network dynamics. Here we use widefield imaging in conjunction with voltage-sensitive dyes to record electrical activity from the virtually intact, unanesthetized turtle brain. Our data consist of single-trial measurements. We analyze our data in the frequency domain to isolate coherent events that lie in different frequency bands. Low frequency oscillations (<5 Hz) are seen in both ongoing activity and activity induced by visual stimuli. These oscillations propagate parallel to the afferent input. Higher frequency activity, with spectral peaks near 10 and 20 Hz, is seen solely in response to stimulation. This activity consists of plane waves and spiral-like waves, as well as more complex patterns. The plane waves have an average phase gradient of ≈ π /2 radians/mm and propagate orthogonally to the low frequency waves. Our results show that large-scale differences in neuronal timing are present and persistent during visual processing.
Remote Sensing of Martian Terrain Hazards via Visually Salient Feature Detection
NASA Astrophysics Data System (ADS)
Al-Milli, S.; Shaukat, A.; Spiteri, C.; Gao, Y.
2014-04-01
The main objective of the FASTER remote sensing system is the detection of rocks on planetary surfaces by employing models that can efficiently characterise rocks in terms of semantic descriptions. The proposed technique abates some of the algorithmic limitations of existing methods with no training requirements, lower computational complexity and greater robustness towards visual tracking applications over long-distance planetary terrains. Visual saliency models inspired from biological systems help to identify important regions (such as rocks) in the visual scene. Surface rocks are therefore completely described in terms of their local or global conspicuity pop-out characteristics. These local and global pop-out cues are (but not limited to); colour, depth, orientation, curvature, size, luminance intensity, shape, topology etc. The currently applied methods follow a purely bottom-up strategy of visual attention for selection of conspicuous regions in the visual scene without any topdown control. Furthermore the choice of models used (tested and evaluated) are relatively fast among the state-of-the-art and have very low computational load. Quantitative evaluation of these state-ofthe- art models was carried out using benchmark datasets including the Surrey Space Centre Lab Testbed, Pangu generated images, RAL Space SEEKER and CNES Mars Yard datasets. The analysis indicates that models based on visually salient information in the frequency domain (SRA, SDSR, PQFT) are the best performing ones for detecting rocks in an extra-terrestrial setting. In particular the SRA model seems to be the most optimum of the lot especially that it requires the least computational time while keeping errors competitively low. The salient objects extracted using these models can then be merged with the Digital Elevation Models (DEMs) generated from the same navigation cameras in order to be fused to the navigation map thus giving a clear indication of the rock locations.
Visual stimuli induce waves of electrical activity in turtle cortex
Prechtl, J. C.; Cohen, L. B.; Pesaran, B.; Mitra, P. P.; Kleinfeld, D.
1997-01-01
The computations involved in the processing of a visual scene invariably involve the interactions among neurons throughout all of visual cortex. One hypothesis is that the timing of neuronal activity, as well as the amplitude of activity, provides a means to encode features of objects. The experimental data from studies on cat [Gray, C. M., Konig, P., Engel, A. K. & Singer, W. (1989) Nature (London) 338, 334–337] support a view in which only synchronous (no phase lags) activity carries information about the visual scene. In contrast, theoretical studies suggest, on the one hand, the utility of multiple phases within a population of neurons as a means to encode independent visual features and, on the other hand, the likely existence of timing differences solely on the basis of network dynamics. Here we use widefield imaging in conjunction with voltage-sensitive dyes to record electrical activity from the virtually intact, unanesthetized turtle brain. Our data consist of single-trial measurements. We analyze our data in the frequency domain to isolate coherent events that lie in different frequency bands. Low frequency oscillations (<5 Hz) are seen in both ongoing activity and activity induced by visual stimuli. These oscillations propagate parallel to the afferent input. Higher frequency activity, with spectral peaks near 10 and 20 Hz, is seen solely in response to stimulation. This activity consists of plane waves and spiral-like waves, as well as more complex patterns. The plane waves have an average phase gradient of ≈π/2 radians/mm and propagate orthogonally to the low frequency waves. Our results show that large-scale differences in neuronal timing are present and persistent during visual processing. PMID:9207142
Perceptual load in different regions of the visual scene and its relevance for driving.
Marciano, Hadas; Yeshurun, Yaffa
2015-06-01
The aim of this study was to better understand the role played by perceptual load, at both central and peripheral regions of the visual scene, in driving safety. Attention is a crucial factor in driving safety, and previous laboratory studies suggest that perceptual load is an important factor determining the efficiency of attentional selectivity. Yet, the effects of perceptual load on driving were never studied systematically. Using a driving simulator, we orthogonally manipulated the load levels at the road (central load) and its sides (peripheral load), while occasionally introducing critical events at one of these regions. Perceptual load affected driving performance at both regions of the visual scene. Critically, the effect was different for central versus peripheral load: Whereas load levels on the road mainly affected driving speed, load levels on its sides mainly affected the ability to detect critical events initiating from the roadsides. Moreover, higher levels of peripheral load impaired performance but mainly with low levels of central load, replicating findings with simple letter stimuli. Perceptual load has a considerable effect on driving, but the nature of this effect depends on the region of the visual scene at which the load is introduced. Given the observed importance of perceptual load, authors of future studies of driving safety should take it into account. Specifically, these findings suggest that our understanding of factors that may be relevant for driving safety would benefit from studying these factors under different levels of load at different regions of the visual scene. © 2014, Human Factors and Ergonomics Society.
Does object view influence the scene consistency effect?
Sastyin, Gergo; Niimi, Ryosuke; Yokosawa, Kazuhiko
2015-04-01
Traditional research on the scene consistency effect only used clearly recognizable object stimuli to show mutually interactive context effects for both the object and background components on scene perception (Davenport & Potter in Psychological Science, 15, 559-564, 2004). However, in real environments, objects are viewed from multiple viewpoints, including an accidental, hard-to-recognize one. When the observers named target objects in scenes (Experiments 1a and 1b, object recognition task), we replicated the scene consistency effect (i.e., there was higher accuracy for the objects with consistent backgrounds). However, there was a significant interaction effect between consistency and object viewpoint, which indicated that the scene consistency effect was more important for identifying objects in the accidental view condition than in the canonical view condition. Therefore, the object recognition system may rely more on the scene context when the object is difficult to recognize. In Experiment 2, the observers identified the background (background recognition task) while the scene consistency and object views were manipulated. The results showed that object viewpoint had no effect, while the scene consistency effect was observed. More specifically, the canonical and accidental views both equally provided contextual information for scene perception. These findings suggested that the mechanism for conscious recognition of objects could be dissociated from the mechanism for visual analysis of object images that were part of a scene. The "context" that the object images provided may have been derived from its view-invariant, relatively low-level visual features (e.g., color), rather than its semantic information.
Learning-dependent plasticity with and without training in the human brain.
Zhang, Jiaxiang; Kourtzi, Zoe
2010-07-27
Long-term experience through development and evolution and shorter-term training in adulthood have both been suggested to contribute to the optimization of visual functions that mediate our ability to interpret complex scenes. However, the brain plasticity mechanisms that mediate the detection of objects in cluttered scenes remain largely unknown. Here, we combine behavioral and functional MRI (fMRI) measurements to investigate the human-brain mechanisms that mediate our ability to learn statistical regularities and detect targets in clutter. We show two different routes to visual learning in clutter with discrete brain plasticity signatures. Specifically, opportunistic learning of regularities typical in natural contours (i.e., collinearity) can occur simply through frequent exposure, generalize across untrained stimulus features, and shape processing in occipitotemporal regions implicated in the representation of global forms. In contrast, learning to integrate discontinuities (i.e., elements orthogonal to contour paths) requires task-specific training (bootstrap-based learning), is stimulus-dependent, and enhances processing in intraparietal regions implicated in attention-gated learning. We propose that long-term experience with statistical regularities may facilitate opportunistic learning of collinear contours, whereas learning to integrate discontinuities entails bootstrap-based training for the detection of contours in clutter. These findings provide insights in understanding how long-term experience and short-term training interact to shape the optimization of visual recognition processes.
ERIC Educational Resources Information Center
Altmann, Gerry T. M.; Kamide, Yuki
2009-01-01
Two experiments explored the mapping between language and mental representations of visual scenes. In both experiments, participants viewed, for example, a scene depicting a woman, a wine glass and bottle on the floor, an empty table, and various other objects. In Experiment 1, participants concurrently heard either "The woman will put the glass…
ERIC Educational Resources Information Center
Amit, Elinor; Mehoudar, Eyal; Trope, Yaacov; Yovel, Galit
2012-01-01
It is well established that scenes and objects elicit a highly selective response in specific brain regions in the ventral visual cortex. An inherent difference between these categories that has not been explored yet is their perceived distance from the observer (i.e. scenes are distal whereas objects are proximal). The current study aimed to test…
ERIC Educational Resources Information Center
Baxter, Mark G.; Browning, Philip G. F.; Mitchell, Anna S.
2008-01-01
Surgical disconnection of the frontal cortex and inferotemporal cortex severely impairs many aspects of visual learning and memory, including learning of new object-in-place scene memory problems, a monkey model of episodic memory. As part of a study of specialization within prefrontal cortex in visual learning and memory, we tested monkeys with…
Change Blindness Phenomena for Virtual Reality Display Systems.
Steinicke, Frank; Bruder, Gerd; Hinrichs, Klaus; Willemsen, Pete
2011-09-01
In visual perception, change blindness describes the phenomenon that persons viewing a visual scene may apparently fail to detect significant changes in that scene. These phenomena have been observed in both computer-generated imagery and real-world scenes. Several studies have demonstrated that change blindness effects occur primarily during visual disruptions such as blinks or saccadic eye movements. However, until now the influence of stereoscopic vision on change blindness has not been studied thoroughly in the context of visual perception research. In this paper, we introduce change blindness techniques for stereoscopic virtual reality (VR) systems, providing the ability to substantially modify a virtual scene in a manner that is difficult for observers to perceive. We evaluate techniques for semiimmersive VR systems, i.e., a passive and active stereoscopic projection system as well as an immersive VR system, i.e., a head-mounted display, and compare the results to those of monoscopic viewing conditions. For stereoscopic viewing conditions, we found that change blindness phenomena occur with the same magnitude as in monoscopic viewing conditions. Furthermore, we have evaluated the potential of the presented techniques for allowing abrupt, and yet significant, changes of a stereoscopically displayed virtual reality environment.
Hu, Jian; Xu, Xiang-yang; Song, En-min; Tan, Hong-bao; Wang, Yi-ning
2009-09-01
To establish a new visual educational system of virtual reality for clinical dentistry based on world wide web (WWW) webpage in order to provide more three-dimensional multimedia resources to dental students and an online three-dimensional consulting system for patients. Based on computer graphics and three-dimensional webpage technologies, the software of 3Dsmax and Webmax were adopted in the system development. In the Windows environment, the architecture of whole system was established step by step, including three-dimensional model construction, three-dimensional scene setup, transplanting three-dimensional scene into webpage, reediting the virtual scene, realization of interactions within the webpage, initial test, and necessary adjustment. Five cases of three-dimensional interactive webpage for clinical dentistry were completed. The three-dimensional interactive webpage could be accessible through web browser on personal computer, and users could interact with the webpage through rotating, panning and zooming the virtual scene. It is technically feasible to implement the visual educational system of virtual reality for clinical dentistry based on WWW webpage. Information related to clinical dentistry can be transmitted properly, visually and interactively through three-dimensional webpage.
A fuzzy measure approach to motion frame analysis for scene detection. M.S. Thesis - Houston Univ.
NASA Technical Reports Server (NTRS)
Leigh, Albert B.; Pal, Sankar K.
1992-01-01
This paper addresses a solution to the problem of scene estimation of motion video data in the fuzzy set theoretic framework. Using fuzzy image feature extractors, a new algorithm is developed to compute the change of information in each of two successive frames to classify scenes. This classification process of raw input visual data can be used to establish structure for correlation. The algorithm attempts to fulfill the need for nonlinear, frame-accurate access to video data for applications such as video editing and visual document archival/retrieval systems in multimedia environments.
Frontal–Occipital Connectivity During Visual Search
Pantazatos, Spiro P.; Yanagihara, Ted K.; Zhang, Xian; Meitzler, Thomas
2012-01-01
Abstract Although expectation- and attention-related interactions between ventral and medial prefrontal cortex and stimulus category-selective visual regions have been identified during visual detection and discrimination, it is not known if similar neural mechanisms apply to other tasks such as visual search. The current work tested the hypothesis that high-level frontal regions, previously implicated in expectation and visual imagery of object categories, interact with visual regions associated with object recognition during visual search. Using functional magnetic resonance imaging, subjects searched for a specific object that varied in size and location within a complex natural scene. A model-free, spatial-independent component analysis isolated multiple task-related components, one of which included visual cortex, as well as a cluster within ventromedial prefrontal cortex (vmPFC), consistent with the engagement of both top-down and bottom-up processes. Analyses of psychophysiological interactions showed increased functional connectivity between vmPFC and object-sensitive lateral occipital cortex (LOC), and results from dynamic causal modeling and Bayesian Model Selection suggested bidirectional connections between vmPFC and LOC that were positively modulated by the task. Using image-guided diffusion-tensor imaging, functionally seeded, probabilistic white-matter tracts between vmPFC and LOC, which presumably underlie this effective interconnectivity, were also observed. These connectivity findings extend previous models of visual search processes to include specific frontal–occipital neuronal interactions during a natural and complex search task. PMID:22708993
Scene and human face recognition in the central vision of patients with glaucoma
Aptel, Florent; Attye, Arnaud; Guyader, Nathalie; Boucart, Muriel; Chiquet, Christophe; Peyrin, Carole
2018-01-01
Primary open-angle glaucoma (POAG) firstly mainly affects peripheral vision. Current behavioral studies support the idea that visual defects of patients with POAG extend into parts of the central visual field classified as normal by static automated perimetry analysis. This is particularly true for visual tasks involving processes of a higher level than mere detection. The purpose of this study was to assess visual abilities of POAG patients in central vision. Patients were assigned to two groups following a visual field examination (Humphrey 24–2 SITA-Standard test). Patients with both peripheral and central defects and patients with peripheral but no central defect, as well as age-matched controls, participated in the experiment. All participants had to perform two visual tasks where low-contrast stimuli were presented in the central 6° of the visual field. A categorization task of scene images and human face images assessed high-level visual recognition abilities. In contrast, a detection task using the same stimuli assessed low-level visual function. The difference in performance between detection and categorization revealed the cost of high-level visual processing. Compared to controls, patients with a central visual defect showed a deficit in both detection and categorization of all low-contrast images. This is consistent with the abnormal retinal sensitivity as assessed by perimetry. However, the deficit was greater for categorization than detection. Patients without a central defect showed similar performances to the controls concerning the detection and categorization of faces. However, while the detection of scene images was well-maintained, these patients showed a deficit in their categorization. This suggests that the simple loss of peripheral vision could be detrimental to scene recognition, even when the information is displayed in central vision. This study revealed subtle defects in the central visual field of POAG patients that cannot be predicted by static automated perimetry assessment using Humphrey 24–2 SITA-Standard test. PMID:29481572
Ball, Felix; Elzemann, Anne; Busch, Niko A
2014-09-01
The change blindness paradigm, in which participants often fail to notice substantial changes in a scene, is a popular tool for studying scene perception, visual memory, and the link between awareness and attention. Some of the most striking and popular examples of change blindness have been demonstrated with digital photographs of natural scenes; in most studies, however, much simpler displays, such as abstract stimuli or "free-floating" objects, are typically used. Although simple displays have undeniable advantages, natural scenes remain a very useful and attractive stimulus for change blindness research. To assist researchers interested in using natural-scene stimuli in change blindness experiments, we provide here a step-by-step tutorial on how to produce changes in natural-scene images with a freely available image-processing tool (GIMP). We explain how changes in a scene can be made by deleting objects or relocating them within the scene or by changing the color of an object, in just a few simple steps. We also explain how the physical properties of such changes can be analyzed using GIMP and MATLAB (a high-level scientific programming tool). Finally, we present an experiment confirming that scenes manipulated according to our guidelines are effective in inducing change blindness and demonstrating the relationship between change blindness and the physical properties of the change and inter-individual differences in performance measures. We expect that this tutorial will be useful for researchers interested in studying the mechanisms of change blindness, attention, or visual memory using natural scenes.
2006-07-01
parameters such as motion (e.g., Meitzler, Kistner et al ., 1998), multiple observers (Rotman, 1989), scene obscurants (Rotman, Gordan, & Kowalczyk...1989), clutter (Tidhar et al ., 1994), and multiple targets (Rotman, Gordan, & Kowalczyk, 1989) and selective visual attention2. As such, it is...resolvable cycles, N, of a bar pattern (i.e., a square wave) on a target (Johnson, 1958), or complexity (e.g., Tidhar et al ., 1994). Such metrics
Runway Texture and Grid Pattern Effects on Rate-of-Descent Perception
NASA Technical Reports Server (NTRS)
Schroeder, J. A.; Dearing, M. G.; Sweet, B. T.; Kaiser, M. K.; Rutkowski, Mike (Technical Monitor)
2001-01-01
To date, perceptual errors occur in determining descent rate from a computer-generated image in flight simulation. Pilots tend to touch down twice as hard in simulation than in flight, and more training time is needed in simulation before reaching steady-state performance. Barnes suggested that recognition of range may be the culprit, and he cited that problems such as collimated objects, binocular vision, and poor resolution lead to poor estimation of the velocity vector. Brown's study essentially ruled out that the lack of binocular vision is the problem. Dorfel added specificity to the problem by showing that pilots underestimated range in simulated scenes by 50% when 800 ft from the runway threshold. Palmer and Petitt showed that pilots are able to distinguish between a 1.7 ft/sec and 2.9 ft/sec sink rate when passively observing sink rates in a night scene. Platform motion also plays a role, as previous research has shown that the addition of substantial platform motion improves pilot estimates of vertical velocity and results in simulated touchdown rates more closely resembling flight. This experiment examined how some specific variations in the visual scene properties affect a pilot's perception of sink rate. It extended another experiment that focused on the visual and motion cues necessary for helicopter autorotations. In that experiment, pilots performed steep approaches to a runway. The visual content of the runway and its surroundings varied in two ways: texture and rectangular grid spacing. Four textures, included a no-texture case, were evaluated. Three grid spacings, including a no-grid case, were evaluated. The results showed that pilot better controlled their vertical descent rates when good texture cues were present. No significant differences were found for the grid manipulation. Using those visual scenes a simple psychophysics, experiment was performed. The purpose was to determine if the variations in the visual scenes allowed pilots to better perceive vertical velocity. To determine that answer, pilots passively viewed a particular visual scene in which the vehicle was descending at two different rates. Pilots had to select which of the two rates they thought was the fastest rate. The difference between the two rates changed using a staircase method, depending on whether or not the pilot was correct, until a minimum threshold between the two descent rates was reached. This process was repeated for all of the visual scenes to decide whether or not the visual scenes did allow pilots to perceive vertical velocity better among them. All of the data have yet to be analyzed; however, neither the effects of grid nor texture revealed any statistically significant trends. On further examination of the staircase method employed, a possibility exists that the lack of an evident trend may be due to the exit criterion used during the study. As such, the experiment will be repeated with an improved exit criterion in February. Results of this study will be presented in the submitted paper.
Transient cardio-respiratory responses to visually induced tilt illusions
NASA Technical Reports Server (NTRS)
Wood, S. J.; Ramsdell, C. D.; Mullen, T. J.; Oman, C. M.; Harm, D. L.; Paloski, W. H.
2000-01-01
Although the orthostatic cardio-respiratory response is primarily mediated by the baroreflex, studies have shown that vestibular cues also contribute in both humans and animals. We have demonstrated a visually mediated response to illusory tilt in some human subjects. Blood pressure, heart and respiration rate, and lung volume were monitored in 16 supine human subjects during two types of visual stimulation, and compared with responses to real passive whole body tilt from supine to head 80 degrees upright. Visual tilt stimuli consisted of either a static scene from an overhead mirror or constant velocity scene motion along different body axes generated by an ultra-wide dome projection system. Visual vertical cues were initially aligned with the longitudinal body axis. Subjective tilt and self-motion were reported verbally. Although significant changes in cardio-respiratory parameters to illusory tilts could not be demonstrated for the entire group, several subjects showed significant transient decreases in mean blood pressure resembling their initial response to passive head-up tilt. Changes in pulse pressure and a slight elevation in heart rate were noted. These transient responses are consistent with the hypothesis that visual-vestibular input contributes to the initial cardiovascular adjustment to a change in posture in humans. On average the static scene elicited perceived tilt without rotation. Dome scene pitch and yaw elicited perceived tilt and rotation, and dome roll motion elicited perceived rotation without tilt. A significant correlation between the magnitude of physiological and subjective reports could not be demonstrated.
Visual encoding and fixation target selection in free viewing: presaccadic brain potentials
Nikolaev, Andrey R.; Jurica, Peter; Nakatani, Chie; Plomp, Gijs; van Leeuwen, Cees
2013-01-01
In scrutinizing a scene, the eyes alternate between fixations and saccades. During a fixation, two component processes can be distinguished: visual encoding and selection of the next fixation target. We aimed to distinguish the neural correlates of these processes in the electrical brain activity prior to a saccade onset. Participants viewed color photographs of natural scenes, in preparation for a change detection task. Then, for each participant and each scene we computed an image heat map, with temperature representing the duration and density of fixations. The temperature difference between the start and end points of saccades was taken as a measure of the expected task-relevance of the information concentrated in specific regions of a scene. Visual encoding was evaluated according to whether subsequent change was correctly detected. Saccades with larger temperature difference were more likely to be followed by correct detection than ones with smaller temperature differences. The amplitude of presaccadic activity over anterior brain areas was larger for correct detection than for detection failure. This difference was observed for short “scrutinizing” but not for long “explorative” saccades, suggesting that presaccadic activity reflects top-down saccade guidance. Thus, successful encoding requires local scanning of scene regions which are expected to be task-relevant. Next, we evaluated fixation target selection. Saccades “moving up” in temperature were preceded by presaccadic activity of higher amplitude than those “moving down”. This finding suggests that presaccadic activity reflects attention deployed to the following fixation location. Our findings illustrate how presaccadic activity can elucidate concurrent brain processes related to the immediate goal of planning the next saccade and the larger-scale goal of constructing a robust representation of the visual scene. PMID:23818877
Driving with indirect viewing sensors: understanding the visual perception issues
NASA Astrophysics Data System (ADS)
O'Kane, Barbara L.
1996-05-01
Visual perception is one of the most important elements of driving in that it enables the driver to understand and react appropriately to the situation along the path of the vehicle. The visual perception of the driver is enabled to the greatest extent while driving during the day. Noticeable decrements in visual acuity, range of vision, depth of field and color perception occur at night and under certain weather conditions. Indirect viewing sensors, utilizing various technologies and spectral bands, may assist the driver's normal mode of driving. Critical applications in the military as well as other official activities may require driving at night without headlights. In these latter cases, it is critical that the device, being the only source of scene information, provide the required scene cues needed for driving on, and often-times, off road. One can speculate about the scene information that a driver needs, such as road edges, terrain orientation, people and object detection in or near the path of the vehicle, and so on. But the perceptual qualities of the scene that give rise to these perceptions are little known and thus not quantified for evaluation of indirect viewing devices. This paper discusses driving with headlights and compares the scene content with that provided by a thermal system in the 8 - 12 micrometers micron spectral band, which may be used for driving at some time. The benefits and advantages of each are discussed as well as their limitations in providing information useful for the driver who must make rapid and critical decisions based upon the scene content available. General recommendations are made for potential avenues of development to overcome some of these limitations.
ERIC Educational Resources Information Center
Rieger, Jochem W.; Kochy, Nick; Schalk, Franziska; Gruschow, Marcus; Heinze, Hans-Jochen
2008-01-01
The visual system rapidly extracts information about objects from the cluttered natural environment. In 5 experiments, the authors quantified the influence of orientation and semantics on the classification speed of objects in natural scenes, particularly with regard to object-context interactions. Natural scene photographs were presented in an…
Neural Correlates of Contextual Cueing Are Modulated by Explicit Learning
ERIC Educational Resources Information Center
Westerberg, Carmen E.; Miller, Brennan B.; Reber, Paul J.; Cohen, Neal J.; Paller, Ken A.
2011-01-01
Contextual cueing refers to the facilitated ability to locate a particular visual element in a scene due to prior exposure to the same scene. This facilitation is thought to reflect implicit learning, as it typically occurs without the observer's knowledge that scenes repeat. Unlike most other implicit learning effects, contextual cueing can be…
Micro-Valences: Perceiving Affective Valence in Everyday Objects
Lebrecht, Sophie; Bar, Moshe; Barrett, Lisa Feldman; Tarr, Michael J.
2012-01-01
Perceiving the affective valence of objects influences how we think about and react to the world around us. Conversely, the speed and quality with which we visually recognize objects in a visual scene can vary dramatically depending on that scene’s affective content. Although typical visual scenes contain mostly “everyday” objects, the affect perception in visual objects has been studied using somewhat atypical stimuli with strong affective valences (e.g., guns or roses). Here we explore whether affective valence must be strong or overt to exert an effect on our visual perception. We conclude that everyday objects carry subtle affective valences – “micro-valences” – which are intrinsic to their perceptual representation. PMID:22529828
Cohn, Neil; Taylor-Weiner, Amaro; Grossman, Suzanne
2012-01-01
Research on visual attention has shown that Americans tend to focus more on focal objects of a scene while Asians attend to the surrounding environment. The panels of comic books - the narrative frames in sequential images - highlight aspects of a scene comparably to how attention becomes focused on parts of a spatial array. Thus, we compared panels from American and Japanese comics to explore cross-cultural cognition beyond behavioral experimentation by looking at the expressive mediums produced by individuals from these cultures. This study compared the panels of two genres of American comics (Independent and Mainstream comics) with mainstream Japanese "manga" to examine how different cultures and genres direct attention through the framing of figures and scenes in comic panels. Both genres of American comics focused on whole scenes as much as individual characters, while Japanese manga individuated characters and parts of scenes. We argue that this framing of space from American and Japanese comic books simulate a viewer's integration of a visual scene, and is consistent with the research showing cross-cultural differences in the direction of attention.
Cohn, Neil; Taylor-Weiner, Amaro; Grossman, Suzanne
2012-01-01
Research on visual attention has shown that Americans tend to focus more on focal objects of a scene while Asians attend to the surrounding environment. The panels of comic books – the narrative frames in sequential images – highlight aspects of a scene comparably to how attention becomes focused on parts of a spatial array. Thus, we compared panels from American and Japanese comics to explore cross-cultural cognition beyond behavioral experimentation by looking at the expressive mediums produced by individuals from these cultures. This study compared the panels of two genres of American comics (Independent and Mainstream comics) with mainstream Japanese “manga” to examine how different cultures and genres direct attention through the framing of figures and scenes in comic panels. Both genres of American comics focused on whole scenes as much as individual characters, while Japanese manga individuated characters and parts of scenes. We argue that this framing of space from American and Japanese comic books simulate a viewer’s integration of a visual scene, and is consistent with the research showing cross-cultural differences in the direction of attention. PMID:23015794
Eye tracking to evaluate evidence recognition in crime scene investigations.
Watalingam, Renuka Devi; Richetelli, Nicole; Pelz, Jeff B; Speir, Jacqueline A
2017-11-01
Crime scene analysts are the core of criminal investigations; decisions made at the scene greatly affect the speed of analysis and the quality of conclusions, thereby directly impacting the successful resolution of a case. If an examiner fails to recognize the pertinence of an item on scene, the analyst's theory regarding the crime will be limited. Conversely, unselective evidence collection will most likely include irrelevant material, thus increasing a forensic laboratory's backlog and potentially sending the investigation into an unproductive and costly direction. Therefore, it is critical that analysts recognize and properly evaluate forensic evidence that can assess the relative support of differing hypotheses related to event reconstruction. With this in mind, the aim of this study was to determine if quantitative eye tracking data and qualitative reconstruction accuracy could be used to distinguish investigator expertise. In order to assess this, 32 participants were successfully recruited and categorized as experts or trained novices based on their practical experiences and educational backgrounds. Each volunteer then processed a mock crime scene while wearing a mobile eye tracker, wherein visual fixations, durations, search patterns, and reconstruction accuracy were evaluated. The eye tracking data (dwell time and task percentage on areas of interest or AOIs) were compared using Earth Mover's Distance (EMD) and the Needleman-Wunsch (N-W) algorithm, revealing significant group differences for both search duration (EMD), as well as search sequence (N-W). More specifically, experts exhibited greater dissimilarity in search duration, but greater similarity in search sequences than their novice counterparts. In addition to the quantitative visual assessment of examiner variability, each participant's reconstruction skill was assessed using a 22-point binary scoring system, in which significant group differences were detected as a function of total reconstruction accuracy. This result, coupled with the fact that the study failed to detect a significant difference between the groups when evaluating the total time needed to complete the investigation, indicates that experts are more efficient and effective. Finally, the results presented here provide a basis for continued research in the use of eye trackers to assess expertise in complex and distributed environments, including suggestions for future work, and cautions regarding the degree to which visual attention can infer cognitive understanding. Copyright © 2017 Elsevier B.V. All rights reserved.
A habituation based approach for detection of visual changes in surveillance camera
NASA Astrophysics Data System (ADS)
Sha'abani, M. N. A. H.; Adan, N. F.; Sabani, M. S. M.; Abdullah, F.; Nadira, J. H. S.; Yasin, M. S. M.
2017-09-01
This paper investigates a habituation based approach in detecting visual changes using video surveillance systems in a passive environment. Various techniques have been introduced for dynamic environment such as motion detection, object classification and behaviour analysis. However, in a passive environment, most of the scenes recorded by the surveillance system are normal. Therefore, implementing a complex analysis all the time in the passive environment resulting on computationally expensive, especially when using a high video resolution. Thus, a mechanism of attention is required, where the system only responds to an abnormal event. This paper proposed a novelty detection mechanism in detecting visual changes and a habituation based approach to measure the level of novelty. The objective of the paper is to investigate the feasibility of the habituation based approach in detecting visual changes. Experiment results show that the approach are able to accurately detect the presence of novelty as deviations from the learned knowledge.
Bayesian learning of visual chunks by human observers
Orbán, Gergő; Fiser, József; Aslin, Richard N.; Lengyel, Máté
2008-01-01
Efficient and versatile processing of any hierarchically structured information requires a learning mechanism that combines lower-level features into higher-level chunks. We investigated this chunking mechanism in humans with a visual pattern-learning paradigm. We developed an ideal learner based on Bayesian model comparison that extracts and stores only those chunks of information that are minimally sufficient to encode a set of visual scenes. Our ideal Bayesian chunk learner not only reproduced the results of a large set of previous empirical findings in the domain of human pattern learning but also made a key prediction that we confirmed experimentally. In accordance with Bayesian learning but contrary to associative learning, human performance was well above chance when pair-wise statistics in the exemplars contained no relevant information. Thus, humans extract chunks from complex visual patterns by generating accurate yet economical representations and not by encoding the full correlational structure of the input. PMID:18268353
Klaver, Peter; Talsma, Durk
2013-11-01
We recorded ERPs to investigate whether the visual memory load can bias visual selective attention. Participants memorized one or four letters and then responded to memory-matching letters presented in a relevant color while ignoring distractor letters or letters in an irrelevant color. Stimuli in the relevant color elicited larger frontal selection positivities (FSP) and occipital selection negativities (OSN) compared to irrelevant color stimuli. Only distractors elicited a larger FSP in the high than in the low memory load task. Memory load prolonged the OSN for all letters. Response mapping complexity was also modulated but did not affect the FSP and OSN. Together, the FSP data suggest that high memory load increased distractability. The OSN data suggest that memory load sustained attention to letters in a relevant color until working memory processing was completed, independently of whether the letters were in working memory or not. Copyright © 2013 Society for Psychophysiological Research.
Victor, Jonathan D; Mechler, Ferenc; Ohiorhenuan, Ifije; Schmid, Anita M; Purpura, Keith P
2009-12-01
A full understanding of the computations performed in primary visual cortex is an important yet elusive goal. Receptive field models consisting of cascades of linear filters and static nonlinearities may be adequate to account for responses to simple stimuli such as gratings and random checkerboards, but their predictions of responses to complex stimuli such as natural scenes are only approximately correct. It is unclear whether these discrepancies are limited to quantitative inaccuracies that reflect well-recognized mechanisms such as response normalization, gain controls, and cross-orientation suppression or, alternatively, imply additional qualitative features of the underlying computations. To address this question, we examined responses of V1 and V2 neurons in the monkey and area 17 neurons in the cat to two-dimensional Hermite functions (TDHs). TDHs are intermediate in complexity between traditional analytic stimuli and natural scenes and have mathematical properties that facilitate their use to test candidate models. By exploiting these properties, along with the laminar organization of V1, we identify qualitative aspects of neural computations beyond those anticipated from the above-cited model framework. Specifically, we find that V1 neurons receive signals from orientation-selective mechanisms that are highly nonlinear: they are sensitive to phase correlations, not just spatial frequency content. That is, the behavior of V1 neurons departs from that of linear-nonlinear cascades with standard modulatory mechanisms in a qualitative manner: even relatively simple stimuli evoke responses that imply complex spatial nonlinearities. The presence of these findings in the input layers suggests that these nonlinearities act in a feedback fashion.
Kotabe, Hiroki P; Kardan, Omid; Berman, Marc G
2017-08-01
Natural environments have powerful aesthetic appeal linked to their capacity for psychological restoration. In contrast, disorderly environments are aesthetically aversive, and have various detrimental psychological effects. But in our research, we have repeatedly found that natural environments are perceptually disorderly. What could explain this paradox? We present 3 competing hypotheses: the aesthetic preference for naturalness is more powerful than the aesthetic aversion to disorder (the nature-trumps-disorder hypothesis ); disorder is trivial to aesthetic preference in natural contexts (the harmless-disorder hypothesis ); and disorder is aesthetically preferred in natural contexts (the beneficial-disorder hypothesis ). Utilizing novel methods of perceptual study and diverse stimuli, we rule in the nature-trumps-disorder hypothesis and rule out the harmless-disorder and beneficial-disorder hypotheses. In examining perceptual mechanisms, we find evidence that high-level scene semantics are both necessary and sufficient for the nature-trumps-disorder effect. Necessity is evidenced by the effect disappearing in experiments utilizing only low-level visual stimuli (i.e., where scene semantics have been removed) and experiments utilizing a rapid-scene-presentation procedure that obscures scene semantics. Sufficiency is evidenced by the effect reappearing in experiments utilizing noun stimuli which remove low-level visual features. Furthermore, we present evidence that the interaction of scene semantics with low-level visual features amplifies the nature-trumps-disorder effect-the effect is weaker both when statistically adjusting for quantified low-level visual features and when using noun stimuli which remove low-level visual features. These results have implications for psychological theories bearing on the joint influence of low- and high-level perceptual inputs on affect and cognition, as well as for aesthetic design. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Aging and feature search: the effect of search area.
Burton-Danner, K; Owsley, C; Jackson, G R
2001-01-01
The preattentive system involves the rapid parallel processing of visual information in the visual scene so that attention can be directed to meaningful objects and locations in the environment. This study used the feature search methodology to examine whether there are aging-related deficits in parallel-processing capabilities when older adults are required to visually search a large area of the visual field. Like young subjects, older subjects displayed flat, near-zero slopes for the Reaction Time x Set Size function when searching over a broad area (30 degrees radius) of the visual field, implying parallel processing of the visual display. These same older subjects exhibited impairment in another task, also dependent on parallel processing, performed over the same broad field area; this task, called the useful field of view test, has more complex task demands. Results imply that aging-related breakdowns of parallel processing over a large visual field area are not likely to emerge when required responses are simple, there is only one task to perform, and there is no limitation on visual inspection time.
Age-related changes in visual exploratory behavior in a natural scene setting
Hamel, Johanna; De Beukelaer, Sophie; Kraft, Antje; Ohl, Sven; Audebert, Heinrich J.; Brandt, Stephan A.
2013-01-01
Diverse cognitive functions decline with increasing age, including the ability to process central and peripheral visual information in a laboratory testing situation (useful visual field of view). To investigate whether and how this influences activities of daily life, we studied age-related changes in visual exploratory behavior in a natural scene setting: a driving simulator paradigm of variable complexity was tested in subjects of varying ages with simultaneous eye- and head-movement recordings via a head-mounted camera. Detection and reaction times were also measured by visual fixation and manual reaction. We considered video computer game experience as a possible influence on performance. Data of 73 participants of varying ages were analyzed, driving two different courses. We analyzed the influence of route difficulty level, age, and eccentricity of test stimuli on oculomotor and driving behavior parameters. No significant age effects were found regarding saccadic parameters. In the older subjects head-movements increasingly contributed to gaze amplitude. More demanding courses and more peripheral stimuli locations induced longer reaction times in all age groups. Deterioration of the functionally useful visual field of view with increasing age was not suggested in our study group. However, video game-experienced subjects revealed larger saccade amplitudes and a broader distribution of fixations on the screen. They reacted faster to peripheral objects suggesting the notion of a general detection task rather than perceiving driving as a central task. As the video game-experienced population consisted of younger subjects, our study indicates that effects due to video game experience can easily be misinterpreted as age effects if not accounted for. We therefore view it as essential to consider video game experience in all testing methods using virtual media. PMID:23801970
Stochastic correlative firing for figure-ground segregation.
Chen, Zhe
2005-03-01
Segregation of sensory inputs into separate objects is a central aspect of perception and arises in all sensory modalities. The figure-ground segregation problem requires identifying an object of interest in a complex scene, in many cases given binaural auditory or binocular visual observations. The computations required for visual and auditory figure-ground segregation share many common features and can be cast within a unified framework. Sensory perception can be viewed as a problem of optimizing information transmission. Here we suggest a stochastic correlative firing mechanism and an associative learning rule for figure-ground segregation in several classic sensory perception tasks, including the cocktail party problem in binaural hearing, binocular fusion of stereo images, and Gestalt grouping in motion perception.
NASA Astrophysics Data System (ADS)
Kutulakos, Kyros N.; O'Toole, Matthew
2015-03-01
Conventional cameras record all light falling on their sensor regardless of the path that light followed to get there. In this paper we give an overview of a new family of computational cameras that offers many more degrees of freedom. These cameras record just a fraction of the light coming from a controllable source, based on the actual 3D light path followed. Photos and live video captured this way offer an unconventional view of everyday scenes in which the effects of scattering, refraction and other phenomena can be selectively blocked or enhanced, visual structures that are too subtle to notice with the naked eye can become apparent, and object appearance can depend on depth. We give an overview of the basic theory behind these cameras and their DMD-based implementation, and discuss three applications: (1) live indirect-only imaging of complex everyday scenes, (2) reconstructing the 3D shape of scenes whose geometry or material properties make them hard or impossible to scan with conventional methods, and (3) acquiring time-of-flight images that are free of multi-path interference.
Pilot Task Profiles, Human Factors, And Image Realism
NASA Astrophysics Data System (ADS)
McCormick, Dennis
1982-06-01
Computer Image Generation (CIG) visual systems provide real time scenes for state-of-the-art flight training simulators. The visual system reauires a greater understanding of training tasks, human factors, and the concept of image realism to produce an effective and efficient training scene than is required by other types of visual systems. Image realism must be defined in terms of pilot visual information reauirements. Human factors analysis of training and perception is necessary to determine the pilot's information requirements. System analysis then determines how the CIG and display device can best provide essential information to the pilot. This analysis procedure ensures optimum training effectiveness and system performance.
Comparison on driving fatigue related hemodynamics activated by auditory and visual stimulus
NASA Astrophysics Data System (ADS)
Deng, Zishan; Gao, Yuan; Li, Ting
2018-02-01
As one of the main causes of traffic accidents, driving fatigue deserves researchers' attention and its detection and monitoring during long-term driving require a new technique to realize. Since functional near-infrared spectroscopy (fNIRS) can be applied to detect cerebral hemodynamic responses, we can promisingly expect its application in fatigue level detection. Here, we performed three different kinds of experiments on a driver and recorded his cerebral hemodynamic responses when driving for long hours utilizing our device based on fNIRS. Each experiment lasted for 7 hours and one of the three specific experimental tests, detecting the driver's response to sounds, traffic lights and direction signs respectively, was done every hour. The results showed that visual stimulus was easier to cause fatigue compared with auditory stimulus and visual stimulus induced by traffic lights scenes was easier to cause fatigue compared with visual stimulus induced by direction signs in the first few hours. We also found that fatigue related hemodynamics caused by auditory stimulus increased fastest, then traffic lights scenes, and direction signs scenes slowest. Our study successfully compared audio, visual color, and visual character stimulus in sensitivity to cause driving fatigue, which is meaningful for driving safety management.
Bradley, Margaret M.; Lang, Peter J.
2013-01-01
During rapid serial visual presentation (RSVP), the perceptual system is confronted with a rapidly changing array of sensory information demanding resolution. At rapid rates of presentation, previous studies have found an early (e.g., 150–280 ms) negativity over occipital sensors that is enhanced when emotional, as compared with neutral, pictures are viewed, suggesting facilitated perception. In the present study, we explored how picture composition and the presence of people in the image affect perceptual processing of pictures of natural scenes. Using RSVP, pictures that differed in perceptual composition (figure–ground or scenes), content (presence of people or not), and emotional content (emotionally arousing or neutral) were presented in a continuous stream for 330 ms each with no intertrial interval. In both subject and picture analyses, all three variables affected the amplitude of occipital negativity, with the greatest enhancement for figure–ground compositions (as compared with scenes), irrespective of content and emotional arousal, supporting an interpretation that ease of perceptual processing is associated with enhanced occipital negativity. Viewing emotional pictures prompted enhanced negativity only for pictures that depicted people, suggesting that specific features of emotionally arousing images are associated with facilitated perceptual processing, rather than all emotional content. PMID:23780520
Chen, Juan; Sperandio, Irene; Goodale, Melvyn Alan
2015-01-01
Objects rarely appear in isolation in natural scenes. Although many studies have investigated how nearby objects influence perception in cluttered scenes (i.e., crowding), none has studied how nearby objects influence visually guided action. In Experiment 1, we found that participants could scale their grasp to the size of a crowded target even when they could not perceive its size, demonstrating for the first time that neurologically intact participants can use visual information that is not available to conscious report to scale their grasp to real objects in real scenes. In Experiments 2 and 3, we found that changing the eccentricity of the display and the orientation of the flankers had no effect on grasping but strongly affected perception. The differential effects of eccentricity and flanker orientation on perception and grasping show that the known differences in retinotopy between the ventral and dorsal streams are reflected in the way in which people deal with targets in cluttered scenes. © The Author(s) 2014.
Bar, Moshe; Aminoff, Elissa; Schacter, Daniel L.
2009-01-01
The parahippocampal cortex (PHC) has been implicated both in episodic memory and in place/scene processing. We proposed that this region should instead be seen as intrinsically mediating contextual associations, and not place/scene processing or episodic memory exclusively. Given that place/scene processing and episodic memory both rely on associations, this modified framework provides a platform for reconciling what seemed like different roles assigned to the same region. Comparing scenes with scenes, we show here that the PHC responds significantly more strongly to scenes with rich contextual associations compared with scenes of equal visual qualities but less associations. This result provides the strongest support to the view that the PHC mediates contextual associations in general, rather than places or scenes proper, and necessitates a revision of current views such as that the PHC contains a dedicated place/scenes “module.” PMID:18716212
Using 3D Visualization to Communicate Scientific Results to Non-scientists
NASA Astrophysics Data System (ADS)
Whipple, S.; Mellors, R. J.; Sale, J.; Kilb, D.
2002-12-01
If "a picture is worth a thousand words" then an animation is worth millions. 3D animations and visualizations are useful for geoscientists but are perhaps even more valuable for rapidly illustrating standard geoscience ideas and concepts (such as faults, seismicity patterns, and topography) to non-specialists. This is useful not only for purely educational needs but also in rapidly briefing decision makers where time may be critical. As a demonstration of this we juxtapose large geophysical datasets (e.g., Southern California seismicity and topography) with other large societal datasets (such as highways and urban areas), which allows an instant understanding of the correlations. We intend to work out a methodology to aid other datasets such as hospitals and bridges, for example, in an ongoing fashion. The 3D scenes we create from the separate datasets can be "flown" through and individual snapshots that emphasize the concepts of interest are quickly rendered and converted to formats accessible to all. Viewing the snapshots and scenes greatly aids non-specialists comprehension of the problems and tasks at hand. For example, seismicity clusters (such as aftershocks) and faults near urban areas are clearly visible. A simple "fly-by" through our Southern California scene demonstrates simple concepts such as the topographic features due to plate motion along faults, and the demarcation of the North American/Pacific Plate boundary by the complex fault system (e.g., Elsinore, San Jacinto and San Andreas faults) in Southern California.
Keefe, Bruce D; Wincenciak, Joanna; Jellema, Tjeerd; Ward, James W; Barraclough, Nick E
2016-07-01
When observing another individual's actions, we can both recognize their actions and infer their beliefs concerning the physical and social environment. The extent to which visual adaptation influences action recognition and conceptually later stages of processing involved in deriving the belief state of the actor remains unknown. To explore this we used virtual reality (life-size photorealistic actors presented in stereoscopic three dimensions) to see how visual adaptation influences the perception of individuals in naturally unfolding social scenes at increasingly higher levels of action understanding. We presented scenes in which one actor picked up boxes (of varying number and weight), after which a second actor picked up a single box. Adaptation to the first actor's behavior systematically changed perception of the second actor. Aftereffects increased with the duration of the first actor's behavior, declined exponentially over time, and were independent of view direction. Inferences about the second actor's expectation of box weight were also distorted by adaptation to the first actor. Distortions in action recognition and actor expectations did not, however, extend across different actions, indicating that adaptation is not acting at an action-independent abstract level but rather at an action-dependent level. We conclude that although adaptation influences more complex inferences about belief states of individuals, this is likely to be a result of adaptation at an earlier action recognition stage rather than adaptation operating at a higher, more abstract level in mentalizing or simulation systems.
Research on three-dimensional visualization based on virtual reality and Internet
NASA Astrophysics Data System (ADS)
Wang, Zongmin; Yang, Haibo; Zhao, Hongling; Li, Jiren; Zhu, Qiang; Zhang, Xiaohong; Sun, Kai
2007-06-01
To disclose and display water information, a three-dimensional visualization system based on Virtual Reality (VR) and Internet is researched for demonstrating "digital water conservancy" application and also for routine management of reservoir. To explore and mine in-depth information, after completion of modeling high resolution DEM with reliable quality, topographical analysis, visibility analysis and reservoir volume computation are studied. And also, some parameters including slope, water level and NDVI are selected to classify easy-landslide zone in water-level-fluctuating zone of reservoir area. To establish virtual reservoir scene, two kinds of methods are used respectively for experiencing immersion, interaction and imagination (3I). First virtual scene contains more detailed textures to increase reality on graphical workstation with virtual reality engine Open Scene Graph (OSG). Second virtual scene is for internet users with fewer details for assuring fluent speed.
Visualization of fluid dynamics at NASA Ames
NASA Technical Reports Server (NTRS)
Watson, Val
1989-01-01
The hardware and software currently used for visualization of fluid dynamics at NASA Ames is described. The software includes programs to create scenes (for example particle traces representing the flow over an aircraft), programs to interactively view the scenes, and programs to control the creation of video tapes and 16mm movies. The hardware includes high performance graphics workstations, a high speed network, digital video equipment, and film recorders.
Ward, Jamie; Hovard, Peter; Jones, Alicia; Rothen, Nicolas
2013-01-01
Memory has been shown to be enhanced in grapheme-color synaesthesia, and this enhancement extends to certain visual stimuli (that don't induce synaesthesia) as well as stimuli comprised of graphemes (which do). Previous studies have used a variety of testing procedures to assess memory in synaesthesia (e.g., free recall, recognition, associative learning) making it hard to know the extent to which memory benefits are attributable to the stimulus properties themselves, the testing method, participant strategies, or some combination of these factors. In the first experiment, we use the same testing procedure (recognition memory) for a variety of stimuli (written words, non-words, scenes, and fractals) and also check which memorization strategies were used. We demonstrate that grapheme-color synaesthetes show enhanced memory across all these stimuli, but this is not found for a non-visual type of synaesthesia (lexical-gustatory). In the second experiment, the memory advantage for scenes is explored further by manipulating the properties of the old and new images (changing color, orientation, or object presence). Again, grapheme-color synaesthetes show a memory advantage for scenes across all manipulations. Although recognition memory is generally enhanced in this study, the largest effects were found for abstract visual images (fractals) and scenes for which color can be used to discriminate old/new status. PMID:24187542
Ward, Jamie; Hovard, Peter; Jones, Alicia; Rothen, Nicolas
2013-01-01
Memory has been shown to be enhanced in grapheme-color synaesthesia, and this enhancement extends to certain visual stimuli (that don't induce synaesthesia) as well as stimuli comprised of graphemes (which do). Previous studies have used a variety of testing procedures to assess memory in synaesthesia (e.g., free recall, recognition, associative learning) making it hard to know the extent to which memory benefits are attributable to the stimulus properties themselves, the testing method, participant strategies, or some combination of these factors. In the first experiment, we use the same testing procedure (recognition memory) for a variety of stimuli (written words, non-words, scenes, and fractals) and also check which memorization strategies were used. We demonstrate that grapheme-color synaesthetes show enhanced memory across all these stimuli, but this is not found for a non-visual type of synaesthesia (lexical-gustatory). In the second experiment, the memory advantage for scenes is explored further by manipulating the properties of the old and new images (changing color, orientation, or object presence). Again, grapheme-color synaesthetes show a memory advantage for scenes across all manipulations. Although recognition memory is generally enhanced in this study, the largest effects were found for abstract visual images (fractals) and scenes for which color can be used to discriminate old/new status.
The Importance of Information Localization in Scene Gist Recognition
ERIC Educational Resources Information Center
Loschky, Lester C.; Sethi, Amit; Simons, Daniel J.; Pydimarri, Tejaswi N.; Ochs, Daniel; Corbeille, Jeremy L.
2007-01-01
People can recognize the meaning or gist of a scene from a single glance, and a few recent studies have begun to examine the sorts of information that contribute to scene gist recognition. The authors of the present study used visual masking coupled with image manipulations (randomizing phase while maintaining the Fourier amplitude spectrum;…
Robust multiperson tracking from a mobile platform.
Ess, Andreas; Leibe, Bastian; Schindler, Konrad; van Gool, Luc
2009-10-01
In this paper, we address the problem of multiperson tracking in busy pedestrian zones using a stereo rig mounted on a mobile platform. The complexity of the problem calls for an integrated solution that extracts as much visual information as possible and combines it through cognitive feedback cycles. We propose such an approach, which jointly estimates camera position, stereo depth, object detection, and tracking. The interplay between those components is represented by a graphical model. Since the model has to incorporate object-object interactions and temporal links to past frames, direct inference is intractable. We, therefore, propose a two-stage procedure: for each frame, we first solve a simplified version of the model (disregarding interactions and temporal continuity) to estimate the scene geometry and an overcomplete set of object detections. Conditioned on these results, we then address object interactions, tracking, and prediction in a second step. The approach is experimentally evaluated on several long and difficult video sequences from busy inner-city locations. Our results show that the proposed integration makes it possible to deliver robust tracking performance in scenes of realistic complexity.
Irsik, Vanessa C; Vanden Bosch der Nederlanden, Christina M; Snyder, Joel S
2016-11-01
Attention and other processing constraints limit the perception of objects in complex scenes, which has been studied extensively in the visual sense. We used a change deafness paradigm to examine how attention to particular objects helps and hurts the ability to notice changes within complex auditory scenes. In a counterbalanced design, we examined how cueing attention to particular objects affected performance in an auditory change-detection task through the use of valid or invalid cues and trials without cues (Experiment 1). We further examined how successful encoding predicted change-detection performance using an object-encoding task and we addressed whether performing the object-encoding task along with the change-detection task affected performance overall (Experiment 2). Participants had more error for invalid compared to valid and uncued trials, but this effect was reduced in Experiment 2 compared to Experiment 1. When the object-encoding task was present, listeners who completed the uncued condition first had less overall error than those who completed the cued condition first. All participants showed less change deafness when they successfully encoded change-relevant compared to irrelevant objects during valid and uncued trials. However, only participants who completed the uncued condition first also showed this effect during invalid cue trials, suggesting a broader scope of attention. These findings provide converging evidence that attention to change-relevant objects is crucial for successful detection of acoustic changes and that encouraging broad attention to multiple objects is the best way to reduce change deafness. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Research on Visualization of Ground Laser Radar Data Based on Osg
NASA Astrophysics Data System (ADS)
Huang, H.; Hu, C.; Zhang, F.; Xue, H.
2018-04-01
Three-dimensional (3D) laser scanning is a new advanced technology integrating light, machine, electricity, and computer technologies. It can conduct 3D scanning to the whole shape and form of space objects with high precision. With this technology, you can directly collect the point cloud data of a ground object and create the structure of it for rendering. People use excellent 3D rendering engine to optimize and display the 3D model in order to meet the higher requirements of real time realism rendering and the complexity of the scene. OpenSceneGraph (OSG) is an open source 3D graphics engine. Compared with the current mainstream 3D rendering engine, OSG is practical, economical, and easy to expand. Therefore, OSG is widely used in the fields of virtual simulation, virtual reality, science and engineering visualization. In this paper, a dynamic and interactive ground LiDAR data visualization platform is constructed based on the OSG and the cross-platform C++ application development framework Qt. In view of the point cloud data of .txt format and the triangulation network data file of .obj format, the functions of 3D laser point cloud and triangulation network data display are realized. It is proved by experiments that the platform is of strong practical value as it is easy to operate and provides good interaction.
Piecewise-Planar StereoScan: Sequential Structure and Motion using Plane Primitives.
Raposo, Carolina; Antunes, Michel; P Barreto, Joao
2017-08-09
The article describes a pipeline that receives as input a sequence of stereo images, and outputs the camera motion and a Piecewise-Planar Reconstruction (PPR) of the scene. The pipeline, named Piecewise-Planar StereoScan (PPSS), works as follows: the planes in the scene are detected for each stereo view using semi-dense depth estimation; the relative pose is computed by a new closed-form minimal algorithm that only uses point correspondences whenever plane detections do not fully constrain the motion; the camera motion and the PPR are jointly refined by alternating between discrete optimization and continuous bundle adjustment; and, finally, the detected 3D planes are segmented in images using a new framework that handles low texture and visibility issues. PPSS is extensively validated in indoor and outdoor datasets, and benchmarked against two popular point-based SfM pipelines. The experiments confirm that plane-based visual odometry is resilient to situations of small image overlap, poor texture, specularity, and perceptual aliasing where the fast LIBVISO2 pipeline fails. The comparison against VisualSfM+CMVS/PMVS shows that, for a similar computational complexity, PPSS is more accurate and provides much more compelling and visually pleasant 3D models. These results strongly suggest that plane primitives are an advantageous alternative to point correspondences for applications of SfM and 3D reconstruction in man-made environments.
Do reference surfaces influence exocentric pointing?
Doumen, M J A; Kappers, A M L; Koenderink, J J
2008-06-01
All elements of the visual field are known to influence the perception of the egocentric distances of objects. Not only the ground surface of a scene, but also the surface at the back or other objects in the scene can affect an observer's egocentric distance estimation of an object. We tested whether this is also true for exocentric direction estimations. We used an exocentric pointing task to test whether the presence of poster-boards in the visual scene would influence the perception of the exocentric direction between two test-objects. In this task the observer has to direct a pointer, with a remote control, to a target. We placed the poster-boards at various positions in the visual field to test whether these boards would affect the settings of the observer. We found that they only affected the settings when they directly served as a reference for orienting the pointer to the target.
The Faces in Infant-Perspective Scenes Change over the First Year of Life
Jayaraman, Swapnaa; Fausey, Caitlin M.; Smith, Linda B.
2015-01-01
Mature face perception has its origins in the face experiences of infants. However, little is known about the basic statistics of faces in early visual environments. We used head cameras to capture and analyze over 72,000 infant-perspective scenes from 22 infants aged 1-11 months as they engaged in daily activities. The frequency of faces in these scenes declined markedly with age: for the youngest infants, faces were present 15 minutes in every waking hour but only 5 minutes for the oldest infants. In general, the available faces were well characterized by three properties: (1) they belonged to relatively few individuals; (2) they were close and visually large; and (3) they presented views showing both eyes. These three properties most strongly characterized the face corpora of our youngest infants and constitute environmental constraints on the early development of the visual system. PMID:26016988
NASA Astrophysics Data System (ADS)
den Hollander, Richard J. M.; Bouma, Henri; van Rest, Jeroen H. C.; ten Hove, Johan-Martijn; ter Haar, Frank B.; Burghouts, Gertjan J.
2017-10-01
Video analytics is essential for managing large quantities of raw data that are produced by video surveillance systems (VSS) for the prevention, repression and investigation of crime and terrorism. Analytics is highly sensitive to changes in the scene, and for changes in the optical chain so a VSS with analytics needs careful configuration and prompt maintenance to avoid false alarms. However, there is a trend from static VSS consisting of fixed CCTV cameras towards more dynamic VSS deployments over public/private multi-organization networks, consisting of a wider variety of visual sensors, including pan-tilt-zoom (PTZ) cameras, body-worn cameras and cameras on moving platforms. This trend will lead to more dynamic scenes and more frequent changes in the optical chain, creating structural problems for analytics. If these problems are not adequately addressed, analytics will not be able to continue to meet end users' developing needs. In this paper, we present a three-part solution for managing the performance of complex analytics deployments. The first part is a register containing meta data describing relevant properties of the optical chain, such as intrinsic and extrinsic calibration, and parameters of the scene such as lighting conditions or measures for scene complexity (e.g. number of people). A second part frequently assesses these parameters in the deployed VSS, stores changes in the register, and signals relevant changes in the setup to the VSS administrator. A third part uses the information in the register to dynamically configure analytics tasks based on VSS operator input. In order to support the feasibility of this solution, we give an overview of related state-of-the-art technologies for autocalibration (self-calibration), scene recognition and lighting estimation in relation to person detection. The presented solution allows for rapid and robust deployment of Video Content Analysis (VCA) tasks in large scale ad-hoc networks.
Face, Body, and Center of Gravity Mediate Person Detection in Natural Scenes
ERIC Educational Resources Information Center
Bindemann, Markus; Scheepers, Christoph; Ferguson, Heather J.; Burton, A. Mike
2010-01-01
Person detection is an important prerequisite of social interaction, but is not well understood. Following suggestions that people in the visual field can capture a viewer's attention, this study examines the role of the face and the body for person detection in natural scenes. We observed that viewers tend first to look at the center of a scene,…
The Effect of Scene Variation on the Redundant Use of Color in Definite Reference
ERIC Educational Resources Information Center
Koolen, Ruud; Goudbeek, Martijn; Krahmer, Emiel
2013-01-01
This study investigates to what extent the amount of variation in a visual scene causes speakers to mention the attribute color in their definite target descriptions, focusing on scenes in which this attribute is not needed for identification of the target. The results of our three experiments show that speakers are more likely to redundantly…
Correction techniques for depth errors with stereo three-dimensional graphic displays
NASA Technical Reports Server (NTRS)
Parrish, Russell V.; Holden, Anthony; Williams, Steven P.
1992-01-01
Three-dimensional (3-D), 'real-world' pictorial displays that incorporate 'true' depth cues via stereopsis techniques have proved effective for displaying complex information in a natural way to enhance situational awareness and to improve pilot/vehicle performance. In such displays, the display designer must map the depths in the real world to the depths available with the stereo display system. However, empirical data have shown that the human subject does not perceive the information at exactly the depth at which it is mathematically placed. Head movements can also seriously distort the depth information that is embedded in stereo 3-D displays because the transformations used in mapping the visual scene to the depth-viewing volume (DVV) depend intrinsically on the viewer location. The goal of this research was to provide two correction techniques; the first technique corrects the original visual scene to the DVV mapping based on human perception errors, and the second (which is based on head-positioning sensor input data) corrects for errors induced by head movements. Empirical data are presented to validate both correction techniques. A combination of the two correction techniques effectively eliminates the distortions of depth information embedded in stereo 3-D displays.
Goodhew, Stephanie C; Edwards, Mark
2016-12-01
When the human brain is confronted with complex and dynamic visual scenes, two pivotal processes are at play: visual attention (the process of selecting certain aspects of the scene for privileged processing) and object individuation (determining what information belongs to a continuing object over time versus what represents two or more distinct objects). Here we examined whether these processes are independent or whether they interact. Object-substitution masking (OSM) has been used as a tool to examine such questions, however, there is controversy surrounding whether OSM reflects object individuation versus substitution processes. The object-individuation account is agnostic regarding the role of attention, whereas object-substitution theory stipulates a pivotal role for attention. There have been attempts to investigate the role of attention in OSM, but they have been subject to alternative explanations. Here, therefore, we manipulated the size of the attended region, a pure and uncontaminated attentional manipulation, and examined the impact on OSM. Across three experiments, there was no interaction. This refutes the object-substitution theory of OSM. This, in turn, tell us that object-individuation is invariant the distribution of attention. Copyright © 2016 Elsevier B.V. All rights reserved.
Radiologists remember mountains better than radiographs, or do they?
Evans, Karla K; Marom, Edith M; Godoy, Myrna C B; Palacio, Diana; Sagebiel, Tara; Cuellar, Sonia Betancourt; McEntee, Mark; Tian, Charles; Brennan, Patrick C; Haygood, Tamara Miner
2016-01-01
Expertise with encoding material has been shown to aid long-term memory for that material. It is not clear how relevant this expertise is for image memorability (e.g., radiologists' memory for radiographs), and how robust over time. In two studies, we tested scene memory using a standard long-term memory paradigm. One compared the performance of radiologists to naïve observers on two image sets, chest radiographs and everyday scenes, and the other radiologists' memory with immediate as opposed to delayed recognition tests using musculoskeletal radiographs and forest scenes. Radiologists' memory was better than novices for images of expertise but no different for everyday scenes. With the heterogeneity of image sets equated, radiologists' expertise with radiographs afforded them better memory for the musculoskeletal radiographs than forest scenes. Enhanced memory for images of expertise disappeared over time, resulting in chance level performance for both image sets after weeks of delay. Expertise with the material is important for visual memorability but not to the same extent as idiosyncratic detail and variability of the image set. Similar memory decline with time for images of expertise as for everyday scenes further suggests that extended familiarity with an image is not a robust factor for visual memorability.
Le, Thang M; Borghi, John A; Kujawa, Autumn J; Klein, Daniel N; Leung, Hoi-Chung
2017-01-01
The present study examined the impacts of major depressive disorder (MDD) on visual and prefrontal cortical activity as well as their connectivity during visual working memory updating and related them to the core clinical features of the disorder. Impairment in working memory updating is typically associated with the retention of irrelevant negative information which can lead to persistent depressive mood and abnormal affect. However, performance deficits have been observed in MDD on tasks involving little or no demand on emotion processing, suggesting dysfunctions may also occur at the more basic level of information processing. Yet, it is unclear how various regions in the visual working memory circuit contribute to behavioral changes in MDD. We acquired functional magnetic resonance imaging data from 18 unmedicated participants with MDD and 21 age-matched healthy controls (CTL) while they performed a visual delayed recognition task with neutral faces and scenes as task stimuli. Selective working memory updating was manipulated by inserting a cue in the delay period to indicate which one or both of the two memorized stimuli (a face and a scene) would remain relevant for the recognition test. Our results revealed several key findings. Relative to the CTL group, the MDD group showed weaker postcue activations in visual association areas during selective maintenance of face and scene working memory. Across the MDD subjects, greater rumination and depressive symptoms were associated with more persistent activation and connectivity related to no-longer-relevant task information. Classification of postcue spatial activation patterns of the scene-related areas was also less consistent in the MDD subjects compared to the healthy controls. Such abnormalities appeared to result from a lack of updating effects in postcue functional connectivity between prefrontal and scene-related areas in the MDD group. In sum, disrupted working memory updating in MDD was revealed by alterations in activity patterns of the visual association areas, their connectivity with the prefrontal cortex, and their relationship with core clinical characteristics. These results highlight the role of information updating deficits in the cognitive control and symptomatology of depression.
Charrier, A; Tardif, C; Gepner, B
2017-02-01
Face and gaze avoidance are among the most characteristic and salient symptoms of autism spectrum disorders (ASD). Studies using eye tracking highlighted early and lifelong ASD-specific abnormalities in attention to face such as decreased attention to internal facial features. These specificities could be partly explained by disorders in the perception and integration of rapid and complex information such as that conveyed by facial movements and more broadly by biological and physical environment. Therefore, we wish to test whether slowing down facial dynamics may improve the way children with ASD attend to a face. We used an eye tracking method to examine gaze patterns of children with ASD aged 3 to 8 (n=23) and TD controls (n=29) while viewing the face of a speaker telling a story. The story was divided into 6 sequences that were randomly displayed at 3 different speeds, i.e. a real-time speed (RT), a slow speed (S70=70% of RT speed), a very slow speed (S50=50% of RT speed). S70 and S50 were displayed thanks to software called Logiral™, aimed at slowing down visual and auditory stimuli simultaneously and without tone distortion. The visual scene was divided into four regions of interest (ROI): eyes region; mouth region; whole face region; outside the face region. The total time, number and mean duration of visual fixations on the whole visual scene and the four ROI were measured between and within the two groups. Compared to TD children, children with ASD spent significantly less time attending to the visual scenes and, when they looked at the scene, they spent less time scanning the speaker's face in general and her mouth in particular, and more time looking outside facial area. Within the ASD group mean duration of fixation increased on the whole scene and particularly on the mouth area, in R50 compared to RT. Children with mild autism spent more time looking at the face than the two other groups of ASD children, and spent more time attending to the face and mouth as well as longer mean duration of visual fixation on mouth and eyes, at slow speeds (S50 and/or S70) than at RT one. Slowing down facial dynamics enhances looking time on face, and particularly on mouth and/or eyes, in a group of 23 children with ASD and particularly in a small subgroup with mild autism. Given the crucial role of reading the eyes for emotional processing and that of lip-reading for language processing, our present result and other converging ones could pave the way for novel socio-emotional and verbal rehabilitation methods for autistic population. Further studies should investigate whether increased attention to face and particularly eyes and mouth is correlated to emotional/social and/or verbal/language improvements. Copyright © 2016 L'Encéphale, Paris. Published by Elsevier Masson SAS. All rights reserved.
Evans, Karla K; Horowitz, Todd S; Howe, Piers; Pedersini, Roccardo; Reijnen, Ester; Pinto, Yair; Kuzmova, Yoana; Wolfe, Jeremy M
2011-09-01
A typical visual scene we encounter in everyday life is complex and filled with a huge amount of perceptual information. The term, 'visual attention' describes a set of mechanisms that limit some processing to a subset of incoming stimuli. Attentional mechanisms shape what we see and what we can act upon. They allow for concurrent selection of some (preferably, relevant) information and inhibition of other information. This selection permits the reduction of complexity and informational overload. Selection can be determined both by the 'bottom-up' saliency of information from the environment and by the 'top-down' state and goals of the perceiver. Attentional effects can take the form of modulating or enhancing the selected information. A central role for selective attention is to enable the 'binding' of selected information into unified and coherent representations of objects in the outside world. In the overview on visual attention presented here we review the mechanisms and consequences of selection and inhibition over space and time. We examine theoretical, behavioral and neurophysiologic work done on visual attention. We also discuss the relations between attention and other cognitive processes such as automaticity and awareness. WIREs Cogni Sci 2011 2 503-514 DOI: 10.1002/wcs.127 For further resources related to this article, please visit the WIREs website. Copyright © 2011 John Wiley & Sons, Ltd.
NASA Technical Reports Server (NTRS)
Sweet, Barbara T.; Kaiser, Mary K.
2013-01-01
Although current technology simulator visual systems can achieve extremely realistic levels they do not completely replicate the experience of a pilot sitting in the cockpit, looking at the outside world. Some differences in experience are due to visual artifacts, or perceptual features that would not be present in a naturally viewed scene. Others are due to features that are missing from the simulated scene. In this paper, these differences will be defined and discussed. The significance of these differences will be examined as a function of several particular operational tasks. A framework to facilitate the choice of visual system characteristics based on operational task requirements will be proposed.
Ma, Liyan; Qiu, Bo; Cui, Mingyue; Ding, Jianwei
2017-01-01
Depth image-based rendering (DIBR), which is used to render virtual views with a color image and the corresponding depth map, is one of the key techniques in the 2D to 3D conversion process. Due to the absence of knowledge about the 3D structure of a scene and its corresponding texture, DIBR in the 2D to 3D conversion process, inevitably leads to holes in the resulting 3D image as a result of newly-exposed areas. In this paper, we proposed a structure-aided depth map preprocessing framework in the transformed domain, which is inspired by recently proposed domain transform for its low complexity and high efficiency. Firstly, our framework integrates hybrid constraints including scene structure, edge consistency and visual saliency information in the transformed domain to improve the performance of depth map preprocess in an implicit way. Then, adaptive smooth localization is cooperated and realized in the proposed framework to further reduce over-smoothness and enhance optimization in the non-hole regions. Different from the other similar methods, the proposed method can simultaneously achieve the effects of hole filling, edge correction and local smoothing for typical depth maps in a united framework. Thanks to these advantages, it can yield visually satisfactory results with less computational complexity for high quality 2D to 3D conversion. Numerical experimental results demonstrate the excellent performances of the proposed method. PMID:28407027
Scene perception and the visual control of travel direction in navigating wood ants
Collett, Thomas S.; Lent, David D.; Graham, Paul
2014-01-01
This review reflects a few of Mike Land's many and varied contributions to visual science. In it, we show for wood ants, as Mike has done for a variety of animals, including readers of this piece, what can be learnt from a detailed analysis of an animal's visually guided eye, head or body movements. In the case of wood ants, close examination of their body movements, as they follow visually guided routes, is starting to reveal how they perceive and respond to their visual world and negotiate a path within it. We describe first some of the mechanisms that underlie the visual control of their paths, emphasizing that vision is not the ant's only sense. In the second part, we discuss how remembered local shape-dependent and global shape-independent features of a visual scene may interact in guiding the ant's path. PMID:24395962
Keshner, E A; Dhaher, Y
2008-07-01
Multiplanar environmental motion could generate head instability, particularly if the visual surround moves in planes orthogonal to a physical disturbance. We combined sagittal plane surface translations with visual field disturbances in 12 healthy (29-31 years) and 3 visually sensitive (27-57 years) adults. Center of pressure (COP), peak head angles, and RMS values of head motion were calculated and a three-dimensional model of joint motion was developed to examine gross head motion in three planes. We found that subjects standing quietly in front of a visual scene translating in the sagittal plane produced significantly greater (p<0.003) head motion in yaw than when on a translating platform. However, when the platform was translated in the dark or with a visual scene rotating in roll, head motion orthogonal to the plane of platform motion significantly increased (p<0.02). Visually sensitive subjects having no history of vestibular disorder produced large, delayed compensatory head motion. Orthogonal head motions were significantly greater in visually sensitive than in healthy subjects in the dark (p<0.05) and with a stationary scene (p<0.01). We concluded that motion of the visual field could modify compensatory response kinematics of a freely moving head in planes orthogonal to the direction of a physical perturbation. These results suggest that the mechanisms controlling head orientation in space are distinct from those that control trunk orientation in space. These behaviors would have been missed if only COP data were considered. Data suggest that rehabilitation training can be enhanced by combining visual and mechanical perturbation paradigms.
Scientific Visualization and Computational Science: Natural Partners
NASA Technical Reports Server (NTRS)
Uselton, Samuel P.; Lasinski, T. A. (Technical Monitor)
1995-01-01
Scientific visualization is developing rapidly, stimulated by computational science, which is gaining acceptance as a third alternative to theory and experiment. Computational science is based on numerical simulations of mathematical models derived from theory. But each individual simulation is like a hypothetical experiment; initial conditions are specified, and the result is a record of the observed conditions. Experiments can be simulated for situations that can not really be created or controlled. Results impossible to measure can be computed.. Even for observable values, computed samples are typically much denser. Numerical simulations also extend scientific exploration where the mathematics is analytically intractable. Numerical simulations are used to study phenomena from subatomic to intergalactic scales and from abstract mathematical structures to pragmatic engineering of everyday objects. But computational science methods would be almost useless without visualization. The obvious reason is that the huge amounts of data produced require the high bandwidth of the human visual system, and interactivity adds to the power. Visualization systems also provide a single context for all the activities involved from debugging the simulations, to exploring the data, to communicating the results. Most of the presentations today have their roots in image processing, where the fundamental task is: Given an image, extract information about the scene. Visualization has developed from computer graphics, and the inverse task: Given a scene description, make an image. Visualization extends the graphics paradigm by expanding the possible input. The goal is still to produce images; the difficulty is that the input is not a scene description displayable by standard graphics methods. Visualization techniques must either transform the data into a scene description or extend graphics techniques to display this odd input. Computational science is a fertile field for visualization research because the results vary so widely and include things that have no known appearance. The amount of data creates additional challenges for both hardware and software systems. Evaluations of visualization should ultimately reflect the insight gained into the scientific phenomena. So making good visualizations requires consideration of characteristics of the user and the purpose of the visualization. Knowledge about human perception and graphic design is also relevant. It is this breadth of knowledge that stimulates proposals for multidisciplinary visualization teams and intelligent visualization assistant software. Visualization is an immature field, but computational science is stimulating research on a broad front.
Visualization of spatial-temporal data based on 3D virtual scene
NASA Astrophysics Data System (ADS)
Wang, Xianghong; Liu, Jiping; Wang, Yong; Bi, Junfang
2009-10-01
The main purpose of this paper is to realize the expression of the three-dimensional dynamic visualization of spatialtemporal data based on three-dimensional virtual scene, using three-dimensional visualization technology, and combining with GIS so that the people's abilities of cognizing time and space are enhanced and improved by designing dynamic symbol and interactive expression. Using particle systems, three-dimensional simulation, virtual reality and other visual means, we can simulate the situations produced by changing the spatial location and property information of geographical entities over time, then explore and analyze its movement and transformation rules by changing the interactive manner, and also replay history and forecast of future. In this paper, the main research object is the vehicle track and the typhoon path and spatial-temporal data, through three-dimensional dynamic simulation of its track, and realize its timely monitoring its trends and historical track replaying; according to visualization techniques of spatialtemporal data in Three-dimensional virtual scene, providing us with excellent spatial-temporal information cognitive instrument not only can add clarity to show spatial-temporal information of the changes and developments in the situation, but also be used for future development and changes in the prediction and deduction.
NASA Astrophysics Data System (ADS)
Assadi, Amir H.
2001-11-01
Perceptual geometry is an emerging field of interdisciplinary research whose objectives focus on study of geometry from the perspective of visual perception, and in turn, apply such geometric findings to the ecological study of vision. Perceptual geometry attempts to answer fundamental questions in perception of form and representation of space through synthesis of cognitive and biological theories of visual perception with geometric theories of the physical world. Perception of form and space are among fundamental problems in vision science. In recent cognitive and computational models of human perception, natural scenes are used systematically as preferred visual stimuli. Among key problems in perception of form and space, we have examined perception of geometry of natural surfaces and curves, e.g. as in the observer's environment. Besides a systematic mathematical foundation for a remarkably general framework, the advantages of the Gestalt theory of natural surfaces include a concrete computational approach to simulate or recreate images whose geometric invariants and quantities might be perceived and estimated by an observer. The latter is at the very foundation of understanding the nature of perception of space and form, and the (computer graphics) problem of rendering scenes to visually invoke virtual presence.
A high-quality high-fidelity visualization of the September 11 attack on the World Trade Center.
Rosen, Paul; Popescu, Voicu; Hoffmann, Christoph; Irfanoglu, Ayhan
2008-01-01
In this application paper, we describe the efforts of a multidisciplinary team towards producing a visualization of the September 11 Attack on the North Tower of New York's World Trade Center. The visualization was designed to meet two requirements. First, the visualization had to depict the impact with high fidelity, by closely following the laws of physics. Second, the visualization had to be eloquent to a nonexpert user. This was achieved by first designing and computing a finite-element analysis (FEA) simulation of the impact between the aircraft and the top 20 stories of the building, and then by visualizing the FEA results with a state-of-the-art commercial animation system. The visualization was enabled by an automatic translator that converts the simulation data into an animation system 3D scene. We built upon a previously developed translator. The translator was substantially extended to enable and control visualization of fire and of disintegrating elements, to better scale with the number of nodes and number of states, to handle beam elements with complex profiles, and to handle smoothed particle hydrodynamics liquid representation. The resulting translator is a powerful automatic and scalable tool for high-quality visualization of FEA results.
Bag of Lines (BoL) for Improved Aerial Scene Representation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sridharan, Harini; Cheriyadat, Anil M.
2014-09-22
Feature representation is a key step in automated visual content interpretation. In this letter, we present a robust feature representation technique, referred to as bag of lines (BoL), for high-resolution aerial scenes. The proposed technique involves extracting and compactly representing low-level line primitives from the scene. The compact scene representation is generated by counting the different types of lines representing various linear structures in the scene. Through extensive experiments, we show that the proposed scene representation is invariant to scale changes and scene conditions and can discriminate urban scene categories accurately. We compare the BoL representation with the popular scalemore » invariant feature transform (SIFT) and Gabor wavelets for their classification and clustering performance on an aerial scene database consisting of images acquired by sensors with different spatial resolutions. The proposed BoL representation outperforms the SIFT- and Gabor-based representations.« less
The artist's advantage: Better integration of object information across eye movements
Perdreau, Florian; Cavanagh, Patrick
2013-01-01
Over their careers, figurative artists spend thousands of hours analyzing objects and scene layout. We examined what impact this extensive training has on the ability to encode complex scenes, comparing participants with a wide range of training and drawing skills on a possible versus impossible objects task. We used a gaze-contingent display to control the amount of information the participants could sample on each fixation either from central or peripheral visual field. Test objects were displayed and participants reported, as quickly as possible, whether the object was structurally possible or not. Our results show that when viewing the image through a small central window, performance improved with the years of training, and to a lesser extent with the level of skill. This suggests that the extensive training itself confers an advantage for integrating object structure into more robust object descriptions. PMID:24349697
Adaptation of facial synthesis to parameter analysis in MPEG-4 visual communication
NASA Astrophysics Data System (ADS)
Yu, Lu; Zhang, Jingyu; Liu, Yunhai
2000-12-01
In MPEG-4, Facial Definition Parameters (FDPs) and Facial Animation Parameters (FAPs) are defined to animate 1 a facial object. Most of the previous facial animation reconstruction systems were focused on synthesizing animation from manually or automatically generated FAPs but not the FAPs extracted from natural video scene. In this paper, an analysis-synthesis MPEG-4 visual communication system is established, in which facial animation is reconstructed from FAPs extracted from natural video scene.
Guidance for Development of a Flight Simulator Specification
2007-05-01
the simulated line of sight to the moon is less than one degree, and that the moon appears to move smoothly across the visual scene. The phase of the...Agencies have adopted the definition used by Optics Companies (this definition has also been adopted in this revision of the Air Force Guide...simulators that require tracking the target as it slues across the displayed scene, such as with air -to-ground or air -to- air combat tasks. Visual systems
Olivier, Agnès; Faugloire, Elise; Lejeune, Laure; Biau, Sophie; Isableu, Brice
2017-01-01
Maintaining equilibrium while riding a horse is a challenging task that involves complex sensorimotor processes. We evaluated the relative contribution of visual information (static or dynamic) to horseback riders' postural stability (measured from the variability of segment position in space) and the coordination modes they adopted to regulate balance according to their level of expertise. Riders' perceptual typologies and their possible relation to postural stability were also assessed. Our main assumption was that the contribution of visual information to postural control would be reduced among expert riders in favor of vestibular and somesthetic reliance. Twelve Professional riders and 13 Club riders rode an equestrian simulator at a gallop under four visual conditions: (1) with the projection of a simulated scene reproducing what a rider sees in the real context of a ride in an outdoor arena, (2) under stroboscopic illumination, preventing access to dynamic visual cues, (3) in normal lighting but without the projected scene (i.e., without the visual consequences of displacement) and (4) with no visual cues. The variability of the position of the head, upper trunk and lower trunk was measured along the anteroposterior (AP), mediolateral (ML), and vertical (V) axes. We computed discrete relative phase to assess the coordination between pairs of segments in the anteroposterior axis. Visual field dependence-independence was evaluated using the Rod and Frame Test (RFT). The results showed that the Professional riders exhibited greater overall postural stability than the Club riders, revealed mainly in the AP axis. In particular, head variability was lower in the Professional riders than in the Club riders in visually altered conditions, suggesting a greater ability to use vestibular and somesthetic information according to task constraints with expertise. In accordance with this result, RFT perceptual scores revealed that the Professional riders were less dependent on the visual field than were the Club riders. Finally, the Professional riders exhibited specific coordination modes that, unlike the Club riders, departed from pure in-phase and anti-phase patterns and depended on visual conditions. The present findings provide evidence of major differences in the sensorimotor processes contributing to postural control with expertise in horseback riding. PMID:28194100
Effect of fixation positions on perception of lightness
NASA Astrophysics Data System (ADS)
Toscani, Matteo; Valsecchi, Matteo; Gegenfurtner, Karl R.
2015-03-01
Visual acuity, luminance sensitivity, contrast sensitivity, and color sensitivity are maximal in the fovea and decrease with retinal eccentricity. Therefore every scene is perceived by integrating the small, high resolution samples collected by moving the eyes around. Moreover, when viewing ambiguous figures the fixated position influences the dominance of the possible percepts. Therefore fixations could serve as a selection mechanism whose function is not confined to finely resolve the selected detail of the scene. Here this hypothesis is tested in the lightness perception domain. In a first series of experiments we demonstrated that when observers matched the color of natural objects they based their lightness judgments on objects' brightest parts. During this task the observers tended to fixate points with above average luminance, suggesting a relationship between perception and fixations that we causally proved using a gaze contingent display in a subsequent experiment. Simulations with rendered physical lighting show that higher values in an object's luminance distribution are particularly informative about reflectance. In a second series of experiments we considered a high level strategy that the visual system uses to segment the visual scene in a layered representation. We demonstrated that eye movement sampling mediates between the layer segregation and its effects on lightness perception. Together these studies show that eye fixations are partially responsible for the selection of information from a scene that allows the visual system to estimate the reflectance of a surface.
Kauffmann, Louise; Chauvin, Alan; Pichat, Cédric; Peyrin, Carole
2015-10-01
According to current models of visual perception scenes are processed in terms of spatial frequencies following a predominantly coarse-to-fine processing sequence. Low spatial frequencies (LSF) reach high-order areas rapidly in order to activate plausible interpretations of the visual input. This triggers top-down facilitation that guides subsequent processing of high spatial frequencies (HSF) in lower-level areas such as the inferotemporal and occipital cortices. However, dynamic interactions underlying top-down influences on the occipital cortex have never been systematically investigated. The present fMRI study aimed to further explore the neural bases and effective connectivity underlying coarse-to-fine processing of scenes, particularly the role of the occipital cortex. We used sequences of six filtered scenes as stimuli depicting coarse-to-fine or fine-to-coarse processing of scenes. Participants performed a categorization task on these stimuli (indoor vs. outdoor). Firstly, we showed that coarse-to-fine (compared to fine-to-coarse) sequences elicited stronger activation in the inferior frontal gyrus (in the orbitofrontal cortex), the inferotemporal cortex (in the fusiform and parahippocampal gyri), and the occipital cortex (in the cuneus). Dynamic causal modeling (DCM) was then used to infer effective connectivity between these regions. DCM results revealed that coarse-to-fine processing resulted in increased connectivity from the occipital cortex to the inferior frontal gyrus and from the inferior frontal gyrus to the inferotemporal cortex. Critically, we also observed an increase in connectivity strength from the inferior frontal gyrus to the occipital cortex, suggesting that top-down influences from frontal areas may guide processing of incoming signals. The present results support current models of visual perception and refine them by emphasizing the role of the occipital cortex as a cortical site for feedback projections in the neural network underlying coarse-to-fine processing of scenes. Copyright © 2015 Elsevier Inc. All rights reserved.
Water surface modeling from a single viewpoint video.
Li, Chuan; Pickup, David; Saunders, Thomas; Cosker, Darren; Marshall, David; Hall, Peter; Willis, Philip
2013-07-01
We introduce a video-based approach for producing water surface models. Recent advances in this field output high-quality results but require dedicated capturing devices and only work in limited conditions. In contrast, our method achieves a good tradeoff between the visual quality and the production cost: It automatically produces a visually plausible animation using a single viewpoint video as the input. Our approach is based on two discoveries: first, shape from shading (SFS) is adequate to capture the appearance and dynamic behavior of the example water; second, shallow water model can be used to estimate a velocity field that produces complex surface dynamics. We will provide qualitative evaluation of our method and demonstrate its good performance across a wide range of scenes.
Scene Context Dependency of Pattern Constancy of Time Series Imagery
NASA Technical Reports Server (NTRS)
Woodell, Glenn A.; Jobson, Daniel J.; Rahman, Zia-ur
2008-01-01
A fundamental element of future generic pattern recognition technology is the ability to extract similar patterns for the same scene despite wide ranging extraneous variables, including lighting, turbidity, sensor exposure variations, and signal noise. In the process of demonstrating pattern constancy of this kind for retinex/visual servo (RVS) image enhancement processing, we found that the pattern constancy performance depended somewhat on scene content. Most notably, the scene topography and, in particular, the scale and extent of the topography in an image, affects the pattern constancy the most. This paper will explore these effects in more depth and present experimental data from several time series tests. These results further quantify the impact of topography on pattern constancy. Despite this residual inconstancy, the results of overall pattern constancy testing support the idea that RVS image processing can be a universal front-end for generic visual pattern recognition. While the effects on pattern constancy were significant, the RVS processing still does achieve a high degree of pattern constancy over a wide spectrum of scene content diversity, and wide ranging extraneousness variations in lighting, turbidity, and sensor exposure.
Constructing, Perceiving, and Maintaining Scenes: Hippocampal Activity and Connectivity
Zeidman, Peter; Mullally, Sinéad L.; Maguire, Eleanor A.
2015-01-01
In recent years, evidence has accumulated to suggest the hippocampus plays a role beyond memory. A strong hippocampal response to scenes has been noted, and patients with bilateral hippocampal damage cannot vividly recall scenes from their past or construct scenes in their imagination. There is debate about whether the hippocampus is involved in the online processing of scenes independent of memory. Here, we investigated the hippocampal response to visually perceiving scenes, constructing scenes in the imagination, and maintaining scenes in working memory. We found extensive hippocampal activation for perceiving scenes, and a circumscribed area of anterior medial hippocampus common to perception and construction. There was significantly less hippocampal activity for maintaining scenes in working memory. We also explored the functional connectivity of the anterior medial hippocampus and found significantly stronger connectivity with a distributed set of brain areas during scene construction compared with scene perception. These results increase our knowledge of the hippocampus by identifying a subregion commonly engaged by scenes, whether perceived or constructed, by separating scene construction from working memory, and by revealing the functional network underlying scene construction, offering new insights into why patients with hippocampal lesions cannot construct scenes. PMID:25405941
How is visual salience computed in the brain? Insights from behaviour, neurobiology and modelling
Veale, Richard; Hafed, Ziad M.
2017-01-01
Inherent in visual scene analysis is a bottleneck associated with the need to sequentially sample locations with foveating eye movements. The concept of a ‘saliency map’ topographically encoding stimulus conspicuity over the visual scene has proven to be an efficient predictor of eye movements. Our work reviews insights into the neurobiological implementation of visual salience computation. We start by summarizing the role that different visual brain areas play in salience computation, whether at the level of feature analysis for bottom-up salience or at the level of goal-directed priority maps for output behaviour. We then delve into how a subcortical structure, the superior colliculus (SC), participates in salience computation. The SC represents a visual saliency map via a centre-surround inhibition mechanism in the superficial layers, which feeds into priority selection mechanisms in the deeper layers, thereby affecting saccadic and microsaccadic eye movements. Lateral interactions in the local SC circuit are particularly important for controlling active populations of neurons. This, in turn, might help explain long-range effects, such as those of peripheral cues on tiny microsaccades. Finally, we show how a combination of in vitro neurophysiology and large-scale computational modelling is able to clarify how salience computation is implemented in the local circuit of the SC. This article is part of the themed issue ‘Auditory and visual scene analysis’. PMID:28044023
ERIC Educational Resources Information Center
Kidd, Evan; Stewart, Andrew J.; Serratrice, Ludovica
2011-01-01
In this paper we report on a visual world eye-tracking experiment that investigated the differing abilities of adults and children to use referential scene information during reanalysis to overcome lexical biases during sentence processing. The results showed that adults incorporated aspects of the referential scene into their parse as soon as it…
Mishra, Jyoti; Zanto, Theodore; Nilakantan, Aneesha; Gazzaley, Adam
2013-01-01
Intrasensory interference during visual working memory (WM) maintenance by object stimuli (such as faces and scenes), has been shown to negatively impact WM performance, with greater detrimental impacts of interference observed in aging. Here we assessed age-related impacts by intrasensory WM interference from lower-level stimulus features such as visual and auditory motion stimuli. We consistently found that interference in the form of ignored distractions and secondary task i nterruptions presented during a WM maintenance period, degraded memory accuracy in both the visual and auditory domain. However, in contrast to prior studies assessing WM for visual object stimuli, feature-based interference effects were not observed to be significantly greater in older adults. Analyses of neural oscillations in the alpha frequency band further revealed preserved mechanisms of interference processing in terms of post-stimulus alpha suppression, which was observed maximally for secondary task interruptions in visual and auditory modalities in both younger and older adults. These results suggest that age-related sensitivity of WM to interference may be limited to complex object stimuli, at least at low WM loads. PMID:23791629
Use of Linear Perspective Scene Cues in a Simulated Height Regulation Task
NASA Technical Reports Server (NTRS)
Levison, W. H.; Warren, R.
1984-01-01
As part of a long-term effort to quantify the effects of visual scene cuing and non-visual motion cuing in flight simulators, an experimental study of the pilot's use of linear perspective cues in a simulated height-regulation task was conducted. Six test subjects performed a fixed-base tracking task with a visual display consisting of a simulated horizon and a perspective view of a straight, infinitely-long roadway of constant width. Experimental parameters were (1) the central angle formed by the roadway perspective and (2) the display gain. The subject controlled only the pitch/height axis; airspeed, bank angle, and lateral track were fixed in the simulation. The average RMS height error score for the least effective display configuration was about 25% greater than the score for the most effective configuration. Overall, larger and more highly significant effects were observed for the pitch and control scores. Model analysis was performed with the optimal control pilot model to characterize the pilot's use of visual scene cues, with the goal of obtaining a consistent set of independent model parameters to account for display effects.
Surface-illuminant ambiguity and color constancy: effects of scene complexity and depth cues.
Kraft, James M; Maloney, Shannon I; Brainard, David H
2002-01-01
Two experiments were conducted to study how scene complexity and cues to depth affect human color constancy. Specifically, two levels of scene complexity were compared. The low-complexity scene contained two walls with the same surface reflectance and a test patch which provided no information about the illuminant. In addition to the surfaces visible in the low-complexity scene, the high-complexity scene contained two rectangular solid objects and 24 paper samples with diverse surface reflectances. Observers viewed illuminated objects in an experimental chamber and adjusted the test patch until it appeared achromatic. Achromatic settings made tinder two different illuminants were used to compute an index that quantified the degree of constancy. Two experiments were conducted: one in which observers viewed the stimuli directly, and one in which they viewed the scenes through an optical system that reduced cues to depth. In each experiment, constancy was assessed for two conditions. In the valid-cue condition, many cues provided valid information about the illuminant change. In the invalid-cue condition, some image cues provided invalid information. Four broad conclusions are drawn from the data: (a) constancy is generally better in the valid-cue condition than in the invalid-cue condition: (b) for the stimulus configuration used, increasing image complexity has little effect in the valid-cue condition but leads to increased constancy in the invalid-cue condition; (c) for the stimulus configuration used, reducing cues to depth has little effect for either constancy condition: and (d) there is moderate individual variation in the degree of constancy exhibited, particularly in the degree to which the complexity manipulation affects performance.
Observers' cognitive states modulate how visual inputs relate to gaze control.
Kardan, Omid; Henderson, John M; Yourganov, Grigori; Berman, Marc G
2016-09-01
Previous research has shown that eye-movements change depending on both the visual features of our environment, and the viewer's top-down knowledge. One important question that is unclear is the degree to which the visual goals of the viewer modulate how visual features of scenes guide eye-movements. Here, we propose a systematic framework to investigate this question. In our study, participants performed 3 different visual tasks on 135 scenes: search, memorization, and aesthetic judgment, while their eye-movements were tracked. Canonical correlation analyses showed that eye-movements were reliably more related to low-level visual features at fixations during the visual search task compared to the aesthetic judgment and scene memorization tasks. Different visual features also had different relevance to eye-movements between tasks. This modulation of the relationship between visual features and eye-movements by task was also demonstrated with classification analyses, where classifiers were trained to predict the viewing task based on eye movements and visual features at fixations. Feature loadings showed that the visual features at fixations could signal task differences independent of temporal and spatial properties of eye-movements. When classifying across participants, edge density and saliency at fixations were as important as eye-movements in the successful prediction of task, with entropy and hue also being significant, but with smaller effect sizes. When classifying within participants, brightness and saturation were also significant contributors. Canonical correlation and classification results, together with a test of moderation versus mediation, suggest that the cognitive state of the observer moderates the relationship between stimulus-driven visual features and eye-movements. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Gestalt-like constraints produce veridical (Euclidean) percepts of 3D indoor scenes
Kwon, TaeKyu; Li, Yunfeng; Sawada, Tadamasa; Pizlo, Zygmunt
2015-01-01
This study, which was influenced a lot by Gestalt ideas, extends our prior work on the role of a priori constraints in the veridical perception of 3D shapes to the perception of 3D scenes. Our experiments tested how human subjects perceive the layout of a naturally-illuminated indoor scene that contains common symmetrical 3D objects standing on a horizontal floor. In one task, the subject was asked to draw a top view of a scene that was viewed either monocularly or binocularly. The top views the subjects reconstructed were configured accurately except for their overall size. These size errors varied from trial to trial, and were shown most-likely to result from the presence of a response bias. There was little, if any, evidence of systematic distortions of the subjects’ perceived visual space, the kind of distortions that have been reported in numerous experiments run under very unnatural conditions. This shown, we proceeded to use Foley’s (Vision Research 12 (1972) 323–332) isosceles right triangle experiment to test the intrinsic geometry of visual space directly. This was done with natural viewing, with the impoverished viewing conditions Foley had used, as well as with a number of intermediate viewing conditions. Our subjects produced very accurate triangles when the viewing conditions were natural, but their performance deteriorated systematically as the viewing conditions were progressively impoverished. Their perception of visual space became more compressed as their natural visual environment was degraded. Once this was shown, we developed a computational model that emulated the most salient features of our psychophysical results. We concluded that human observers see 3D scenes veridically when they view natural 3D objects within natural 3D environments. PMID:26525845
Causal Inference for Spatial Constancy across Saccades
Atsma, Jeroen; Maij, Femke; Koppen, Mathieu; Irwin, David E.; Medendorp, W. Pieter
2016-01-01
Our ability to interact with the environment hinges on creating a stable visual world despite the continuous changes in retinal input. To achieve visual stability, the brain must distinguish the retinal image shifts caused by eye movements and shifts due to movements of the visual scene. This process appears not to be flawless: during saccades, we often fail to detect whether visual objects remain stable or move, which is called saccadic suppression of displacement (SSD). How does the brain evaluate the memorized information of the presaccadic scene and the actual visual feedback of the postsaccadic visual scene in the computations for visual stability? Using a SSD task, we test how participants localize the presaccadic position of the fixation target, the saccade target or a peripheral non-foveated target that was displaced parallel or orthogonal during a horizontal saccade, and subsequently viewed for three different durations. Results showed different localization errors of the three targets, depending on the viewing time of the postsaccadic stimulus and its spatial separation from the presaccadic location. We modeled the data through a Bayesian causal inference mechanism, in which at the trial level an optimal mixing of two possible strategies, integration vs. separation of the presaccadic memory and the postsaccadic sensory signals, is applied. Fits of this model generally outperformed other plausible decision strategies for producing SSD. Our findings suggest that humans exploit a Bayesian inference process with two causal structures to mediate visual stability. PMID:26967730
Herbort, Maike C.; Iseev, Jenny; Stolz, Christopher; Roeser, Benedict; Großkopf, Nora; Wüstenberg, Torsten; Hellweg, Rainer; Walter, Henrik; Dziobek, Isabel; Schott, Björn H.
2016-01-01
We present the ToMenovela, a stimulus set that has been developed to provide a set of normatively rated socio-emotional stimuli showing varying amount of characters in emotionally laden interactions for experimental investigations of (i) cognitive and (ii) affective Theory of Mind (ToM), (iii) emotional reactivity, and (iv) complex emotion judgment with respect to Ekman’s basic emotions (happiness, anger, disgust, fear, sadness, surprise, Ekman and Friesen, 1975). Stimuli were generated with focus on ecological validity and consist of 190 scenes depicting daily-life situations. Two or more of eight main characters with distinct biographies and personalities are depicted on each scene picture. To obtain an initial evaluation of the stimulus set and to pave the way for future studies in clinical populations, normative data on each stimulus of the set was obtained from a sample of 61 neurologically and psychiatrically healthy participants (31 female, 30 male; mean age 26.74 ± 5.84), including a visual analog scale rating of Ekman’s basic emotions (happiness, anger, disgust, fear, sadness, surprise) and free-text descriptions of the content of each scene. The ToMenovela is being developed to provide standardized material of social scenes that are available to researchers in the study of social cognition. It should facilitate experimental control while keeping ecological validity high. PMID:27994562
Ganesh, Attigodu Chandrashekara; Berthommier, Frédéric; Schwartz, Jean-Luc
2016-01-01
We introduce "Audio-Visual Speech Scene Analysis" (AVSSA) as an extension of the two-stage Auditory Scene Analysis model towards audiovisual scenes made of mixtures of speakers. AVSSA assumes that a coherence index between the auditory and the visual input is computed prior to audiovisual fusion, enabling to determine whether the sensory inputs should be bound together. Previous experiments on the modulation of the McGurk effect by audiovisual coherent vs. incoherent contexts presented before the McGurk target have provided experimental evidence supporting AVSSA. Indeed, incoherent contexts appear to decrease the McGurk effect, suggesting that they produce lower audiovisual coherence hence less audiovisual fusion. The present experiments extend the AVSSA paradigm by creating contexts made of competing audiovisual sources and measuring their effect on McGurk targets. The competing audiovisual sources have respectively a high and a low audiovisual coherence (that is, large vs. small audiovisual comodulations in time). The first experiment involves contexts made of two auditory sources and one video source associated to either the first or the second audio source. It appears that the McGurk effect is smaller after the context made of the visual source associated to the auditory source with less audiovisual coherence. In the second experiment with the same stimuli, the participants are asked to attend to either one or the other source. The data show that the modulation of fusion depends on the attentional focus. Altogether, these two experiments shed light on audiovisual binding, the AVSSA process and the role of attention.
Barrès, Victor; Lee, Jinyong
2014-01-01
How does the language system coordinate with our visual system to yield flexible integration of linguistic, perceptual, and world-knowledge information when we communicate about the world we perceive? Schema theory is a computational framework that allows the simulation of perceptuo-motor coordination programs on the basis of known brain operating principles such as cooperative computation and distributed processing. We present first its application to a model of language production, SemRep/TCG, which combines a semantic representation of visual scenes (SemRep) with Template Construction Grammar (TCG) as a means to generate verbal descriptions of a scene from its associated SemRep graph. SemRep/TCG combines the neurocomputational framework of schema theory with the representational format of construction grammar in a model linking eye-tracking data to visual scene descriptions. We then offer a conceptual extension of TCG to include language comprehension and address data on the role of both world knowledge and grammatical semantics in the comprehension performances of agrammatic aphasic patients. This extension introduces a distinction between heavy and light semantics. The TCG model of language comprehension offers a computational framework to quantitatively analyze the distributed dynamics of language processes, focusing on the interactions between grammatical, world knowledge, and visual information. In particular, it reveals interesting implications for the understanding of the various patterns of comprehension performances of agrammatic aphasics measured using sentence-picture matching tasks. This new step in the life cycle of the model serves as a basis for exploring the specific challenges that neurolinguistic computational modeling poses to the neuroinformatics community.
The what, where and how of auditory-object perception.
Bizley, Jennifer K; Cohen, Yale E
2013-10-01
The fundamental perceptual unit in hearing is the 'auditory object'. Similar to visual objects, auditory objects are the computational result of the auditory system's capacity to detect, extract, segregate and group spectrotemporal regularities in the acoustic environment; the multitude of acoustic stimuli around us together form the auditory scene. However, unlike the visual scene, resolving the component objects within the auditory scene crucially depends on their temporal structure. Neural correlates of auditory objects are found throughout the auditory system. However, neural responses do not become correlated with a listener's perceptual reports until the level of the cortex. The roles of different neural structures and the contribution of different cognitive states to the perception of auditory objects are not yet fully understood.
The what, where and how of auditory-object perception
Bizley, Jennifer K.; Cohen, Yale E.
2014-01-01
The fundamental perceptual unit in hearing is the ‘auditory object’. Similar to visual objects, auditory objects are the computational result of the auditory system's capacity to detect, extract, segregate and group spectrotemporal regularities in the acoustic environment; the multitude of acoustic stimuli around us together form the auditory scene. However, unlike the visual scene, resolving the component objects within the auditory scene crucially depends on their temporal structure. Neural correlates of auditory objects are found throughout the auditory system. However, neural responses do not become correlated with a listener's perceptual reports until the level of the cortex. The roles of different neural structures and the contribution of different cognitive states to the perception of auditory objects are not yet fully understood. PMID:24052177
Kobayashi, Yasutaka; Muramatsu, Tomoko; Sato, Mamiko; Hayashi, Hiromi; Miura, Toyoaki
2015-01-01
A 68-year-old man was admitted to our hospital for rehabilitation of topographical disorientation. Brain magnetic resonance imaging revealed infarction in the right medial side of the occipital lobe. On neuropsychological testing, he scored low for the visual information-processing task; however, his overall cognitive function was retained. He could identify parts of the picture while describing the context picture of the Visual Perception Test for Agnosia but could not explain the contents of the entire picture, representing so-called simultanagnosia. Further, he could morphologically perceive both familiar and new scenes, but could not identify them, representing so-called scene agnosia. We report this case because simultanagnosia associated with a right occipital lobe lesion is rare.
Cichy, Radoslaw Martin; Khosla, Aditya; Pantazis, Dimitrios; Oliva, Aude
2017-01-01
Human scene recognition is a rapid multistep process evolving over time from single scene image to spatial layout processing. We used multivariate pattern analyses on magnetoencephalography (MEG) data to unravel the time course of this cortical process. Following an early signal for lower-level visual analysis of single scenes at ~100 ms, we found a marker of real-world scene size, i.e. spatial layout processing, at ~250 ms indexing neural representations robust to changes in unrelated scene properties and viewing conditions. For a quantitative model of how scene size representations may arise in the brain, we compared MEG data to a deep neural network model trained on scene classification. Representations of scene size emerged intrinsically in the model, and resolved emerging neural scene size representation. Together our data provide a first description of an electrophysiological signal for layout processing in humans, and suggest that deep neural networks are a promising framework to investigate how spatial layout representations emerge in the human brain. PMID:27039703
Moors, Pieter; Boelens, David; van Overwalle, Jaana; Wagemans, Johan
2016-07-01
A recent study showed that scenes with an object-background relationship that is semantically incongruent break interocular suppression faster than scenes with a semantically congruent relationship. These results implied that semantic relations between the objects and the background of a scene could be extracted in the absence of visual awareness of the stimulus. In the current study, we assessed the replicability of this finding and tried to rule out an alternative explanation dependent on low-level differences between the stimuli. Furthermore, we used a Bayesian analysis to quantify the evidence in favor of the presence or absence of a scene-congruency effect. Across three experiments, we found no convincing evidence for a scene-congruency effect or a modulation of scene congruency by scene inversion. These findings question the generalizability of previous observations and cast doubt on whether genuine semantic processing of object-background relationships in scenes can manifest during interocular suppression. © The Author(s) 2016.
Feature-based attentional modulations in the absence of direct visual stimulation.
Serences, John T; Boynton, Geoffrey M
2007-07-19
When faced with a crowded visual scene, observers must selectively attend to behaviorally relevant objects to avoid sensory overload. Often this selection process is guided by prior knowledge of a target-defining feature (e.g., the color red when looking for an apple), which enhances the firing rate of visual neurons that are selective for the attended feature. Here, we used functional magnetic resonance imaging and a pattern classification algorithm to predict the attentional state of human observers as they monitored a visual feature (one of two directions of motion). We find that feature-specific attention effects spread across the visual field-even to regions of the scene that do not contain a stimulus. This spread of feature-based attention to empty regions of space may facilitate the perception of behaviorally relevant stimuli by increasing sensitivity to attended features at all locations in the visual field.
Top-down visual search in Wimmelbild
NASA Astrophysics Data System (ADS)
Bergbauer, Julia; Tari, Sibel
2013-03-01
Wimmelbild which means "teeming figure picture" is a popular genre of visual puzzles. Abundant masses of small figures are brought together in complex arrangements to make one scene in a Wimmelbild. It is picture hunt game. We discuss what type of computations/processes could possibly underlie the solution of the discovery of figures that are hidden due to a distractive influence of the context. One thing for sure is that the processes are unlikely to be purely bottom-up. One possibility is to re-arrange parts and see what happens. As this idea is linked to creativity, there are abundant examples of unconventional part re-organization in modern art. A second possibility is to define what to look for. That is to formulate the search as a top-down process. We address top-down visual search in Wimmelbild with the help of diffuse distance and curvature coding fields.
A Neurobehavioral Model of Flexible Spatial Language Behaviors
Lipinski, John; Schneegans, Sebastian; Sandamirskaya, Yulia; Spencer, John P.; Schöner, Gregor
2012-01-01
We propose a neural dynamic model that specifies how low-level visual processes can be integrated with higher level cognition to achieve flexible spatial language behaviors. This model uses real-word visual input that is linked to relational spatial descriptions through a neural mechanism for reference frame transformations. We demonstrate that the system can extract spatial relations from visual scenes, select items based on relational spatial descriptions, and perform reference object selection in a single unified architecture. We further show that the performance of the system is consistent with behavioral data in humans by simulating results from 2 independent empirical studies, 1 spatial term rating task and 1 study of reference object selection behavior. The architecture we present thereby achieves a high degree of task flexibility under realistic stimulus conditions. At the same time, it also provides a detailed neural grounding for complex behavioral and cognitive processes. PMID:21517224
NASA Technical Reports Server (NTRS)
Chase, W. D.
1975-01-01
The calligraphic chromatic projector described was developed to improve the perceived realism of visual scene simulation ('out-the-window visuals'). The optical arrangement of the projector is illustrated and discussed. The device permits drawing 2000 vectors in as many as 500 colors, all above critical flicker frequencies, and use of high scene resolution and brightness at an acceptable level to the pilot, with the maximum system capabilities of 1000 lines and 1000 fL. The device for generating the colors is discussed, along with an experiment conducted to demonstrate potential improvements in performance and pilot opinion. Current research work and future research plans are noted.
The Orbital Maneuvering Vehicle Training Facility visual system concept
NASA Technical Reports Server (NTRS)
Williams, Keith
1989-01-01
The purpose of the Orbital Maneuvering Vehicle (OMV) Training Facility (OTF) is to provide effective training for OMV pilots. A critical part of the training environment is the Visual System, which will simulate the video scenes produced by the OMV Closed-Circuit Television (CCTV) system. The simulation will include camera models, dynamic target models, moving appendages, and scene degradation due to the compression/decompression of video signal. Video system malfunctions will also be provided to ensure that the pilot is ready to meet all challenges the real-world might provide. One possible visual system configuration for the training facility that will meet existing requirements is described.
Radiologists remember mountains better than radiographs, or do they?
Evans, Karla K.; Marom, Edith M.; Godoy, Myrna C. B.; Palacio, Diana; Sagebiel, Tara; Cuellar, Sonia Betancourt; McEntee, Mark; Tian, Charles; Brennan, Patrick C.; Haygood, Tamara Miner
2015-01-01
Abstract. Expertise with encoding material has been shown to aid long-term memory for that material. It is not clear how relevant this expertise is for image memorability (e.g., radiologists’ memory for radiographs), and how robust over time. In two studies, we tested scene memory using a standard long-term memory paradigm. One compared the performance of radiologists to naïve observers on two image sets, chest radiographs and everyday scenes, and the other radiologists’ memory with immediate as opposed to delayed recognition tests using musculoskeletal radiographs and forest scenes. Radiologists’ memory was better than novices for images of expertise but no different for everyday scenes. With the heterogeneity of image sets equated, radiologists’ expertise with radiographs afforded them better memory for the musculoskeletal radiographs than forest scenes. Enhanced memory for images of expertise disappeared over time, resulting in chance level performance for both image sets after weeks of delay. Expertise with the material is important for visual memorability but not to the same extent as idiosyncratic detail and variability of the image set. Similar memory decline with time for images of expertise as for everyday scenes further suggests that extended familiarity with an image is not a robust factor for visual memorability. PMID:26870748
Basic level scene understanding: categories, attributes and structures
Xiao, Jianxiong; Hays, James; Russell, Bryan C.; Patterson, Genevieve; Ehinger, Krista A.; Torralba, Antonio; Oliva, Aude
2013-01-01
A longstanding goal of computer vision is to build a system that can automatically understand a 3D scene from a single image. This requires extracting semantic concepts and 3D information from 2D images which can depict an enormous variety of environments that comprise our visual world. This paper summarizes our recent efforts toward these goals. First, we describe the richly annotated SUN database which is a collection of annotated images spanning 908 different scene categories with object, attribute, and geometric labels for many scenes. This database allows us to systematically study the space of scenes and to establish a benchmark for scene and object recognition. We augment the categorical SUN database with 102 scene attributes for every image and explore attribute recognition. Finally, we present an integrated system to extract the 3D structure of the scene and objects depicted in an image. PMID:24009590
Foulsham, Tom; Alan, Rana; Kingstone, Alan
2011-10-01
Previous research has demonstrated that search and memory for items within natural scenes can be disrupted by "scrambling" the images. In the present study, we asked how disrupting the structure of a scene through scrambling might affect the control of eye fixations in either a search task (Experiment 1) or a memory task (Experiment 2). We found that the search decrement in scrambled scenes was associated with poorer guidance of the eyes to the target. Across both tasks, scrambling led to shorter fixations and longer saccades, and more distributed, less selective overt attention, perhaps corresponding to an ambient mode of processing. These results confirm that scene structure has widespread effects on the guidance of eye movements in scenes. Furthermore, the results demonstrate the trade-off between scene structure and visual saliency, with saliency having more of an effect on eye guidance in scrambled scenes.
The Characteristics and Limits of Rapid Visual Categorization
Fabre-Thorpe, Michèle
2011-01-01
Visual categorization appears both effortless and virtually instantaneous. The study by Thorpe et al. (1996) was the first to estimate the processing time necessary to perform fast visual categorization of animals in briefly flashed (20 ms) natural photographs. They observed a large differential EEG activity between target and distracter correct trials that developed from 150 ms after stimulus onset, a value that was later shown to be even shorter in monkeys! With such strong processing time constraints, it was difficult to escape the conclusion that rapid visual categorization was relying on massively parallel, essentially feed-forward processing of visual information. Since 1996, we have conducted a large number of studies to determine the characteristics and limits of fast visual categorization. The present chapter will review some of the main results obtained. I will argue that rapid object categorizations in natural scenes can be done without focused attention and are most likely based on coarse and unconscious visual representations activated with the first available (magnocellular) visual information. Fast visual processing proved efficient for the categorization of large superordinate object or scene categories, but shows its limits when more detailed basic representations are required. The representations for basic objects (dogs, cars) or scenes (mountain or sea landscapes) need additional processing time to be activated. This finding is at odds with the widely accepted idea that such basic representations are at the entry level of the system. Interestingly, focused attention is still not required to perform these time consuming basic categorizations. Finally we will show that object and context processing can interact very early in an ascending wave of visual information processing. We will discuss how such data could result from our experience with a highly structured and predictable surrounding world that shaped neuronal visual selectivity. PMID:22007180
Keshner, E.A.; Dhaher, Y.
2008-01-01
Multiplanar environmental motion could generate head instability, particularly if the visual surround moves in planes orthogonal to a physical disturbance. We combined sagittal plane surface translations with visual field disturbances in 12 healthy (29–31 years) and 3 visually sensitive (27–57 years) adults. Center of pressure (COP), peak head angles, and RMS values of head motion were calculated and a 3-dimensional model of joint motion11 was developed to examine gross head motion in 3 planes. We found that subjects standing quietly in front of a visual scene translating in the sagittal plane produced significantly greater (p<0.003) head motion in yaw than when on a translating platform. However, when the platform was translated in the dark or with a visual scene rotating in roll, head motion orthogonal to the plane of platform motion significantly increased (p<0.02). Visually sensitive subjects having no history of vestibular disorder produced large, delayed compensatory head motion. Orthogonal head motions were significantly greater in visually sensitive than in healthy subjects in the dark (p<0.05) and with a stationary scene (p<0.01). We concluded that motion of the visual field can modify compensatory response kinematics of a freely moving head in planes orthogonal to the direction of a physical perturbation. These results suggest that the mechanisms controlling head orientation in space are distinct from those that control trunk orientation in space. These behaviors would have been missed if only COP data were considered. Data suggest that rehabilitation training can be enhanced by combining visual and mechanical perturbation paradigms. PMID:18162402
The Deployment of Visual Attention
2006-03-01
targets: Evidence for memory-based control of attention. Psychonomic Bulletin & Review , 11(1), 71-76. Torralba, A. (2003). Modeling global scene...S., Fencsik, D. E., Tran, L., & Wolfe, J. M. (in press). How do we track invisible objects? Psychonomic Bulletin & Review . *Horowitz, T. S. (in press
Do Visual Illusions Probe the Visual Brain?: Illusions in Action without a Dorsal Visual Stream
ERIC Educational Resources Information Center
Coello, Yann; Danckert, James; Blangero, Annabelle; Rossetti, Yves
2007-01-01
Visual illusions have been shown to affect perceptual judgements more so than motor behaviour, which was interpreted as evidence for a functional division of labour within the visual system. The dominant perception-action theory argues that perception involves a holistic processing of visual objects or scenes, performed within the ventral,…
Impact of age-related macular degeneration on object searches in realistic panoramic scenes.
Thibaut, Miguel; Tran, Thi-Ha-Chau; Szaffarczyk, Sebastien; Boucart, Muriel
2018-05-01
This study investigated whether realistic immersive conditions with dynamic indoor scenes presented on a large, hemispheric panoramic screen covering 180° of the visual field improved the visual search abilities of participants with age-related macular degeneration (AMD). Twenty-one participants with AMD, 16 age-matched controls and 16 young observers were included. Realistic indoor scenes were presented on a panoramic five metre diameter screen. Twelve different objects were used as targets. The participants were asked to search for a target object, shown on paper before each trial, within a room composed of various objects. A joystick was used for navigation within the scene views. A target object was present in 24 trials and absent in 24 trials. The percentage of correct detection of the target, the percentage of false alarms (that is, the detection of the target when it was absent), the number of scene views explored and the search time were measured. The search time was slower for participants with AMD than for the age-matched controls, who in turn were slower than the young participants. The participants with AMD were able to accomplish the task with a performance of 75 per cent correct detections. This was slightly lower than older controls (79.2 per cent) while young controls were at ceiling (91.7 per cent). Errors were mainly due to false alarms resulting from confusion between the target object and another object present in the scene in the target-absent trials. The outcomes of the present study indicate that, under realistic conditions, although slower than age-matched, normally sighted controls, participants with AMD were able to accomplish visual searches of objects with high accuracy. © 2017 Optometry Australia.
Interactive MPEG-4 low-bit-rate speech/audio transmission over the Internet
NASA Astrophysics Data System (ADS)
Liu, Fang; Kim, JongWon; Kuo, C.-C. Jay
1999-11-01
The recently developed MPEG-4 technology enables the coding and transmission of natural and synthetic audio-visual data in the form of objects. In an effort to extend the object-based functionality of MPEG-4 to real-time Internet applications, architectural prototypes of multiplex layer and transport layer tailored for transmission of MPEG-4 data over IP are under debate among Internet Engineering Task Force (IETF), and MPEG-4 systems Ad Hoc group. In this paper, we present an architecture for interactive MPEG-4 speech/audio transmission system over the Internet. It utilities a framework of Real Time Streaming Protocol (RTSP) over Real-time Transport Protocol (RTP) to provide controlled, on-demand delivery of real time speech/audio data. Based on a client-server model, a couple of low bit-rate bit streams (real-time speech/audio, pre- encoded speech/audio) are multiplexed and transmitted via a single RTP channel to the receiver. The MPEG-4 Scene Description (SD) and Object Descriptor (OD) bit streams are securely sent through the RTSP control channel. Upon receiving, an initial MPEG-4 audio- visual scene is constructed after de-multiplexing, decoding of bit streams, and scene composition. A receiver is allowed to manipulate the initial audio-visual scene presentation locally, or interactively arrange scene changes by sending requests to the server. A server may also choose to update the client with new streams and list of contents for user selection.
Ground-plane influences on size estimation in early visual processing.
Champion, Rebecca A; Warren, Paul A
2010-07-21
Ground-planes have an important influence on the perception of 3D space (Gibson, 1950) and it has been shown that the assumption that a ground-plane is present in the scene plays a role in the perception of object distance (Bruno & Cutting, 1988). Here, we investigate whether this influence is exerted at an early stage of processing, to affect the rapid estimation of 3D size. Participants performed a visual search task in which they searched for a target object that was larger or smaller than distracter objects. Objects were presented against a background that contained either a frontoparallel or slanted 3D surface, defined by texture gradient cues. We measured the effect on search performance of target location within the scene (near vs. far) and how this was influenced by scene orientation (which, e.g., might be consistent with a ground or ceiling plane, etc.). In addition, we investigated how scene orientation interacted with texture gradient information (indicating surface slant), to determine how these separate cues to scene layout were combined. We found that the difference in target detection performance between targets at the front and rear of the simulated scene was maximal when the scene was consistent with a ground-plane - consistent with the use of an elevation cue to object distance. In addition, we found a significant increase in the size of this effect when texture gradient information (indicating surface slant) was present, but no interaction between texture gradient and scene orientation information. We conclude that scene orientation plays an important role in the estimation of 3D size at an early stage of processing, and suggest that elevation information is linearly combined with texture gradient information for the rapid estimation of 3D size. Copyright 2010 Elsevier Ltd. All rights reserved.
A new method for text detection and recognition in indoor scene for assisting blind people
NASA Astrophysics Data System (ADS)
Jabnoun, Hanen; Benzarti, Faouzi; Amiri, Hamid
2017-03-01
Developing assisting system of handicapped persons become a challenging ask in research projects. Recently, a variety of tools are designed to help visually impaired or blind people object as a visual substitution system. The majority of these tools are based on the conversion of input information into auditory or tactile sensory information. Furthermore, object recognition and text retrieval are exploited in the visual substitution systems. Text detection and recognition provides the description of the surrounding environments, so that the blind person can readily recognize the scene. In this work, we aim to introduce a method for detecting and recognizing text in indoor scene. The process consists on the detection of the regions of interest that should contain the text using the connected component. Then, the text detection is provided by employing the images correlation. This component of an assistive blind person should be simple, so that the users are able to obtain the most informative feedback within the shortest time.
Vestibular nuclei and cerebellum put visual gravitational motion in context.
Miller, William L; Maffei, Vincenzo; Bosco, Gianfranco; Iosa, Marco; Zago, Myrka; Macaluso, Emiliano; Lacquaniti, Francesco
2008-04-01
Animal survival in the forest, and human success on the sports field, often depend on the ability to seize a target on the fly. All bodies fall at the same rate in the gravitational field, but the corresponding retinal motion varies with apparent viewing distance. How then does the brain predict time-to-collision under gravity? A perspective context from natural or pictorial settings might afford accurate predictions of gravity's effects via the recovery of an environmental reference from the scene structure. We report that embedding motion in a pictorial scene facilitates interception of gravitational acceleration over unnatural acceleration, whereas a blank scene eliminates such bias. Functional magnetic resonance imaging (fMRI) revealed blood-oxygen-level-dependent correlates of these visual context effects on gravitational motion processing in the vestibular nuclei and posterior cerebellar vermis. Our results suggest an early stage of integration of high-level visual analysis with gravity-related motion information, which may represent the substrate for perceptual constancy of ubiquitous gravitational motion.
Visibility Equalizer Cutaway Visualization of Mesoscopic Biological Models.
Le Muzic, M; Mindek, P; Sorger, J; Autin, L; Goodsell, D; Viola, I
2016-06-01
In scientific illustrations and visualization, cutaway views are often employed as an effective technique for occlusion management in densely packed scenes. We propose a novel method for authoring cutaway illustrations of mesoscopic biological models. In contrast to the existing cutaway algorithms, we take advantage of the specific nature of the biological models. These models consist of thousands of instances with a comparably smaller number of different types. Our method constitutes a two stage process. In the first step, clipping objects are placed in the scene, creating a cutaway visualization of the model. During this process, a hierarchical list of stacked bars inform the user about the instance visibility distribution of each individual molecular type in the scene. In the second step, the visibility of each molecular type is fine-tuned through these bars, which at this point act as interactive visibility equalizers. An evaluation of our technique with domain experts confirmed that our equalizer-based approach for visibility specification was valuable and effective for both, scientific and educational purposes.
Visibility Equalizer Cutaway Visualization of Mesoscopic Biological Models
Le Muzic, M.; Mindek, P.; Sorger, J.; Autin, L.; Goodsell, D.; Viola, I.
2017-01-01
In scientific illustrations and visualization, cutaway views are often employed as an effective technique for occlusion management in densely packed scenes. We propose a novel method for authoring cutaway illustrations of mesoscopic biological models. In contrast to the existing cutaway algorithms, we take advantage of the specific nature of the biological models. These models consist of thousands of instances with a comparably smaller number of different types. Our method constitutes a two stage process. In the first step, clipping objects are placed in the scene, creating a cutaway visualization of the model. During this process, a hierarchical list of stacked bars inform the user about the instance visibility distribution of each individual molecular type in the scene. In the second step, the visibility of each molecular type is fine-tuned through these bars, which at this point act as interactive visibility equalizers. An evaluation of our technique with domain experts confirmed that our equalizer-based approach for visibility specification was valuable and effective for both, scientific and educational purposes. PMID:28344374
Adhikarla, Vamsi Kiran; Sodnik, Jaka; Szolgay, Peter; Jakus, Grega
2015-04-14
This paper reports on the design and evaluation of direct 3D gesture interaction with a full horizontal parallax light field display. A light field display defines a visual scene using directional light beams emitted from multiple light sources as if they are emitted from scene points. Each scene point is rendered individually resulting in more realistic and accurate 3D visualization compared to other 3D displaying technologies. We propose an interaction setup combining the visualization of objects within the Field Of View (FOV) of a light field display and their selection through freehand gesture tracked by the Leap Motion Controller. The accuracy and usefulness of the proposed interaction setup was also evaluated in a user study with test subjects. The results of the study revealed high user preference for free hand interaction with light field display as well as relatively low cognitive demand of this technique. Further, our results also revealed some limitations and adjustments of the proposed setup to be addressed in future work.
Active sensing in the categorization of visual patterns
Yang, Scott Cheng-Hsin; Lengyel, Máté; Wolpert, Daniel M
2016-01-01
Interpreting visual scenes typically requires us to accumulate information from multiple locations in a scene. Using a novel gaze-contingent paradigm in a visual categorization task, we show that participants' scan paths follow an active sensing strategy that incorporates information already acquired about the scene and knowledge of the statistical structure of patterns. Intriguingly, categorization performance was markedly improved when locations were revealed to participants by an optimal Bayesian active sensor algorithm. By using a combination of a Bayesian ideal observer and the active sensor algorithm, we estimate that a major portion of this apparent suboptimality of fixation locations arises from prior biases, perceptual noise and inaccuracies in eye movements, and the central process of selecting fixation locations is around 70% efficient in our task. Our results suggest that participants select eye movements with the goal of maximizing information about abstract categories that require the integration of information from multiple locations. DOI: http://dx.doi.org/10.7554/eLife.12215.001 PMID:26880546
Colour agnosia impairs the recognition of natural but not of non-natural scenes.
Nijboer, Tanja C W; Van Der Smagt, Maarten J; Van Zandvoort, Martine J E; De Haan, Edward H F
2007-03-01
Scene recognition can be enhanced by appropriate colour information, yet the level of visual processing at which colour exerts its effects is still unclear. It has been suggested that colour supports low-level sensory processing, while others have claimed that colour information aids semantic categorization and recognition of objects and scenes. We investigated the effect of colour on scene recognition in a case of colour agnosia, M.A.H. In a scene identification task, participants had to name images of natural or non-natural scenes in six different formats. Irrespective of scene format, M.A.H. was much slower on the natural than on the non-natural scenes. As expected, neither M.A.H. nor control participants showed any difference in performance for the non-natural scenes. However, for the natural scenes, appropriate colour facilitated scene recognition in control participants (i.e., shorter reaction times), whereas M.A.H.'s performance did not differ across formats. Our data thus support the hypothesis that the effect of colour occurs at the level of learned associations.
Knoblauch, Andreas; Palm, Günther
2002-09-01
To investigate scene segmentation in the visual system we present a model of two reciprocally connected visual areas using spiking neurons. Area P corresponds to the orientation-selective subsystem of the primary visual cortex, while the central visual area C is modeled as associative memory representing stimulus objects according to Hebbian learning. Without feedback from area C, a single stimulus results in relatively slow and irregular activity, synchronized only for neighboring patches (slow state), while in the complete model activity is faster with an enlarged synchronization range (fast state). When presenting a superposition of several stimulus objects, scene segmentation happens on a time scale of hundreds of milliseconds by alternating epochs of the slow and fast states, where neurons representing the same object are simultaneously in the fast state. Correlation analysis reveals synchronization on different time scales as found in experiments (designated as tower, castle, and hill peaks). On the fast time scale (tower peaks, gamma frequency range), recordings from two sites coding either different or the same object lead to correlograms that are either flat or exhibit oscillatory modulations with a central peak. This is in agreement with experimental findings, whereas standard phase-coding models would predict shifted peaks in the case of different objects.
Lykins, Amy D; Meana, Marta; Kambe, Gretchen
2006-10-01
As a first step in the investigation of the role of visual attention in the processing of erotic stimuli, eye-tracking methodology was employed to measure eye movements during erotic scene presentation. Because eye-tracking is a novel methodology in sexuality research, we attempted to determine whether the eye-tracker could detect differences (should they exist) in visual attention to erotic and non-erotic scenes. A total of 20 men and 20 women were presented with a series of erotic and non-erotic images and tracked their eye movements during image presentation. Comparisons between erotic and non-erotic image groups showed significant differences on two of three dependent measures of visual attention (number of fixations and total time) in both men and women. As hypothesized, there was a significant Stimulus x Scene Region interaction, indicating that participants visually attended to the body more in the erotic stimuli than in the non-erotic stimuli, as evidenced by a greater number of fixations and longer total time devoted to that region. These findings provide support for the application of eye-tracking methodology as a measure of visual attentional capture in sexuality research. Future applications of this methodology to expand our knowledge of the role of cognition in sexuality are suggested.
Eye movements, visual search and scene memory, in an immersive virtual environment.
Kit, Dmitry; Katz, Leor; Sullivan, Brian; Snyder, Kat; Ballard, Dana; Hayhoe, Mary
2014-01-01
Visual memory has been demonstrated to play a role in both visual search and attentional prioritization in natural scenes. However, it has been studied predominantly in experimental paradigms using multiple two-dimensional images. Natural experience, however, entails prolonged immersion in a limited number of three-dimensional environments. The goal of the present experiment was to recreate circumstances comparable to natural visual experience in order to evaluate the role of scene memory in guiding eye movements in a natural environment. Subjects performed a continuous visual-search task within an immersive virtual-reality environment over three days. We found that, similar to two-dimensional contexts, viewers rapidly learn the location of objects in the environment over time, and use spatial memory to guide search. Incidental fixations did not provide obvious benefit to subsequent search, suggesting that semantic contextual cues may often be just as efficient, or that many incidentally fixated items are not held in memory in the absence of a specific task. On the third day of the experience in the environment, previous search items changed in color. These items were fixated upon with increased probability relative to control objects, suggesting that memory-guided prioritization (or Surprise) may be a robust mechanisms for attracting gaze to novel features of natural environments, in addition to task factors and simple spatial saliency.
The Benefit of Positive Visualization on the U.S. Army
2014-06-13
calm, guided imagery allows individuals to envision what it would be like to be in an ideally peaceful, serene , and comforting scene. Typically...ideally peaceful, serene , and comforting scene. Typically, guided imagery is conducted by a qualified mental health specialist hence the term
How many pixels make a memory? Picture memory for small pictures.
Wolfe, Jeremy M; Kuzmova, Yoana I
2011-06-01
Torralba (Visual Neuroscience, 26, 123-131, 2009) showed that, if the resolution of images of scenes were reduced to the information present in very small "thumbnail images," those scenes could still be recognized. The objects in those degraded scenes could be identified, even though it would be impossible to identify them if they were removed from the scene context. Can tiny and/or degraded scenes be remembered, or are they like brief presentations, identified but not remembered. We report that memory for tiny and degraded scenes parallels the recognizability of those scenes. You can remember a scene to approximately the degree to which you can classify it. Interestingly, there is a striking asymmetry in memory when scenes are not the same size on their initial appearance and subsequent test. Memory for a large, full-resolution stimulus can be tested with a small, degraded stimulus. However, memory for a small stimulus is not retrieved when it is tested with a large stimulus.
Integration of nonthematic details in pictures and passages.
Viera, C L; Homa, D L
1991-01-01
Nonthematic details in naturalistic scenes were manipulated to produce four stimulus versions: color photos, black-white copies, and elaborated and unelaborated line drawings (Experiment 1); analogous verbal descriptions of each visual version were produced for Experiment 2. In Experiment 1, two or three different versions of a scene were presented in the mixed condition; the same version of the scene was repeated either two or three times in the same condition, and a 1-presentation control condition was also included. In Experiment 2, the same presentation conditions were used across different groups of subjects who either viewed the pictures or heard the descriptions. An old/new recognition test was given in which the nonstudied versions of the studied items were used as foils. Higher false recognition performances for the mixed condition were found for the visual materials in both experiments, and in the second experiment the verbal materials produced equivalently high levels of false recognition for both same and mixed conditions. Additionally, in Experiment 2 the patterns of performances across material conditions were differentially affected by the manipulation of detail in the four stimulus versions. These differences across materials suggest that the integration of semantically consistent details across temporally separable presentations is facilitated when the stimuli do not provide visual/physical attributes to enhance discrimination of different presentations. Further, the evidence derived from the visual scenes in both experiments indicates that the semantic schema abstracted from a picture is not the sole mediator of recognition performance.
Video content parsing based on combined audio and visual information
NASA Astrophysics Data System (ADS)
Zhang, Tong; Kuo, C.-C. Jay
1999-08-01
While previous research on audiovisual data segmentation and indexing primarily focuses on the pictorial part, significant clues contained in the accompanying audio flow are often ignored. A fully functional system for video content parsing can be achieved more successfully through a proper combination of audio and visual information. By investigating the data structure of different video types, we present tools for both audio and visual content analysis and a scheme for video segmentation and annotation in this research. In the proposed system, video data are segmented into audio scenes and visual shots by detecting abrupt changes in audio and visual features, respectively. Then, the audio scene is categorized and indexed as one of the basic audio types while a visual shot is presented by keyframes and associate image features. An index table is then generated automatically for each video clip based on the integration of outputs from audio and visual analysis. It is shown that the proposed system provides satisfying video indexing results.
Flow visualization of CFD using graphics workstations
NASA Technical Reports Server (NTRS)
Lasinski, Thomas; Buning, Pieter; Choi, Diana; Rogers, Stuart; Bancroft, Gordon
1987-01-01
High performance graphics workstations are used to visualize the fluid flow dynamics obtained from supercomputer solutions of computational fluid dynamic programs. The visualizations can be done independently on the workstation or while the workstation is connected to the supercomputer in a distributed computing mode. In the distributed mode, the supercomputer interactively performs the computationally intensive graphics rendering tasks while the workstation performs the viewing tasks. A major advantage of the workstations is that the viewers can interactively change their viewing position while watching the dynamics of the flow fields. An overview of the computer hardware and software required to create these displays is presented. For complex scenes the workstation cannot create the displays fast enough for good motion analysis. For these cases, the animation sequences are recorded on video tape or 16 mm film a frame at a time and played back at the desired speed. The additional software and hardware required to create these video tapes or 16 mm movies are also described. Photographs illustrating current visualization techniques are discussed. Examples of the use of the workstations for flow visualization through animation are available on video tape.
Do advertisements at the roadside distract the driver?
NASA Astrophysics Data System (ADS)
Kettwich, Carmen; Klinger, Karsten; Lemmer, Uli
2008-04-01
Nowadays drivers have to get along with an increasing complex visual environment. More and more cars are on the road. There are not only distractions available within the vehicle, like radio and navigation system, the environment outside the car has also become more and more complex. Hoardings, advertising pillars, shop fronts and video screens are just a few examples. For this reason the potential risk of driver distraction is rising. But in which way do the advertisements at the roadside influence the driver's attention? The investigation which is described is devoted to this topic. Various kinds of advertisements played an important role, like illuminated and non-illuminated posters as well as illuminated animated ads. Several test runs in an urban environment were performed. The gaze direction of the driver's eye was measured with an eye tracking system. The latter consists of three cameras which logged the eye movements during the test run and a small-sized scene camera recording the traffic scene. 16 subjects (six female and ten male) between 21 and 65 years of age took part in this experiment. Thus the driver's fixation duration of the different advertisements could be determined.
Infants’ Looking to Surprising Events: When Eye-Tracking Reveals More than Looking Time
Yeung, H. Henny; Denison, Stephanie; Johnson, Scott P.
2016-01-01
Research on infants’ reasoning abilities often rely on looking times, which are longer to surprising and unexpected visual scenes compared to unsurprising and expected ones. Few researchers have examined more precise visual scanning patterns in these scenes, and so, here, we recorded 8- to 11-month-olds’ gaze with an eye tracker as we presented a sampling event whose outcome was either surprising, neutral, or unsurprising: A red (or yellow) ball was drawn from one of three visible containers populated 0%, 50%, or 100% with identically colored balls. When measuring looking time to the whole scene, infants were insensitive to the likelihood of the sampling event, replicating failures in similar paradigms. Nevertheless, a new analysis of visual scanning showed that infants did spend more time fixating specific areas-of-interest as a function of the event likelihood. The drawn ball and its associated container attracted more looking than the other containers in the 0% condition, but this pattern was weaker in the 50% condition, and even less strong in the 100% condition. Results suggest that measuring where infants look may be more sensitive than simply how much looking there is to the whole scene. The advantages of eye tracking measures over traditional looking measures are discussed. PMID:27926920
Visuo-Haptic Mixed Reality with Unobstructed Tool-Hand Integration.
Cosco, Francesco; Garre, Carlos; Bruno, Fabio; Muzzupappa, Maurizio; Otaduy, Miguel A
2013-01-01
Visuo-haptic mixed reality consists of adding to a real scene the ability to see and touch virtual objects. It requires the use of see-through display technology for visually mixing real and virtual objects, and haptic devices for adding haptic interaction with the virtual objects. Unfortunately, the use of commodity haptic devices poses obstruction and misalignment issues that complicate the correct integration of a virtual tool and the user's real hand in the mixed reality scene. In this work, we propose a novel mixed reality paradigm where it is possible to touch and see virtual objects in combination with a real scene, using commodity haptic devices, and with a visually consistent integration of the user's hand and the virtual tool. We discuss the visual obstruction and misalignment issues introduced by commodity haptic devices, and then propose a solution that relies on four simple technical steps: color-based segmentation of the hand, tracking-based segmentation of the haptic device, background repainting using image-based models, and misalignment-free compositing of the user's hand. We have developed a successful proof-of-concept implementation, where a user can touch virtual objects and interact with them in the context of a real scene, and we have evaluated the impact on user performance of obstruction and misalignment correction.
Rapid natural scene categorization in the near absence of attention
Li, Fei Fei; VanRullen, Rufin; Koch, Christof; Perona, Pietro
2002-01-01
What can we see when we do not pay attention? It is well known that we can be “blind” even to major aspects of natural scenes when we attend elsewhere. The only tasks that do not need attention appear to be carried out in the early stages of the visual system. Contrary to this common belief, we report that subjects can rapidly detect animals or vehicles in briefly presented novel natural scenes while simultaneously performing another attentionally demanding task. By comparison, they are unable to discriminate large T's from L's, or bisected two-color disks from their mirror images under the same conditions. We conclude that some visual tasks associated with “high-level” cortical areas may proceed in the near absence of attention. PMID:12077298
Effects of Perceptual and Contextual Enrichment on Visual Confrontation Naming in Adult Aging
Rogalski, Yvonne; Peelle, Jonathan E.; Reilly, Jamie
2013-01-01
Purpose The purpose of this study was to determine the effects of enriching line drawings with color/texture and environmental context as a facilitator of naming speed and accuracy in older adults. Method Twenty young and 23 older adults named high-frequency picture stimuli from the Boston Naming Test (Kaplan, Goodglass, & Weintraub, 2001) under three conditions: (a) black-and-white items, (b) colorized-texturized items, and (c) scene-primed colored items (e.g., “hammock” preceded 1,000 ms by a backyard scene). Results With respect to speeded naming latencies, mixed-model analyses of variance revealed that young adults did not benefit from colorization-texturization but did show scene-priming effects. In contrast, older adults failed to show facilitation effects from either colorized-texturized or scene-primed items. Moreover, older adults were consistently slower to initiate naming than were their younger counterparts across all conditions. Conclusions Perceptual and contextual enrichment of sparse line drawings does not appear to facilitate visual confrontation naming in older adults, whereas younger adults do tend to show benefits of scene priming. We interpret these findings as generally supportive of a processing speed account of age-related object picture-naming difficulty. PMID:21498581
ERIC Educational Resources Information Center
Dilek, Gulcin
2010-01-01
This study aims to explore the visual thinking skills of some sixth grade (12-13 year-old) primary pupils who created visual interpretations during history courses. Pupils drew pictures describing historical scenes or events based on visual sources. They constructed these illustrations by using visual and written primary and secondary sources in…
Processing of Unattended Emotional Visual Scenes
ERIC Educational Resources Information Center
Calvo, Manuel G.; Nummenmaa, Lauri
2007-01-01
Prime pictures of emotional scenes appeared in parafoveal vision, followed by probe pictures either congruent or incongruent in affective valence. Participants responded whether the probe was pleasant or unpleasant (or whether it portrayed people or animals). Shorter latencies for congruent than for incongruent prime-probe pairs revealed affective…
Virtual reality and 3D animation in forensic visualization.
Ma, Minhua; Zheng, Huiru; Lallie, Harjinder
2010-09-01
Computer-generated three-dimensional (3D) animation is an ideal media to accurately visualize crime or accident scenes to the viewers and in the courtrooms. Based upon factual data, forensic animations can reproduce the scene and demonstrate the activity at various points in time. The use of computer animation techniques to reconstruct crime scenes is beginning to replace the traditional illustrations, photographs, and verbal descriptions, and is becoming popular in today's forensics. This article integrates work in the areas of 3D graphics, computer vision, motion tracking, natural language processing, and forensic computing, to investigate the state-of-the-art in forensic visualization. It identifies and reviews areas where new applications of 3D digital technologies and artificial intelligence could be used to enhance particular phases of forensic visualization to create 3D models and animations automatically and quickly. Having discussed the relationships between major crime types and level-of-detail in corresponding forensic animations, we recognized that high level-of-detail animation involving human characters, which is appropriate for many major crime types but has had limited use in courtrooms, could be useful for crime investigation. © 2010 American Academy of Forensic Sciences.
Verhoef, Bram-Ernst; Bohon, Kaitlin S.
2015-01-01
Binocular disparity is a powerful depth cue for object perception. The computations for object vision culminate in inferior temporal cortex (IT), but the functional organization for disparity in IT is unknown. Here we addressed this question by measuring fMRI responses in alert monkeys to stimuli that appeared in front of (near), behind (far), or at the fixation plane. We discovered three regions that showed preferential responses for near and far stimuli, relative to zero-disparity stimuli at the fixation plane. These “near/far” disparity-biased regions were located within dorsal IT, as predicted by microelectrode studies, and on the posterior inferotemporal gyrus. In a second analysis, we instead compared responses to near stimuli with responses to far stimuli and discovered a separate network of “near” disparity-biased regions that extended along the crest of the superior temporal sulcus. We also measured in the same animals fMRI responses to faces, scenes, color, and checkerboard annuli at different visual field eccentricities. Disparity-biased regions defined in either analysis did not show a color bias, suggesting that disparity and color contribute to different computations within IT. Scene-biased regions responded preferentially to near and far stimuli (compared with stimuli without disparity) and had a peripheral visual field bias, whereas face patches had a marked near bias and a central visual field bias. These results support the idea that IT is organized by a coarse eccentricity map, and show that disparity likely contributes to computations associated with both central (face processing) and peripheral (scene processing) visual field biases, but likely does not contribute much to computations within IT that are implicated in processing color. PMID:25926470
Scene Recognition for Indoor Localization Using a Multi-Sensor Fusion Approach.
Liu, Mengyun; Chen, Ruizhi; Li, Deren; Chen, Yujin; Guo, Guangyi; Cao, Zhipeng; Pan, Yuanjin
2017-12-08
After decades of research, there is still no solution for indoor localization like the GNSS (Global Navigation Satellite System) solution for outdoor environments. The major reasons for this phenomenon are the complex spatial topology and RF transmission environment. To deal with these problems, an indoor scene constrained method for localization is proposed in this paper, which is inspired by the visual cognition ability of the human brain and the progress in the computer vision field regarding high-level image understanding. Furthermore, a multi-sensor fusion method is implemented on a commercial smartphone including cameras, WiFi and inertial sensors. Compared to former research, the camera on a smartphone is used to "see" which scene the user is in. With this information, a particle filter algorithm constrained by scene information is adopted to determine the final location. For indoor scene recognition, we take advantage of deep learning that has been proven to be highly effective in the computer vision community. For particle filter, both WiFi and magnetic field signals are used to update the weights of particles. Similar to other fingerprinting localization methods, there are two stages in the proposed system, offline training and online localization. In the offline stage, an indoor scene model is trained by Caffe (one of the most popular open source frameworks for deep learning) and a fingerprint database is constructed by user trajectories in different scenes. To reduce the volume requirement of training data for deep learning, a fine-tuned method is adopted for model training. In the online stage, a camera in a smartphone is used to recognize the initial scene. Then a particle filter algorithm is used to fuse the sensor data and determine the final location. To prove the effectiveness of the proposed method, an Android client and a web server are implemented. The Android client is used to collect data and locate a user. The web server is developed for indoor scene model training and communication with an Android client. To evaluate the performance, comparison experiments are conducted and the results demonstrate that a positioning accuracy of 1.32 m at 95% is achievable with the proposed solution. Both positioning accuracy and robustness are enhanced compared to approaches without scene constraint including commercial products such as IndoorAtlas.
Scene Recognition for Indoor Localization Using a Multi-Sensor Fusion Approach
Chen, Ruizhi; Li, Deren; Chen, Yujin; Guo, Guangyi; Cao, Zhipeng
2017-01-01
After decades of research, there is still no solution for indoor localization like the GNSS (Global Navigation Satellite System) solution for outdoor environments. The major reasons for this phenomenon are the complex spatial topology and RF transmission environment. To deal with these problems, an indoor scene constrained method for localization is proposed in this paper, which is inspired by the visual cognition ability of the human brain and the progress in the computer vision field regarding high-level image understanding. Furthermore, a multi-sensor fusion method is implemented on a commercial smartphone including cameras, WiFi and inertial sensors. Compared to former research, the camera on a smartphone is used to “see” which scene the user is in. With this information, a particle filter algorithm constrained by scene information is adopted to determine the final location. For indoor scene recognition, we take advantage of deep learning that has been proven to be highly effective in the computer vision community. For particle filter, both WiFi and magnetic field signals are used to update the weights of particles. Similar to other fingerprinting localization methods, there are two stages in the proposed system, offline training and online localization. In the offline stage, an indoor scene model is trained by Caffe (one of the most popular open source frameworks for deep learning) and a fingerprint database is constructed by user trajectories in different scenes. To reduce the volume requirement of training data for deep learning, a fine-tuned method is adopted for model training. In the online stage, a camera in a smartphone is used to recognize the initial scene. Then a particle filter algorithm is used to fuse the sensor data and determine the final location. To prove the effectiveness of the proposed method, an Android client and a web server are implemented. The Android client is used to collect data and locate a user. The web server is developed for indoor scene model training and communication with an Android client. To evaluate the performance, comparison experiments are conducted and the results demonstrate that a positioning accuracy of 1.32 m at 95% is achievable with the proposed solution. Both positioning accuracy and robustness are enhanced compared to approaches without scene constraint including commercial products such as IndoorAtlas. PMID:29292761
Foggy perception slows us down.
Pretto, Paolo; Bresciani, Jean-Pierre; Rainer, Gregor; Bülthoff, Heinrich H
2012-10-30
Visual speed is believed to be underestimated at low contrast, which has been proposed as an explanation of excessive driving speed in fog. Combining psychophysics measurements and driving simulation, we confirm that speed is underestimated when contrast is reduced uniformly for all objects of the visual scene independently of their distance from the viewer. However, we show that when contrast is reduced more for distant objects, as is the case in real fog, visual speed is actually overestimated, prompting drivers to decelerate. Using an artificial anti-fog-that is, fog characterized by better visibility for distant than for close objects, we demonstrate for the first time that perceived speed depends on the spatial distribution of contrast over the visual scene rather than the global level of contrast per se. Our results cast new light on how reduced visibility conditions affect perceived speed, providing important insight into the human visual system.DOI:http://dx.doi.org/10.7554/eLife.00031.001.
Yue, Shigang; Rind, F Claire
2006-05-01
The lobula giant movement detector (LGMD) is an identified neuron in the locust brain that responds most strongly to the images of an approaching object such as a predator. Its computational model can cope with unpredictable environments without using specific object recognition algorithms. In this paper, an LGMD-based neural network is proposed with a new feature enhancement mechanism to enhance the expanded edges of colliding objects via grouped excitation for collision detection with complex backgrounds. The isolated excitation caused by background detail will be filtered out by the new mechanism. Offline tests demonstrated the advantages of the presented LGMD-based neural network in complex backgrounds. Real time robotics experiments using the LGMD-based neural network as the only sensory system showed that the system worked reliably in a wide range of conditions; in particular, the robot was able to navigate in arenas with structured surrounds and complex backgrounds.
Martin Cichy, Radoslaw; Khosla, Aditya; Pantazis, Dimitrios; Oliva, Aude
2017-06-01
Human scene recognition is a rapid multistep process evolving over time from single scene image to spatial layout processing. We used multivariate pattern analyses on magnetoencephalography (MEG) data to unravel the time course of this cortical process. Following an early signal for lower-level visual analysis of single scenes at ~100ms, we found a marker of real-world scene size, i.e. spatial layout processing, at ~250ms indexing neural representations robust to changes in unrelated scene properties and viewing conditions. For a quantitative model of how scene size representations may arise in the brain, we compared MEG data to a deep neural network model trained on scene classification. Representations of scene size emerged intrinsically in the model, and resolved emerging neural scene size representation. Together our data provide a first description of an electrophysiological signal for layout processing in humans, and suggest that deep neural networks are a promising framework to investigate how spatial layout representations emerge in the human brain. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Seek and you shall remember: Scene semantics interact with visual search to build better memories
Draschkow, Dejan; Wolfe, Jeremy M.; Võ, Melissa L.-H.
2014-01-01
Memorizing critical objects and their locations is an essential part of everyday life. In the present study, incidental encoding of objects in naturalistic scenes during search was compared to explicit memorization of those scenes. To investigate if prior knowledge of scene structure influences these two types of encoding differently, we used meaningless arrays of objects as well as objects in real-world, semantically meaningful images. Surprisingly, when participants were asked to recall scenes, their memory performance was markedly better for searched objects than for objects they had explicitly tried to memorize, even though participants in the search condition were not explicitly asked to memorize objects. This finding held true even when objects were observed for an equal amount of time in both conditions. Critically, the recall benefit for searched over memorized objects in scenes was eliminated when objects were presented on uniform, non-scene backgrounds rather than in a full scene context. Thus, scene semantics not only help us search for objects in naturalistic scenes, but appear to produce a representation that supports our memory for those objects beyond intentional memorization. PMID:25015385
Brébion, Gildas; Stephan-Otto, Christian; Usall, Judith; Huerta-Ramos, Elena; Perez del Olmo, Mireia; Cuevas-Esteban, Jorge; Haro, Josep Maria; Ochoa, Susana
2015-09-01
A number of cognitive underpinnings of auditory hallucinations have been established in schizophrenia patients, but few have, as yet, been uncovered for visual hallucinations. In previous research, we unexpectedly observed that auditory hallucinations were associated with poor recognition of color, but not black-and-white (b/w), pictures. In this study, we attempted to replicate and explain this finding. Potential associations with visual hallucinations were explored. B/w and color pictures were presented to 50 schizophrenia patients and 45 healthy individuals under 2 conditions of visual context presentation corresponding to 2 levels of visual encoding complexity. Then, participants had to recognize the target pictures among distractors. Auditory-verbal hallucinations were inversely associated with the recognition of the color pictures presented under the most effortful encoding condition. This association was fully mediated by working-memory span. Visual hallucinations were associated with improved recognition of the color pictures presented under the less effortful condition. Patients suffering from visual hallucinations were not impaired, relative to the healthy participants, in the recognition of these pictures. Decreased working-memory span in patients with auditory-verbal hallucinations might impede the effortful encoding of stimuli. Visual hallucinations might be associated with facilitation in the visual encoding of natural scenes, or with enhanced color perception abilities. (c) 2015 APA, all rights reserved).
Visual memory for moving scenes.
DeLucia, Patricia R; Maldia, Maria M
2006-02-01
In the present study, memory for picture boundaries was measured with scenes that simulated self-motion along the depth axis. The results indicated that boundary extension (a distortion in memory for picture boundaries) occurred with moving scenes in the same manner as that reported previously for static scenes. Furthermore, motion affected memory for the boundaries but this effect of motion was not consistent with representational momentum of the self (memory being further forward in a motion trajectory than actually shown). We also found that memory for the final position of the depicted self in a moving scene was influenced by properties of the optical expansion pattern. The results are consistent with a conceptual framework in which the mechanisms that underlie boundary extension and representational momentum (a) process different information and (b) both contribute to the integration of successive views of a scene while the scene is changing.
Multilevel depth and image fusion for human activity detection.
Ni, Bingbing; Pei, Yong; Moulin, Pierre; Yan, Shuicheng
2013-10-01
Recognizing complex human activities usually requires the detection and modeling of individual visual features and the interactions between them. Current methods only rely on the visual features extracted from 2-D images, and therefore often lead to unreliable salient visual feature detection and inaccurate modeling of the interaction context between individual features. In this paper, we show that these problems can be addressed by combining data from a conventional camera and a depth sensor (e.g., Microsoft Kinect). We propose a novel complex activity recognition and localization framework that effectively fuses information from both grayscale and depth image channels at multiple levels of the video processing pipeline. In the individual visual feature detection level, depth-based filters are applied to the detected human/object rectangles to remove false detections. In the next level of interaction modeling, 3-D spatial and temporal contexts among human subjects or objects are extracted by integrating information from both grayscale and depth images. Depth information is also utilized to distinguish different types of indoor scenes. Finally, a latent structural model is developed to integrate the information from multiple levels of video processing for an activity detection. Extensive experiments on two activity recognition benchmarks (one with depth information) and a challenging grayscale + depth human activity database that contains complex interactions between human-human, human-object, and human-surroundings demonstrate the effectiveness of the proposed multilevel grayscale + depth fusion scheme. Higher recognition and localization accuracies are obtained relative to the previous methods.
Psychophysiological responses and restorative values of wilderness environments
Chun-Yen Chang; Ping-Kun Chen; William E. Hammitt; Lisa Machnik
2007-01-01
Scenes of natural areas were used as stimuli to analyze the psychological and physiological responses of subjects while viewing wildland scenes. Attention Restoration Theory (Kaplan 1995) and theorized components of restorative environments were used as an orientation for selection of the visual stimuli. Conducted in Taiwan, the studies recorded the psychophysiological...
Auditory Memory Distortion for Spoken Prose
ERIC Educational Resources Information Center
Hutchison, Joanna L.; Hubbard, Timothy L.; Ferrandino, Blaise; Brigante, Ryan; Wright, Jamie M.; Rypma, Bart
2012-01-01
Observers often remember a scene as containing information that was not presented but that would have likely been located just beyond the observed boundaries of the scene. This effect is called "boundary extension" (BE; e.g., Intraub & Richardson, 1989). Previous studies have observed BE in memory for visual and haptic stimuli, and…
Semantic Categorization Precedes Affective Evaluation of Visual Scenes
ERIC Educational Resources Information Center
Nummenmaa, Lauri; Hyona, Jukka; Calvo, Manuel G.
2010-01-01
We compared the primacy of affective versus semantic categorization by using forced-choice saccadic and manual response tasks. Participants viewed paired emotional and neutral scenes involving humans or animals flashed rapidly in extrafoveal vision. Participants were instructed to categorize the targets by saccading toward the location occupied by…
Thalamic nuclei convey diverse contextual information to layer 1 of visual cortex
Imhof, Fabia; Martini, Francisco J.; Hofer, Sonja B.
2017-01-01
Sensory perception depends on the context within which a stimulus occurs. Prevailing models emphasize cortical feedback as the source of contextual modulation. However, higher-order thalamic nuclei, such as the pulvinar, interconnect with many cortical and subcortical areas, suggesting a role for the thalamus in providing sensory and behavioral context – yet the nature of the signals conveyed to cortex by higher-order thalamus remains poorly understood. Here we use axonal calcium imaging to measure information provided to visual cortex by the pulvinar equivalent in mice, the lateral posterior nucleus (LP), as well as the dorsolateral geniculate nucleus (dLGN). We found that dLGN conveys retinotopically precise visual signals, while LP provides distributed information from the visual scene. Both LP and dLGN projections carry locomotion signals. However, while dLGN inputs often respond to positive combinations of running and visual flow speed, LP signals discrepancies between self-generated and external visual motion. This higher-order thalamic nucleus therefore conveys diverse contextual signals that inform visual cortex about visual scene changes not predicted by the animal’s own actions. PMID:26691828
MacKay, Donald G; James, Lori E
2009-10-01
Two experiments compared the visual cognition performance of amnesic H.M. and memory-normal controls matched for age, background, intelligence, and education. In Experiment 1 H.M. exhibited deficits relative to the controls in detecting "erroneous objects" in complex visual scenes--for example, a bird flying inside a fishbowl. In Experiment 2 H.M. exhibited deficits relative to the controls in standard Hidden-Figure tasks when detecting unfamiliar targets but not when detecting familiar targets--for example, circles, squares, and right-angle triangles. H.M.'s visual cognition deficits were not due to his well-known problems in explicit learning and recall, inability to comprehend or remember the instructions, general slowness, motoric difficulties, low motivation, low IQ relative to the controls, or working-memory limitations. Parallels between H.M.'s selective deficits in visual cognition, language, and memory are discussed. These parallels contradict the standard "systems theory" account of H.M.'s condition but comport with the hypothesis that H.M. has difficulty representing unfamiliar but not familiar information in visual cognition, language, and memory. Implications of our results are discussed for binding theory and the ongoing debate over what counts as "memory" versus "not-memory."
Procedural 3d Modelling for Traditional Settlements. The Case Study of Central Zagori
NASA Astrophysics Data System (ADS)
Kitsakis, D.; Tsiliakou, E.; Labropoulos, T.; Dimopoulou, E.
2017-02-01
Over the last decades 3D modelling has been a fast growing field in Geographic Information Science, extensively applied in various domains including reconstruction and visualization of cultural heritage, especially monuments and traditional settlements. Technological advances in computer graphics, allow for modelling of complex 3D objects achieving high precision and accuracy. Procedural modelling is an effective tool and a relatively novel method, based on algorithmic modelling concept. It is utilized for the generation of accurate 3D models and composite facade textures from sets of rules which are called Computer Generated Architecture grammars (CGA grammars), defining the objects' detailed geometry, rather than altering or editing the model manually. In this paper, procedural modelling tools have been exploited to generate the 3D model of a traditional settlement in the region of Central Zagori in Greece. The detailed geometries of 3D models derived from the application of shape grammars on selected footprints, and the process resulted in a final 3D model, optimally describing the built environment of Central Zagori, in three levels of Detail (LoD). The final 3D scene was exported and published as 3D web-scene which can be viewed with 3D CityEngine viewer, giving a walkthrough the whole model, same as in virtual reality or game environments. This research work addresses issues regarding textures' precision, LoD for 3D objects and interactive visualization within one 3D scene, as well as the effectiveness of large scale modelling, along with the benefits and drawbacks that derive from procedural modelling techniques in the field of cultural heritage and more specifically on 3D modelling of traditional settlements.
Could nursery rhymes cause violent behaviour? A comparison with television viewing.
Davies, P; Lee, L; Fox, A; Fox, E
2004-12-01
To assess the rates of violence in nursery rhymes compared to pre-watershed television viewing. Data regarding television viewing habits, and the amount of violence on British television, were obtained from Ofcom. A compilation of nursery rhymes was examined for episodes of violence by three of the researchers. Each nursery rhyme was analysed by number and type of episode. They were then recited to the fourth researcher whose reactions were scrutinised. There were 1045 violent scenes on pre-watershed television over two weeks, of which 61% showed the act and the result; 51% of programmes contained violence. The 25 nursery rhymes had 20 episodes of violence, with 41% of rhymes being violent in some way; 30% mentioned the act and the result, with 50% only the act. Episodes of law breaking and animal abuse were also identified. Television has 4.8 violent scenes per hour and nursery rhymes have 52.2 violent scenes per hour. Analysis of the reactions of the fourth researcher were inconclusive. Although we do not advocate exposure for anyone to violent scenes or stimuli, childhood violence is not a new phenomenon. Whether visual violence and imagined violence have the same effect is likely to depend on the age of the child and the effectiveness of the storyteller. Re-interpretation of the ancient problem of childhood and youth violence through modern eyes is difficult, and laying the blame solely on television viewing is simplistic and may divert attention from vastly more complex societal problems.
Kuniecki, Michał; Wołoszyn, Kinga; Domagalik, Aleksandra; Pilarczyk, Joanna
2018-05-01
Processing of emotional visual information engages cognitive functions and induces arousal. We aimed to examine the modulatory role of emotional valence on brain activations linked to the processing of visual information and those linked to arousal. Participants were scanned and their pupil size was measured while viewing negative and neutral images. The visual noise was added to the images in various proportions to parametrically manipulate the amount of visual information. Pupil size was used as an index of physiological arousal. We show that arousal induced by the negative images, as compared to the neutral ones, is primarily related to greater amygdala activity while increasing visibility of negative content to enhanced activity in the lateral occipital complex (LOC). We argue that more intense visual processing of negative scenes can occur irrespective of the level of arousal. It may suggest that higher areas of the visual stream are fine-tuned to process emotionally relevant objects. Both arousal and processing of emotional visual information modulated activity within the ventromedial prefrontal cortex (vmPFC). Overlapping activations within the vmPFC may reflect the integration of these aspects of emotional processing. Additionally, we show that emotionally-evoked pupil dilations are related to activations in the amygdala, vmPFC, and LOC.
Richard, Christian M; Wright, Richard D; Ee, Cheryl; Prime, Steven L; Shimizu, Yujiro; Vavrik, John
2002-01-01
The effect of a concurrent auditory task on visual search was investigated using an image-flicker technique. Participants were undergraduate university students with normal or corrected-to-normal vision who searched for changes in images of driving scenes that involved either driving-related (e.g., traffic light) or driving-unrelated (e.g., mailbox) scene elements. The results indicated that response times were significantly slower if the search was accompanied by a concurrent auditory task. In addition, slower overall responses to scenes involving driving-unrelated changes suggest that the underlying process affected by the concurrent auditory task is strategic in nature. These results were interpreted in terms of their implications for using a cellular telephone while driving. Actual or potential applications of this research include the development of safer in-vehicle communication devices.
Temporal and spatial adaptation of transient responses to local features
O'Carroll, David C.; Barnett, Paul D.; Nordström, Karin
2012-01-01
Interpreting visual motion within the natural environment is a challenging task, particularly considering that natural scenes vary enormously in brightness, contrast and spatial structure. The performance of current models for the detection of self-generated optic flow depends critically on these very parameters, but despite this, animals manage to successfully navigate within a broad range of scenes. Within global scenes local areas with more salient features are common. Recent work has highlighted the influence that local, salient features have on the encoding of optic flow, but it has been difficult to quantify how local transient responses affect responses to subsequent features and thus contribute to the global neural response. To investigate this in more detail we used experimenter-designed stimuli and recorded intracellularly from motion-sensitive neurons. We limited the stimulus to a small vertically elongated strip, to investigate local and global neural responses to pairs of local “doublet” features that were designed to interact with each other in the temporal and spatial domain. We show that the passage of a high-contrast doublet feature produces a complex transient response from local motion detectors consistent with predictions of a simple computational model. In the neuron, the passage of a high-contrast feature induces a local reduction in responses to subsequent low-contrast features. However, this neural contrast gain reduction appears to be recruited only when features stretch vertically (i.e., orthogonal to the direction of motion) across at least several aligned neighboring ommatidia. Horizontal displacement of the components of elongated features abolishes the local adaptation effect. It is thus likely that features in natural scenes with vertically aligned edges, such as tree trunks, recruit the greatest amount of response suppression. This property could emphasize the local responses to such features vs. those in nearby texture within the scene. PMID:23087617
Vertical gaze angle: absolute height-in-scene information for the programming of prehension.
Gardner, P L; Mon-Williams, M
2001-02-01
One possible source of information regarding the distance of a fixated target is provided by the height of the object within the visual scene. It is accepted that this cue can provide ordinal information, but generally it has been assumed that the nervous system cannot extract "absolute" information from height-in-scene. In order to use height-in-scene, the nervous system would need to be sensitive to ocular position with respect to the head and to head orientation with respect to the shoulders (i.e. vertical gaze angle or VGA). We used a perturbation technique to establish whether the nervous system uses vertical gaze angle as a distance cue. Vertical gaze angle was perturbed using ophthalmic prisms with the base oriented either up or down. In experiment 1, participants were required to carry out an open-loop pointing task whilst wearing: (1) no prisms; (2) a base-up prism; or (3) a base-down prism. In experiment 2, the participants reached to grasp an object under closed-loop viewing conditions whilst wearing: (1) no prisms; (2) a base-up prism; or (3) a base-down prism. Experiment 1 and 2 provided clear evidence that the human nervous system uses vertical gaze angle as a distance cue. It was found that the weighting attached to VGA decreased with increasing target distance. The weighting attached to VGA was also affected by the discrepancy between the height of the target, as specified by all other distance cues, and the height indicated by the initial estimate of the position of the supporting surface. We conclude by considering the use of height-in-scene information in the perception of surface slant and highlight some of the complexities that must be involved in the computation of environmental layout.
Schettino, Antonio; Keil, Andreas; Porcu, Emanuele; Müller, Matthias M
2016-06-01
The rapid extraction of affective cues from the visual environment is crucial for flexible behavior. Previous studies have reported emotion-dependent amplitude modulations of two event-related potential (ERP) components - the N1 and EPN - reflecting sensory gain control mechanisms in extrastriate visual areas. However, it is unclear whether both components are selective electrophysiological markers of attentional orienting toward emotional material or are also influenced by physical features of the visual stimuli. To address this question, electrical brain activity was recorded from seventeen male participants while viewing original and bright versions of neutral and erotic pictures. Bright neutral scenes were rated as more pleasant compared to their original counterpart, whereas erotic scenes were judged more positively when presented in their original version. Classical and mass univariate ERP analysis showed larger N1 amplitude for original relative to bright erotic pictures, with no differences for original and bright neutral scenes. Conversely, the EPN was only modulated by picture content and not by brightness, substantiating the idea that this component is a unique electrophysiological marker of attention allocation toward emotional material. Complementary topographic analysis revealed the early selective expression of a centro-parietal positivity following the presentation of original erotic scenes only, reflecting the recruitment of neural networks associated with sustained attention and facilitated memory encoding for motivationally relevant material. Overall, these results indicate that neural networks subtending the extraction of emotional information are differentially recruited depending on low-level perceptual features, which ultimately influence affective evaluations. Copyright © 2016 Elsevier Inc. All rights reserved.
Eye Movements, Visual Search and Scene Memory, in an Immersive Virtual Environment
Sullivan, Brian; Snyder, Kat; Ballard, Dana; Hayhoe, Mary
2014-01-01
Visual memory has been demonstrated to play a role in both visual search and attentional prioritization in natural scenes. However, it has been studied predominantly in experimental paradigms using multiple two-dimensional images. Natural experience, however, entails prolonged immersion in a limited number of three-dimensional environments. The goal of the present experiment was to recreate circumstances comparable to natural visual experience in order to evaluate the role of scene memory in guiding eye movements in a natural environment. Subjects performed a continuous visual-search task within an immersive virtual-reality environment over three days. We found that, similar to two-dimensional contexts, viewers rapidly learn the location of objects in the environment over time, and use spatial memory to guide search. Incidental fixations did not provide obvious benefit to subsequent search, suggesting that semantic contextual cues may often be just as efficient, or that many incidentally fixated items are not held in memory in the absence of a specific task. On the third day of the experience in the environment, previous search items changed in color. These items were fixated upon with increased probability relative to control objects, suggesting that memory-guided prioritization (or Surprise) may be a robust mechanisms for attracting gaze to novel features of natural environments, in addition to task factors and simple spatial saliency. PMID:24759905
Feature binding, attention and object perception.
Treisman, A
1998-01-01
The seemingly effortless ability to perceive meaningful objects in an integrated scene actually depends on complex visual processes. The 'binding problem' concerns the way in which we select and integrate the separate features of objects in the correct combinations. Experiments suggest that attention plays a central role in solving this problem. Some neurological patients show a dramatic breakdown in the ability to see several objects; their deficits suggest a role for the parietal cortex in the binding process. However, indirect measures of priming and interference suggest that more information may be implicitly available than we can consciously access. PMID:9770223
Computational model of lightness perception in high dynamic range imaging
NASA Astrophysics Data System (ADS)
Krawczyk, Grzegorz; Myszkowski, Karol; Seidel, Hans-Peter
2006-02-01
An anchoring theory of lightness perception by Gilchrist et al. [1999] explains many characteristics of human visual system such as lightness constancy and its spectacular failures which are important in the perception of images. The principal concept of this theory is the perception of complex scenes in terms of groups of consistent areas (frameworks). Such areas, following the gestalt theorists, are defined by the regions of common illumination. The key aspect of the image perception is the estimation of lightness within each framework through the anchoring to the luminance perceived as white, followed by the computation of the global lightness. In this paper we provide a computational model for automatic decomposition of HDR images into frameworks. We derive a tone mapping operator which predicts lightness perception of the real world scenes and aims at its accurate reproduction on low dynamic range displays. Furthermore, such a decomposition into frameworks opens new grounds for local image analysis in view of human perception.
A Demonstration of ‘Broken’ Visual Space
Gilson, Stuart
2012-01-01
It has long been assumed that there is a distorted mapping between real and ‘perceived’ space, based on demonstrations of systematic errors in judgements of slant, curvature, direction and separation. Here, we have applied a direct test to the notion of a coherent visual space. In an immersive virtual environment, participants judged the relative distance of two squares displayed in separate intervals. On some trials, the virtual scene expanded by a factor of four between intervals although, in line with recent results, participants did not report any noticeable change in the scene. We found that there was no consistent depth ordering of objects that can explain the distance matches participants made in this environment (e.g. A>B>D yet also A
Recapitulation of Emotional Source Context during Memory Retrieval
Bowen, Holly J.; Kensinger, Elizabeth A.
2016-01-01
Recapitulation involves the reactivation of cognitive and neural encoding processes at retrieval. In the current study, we investigated the effects of emotional valence on recapitulation processes. Participants encoded neutral words presented on a background face or scene that was negative, positive or neutral. During retrieval, studied and novel neutral words were presented alone (i.e., without the scene or face) and participants were asked to make a remember, know or new judgment. Both the encoding and retrieval tasks were completed in the fMRI scanner. Conjunction analyses were used to reveal the overlap between encoding and retrieval processing. These results revealed that, compared to positive or neutral contexts, words that were recollected and previously encoded in a negative context showed greater encoding-to-retrieval overlap, including in the ventral visual stream and amygdala. Interestingly, the visual stream recapitulation was not enhanced within regions that specifically process faces or scenes but rather extended broadly throughout visual cortices. These findings elucidate how memories for negative events can feel more vivid or detailed than positive or neutral memories. PMID:27923474
NASA Astrophysics Data System (ADS)
Le, Minh Tuan; Nguyen, Congdu; Yoon, Dae-Il; Jung, Eun Ku; Jia, Jie; Kim, Hae-Kwang
2007-12-01
In this paper, we propose a method of 3D graphics to video encoding and streaming that are embedded into a remote interactive 3D visualization system for rapidly representing a 3D scene on mobile devices without having to download it from the server. In particular, a 3D graphics to video framework is presented that increases the visual quality of regions of interest (ROI) of the video by performing more bit allocation to ROI during H.264 video encoding. The ROI are identified by projection 3D objects to a 2D plane during rasterization. The system offers users to navigate the 3D scene and interact with objects of interests for querying their descriptions. We developed an adaptive media streaming server that can provide an adaptive video stream in term of object-based quality to the client according to the user's preferences and the variation of network bandwidth. Results show that by doing ROI mode selection, PSNR of test sample slightly change while visual quality of objects increases evidently.
Coherence of structural visual cues and pictorial gravity paves the way for interceptive actions.
Zago, Myrka; La Scaleia, Barbara; Miller, William L; Lacquaniti, Francesco
2011-09-20
Dealing with upside-down objects is difficult and takes time. Among the cues that are critical for defining object orientation, the visible influence of gravity on the object's motion has received limited attention. Here, we manipulated the alignment of visible gravity and structural visual cues between each other and relative to the orientation of the observer and physical gravity. Participants pressed a button triggering a hitter to intercept a target accelerated by a virtual gravity. A factorial design assessed the effects of scene orientation (normal or inverted) and target gravity (normal or inverted). We found that interception was significantly more successful when scene direction was concordant with target gravity direction, irrespective of whether both were upright or inverted. This was so independent of the hitter type and when performance feedback to the participants was either available (Experiment 1) or unavailable (Experiment 2). These results show that the combined influence of visible gravity and structural visual cues can outweigh both physical gravity and viewer-centered cues, leading to rely instead on the congruence of the apparent physical forces acting on people and objects in the scene.
Pooresmaeili, Arezoo; Arrighi, Roberto; Biagi, Laura; Morrone, Maria Concetta
2016-01-01
In natural scenes, objects rarely occur in isolation but appear within a spatiotemporal context. Here, we show that the perceived size of a stimulus is significantly affected by the context of the scene: brief previous presentation of larger or smaller adapting stimuli at the same region of space changes the perceived size of a test stimulus, with larger adapting stimuli causing the test to appear smaller than veridical and vice versa. In a human fMRI study, we measured the blood oxygen level-dependent activation (BOLD) responses of the primary visual cortex (V1) to the contours of large-diameter stimuli and found that activation closely matched the perceptual rather than the retinal stimulus size: the activated area of V1 increased or decreased, depending on the size of the preceding stimulus. A model based on local inhibitory V1 mechanisms simulated the inward or outward shifts of the stimulus contours and hence the perceptual effects. Our findings suggest that area V1 is actively involved in reshaping our perception to match the short-term statistics of the visual scene. PMID:24089504
Adhikarla, Vamsi Kiran; Sodnik, Jaka; Szolgay, Peter; Jakus, Grega
2015-01-01
This paper reports on the design and evaluation of direct 3D gesture interaction with a full horizontal parallax light field display. A light field display defines a visual scene using directional light beams emitted from multiple light sources as if they are emitted from scene points. Each scene point is rendered individually resulting in more realistic and accurate 3D visualization compared to other 3D displaying technologies. We propose an interaction setup combining the visualization of objects within the Field Of View (FOV) of a light field display and their selection through freehand gesture tracked by the Leap Motion Controller. The accuracy and usefulness of the proposed interaction setup was also evaluated in a user study with test subjects. The results of the study revealed high user preference for free hand interaction with light field display as well as relatively low cognitive demand of this technique. Further, our results also revealed some limitations and adjustments of the proposed setup to be addressed in future work. PMID:25875189
Markman, Adam; Shen, Xin; Hua, Hong; Javidi, Bahram
2016-01-15
An augmented reality (AR) smartglass display combines real-world scenes with digital information enabling the rapid growth of AR-based applications. We present an augmented reality-based approach for three-dimensional (3D) optical visualization and object recognition using axially distributed sensing (ADS). For object recognition, the 3D scene is reconstructed, and feature extraction is performed by calculating the histogram of oriented gradients (HOG) of a sliding window. A support vector machine (SVM) is then used for classification. Once an object has been identified, the 3D reconstructed scene with the detected object is optically displayed in the smartglasses allowing the user to see the object, remove partial occlusions of the object, and provide critical information about the object such as 3D coordinates, which are not possible with conventional AR devices. To the best of our knowledge, this is the first report on combining axially distributed sensing with 3D object visualization and recognition for applications to augmented reality. The proposed approach can have benefits for many applications, including medical, military, transportation, and manufacturing.
AgRISTARS. Supporting research: Algorithms for scene modelling
NASA Technical Reports Server (NTRS)
Rassbach, M. E. (Principal Investigator)
1982-01-01
The requirements for a comprehensive analysis of LANDSAT or other visual data scenes are defined. The development of a general model of a scene and a computer algorithm for finding the particular model for a given scene is discussed. The modelling system includes a boundary analysis subsystem, which detects all the boundaries and lines in the image and builds a boundary graph; a continuous variation analysis subsystem, which finds gradual variations not well approximated by a boundary structure; and a miscellaneous features analysis, which includes texture, line parallelism, etc. The noise reduction capabilities of this method and its use in image rectification and registration are discussed.
On the "Give" and "Take" between Event Apprehension and Utterance Formulation
ERIC Educational Resources Information Center
Gleitman, Lila R.; January, David; Nappa, Rebecca; Trueswell, John C.
2007-01-01
Two experiments are reported that examine how manipulations of visual attention affect speakers' linguistic choices regarding word order, verb use and syntactic structure when describing simple pictured scenes. Experiment 1 presented participants with scenes designed to elicit the use of a perspective predicate ("The man chases the dog/The dog…
From Seeing to Saying: Perceiving, Planning, Producing
ERIC Educational Resources Information Center
Kuchinsky, Stefanie Ellen
2009-01-01
Given the amount of visual information in a scene, how do speakers determine what to talk about first? One hypothesis is that speakers start talking about what has attentional priority, while another is that speakers first extract the scene gist, using the obtained relational information to generate a rudimentary sentence plan before retrieving…
Handheld real-time volumetric 3-D gamma-ray imaging
NASA Astrophysics Data System (ADS)
Haefner, Andrew; Barnowski, Ross; Luke, Paul; Amman, Mark; Vetter, Kai
2017-06-01
This paper presents the concept of real-time fusion of gamma-ray imaging and visual scene data for a hand-held mobile Compton imaging system in 3-D. The ability to obtain and integrate both gamma-ray and scene data from a mobile platform enables improved capabilities in the localization and mapping of radioactive materials. This not only enhances the ability to localize these materials, but it also provides important contextual information of the scene which once acquired can be reviewed and further analyzed subsequently. To demonstrate these concepts, the high-efficiency multimode imager (HEMI) is used in a hand-portable implementation in combination with a Microsoft Kinect sensor. This sensor, in conjunction with open-source software, provides the ability to create a 3-D model of the scene and to track the position and orientation of HEMI in real-time. By combining the gamma-ray data and visual data, accurate 3-D maps of gamma-ray sources are produced in real-time. This approach is extended to map the location of radioactive materials within objects with unknown geometry.
NASA Astrophysics Data System (ADS)
Luo, Xiongbiao; McLeod, A. Jonathan; Jayarathne, Uditha L.; Pautler, Stephen E.; Schlacta, Christopher M.; Peters, Terry M.
2016-03-01
Three-dimensional (3-D) scene reconstruction from stereoscopic binocular laparoscopic videos is an effective way to expand the limited surgical field and augment the structure visualization of the organ being operated in minimally invasive surgery. However, currently available reconstruction approaches are limited by image noise, occlusions, textureless and blurred structures. In particular, an endoscope inside the body only has the limited light source resulting in illumination non-uniformities in the visualized field. These limitations unavoidably deteriorate the stereo image quality and hence lead to low-resolution and inaccurate disparity maps, resulting in blurred edge structures in 3-D scene reconstruction. This paper proposes an improved stereo correspondence framework that integrates cost-volume filtering with joint upsampling for robust disparity estimation. Joint bilateral upsampling, joint geodesic upsampling, and tree filtering upsampling were compared to enhance the disparity accuracy. The experimental results demonstrate that joint upsampling provides an effective way to boost the disparity estimation and hence to improve the surgical endoscopic scene 3-D reconstruction. Moreover, the bilateral upsampling generally outperforms the other two upsampling methods in disparity estimation.
Criterion-free measurement of motion transparency perception at different speeds
Rocchi, Francesca; Ledgeway, Timothy; Webb, Ben S.
2018-01-01
Transparency perception often occurs when objects within the visual scene partially occlude each other or move at the same time, at different velocities across the same spatial region. Although transparent motion perception has been extensively studied, we still do not understand how the distribution of velocities within a visual scene contribute to transparent perception. Here we use a novel psychophysical procedure to characterize the distribution of velocities in a scene that give rise to transparent motion perception. To prevent participants from adopting a subjective decision criterion when discriminating transparent motion, we used an “odd-one-out,” three-alternative forced-choice procedure. Two intervals contained the standard—a random-dot-kinematogram with dot speeds or directions sampled from a uniform distribution. The other interval contained the comparison—speeds or directions sampled from a distribution with the same range as the standard, but with a notch of different widths removed. Our results suggest that transparent motion perception is driven primarily by relatively slow speeds, and does not emerge when only very fast speeds are present within a visual scene. Transparent perception of moving surfaces is modulated by stimulus-based characteristics, such as the separation between the means of the overlapping distributions or the range of speeds presented within an image. Our work illustrates the utility of using objective, forced-choice methods to reveal the mechanisms underlying motion transparency perception. PMID:29614154
The role of iconic memory in change-detection tasks.
Becker, M W; Pashler, H; Anstis, S M
2000-01-01
In three experiments, subjects attempted to detect the change of a single item in a visually presented array of items. Subjects' ability to detect a change was greatly reduced if a blank interstimulus interval (ISI) was inserted between the original array and an array in which one item had changed ('change blindness'). However, change detection improved when the location of the change was cued during the blank ISI. This suggests that people represent more information of a scene than change blindness might suggest. We test two possible hypotheses why, in the absence of a cue, this representation fails to produce good change detection. The first claims that the intervening events employed to create change blindness result in multiple neural transients which co-occur with the to-be-detected change. Poor detection rates occur because a serial search of all the transient locations is required to detect the change, during which time the representation of the original scene fades. The second claims that the occurrence of the second frame overwrites the representation of the first frame, unless that information is insulated against overwriting by attention. The results support the second hypothesis. We conclude that people may have a fairly rich visual representation of a scene while the scene is present, but fail to detect changes because they lack the ability to simultaneously represent two complete visual representations.
Mishra, Ajay; Aloimonos, Yiannis
2009-01-01
The human visual system observes and understands a scene/image by making a series of fixations. Every fixation point lies inside a particular region of arbitrary shape and size in the scene which can either be an object or just a part of it. We define as a basic segmentation problem the task of segmenting that region containing the fixation point. Segmenting the region containing the fixation is equivalent to finding the enclosing contour- a connected set of boundary edge fragments in the edge map of the scene - around the fixation. This enclosing contour should be a depth boundary.We present here a novel algorithm that finds this bounding contour and achieves the segmentation of one object, given the fixation. The proposed segmentation framework combines monocular cues (color/intensity/texture) with stereo and/or motion, in a cue independent manner. The semantic robots of the immediate future will be able to use this algorithm to automatically find objects in any environment. The capability of automatically segmenting objects in their visual field can bring the visual processing to the next level. Our approach is different from current approaches. While existing work attempts to segment the whole scene at once into many areas, we segment only one image region, specifically the one containing the fixation point. Experiments with real imagery collected by our active robot and from the known databases 1 demonstrate the promise of the approach.
Figure ground discrimination in age-related macular degeneration.
Tran, Thi Ha Chau; Guyader, Nathalie; Guerin, Anne; Despretz, Pascal; Boucart, Muriel
2011-03-01
To investigate impairment in discriminating a figure from its background and to study its relation to visual acuity and lesion size in patients with neovascular age-related macular degeneration (AMD). Seventeen patients with neovascular AMD and visual acuity <20/50 were included. Seventeen age-matched healthy subjects participated as controls. Complete ophthalmologic examination was performed on all participants. The stimuli were photographs of scenes containing animals (targets) or other objects (distractors), displayed on a computer monitor screen. Performance was compared in four background conditions: the target in the natural scene; the target isolated on a white background; the target separated by a white space from a structured scene; the target separated by a white space from a nonstructured, shapeless background. Target discriminability (d') was recorded. Performance was lower for patients than for controls. For the patients, it was easier to detect the target when it was separated from its background (under isolated, structured, and nonstructured conditions) than it was when located in a scene. Performance was improved in patients with increasing exposure time but remained lower in controls. Correlations were found between visual acuity, lesion size, and sensitivity for patients. Figure/ground segregation is impaired in patients with AMD. A white space surrounding an object is sufficient to improve the object's detection and to facilitate figure/ground segregation. These results may have practical applications to the rehabilitation of the environment in patients with AMD.
A distributed code for color in natural scenes derived from center-surround filtered cone signals
Kellner, Christian J.; Wachtler, Thomas
2013-01-01
In the retina of trichromatic primates, chromatic information is encoded in an opponent fashion and transmitted to the lateral geniculate nucleus (LGN) and visual cortex via parallel pathways. Chromatic selectivities of neurons in the LGN form two separate clusters, corresponding to two classes of cone opponency. In the visual cortex, however, the chromatic selectivities are more distributed, which is in accordance with a population code for color. Previous studies of cone signals in natural scenes typically found opponent codes with chromatic selectivities corresponding to two directions in color space. Here we investigated how the non-linear spatio-chromatic filtering in the retina influences the encoding of color signals. Cone signals were derived from hyper-spectral images of natural scenes and preprocessed by center-surround filtering and rectification, resulting in parallel ON and OFF channels. Independent Component Analysis (ICA) on these signals yielded a highly sparse code with basis functions that showed spatio-chromatic selectivities. In contrast to previous analyses of linear transformations of cone signals, chromatic selectivities were not restricted to two main chromatic axes, but were more continuously distributed in color space, similar to the population code of color in the early visual cortex. Our results indicate that spatio-chromatic processing in the retina leads to a more distributed and more efficient code for natural scenes. PMID:24098289
Conscious visual memory with minimal attention.
Pinto, Yair; Vandenbroucke, Annelinde R; Otten, Marte; Sligte, Ilja G; Seth, Anil K; Lamme, Victor A F
2017-02-01
Is conscious visual perception limited to the locations that a person attends? The remarkable phenomenon of change blindness, which shows that people miss nearly all unattended changes in a visual scene, suggests the answer is yes. However, change blindness is found after visual interference (a mask or a new scene), so that subjects have to rely on working memory (WM), which has limited capacity, to detect the change. Before such interference, however, a much larger capacity store, called fragile memory (FM), which is easily overwritten by newly presented visual information, is present. Whether these different stores depend equally on spatial attention is central to the debate on the role of attention in conscious vision. In 2 experiments, we found that minimizing spatial attention almost entirely erases visual WM, as expected. Critically, FM remains largely intact. Moreover, minimally attended FM responses yield accurate metacognition, suggesting that conscious memory persists with limited spatial attention. Together, our findings help resolve the fundamental issue of how attention affects perception: Both visual consciousness and memory can be supported by only minimal attention. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
NASA Astrophysics Data System (ADS)
Meitzler, Thomas J.
The field of computer vision interacts with fields such as psychology, vision research, machine vision, psychophysics, mathematics, physics, and computer science. The focus of this thesis is new algorithms and methods for the computation of the probability of detection (Pd) of a target in a cluttered scene. The scene can be either a natural visual scene such as one sees with the naked eye (visual), or, a scene displayed on a monitor with the help of infrared sensors. The relative clutter and the temperature difference between the target and background (DeltaT) are defined and then used to calculate a relative signal -to-clutter ratio (SCR) from which the Pd is calculated for a target in a cluttered scene. It is shown how this definition can include many previous definitions of clutter and (DeltaT). Next, fuzzy and neural -fuzzy techniques are used to calculate the Pd and it is shown how these methods can give results that have a good correlation with experiment. The experimental design for actually measuring the Pd of a target by observers is described. Finally, wavelets are applied to the calculation of clutter and it is shown how this new definition of clutter based on wavelets can be used to compute the Pd of a target.
Learning object-to-class kernels for scene classification.
Zhang, Lei; Zhen, Xiantong; Shao, Ling
2014-08-01
High-level image representations have drawn increasing attention in visual recognition, e.g., scene classification, since the invention of the object bank. The object bank represents an image as a response map of a large number of pretrained object detectors and has achieved superior performance for visual recognition. In this paper, based on the object bank representation, we propose the object-to-class (O2C) distances to model scene images. In particular, four variants of O2C distances are presented, and with the O2C distances, we can represent the images using the object bank by lower-dimensional but more discriminative spaces, called distance spaces, which are spanned by the O2C distances. Due to the explicit computation of O2C distances based on the object bank, the obtained representations can possess more semantic meanings. To combine the discriminant ability of the O2C distances to all scene classes, we further propose to kernalize the distance representation for the final classification. We have conducted extensive experiments on four benchmark data sets, UIUC-Sports, Scene-15, MIT Indoor, and Caltech-101, which demonstrate that the proposed approaches can significantly improve the original object bank approach and achieve the state-of-the-art performance.
Rendering visual events as sounds: Spatial attention capture by auditory augmented reality.
Stone, Scott A; Tata, Matthew S
2017-01-01
Many salient visual events tend to coincide with auditory events, such as seeing and hearing a car pass by. Information from the visual and auditory senses can be used to create a stable percept of the stimulus. Having access to related coincident visual and auditory information can help for spatial tasks such as localization. However not all visual information has analogous auditory percepts, such as viewing a computer monitor. Here, we describe a system capable of detecting and augmenting visual salient events into localizable auditory events. The system uses a neuromorphic camera (DAVIS 240B) to detect logarithmic changes of brightness intensity in the scene, which can be interpreted as salient visual events. Participants were blindfolded and asked to use the device to detect new objects in the scene, as well as determine direction of motion for a moving visual object. Results suggest the system is robust enough to allow for the simple detection of new salient stimuli, as well accurately encoding direction of visual motion. Future successes are probable as neuromorphic devices are likely to become faster and smaller in the future, making this system much more feasible.
Rendering visual events as sounds: Spatial attention capture by auditory augmented reality
Tata, Matthew S.
2017-01-01
Many salient visual events tend to coincide with auditory events, such as seeing and hearing a car pass by. Information from the visual and auditory senses can be used to create a stable percept of the stimulus. Having access to related coincident visual and auditory information can help for spatial tasks such as localization. However not all visual information has analogous auditory percepts, such as viewing a computer monitor. Here, we describe a system capable of detecting and augmenting visual salient events into localizable auditory events. The system uses a neuromorphic camera (DAVIS 240B) to detect logarithmic changes of brightness intensity in the scene, which can be interpreted as salient visual events. Participants were blindfolded and asked to use the device to detect new objects in the scene, as well as determine direction of motion for a moving visual object. Results suggest the system is robust enough to allow for the simple detection of new salient stimuli, as well accurately encoding direction of visual motion. Future successes are probable as neuromorphic devices are likely to become faster and smaller in the future, making this system much more feasible. PMID:28792518
Dual processing of visual rotation for bipedal stance control.
Day, Brian L; Muller, Timothy; Offord, Joanna; Di Giulio, Irene
2016-10-01
When standing, the gain of the body-movement response to a sinusoidally moving visual scene has been shown to get smaller with faster stimuli, possibly through changes in the apportioning of visual flow to self-motion or environment motion. We investigated whether visual-flow speed similarly influences the postural response to a discrete, unidirectional rotation of the visual scene in the frontal plane. Contrary to expectation, the evoked postural response consisted of two sequential components with opposite relationships to visual motion speed. With faster visual rotation the early component became smaller, not through a change in gain but by changes in its temporal structure, while the later component grew larger. We propose that the early component arises from the balance control system minimising apparent self-motion, while the later component stems from the postural system realigning the body with gravity. The source of visual motion is inherently ambiguous such that movement of objects in the environment can evoke self-motion illusions and postural adjustments. Theoretically, the brain can mitigate this problem by combining visual signals with other types of information. A Bayesian model that achieves this was previously proposed and predicts a decreasing gain of postural response with increasing visual motion speed. Here we test this prediction for discrete, unidirectional, full-field visual rotations in the frontal plane of standing subjects. The speed (0.75-48 deg s(-1) ) and direction of visual rotation was pseudo-randomly varied and mediolateral responses were measured from displacements of the trunk and horizontal ground reaction forces. The behaviour evoked by this visual rotation was more complex than has hitherto been reported, consisting broadly of two consecutive components with respective latencies of ∼190 ms and >0.7 s. Both components were sensitive to visual rotation speed, but with diametrically opposite relationships. Thus, the early component decreased with faster visual rotation, while the later component increased. Furthermore, the decrease in size of the early component was not achieved by a simple attenuation of gain, but by a change in its temporal structure. We conclude that the two components represent expressions of different motor functions, both pertinent to the control of bipedal stance. We propose that the early response stems from the balance control system attempting to minimise unintended body motion, while the later response arises from the postural control system attempting to align the body with gravity. © 2016 The Authors. The Journal of Physiology published by John Wiley & Sons Ltd on behalf of The Physiological Society.
Contextual Cueing: Implicit Learning and Memory of Visual Context Guides Spatial Attention.
ERIC Educational Resources Information Center
Chun, Marvin M.; Jiang, Yuhong
1998-01-01
Six experiments involving a total of 112 college students demonstrate that a robust memory for visual context exists to guide spatial attention. Results show how implicit learning and memory of visual context can guide spatial attention toward task-relevant aspects of a scene. (SLD)
Learning Visual Design through Hypermedia: Pathways to Visual Literacy.
ERIC Educational Resources Information Center
Lockee, Barbara; Hergert, Tom
The interactive multimedia application described here attempts to provide learners and teachers with a common frame of reference for communicating about visual media. The system is based on a list of concepts related to composition, and illustrates those concepts with photographs, paintings, graphic designs, and motion picture scenes. The ability…
Conducting a wildland visual resources inventory
James F. Palmer
1979-01-01
This paper describes a procedure for systematically inventorying the visual resources of wildland environments. Visual attributes are recorded photographically using two separate sampling methods: one based on professional judgment and the other on random selection. The location and description of each inventoried scene are recorded on U.S. Geological Survey...
3D Data Mapping and Real-Time Experiment Control and Visualization in Brain Slices.
Navarro, Marco A; Hibbard, Jaime V K; Miller, Michael E; Nivin, Tyler W; Milescu, Lorin S
2015-10-20
Here, we propose two basic concepts that can streamline electrophysiology and imaging experiments in brain slices and enhance data collection and analysis. The first idea is to interface the experiment with a software environment that provides a 3D scene viewer in which the experimental rig, the brain slice, and the recorded data are represented to scale. Within the 3D scene viewer, the user can visualize a live image of the sample and 3D renderings of the recording electrodes with real-time position feedback. Furthermore, the user can control the instruments and visualize their status in real time. The second idea is to integrate multiple types of experimental data into a spatial and temporal map of the brain slice. These data may include low-magnification maps of the entire brain slice, for spatial context, or any other type of high-resolution structural and functional image, together with time-resolved electrical and optical signals. The entire data collection can be visualized within the 3D scene viewer. These concepts can be applied to any other type of experiment in which high-resolution data are recorded within a larger sample at different spatial and temporal coordinates. Copyright © 2015 Biophysical Society. Published by Elsevier Inc. All rights reserved.
3D visualization of numeric planetary data using JMARS
NASA Astrophysics Data System (ADS)
Dickenshied, S.; Christensen, P. R.; Anwar, S.; Carter, S.; Hagee, W.; Noss, D.
2013-12-01
JMARS (Java Mission-planning and Analysis for Remote Sensing) is a free geospatial application developed by the Mars Space Flight Facility at Arizona State University. Originally written as a mission planning tool for the THEMIS instrument on board the MARS Odyssey Spacecraft, it was released as an analysis tool to the general public in 2003. Since then it has expanded to be used for mission planning and scientific data analysis by additional NASA missions to Mars, the Moon, and Vesta, and it has come to be used by scientists, researchers and students of all ages from more than 40 countries around the world. The public version of JMARS now also includes remote sensing data for Mercury, Venus, Earth, the Moon, Mars, and a number of the moons of Jupiter and Saturn. Additional datasets for asteroids and other smaller bodies are being added as they becomes available and time permits. In addition to visualizing multiple datasets in context with one another, significant effort has been put into on-the-fly projection of georegistered data over surface topography. This functionality allows a user to easily create and modify 3D visualizations of any regional scene where elevation data is available in JMARS. This can be accomplished through the use of global topographic maps or regional numeric data such as HiRISE or HRSC DTMs. Users can also upload their own regional or global topographic dataset and use it as an elevation source for 3D rendering of their scene. The 3D Layer in JMARS allows the user to exaggerate the z-scale of any elevation source to emphasize the vertical variance throughout a scene. In addition, the user can rotate, tilt, and zoom the scene to any desired angle and then illuminate it with an artificial light source. This scene can be easily overlain with additional JMARS datasets such as maps, images, shapefiles, contour lines, or scale bars, and the scene can be easily saved as a graphic image for use in presentations or publications.
A Theoretical and Experimental Analysis of the Outside World Perception Process
NASA Technical Reports Server (NTRS)
Wewerinke, P. H.
1978-01-01
The outside scene is often an important source of information for manual control tasks. Important examples of these are car driving and aircraft control. This paper deals with modelling this visual scene perception process on the basis of linear perspective geometry and the relative motion cues. Model predictions utilizing psychophysical threshold data from base-line experiments and literature of a variety of visual approach tasks are compared with experimental data. Both the performance and workload results illustrate that the model provides a meaningful description of the outside world perception process, with a useful predictive capability.
NASA Technical Reports Server (NTRS)
Mcruer, D. T.; Klein, R. H.
1975-01-01
As part of a comprehensive program exploring driver/vehicle system response in lateral steering tasks, driver/vehicle system describing functions and other dynamic data have been gathered in several milieu. These include a simple fixed base simulator with an elementary roadway delineation only display; a fixed base statically operating automobile with a terrain model based, wide angle projection system display; and a full scale moving base automobile operating on the road. Dynamic data with the two fixed base simulators compared favorably, implying that the impoverished visual scene, lack of engine noise, and simplified steering wheel feel characteristics in the simple simulator did not induce significant driver dynamic behavior variations. The fixed base vs. moving base comparisons showed substantially greater crossover frequencies and phase margins on the road course.
Contardi, Sara; Rubboli, Guido; Giulioni, Marco; Michelucci, Roberto; Pizza, Fabio; Gardella, Elena; Pinardi, Federica; Bartolomei, Ilaria; Tassinari, Carlo Alberto
2007-09-01
Charles Bonnet syndrome (CBS) is a disorder characterized by the occurrence of complex visual hallucinations in patients with acquired impairment of vision and without psychiatric disorders. In spite of the high incidence of visual field defects following antero-mesial temporal lobectomy for refractory temporal lobe epilepsy, reports of CBS in patients who underwent this surgical procedure are surprisingly rare. We describe a patient operated on for drug-resistant epilepsy. As a result of left antero-mesial temporal resection, she presented right homonymous hemianopia. A few days after surgery, she started complaining of visual hallucinations, such as static or moving "Lilliputian" human figures, or countryside scenes, restricted to the hemianopic field. The patient was fully aware of their fictitious nature. These disturbances disappeared progressively over a few weeks. The incidence of CBS associated with visual field defects following epilepsy surgery might be underestimated. Patients with post-surgical CBS should be reassured that it is not an epileptic phenomenon, and that it has a benign, self-limiting, course which does not usually require treatment.
Mahr, Angela; Wentura, Dirk
2014-02-01
Findings from three experiments support the conclusion that auditory primes facilitate the processing of related targets. In Experiments 1 and 2, we employed a crossmodal Stroop color identification task with auditory color words (as primes) and visual color patches (as targets). Responses were faster for congruent priming, in comparison to neutral or incongruent priming. This effect also emerged for different levels of time compression of the auditory primes (to 30 % and 10 % of the original length; i.e., 120 and 40 ms) and turned out to be even more pronounced under high-perceptual-load conditions (Exps. 1 and 2). In Experiment 3, target-present or -absent decisions for brief target displays had to be made, thereby ruling out response-priming processes as a cause of the congruency effects. Nevertheless, target detection (d') was increased by congruent primes (30 % compression) in comparison to incongruent or neutral primes. Our results suggest semantic object-based auditory-visual interactions, which rapidly increase the denoted target object's salience. This would apply, in particular, to complex visual scenes.
Investigating the variability of memory distortion for an analogue trauma.
Strange, Deryn; Takarangi, Melanie K T
2015-01-01
In this paper, we examine whether source monitoring (SM) errors might be one mechanism that accounts for traumatic memory distortion. Participants watched a traumatic film with some critical (crux) and non-critical (non-crux) scenes removed. Twenty-four hours later, they completed a memory test. To increase the likelihood participants would notice the film's gaps, we inserted visual static for the length of each missing scene. We then added manipulations designed to affect people's SM behaviour. To encourage systematic SM, before watching the film, we warned half the participants that we had removed some scenes. To encourage heuristic SM some participants also saw labels describing the missing scenes. Adding static highlighting, the missing scenes did not affect false recognition of those missing scenes. However, a warning decreased, while labels increased, participants' false recognition rates. We conclude that manipulations designed to affect SM behaviour also affect the degree of memory distortion in our paradigm.
The forensic holodeck: an immersive display for forensic crime scene reconstructions.
Ebert, Lars C; Nguyen, Tuan T; Breitbeck, Robert; Braun, Marcel; Thali, Michael J; Ross, Steffen
2014-12-01
In forensic investigations, crime scene reconstructions are created based on a variety of three-dimensional image modalities. Although the data gathered are three-dimensional, their presentation on computer screens and paper is two-dimensional, which incurs a loss of information. By applying immersive virtual reality (VR) techniques, we propose a system that allows a crime scene to be viewed as if the investigator were present at the scene. We used a low-cost VR headset originally developed for computer gaming in our system. The headset offers a large viewing volume and tracks the user's head orientation in real-time, and an optical tracker is used for positional information. In addition, we created a crime scene reconstruction to demonstrate the system. In this article, we present a low-cost system that allows immersive, three-dimensional and interactive visualization of forensic incident scene reconstructions.
Stages as models of scene geometry.
Nedović, Vladimir; Smeulders, Arnold W M; Redert, André; Geusebroek, Jan-Mark
2010-09-01
Reconstruction of 3D scene geometry is an important element for scene understanding, autonomous vehicle and robot navigation, image retrieval, and 3D television. We propose accounting for the inherent structure of the visual world when trying to solve the scene reconstruction problem. Consequently, we identify geometric scene categorization as the first step toward robust and efficient depth estimation from single images. We introduce 15 typical 3D scene geometries called stages, each with a unique depth profile, which roughly correspond to a large majority of broadcast video frames. Stage information serves as a first approximation of global depth, narrowing down the search space in depth estimation and object localization. We propose different sets of low-level features for depth estimation, and perform stage classification on two diverse data sets of television broadcasts. Classification results demonstrate that stages can often be efficiently learned from low-dimensional image representations.
Brady, Timothy F; Oliva, Aude
2008-07-01
Recent work has shown that observers can parse streams of syllables, tones, or visual shapes and learn statistical regularities in them without conscious intent (e.g., learn that A is always followed by B). Here, we demonstrate that these statistical-learning mechanisms can operate at an abstract, conceptual level. In Experiments 1 and 2, observers incidentally learned which semantic categories of natural scenes covaried (e.g., kitchen scenes were always followed by forest scenes). In Experiments 3 and 4, category learning with images of scenes transferred to words that represented the categories. In each experiment, the category of the scenes was irrelevant to the task. Together, these results suggest that statistical-learning mechanisms can operate at a categorical level, enabling generalization of learned regularities using existing conceptual knowledge. Such mechanisms may guide learning in domains as disparate as the acquisition of causal knowledge and the development of cognitive maps from environmental exploration.
A neural model of motion processing and visual navigation by cortical area MST.
Grossberg, S; Mingolla, E; Pack, C
1999-12-01
Cells in the dorsal medial superior temporal cortex (MSTd) process optic flow generated by self-motion during visually guided navigation. A neural model shows how interactions between well-known neural mechanisms (log polar cortical magnification, Gaussian motion-sensitive receptive fields, spatial pooling of motion-sensitive signals and subtractive extraretinal eye movement signals) lead to emergent properties that quantitatively simulate neurophysiological data about MSTd cell properties and psychophysical data about human navigation. Model cells match MSTd neuron responses to optic flow stimuli placed in different parts of the visual field, including position invariance, tuning curves, preferred spiral directions, direction reversals, average response curves and preferred locations for stimulus motion centers. The model shows how the preferred motion direction of the most active MSTd cells can explain human judgments of self-motion direction (heading), without using complex heading templates. The model explains when extraretinal eye movement signals are needed for accurate heading perception, and when retinal input is sufficient, and how heading judgments depend on scene layouts and rotation rates.
Neuronal integration in visual cortex elevates face category tuning to conscious face perception
Fahrenfort, Johannes J.; Snijders, Tineke M.; Heinen, Klaartje; van Gaal, Simon; Scholte, H. Steven; Lamme, Victor A. F.
2012-01-01
The human brain has the extraordinary capability to transform cluttered sensory input into distinct object representations. For example, it is able to rapidly and seemingly without effort detect object categories in complex natural scenes. Surprisingly, category tuning is not sufficient to achieve conscious recognition of objects. What neural process beyond category extraction might elevate neural representations to the level where objects are consciously perceived? Here we show that visible and invisible faces produce similar category-selective responses in the ventral visual cortex. The pattern of neural activity evoked by visible faces could be used to decode the presence of invisible faces and vice versa. However, only visible faces caused extensive response enhancements and changes in neural oscillatory synchronization, as well as increased functional connectivity between higher and lower visual areas. We conclude that conscious face perception is more tightly linked to neural processes of sustained information integration and binding than to processes accommodating face category tuning. PMID:23236162