Willems, Roel M; Clevis, Krien; Hagoort, Peter
2011-09-01
We investigated how visual and linguistic information interact in the perception of emotion. We borrowed a phenomenon from film theory which states that presentation of an as such neutral visual scene intensifies the percept of fear or suspense induced by a different channel of information, such as language. Our main aim was to investigate how neutral visual scenes can enhance responses to fearful language content in parts of the brain involved in the perception of emotion. Healthy participants' brain activity was measured (using functional magnetic resonance imaging) while they read fearful and less fearful sentences presented with or without a neutral visual scene. The main idea is that the visual scenes intensify the fearful content of the language by subtly implying and concretizing what is described in the sentence. Activation levels in the right anterior temporal pole were selectively increased when a neutral visual scene was paired with a fearful sentence, compared to reading the sentence alone, as well as to reading of non-fearful sentences presented with the same neutral scene. We conclude that the right anterior temporal pole serves a binding function of emotional information across domains such as visual and linguistic information.
Guidance of visual attention by semantic information in real-world scenes
Wu, Chia-Chien; Wick, Farahnaz Ahmed; Pomplun, Marc
2014-01-01
Recent research on attentional guidance in real-world scenes has focused on object recognition within the context of a scene. This approach has been valuable for determining some factors that drive the allocation of visual attention and determine visual selection. This article provides a review of experimental work on how different components of context, especially semantic information, affect attentional deployment. We review work from the areas of object recognition, scene perception, and visual search, highlighting recent studies examining semantic structure in real-world scenes. A better understanding on how humans parse scene representations will not only improve current models of visual attention but also advance next-generation computer vision systems and human-computer interfaces. PMID:24567724
Clevis, Krien; Hagoort, Peter
2011-01-01
We investigated how visual and linguistic information interact in the perception of emotion. We borrowed a phenomenon from film theory which states that presentation of an as such neutral visual scene intensifies the percept of fear or suspense induced by a different channel of information, such as language. Our main aim was to investigate how neutral visual scenes can enhance responses to fearful language content in parts of the brain involved in the perception of emotion. Healthy participants’ brain activity was measured (using functional magnetic resonance imaging) while they read fearful and less fearful sentences presented with or without a neutral visual scene. The main idea is that the visual scenes intensify the fearful content of the language by subtly implying and concretizing what is described in the sentence. Activation levels in the right anterior temporal pole were selectively increased when a neutral visual scene was paired with a fearful sentence, compared to reading the sentence alone, as well as to reading of non-fearful sentences presented with the same neutral scene. We conclude that the right anterior temporal pole serves a binding function of emotional information across domains such as visual and linguistic information. PMID:20530540
Meyerhoff, Hauke S; Huff, Markus
2016-04-01
Human long-term memory for visual objects and scenes is tremendous. Here, we test how auditory information contributes to long-term memory performance for realistic scenes. In a total of six experiments, we manipulated the presentation modality (auditory, visual, audio-visual) as well as semantic congruency and temporal synchrony between auditory and visual information of brief filmic clips. Our results show that audio-visual clips generally elicit more accurate memory performance than unimodal clips. This advantage even increases with congruent visual and auditory information. However, violations of audio-visual synchrony hardly have any influence on memory performance. Memory performance remained intact even with a sequential presentation of auditory and visual information, but finally declined when the matching tracks of one scene were presented separately with intervening tracks during learning. With respect to memory performance, our results therefore show that audio-visual integration is sensitive to semantic congruency but remarkably robust against asymmetries between different modalities.
NASA Technical Reports Server (NTRS)
Foyle, David C.; Kaiser, Mary K.; Johnson, Walter W.
1992-01-01
This paper reviews some of the sources of visual information that are available in the out-the-window scene and describes how these visual cues are important for routine pilotage and training, as well as the development of simulator visual systems and enhanced or synthetic vision systems for aircraft cockpits. It is shown how these visual cues may change or disappear under environmental or sensor conditions, and how the visual scene can be augmented by advanced displays to capitalize on the pilot's excellent ability to extract visual information from the visual scene.
Visual search in scenes involves selective and non-selective pathways
Wolfe, Jeremy M; Vo, Melissa L-H; Evans, Karla K; Greene, Michelle R
2010-01-01
How do we find objects in scenes? For decades, visual search models have been built on experiments in which observers search for targets, presented among distractor items, isolated and randomly arranged on blank backgrounds. Are these models relevant to search in continuous scenes? This paper argues that the mechanisms that govern artificial, laboratory search tasks do play a role in visual search in scenes. However, scene-based information is used to guide search in ways that had no place in earlier models. Search in scenes may be best explained by a dual-path model: A “selective” path in which candidate objects must be individually selected for recognition and a “non-selective” path in which information can be extracted from global / statistical information. PMID:21227734
Deconstructing Visual Scenes in Cortex: Gradients of Object and Spatial Layout Information
Kravitz, Dwight J.; Baker, Chris I.
2013-01-01
Real-world visual scenes are complex cluttered, and heterogeneous stimuli engaging scene- and object-selective cortical regions including parahippocampal place area (PPA), retrosplenial complex (RSC), and lateral occipital complex (LOC). To understand the unique contribution of each region to distributed scene representations, we generated predictions based on a neuroanatomical framework adapted from monkey and tested them using minimal scenes in which we independently manipulated both spatial layout (open, closed, and gradient) and object content (furniture, e.g., bed, dresser). Commensurate with its strong connectivity with posterior parietal cortex, RSC evidenced strong spatial layout information but no object information, and its response was not even modulated by object presence. In contrast, LOC, which lies within the ventral visual pathway, contained strong object information but no background information. Finally, PPA, which is connected with both the dorsal and the ventral visual pathway, showed information about both objects and spatial backgrounds and was sensitive to the presence or absence of either. These results suggest that 1) LOC, PPA, and RSC have distinct representations, emphasizing different aspects of scenes, 2) the specific representations in each region are predictable from their patterns of connectivity, and 3) PPA combines both spatial layout and object information as predicted by connectivity. PMID:22473894
The singular nature of auditory and visual scene analysis in autism
Lin, I.-Fan; Shirama, Aya; Kato, Nobumasa
2017-01-01
Individuals with autism spectrum disorder often have difficulty acquiring relevant auditory and visual information in daily environments, despite not being diagnosed as hearing impaired or having low vision. Resent psychophysical and neurophysiological studies have shown that autistic individuals have highly specific individual differences at various levels of information processing, including feature extraction, automatic grouping and top-down modulation in auditory and visual scene analysis. Comparison of the characteristics of scene analysis between auditory and visual modalities reveals some essential commonalities, which could provide clues about the underlying neural mechanisms. Further progress in this line of research may suggest effective methods for diagnosing and supporting autistic individuals. This article is part of the themed issue ‘Auditory and visual scene analysis'. PMID:28044025
Unconscious analyses of visual scenes based on feature conjunctions.
Tachibana, Ryosuke; Noguchi, Yasuki
2015-06-01
To efficiently process a cluttered scene, the visual system analyzes statistical properties or regularities of visual elements embedded in the scene. It is controversial, however, whether those scene analyses could also work for stimuli unconsciously perceived. Here we show that our brain performs the unconscious scene analyses not only using a single featural cue (e.g., orientation) but also based on conjunctions of multiple visual features (e.g., combinations of color and orientation information). Subjects foveally viewed a stimulus array (duration: 50 ms) where 4 types of bars (red-horizontal, red-vertical, green-horizontal, and green-vertical) were intermixed. Although a conscious perception of those bars was inhibited by a subsequent mask stimulus, the brain correctly analyzed the information about color, orientation, and color-orientation conjunctions of those invisible bars. The information of those features was then used for the unconscious configuration analysis (statistical processing) of the central bars, which induced a perceptual bias and illusory feature binding in visible stimuli at peripheral locations. While statistical analyses and feature binding are normally 2 key functions of the visual system to construct coherent percepts of visual scenes, our results show that a high-level analysis combining those 2 functions is correctly performed by unconscious computations in the brain. (c) 2015 APA, all rights reserved).
View Combination: A Generalization Mechanism for Visual Recognition
ERIC Educational Resources Information Center
Friedman, Alinda; Waller, David; Thrash, Tyler; Greenauer, Nathan; Hodgson, Eric
2011-01-01
We examined whether view combination mechanisms shown to underlie object and scene recognition can integrate visual information across views that have little or no three-dimensional information at either the object or scene level. In three experiments, people learned four "views" of a two dimensional visual array derived from a three-dimensional…
Modeling global scene factors in attention
NASA Astrophysics Data System (ADS)
Torralba, Antonio
2003-07-01
Models of visual attention have focused predominantly on bottom-up approaches that ignored structured contextual and scene information. I propose a model of contextual cueing for attention guidance based on the global scene configuration. It is shown that the statistics of low-level features across the whole image can be used to prime the presence or absence of objects in the scene and to predict their location, scale, and appearance before exploring the image. In this scheme, visual context information can become available early in the visual processing chain, which allows modulation of the saliency of image regions and provides an efficient shortcut for object detection and recognition. 2003 Optical Society of America
Optic Flow Dominates Visual Scene Polarity in Causing Adaptive Modification of Locomotor Trajectory
NASA Technical Reports Server (NTRS)
Nomura, Y.; Mulavara, A. P.; Richards, J. T.; Brady, R.; Bloomberg, Jacob J.
2005-01-01
Locomotion and posture are influenced and controlled by vestibular, visual and somatosensory information. Optic flow and scene polarity are two characteristics of a visual scene that have been identified as being critical in how they affect perceived body orientation and self-motion. The goal of this study was to determine the role of optic flow and visual scene polarity on adaptive modification in locomotor trajectory. Two computer-generated virtual reality scenes were shown to subjects during 20 minutes of treadmill walking. One scene was a highly polarized scene while the other was composed of objects displayed in a non-polarized fashion. Both virtual scenes depicted constant rate self-motion equivalent to walking counterclockwise around the perimeter of a room. Subjects performed Stepping Tests blindfolded before and after scene exposure to assess adaptive changes in locomotor trajectory. Subjects showed a significant difference in heading direction, between pre and post adaptation stepping tests, when exposed to either scene during treadmill walking. However, there was no significant difference in the subjects heading direction between the two visual scene polarity conditions. Therefore, it was inferred from these data that optic flow has a greater role than visual polarity in influencing adaptive locomotor function.
Beyond the cockpit: The visual world as a flight instrument
NASA Technical Reports Server (NTRS)
Johnson, W. W.; Kaiser, M. K.; Foyle, D. C.
1992-01-01
The use of cockpit instruments to guide flight control is not always an option (e.g., low level rotorcraft flight). Under such circumstances the pilot must use out-the-window information for control and navigation. Thus it is important to determine the basis of visually guided flight for several reasons: (1) to guide the design and construction of the visual displays used in training simulators; (2) to allow modeling of visibility restrictions brought about by weather, cockpit constraints, or distortions introduced by sensor systems; and (3) to aid in the development of displays that augment the cockpit window scene and are compatible with the pilot's visual extraction of information from the visual scene. The authors are actively pursuing these questions. We have on-going studies using both low-cost, lower fidelity flight simulators, and state-of-the-art helicopter simulation research facilities. Research results will be presented on: (1) the important visual scene information used in altitude and speed control; (2) the utility of monocular, stereo, and hyperstereo cues for the control of flight; (3) perceptual effects due to the differences between normal unaided daylight vision, and that made available by various night vision devices (e.g., light intensifying goggles and infra-red sensor displays); and (4) the utility of advanced contact displays in which instrument information is made part of the visual scene, as on a 'scene linked' head-up display (e.g., displaying altimeter information on a virtual billboard located on the ground).
Cortical feedback signals generalise across different spatial frequencies of feedforward inputs.
Revina, Yulia; Petro, Lucy S; Muckli, Lars
2017-09-22
Visual processing in cortex relies on feedback projections contextualising feedforward information flow. Primary visual cortex (V1) has small receptive fields and processes feedforward information at a fine-grained spatial scale, whereas higher visual areas have larger, spatially invariant receptive fields. Therefore, feedback could provide coarse information about the global scene structure or alternatively recover fine-grained structure by targeting small receptive fields in V1. We tested if feedback signals generalise across different spatial frequencies of feedforward inputs, or if they are tuned to the spatial scale of the visual scene. Using a partial occlusion paradigm, functional magnetic resonance imaging (fMRI) and multivoxel pattern analysis (MVPA) we investigated whether feedback to V1 contains coarse or fine-grained information by manipulating the spatial frequency of the scene surround outside an occluded image portion. We show that feedback transmits both coarse and fine-grained information as it carries information about both low (LSF) and high spatial frequencies (HSF). Further, feedback signals containing LSF information are similar to feedback signals containing HSF information, even without a large overlap in spatial frequency bands of the HSF and LSF scenes. Lastly, we found that feedback carries similar information about the spatial frequency band across different scenes. We conclude that cortical feedback signals contain information which generalises across different spatial frequencies of feedforward inputs. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Chromatic information and feature detection in fast visual analysis
Del Viva, Maria M.; Punzi, Giovanni; Shevell, Steven K.; ...
2016-08-01
The visual system is able to recognize a scene based on a sketch made of very simple features. This ability is likely crucial for survival, when fast image recognition is necessary, and it is believed that a primal sketch is extracted very early in the visual processing. Such highly simplified representations can be sufficient for accurate object discrimination, but an open question is the role played by color in this process. Rich color information is available in natural scenes, yet artist's sketches are usually monochromatic; and, black-andwhite movies provide compelling representations of real world scenes. Also, the contrast sensitivity ofmore » color is low at fine spatial scales. We approach the question from the perspective of optimal information processing by a system endowed with limited computational resources. We show that when such limitations are taken into account, the intrinsic statistical properties of natural scenes imply that the most effective strategy is to ignore fine-scale color features and devote most of the bandwidth to gray-scale information. We find confirmation of these information-based predictions from psychophysics measurements of fast-viewing discrimination of natural scenes. As a result, we conclude that the lack of colored features in our visual representation, and our overall low sensitivity to high-frequency color components, are a consequence of an adaptation process, optimizing the size and power consumption of our brain for the visual world we live in.« less
Chromatic information and feature detection in fast visual analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Del Viva, Maria M.; Punzi, Giovanni; Shevell, Steven K.
The visual system is able to recognize a scene based on a sketch made of very simple features. This ability is likely crucial for survival, when fast image recognition is necessary, and it is believed that a primal sketch is extracted very early in the visual processing. Such highly simplified representations can be sufficient for accurate object discrimination, but an open question is the role played by color in this process. Rich color information is available in natural scenes, yet artist's sketches are usually monochromatic; and, black-andwhite movies provide compelling representations of real world scenes. Also, the contrast sensitivity ofmore » color is low at fine spatial scales. We approach the question from the perspective of optimal information processing by a system endowed with limited computational resources. We show that when such limitations are taken into account, the intrinsic statistical properties of natural scenes imply that the most effective strategy is to ignore fine-scale color features and devote most of the bandwidth to gray-scale information. We find confirmation of these information-based predictions from psychophysics measurements of fast-viewing discrimination of natural scenes. As a result, we conclude that the lack of colored features in our visual representation, and our overall low sensitivity to high-frequency color components, are a consequence of an adaptation process, optimizing the size and power consumption of our brain for the visual world we live in.« less
Feature diagnosticity and task context shape activity in human scene-selective cortex.
Lowe, Matthew X; Gallivan, Jason P; Ferber, Susanne; Cant, Jonathan S
2016-01-15
Scenes are constructed from multiple visual features, yet previous research investigating scene processing has often focused on the contributions of single features in isolation. In the real world, features rarely exist independently of one another and likely converge to inform scene identity in unique ways. Here, we utilize fMRI and pattern classification techniques to examine the interactions between task context (i.e., attend to diagnostic global scene features; texture or layout) and high-level scene attributes (content and spatial boundary) to test the novel hypothesis that scene-selective cortex represents multiple visual features, the importance of which varies according to their diagnostic relevance across scene categories and task demands. Our results show for the first time that scene representations are driven by interactions between multiple visual features and high-level scene attributes. Specifically, univariate analysis of scene-selective cortex revealed that task context and feature diagnosticity shape activity differentially across scene categories. Examination using multivariate decoding methods revealed results consistent with univariate findings, but also evidence for an interaction between high-level scene attributes and diagnostic visual features within scene categories. Critically, these findings suggest visual feature representations are not distributed uniformly across scene categories but are shaped by task context and feature diagnosticity. Thus, we propose that scene-selective cortex constructs a flexible representation of the environment by integrating multiple diagnostically relevant visual features, the nature of which varies according to the particular scene being perceived and the goals of the observer. Copyright © 2015 Elsevier Inc. All rights reserved.
Does scene context always facilitate retrieval of visual object representations?
Nakashima, Ryoichi; Yokosawa, Kazuhiko
2011-04-01
An object-to-scene binding hypothesis maintains that visual object representations are stored as part of a larger scene representation or scene context, and that scene context facilitates retrieval of object representations (see, e.g., Hollingworth, Journal of Experimental Psychology: Learning, Memory and Cognition, 32, 58-69, 2006). Support for this hypothesis comes from data using an intentional memory task. In the present study, we examined whether scene context always facilitates retrieval of visual object representations. In two experiments, we investigated whether the scene context facilitates retrieval of object representations, using a new paradigm in which a memory task is appended to a repeated-flicker change detection task. Results indicated that in normal scene viewing, in which many simultaneous objects appear, scene context facilitation of the retrieval of object representations-henceforth termed object-to-scene binding-occurred only when the observer was required to retain much information for a task (i.e., an intentional memory task).
Campagne, Aurélie; Fradcourt, Benoit; Pichat, Cédric; Baciu, Monica; Kauffmann, Louise; Peyrin, Carole
2016-01-01
Visual processing of emotional stimuli critically depends on the type of cognitive appraisal involved. The present fMRI pilot study aimed to investigate the cerebral correlates involved in the visual processing of emotional scenes in two tasks, one emotional, based on the appraisal of personal emotional experience, and the other motivational, based on the appraisal of the tendency to action. Given that the use of spatial frequency information is relatively flexible during the visual processing of emotional stimuli depending on the task's demands, we also explored the effect of the type of spatial frequency in visual stimuli in each task by using emotional scenes filtered in low spatial frequency (LSF) and high spatial frequencies (HSF). Activation was observed in the visual areas of the fusiform gyrus for all emotional scenes in both tasks, and in the amygdala for unpleasant scenes only. The motivational task induced additional activation in frontal motor-related areas (e.g. premotor cortex, SMA) and parietal regions (e.g. superior and inferior parietal lobules). Parietal regions were recruited particularly during the motivational appraisal of approach in response to pleasant scenes. These frontal and parietal activations, respectively, suggest that motor and navigation processes play a specific role in the identification of the tendency to action in the motivational task. Furthermore, activity observed in the motivational task, in response to both pleasant and unpleasant scenes, was significantly greater for HSF than for LSF scenes, suggesting that the tendency to action is driven mainly by the detailed information contained in scenes. Results for the emotional task suggest that spatial frequencies play only a small role in the evaluation of unpleasant and pleasant emotions. Our preliminary study revealed a partial distinction between visual processing of emotional scenes during identification of the tendency to action, and during identification of personal emotional experiences. It also illustrates flexible use of the spatial frequencies contained in scenes depending on their emotional valence and on task demands.
Direct versus indirect processing changes the influence of color in natural scene categorization.
Otsuka, Sachio; Kawaguchi, Jun
2009-10-01
We examined whether participants would use a negative priming (NP) paradigm to categorize color and grayscale images of natural scenes that were presented peripherally and were ignored. We focused on (1) attentional resources allocated to natural scenes and (2) direct versus indirect processing of them. We set up low and high attention-load conditions, based on the set size of the searched stimuli in the prime display (one and five). Participants were required to detect and categorize the target objects in natural scenes in a central visual search task, ignoring peripheral natural images in both the prime and probe displays. The results showed that, irrespective of attention load, NP was observed for color scenes but not for grayscale scenes. We did not observe any effect of color information in central visual search, where participants responded directly to natural scenes. These results indicate that, in a situation in which participants indirectly process natural scenes, color information is critical to object categorization, but when the scenes are processed directly, color information does not contribute to categorization.
Brockmole, James R; Henderson, John M
2006-07-01
When confronted with a previously encountered scene, what information is used to guide search to a known target? We contrasted the role of a scene's basic-level category membership with its specific arrangement of visual properties. Observers were repeatedly shown photographs of scenes that contained consistently but arbitrarily located targets, allowing target positions to be associated with scene content. Learned scenes were then unexpectedly mirror reversed, spatially translating visual features as well as the target across the display while preserving the scene's identity and concept. Mirror reversals produced a cost as the eyes initially moved toward the position in the display in which the target had previously appeared. The cost was not complete, however; when initial search failed, the eyes were quickly directed to the target's new position. These results suggest that in real-world scenes, shifts of attention are initially based on scene identity, and subsequent shifts are guided by more detailed information regarding scene and object layout.
Visual Acuity Using Head-fixed Displays During Passive Self and Surround Motion
NASA Technical Reports Server (NTRS)
Wood, Scott J.; Black, F. Owen; Stallings, Valerie; Peters, Brian
2007-01-01
The ability to read head-fixed displays on various motion platforms requires the suppression of vestibulo-ocular reflexes. This study examined dynamic visual acuity while viewing a head-fixed display during different self and surround rotation conditions. Twelve healthy subjects were asked to report the orientation of Landolt C optotypes presented on a micro-display fixed to a rotating chair at 50 cm distance. Acuity thresholds were determined by the lowest size at which the subjects correctly identified 3 of 5 optotype orientations at peak velocity. Visual acuity was compared across four different conditions, each tested at 0.05 and 0.4 Hz (peak amplitude of 57 deg/s). The four conditions included: subject rotated in semi-darkness (i.e., limited to background illumination of the display), subject stationary while visual scene rotated, subject rotated around a stationary visual background, and both subject and visual scene rotated together. Visual acuity performance was greatest when the subject rotated around a stationary visual background; i.e., when both vestibular and visual inputs provided concordant information about the motion. Visual acuity performance was most reduced when the subject and visual scene rotated together; i.e., when the visual scene provided discordant information about the motion. Ranges of 4-5 logMAR step sizes across the conditions indicated the acuity task was sufficient to discriminate visual performance levels. The background visual scene can influence the ability to read head-fixed displays during passive motion disturbances. Dynamic visual acuity using head-fixed displays can provide an operationally relevant screening tool for visual performance during exposure to novel acceleration environments.
ERIC Educational Resources Information Center
Brady, Timothy F.; Tenenbaum, Joshua B.
2013-01-01
When remembering a real-world scene, people encode both detailed information about specific objects and higher order information like the overall gist of the scene. However, formal models of change detection, like those used to estimate visual working memory capacity, assume observers encode only a simple memory representation that includes no…
Campagne, Aurélie; Fradcourt, Benoit; Pichat, Cédric; Baciu, Monica; Kauffmann, Louise; Peyrin, Carole
2016-01-01
Visual processing of emotional stimuli critically depends on the type of cognitive appraisal involved. The present fMRI pilot study aimed to investigate the cerebral correlates involved in the visual processing of emotional scenes in two tasks, one emotional, based on the appraisal of personal emotional experience, and the other motivational, based on the appraisal of the tendency to action. Given that the use of spatial frequency information is relatively flexible during the visual processing of emotional stimuli depending on the task’s demands, we also explored the effect of the type of spatial frequency in visual stimuli in each task by using emotional scenes filtered in low spatial frequency (LSF) and high spatial frequencies (HSF). Activation was observed in the visual areas of the fusiform gyrus for all emotional scenes in both tasks, and in the amygdala for unpleasant scenes only. The motivational task induced additional activation in frontal motor-related areas (e.g. premotor cortex, SMA) and parietal regions (e.g. superior and inferior parietal lobules). Parietal regions were recruited particularly during the motivational appraisal of approach in response to pleasant scenes. These frontal and parietal activations, respectively, suggest that motor and navigation processes play a specific role in the identification of the tendency to action in the motivational task. Furthermore, activity observed in the motivational task, in response to both pleasant and unpleasant scenes, was significantly greater for HSF than for LSF scenes, suggesting that the tendency to action is driven mainly by the detailed information contained in scenes. Results for the emotional task suggest that spatial frequencies play only a small role in the evaluation of unpleasant and pleasant emotions. Our preliminary study revealed a partial distinction between visual processing of emotional scenes during identification of the tendency to action, and during identification of personal emotional experiences. It also illustrates flexible use of the spatial frequencies contained in scenes depending on their emotional valence and on task demands. PMID:26757433
Brayfield, Brad P.
2016-01-01
The navigation of bees and ants from hive to food and back has captivated people for more than a century. Recently, the Navigation by Scene Familiarity Hypothesis (NSFH) has been proposed as a parsimonious approach that is congruent with the limited neural elements of these insects’ brains. In the NSFH approach, an agent completes an initial training excursion, storing images along the way. To retrace the path, the agent scans the area and compares the current scenes to those previously experienced. By turning and moving to minimize the pixel-by-pixel differences between encountered and stored scenes, the agent is guided along the path without having memorized the sequence. An important premise of the NSFH is that the visual information of the environment is adequate to guide navigation without aliasing. Here we demonstrate that an image landscape of an indoor setting possesses ample navigational information. We produced a visual landscape of our laboratory and part of the adjoining corridor consisting of 2816 panoramic snapshots arranged in a grid at 12.7-cm centers. We show that pixel-by-pixel comparisons of these images yield robust translational and rotational visual information. We also produced a simple algorithm that tracks previously experienced routes within our lab based on an insect-inspired scene familiarity approach and demonstrate that adequate visual information exists for an agent to retrace complex training routes, including those where the path’s end is not visible from its origin. We used this landscape to systematically test the interplay of sensor morphology, angles of inspection, and similarity threshold with the recapitulation performance of the agent. Finally, we compared the relative information content and chance of aliasing within our visually rich laboratory landscape to scenes acquired from indoor corridors with more repetitive scenery. PMID:27119720
Shapiro, Arthur G; Hamburger, Kai
2007-01-01
A central tenet of Gestalt psychology is that the visual scene can be separated into figure and ground. The two illusions we present demonstrate that Gestalt processes can group spatial contrast information that cuts across the figure/ground separation. This finding suggests that visual processes that organise the visual scene do not necessarily require structural segmentation as their primary input.
The role of memory for visual search in scenes
Võ, Melissa Le-Hoa; Wolfe, Jeremy M.
2014-01-01
Many daily activities involve looking for something. The ease with which these searches are performed often allows one to forget that searching represents complex interactions between visual attention and memory. While a clear understanding exists of how search efficiency will be influenced by visual features of targets and their surrounding distractors or by the number of items in the display, the role of memory in search is less well understood. Contextual cueing studies have shown that implicit memory for repeated item configurations can facilitate search in artificial displays. When searching more naturalistic environments, other forms of memory come into play. For instance, semantic memory provides useful information about which objects are typically found where within a scene, and episodic scene memory provides information about where a particular object was seen the last time a particular scene was viewed. In this paper, we will review work on these topics, with special emphasis on the role of memory in guiding search in organized, real-world scenes. PMID:25684693
The role of memory for visual search in scenes.
Le-Hoa Võ, Melissa; Wolfe, Jeremy M
2015-03-01
Many daily activities involve looking for something. The ease with which these searches are performed often allows one to forget that searching represents complex interactions between visual attention and memory. Although a clear understanding exists of how search efficiency will be influenced by visual features of targets and their surrounding distractors or by the number of items in the display, the role of memory in search is less well understood. Contextual cueing studies have shown that implicit memory for repeated item configurations can facilitate search in artificial displays. When searching more naturalistic environments, other forms of memory come into play. For instance, semantic memory provides useful information about which objects are typically found where within a scene, and episodic scene memory provides information about where a particular object was seen the last time a particular scene was viewed. In this paper, we will review work on these topics, with special emphasis on the role of memory in guiding search in organized, real-world scenes. © 2015 New York Academy of Sciences.
He, Mengyang; Qi, Changzhu; Lu, Yang; Song, Amanda; Hayat, Saba Z; Xu, Xia
2018-05-21
Extensive studies have shown that a sports expert is superior to a sports novice in visually perceptual-cognitive processes of sports scene information, however the attentional and neural basis of it has not been thoroughly explored. The present study examined whether a sport expert has the attentional superiority on scene information relevant to his/her sport skill, and explored what factor drives this superiority. To address this problem, EEGs were recorded as participants passively viewed sport scenes (tennis vs. non-tennis) and negative emotional faces in the context of a visual attention task, where the pictures of sport scenes or of negative emotional faces randomly followed the pictures with overlapping sport scenes and negative emotional faces. ERP results showed that for experts, the evoked potential of attentional competition elicited by the overlap of tennis scene was significantly larger than that evoked by the overlap of non-tennis scene, while this effect was absent for novices. The LORETA showed that the experts' left medial frontal gyrus (MFG) cortex was significantly more active as compared to the right MFG when processing the overlap of tennis scene, but the lateralization effect was not significant in novices. Those results indicate that experts have attentional superiority on skill-related scene information, despite intruding the scene through negative emotional faces that are prone to cause negativity bias toward their visual field as a strong distractor. This superiority is actuated by the activation of left MFG cortex and probably due to self-reference. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.
Pilot Task Profiles, Human Factors, And Image Realism
NASA Astrophysics Data System (ADS)
McCormick, Dennis
1982-06-01
Computer Image Generation (CIG) visual systems provide real time scenes for state-of-the-art flight training simulators. The visual system reauires a greater understanding of training tasks, human factors, and the concept of image realism to produce an effective and efficient training scene than is required by other types of visual systems. Image realism must be defined in terms of pilot visual information reauirements. Human factors analysis of training and perception is necessary to determine the pilot's information requirements. System analysis then determines how the CIG and display device can best provide essential information to the pilot. This analysis procedure ensures optimum training effectiveness and system performance.
Computational mechanisms underlying cortical responses to the affordance properties of visual scenes
Epstein, Russell A.
2018-01-01
Biologically inspired deep convolutional neural networks (CNNs), trained for computer vision tasks, have been found to predict cortical responses with remarkable accuracy. However, the internal operations of these models remain poorly understood, and the factors that account for their success are unknown. Here we develop a set of techniques for using CNNs to gain insights into the computational mechanisms underlying cortical responses. We focused on responses in the occipital place area (OPA), a scene-selective region of dorsal occipitoparietal cortex. In a previous study, we showed that fMRI activation patterns in the OPA contain information about the navigational affordances of scenes; that is, information about where one can and cannot move within the immediate environment. We hypothesized that this affordance information could be extracted using a set of purely feedforward computations. To test this idea, we examined a deep CNN with a feedforward architecture that had been previously trained for scene classification. We found that responses in the CNN to scene images were highly predictive of fMRI responses in the OPA. Moreover the CNN accounted for the portion of OPA variance relating to the navigational affordances of scenes. The CNN could thus serve as an image-computable candidate model of affordance-related responses in the OPA. We then ran a series of in silico experiments on this model to gain insights into its internal operations. These analyses showed that the computation of affordance-related features relied heavily on visual information at high-spatial frequencies and cardinal orientations, both of which have previously been identified as low-level stimulus preferences of scene-selective visual cortex. These computations also exhibited a strong preference for information in the lower visual field, which is consistent with known retinotopic biases in the OPA. Visualizations of feature selectivity within the CNN suggested that affordance-based responses encoded features that define the layout of the spatial environment, such as boundary-defining junctions and large extended surfaces. Together, these results map the sensory functions of the OPA onto a fully quantitative model that provides insights into its visual computations. More broadly, they advance integrative techniques for understanding visual cortex across multiple level of analysis: from the identification of cortical sensory functions to the modeling of their underlying algorithms. PMID:29684011
The neural bases of spatial frequency processing during scene perception
Kauffmann, Louise; Ramanoël, Stephen; Peyrin, Carole
2014-01-01
Theories on visual perception agree that scenes are processed in terms of spatial frequencies. Low spatial frequencies (LSF) carry coarse information whereas high spatial frequencies (HSF) carry fine details of the scene. However, how and where spatial frequencies are processed within the brain remain unresolved questions. The present review addresses these issues and aims to identify the cerebral regions differentially involved in low and high spatial frequency processing, and to clarify their attributes during scene perception. Results from a number of behavioral and neuroimaging studies suggest that spatial frequency processing is lateralized in both hemispheres, with the right and left hemispheres predominantly involved in the categorization of LSF and HSF scenes, respectively. There is also evidence that spatial frequency processing is retinotopically mapped in the visual cortex. HSF scenes (as opposed to LSF) activate occipital areas in relation to foveal representations, while categorization of LSF scenes (as opposed to HSF) activates occipital areas in relation to more peripheral representations. Concomitantly, a number of studies have demonstrated that LSF information may reach high-order areas rapidly, allowing an initial coarse parsing of the visual scene, which could then be sent back through feedback into the occipito-temporal cortex to guide finer HSF-based analysis. Finally, the review addresses spatial frequency processing within scene-selective regions areas of the occipito-temporal cortex. PMID:24847226
The Importance of Information Localization in Scene Gist Recognition
ERIC Educational Resources Information Center
Loschky, Lester C.; Sethi, Amit; Simons, Daniel J.; Pydimarri, Tejaswi N.; Ochs, Daniel; Corbeille, Jeremy L.
2007-01-01
People can recognize the meaning or gist of a scene from a single glance, and a few recent studies have begun to examine the sorts of information that contribute to scene gist recognition. The authors of the present study used visual masking coupled with image manipulations (randomizing phase while maintaining the Fourier amplitude spectrum;…
The effects of alcohol intoxication on attention and memory for visual scenes.
Harvey, Alistair J; Kneller, Wendy; Campbell, Alison C
2013-01-01
This study tests the claim that alcohol intoxication narrows the focus of visual attention on to the more salient features of a visual scene. A group of alcohol intoxicated and sober participants had their eye movements recorded as they encoded a photographic image featuring a central event of either high or low salience. All participants then recalled the details of the image the following day when sober. We sought to determine whether the alcohol group would pay less attention to the peripheral features of the encoded scene than their sober counterparts, whether this effect of attentional narrowing was stronger for the high-salience event than for the low-salience event, and whether it would lead to a corresponding deficit in peripheral recall. Alcohol was found to narrow the focus of foveal attention to the central features of both images but did not facilitate recall from this region. It also reduced the overall amount of information accurately recalled from each scene. These findings demonstrate that the concept of alcohol myopia originally posited to explain the social consequences of intoxication (Steele & Josephs, 1990) may be extended to explain the relative neglect of peripheral information during the processing of visual scenes.
Fu, Kun; Jin, Junqi; Cui, Runpeng; Sha, Fei; Zhang, Changshui
2017-12-01
Recent progress on automatic generation of image captions has shown that it is possible to describe the most salient information conveyed by images with accurate and meaningful sentences. In this paper, we propose an image captioning system that exploits the parallel structures between images and sentences. In our model, the process of generating the next word, given the previously generated ones, is aligned with the visual perception experience where the attention shifts among the visual regions-such transitions impose a thread of ordering in visual perception. This alignment characterizes the flow of latent meaning, which encodes what is semantically shared by both the visual scene and the text description. Our system also makes another novel modeling contribution by introducing scene-specific contexts that capture higher-level semantic information encoded in an image. The contexts adapt language models for word generation to specific scene types. We benchmark our system and contrast to published results on several popular datasets, using both automatic evaluation metrics and human evaluation. We show that either region-based attention or scene-specific contexts improves systems without those components. Furthermore, combining these two modeling ingredients attains the state-of-the-art performance.
Is that disgust I see? Political ideology and biased visual attention.
Oosterhoff, Benjamin; Shook, Natalie J; Ford, Cameron
2018-01-15
Considerable evidence suggests that political liberals and conservatives vary in the way they process and respond to valenced (i.e., negative versus positive) information, with conservatives generally displaying greater negativity biases than liberals. Less is known about whether liberals and conservatives differentially prioritize certain forms of negative information over others. Across two studies using eye-tracking methodology, we examined differences in visual attention to negative scenes and facial expressions based on self-reported political ideology. In Study 1, scenes rated high in fear, disgust, sadness, and neutrality were presented simultaneously. Greater endorsement of socially conservative political attitudes was associated with less attentional engagement (i.e., lower dwell time) of disgust scenes and more attentional engagement toward neutral scenes. Socially conservative political attitudes were not significantly associated with visual attention to fear or sad scenes. In Study 2, images depicting facial expressions of fear, disgust, sadness, and neutrality were presented simultaneously. Greater endorsement of socially conservative political attitudes was associated with greater attentional engagement with facial expressions depicting disgust and less attentional engagement toward neutral faces. Visual attention to fearful or sad faces was not related to social conservatism. Endorsement of economically conservative political attitudes was not consistently associated with biases in visual attention across both studies. These findings support disease-avoidance models and suggest that social conservatism may be rooted within a greater sensitivity to disgust-related information. Copyright © 2017 Elsevier B.V. All rights reserved.
Eye guidance during real-world scene search: The role color plays in central and peripheral vision.
Nuthmann, Antje; Malcolm, George L
2016-01-01
The visual system utilizes environmental features to direct gaze efficiently when locating objects. While previous research has isolated various features' contributions to gaze guidance, these studies generally used sparse displays and did not investigate how features facilitated search as a function of their location on the visual field. The current study investigated how features across the visual field--particularly color--facilitate gaze guidance during real-world search. A gaze-contingent window followed participants' eye movements, restricting color information to specified regions. Scene images were presented in full color, with color in the periphery and gray in central vision or gray in the periphery and color in central vision, or in grayscale. Color conditions were crossed with a search cue manipulation, with the target cued either with a word label or an exact picture. Search times increased as color information in the scene decreased. A gaze-data based decomposition of search time revealed color-mediated effects on specific subprocesses of search. Color in peripheral vision facilitated target localization, whereas color in central vision facilitated target verification. Picture cues facilitated search, with the effects of cue specificity and scene color combining additively. When available, the visual system utilizes the environment's color information to facilitate different real-world visual search behaviors based on the location within the visual field.
Pasqualotto, Achille; Esenkaya, Tayfun
2016-01-01
Visual-to-auditory sensory substitution is used to convey visual information through audition, and it was initially created to compensate for blindness; it consists of software converting the visual images captured by a video-camera into the equivalent auditory images, or "soundscapes". Here, it was used by blindfolded sighted participants to learn the spatial position of simple shapes depicted in images arranged on the floor. Very few studies have used sensory substitution to investigate spatial representation, while it has been widely used to investigate object recognition. Additionally, with sensory substitution we could study the performance of participants actively exploring the environment through audition, rather than passively localizing sound sources. Blindfolded participants egocentrically learnt the position of six images by using sensory substitution and then a judgment of relative direction task (JRD) was used to determine how this scene was represented. This task consists of imagining being in a given location, oriented in a given direction, and pointing towards the required image. Before performing the JRD task, participants explored a map that provided allocentric information about the scene. Although spatial exploration was egocentric, surprisingly we found that performance in the JRD task was better for allocentric perspectives. This suggests that the egocentric representation of the scene was updated. This result is in line with previous studies using visual and somatosensory scenes, thus supporting the notion that different sensory modalities produce equivalent spatial representation(s). Moreover, our results have practical implications to improve training methods with sensory substitution devices (SSD).
Songnian, Zhao; Qi, Zou; Chang, Liu; Xuemin, Liu; Shousi, Sun; Jun, Qiu
2014-04-23
How it is possible to "faithfully" represent a three-dimensional stereoscopic scene using Cartesian coordinates on a plane, and how three-dimensional perceptions differ between an actual scene and an image of the same scene are questions that have not yet been explored in depth. They seem like commonplace phenomena, but in fact, they are important and difficult issues for visual information processing, neural computation, physics, psychology, cognitive psychology, and neuroscience. The results of this study show that the use of plenoptic (or all-optical) functions and their dual plane parameterizations can not only explain the nature of information processing from the retina to the primary visual cortex and, in particular, the characteristics of the visual pathway's optical system and its affine transformation, but they can also clarify the reason why the vanishing point and line exist in a visual image. In addition, they can better explain the reasons why a three-dimensional Cartesian coordinate system can be introduced into the two-dimensional plane to express a real three-dimensional scene. 1. We introduce two different mathematical expressions of the plenoptic functions, Pw and Pv that can describe the objective world. We also analyze the differences between these two functions when describing visual depth perception, that is, the difference between how these two functions obtain the depth information of an external scene.2. The main results include a basic method for introducing a three-dimensional Cartesian coordinate system into a two-dimensional plane to express the depth of a scene, its constraints, and algorithmic implementation. In particular, we include a method to separate the plenoptic function and proceed with the corresponding transformation in the retina and visual cortex.3. We propose that size constancy, the vanishing point, and vanishing line form the basis of visual perception of the outside world, and that the introduction of a three-dimensional Cartesian coordinate system into a two dimensional plane reveals a corresponding mapping between a retinal image and the vanishing point and line.
2014-01-01
Background How it is possible to “faithfully” represent a three-dimensional stereoscopic scene using Cartesian coordinates on a plane, and how three-dimensional perceptions differ between an actual scene and an image of the same scene are questions that have not yet been explored in depth. They seem like commonplace phenomena, but in fact, they are important and difficult issues for visual information processing, neural computation, physics, psychology, cognitive psychology, and neuroscience. Results The results of this study show that the use of plenoptic (or all-optical) functions and their dual plane parameterizations can not only explain the nature of information processing from the retina to the primary visual cortex and, in particular, the characteristics of the visual pathway’s optical system and its affine transformation, but they can also clarify the reason why the vanishing point and line exist in a visual image. In addition, they can better explain the reasons why a three-dimensional Cartesian coordinate system can be introduced into the two-dimensional plane to express a real three-dimensional scene. Conclusions 1. We introduce two different mathematical expressions of the plenoptic functions, P w and P v that can describe the objective world. We also analyze the differences between these two functions when describing visual depth perception, that is, the difference between how these two functions obtain the depth information of an external scene. 2. The main results include a basic method for introducing a three-dimensional Cartesian coordinate system into a two-dimensional plane to express the depth of a scene, its constraints, and algorithmic implementation. In particular, we include a method to separate the plenoptic function and proceed with the corresponding transformation in the retina and visual cortex. 3. We propose that size constancy, the vanishing point, and vanishing line form the basis of visual perception of the outside world, and that the introduction of a three-dimensional Cartesian coordinate system into a two dimensional plane reveals a corresponding mapping between a retinal image and the vanishing point and line. PMID:24755246
Synchronization of spontaneous eyeblinks while viewing video stories
Nakano, Tamami; Yamamoto, Yoshiharu; Kitajo, Keiichi; Takahashi, Toshimitsu; Kitazawa, Shigeru
2009-01-01
Blinks are generally suppressed during a task that requires visual attention and tend to occur immediately before or after the task when the timing of its onset and offset are explicitly given. During the viewing of video stories, blinks are expected to occur at explicit breaks such as scene changes. However, given that the scene length is unpredictable, there should also be appropriate timing for blinking within a scene to prevent temporal loss of critical visual information. Here, we show that spontaneous blinks were highly synchronized between and within subjects when they viewed the same short video stories, but were not explicitly tied to the scene breaks. Synchronized blinks occurred during scenes that required less attention such as at the conclusion of an action, during the absence of the main character, during a long shot and during repeated presentations of a similar scene. In contrast, blink synchronization was not observed when subjects viewed a background video or when they listened to a story read aloud. The results suggest that humans share a mechanism for controlling the timing of blinks that searches for an implicit timing that is appropriate to minimize the chance of losing critical information while viewing a stream of visual events. PMID:19640888
Common and Innovative Visuals: A sparsity modeling framework for video.
Abdolhosseini Moghadam, Abdolreza; Kumar, Mrityunjay; Radha, Hayder
2014-05-02
Efficient video representation models are critical for many video analysis and processing tasks. In this paper, we present a framework based on the concept of finding the sparsest solution to model video frames. To model the spatio-temporal information, frames from one scene are decomposed into two components: (i) a common frame, which describes the visual information common to all the frames in the scene/segment, and (ii) a set of innovative frames, which depicts the dynamic behaviour of the scene. The proposed approach exploits and builds on recent results in the field of compressed sensing to jointly estimate the common frame and the innovative frames for each video segment. We refer to the proposed modeling framework by CIV (Common and Innovative Visuals). We show how the proposed model can be utilized to find scene change boundaries and extend CIV to videos from multiple scenes. Furthermore, the proposed model is robust to noise and can be used for various video processing applications without relying on motion estimation and detection or image segmentation. Results for object tracking, video editing (object removal, inpainting) and scene change detection are presented to demonstrate the efficiency and the performance of the proposed model.
Anticipation in Real-World Scenes: The Role of Visual Context and Visual Memory.
Coco, Moreno I; Keller, Frank; Malcolm, George L
2016-11-01
The human sentence processor is able to make rapid predictions about upcoming linguistic input. For example, upon hearing the verb eat, anticipatory eye-movements are launched toward edible objects in a visual scene (Altmann & Kamide, 1999). However, the cognitive mechanisms that underlie anticipation remain to be elucidated in ecologically valid contexts. Previous research has, in fact, mainly used clip-art scenes and object arrays, raising the possibility that anticipatory eye-movements are limited to displays containing a small number of objects in a visually impoverished context. In Experiment 1, we confirm that anticipation effects occur in real-world scenes and investigate the mechanisms that underlie such anticipation. In particular, we demonstrate that real-world scenes provide contextual information that anticipation can draw on: When the target object is not present in the scene, participants infer and fixate regions that are contextually appropriate (e.g., a table upon hearing eat). Experiment 2 investigates whether such contextual inference requires the co-presence of the scene, or whether memory representations can be utilized instead. The same real-world scenes as in Experiment 1 are presented to participants, but the scene disappears before the sentence is heard. We find that anticipation occurs even when the screen is blank, including when contextual inference is required. We conclude that anticipatory language processing is able to draw upon global scene representations (such as scene type) to make contextual inferences. These findings are compatible with theories assuming contextual guidance, but posit a challenge for theories assuming object-based visual indices. Copyright © 2015 Cognitive Science Society, Inc.
-The Influence of Scene Context on Parafoveal Processing of Objects.
Castelhano, Monica S; Pereira, Effie J
2017-04-21
Many studies in reading have shown the enhancing effect of context on the processing of a word before it is directly fixated (parafoveal processing of words; Balota et al., 1985; Balota & Rayner, 1983; Ehrlich & Rayner, 1981). Here, we examined whether scene context influences the parafoveal processing of objects and enhances the extraction of object information. Using a modified boundary paradigm (Rayner, 1975), the Dot-Boundary paradigm, participants fixated on a suddenly-onsetting cue before the preview object would onset 4° away. The preview object could be identical to the target, visually similar, visually dissimilar, or a control (black rectangle). The preview changed to the target object once a saccade toward the object was made. Critically, the objects were presented on either a consistent or an inconsistent scene background. Results revealed that there was a greater processing benefit for consistent than inconsistent scene backgrounds and that identical and visually similar previews produced greater processing benefits than other previews. In the second experiment, we added an additional context condition in which the target location was inconsistent, but the scene semantics remained consistent. We found that changing the location of the target object disrupted the processing benefit derived from the consistent context. Most importantly, across both experiments, the effect of preview was not enhanced by scene context. Thus, preview information and scene context appear to independently boost the parafoveal processing of objects without any interaction from object-scene congruency.
A scheme for racquet sports video analysis with the combination of audio-visual information
NASA Astrophysics Data System (ADS)
Xing, Liyuan; Ye, Qixiang; Zhang, Weigang; Huang, Qingming; Yu, Hua
2005-07-01
As a very important category in sports video, racquet sports video, e.g. table tennis, tennis and badminton, has been paid little attention in the past years. Considering the characteristics of this kind of sports video, we propose a new scheme for structure indexing and highlight generating based on the combination of audio and visual information. Firstly, a supervised classification method is employed to detect important audio symbols including impact (ball hit), audience cheers, commentator speech, etc. Meanwhile an unsupervised algorithm is proposed to group video shots into various clusters. Then, by taking advantage of temporal relationship between audio and visual signals, we can specify the scene clusters with semantic labels including rally scenes and break scenes. Thirdly, a refinement procedure is developed to reduce false rally scenes by further audio analysis. Finally, an exciting model is proposed to rank the detected rally scenes from which many exciting video clips such as game (match) points can be correctly retrieved. Experiments on two types of representative racquet sports video, table tennis video and tennis video, demonstrate encouraging results.
Neural representations of contextual guidance in visual search of real-world scenes.
Preston, Tim J; Guo, Fei; Das, Koel; Giesbrecht, Barry; Eckstein, Miguel P
2013-05-01
Exploiting scene context and object-object co-occurrence is critical in guiding eye movements and facilitating visual search, yet the mediating neural mechanisms are unknown. We used functional magnetic resonance imaging while observers searched for target objects in scenes and used multivariate pattern analyses (MVPA) to show that the lateral occipital complex (LOC) can predict the coarse spatial location of observers' expectations about the likely location of 213 different targets absent from the scenes. In addition, we found weaker but significant representations of context location in an area related to the orienting of attention (intraparietal sulcus, IPS) as well as a region related to scene processing (retrosplenial cortex, RSC). Importantly, the degree of agreement among 100 independent raters about the likely location to contain a target object in a scene correlated with LOC's ability to predict the contextual location while weaker but significant effects were found in IPS, RSC, the human motion area, and early visual areas (V1, V3v). When contextual information was made irrelevant to observers' behavioral task, the MVPA analysis of LOC and the other areas' activity ceased to predict the location of context. Thus, our findings suggest that the likely locations of targets in scenes are represented in various visual areas with LOC playing a key role in contextual guidance during visual search of objects in real scenes.
Active sensing in the categorization of visual patterns
Yang, Scott Cheng-Hsin; Lengyel, Máté; Wolpert, Daniel M
2016-01-01
Interpreting visual scenes typically requires us to accumulate information from multiple locations in a scene. Using a novel gaze-contingent paradigm in a visual categorization task, we show that participants' scan paths follow an active sensing strategy that incorporates information already acquired about the scene and knowledge of the statistical structure of patterns. Intriguingly, categorization performance was markedly improved when locations were revealed to participants by an optimal Bayesian active sensor algorithm. By using a combination of a Bayesian ideal observer and the active sensor algorithm, we estimate that a major portion of this apparent suboptimality of fixation locations arises from prior biases, perceptual noise and inaccuracies in eye movements, and the central process of selecting fixation locations is around 70% efficient in our task. Our results suggest that participants select eye movements with the goal of maximizing information about abstract categories that require the integration of information from multiple locations. DOI: http://dx.doi.org/10.7554/eLife.12215.001 PMID:26880546
Pasqualotto, Achille; Esenkaya, Tayfun
2016-01-01
Visual-to-auditory sensory substitution is used to convey visual information through audition, and it was initially created to compensate for blindness; it consists of software converting the visual images captured by a video-camera into the equivalent auditory images, or “soundscapes”. Here, it was used by blindfolded sighted participants to learn the spatial position of simple shapes depicted in images arranged on the floor. Very few studies have used sensory substitution to investigate spatial representation, while it has been widely used to investigate object recognition. Additionally, with sensory substitution we could study the performance of participants actively exploring the environment through audition, rather than passively localizing sound sources. Blindfolded participants egocentrically learnt the position of six images by using sensory substitution and then a judgment of relative direction task (JRD) was used to determine how this scene was represented. This task consists of imagining being in a given location, oriented in a given direction, and pointing towards the required image. Before performing the JRD task, participants explored a map that provided allocentric information about the scene. Although spatial exploration was egocentric, surprisingly we found that performance in the JRD task was better for allocentric perspectives. This suggests that the egocentric representation of the scene was updated. This result is in line with previous studies using visual and somatosensory scenes, thus supporting the notion that different sensory modalities produce equivalent spatial representation(s). Moreover, our results have practical implications to improve training methods with sensory substitution devices (SSD). PMID:27148000
Greene, Michelle R; Baldassano, Christopher; Fei-Fei, Li; Beck, Diane M; Baker, Chris I
2018-01-01
Inherent correlations between visual and semantic features in real-world scenes make it difficult to determine how different scene properties contribute to neural representations. Here, we assessed the contributions of multiple properties to scene representation by partitioning the variance explained in human behavioral and brain measurements by three feature models whose inter-correlations were minimized a priori through stimulus preselection. Behavioral assessments of scene similarity reflected unique contributions from a functional feature model indicating potential actions in scenes as well as high-level visual features from a deep neural network (DNN). In contrast, similarity of cortical responses in scene-selective areas was uniquely explained by mid- and high-level DNN features only, while an object label model did not contribute uniquely to either domain. The striking dissociation between functional and DNN features in their contribution to behavioral and brain representations of scenes indicates that scene-selective cortex represents only a subset of behaviorally relevant scene information. PMID:29513219
Groen, Iris Ia; Greene, Michelle R; Baldassano, Christopher; Fei-Fei, Li; Beck, Diane M; Baker, Chris I
2018-03-07
Inherent correlations between visual and semantic features in real-world scenes make it difficult to determine how different scene properties contribute to neural representations. Here, we assessed the contributions of multiple properties to scene representation by partitioning the variance explained in human behavioral and brain measurements by three feature models whose inter-correlations were minimized a priori through stimulus preselection. Behavioral assessments of scene similarity reflected unique contributions from a functional feature model indicating potential actions in scenes as well as high-level visual features from a deep neural network (DNN). In contrast, similarity of cortical responses in scene-selective areas was uniquely explained by mid- and high-level DNN features only, while an object label model did not contribute uniquely to either domain. The striking dissociation between functional and DNN features in their contribution to behavioral and brain representations of scenes indicates that scene-selective cortex represents only a subset of behaviorally relevant scene information.
Wu, Chia-Chien; Wang, Hsueh-Cheng; Pomplun, Marc
2014-12-01
A previous study (Vision Research 51 (2011) 1192-1205) found evidence for semantic guidance of visual attention during the inspection of real-world scenes, i.e., an influence of semantic relationships among scene objects on overt shifts of attention. In particular, the results revealed an observer bias toward gaze transitions between semantically similar objects. However, this effect is not necessarily indicative of semantic processing of individual objects but may be mediated by knowledge of the scene gist, which does not require object recognition, or by known spatial dependency among objects. To examine the mechanisms underlying semantic guidance, in the present study, participants were asked to view a series of displays with the scene gist excluded and spatial dependency varied. Our results show that spatial dependency among objects seems to be sufficient to induce semantic guidance. Scene gist, on the other hand, does not seem to affect how observers use semantic information to guide attention while viewing natural scenes. Extracting semantic information mainly based on spatial dependency may be an efficient strategy of the visual system that only adds little cognitive load to the viewing task. Copyright © 2014 Elsevier Ltd. All rights reserved.
Stainer, Matthew J.; Scott-Brown, Kenneth C.; Tatler, Benjamin W.
2013-01-01
Where people look when viewing a scene has been a much explored avenue of vision research (e.g., see Tatler, 2009). Current understanding of eye guidance suggests that a combination of high and low-level factors influence fixation selection (e.g., Torralba et al., 2006), but that there are also strong biases toward the center of an image (Tatler, 2007). However, situations where we view multiplexed scenes are becoming increasingly common, and it is unclear how visual inspection might be arranged when content lacks normal semantic or spatial structure. Here we use the central bias to examine how gaze behavior is organized in scenes that are presented in their normal format, or disrupted by scrambling the quadrants and separating them by space. In Experiment 1, scrambling scenes had the strongest influence on gaze allocation. Observers were highly biased by the quadrant center, although physical space did not enhance this bias. However, the center of the display still contributed to fixation selection above chance, and was most influential early in scene viewing. When the top left quadrant was held constant across all conditions in Experiment 2, fixation behavior was significantly influenced by the overall arrangement of the display, with fixations being biased toward the quadrant center when the other three quadrants were scrambled (despite the visual information in this quadrant being identical in all conditions). When scenes are scrambled into four quadrants and semantic contiguity is disrupted, observers no longer appear to view the content as a single scene (despite it consisting of the same visual information overall), but rather anchor visual inspection around the four separate “sub-scenes.” Moreover, the frame of reference that observers use when viewing the multiplex seems to change across viewing time: from an early bias toward the display center to a later bias toward quadrant centers. PMID:24069008
On a common circle: natural scenes and Gestalt rules.
Sigman, M; Cecchi, G A; Gilbert, C D; Magnasco, M O
2001-02-13
To understand how the human visual system analyzes images, it is essential to know the structure of the visual environment. In particular, natural images display consistent statistical properties that distinguish them from random luminance distributions. We have studied the geometric regularities of oriented elements (edges or line segments) present in an ensemble of visual scenes, asking how much information the presence of a segment in a particular location of the visual scene carries about the presence of a second segment at different relative positions and orientations. We observed strong long-range correlations in the distribution of oriented segments that extend over the whole visual field. We further show that a very simple geometric rule, cocircularity, predicts the arrangement of segments in natural scenes, and that different geometrical arrangements show relevant differences in their scaling properties. Our results show similarities to geometric features of previous physiological and psychophysical studies. We discuss the implications of these findings for theories of early vision.
Driving with indirect viewing sensors: understanding the visual perception issues
NASA Astrophysics Data System (ADS)
O'Kane, Barbara L.
1996-05-01
Visual perception is one of the most important elements of driving in that it enables the driver to understand and react appropriately to the situation along the path of the vehicle. The visual perception of the driver is enabled to the greatest extent while driving during the day. Noticeable decrements in visual acuity, range of vision, depth of field and color perception occur at night and under certain weather conditions. Indirect viewing sensors, utilizing various technologies and spectral bands, may assist the driver's normal mode of driving. Critical applications in the military as well as other official activities may require driving at night without headlights. In these latter cases, it is critical that the device, being the only source of scene information, provide the required scene cues needed for driving on, and often-times, off road. One can speculate about the scene information that a driver needs, such as road edges, terrain orientation, people and object detection in or near the path of the vehicle, and so on. But the perceptual qualities of the scene that give rise to these perceptions are little known and thus not quantified for evaluation of indirect viewing devices. This paper discusses driving with headlights and compares the scene content with that provided by a thermal system in the 8 - 12 micrometers micron spectral band, which may be used for driving at some time. The benefits and advantages of each are discussed as well as their limitations in providing information useful for the driver who must make rapid and critical decisions based upon the scene content available. General recommendations are made for potential avenues of development to overcome some of these limitations.
Ground-plane influences on size estimation in early visual processing.
Champion, Rebecca A; Warren, Paul A
2010-07-21
Ground-planes have an important influence on the perception of 3D space (Gibson, 1950) and it has been shown that the assumption that a ground-plane is present in the scene plays a role in the perception of object distance (Bruno & Cutting, 1988). Here, we investigate whether this influence is exerted at an early stage of processing, to affect the rapid estimation of 3D size. Participants performed a visual search task in which they searched for a target object that was larger or smaller than distracter objects. Objects were presented against a background that contained either a frontoparallel or slanted 3D surface, defined by texture gradient cues. We measured the effect on search performance of target location within the scene (near vs. far) and how this was influenced by scene orientation (which, e.g., might be consistent with a ground or ceiling plane, etc.). In addition, we investigated how scene orientation interacted with texture gradient information (indicating surface slant), to determine how these separate cues to scene layout were combined. We found that the difference in target detection performance between targets at the front and rear of the simulated scene was maximal when the scene was consistent with a ground-plane - consistent with the use of an elevation cue to object distance. In addition, we found a significant increase in the size of this effect when texture gradient information (indicating surface slant) was present, but no interaction between texture gradient and scene orientation information. We conclude that scene orientation plays an important role in the estimation of 3D size at an early stage of processing, and suggest that elevation information is linearly combined with texture gradient information for the rapid estimation of 3D size. Copyright 2010 Elsevier Ltd. All rights reserved.
Research on three-dimensional visualization based on virtual reality and Internet
NASA Astrophysics Data System (ADS)
Wang, Zongmin; Yang, Haibo; Zhao, Hongling; Li, Jiren; Zhu, Qiang; Zhang, Xiaohong; Sun, Kai
2007-06-01
To disclose and display water information, a three-dimensional visualization system based on Virtual Reality (VR) and Internet is researched for demonstrating "digital water conservancy" application and also for routine management of reservoir. To explore and mine in-depth information, after completion of modeling high resolution DEM with reliable quality, topographical analysis, visibility analysis and reservoir volume computation are studied. And also, some parameters including slope, water level and NDVI are selected to classify easy-landslide zone in water-level-fluctuating zone of reservoir area. To establish virtual reservoir scene, two kinds of methods are used respectively for experiencing immersion, interaction and imagination (3I). First virtual scene contains more detailed textures to increase reality on graphical workstation with virtual reality engine Open Scene Graph (OSG). Second virtual scene is for internet users with fewer details for assuring fluent speed.
Visualization of spatial-temporal data based on 3D virtual scene
NASA Astrophysics Data System (ADS)
Wang, Xianghong; Liu, Jiping; Wang, Yong; Bi, Junfang
2009-10-01
The main purpose of this paper is to realize the expression of the three-dimensional dynamic visualization of spatialtemporal data based on three-dimensional virtual scene, using three-dimensional visualization technology, and combining with GIS so that the people's abilities of cognizing time and space are enhanced and improved by designing dynamic symbol and interactive expression. Using particle systems, three-dimensional simulation, virtual reality and other visual means, we can simulate the situations produced by changing the spatial location and property information of geographical entities over time, then explore and analyze its movement and transformation rules by changing the interactive manner, and also replay history and forecast of future. In this paper, the main research object is the vehicle track and the typhoon path and spatial-temporal data, through three-dimensional dynamic simulation of its track, and realize its timely monitoring its trends and historical track replaying; according to visualization techniques of spatialtemporal data in Three-dimensional virtual scene, providing us with excellent spatial-temporal information cognitive instrument not only can add clarity to show spatial-temporal information of the changes and developments in the situation, but also be used for future development and changes in the prediction and deduction.
ERIC Educational Resources Information Center
Amit, Elinor; Mehoudar, Eyal; Trope, Yaacov; Yovel, Galit
2012-01-01
It is well established that scenes and objects elicit a highly selective response in specific brain regions in the ventral visual cortex. An inherent difference between these categories that has not been explored yet is their perceived distance from the observer (i.e. scenes are distal whereas objects are proximal). The current study aimed to test…
Thalamic nuclei convey diverse contextual information to layer 1 of visual cortex
Imhof, Fabia; Martini, Francisco J.; Hofer, Sonja B.
2017-01-01
Sensory perception depends on the context within which a stimulus occurs. Prevailing models emphasize cortical feedback as the source of contextual modulation. However, higher-order thalamic nuclei, such as the pulvinar, interconnect with many cortical and subcortical areas, suggesting a role for the thalamus in providing sensory and behavioral context – yet the nature of the signals conveyed to cortex by higher-order thalamus remains poorly understood. Here we use axonal calcium imaging to measure information provided to visual cortex by the pulvinar equivalent in mice, the lateral posterior nucleus (LP), as well as the dorsolateral geniculate nucleus (dLGN). We found that dLGN conveys retinotopically precise visual signals, while LP provides distributed information from the visual scene. Both LP and dLGN projections carry locomotion signals. However, while dLGN inputs often respond to positive combinations of running and visual flow speed, LP signals discrepancies between self-generated and external visual motion. This higher-order thalamic nucleus therefore conveys diverse contextual signals that inform visual cortex about visual scene changes not predicted by the animal’s own actions. PMID:26691828
ERIC Educational Resources Information Center
Rieger, Jochem W.; Kochy, Nick; Schalk, Franziska; Gruschow, Marcus; Heinze, Hans-Jochen
2008-01-01
The visual system rapidly extracts information about objects from the cluttered natural environment. In 5 experiments, the authors quantified the influence of orientation and semantics on the classification speed of objects in natural scenes, particularly with regard to object-context interactions. Natural scene photographs were presented in an…
Wystrach, Antoine; Dewar, Alex; Philippides, Andrew; Graham, Paul
2016-02-01
The visual systems of animals have to provide information to guide behaviour and the informational requirements of an animal's behavioural repertoire are often reflected in its sensory system. For insects, this is often evident in the optical array of the compound eye. One behaviour that insects share with many animals is the use of learnt visual information for navigation. As ants are expert visual navigators it may be that their vision is optimised for navigation. Here we take a computational approach in asking how the details of the optical array influence the informational content of scenes used in simple view matching strategies for orientation. We find that robust orientation is best achieved with low-resolution visual information and a large field of view, similar to the optical properties seen for many ant species. A lower resolution allows for a trade-off between specificity and generalisation for stored views. Additionally, our simulations show that orientation performance increases if different portions of the visual field are considered as discrete visual sensors, each giving an independent directional estimate. This suggests that ants might benefit by processing information from their two eyes independently.
Le, Thang M; Borghi, John A; Kujawa, Autumn J; Klein, Daniel N; Leung, Hoi-Chung
2017-01-01
The present study examined the impacts of major depressive disorder (MDD) on visual and prefrontal cortical activity as well as their connectivity during visual working memory updating and related them to the core clinical features of the disorder. Impairment in working memory updating is typically associated with the retention of irrelevant negative information which can lead to persistent depressive mood and abnormal affect. However, performance deficits have been observed in MDD on tasks involving little or no demand on emotion processing, suggesting dysfunctions may also occur at the more basic level of information processing. Yet, it is unclear how various regions in the visual working memory circuit contribute to behavioral changes in MDD. We acquired functional magnetic resonance imaging data from 18 unmedicated participants with MDD and 21 age-matched healthy controls (CTL) while they performed a visual delayed recognition task with neutral faces and scenes as task stimuli. Selective working memory updating was manipulated by inserting a cue in the delay period to indicate which one or both of the two memorized stimuli (a face and a scene) would remain relevant for the recognition test. Our results revealed several key findings. Relative to the CTL group, the MDD group showed weaker postcue activations in visual association areas during selective maintenance of face and scene working memory. Across the MDD subjects, greater rumination and depressive symptoms were associated with more persistent activation and connectivity related to no-longer-relevant task information. Classification of postcue spatial activation patterns of the scene-related areas was also less consistent in the MDD subjects compared to the healthy controls. Such abnormalities appeared to result from a lack of updating effects in postcue functional connectivity between prefrontal and scene-related areas in the MDD group. In sum, disrupted working memory updating in MDD was revealed by alterations in activity patterns of the visual association areas, their connectivity with the prefrontal cortex, and their relationship with core clinical characteristics. These results highlight the role of information updating deficits in the cognitive control and symptomatology of depression.
Top-down control of visual perception: attention in natural vision.
Rolls, Edmund T
2008-01-01
Top-down perceptual influences can bias (or pre-empt) perception. In natural scenes, the receptive fields of neurons in the inferior temporal visual cortex (IT) shrink to become close to the size of objects. This facilitates the read-out of information from the ventral visual system, because the information is primarily about the object at the fovea. Top-down attentional influences are much less evident in natural scenes than when objects are shown against blank backgrounds, though are still present. It is suggested that the reduced receptive-field size in natural scenes, and the effects of top-down attention contribute to change blindness. The receptive fields of IT neurons in complex scenes, though including the fovea, are frequently asymmetric around the fovea, and it is proposed that this is the solution the IT uses to represent multiple objects and their relative spatial positions in a scene. Networks that implement probabilistic decision-making are described, and it is suggested that, when in perceptual systems they take decisions (or 'test hypotheses'), they influence lower-level networks to bias visual perception. Finally, it is shown that similar processes extend to systems involved in the processing of emotion-provoking sensory stimuli, in that word-level cognitive states provide top-down biasing that reaches as far down as the orbitofrontal cortex, where, at the first stage of affective representations, olfactory, taste, flavour, and touch processing is biased (or pre-empted) in humans.
A new method for text detection and recognition in indoor scene for assisting blind people
NASA Astrophysics Data System (ADS)
Jabnoun, Hanen; Benzarti, Faouzi; Amiri, Hamid
2017-03-01
Developing assisting system of handicapped persons become a challenging ask in research projects. Recently, a variety of tools are designed to help visually impaired or blind people object as a visual substitution system. The majority of these tools are based on the conversion of input information into auditory or tactile sensory information. Furthermore, object recognition and text retrieval are exploited in the visual substitution systems. Text detection and recognition provides the description of the surrounding environments, so that the blind person can readily recognize the scene. In this work, we aim to introduce a method for detecting and recognizing text in indoor scene. The process consists on the detection of the regions of interest that should contain the text using the connected component. Then, the text detection is provided by employing the images correlation. This component of an assistive blind person should be simple, so that the users are able to obtain the most informative feedback within the shortest time.
How high is visual short-term memory capacity for object layout?
Sanocki, Thomas; Sellers, Eric; Mittelstadt, Jeff; Sulman, Noah
2010-05-01
Previous research measuring visual short-term memory (VSTM) suggests that the capacity for representing the layout of objects is fairly high. In four experiments, we further explored the capacity of VSTM for layout of objects, using the change detection method. In Experiment 1, participants retained most of the elements in displays of 4 to 8 elements. In Experiments 2 and 3, with up to 20 elements, participants retained many of them, reaching a capacity of 13.4 stimulus elements. In Experiment 4, participants retained much of a complex naturalistic scene. In most cases, increasing display size caused only modest reductions in performance, consistent with the idea of configural, variable-resolution grouping. The results indicate that participants can retain a substantial amount of scene layout information (objects and locations) in short-term memory. We propose that this is a case of remote visual understanding, where observers' ability to integrate information from a scene is paramount.
Miskovic, Vladimir; Martinovic, Jasna; Wieser, Matthias J; Petro, Nathan M; Bradley, Margaret M; Keil, Andreas
2015-03-01
Emotionally arousing scenes readily capture visual attention, prompting amplified neural activity in sensory regions of the brain. The physical stimulus features and related information channels in the human visual system that contribute to this modulation, however, are not known. Here, we manipulated low-level physical parameters of complex scenes varying in hedonic valence and emotional arousal in order to target the relative contributions of luminance based versus chromatic visual channels to emotional perception. Stimulus-evoked brain electrical activity was measured during picture viewing and used to quantify neural responses sensitive to lower-tier visual cortical involvement (steady-state visual evoked potentials) as well as the late positive potential, reflecting a more distributed cortical event. Results showed that the enhancement for emotional content was stimulus-selective when examining the steady-state segments of the evoked visual potentials. Response amplification was present only for low spatial frequency, grayscale stimuli, and not for high spatial frequency, red/green stimuli. In contrast, the late positive potential was modulated by emotion regardless of the scene's physical properties. Our findings are discussed in relation to neurophysiologically plausible constraints operating at distinct stages of the cortical processing stream. Copyright © 2015 Elsevier B.V. All rights reserved.
Neural correlates of contextual cueing are modulated by explicit learning.
Westerberg, Carmen E; Miller, Brennan B; Reber, Paul J; Cohen, Neal J; Paller, Ken A
2011-10-01
Contextual cueing refers to the facilitated ability to locate a particular visual element in a scene due to prior exposure to the same scene. This facilitation is thought to reflect implicit learning, as it typically occurs without the observer's knowledge that scenes repeat. Unlike most other implicit learning effects, contextual cueing can be impaired following damage to the medial temporal lobe. Here we investigated neural correlates of contextual cueing and explicit scene memory in two participant groups. Only one group was explicitly instructed about scene repetition. Participants viewed a sequence of complex scenes that depicted a landscape with five abstract geometric objects. Superimposed on each object was a letter T or L rotated left or right by 90°. Participants responded according to the target letter (T) orientation. Responses were highly accurate for all scenes. Response speeds were faster for repeated versus novel scenes. The magnitude of this contextual cueing did not differ between the two groups. Also, in both groups repeated scenes yielded reduced hemodynamic activation compared with novel scenes in several regions involved in visual perception and attention, and reductions in some of these areas were correlated with response-time facilitation. In the group given instructions about scene repetition, recognition memory for scenes was superior and was accompanied by medial temporal and more anterior activation. Thus, strategic factors can promote explicit memorization of visual scene information, which appears to engage additional neural processing beyond what is required for implicit learning of object configurations and target locations in a scene. Copyright © 2011 Elsevier Ltd. All rights reserved.
Neural correlates of contextual cueing are modulated by explicit learning
Westerberg, Carmen E.; Miller, Brennan B.; Reber, Paul J.; Cohen, Neal J.; Paller, Ken A.
2011-01-01
Contextual cueing refers to the facilitated ability to locate a particular visual element in a scene due to prior exposure to the same scene. This facilitation is thought to reflect implicit learning, as it typically occurs without the observer’s knowledge that scenes repeat. Unlike most other implicit learning effects, contextual cueing can be impaired following damage to the medial temporal lobe. Here we investigated neural correlates of contextual cueing and explicit scene memory in two participant groups. Only one group was explicitly instructed about scene repetition. Participants viewed a sequence of complex scenes that depicted a landscape with five abstract geometric objects. Superimposed on each object was a letter T or L rotated left or right by 90°. Participants responded according to the target letter (T) orientation. Responses were highly accurate for all scenes. Response speeds were faster for repeated versus novel scenes. The magnitude of this contextual cueing did not differ between the two groups. Also, in both groups repeated scenes yielded reduced hemodynamic activation compared with novel scenes in several regions involved in visual perception and attention, and reductions in some of these areas were correlated with response-time facilitation. In the group given instructions about scene repetition, recognition memory for scenes was superior and was accompanied by medial temporal and more anterior activation. Thus, strategic factors can promote explicit memorization of visual scene information, which appears to engage additional neural processing beyond what is required for implicit learning of object configurations and target locations in a scene. PMID:21889947
From Seeing to Saying: Perceiving, Planning, Producing
ERIC Educational Resources Information Center
Kuchinsky, Stefanie Ellen
2009-01-01
Given the amount of visual information in a scene, how do speakers determine what to talk about first? One hypothesis is that speakers start talking about what has attentional priority, while another is that speakers first extract the scene gist, using the obtained relational information to generate a rudimentary sentence plan before retrieving…
Does object view influence the scene consistency effect?
Sastyin, Gergo; Niimi, Ryosuke; Yokosawa, Kazuhiko
2015-04-01
Traditional research on the scene consistency effect only used clearly recognizable object stimuli to show mutually interactive context effects for both the object and background components on scene perception (Davenport & Potter in Psychological Science, 15, 559-564, 2004). However, in real environments, objects are viewed from multiple viewpoints, including an accidental, hard-to-recognize one. When the observers named target objects in scenes (Experiments 1a and 1b, object recognition task), we replicated the scene consistency effect (i.e., there was higher accuracy for the objects with consistent backgrounds). However, there was a significant interaction effect between consistency and object viewpoint, which indicated that the scene consistency effect was more important for identifying objects in the accidental view condition than in the canonical view condition. Therefore, the object recognition system may rely more on the scene context when the object is difficult to recognize. In Experiment 2, the observers identified the background (background recognition task) while the scene consistency and object views were manipulated. The results showed that object viewpoint had no effect, while the scene consistency effect was observed. More specifically, the canonical and accidental views both equally provided contextual information for scene perception. These findings suggested that the mechanism for conscious recognition of objects could be dissociated from the mechanism for visual analysis of object images that were part of a scene. The "context" that the object images provided may have been derived from its view-invariant, relatively low-level visual features (e.g., color), rather than its semantic information.
Wilkinson, Krista M; Light, Janice; Drager, Kathryn
2012-09-01
Aided augmentative and alternative (AAC) interventions have been demonstrated to facilitate a variety of communication outcomes in persons with intellectual disabilities. Most aided AAC systems rely on a visual modality. When the medium for communication is visual, it seems likely that the effectiveness of intervention depends in part on the effectiveness and efficiency with which the information presented in the display can be perceived, identified, and extracted by communicators and their partners. Understanding of visual-cognitive processing - that is, how a user attends, perceives, and makes sense of the visual information on the display - therefore seems critical to designing effective aided AAC interventions. In this Forum Note, we discuss characteristics of one particular type of aided AAC display, that is, Visual Scene Displays (VSDs) as they may relate to user visual and cognitive processing. We consider three specific ways in which bodies of knowledge drawn from the visual cognitive sciences may be relevant to the composition of VSDs, with the understanding the direct research with children with complex communication needs is necessary to verify or refute our speculations.
Language-guided visual processing affects reasoning: the role of referential and spatial anchoring.
Dumitru, Magda L; Joergensen, Gitte H; Cruickshank, Alice G; Altmann, Gerry T M
2013-06-01
Language is more than a source of information for accessing higher-order conceptual knowledge. Indeed, language may determine how people perceive and interpret visual stimuli. Visual processing in linguistic contexts, for instance, mirrors language processing and happens incrementally, rather than through variously-oriented fixations over a particular scene. The consequences of this atypical visual processing are yet to be determined. Here, we investigated the integration of visual and linguistic input during a reasoning task. Participants listened to sentences containing conjunctions or disjunctions (Nancy examined an ant and/or a cloud) and looked at visual scenes containing two pictures that either matched or mismatched the nouns. Degree of match between nouns and pictures (referential anchoring) and between their expected and actual spatial positions (spatial anchoring) affected fixations as well as judgments. We conclude that language induces incremental processing of visual scenes, which in turn becomes susceptible to reasoning errors during the language-meaning verification process. Copyright © 2013 Elsevier Inc. All rights reserved.
Bekhtereva, Valeria; Müller, Matthias M
2017-10-01
Is color a critical feature in emotional content extraction and involuntary attentional orienting toward affective stimuli? Here we used briefly presented emotional distractors to investigate the extent to which color information can influence the time course of attentional bias in early visual cortex. While participants performed a demanding visual foreground task, complex unpleasant and neutral background images were displayed in color or grayscale format for a short period of 133 ms and were immediately masked. Such a short presentation poses a challenge for visual processing. In the visual detection task, participants attended to flickering squares that elicited the steady-state visual evoked potential (SSVEP), allowing us to analyze the temporal dynamics of the competition for processing resources in early visual cortex. Concurrently we measured the visual event-related potentials (ERPs) evoked by the unpleasant and neutral background scenes. The results showed (a) that the distraction effect was greater with color than with grayscale images and (b) that it lasted longer with colored unpleasant distractor images. Furthermore, classical and mass-univariate ERP analyses indicated that, when presented in color, emotional scenes elicited more pronounced early negativities (N1-EPN) relative to neutral scenes, than when the scenes were presented in grayscale. Consistent with neural data, unpleasant scenes were rated as being more emotionally negative and received slightly higher arousal values when they were shown in color than when they were presented in grayscale. Taken together, these findings provide evidence for the modulatory role of picture color on a cascade of coordinated perceptual processes: by facilitating the higher-level extraction of emotional content, color influences the duration of the attentional bias to briefly presented affective scenes in lower-tier visual areas.
Flies and humans share a motion estimation strategy that exploits natural scene statistics
Clark, Damon A.; Fitzgerald, James E.; Ales, Justin M.; Gohl, Daryl M.; Silies, Marion A.; Norcia, Anthony M.; Clandinin, Thomas R.
2014-01-01
Sighted animals extract motion information from visual scenes by processing spatiotemporal patterns of light falling on the retina. The dominant models for motion estimation exploit intensity correlations only between pairs of points in space and time. Moving natural scenes, however, contain more complex correlations. Here we show that fly and human visual systems encode the combined direction and contrast polarity of moving edges using triple correlations that enhance motion estimation in natural environments. Both species extract triple correlations with neural substrates tuned for light or dark edges, and sensitivity to specific triple correlations is retained even as light and dark edge motion signals are combined. Thus, both species separately process light and dark image contrasts to capture motion signatures that can improve estimation accuracy. This striking convergence argues that statistical structures in natural scenes have profoundly affected visual processing, driving a common computational strategy over 500 million years of evolution. PMID:24390225
A fuzzy measure approach to motion frame analysis for scene detection. M.S. Thesis - Houston Univ.
NASA Technical Reports Server (NTRS)
Leigh, Albert B.; Pal, Sankar K.
1992-01-01
This paper addresses a solution to the problem of scene estimation of motion video data in the fuzzy set theoretic framework. Using fuzzy image feature extractors, a new algorithm is developed to compute the change of information in each of two successive frames to classify scenes. This classification process of raw input visual data can be used to establish structure for correlation. The algorithm attempts to fulfill the need for nonlinear, frame-accurate access to video data for applications such as video editing and visual document archival/retrieval systems in multimedia environments.
Reduced change blindness suggests enhanced attention to detail in individuals with autism.
Smith, Hayley; Milne, Elizabeth
2009-03-01
The phenomenon of change blindness illustrates that a limited number of items within the visual scene are attended to at any one time. It has been suggested that individuals with autism focus attention on less contextually relevant aspects of the visual scene, show superior perceptual discrimination and notice details which are often ignored by typical observers. In this study we investigated change blindness in autism by asking participants to detect continuity errors deliberately introduced into a short film. Whether the continuity errors involved central/marginal or social/non-social aspects of the visual scene was varied. Thirty adolescent participants, 15 with autistic spectrum disorder (ASD) and 15 typically developing (TD) controls participated. The participants with ASD detected significantly more errors than the TD participants. Both groups identified more errors involving central rather than marginal aspects of the scene, although this effect was larger in the TD participants. There was no difference in the number of social or non-social errors detected by either group of participants. In line with previous data suggesting an abnormally broad attentional spotlight and enhanced perceptual function in individuals with ASD, the results of this study suggest enhanced awareness of the visual scene in ASD. The results of this study could reflect superior top-down control of visual search in autism, enhanced perceptual function, or inefficient filtering of visual information in ASD.
Matching optical flow to motor speed in virtual reality while running on a treadmill
Lafortuna, Claudio L.; Mugellini, Elena; Abou Khaled, Omar
2018-01-01
We investigated how visual and kinaesthetic/efferent information is integrated for speed perception in running. Twelve moderately trained to trained subjects ran on a treadmill at three different speeds (8, 10, 12 km/h) in front of a moving virtual scene. They were asked to match the visual speed of the scene to their running speed–i.e., treadmill’s speed. For each trial, participants indicated whether the scene was moving slower or faster than they were running. Visual speed was adjusted according to their response using a staircase until the Point of Subjective Equality (PSE) was reached, i.e., until visual and running speed were perceived as equivalent. For all three running speeds, participants systematically underestimated the visual speed relative to their actual running speed. Indeed, the speed of the visual scene had to exceed the actual running speed in order to be perceived as equivalent to the treadmill speed. The underestimation of visual speed was speed-dependent, and percentage of underestimation relative to running speed ranged from 15% at 8km/h to 31% at 12km/h. We suggest that this fact should be taken into consideration to improve the design of attractive treadmill-mediated virtual environments enhancing engagement into physical activity for healthier lifestyles and disease prevention and care. PMID:29641564
Matching optical flow to motor speed in virtual reality while running on a treadmill.
Caramenti, Martina; Lafortuna, Claudio L; Mugellini, Elena; Abou Khaled, Omar; Bresciani, Jean-Pierre; Dubois, Amandine
2018-01-01
We investigated how visual and kinaesthetic/efferent information is integrated for speed perception in running. Twelve moderately trained to trained subjects ran on a treadmill at three different speeds (8, 10, 12 km/h) in front of a moving virtual scene. They were asked to match the visual speed of the scene to their running speed-i.e., treadmill's speed. For each trial, participants indicated whether the scene was moving slower or faster than they were running. Visual speed was adjusted according to their response using a staircase until the Point of Subjective Equality (PSE) was reached, i.e., until visual and running speed were perceived as equivalent. For all three running speeds, participants systematically underestimated the visual speed relative to their actual running speed. Indeed, the speed of the visual scene had to exceed the actual running speed in order to be perceived as equivalent to the treadmill speed. The underestimation of visual speed was speed-dependent, and percentage of underestimation relative to running speed ranged from 15% at 8km/h to 31% at 12km/h. We suggest that this fact should be taken into consideration to improve the design of attractive treadmill-mediated virtual environments enhancing engagement into physical activity for healthier lifestyles and disease prevention and care.
Rolls, Edmund T.; Webb, Tristan J.
2014-01-01
Searching for and recognizing objects in complex natural scenes is implemented by multiple saccades until the eyes reach within the reduced receptive field sizes of inferior temporal cortex (IT) neurons. We analyze and model how the dorsal and ventral visual streams both contribute to this. Saliency detection in the dorsal visual system including area LIP is modeled by graph-based visual saliency, and allows the eyes to fixate potential objects within several degrees. Visual information at the fixated location subtending approximately 9° corresponding to the receptive fields of IT neurons is then passed through a four layer hierarchical model of the ventral cortical visual system, VisNet. We show that VisNet can be trained using a synaptic modification rule with a short-term memory trace of recent neuronal activity to capture both the required view and translation invariances to allow in the model approximately 90% correct object recognition for 4 objects shown in any view across a range of 135° anywhere in a scene. The model was able to generalize correctly within the four trained views and the 25 trained translations. This approach analyses the principles by which complementary computations in the dorsal and ventral visual cortical streams enable objects to be located and recognized in complex natural scenes. PMID:25161619
The Characteristics and Limits of Rapid Visual Categorization
Fabre-Thorpe, Michèle
2011-01-01
Visual categorization appears both effortless and virtually instantaneous. The study by Thorpe et al. (1996) was the first to estimate the processing time necessary to perform fast visual categorization of animals in briefly flashed (20 ms) natural photographs. They observed a large differential EEG activity between target and distracter correct trials that developed from 150 ms after stimulus onset, a value that was later shown to be even shorter in monkeys! With such strong processing time constraints, it was difficult to escape the conclusion that rapid visual categorization was relying on massively parallel, essentially feed-forward processing of visual information. Since 1996, we have conducted a large number of studies to determine the characteristics and limits of fast visual categorization. The present chapter will review some of the main results obtained. I will argue that rapid object categorizations in natural scenes can be done without focused attention and are most likely based on coarse and unconscious visual representations activated with the first available (magnocellular) visual information. Fast visual processing proved efficient for the categorization of large superordinate object or scene categories, but shows its limits when more detailed basic representations are required. The representations for basic objects (dogs, cars) or scenes (mountain or sea landscapes) need additional processing time to be activated. This finding is at odds with the widely accepted idea that such basic representations are at the entry level of the system. Interestingly, focused attention is still not required to perform these time consuming basic categorizations. Finally we will show that object and context processing can interact very early in an ascending wave of visual information processing. We will discuss how such data could result from our experience with a highly structured and predictable surrounding world that shaped neuronal visual selectivity. PMID:22007180
Global ensemble texture representations are critical to rapid scene perception.
Brady, Timothy F; Shafer-Skelton, Anna; Alvarez, George A
2017-06-01
Traditionally, recognizing the objects within a scene has been treated as a prerequisite to recognizing the scene itself. However, research now suggests that the ability to rapidly recognize visual scenes could be supported by global properties of the scene itself rather than the objects within the scene. Here, we argue for a particular instantiation of this view: That scenes are recognized by treating them as a global texture and processing the pattern of orientations and spatial frequencies across different areas of the scene without recognizing any objects. To test this model, we asked whether there is a link between how proficient individuals are at rapid scene perception and how proficiently they represent simple spatial patterns of orientation information (global ensemble texture). We find a significant and selective correlation between these tasks, suggesting a link between scene perception and spatial ensemble tasks but not nonspatial summary statistics In a second and third experiment, we additionally show that global ensemble texture information is not only associated with scene recognition, but that preserving only global ensemble texture information from scenes is sufficient to support rapid scene perception; however, preserving the same information is not sufficient for object recognition. Thus, global ensemble texture alone is sufficient to allow activation of scene representations but not object representations. Together, these results provide evidence for a view of scene recognition based on global ensemble texture rather than a view based purely on objects or on nonspatially localized global properties. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Processing reafferent and exafferent visual information for action and perception.
Reichenbach, Alexandra; Diedrichsen, Jörn
2015-01-01
A recent study suggests that reafferent hand-related visual information utilizes a privileged, attention-independent processing channel for motor control. This process was termed visuomotor binding to reflect its proposed function: linking visual reafferences to the corresponding motor control centers. Here, we ask whether the advantage of processing reafferent over exafferent visual information is a specific feature of the motor processing stream or whether the improved processing also benefits the perceptual processing stream. Human participants performed a bimanual reaching task in a cluttered visual display, and one of the visual hand cursors could be displaced laterally during the movement. We measured the rapid feedback responses of the motor system as well as matched perceptual judgments of which cursor was displaced. Perceptual judgments were either made by watching the visual scene without moving or made simultaneously to the reaching tasks, such that the perceptual processing stream could also profit from the specialized processing of reafferent information in the latter case. Our results demonstrate that perceptual judgments in the heavily cluttered visual environment were improved when performed based on reafferent information. Even in this case, however, the filtering capability of the perceptual processing stream suffered more from the increasing complexity of the visual scene than the motor processing stream. These findings suggest partly shared and partly segregated processing of reafferent information for vision for motor control versus vision for perception.
Integrating mechanisms of visual guidance in naturalistic language production.
Coco, Moreno I; Keller, Frank
2015-05-01
Situated language production requires the integration of visual attention and linguistic processing. Previous work has not conclusively disentangled the role of perceptual scene information and structural sentence information in guiding visual attention. In this paper, we present an eye-tracking study that demonstrates that three types of guidance, perceptual, conceptual, and structural, interact to control visual attention. In a cued language production experiment, we manipulate perceptual (scene clutter) and conceptual guidance (cue animacy) and measure structural guidance (syntactic complexity of the utterance). Analysis of the time course of language production, before and during speech, reveals that all three forms of guidance affect the complexity of visual responses, quantified in terms of the entropy of attentional landscapes and the turbulence of scan patterns, especially during speech. We find that perceptual and conceptual guidance mediate the distribution of attention in the scene, whereas structural guidance closely relates to scan pattern complexity. Furthermore, the eye-voice span of the cued object and its perceptual competitor are similar; its latency mediated by both perceptual and structural guidance. These results rule out a strict interpretation of structural guidance as the single dominant form of visual guidance in situated language production. Rather, the phase of the task and the associated demands of cross-modal cognitive processing determine the mechanisms that guide attention.
ERIC Educational Resources Information Center
Kidd, Evan; Stewart, Andrew J.; Serratrice, Ludovica
2011-01-01
In this paper we report on a visual world eye-tracking experiment that investigated the differing abilities of adults and children to use referential scene information during reanalysis to overcome lexical biases during sentence processing. The results showed that adults incorporated aspects of the referential scene into their parse as soon as it…
Rendering visual events as sounds: Spatial attention capture by auditory augmented reality.
Stone, Scott A; Tata, Matthew S
2017-01-01
Many salient visual events tend to coincide with auditory events, such as seeing and hearing a car pass by. Information from the visual and auditory senses can be used to create a stable percept of the stimulus. Having access to related coincident visual and auditory information can help for spatial tasks such as localization. However not all visual information has analogous auditory percepts, such as viewing a computer monitor. Here, we describe a system capable of detecting and augmenting visual salient events into localizable auditory events. The system uses a neuromorphic camera (DAVIS 240B) to detect logarithmic changes of brightness intensity in the scene, which can be interpreted as salient visual events. Participants were blindfolded and asked to use the device to detect new objects in the scene, as well as determine direction of motion for a moving visual object. Results suggest the system is robust enough to allow for the simple detection of new salient stimuli, as well accurately encoding direction of visual motion. Future successes are probable as neuromorphic devices are likely to become faster and smaller in the future, making this system much more feasible.
Rendering visual events as sounds: Spatial attention capture by auditory augmented reality
Tata, Matthew S.
2017-01-01
Many salient visual events tend to coincide with auditory events, such as seeing and hearing a car pass by. Information from the visual and auditory senses can be used to create a stable percept of the stimulus. Having access to related coincident visual and auditory information can help for spatial tasks such as localization. However not all visual information has analogous auditory percepts, such as viewing a computer monitor. Here, we describe a system capable of detecting and augmenting visual salient events into localizable auditory events. The system uses a neuromorphic camera (DAVIS 240B) to detect logarithmic changes of brightness intensity in the scene, which can be interpreted as salient visual events. Participants were blindfolded and asked to use the device to detect new objects in the scene, as well as determine direction of motion for a moving visual object. Results suggest the system is robust enough to allow for the simple detection of new salient stimuli, as well accurately encoding direction of visual motion. Future successes are probable as neuromorphic devices are likely to become faster and smaller in the future, making this system much more feasible. PMID:28792518
Wilkinson, Krista M.; Light, Janice; Drager, Kathryn
2013-01-01
Aided augmentative and alternative (AAC) interventions have been demonstrated to facilitate a variety of communication outcomes in persons with intellectual disabilities. Most aided AAC systems rely on a visual modality. When the medium for communication is visual, it seems likely that the effectiveness of intervention depends in part on the effectiveness and efficiency with which the information presented in the display can be perceived, identified, and extracted by communicators and their partners. Understanding of visual-cognitive processing – that is, how a user attends, perceives, and makes sense of the visual information on the display – therefore seems critical to designing effective aided AAC interventions. In this Forum Note, we discuss characteristics of one particular type of aided AAC display, that is, Visual Scene Displays (VSDs) as they may relate to user visual and cognitive processing. We consider three specific ways in which bodies of knowledge drawn from the visual cognitive sciences may be relevant to the composition of VSDs, with the understanding the direct research with children with complex communication needs is necessary to verify or refute our speculations. PMID:22946989
Thiessen, Amber; Beukelman, David; Hux, Karen; Longenecker, Maria
2016-04-01
The purpose of the study was to compare the visual attention patterns of adults with aphasia and adults without neurological conditions when viewing visual scenes with 2 types of engagement. Eye-tracking technology was used to measure the visual attention patterns of 10 adults with aphasia and 10 adults without neurological conditions. Participants viewed camera-engaged (i.e., human figure facing camera) and task-engaged (i.e., human figure looking at and touching an object) visual scenes. Participants with aphasia responded to engagement cues by focusing on objects of interest more for task-engaged scenes than camera-engaged scenes; however, the difference in their responses to these scenes were not as pronounced as those observed in adults without neurological conditions. In addition, people with aphasia spent more time looking at background areas of interest and less time looking at person areas of interest for camera-engaged scenes than did control participants. Results indicate people with aphasia visually attend to scenes differently than adults without neurological conditions. As a consequence, augmentative and alternative communication (AAC) facilitators may have different visual attention behaviors than the people with aphasia for whom they are constructing or selecting visual scenes. Further examination of the visual attention of people with aphasia may help optimize visual scene selection.
Video content parsing based on combined audio and visual information
NASA Astrophysics Data System (ADS)
Zhang, Tong; Kuo, C.-C. Jay
1999-08-01
While previous research on audiovisual data segmentation and indexing primarily focuses on the pictorial part, significant clues contained in the accompanying audio flow are often ignored. A fully functional system for video content parsing can be achieved more successfully through a proper combination of audio and visual information. By investigating the data structure of different video types, we present tools for both audio and visual content analysis and a scheme for video segmentation and annotation in this research. In the proposed system, video data are segmented into audio scenes and visual shots by detecting abrupt changes in audio and visual features, respectively. Then, the audio scene is categorized and indexed as one of the basic audio types while a visual shot is presented by keyframes and associate image features. An index table is then generated automatically for each video clip based on the integration of outputs from audio and visual analysis. It is shown that the proposed system provides satisfying video indexing results.
Generating descriptive visual words and visual phrases for large-scale image applications.
Zhang, Shiliang; Tian, Qi; Hua, Gang; Huang, Qingming; Gao, Wen
2011-09-01
Bag-of-visual Words (BoWs) representation has been applied for various problems in the fields of multimedia and computer vision. The basic idea is to represent images as visual documents composed of repeatable and distinctive visual elements, which are comparable to the text words. Notwithstanding its great success and wide adoption, visual vocabulary created from single-image local descriptors is often shown to be not as effective as desired. In this paper, descriptive visual words (DVWs) and descriptive visual phrases (DVPs) are proposed as the visual correspondences to text words and phrases, where visual phrases refer to the frequently co-occurring visual word pairs. Since images are the carriers of visual objects and scenes, a descriptive visual element set can be composed by the visual words and their combinations which are effective in representing certain visual objects or scenes. Based on this idea, a general framework is proposed for generating DVWs and DVPs for image applications. In a large-scale image database containing 1506 object and scene categories, the visual words and visual word pairs descriptive to certain objects or scenes are identified and collected as the DVWs and DVPs. Experiments show that the DVWs and DVPs are informative and descriptive and, thus, are more comparable with the text words than the classic visual words. We apply the identified DVWs and DVPs in several applications including large-scale near-duplicated image retrieval, image search re-ranking, and object recognition. The combination of DVW and DVP performs better than the state of the art in large-scale near-duplicated image retrieval in terms of accuracy, efficiency and memory consumption. The proposed image search re-ranking algorithm: DWPRank outperforms the state-of-the-art algorithm by 12.4% in mean average precision and about 11 times faster in efficiency.
Visual flight control in naturalistic and artificial environments.
Baird, Emily; Dacke, Marie
2012-12-01
Although the visual flight control strategies of flying insects have evolved to cope with the complexity of the natural world, studies investigating this behaviour have typically been performed indoors using simplified two-dimensional artificial visual stimuli. How well do the results from these studies reflect the natural behaviour of flying insects considering the radical differences in contrast, spatial composition, colour and dimensionality between these visual environments? Here, we aim to answer this question by investigating the effect of three- and two-dimensional naturalistic and artificial scenes on bumblebee flight control in an outdoor setting and compare the results with those of similar experiments performed in an indoor setting. In particular, we focus on investigating the effect of axial (front-to-back) visual motion cues on ground speed and centring behaviour. Our results suggest that, in general, ground speed control and centring behaviour in bumblebees is not affected by whether the visual scene is two- or three dimensional, naturalistic or artificial, or whether the experiment is conducted indoors or outdoors. The only effect that we observe between naturalistic and artificial scenes on flight control is that when the visual scene is three-dimensional and the visual information on the floor is minimised, bumblebees fly further from the midline of the tunnel. The findings presented here have implications not only for understanding the mechanisms of visual flight control in bumblebees, but also for the results of past and future investigations into visually guided flight control in other insects.
Colour agnosia impairs the recognition of natural but not of non-natural scenes.
Nijboer, Tanja C W; Van Der Smagt, Maarten J; Van Zandvoort, Martine J E; De Haan, Edward H F
2007-03-01
Scene recognition can be enhanced by appropriate colour information, yet the level of visual processing at which colour exerts its effects is still unclear. It has been suggested that colour supports low-level sensory processing, while others have claimed that colour information aids semantic categorization and recognition of objects and scenes. We investigated the effect of colour on scene recognition in a case of colour agnosia, M.A.H. In a scene identification task, participants had to name images of natural or non-natural scenes in six different formats. Irrespective of scene format, M.A.H. was much slower on the natural than on the non-natural scenes. As expected, neither M.A.H. nor control participants showed any difference in performance for the non-natural scenes. However, for the natural scenes, appropriate colour facilitated scene recognition in control participants (i.e., shorter reaction times), whereas M.A.H.'s performance did not differ across formats. Our data thus support the hypothesis that the effect of colour occurs at the level of learned associations.
Modulation of Temporal Precision in Thalamic Population Responses to Natural Visual Stimuli
Desbordes, Gaëlle; Jin, Jianzhong; Alonso, Jose-Manuel; Stanley, Garrett B.
2010-01-01
Natural visual stimuli have highly structured spatial and temporal properties which influence the way visual information is encoded in the visual pathway. In response to natural scene stimuli, neurons in the lateral geniculate nucleus (LGN) are temporally precise – on a time scale of 10–25 ms – both within single cells and across cells within a population. This time scale, established by non stimulus-driven elements of neuronal firing, is significantly shorter than that of natural scenes, yet is critical for the neural representation of the spatial and temporal structure of the scene. Here, a generalized linear model (GLM) that combines stimulus-driven elements with spike-history dependence associated with intrinsic cellular dynamics is shown to predict the fine timing precision of LGN responses to natural scene stimuli, the corresponding correlation structure across nearby neurons in the population, and the continuous modulation of spike timing precision and latency across neurons. A single model captured the experimentally observed neural response, across different levels of contrasts and different classes of visual stimuli, through interactions between the stimulus correlation structure and the nonlinearity in spike generation and spike history dependence. Given the sensitivity of the thalamocortical synapse to closely timed spikes and the importance of fine timing precision for the faithful representation of natural scenes, the modulation of thalamic population timing over these time scales is likely important for cortical representations of the dynamic natural visual environment. PMID:21151356
Effect of fixation positions on perception of lightness
NASA Astrophysics Data System (ADS)
Toscani, Matteo; Valsecchi, Matteo; Gegenfurtner, Karl R.
2015-03-01
Visual acuity, luminance sensitivity, contrast sensitivity, and color sensitivity are maximal in the fovea and decrease with retinal eccentricity. Therefore every scene is perceived by integrating the small, high resolution samples collected by moving the eyes around. Moreover, when viewing ambiguous figures the fixated position influences the dominance of the possible percepts. Therefore fixations could serve as a selection mechanism whose function is not confined to finely resolve the selected detail of the scene. Here this hypothesis is tested in the lightness perception domain. In a first series of experiments we demonstrated that when observers matched the color of natural objects they based their lightness judgments on objects' brightest parts. During this task the observers tended to fixate points with above average luminance, suggesting a relationship between perception and fixations that we causally proved using a gaze contingent display in a subsequent experiment. Simulations with rendered physical lighting show that higher values in an object's luminance distribution are particularly informative about reflectance. In a second series of experiments we considered a high level strategy that the visual system uses to segment the visual scene in a layered representation. We demonstrated that eye movement sampling mediates between the layer segregation and its effects on lightness perception. Together these studies show that eye fixations are partially responsible for the selection of information from a scene that allows the visual system to estimate the reflectance of a surface.
Beyond scene gist: Objects guide search more than scene background.
Koehler, Kathryn; Eckstein, Miguel P
2017-06-01
Although the facilitation of visual search by contextual information is well established, there is little understanding of the independent contributions of different types of contextual cues in scenes. Here we manipulated 3 types of contextual information: object co-occurrence, multiple object configurations, and background category. We isolated the benefits of each contextual cue to target detectability, its impact on decision bias, confidence, and the guidance of eye movements. We find that object-based information guides eye movements and facilitates perceptual judgments more than scene background. The degree of guidance and facilitation of each contextual cue can be related to its inherent informativeness about the target spatial location as measured by human explicit judgments about likely target locations. Our results improve the understanding of the contributions of distinct contextual scene components to search and suggest that the brain's utilization of cues to guide eye movements is linked to the cue's informativeness about the target's location. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Tachistoscopic exposure and masking of real three-dimensional scenes
Pothier, Stephen; Philbeck, John; Chichka, David; Gajewski, Daniel A.
2010-01-01
Although there are many well-known forms of visual cues specifying absolute and relative distance, little is known about how visual space perception develops at small temporal scales. How much time does the visual system require to extract the information in the various absolute and relative distance cues? In this article, we describe a system that may be used to address this issue by presenting brief exposures of real, three-dimensional scenes, followed by a masking stimulus. The system is composed of an electronic shutter (a liquid crystal smart window) for exposing the stimulus scene, and a liquid crystal projector coupled with an electromechanical shutter for presenting the masking stimulus. This system can be used in both full- and reduced-cue viewing conditions, under monocular and binocular viewing, and at distances limited only by the testing space. We describe a configuration that may be used for studying the microgenesis of visual space perception in the context of visually directed walking. PMID:19182129
How Visual and Semantic Information Influence Learning in Familiar Contexts
ERIC Educational Resources Information Center
Goujon, Annabelle; Brockmole, James R.; Ehinger, Krista A.
2012-01-01
Previous research using the contextual cuing paradigm has revealed both quantitative and qualitative differences in learning depending on whether repeated contexts are defined by letter arrays or real-world scenes. To clarify the relative contributions of visual features and semantic information likely to account for such differences, the typical…
32 CFR 813.5 - Shipping or transmitting visual information documentation images.
Code of Federal Regulations, 2010 CFR
2010-07-01
... documentation images. 813.5 Section 813.5 National Defense Department of Defense (Continued) DEPARTMENT OF THE... visual information documentation images. (a) COMCAM images. Send COMCAM images to the DoD Joint Combat... the approval procedures that on-scene and theater commanders set. (b) Other non-COMCAM images. After...
32 CFR 813.5 - Shipping or transmitting visual information documentation images.
Code of Federal Regulations, 2013 CFR
2013-07-01
... documentation images. 813.5 Section 813.5 National Defense Department of Defense (Continued) DEPARTMENT OF THE... visual information documentation images. (a) COMCAM images. Send COMCAM images to the DoD Joint Combat... the approval procedures that on-scene and theater commanders set. (b) Other non-COMCAM images. After...
32 CFR 813.5 - Shipping or transmitting visual information documentation images.
Code of Federal Regulations, 2011 CFR
2011-07-01
... documentation images. 813.5 Section 813.5 National Defense Department of Defense (Continued) DEPARTMENT OF THE... visual information documentation images. (a) COMCAM images. Send COMCAM images to the DoD Joint Combat... the approval procedures that on-scene and theater commanders set. (b) Other non-COMCAM images. After...
32 CFR 813.5 - Shipping or transmitting visual information documentation images.
Code of Federal Regulations, 2012 CFR
2012-07-01
... documentation images. 813.5 Section 813.5 National Defense Department of Defense (Continued) DEPARTMENT OF THE... visual information documentation images. (a) COMCAM images. Send COMCAM images to the DoD Joint Combat... the approval procedures that on-scene and theater commanders set. (b) Other non-COMCAM images. After...
32 CFR 813.5 - Shipping or transmitting visual information documentation images.
Code of Federal Regulations, 2014 CFR
2014-07-01
... documentation images. 813.5 Section 813.5 National Defense Department of Defense (Continued) DEPARTMENT OF THE... visual information documentation images. (a) COMCAM images. Send COMCAM images to the DoD Joint Combat... the approval procedures that on-scene and theater commanders set. (b) Other non-COMCAM images. After...
Auditory Memory Distortion for Spoken Prose
ERIC Educational Resources Information Center
Hutchison, Joanna L.; Hubbard, Timothy L.; Ferrandino, Blaise; Brigante, Ryan; Wright, Jamie M.; Rypma, Bart
2012-01-01
Observers often remember a scene as containing information that was not presented but that would have likely been located just beyond the observed boundaries of the scene. This effect is called "boundary extension" (BE; e.g., Intraub & Richardson, 1989). Previous studies have observed BE in memory for visual and haptic stimuli, and…
A color fusion method of infrared and low-light-level images based on visual perception
NASA Astrophysics Data System (ADS)
Han, Jing; Yan, Minmin; Zhang, Yi; Bai, Lianfa
2014-11-01
The color fusion images can be obtained through the fusion of infrared and low-light-level images, which will contain both the information of the two. The fusion images can help observers to understand the multichannel images comprehensively. However, simple fusion may lose the target information due to inconspicuous targets in long-distance infrared and low-light-level images; and if targets extraction is adopted blindly, the perception of the scene information will be affected seriously. To solve this problem, a new fusion method based on visual perception is proposed in this paper. The extraction of the visual targets ("what" information) and parallel processing mechanism are applied in traditional color fusion methods. The infrared and low-light-level color fusion images are achieved based on efficient typical targets learning. Experimental results show the effectiveness of the proposed method. The fusion images achieved by our algorithm can not only improve the detection rate of targets, but also get rich natural information of the scenes.
Experiencing simultanagnosia through windowed viewing of complex social scenes.
Dalrymple, Kirsten A; Birmingham, Elina; Bischof, Walter F; Barton, Jason J S; Kingstone, Alan
2011-01-07
Simultanagnosia is a disorder of visual attention, defined as an inability to see more than one object at once. It has been conceived as being due to a constriction of the visual "window" of attention, a metaphor that we examine in the present article. A simultanagnosic patient (SL) and two non-simultanagnosic control patients (KC and ES) described social scenes while their eye movements were monitored. These data were compared to a group of healthy subjects who described the same scenes under the same conditions as the patients, or through an aperture that restricted their vision to a small portion of the scene. Experiment 1 demonstrated that SL showed unusually low proportions of fixations to the eyes in social scenes, which contrasted with all other participants who demonstrated the standard preferential bias toward eyes. Experiments 2 and 3 revealed that when healthy participants viewed scenes through a window that was contingent on where they looked (Experiment 2) or where they moved a computer mouse (Experiment 3), their behavior closely mirrored that of patient SL. These findings suggest that a constricted window of visual processing has important consequences for how simultanagnosic patients explore their world. Our paradigm's capacity to mimic simultanagnosic behaviors while viewing complex scenes implies that it may be a valid way of modeling simultanagnosia in healthy individuals, providing a useful tool for future research. More broadly, our results support the thesis that people fixate the eyes in social scenes because they are informative to the meaning of the scene. Copyright © 2010 Elsevier B.V. All rights reserved.
The forensic holodeck: an immersive display for forensic crime scene reconstructions.
Ebert, Lars C; Nguyen, Tuan T; Breitbeck, Robert; Braun, Marcel; Thali, Michael J; Ross, Steffen
2014-12-01
In forensic investigations, crime scene reconstructions are created based on a variety of three-dimensional image modalities. Although the data gathered are three-dimensional, their presentation on computer screens and paper is two-dimensional, which incurs a loss of information. By applying immersive virtual reality (VR) techniques, we propose a system that allows a crime scene to be viewed as if the investigator were present at the scene. We used a low-cost VR headset originally developed for computer gaming in our system. The headset offers a large viewing volume and tracks the user's head orientation in real-time, and an optical tracker is used for positional information. In addition, we created a crime scene reconstruction to demonstrate the system. In this article, we present a low-cost system that allows immersive, three-dimensional and interactive visualization of forensic incident scene reconstructions.
Barrès, Victor; Lee, Jinyong
2014-01-01
How does the language system coordinate with our visual system to yield flexible integration of linguistic, perceptual, and world-knowledge information when we communicate about the world we perceive? Schema theory is a computational framework that allows the simulation of perceptuo-motor coordination programs on the basis of known brain operating principles such as cooperative computation and distributed processing. We present first its application to a model of language production, SemRep/TCG, which combines a semantic representation of visual scenes (SemRep) with Template Construction Grammar (TCG) as a means to generate verbal descriptions of a scene from its associated SemRep graph. SemRep/TCG combines the neurocomputational framework of schema theory with the representational format of construction grammar in a model linking eye-tracking data to visual scene descriptions. We then offer a conceptual extension of TCG to include language comprehension and address data on the role of both world knowledge and grammatical semantics in the comprehension performances of agrammatic aphasic patients. This extension introduces a distinction between heavy and light semantics. The TCG model of language comprehension offers a computational framework to quantitatively analyze the distributed dynamics of language processes, focusing on the interactions between grammatical, world knowledge, and visual information. In particular, it reveals interesting implications for the understanding of the various patterns of comprehension performances of agrammatic aphasics measured using sentence-picture matching tasks. This new step in the life cycle of the model serves as a basis for exploring the specific challenges that neurolinguistic computational modeling poses to the neuroinformatics community.
Creating a meaningful visual perception in blind volunteers by optic nerve stimulation
NASA Astrophysics Data System (ADS)
Brelén, M. E.; Duret, F.; Gérard, B.; Delbeke, J.; Veraart, C.
2005-03-01
A blind volunteer, suffering from retinitis pigmentosa, has been chronically implanted with an optic nerve visual prosthesis. Vision rehabilitation with this volunteer has concentrated on the development of a stimulation strategy according to which video camera images are converted into stimulation pulses. The aim is to convey as much information as possible about the visual scene within the limits of the device's capabilities. Pattern recognition tasks were used to assess the effectiveness of the stimulation strategy. The results demonstrate how even a relatively basic algorithm can efficiently convey useful information regarding the visual scene. By increasing the number of phosphenes used in the algorithm, better performance is observed but a longer training period is required. After a learning period, the volunteer achieved a pattern recognition score of 85% at 54 s on average per pattern. After nine evaluation sessions, when using a stimulation strategy exploiting all available phosphenes, no saturation effect has yet been observed.
Effects of chromatic image statistics on illumination induced color differences.
Lucassen, Marcel P; Gevers, Theo; Gijsenij, Arjan; Dekker, Niels
2013-09-01
We measure the color fidelity of visual scenes that are rendered under different (simulated) illuminants and shown on a calibrated LCD display. Observers make triad illuminant comparisons involving the renderings from two chromatic test illuminants and one achromatic reference illuminant shown simultaneously. Four chromatic test illuminants are used: two along the daylight locus (yellow and blue), and two perpendicular to it (red and green). The observers select the rendering having the best color fidelity, thereby indirectly judging which of the two test illuminants induces the smallest color differences compared to the reference. Both multicolor test scenes and natural scenes are studied. The multicolor scenes are synthesized and represent ellipsoidal distributions in CIELAB chromaticity space having the same mean chromaticity but different chromatic orientations. We show that, for those distributions, color fidelity is best when the vector of the illuminant change (pointing from neutral to chromatic) is parallel to the major axis of the scene's chromatic distribution. For our selection of natural scenes, which generally have much broader chromatic distributions, we measure a higher color fidelity for the yellow and blue illuminants than for red and green. Scrambled versions of the natural images are also studied to exclude possible semantic effects. We quantitatively predict the average observer response (i.e., the illuminant probability) with four types of models, differing in the extent to which they incorporate information processing by the visual system. Results show different levels of performance for the models, and different levels for the multicolor scenes and the natural scenes. Overall, models based on the scene averaged color difference have the best performance. We discuss how color constancy algorithms may be improved by exploiting knowledge of the chromatic distribution of the visual scene.
Teng, Santani
2017-01-01
In natural environments, visual and auditory stimulation elicit responses across a large set of brain regions in a fraction of a second, yielding representations of the multimodal scene and its properties. The rapid and complex neural dynamics underlying visual and auditory information processing pose major challenges to human cognitive neuroscience. Brain signals measured non-invasively are inherently noisy, the format of neural representations is unknown, and transformations between representations are complex and often nonlinear. Further, no single non-invasive brain measurement technique provides a spatio-temporally integrated view. In this opinion piece, we argue that progress can be made by a concerted effort based on three pillars of recent methodological development: (i) sensitive analysis techniques such as decoding and cross-classification, (ii) complex computational modelling using models such as deep neural networks, and (iii) integration across imaging methods (magnetoencephalography/electroencephalography, functional magnetic resonance imaging) and models, e.g. using representational similarity analysis. We showcase two recent efforts that have been undertaken in this spirit and provide novel results about visual and auditory scene analysis. Finally, we discuss the limits of this perspective and sketch a concrete roadmap for future research. This article is part of the themed issue ‘Auditory and visual scene analysis’. PMID:28044019
Cichy, Radoslaw Martin; Teng, Santani
2017-02-19
In natural environments, visual and auditory stimulation elicit responses across a large set of brain regions in a fraction of a second, yielding representations of the multimodal scene and its properties. The rapid and complex neural dynamics underlying visual and auditory information processing pose major challenges to human cognitive neuroscience. Brain signals measured non-invasively are inherently noisy, the format of neural representations is unknown, and transformations between representations are complex and often nonlinear. Further, no single non-invasive brain measurement technique provides a spatio-temporally integrated view. In this opinion piece, we argue that progress can be made by a concerted effort based on three pillars of recent methodological development: (i) sensitive analysis techniques such as decoding and cross-classification, (ii) complex computational modelling using models such as deep neural networks, and (iii) integration across imaging methods (magnetoencephalography/electroencephalography, functional magnetic resonance imaging) and models, e.g. using representational similarity analysis. We showcase two recent efforts that have been undertaken in this spirit and provide novel results about visual and auditory scene analysis. Finally, we discuss the limits of this perspective and sketch a concrete roadmap for future research.This article is part of the themed issue 'Auditory and visual scene analysis'. © 2017 The Authors.
Hu, Jian; Xu, Xiang-yang; Song, En-min; Tan, Hong-bao; Wang, Yi-ning
2009-09-01
To establish a new visual educational system of virtual reality for clinical dentistry based on world wide web (WWW) webpage in order to provide more three-dimensional multimedia resources to dental students and an online three-dimensional consulting system for patients. Based on computer graphics and three-dimensional webpage technologies, the software of 3Dsmax and Webmax were adopted in the system development. In the Windows environment, the architecture of whole system was established step by step, including three-dimensional model construction, three-dimensional scene setup, transplanting three-dimensional scene into webpage, reediting the virtual scene, realization of interactions within the webpage, initial test, and necessary adjustment. Five cases of three-dimensional interactive webpage for clinical dentistry were completed. The three-dimensional interactive webpage could be accessible through web browser on personal computer, and users could interact with the webpage through rotating, panning and zooming the virtual scene. It is technically feasible to implement the visual educational system of virtual reality for clinical dentistry based on WWW webpage. Information related to clinical dentistry can be transmitted properly, visually and interactively through three-dimensional webpage.
Markman, Adam; Shen, Xin; Hua, Hong; Javidi, Bahram
2016-01-15
An augmented reality (AR) smartglass display combines real-world scenes with digital information enabling the rapid growth of AR-based applications. We present an augmented reality-based approach for three-dimensional (3D) optical visualization and object recognition using axially distributed sensing (ADS). For object recognition, the 3D scene is reconstructed, and feature extraction is performed by calculating the histogram of oriented gradients (HOG) of a sliding window. A support vector machine (SVM) is then used for classification. Once an object has been identified, the 3D reconstructed scene with the detected object is optically displayed in the smartglasses allowing the user to see the object, remove partial occlusions of the object, and provide critical information about the object such as 3D coordinates, which are not possible with conventional AR devices. To the best of our knowledge, this is the first report on combining axially distributed sensing with 3D object visualization and recognition for applications to augmented reality. The proposed approach can have benefits for many applications, including medical, military, transportation, and manufacturing.
Gist in time: scene semantics and structure enhance recall of searched objects
Wolfe, Jeremy M.; Võ, Melissa L.-H.
2016-01-01
Previous work has shown that recall of objects that are incidentally encountered as targets in visual search is better than recall of objects that have been intentionally memorized (Draschkow, Wolfe & Võ, 2014). However, this counter-intuitive result is not seen when these tasks are performed with non-scene stimuli. The goal of the current paper is to determine what features of search in a scene contribute to higher recall rates when compared to a memorization task. In each of four experiments, we compare the free recall rate for target objects following a search to the rate following a memorization task. Across the experiments, the stimuli include progressively more scene-related information. Experiment 1 provides the spatial relations between objects. Experiment 2 adds relative size and depth of objects. Experiments 3 and 4 include scene layout and semantic information. We find that search leads to better recall than explicit memorization in cases where scene layout and semantic information are present, as long as the participant has ample time (2500ms) to integrate this information with knowledge about the target object (Exp. 4). These results suggest that the integration of scene and target information not only leads to more efficient search, but can also contribute to stronger memory representations than intentional memorization. PMID:27270227
Gist in time: Scene semantics and structure enhance recall of searched objects.
Josephs, Emilie L; Draschkow, Dejan; Wolfe, Jeremy M; Võ, Melissa L-H
2016-09-01
Previous work has shown that recall of objects that are incidentally encountered as targets in visual search is better than recall of objects that have been intentionally memorized (Draschkow, Wolfe, & Võ, 2014). However, this counter-intuitive result is not seen when these tasks are performed with non-scene stimuli. The goal of the current paper is to determine what features of search in a scene contribute to higher recall rates when compared to a memorization task. In each of four experiments, we compare the free recall rate for target objects following a search to the rate following a memorization task. Across the experiments, the stimuli include progressively more scene-related information. Experiment 1 provides the spatial relations between objects. Experiment 2 adds relative size and depth of objects. Experiments 3 and 4 include scene layout and semantic information. We find that search leads to better recall than explicit memorization in cases where scene layout and semantic information are present, as long as the participant has ample time (2500ms) to integrate this information with knowledge about the target object (Exp. 4). These results suggest that the integration of scene and target information not only leads to more efficient search, but can also contribute to stronger memory representations than intentional memorization. Copyright © 2016 Elsevier B.V. All rights reserved.
Combined influence of visual scene and body tilt on arm pointing movements: gravity matters!
Scotto Di Cesare, Cécile; Sarlegna, Fabrice R; Bourdin, Christophe; Mestre, Daniel R; Bringoux, Lionel
2014-01-01
Performing accurate actions such as goal-directed arm movements requires taking into account visual and body orientation cues to localize the target in space and produce appropriate reaching motor commands. We experimentally tilted the body and/or the visual scene to investigate how visual and body orientation cues are combined for the control of unseen arm movements. Subjects were asked to point toward a visual target using an upward movement during slow body and/or visual scene tilts. When the scene was tilted, final pointing errors varied as a function of the direction of the scene tilt (forward or backward). Actual forward body tilt resulted in systematic target undershoots, suggesting that the brain may have overcompensated for the biomechanical movement facilitation arising from body tilt. Combined body and visual scene tilts also affected final pointing errors according to the orientation of the visual scene. The data were further analysed using either a body-centered or a gravity-centered reference frame to encode visual scene orientation with simple additive models (i.e., 'combined' tilts equal to the sum of 'single' tilts). We found that the body-centered model could account only for some of the data regarding kinematic parameters and final errors. In contrast, the gravity-centered modeling in which the body and visual scene orientations were referred to vertical could explain all of these data. Therefore, our findings suggest that the brain uses gravity, thanks to its invariant properties, as a reference for the combination of visual and non-visual cues.
Combined Influence of Visual Scene and Body Tilt on Arm Pointing Movements: Gravity Matters!
Scotto Di Cesare, Cécile; Sarlegna, Fabrice R.; Bourdin, Christophe; Mestre, Daniel R.; Bringoux, Lionel
2014-01-01
Performing accurate actions such as goal-directed arm movements requires taking into account visual and body orientation cues to localize the target in space and produce appropriate reaching motor commands. We experimentally tilted the body and/or the visual scene to investigate how visual and body orientation cues are combined for the control of unseen arm movements. Subjects were asked to point toward a visual target using an upward movement during slow body and/or visual scene tilts. When the scene was tilted, final pointing errors varied as a function of the direction of the scene tilt (forward or backward). Actual forward body tilt resulted in systematic target undershoots, suggesting that the brain may have overcompensated for the biomechanical movement facilitation arising from body tilt. Combined body and visual scene tilts also affected final pointing errors according to the orientation of the visual scene. The data were further analysed using either a body-centered or a gravity-centered reference frame to encode visual scene orientation with simple additive models (i.e., ‘combined’ tilts equal to the sum of ‘single’ tilts). We found that the body-centered model could account only for some of the data regarding kinematic parameters and final errors. In contrast, the gravity-centered modeling in which the body and visual scene orientations were referred to vertical could explain all of these data. Therefore, our findings suggest that the brain uses gravity, thanks to its invariant properties, as a reference for the combination of visual and non-visual cues. PMID:24925371
Visually-guided attention enhances target identification in a complex auditory scene.
Best, Virginia; Ozmeral, Erol J; Shinn-Cunningham, Barbara G
2007-06-01
In auditory scenes containing many similar sound sources, sorting of acoustic information into streams becomes difficult, which can lead to disruptions in the identification of behaviorally relevant targets. This study investigated the benefit of providing simple visual cues for when and/or where a target would occur in a complex acoustic mixture. Importantly, the visual cues provided no information about the target content. In separate experiments, human subjects either identified learned birdsongs in the presence of a chorus of unlearned songs or recalled strings of spoken digits in the presence of speech maskers. A visual cue indicating which loudspeaker (from an array of five) would contain the target improved accuracy for both kinds of stimuli. A cue indicating which time segment (out of a possible five) would contain the target also improved accuracy, but much more for birdsong than for speech. These results suggest that in real world situations, information about where a target of interest is located can enhance its identification, while information about when to listen can also be helpful when targets are unfamiliar or extremely similar to their competitors.
Visually-guided Attention Enhances Target Identification in a Complex Auditory Scene
Ozmeral, Erol J.; Shinn-Cunningham, Barbara G.
2007-01-01
In auditory scenes containing many similar sound sources, sorting of acoustic information into streams becomes difficult, which can lead to disruptions in the identification of behaviorally relevant targets. This study investigated the benefit of providing simple visual cues for when and/or where a target would occur in a complex acoustic mixture. Importantly, the visual cues provided no information about the target content. In separate experiments, human subjects either identified learned birdsongs in the presence of a chorus of unlearned songs or recalled strings of spoken digits in the presence of speech maskers. A visual cue indicating which loudspeaker (from an array of five) would contain the target improved accuracy for both kinds of stimuli. A cue indicating which time segment (out of a possible five) would contain the target also improved accuracy, but much more for birdsong than for speech. These results suggest that in real world situations, information about where a target of interest is located can enhance its identification, while information about when to listen can also be helpful when targets are unfamiliar or extremely similar to their competitors. PMID:17453308
Local statistics of retinal optic flow for self-motion through natural sceneries.
Calow, Dirk; Lappe, Markus
2007-12-01
Image analysis in the visual system is well adapted to the statistics of natural scenes. Investigations of natural image statistics have so far mainly focused on static features. The present study is dedicated to the measurement and the analysis of the statistics of optic flow generated on the retina during locomotion through natural environments. Natural locomotion includes bouncing and swaying of the head and eye movement reflexes that stabilize gaze onto interesting objects in the scene while walking. We investigate the dependencies of the local statistics of optic flow on the depth structure of the natural environment and on the ego-motion parameters. To measure these dependencies we estimate the mutual information between correlated data sets. We analyze the results with respect to the variation of the dependencies over the visual field, since the visual motions in the optic flow vary depending on visual field position. We find that retinal flow direction and retinal speed show only minor statistical interdependencies. Retinal speed is statistically tightly connected to the depth structure of the scene. Retinal flow direction is statistically mostly driven by the relation between the direction of gaze and the direction of ego-motion. These dependencies differ at different visual field positions such that certain areas of the visual field provide more information about ego-motion and other areas provide more information about depth. The statistical properties of natural optic flow may be used to tune the performance of artificial vision systems based on human imitating behavior, and may be useful for analyzing properties of natural vision systems.
Miskovic, Vladimir; Martinovic, Jasna; Wieser, Matthias M.; Petro, Nathan M.; Bradley, Margaret M.; Keil, Andreas
2015-01-01
Emotionally arousing scenes readily capture visual attention, prompting amplified neural activity in sensory regions of the brain. The physical stimulus features and related information channels in the human visual system that contribute to this modulation, however, are not known. Here, we manipulated low-level physical parameters of complex scenes varying in hedonic valence and emotional arousal in order to target the relative contributions of luminance based versus chromatic visual channels to emotional perception. Stimulus-evoked brain electrical activity was measured during picture viewing and used to quantify neural responses sensitive to lower-tier visual cortical involvement (steady-state visual evoked potentials) as well as the late positive potential, reflecting a more distributed cortical event. Results showed that the enhancement for emotional content was stimulus-selective when examining the steady-state segments of the evoked visual potentials. Response amplification was present only for low spatial frequency, grayscale stimuli, and not for high spatial frequency, red/green stimuli. In contrast, the late positive potential was modulated by emotion regardless of the scene’s physical properties. Our findings are discussed in relation to neurophysiologically plausible constraints operating at distinct stages of the cortical processing stream. PMID:25640949
Accuracy and Tuning of Flow Parsing for Visual Perception of Object Motion During Self-Motion
Niehorster, Diederick C.
2017-01-01
How do we perceive object motion during self-motion using visual information alone? Previous studies have reported that the visual system can use optic flow to identify and globally subtract the retinal motion component resulting from self-motion to recover scene-relative object motion, a process called flow parsing. In this article, we developed a retinal motion nulling method to directly measure and quantify the magnitude of flow parsing (i.e., flow parsing gain) in various scenarios to examine the accuracy and tuning of flow parsing for the visual perception of object motion during self-motion. We found that flow parsing gains were below unity for all displays in all experiments; and that increasing self-motion and object motion speed did not alter flow parsing gain. We conclude that visual information alone is not sufficient for the accurate perception of scene-relative motion during self-motion. Although flow parsing performs global subtraction, its accuracy also depends on local motion information in the retinal vicinity of the moving object. Furthermore, the flow parsing gain was constant across common self-motion or object motion speeds. These results can be used to inform and validate computational models of flow parsing. PMID:28567272
Chen, Juan; Sperandio, Irene; Goodale, Melvyn Alan
2015-01-01
Objects rarely appear in isolation in natural scenes. Although many studies have investigated how nearby objects influence perception in cluttered scenes (i.e., crowding), none has studied how nearby objects influence visually guided action. In Experiment 1, we found that participants could scale their grasp to the size of a crowded target even when they could not perceive its size, demonstrating for the first time that neurologically intact participants can use visual information that is not available to conscious report to scale their grasp to real objects in real scenes. In Experiments 2 and 3, we found that changing the eccentricity of the display and the orientation of the flankers had no effect on grasping but strongly affected perception. The differential effects of eccentricity and flanker orientation on perception and grasping show that the known differences in retinotopy between the ventral and dorsal streams are reflected in the way in which people deal with targets in cluttered scenes. © The Author(s) 2014.
Fu, Qiufang; Liu, Yong-Jin; Dienes, Zoltan; Wu, Jianhui; Chen, Wenfeng; Fu, Xiaolan
2016-07-01
A fundamental question in vision research is whether visual recognition is determined by edge-based information (e.g., edge, line, and conjunction) or surface-based information (e.g., color, brightness, and texture). To investigate this question, we manipulated the stimulus onset asynchrony (SOA) between the scene and the mask in a backward masking task of natural scene categorization. The behavioral results showed that correct classification was higher for line-drawings than for color photographs when the SOA was 13ms, but lower when the SOA was longer. The ERP results revealed that most latencies of early components were shorter for the line-drawings than for the color photographs, and the latencies gradually increased with the SOA for the color photographs but not for the line-drawings. The results provide new evidence that edge-based information is the primary determinant of natural scene categorization, receiving priority processing; by contrast, surface information takes longer to facilitate natural scene categorization. Copyright © 2016 Elsevier Inc. All rights reserved.
Hayes, Scott M; Nadel, Lynn; Ryan, Lee
2007-01-01
Previous research has investigated intentional retrieval of contextual information and contextual influences on object identification and word recognition, yet few studies have investigated context effects in episodic memory for objects. To address this issue, unique objects embedded in a visually rich scene or on a white background were presented to participants. At test, objects were presented either in the original scene or on a white background. A series of behavioral studies with young adults demonstrated a context shift decrement (CSD)-decreased recognition performance when context is changed between encoding and retrieval. The CSD was not attenuated by encoding or retrieval manipulations, suggesting that binding of object and context may be automatic. A final experiment explored the neural correlates of the CSD, using functional Magnetic Resonance Imaging. Parahippocampal cortex (PHC) activation (right greater than left) during incidental encoding was associated with subsequent memory of objects in the context shift condition. Greater activity in right PHC was also observed during successful recognition of objects previously presented in a scene. Finally, a subset of regions activated during scene encoding, such as bilateral PHC, was reactivated when the object was presented on a white background at retrieval. Although participants were not required to intentionally retrieve contextual information, the results suggest that PHC may reinstate visual context to mediate successful episodic memory retrieval. The CSD is attributed to automatic and obligatory binding of object and context. The results suggest that PHC is important not only for processing of scene information, but also plays a role in successful episodic memory encoding and retrieval. These findings are consistent with the view that spatial information is stored in the hippocampal complex, one of the central tenets of Multiple Trace Theory. (c) 2007 Wiley-Liss, Inc.
Kobayashi, Yasutaka; Muramatsu, Tomoko; Sato, Mamiko; Hayashi, Hiromi; Miura, Toyoaki
2015-01-01
A 68-year-old man was admitted to our hospital for rehabilitation of topographical disorientation. Brain magnetic resonance imaging revealed infarction in the right medial side of the occipital lobe. On neuropsychological testing, he scored low for the visual information-processing task; however, his overall cognitive function was retained. He could identify parts of the picture while describing the context picture of the Visual Perception Test for Agnosia but could not explain the contents of the entire picture, representing so-called simultanagnosia. Further, he could morphologically perceive both familiar and new scenes, but could not identify them, representing so-called scene agnosia. We report this case because simultanagnosia associated with a right occipital lobe lesion is rare.
Spatial Correlations in Natural Scenes Modulate Response Reliability in Mouse Visual Cortex
Rikhye, Rajeev V.
2015-01-01
Intrinsic neuronal variability significantly limits information encoding in the primary visual cortex (V1). Certain stimuli can suppress this intertrial variability to increase the reliability of neuronal responses. In particular, responses to natural scenes, which have broadband spatiotemporal statistics, are more reliable than responses to stimuli such as gratings. However, very little is known about which stimulus statistics modulate reliable coding and how this occurs at the neural ensemble level. Here, we sought to elucidate the role that spatial correlations in natural scenes play in reliable coding. We developed a novel noise-masking method to systematically alter spatial correlations in natural movies, without altering their edge structure. Using high-speed two-photon calcium imaging in vivo, we found that responses in mouse V1 were much less reliable at both the single neuron and population level when spatial correlations were removed from the image. This change in reliability was due to a reorganization of between-neuron correlations. Strongly correlated neurons formed ensembles that reliably and accurately encoded visual stimuli, whereas reducing spatial correlations reduced the activation of these ensembles, leading to an unreliable code. Together with an ensemble-specific normalization model, these results suggest that the coordinated activation of specific subsets of neurons underlies the reliable coding of natural scenes. SIGNIFICANCE STATEMENT The natural environment is rich with information. To process this information with high fidelity, V1 neurons have to be robust to noise and, consequentially, must generate responses that are reliable from trial to trial. While several studies have hinted that both stimulus attributes and population coding may reduce noise, the details remain unclear. Specifically, what features of natural scenes are important and how do they modulate reliability? This study is the first to investigate the role of spatial correlations, which are a fundamental attribute of natural scenes, in shaping stimulus coding by V1 neurons. Our results provide new insights into how stimulus spatial correlations reorganize the correlated activation of specific ensembles of neurons to ensure accurate information processing in V1. PMID:26511254
Remembering faces and scenes: The mixed-category advantage in visual working memory.
Jiang, Yuhong V; Remington, Roger W; Asaad, Anthony; Lee, Hyejin J; Mikkalson, Taylor C
2016-09-01
We examined the mixed-category memory advantage for faces and scenes to determine how domain-specific cortical resources constrain visual working memory. Consistent with previous findings, visual working memory for a display of 2 faces and 2 scenes was better than that for a display of 4 faces or 4 scenes. This pattern was unaffected by manipulations of encoding duration. However, the mixed-category advantage was carried solely by faces: Memory for scenes was not better when scenes were encoded with faces rather than with other scenes. The asymmetry between faces and scenes was found when items were presented simultaneously or sequentially, centrally, or peripherally, and when scenes were drawn from a narrow category. A further experiment showed a mixed-category advantage in memory for faces and bodies, but not in memory for scenes and objects. The results suggest that unique category-specific interactions contribute significantly to the mixed-category advantage in visual working memory. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Groen, Iris I A; Silson, Edward H; Baker, Chris I
2017-02-19
Visual scene analysis in humans has been characterized by the presence of regions in extrastriate cortex that are selectively responsive to scenes compared with objects or faces. While these regions have often been interpreted as representing high-level properties of scenes (e.g. category), they also exhibit substantial sensitivity to low-level (e.g. spatial frequency) and mid-level (e.g. spatial layout) properties, and it is unclear how these disparate findings can be united in a single framework. In this opinion piece, we suggest that this problem can be resolved by questioning the utility of the classical low- to high-level framework of visual perception for scene processing, and discuss why low- and mid-level properties may be particularly diagnostic for the behavioural goals specific to scene perception as compared to object recognition. In particular, we highlight the contributions of low-level vision to scene representation by reviewing (i) retinotopic biases and receptive field properties of scene-selective regions and (ii) the temporal dynamics of scene perception that demonstrate overlap of low- and mid-level feature representations with those of scene category. We discuss the relevance of these findings for scene perception and suggest a more expansive framework for visual scene analysis.This article is part of the themed issue 'Auditory and visual scene analysis'. © 2017 The Author(s).
2017-01-01
Visual scene analysis in humans has been characterized by the presence of regions in extrastriate cortex that are selectively responsive to scenes compared with objects or faces. While these regions have often been interpreted as representing high-level properties of scenes (e.g. category), they also exhibit substantial sensitivity to low-level (e.g. spatial frequency) and mid-level (e.g. spatial layout) properties, and it is unclear how these disparate findings can be united in a single framework. In this opinion piece, we suggest that this problem can be resolved by questioning the utility of the classical low- to high-level framework of visual perception for scene processing, and discuss why low- and mid-level properties may be particularly diagnostic for the behavioural goals specific to scene perception as compared to object recognition. In particular, we highlight the contributions of low-level vision to scene representation by reviewing (i) retinotopic biases and receptive field properties of scene-selective regions and (ii) the temporal dynamics of scene perception that demonstrate overlap of low- and mid-level feature representations with those of scene category. We discuss the relevance of these findings for scene perception and suggest a more expansive framework for visual scene analysis. This article is part of the themed issue ‘Auditory and visual scene analysis’. PMID:28044013
A neural correlate of working memory in the monkey primary visual cortex.
Supèr, H; Spekreijse, H; Lamme, V A
2001-07-06
The brain frequently needs to store information for short periods. In vision, this means that the perceptual correlate of a stimulus has to be maintained temporally once the stimulus has been removed from the visual scene. However, it is not known how the visual system transfers sensory information into a memory component. Here, we identify a neural correlate of working memory in the monkey primary visual cortex (V1). We propose that this component may link sensory activity with memory activity.
Situated sentence processing: the coordinated interplay account and a neurobehavioral model.
Crocker, Matthew W; Knoeferle, Pia; Mayberry, Marshall R
2010-03-01
Empirical evidence demonstrating that sentence meaning is rapidly reconciled with the visual environment has been broadly construed as supporting the seamless interaction of visual and linguistic representations during situated comprehension. Based on recent behavioral and neuroscientific findings, however, we argue for the more deeply rooted coordination of the mechanisms underlying visual and linguistic processing, and for jointly considering the behavioral and neural correlates of scene-sentence reconciliation during situated comprehension. The Coordinated Interplay Account (CIA; Knoeferle, P., & Crocker, M. W. (2007). The influence of recent scene events on spoken comprehension: Evidence from eye movements. Journal of Memory and Language, 57(4), 519-543) asserts that incremental linguistic interpretation actively directs attention in the visual environment, thereby increasing the salience of attended scene information for comprehension. We review behavioral and neuroscientific findings in support of the CIA's three processing stages: (i) incremental sentence interpretation, (ii) language-mediated visual attention, and (iii) the on-line influence of non-linguistic visual context. We then describe a recently developed connectionist model which both embodies the central CIA proposals and has been successfully applied in modeling a range of behavioral findings from the visual world paradigm (Mayberry, M. R., Crocker, M. W., & Knoeferle, P. (2009). Learning to attend: A connectionist model of situated language comprehension. Cognitive Science). Results from a new simulation suggest the model also correlates with event-related brain potentials elicited by the immediate use of visual context for linguistic disambiguation (Knoeferle, P., Habets, B., Crocker, M. W., & Münte, T. F. (2008). Visual scenes trigger immediate syntactic reanalysis: Evidence from ERPs during situated spoken comprehension. Cerebral Cortex, 18(4), 789-795). Finally, we argue that the mechanisms underlying interpretation, visual attention, and scene apprehension are not only in close temporal synchronization, but have co-adapted to optimize real-time visual grounding of situated spoken language, thus facilitating the association of linguistic, visual and motor representations that emerge during the course of our embodied linguistic experience in the world. Copyright 2009 Elsevier Inc. All rights reserved.
The role of iconic memory in change-detection tasks.
Becker, M W; Pashler, H; Anstis, S M
2000-01-01
In three experiments, subjects attempted to detect the change of a single item in a visually presented array of items. Subjects' ability to detect a change was greatly reduced if a blank interstimulus interval (ISI) was inserted between the original array and an array in which one item had changed ('change blindness'). However, change detection improved when the location of the change was cued during the blank ISI. This suggests that people represent more information of a scene than change blindness might suggest. We test two possible hypotheses why, in the absence of a cue, this representation fails to produce good change detection. The first claims that the intervening events employed to create change blindness result in multiple neural transients which co-occur with the to-be-detected change. Poor detection rates occur because a serial search of all the transient locations is required to detect the change, during which time the representation of the original scene fades. The second claims that the occurrence of the second frame overwrites the representation of the first frame, unless that information is insulated against overwriting by attention. The results support the second hypothesis. We conclude that people may have a fairly rich visual representation of a scene while the scene is present, but fail to detect changes because they lack the ability to simultaneously represent two complete visual representations.
Scene and Position Specificity in Visual Memory for Objects
ERIC Educational Resources Information Center
Hollingworth, Andrew
2006-01-01
This study investigated whether and how visual representations of individual objects are bound in memory to scene context. Participants viewed a series of naturalistic scenes, and memory for the visual form of a target object in each scene was examined in a 2-alternative forced-choice test, with the distractor object either a different object…
Pilot/vehicle model analysis of visually guided flight
NASA Technical Reports Server (NTRS)
Zacharias, Greg L.
1991-01-01
Information is given in graphical and outline form on a pilot/vehicle model description, control of altitude with simple terrain clues, simulated flight with visual scene delays, model-based in-cockpit display design, and some thoughts on the role of pilot/vehicle modeling.
Buchs, Galit; Maidenbaum, Shachar; Levy-Tzedek, Shelly; Amedi, Amir
2015-01-01
Purpose: To visually perceive our surroundings we constantly move our eyes and focus on particular details, and then integrate them into a combined whole. Current visual rehabilitation methods, both invasive, like bionic-eyes and non-invasive, like Sensory Substitution Devices (SSDs), down-sample visual stimuli into low-resolution images. Zooming-in to sub-parts of the scene could potentially improve detail perception. Can congenitally blind individuals integrate a ‘visual’ scene when offered this information via different sensory modalities, such as audition? Can they integrate visual information –perceived in parts - into larger percepts despite never having had any visual experience? Methods: We explored these questions using a zooming-in functionality embedded in the EyeMusic visual-to-auditory SSD. Eight blind participants were tasked with identifying cartoon faces by integrating their individual components recognized via the EyeMusic’s zooming mechanism. Results: After specialized training of just 6–10 hours, blind participants successfully and actively integrated facial features into cartooned identities in 79±18% of the trials in a highly significant manner, (chance level 10% ; rank-sum P < 1.55E-04). Conclusions: These findings show that even users who lacked any previous visual experience whatsoever can indeed integrate this visual information with increased resolution. This potentially has important practical visual rehabilitation implications for both invasive and non-invasive methods. PMID:26518671
Modulation of visually evoked movement responses in moving virtual environments.
Reed-Jones, Rebecca J; Vallis, Lori Ann
2009-01-01
Virtual-reality technology is being increasingly used to understand how humans perceive and act in the moving world around them. What is currently not clear is how virtual reality technology is perceived by human participants and what virtual scenes are effective in evoking movement responses to visual stimuli. We investigated the effect of virtual-scene context on human responses to a virtual visual perturbation. We hypothesised that exposure to a natural scene that matched the visual expectancies of the natural world would create a perceptual set towards presence, and thus visual guidance of body movement in a subsequently presented virtual scene. Results supported this hypothesis; responses to a virtual visual perturbation presented in an ambiguous virtual scene were increased when participants first viewed a scene that consisted of natural landmarks which provided 'real-world' visual motion cues. Further research in this area will provide a basis of knowledge for the effective use of this technology in the study of human movement responses.
Manipulating the content of dynamic natural scenes to characterize response in human MT/MST.
Durant, Szonya; Wall, Matthew B; Zanker, Johannes M
2011-09-09
Optic flow is one of the most important sources of information for enabling human navigation through the world. A striking finding from single-cell studies in monkeys is the rapid saturation of response of MT/MST areas with the density of optic flow type motion information. These results are reflected psychophysically in human perception in the saturation of motion aftereffects. We began by comparing responses to natural optic flow scenes in human visual brain areas to responses to the same scenes with inverted contrast (photo negative). This changes scene familiarity while preserving local motion signals. This manipulation had no effect; however, the response was only correlated with the density of local motion (calculated by a motion correlation model) in V1, not in MT/MST. To further investigate this, we manipulated the visible proportion of natural dynamic scenes and found that areas MT and MST did not increase in response over a 16-fold increase in the amount of information presented, i.e., response had saturated. This makes sense in light of the sparseness of motion information in natural scenes, suggesting that the human brain is well adapted to exploit a small amount of dynamic signal and extract information important for survival.
Bio-inspired display of polarization information using selected visual cues
NASA Astrophysics Data System (ADS)
Yemelyanov, Konstantin M.; Lin, Shih-Schon; Luis, William Q.; Pugh, Edward N., Jr.; Engheta, Nader
2003-12-01
For imaging systems the polarization of electromagnetic waves carries much potentially useful information about such features of the world as the surface shape, material contents, local curvature of objects, as well as about the relative locations of the source, object and imaging system. The imaging system of the human eye however, is "polarization-blind", and cannot utilize the polarization of light without the aid of an artificial, polarization-sensitive instrument. Therefore, polarization information captured by a man-made polarimetric imaging system must be displayed to a human observer in the form of visual cues that are naturally processed by the human visual system, while essentially preserving the other important non-polarization information (such as spectral and intensity information) in an image. In other words, some forms of sensory substitution are needed for representing polarization "signals" without affecting other visual information such as color and brightness. We are investigating several bio-inspired representational methodologies for mapping polarization information into visual cues readily perceived by the human visual system, and determining which mappings are most suitable for specific applications such as object detection, navigation, sensing, scene classifications, and surface deformation. The visual cues and strategies we are exploring are the use of coherently moving dots superimposed on image to represent various range of polarization signals, overlaying textures with spatial and/or temporal signatures to segregate regions of image with differing polarization, modulating luminance and/or color contrast of scenes in terms of certain aspects of polarization values, and fusing polarization images into intensity-only images. In this talk, we will present samples of our findings in this area.
The Relationship Between Online Visual Representation of a Scene and Long-Term Scene Memory
ERIC Educational Resources Information Center
Hollingworth, Andrew
2005-01-01
In 3 experiments the author investigated the relationship between the online visual representation of natural scenes and long-term visual memory. In a change detection task, a target object either changed or remained the same from an initial image of a natural scene to a test image. Two types of changes were possible: rotation in depth, or…
Basic level scene understanding: categories, attributes and structures
Xiao, Jianxiong; Hays, James; Russell, Bryan C.; Patterson, Genevieve; Ehinger, Krista A.; Torralba, Antonio; Oliva, Aude
2013-01-01
A longstanding goal of computer vision is to build a system that can automatically understand a 3D scene from a single image. This requires extracting semantic concepts and 3D information from 2D images which can depict an enormous variety of environments that comprise our visual world. This paper summarizes our recent efforts toward these goals. First, we describe the richly annotated SUN database which is a collection of annotated images spanning 908 different scene categories with object, attribute, and geometric labels for many scenes. This database allows us to systematically study the space of scenes and to establish a benchmark for scene and object recognition. We augment the categorical SUN database with 102 scene attributes for every image and explore attribute recognition. Finally, we present an integrated system to extract the 3D structure of the scene and objects depicted in an image. PMID:24009590
Visual Attention Model Based on Statistical Properties of Neuron Responses
Duan, Haibin; Wang, Xiaohua
2015-01-01
Visual attention is a mechanism of the visual system that can select relevant objects from a specific scene. Interactions among neurons in multiple cortical areas are considered to be involved in attentional allocation. However, the characteristics of the encoded features and neuron responses in those attention related cortices are indefinite. Therefore, further investigations carried out in this study aim at demonstrating that unusual regions arousing more attention generally cause particular neuron responses. We suppose that visual saliency is obtained on the basis of neuron responses to contexts in natural scenes. A bottom-up visual attention model is proposed based on the self-information of neuron responses to test and verify the hypothesis. Four different color spaces are adopted and a novel entropy-based combination scheme is designed to make full use of color information. Valuable regions are highlighted while redundant backgrounds are suppressed in the saliency maps obtained by the proposed model. Comparative results reveal that the proposed model outperforms several state-of-the-art models. This study provides insights into the neuron responses based saliency detection and may underlie the neural mechanism of early visual cortices for bottom-up visual attention. PMID:25747859
Global Transsaccadic Change Blindness During Scene Perception
2003-09-01
objects in natural scenes. Psychonomic Bulletin & Review , 8 , 761–768. Irwin, D.E. (1991). Information integration across saccadic eye... Psychonomic Bulletin & Review , 5 , 644–649. Tanaka, K. (1996). Inferotemporal cortex and object vision. Annual Review of Neuro- science... Bulletin & Review , 8 , 753–760. Hoffman, J.R., & Subramanian, B. (1995). The role of visual attention in saccadic eye movements. Perception
Two Distinct Scene-Processing Networks Connecting Vision and Memory.
Baldassano, Christopher; Esteva, Andre; Fei-Fei, Li; Beck, Diane M
2016-01-01
A number of regions in the human brain are known to be involved in processing natural scenes, but the field has lacked a unifying framework for understanding how these different regions are organized and interact. We provide evidence from functional connectivity and meta-analyses for a new organizational principle, in which scene processing relies upon two distinct networks that split the classically defined parahippocampal place area (PPA). The first network of strongly connected regions consists of the occipital place area/transverse occipital sulcus and posterior PPA, which contain retinotopic maps and are not strongly coupled to the hippocampus at rest. The second network consists of the caudal inferior parietal lobule, retrosplenial complex, and anterior PPA, which connect to the hippocampus (especially anterior hippocampus), and are implicated in both visual and nonvisual tasks, including episodic memory and navigation. We propose that these two distinct networks capture the primary functional division among scene-processing regions, between those that process visual features from the current view of a scene and those that connect information from a current scene view with a much broader temporal and spatial context. This new framework for understanding the neural substrates of scene-processing bridges results from many lines of research, and makes specific functional predictions.
Büttner, Oliver B; Wieber, Frank; Schulz, Anna Maria; Bayer, Ute C; Florack, Arnd; Gollwitzer, Peter M
2014-10-01
Mindset theory suggests that a deliberative mindset entails openness to information in one's environment, whereas an implemental mindset entails filtering of information. We hypothesized that this open- versus closed-mindedness influences individuals' breadth of visual attention. In Studies 1 and 2, we induced an implemental or deliberative mindset, and measured breadth of attention using participants' length estimates of x-winged Müller-Lyer figures. Both studies demonstrate a narrower breadth of attention in the implemental mindset than in the deliberative mindset. In Study 3, we manipulated participants' mindsets and measured the breadth of attention by tracking eye movements during scene perception. Implemental mindset participants focused on foreground objects, whereas deliberative mindset participants attended more evenly to the entire scene. Our findings imply that deliberative versus implemental mindsets already operate at the level of visual attention. © 2014 by the Society for Personality and Social Psychology, Inc.
Using articulated scene models for dynamic 3d scene analysis in vista spaces
NASA Astrophysics Data System (ADS)
Beuter, Niklas; Swadzba, Agnes; Kummert, Franz; Wachsmuth, Sven
2010-09-01
In this paper we describe an efficient but detailed new approach to analyze complex dynamic scenes directly in 3D. The arising information is important for mobile robots to solve tasks in the area of household robotics. In our work a mobile robot builds an articulated scene model by observing the environment in the visual field or rather in the so-called vista space. The articulated scene model consists of essential knowledge about the static background, about autonomously moving entities like humans or robots and finally, in contrast to existing approaches, information about articulated parts. These parts describe movable objects like chairs, doors or other tangible entities, which could be moved by an agent. The combination of the static scene, the self-moving entities and the movable objects in one articulated scene model enhances the calculation of each single part. The reconstruction process for parts of the static scene benefits from removal of the dynamic parts and in turn, the moving parts can be extracted more easily through the knowledge about the background. In our experiments we show, that the system delivers simultaneously an accurate static background model, moving persons and movable objects. This information of the articulated scene model enables a mobile robot to detect and keep track of interaction partners, to navigate safely through the environment and finally, to strengthen the interaction with the user through the knowledge about the 3D articulated objects and 3D scene analysis. [Figure not available: see fulltext.
Visual encoding and fixation target selection in free viewing: presaccadic brain potentials
Nikolaev, Andrey R.; Jurica, Peter; Nakatani, Chie; Plomp, Gijs; van Leeuwen, Cees
2013-01-01
In scrutinizing a scene, the eyes alternate between fixations and saccades. During a fixation, two component processes can be distinguished: visual encoding and selection of the next fixation target. We aimed to distinguish the neural correlates of these processes in the electrical brain activity prior to a saccade onset. Participants viewed color photographs of natural scenes, in preparation for a change detection task. Then, for each participant and each scene we computed an image heat map, with temperature representing the duration and density of fixations. The temperature difference between the start and end points of saccades was taken as a measure of the expected task-relevance of the information concentrated in specific regions of a scene. Visual encoding was evaluated according to whether subsequent change was correctly detected. Saccades with larger temperature difference were more likely to be followed by correct detection than ones with smaller temperature differences. The amplitude of presaccadic activity over anterior brain areas was larger for correct detection than for detection failure. This difference was observed for short “scrutinizing” but not for long “explorative” saccades, suggesting that presaccadic activity reflects top-down saccade guidance. Thus, successful encoding requires local scanning of scene regions which are expected to be task-relevant. Next, we evaluated fixation target selection. Saccades “moving up” in temperature were preceded by presaccadic activity of higher amplitude than those “moving down”. This finding suggests that presaccadic activity reflects attention deployed to the following fixation location. Our findings illustrate how presaccadic activity can elucidate concurrent brain processes related to the immediate goal of planning the next saccade and the larger-scale goal of constructing a robust representation of the visual scene. PMID:23818877
A Theoretical and Experimental Analysis of the Outside World Perception Process
NASA Technical Reports Server (NTRS)
Wewerinke, P. H.
1978-01-01
The outside scene is often an important source of information for manual control tasks. Important examples of these are car driving and aircraft control. This paper deals with modelling this visual scene perception process on the basis of linear perspective geometry and the relative motion cues. Model predictions utilizing psychophysical threshold data from base-line experiments and literature of a variety of visual approach tasks are compared with experimental data. Both the performance and workload results illustrate that the model provides a meaningful description of the outside world perception process, with a useful predictive capability.
Behavioral and Neural Representations of Spatial Directions across Words, Schemas, and Images.
Weisberg, Steven M; Marchette, Steven A; Chatterjee, Anjan
2018-05-23
Modern spatial navigation requires fluency with multiple representational formats, including visual scenes, signs, and words. These formats convey different information. Visual scenes are rich and specific but contain extraneous details. Arrows, as an example of signs, are schematic representations in which the extraneous details are eliminated, but analog spatial properties are preserved. Words eliminate all spatial information and convey spatial directions in a purely abstract form. How does the human brain compute spatial directions within and across these formats? To investigate this question, we conducted two experiments on men and women: a behavioral study that was preregistered and a neuroimaging study using multivoxel pattern analysis of fMRI data to uncover similarities and differences among representational formats. Participants in the behavioral study viewed spatial directions presented as images, schemas, or words (e.g., "left"), and responded to each trial, indicating whether the spatial direction was the same or different as the one viewed previously. They responded more quickly to schemas and words than images, despite the visual complexity of stimuli being matched. Participants in the fMRI study performed the same task but responded only to occasional catch trials. Spatial directions in images were decodable in the intraparietal sulcus bilaterally but were not in schemas and words. Spatial directions were also decodable between all three formats. These results suggest that intraparietal sulcus plays a role in calculating spatial directions in visual scenes, but this neural circuitry may be bypassed when the spatial directions are presented as schemas or words. SIGNIFICANCE STATEMENT Human navigators encounter spatial directions in various formats: words ("turn left"), schematic signs (an arrow showing a left turn), and visual scenes (a road turning left). The brain must transform these spatial directions into a plan for action. Here, we investigate similarities and differences between neural representations of these formats. We found that bilateral intraparietal sulci represent spatial directions in visual scenes and across the three formats. We also found that participants respond quickest to schemas, then words, then images, suggesting that spatial directions in abstract formats are easier to interpret than concrete formats. These results support a model of spatial direction interpretation in which spatial directions are either computed for real world action or computed for efficient visual comparison. Copyright © 2018 the authors 0270-6474/18/384996-12$15.00/0.
Scene and human face recognition in the central vision of patients with glaucoma
Aptel, Florent; Attye, Arnaud; Guyader, Nathalie; Boucart, Muriel; Chiquet, Christophe; Peyrin, Carole
2018-01-01
Primary open-angle glaucoma (POAG) firstly mainly affects peripheral vision. Current behavioral studies support the idea that visual defects of patients with POAG extend into parts of the central visual field classified as normal by static automated perimetry analysis. This is particularly true for visual tasks involving processes of a higher level than mere detection. The purpose of this study was to assess visual abilities of POAG patients in central vision. Patients were assigned to two groups following a visual field examination (Humphrey 24–2 SITA-Standard test). Patients with both peripheral and central defects and patients with peripheral but no central defect, as well as age-matched controls, participated in the experiment. All participants had to perform two visual tasks where low-contrast stimuli were presented in the central 6° of the visual field. A categorization task of scene images and human face images assessed high-level visual recognition abilities. In contrast, a detection task using the same stimuli assessed low-level visual function. The difference in performance between detection and categorization revealed the cost of high-level visual processing. Compared to controls, patients with a central visual defect showed a deficit in both detection and categorization of all low-contrast images. This is consistent with the abnormal retinal sensitivity as assessed by perimetry. However, the deficit was greater for categorization than detection. Patients without a central defect showed similar performances to the controls concerning the detection and categorization of faces. However, while the detection of scene images was well-maintained, these patients showed a deficit in their categorization. This suggests that the simple loss of peripheral vision could be detrimental to scene recognition, even when the information is displayed in central vision. This study revealed subtle defects in the central visual field of POAG patients that cannot be predicted by static automated perimetry assessment using Humphrey 24–2 SITA-Standard test. PMID:29481572
Eye movements and attention in reading, scene perception, and visual search.
Rayner, Keith
2009-08-01
Eye movements are now widely used to investigate cognitive processes during reading, scene perception, and visual search. In this article, research on the following topics is reviewed with respect to reading: (a) the perceptual span (or span of effective vision), (b) preview benefit, (c) eye movement control, and (d) models of eye movements. Related issues with respect to eye movements during scene perception and visual search are also reviewed. It is argued that research on eye movements during reading has been somewhat advanced over research on eye movements in scene perception and visual search and that some of the paradigms developed to study reading should be more widely adopted in the study of scene perception and visual search. Research dealing with "real-world" tasks and research utilizing the visual-world paradigm are also briefly discussed.
Vestibular nuclei and cerebellum put visual gravitational motion in context.
Miller, William L; Maffei, Vincenzo; Bosco, Gianfranco; Iosa, Marco; Zago, Myrka; Macaluso, Emiliano; Lacquaniti, Francesco
2008-04-01
Animal survival in the forest, and human success on the sports field, often depend on the ability to seize a target on the fly. All bodies fall at the same rate in the gravitational field, but the corresponding retinal motion varies with apparent viewing distance. How then does the brain predict time-to-collision under gravity? A perspective context from natural or pictorial settings might afford accurate predictions of gravity's effects via the recovery of an environmental reference from the scene structure. We report that embedding motion in a pictorial scene facilitates interception of gravitational acceleration over unnatural acceleration, whereas a blank scene eliminates such bias. Functional magnetic resonance imaging (fMRI) revealed blood-oxygen-level-dependent correlates of these visual context effects on gravitational motion processing in the vestibular nuclei and posterior cerebellar vermis. Our results suggest an early stage of integration of high-level visual analysis with gravity-related motion information, which may represent the substrate for perceptual constancy of ubiquitous gravitational motion.
Visibility Equalizer Cutaway Visualization of Mesoscopic Biological Models.
Le Muzic, M; Mindek, P; Sorger, J; Autin, L; Goodsell, D; Viola, I
2016-06-01
In scientific illustrations and visualization, cutaway views are often employed as an effective technique for occlusion management in densely packed scenes. We propose a novel method for authoring cutaway illustrations of mesoscopic biological models. In contrast to the existing cutaway algorithms, we take advantage of the specific nature of the biological models. These models consist of thousands of instances with a comparably smaller number of different types. Our method constitutes a two stage process. In the first step, clipping objects are placed in the scene, creating a cutaway visualization of the model. During this process, a hierarchical list of stacked bars inform the user about the instance visibility distribution of each individual molecular type in the scene. In the second step, the visibility of each molecular type is fine-tuned through these bars, which at this point act as interactive visibility equalizers. An evaluation of our technique with domain experts confirmed that our equalizer-based approach for visibility specification was valuable and effective for both, scientific and educational purposes.
Visibility Equalizer Cutaway Visualization of Mesoscopic Biological Models
Le Muzic, M.; Mindek, P.; Sorger, J.; Autin, L.; Goodsell, D.; Viola, I.
2017-01-01
In scientific illustrations and visualization, cutaway views are often employed as an effective technique for occlusion management in densely packed scenes. We propose a novel method for authoring cutaway illustrations of mesoscopic biological models. In contrast to the existing cutaway algorithms, we take advantage of the specific nature of the biological models. These models consist of thousands of instances with a comparably smaller number of different types. Our method constitutes a two stage process. In the first step, clipping objects are placed in the scene, creating a cutaway visualization of the model. During this process, a hierarchical list of stacked bars inform the user about the instance visibility distribution of each individual molecular type in the scene. In the second step, the visibility of each molecular type is fine-tuned through these bars, which at this point act as interactive visibility equalizers. An evaluation of our technique with domain experts confirmed that our equalizer-based approach for visibility specification was valuable and effective for both, scientific and educational purposes. PMID:28344374
Comparing object recognition from binary and bipolar edge images for visual prostheses.
Jung, Jae-Hyun; Pu, Tian; Peli, Eli
2016-11-01
Visual prostheses require an effective representation method due to the limited display condition which has only 2 or 3 levels of grayscale in low resolution. Edges derived from abrupt luminance changes in images carry essential information for object recognition. Typical binary (black and white) edge images have been used to represent features to convey essential information. However, in scenes with a complex cluttered background, the recognition rate of the binary edge images by human observers is limited and additional information is required. The polarity of edges and cusps (black or white features on a gray background) carries important additional information; the polarity may provide shape from shading information missing in the binary edge image. This depth information may be restored by using bipolar edges. We compared object recognition rates from 16 binary edge images and bipolar edge images by 26 subjects to determine the possible impact of bipolar filtering in visual prostheses with 3 or more levels of grayscale. Recognition rates were higher with bipolar edge images and the improvement was significant in scenes with complex backgrounds. The results also suggest that erroneous shape from shading interpretation of bipolar edges resulting from pigment rather than boundaries of shape may confound the recognition.
Robust sampling of decision information during perceptual choice
Vandormael, Hildward; Herce Castañón, Santiago; Balaguer, Jan; Li, Vickie; Summerfield, Christopher
2017-01-01
Humans move their eyes to gather information about the visual world. However, saccadic sampling has largely been explored in paradigms that involve searching for a lone target in a cluttered array or natural scene. Here, we investigated the policy that humans use to overtly sample information in a perceptual decision task that required information from across multiple spatial locations to be combined. Participants viewed a spatial array of numbers and judged whether the average was greater or smaller than a reference value. Participants preferentially sampled items that were less diagnostic of the correct answer (“inlying” elements; that is, elements closer to the reference value). This preference to sample inlying items was linked to decisions, enhancing the tendency to give more weight to inlying elements in the final choice (“robust averaging”). These findings contrast with a large body of evidence indicating that gaze is directed preferentially to deviant information during natural scene viewing and visual search, and suggest that humans may sample information “robustly” with their eyes during perceptual decision-making. PMID:28223519
How many pixels make a memory? Picture memory for small pictures.
Wolfe, Jeremy M; Kuzmova, Yoana I
2011-06-01
Torralba (Visual Neuroscience, 26, 123-131, 2009) showed that, if the resolution of images of scenes were reduced to the information present in very small "thumbnail images," those scenes could still be recognized. The objects in those degraded scenes could be identified, even though it would be impossible to identify them if they were removed from the scene context. Can tiny and/or degraded scenes be remembered, or are they like brief presentations, identified but not remembered. We report that memory for tiny and degraded scenes parallels the recognizability of those scenes. You can remember a scene to approximately the degree to which you can classify it. Interestingly, there is a striking asymmetry in memory when scenes are not the same size on their initial appearance and subsequent test. Memory for a large, full-resolution stimulus can be tested with a small, degraded stimulus. However, memory for a small stimulus is not retrieved when it is tested with a large stimulus.
Statistics of high-level scene context.
Greene, Michelle R
2013-01-01
CONTEXT IS CRITICAL FOR RECOGNIZING ENVIRONMENTS AND FOR SEARCHING FOR OBJECTS WITHIN THEM: contextual associations have been shown to modulate reaction time and object recognition accuracy, as well as influence the distribution of eye movements and patterns of brain activations. However, we have not yet systematically quantified the relationships between objects and their scene environments. Here I seek to fill this gap by providing descriptive statistics of object-scene relationships. A total of 48, 167 objects were hand-labeled in 3499 scenes using the LabelMe tool (Russell et al., 2008). From these data, I computed a variety of descriptive statistics at three different levels of analysis: the ensemble statistics that describe the density and spatial distribution of unnamed "things" in the scene; the bag of words level where scenes are described by the list of objects contained within them; and the structural level where the spatial distribution and relationships between the objects are measured. The utility of each level of description for scene categorization was assessed through the use of linear classifiers, and the plausibility of each level for modeling human scene categorization is discussed. Of the three levels, ensemble statistics were found to be the most informative (per feature), and also best explained human patterns of categorization errors. Although a bag of words classifier had similar performance to human observers, it had a markedly different pattern of errors. However, certain objects are more useful than others, and ceiling classification performance could be achieved using only the 64 most informative objects. As object location tends not to vary as a function of category, structural information provided little additional information. Additionally, these data provide valuable information on natural scene redundancy that can be exploited for machine vision, and can help the visual cognition community to design experiments guided by statistics rather than intuition.
An automated approach for tone mapping operator parameter adjustment in security applications
NASA Astrophysics Data System (ADS)
Krasula, LukáÅ.¡; Narwaria, Manish; Le Callet, Patrick
2014-05-01
High Dynamic Range (HDR) imaging has been gaining popularity in recent years. Different from the traditional low dynamic range (LDR), HDR content tends to be visually more appealing and realistic as it can represent the dynamic range of the visual stimuli present in the real world. As a result, more scene details can be faithfully reproduced. As a direct consequence, the visual quality tends to improve. HDR can be also directly exploited for new applications such as video surveillance and other security tasks. Since more scene details are available in HDR, it can help in identifying/tracking visual information which otherwise might be difficult with typical LDR content due to factors such as lack/excess of illumination, extreme contrast in the scene, etc. On the other hand, with HDR, there might be issues related to increased privacy intrusion. To display the HDR content on the regular screen, tone-mapping operators (TMO) are used. In this paper, we present the universal method for TMO parameters tuning, in order to maintain as many details as possible, which is desirable in security applications. The method's performance is verified on several TMOs by comparing the outcomes from tone-mapping with default and optimized parameters. The results suggest that the proposed approach preserves more information which could be of advantage for security surveillance but, on the other hand, makes us consider possible increase in privacy intrusion.
Visual memory for moving scenes.
DeLucia, Patricia R; Maldia, Maria M
2006-02-01
In the present study, memory for picture boundaries was measured with scenes that simulated self-motion along the depth axis. The results indicated that boundary extension (a distortion in memory for picture boundaries) occurred with moving scenes in the same manner as that reported previously for static scenes. Furthermore, motion affected memory for the boundaries but this effect of motion was not consistent with representational momentum of the self (memory being further forward in a motion trajectory than actually shown). We also found that memory for the final position of the depicted self in a moving scene was influenced by properties of the optical expansion pattern. The results are consistent with a conceptual framework in which the mechanisms that underlie boundary extension and representational momentum (a) process different information and (b) both contribute to the integration of successive views of a scene while the scene is changing.
Representation of Gravity-Aligned Scene Structure in Ventral Pathway Visual Cortex.
Vaziri, Siavash; Connor, Charles E
2016-03-21
The ventral visual pathway in humans and non-human primates is known to represent object information, including shape and identity [1]. Here, we show the ventral pathway also represents scene structure aligned with the gravitational reference frame in which objects move and interact. We analyzed shape tuning of recently described macaque monkey ventral pathway neurons that prefer scene-like stimuli to objects [2]. Individual neurons did not respond to a single shape class, but to a variety of scene elements that are typically aligned with gravity: large planes in the orientation range of ground surfaces under natural viewing conditions, planes in the orientation range of ceilings, and extended convex and concave edges in the orientation range of wall/floor/ceiling junctions. For a given neuron, these elements tended to share a common alignment in eye-centered coordinates. Thus, each neuron integrated information about multiple gravity-aligned structures as they would be seen from a specific eye and head orientation. This eclectic coding strategy provides only ambiguous information about individual structures but explicit information about the environmental reference frame and the orientation of gravity in egocentric coordinates. In the ventral pathway, this could support perceiving and/or predicting physical events involving objects subject to gravity, recognizing object attributes like animacy based on movement not caused by gravity, and/or stabilizing perception of the world against changes in head orientation [3-5]. Our results, like the recent discovery of object weight representation [6], imply that the ventral pathway is involved not just in recognition, but also in physical understanding of objects and scenes. Copyright © 2016 Elsevier Ltd. All rights reserved.
ERIC Educational Resources Information Center
Torralba, Antonio; Oliva, Aude; Castelhano, Monica S.; Henderson, John M.
2006-01-01
Many experiments have shown that the human visual system makes extensive use of contextual information for facilitating object search in natural scenes. However, the question of how to formally model contextual influences is still open. On the basis of a Bayesian framework, the authors present an original approach of attentional guidance by global…
Bottom-up Attention Orienting in Young Children with Autism
ERIC Educational Resources Information Center
Amso, Dima; Haas, Sara; Tenenbaum, Elena; Markant, Julie; Sheinkopf, Stephen J.
2014-01-01
We examined the impact of simultaneous bottom-up visual influences and meaningful social stimuli on attention orienting in young children with autism spectrum disorders (ASDs). Relative to typically-developing age and sex matched participants, children with ASDs were more influenced by bottom-up visual scene information regardless of whether…
A lighting metric for quantitative evaluation of accent lighting systems
NASA Astrophysics Data System (ADS)
Acholo, Cyril O.; Connor, Kenneth A.; Radke, Richard J.
2014-09-01
Accent lighting is critical for artwork and sculpture lighting in museums, and subject lighting for stage, Film and television. The research problem of designing effective lighting in such settings has been revived recently with the rise of light-emitting-diode-based solid state lighting. In this work, we propose an easy-to-apply quantitative measure of the scene's visual quality as perceived by human viewers. We consider a well-accent-lit scene as one which maximizes the information about the scene (in an information-theoretic sense) available to the user. We propose a metric based on the entropy of the distribution of colors, which are extracted from an image of the scene from the viewer's perspective. We demonstrate that optimizing the metric as a function of illumination configuration (i.e., position, orientation, and spectral composition) results in natural, pleasing accent lighting. We use a photorealistic simulation tool to validate the functionality of our proposed approach, showing its successful application to two- and three-dimensional scenes.
Viewing social scenes: a visual scan-path study comparing fragile X syndrome and Williams syndrome.
Williams, Tracey A; Porter, Melanie A; Langdon, Robyn
2013-08-01
Fragile X syndrome (FXS) and Williams syndrome (WS) are both genetic disorders which present with similar cognitive-behavioral problems, but distinct social phenotypes. Despite these social differences both syndromes display poor social relations which may result from abnormal social processing. This study aimed to manipulate the location of socially salient information within scenes to investigate the visual attentional mechanisms of: capture, disengagement, and/or general engagement. Findings revealed that individuals with FXS avoid social information presented centrally, at least initially. The WS findings, on the other hand, provided some evidence that difficulties with attentional disengagement, rather than attentional capture, may play a role in the WS social phenotype. These findings are discussed in relation to the distinct social phenotypes of these two disorders.
How do visual and postural cues combine for self-tilt perception during slow pitch rotations?
Scotto Di Cesare, C; Buloup, F; Mestre, D R; Bringoux, L
2014-11-01
Self-orientation perception relies on the integration of multiple sensory inputs which convey spatially-related visual and postural cues. In the present study, an experimental set-up was used to tilt the body and/or the visual scene to investigate how these postural and visual cues are integrated for self-tilt perception (the subjective sensation of being tilted). Participants were required to repeatedly rate a confidence level for self-tilt perception during slow (0.05°·s(-1)) body and/or visual scene pitch tilts up to 19° relative to vertical. Concurrently, subjects also had to perform arm reaching movements toward a body-fixed target at certain specific angles of tilt. While performance of a concurrent motor task did not influence the main perceptual task, self-tilt detection did vary according to the visuo-postural stimuli. Slow forward or backward tilts of the visual scene alone did not induce a marked sensation of self-tilt contrary to actual body tilt. However, combined body and visual scene tilt influenced self-tilt perception more strongly, although this effect was dependent on the direction of visual scene tilt: only a forward visual scene tilt combined with a forward body tilt facilitated self-tilt detection. In such a case, visual scene tilt did not seem to induce vection but rather may have produced a deviation of the perceived orientation of the longitudinal body axis in the forward direction, which may have lowered the self-tilt detection threshold during actual forward body tilt. Copyright © 2014 Elsevier B.V. All rights reserved.
Streepey, Jefferson W; Kenyon, Robert V; Keshner, Emily A
2007-01-01
We previously reported responses to induced postural instability in young healthy individuals viewing visual motion with a narrow (25 degrees in both directions) and wide (90 degrees and 55 degrees in the horizontal and vertical directions) field of view (FOV) as they stood on different sized blocks. Visual motion was achieved using an immersive virtual environment that moved realistically with head motion (natural motion) and translated sinusoidally at 0.1 Hz in the fore-aft direction (augmented motion). We observed that a subset of the subjects (steppers) could not maintain continuous stance on the smallest block when the virtual environment was in motion. We completed a posteriori analyses on the postural responses of the steppers and non-steppers that may inform us about the mechanisms underlying these differences in stability. We found that when viewing augmented motion with a wide FOV, there was a greater effect on the head and whole body center of mass and ankle angle root mean square (RMS) values of the steppers than of the non-steppers. FFT analyses revealed greater power at the frequency of the visual stimulus in the steppers compared to the non-steppers. Whole body COM time lags relative to the augmented visual scene revealed that the time-delay between the scene and the COM was significantly increased in the steppers. The increased responsiveness to visual information suggests a greater visual field-dependency of the steppers and suggests that the thresholds for shifting from a reliance on visual information to somatosensory information can differ even within a healthy population.
Image Feature Types and Their Predictions of Aesthetic Preference and Naturalness
Ibarra, Frank F.; Kardan, Omid; Hunter, MaryCarol R.; Kotabe, Hiroki P.; Meyer, Francisco A. C.; Berman, Marc G.
2017-01-01
Previous research has investigated ways to quantify visual information of a scene in terms of a visual processing hierarchy, i.e., making sense of visual environment by segmentation and integration of elementary sensory input. Guided by this research, studies have developed categories for low-level visual features (e.g., edges, colors), high-level visual features (scene-level entities that convey semantic information such as objects), and how models of those features predict aesthetic preference and naturalness. For example, in Kardan et al. (2015a), 52 participants provided aesthetic preference and naturalness ratings, which are used in the current study, for 307 images of mixed natural and urban content. Kardan et al. (2015a) then developed a model using low-level features to predict aesthetic preference and naturalness and could do so with high accuracy. What has yet to be explored is the ability of higher-level visual features (e.g., horizon line position relative to viewer, geometry of building distribution relative to visual access) to predict aesthetic preference and naturalness of scenes, and whether higher-level features mediate some of the association between the low-level features and aesthetic preference or naturalness. In this study we investigated these relationships and found that low- and high- level features explain 68.4% of the variance in aesthetic preference ratings and 88.7% of the variance in naturalness ratings. Additionally, several high-level features mediated the relationship between the low-level visual features and aaesthetic preference. In a multiple mediation analysis, the high-level feature mediators accounted for over 50% of the variance in predicting aesthetic preference. These results show that high-level visual features play a prominent role predicting aesthetic preference, but do not completely eliminate the predictive power of the low-level visual features. These strong predictors provide powerful insights for future research relating to landscape and urban design with the aim of maximizing subjective well-being, which could lead to improved health outcomes on a larger scale. PMID:28503158
Fuzzy Classification of High Resolution Remote Sensing Scenes Using Visual Attention Features.
Li, Linyi; Xu, Tingbao; Chen, Yun
2017-01-01
In recent years the spatial resolutions of remote sensing images have been improved greatly. However, a higher spatial resolution image does not always lead to a better result of automatic scene classification. Visual attention is an important characteristic of the human visual system, which can effectively help to classify remote sensing scenes. In this study, a novel visual attention feature extraction algorithm was proposed, which extracted visual attention features through a multiscale process. And a fuzzy classification method using visual attention features (FC-VAF) was developed to perform high resolution remote sensing scene classification. FC-VAF was evaluated by using remote sensing scenes from widely used high resolution remote sensing images, including IKONOS, QuickBird, and ZY-3 images. FC-VAF achieved more accurate classification results than the others according to the quantitative accuracy evaluation indices. We also discussed the role and impacts of different decomposition levels and different wavelets on the classification accuracy. FC-VAF improves the accuracy of high resolution scene classification and therefore advances the research of digital image analysis and the applications of high resolution remote sensing images.
Fuzzy Classification of High Resolution Remote Sensing Scenes Using Visual Attention Features
Xu, Tingbao; Chen, Yun
2017-01-01
In recent years the spatial resolutions of remote sensing images have been improved greatly. However, a higher spatial resolution image does not always lead to a better result of automatic scene classification. Visual attention is an important characteristic of the human visual system, which can effectively help to classify remote sensing scenes. In this study, a novel visual attention feature extraction algorithm was proposed, which extracted visual attention features through a multiscale process. And a fuzzy classification method using visual attention features (FC-VAF) was developed to perform high resolution remote sensing scene classification. FC-VAF was evaluated by using remote sensing scenes from widely used high resolution remote sensing images, including IKONOS, QuickBird, and ZY-3 images. FC-VAF achieved more accurate classification results than the others according to the quantitative accuracy evaluation indices. We also discussed the role and impacts of different decomposition levels and different wavelets on the classification accuracy. FC-VAF improves the accuracy of high resolution scene classification and therefore advances the research of digital image analysis and the applications of high resolution remote sensing images. PMID:28761440
Knepper, Daniel H.
2010-01-01
As part of the Central Colorado Mineral Resource Assessment Project, the digital image data for four Landsat Thematic Mapper scenes covering central Colorado between Wyoming and New Mexico were acquired and band ratios were calculated after masking pixels dominated by vegetation, snow, and terrain shadows. Ratio values were visually enhanced by contrast stretching, revealing only those areas with strong responses (high ratio values). A color-ratio composite mosaic was prepared for the four scenes so that the distribution of potentially hydrothermally altered rocks could be visually evaluated. To provide a more useful input to a Geographic Information System-based mineral resource assessment, the information contained in the color-ratio composite raster image mosaic was converted to vector-based polygons after thresholding to isolate the strongest ratio responses and spatial filtering to reduce vector complexity and isolate the largest occurrences of potentially hydrothermally altered rocks.
Smart unattended sensor networks with scene understanding capabilities
NASA Astrophysics Data System (ADS)
Kuvich, Gary
2006-05-01
Unattended sensor systems are new technologies that are supposed to provide enhanced situation awareness to military and law enforcement agencies. A network of such sensors cannot be very effective in field conditions only if it can transmit visual information to human operators or alert them on motion. In the real field conditions, events may happen in many nodes of a network simultaneously. But the real number of control personnel is always limited, and attention of human operators can be simply attracted to particular network nodes, while more dangerous threat may be unnoticed at the same time in the other nodes. Sensor networks would be more effective if equipped with a system that is similar to human vision in its abilities to understand visual information. Human vision uses for that a rough but wide peripheral system that tracks motions and regions of interests, narrow but precise foveal vision that analyzes and recognizes objects in the center of selected region of interest, and visual intelligence that provides scene and object contexts and resolves ambiguity and uncertainty in the visual information. Biologically-inspired Network-Symbolic models convert image information into an 'understandable' Network-Symbolic format, which is similar to relational knowledge models. The equivalent of interaction between peripheral and foveal systems in the network-symbolic system is achieved via interaction between Visual and Object Buffers and the top-level knowledge system.
Intrinsic and contextual features in object recognition.
Schlangen, Derrick; Barenholtz, Elan
2015-01-28
The context in which an object is found can facilitate its recognition. Yet, it is not known how effective this contextual information is relative to the object's intrinsic visual features, such as color and shape. To address this, we performed four experiments using rendered scenes with novel objects. In each experiment, participants first performed a visual search task, searching for a uniquely shaped target object whose color and location within the scene was experimentally manipulated. We then tested participants' tendency to use their knowledge of the location and color information in an identification task when the objects' images were degraded due to blurring, thus eliminating the shape information. In Experiment 1, we found that, in the absence of any diagnostic intrinsic features, participants identified objects based purely on their locations within the scene. In Experiment 2, we found that participants combined an intrinsic feature, color, with contextual location in order to uniquely specify an object. In Experiment 3, we found that when an object's color and location information were in conflict, participants identified the object using both sources of information equally. Finally, in Experiment 4, we found that participants used whichever source of information-either color or location-was more statistically reliable in order to identify the target object. Overall, these experiments show that the context in which objects are found can play as important a role as intrinsic features in identifying the objects. © 2015 ARVO.
Yoo, Seung-Woo; Lee, Inah
2017-01-01
How visual scene memory is processed differentially by the upstream structures of the hippocampus is largely unknown. We sought to dissociate functionally the lateral and medial subdivisions of the entorhinal cortex (LEC and MEC, respectively) in visual scene-dependent tasks by temporarily inactivating the LEC and MEC in the same rat. When the rat made spatial choices in a T-maze using visual scenes displayed on LCD screens, the inactivation of the MEC but not the LEC produced severe deficits in performance. However, when the task required the animal to push a jar or to dig in the sand in the jar using the same scene stimuli, the LEC but not the MEC became important. Our findings suggest that the entorhinal cortex is critical for scene-dependent mnemonic behavior, and the response modality may interact with a sensory modality to determine the involvement of the LEC and MEC in scene-based memory tasks. DOI: http://dx.doi.org/10.7554/eLife.21543.001 PMID:28169828
Seymour, K J; Williams, M A; Rich, A N
2016-05-01
Many theories of visual object perception assume the visual system initially extracts borders between objects and their background and then "fills in" color to the resulting object surfaces. We investigated the transformation of chromatic signals across the human ventral visual stream, with particular interest in distinguishing representations of object surface color from representations of chromatic signals reflecting the retinal input. We used fMRI to measure brain activity while participants viewed figure-ground stimuli that differed either in the position or in the color contrast polarity of the foreground object (the figure). Multivariate pattern analysis revealed that classifiers were able to decode information about which color was presented at a particular retinal location from early visual areas, whereas regions further along the ventral stream exhibited biases for representing color as part of an object's surface, irrespective of its position on the retina. Additional analyses showed that although activity in V2 contained strong chromatic contrast information to support the early parsing of objects within a visual scene, activity in this area also signaled information about object surface color. These findings are consistent with the view that mechanisms underlying scene segmentation and the binding of color to object surfaces converge in V2. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
How affective information from faces and scenes interacts in the brain
Vandenbulcke, Mathieu; Sinke, Charlotte B. A.; Goebel, Rainer; de Gelder, Beatrice
2014-01-01
Facial expression perception can be influenced by the natural visual context in which the face is perceived. We performed an fMRI experiment presenting participants with fearful or neutral faces against threatening or neutral background scenes. Triangles and scrambled scenes served as control stimuli. The results showed that the valence of the background influences face selective activity in the right anterior parahippocampal place area (PPA) and subgenual anterior cingulate cortex (sgACC) with higher activation for neutral backgrounds compared to threatening backgrounds (controlled for isolated background effects) and that this effect correlated with trait empathy in the sgACC. In addition, the left fusiform gyrus (FG) responds to the affective congruence between face and background scene. The results show that valence of the background modulates face processing and support the hypothesis that empathic processing in sgACC is inhibited when affective information is present in the background. In addition, the findings reveal a pattern of complex scene perception showing a gradient of functional specialization along the posterior–anterior axis: from sensitivity to the affective content of scenes (extrastriate body area: EBA and posterior PPA), over scene emotion–face emotion interaction (left FG) via category–scene interaction (anterior PPA) to scene–category–personality interaction (sgACC). PMID:23956081
Canessa, Andrea; Gibaldi, Agostino; Chessa, Manuela; Fato, Marco; Solari, Fabio; Sabatini, Silvio P.
2017-01-01
Binocular stereopsis is the ability of a visual system, belonging to a live being or a machine, to interpret the different visual information deriving from two eyes/cameras for depth perception. From this perspective, the ground-truth information about three-dimensional visual space, which is hardly available, is an ideal tool both for evaluating human performance and for benchmarking machine vision algorithms. In the present work, we implemented a rendering methodology in which the camera pose mimics realistic eye pose for a fixating observer, thus including convergent eye geometry and cyclotorsion. The virtual environment we developed relies on highly accurate 3D virtual models, and its full controllability allows us to obtain the stereoscopic pairs together with the ground-truth depth and camera pose information. We thus created a stereoscopic dataset: GENUA PESTO—GENoa hUman Active fixation database: PEripersonal space STereoscopic images and grOund truth disparity. The dataset aims to provide a unified framework useful for a number of problems relevant to human and computer vision, from scene exploration and eye movement studies to 3D scene reconstruction. PMID:28350382
Causal Inference for Spatial Constancy across Saccades
Atsma, Jeroen; Maij, Femke; Koppen, Mathieu; Irwin, David E.; Medendorp, W. Pieter
2016-01-01
Our ability to interact with the environment hinges on creating a stable visual world despite the continuous changes in retinal input. To achieve visual stability, the brain must distinguish the retinal image shifts caused by eye movements and shifts due to movements of the visual scene. This process appears not to be flawless: during saccades, we often fail to detect whether visual objects remain stable or move, which is called saccadic suppression of displacement (SSD). How does the brain evaluate the memorized information of the presaccadic scene and the actual visual feedback of the postsaccadic visual scene in the computations for visual stability? Using a SSD task, we test how participants localize the presaccadic position of the fixation target, the saccade target or a peripheral non-foveated target that was displaced parallel or orthogonal during a horizontal saccade, and subsequently viewed for three different durations. Results showed different localization errors of the three targets, depending on the viewing time of the postsaccadic stimulus and its spatial separation from the presaccadic location. We modeled the data through a Bayesian causal inference mechanism, in which at the trial level an optimal mixing of two possible strategies, integration vs. separation of the presaccadic memory and the postsaccadic sensory signals, is applied. Fits of this model generally outperformed other plausible decision strategies for producing SSD. Our findings suggest that humans exploit a Bayesian inference process with two causal structures to mediate visual stability. PMID:26967730
The Effect of Visual Information on the Manual Approach and Landing
NASA Technical Reports Server (NTRS)
Wewerinke, P. H.
1982-01-01
The effect of visual information in combination with basic display information on the approach performance. A pre-experimental model analysis was performed in terms of the optimal control model. The resulting aircraft approach performance predictions were compared with the results of a moving base simulator program. The results illustrate that the model provides a meaningful description of the visual (scene) perception process involved in the complex (multi-variable, time varying) manual approach task with a useful predictive capability. The theoretical framework was shown to allow a straight-forward investigation of the complex interaction of a variety of task variables.
How Ants Use Vision When Homing Backward.
Schwarz, Sebastian; Mangan, Michael; Zeil, Jochen; Webb, Barbara; Wystrach, Antoine
2017-02-06
Ants can navigate over long distances between their nest and food sites using visual cues [1, 2]. Recent studies show that this capacity is undiminished when walking backward while dragging a heavy food item [3-5]. This challenges the idea that ants use egocentric visual memories of the scene for guidance [1, 2, 6]. Can ants use their visual memories of the terrestrial cues when going backward? Our results suggest that ants do not adjust their direction of travel based on the perceived scene while going backward. Instead, they maintain a straight direction using their celestial compass. This direction can be dictated by their path integrator [5] but can also be set using terrestrial visual cues after a forward peek. If the food item is too heavy to enable body rotations, ants moving backward drop their food on occasion, rotate and walk a few steps forward, return to the food, and drag it backward in a now-corrected direction defined by terrestrial cues. Furthermore, we show that ants can maintain their direction of travel independently of their body orientation. It thus appears that egocentric retinal alignment is required for visual scene recognition, but ants can translate this acquired directional information into a holonomic frame of reference, which enables them to decouple their travel direction from their body orientation and hence navigate backward. This reveals substantial flexibility and communication between different types of navigational information: from terrestrial to celestial cues and from egocentric to holonomic directional memories. VIDEO ABSTRACT. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.
The roles of scene priming and location priming in object-scene consistency effects
Heise, Nils; Ansorge, Ulrich
2014-01-01
Presenting consistent objects in scenes facilitates object recognition as compared to inconsistent objects. Yet the mechanisms by which scenes influence object recognition are still not understood. According to one theory, consistent scenes facilitate visual search for objects at expected places. Here, we investigated two predictions following from this theory: If visual search is responsible for consistency effects, consistency effects could be weaker (1) with better-primed than less-primed object locations, and (2) with less-primed than better-primed scenes. In Experiments 1 and 2, locations of objects were varied within a scene to a different degree (one, two, or four possible locations). In addition, object-scene consistency was studied as a function of progressive numbers of repetitions of the backgrounds. Because repeating locations and backgrounds could facilitate visual search for objects, these repetitions might alter the object-scene consistency effect by lowering of location uncertainty. Although we find evidence for a significant consistency effect, we find no clear support for impacts of scene priming or location priming on the size of the consistency effect. Additionally, we find evidence that the consistency effect is dependent on the eccentricity of the target objects. These results point to only small influences of priming to object-scene consistency effects but all-in-all the findings can be reconciled with a visual-search explanation of the consistency effect. PMID:24910628
Stages as models of scene geometry.
Nedović, Vladimir; Smeulders, Arnold W M; Redert, André; Geusebroek, Jan-Mark
2010-09-01
Reconstruction of 3D scene geometry is an important element for scene understanding, autonomous vehicle and robot navigation, image retrieval, and 3D television. We propose accounting for the inherent structure of the visual world when trying to solve the scene reconstruction problem. Consequently, we identify geometric scene categorization as the first step toward robust and efficient depth estimation from single images. We introduce 15 typical 3D scene geometries called stages, each with a unique depth profile, which roughly correspond to a large majority of broadcast video frames. Stage information serves as a first approximation of global depth, narrowing down the search space in depth estimation and object localization. We propose different sets of low-level features for depth estimation, and perform stage classification on two diverse data sets of television broadcasts. Classification results demonstrate that stages can often be efficiently learned from low-dimensional image representations.
Temporal and peripheral extraction of contextual cues from scenes during visual search.
Koehler, Kathryn; Eckstein, Miguel P
2017-02-01
Scene context is known to facilitate object recognition and guide visual search, but little work has focused on isolating image-based cues and evaluating their contributions to eye movement guidance and search performance. Here, we explore three types of contextual cues (a co-occurring object, the configuration of other objects, and the superordinate category of background elements) and assess their joint contributions to search performance in the framework of cue-combination and the temporal unfolding of their extraction. We also assess whether observers' ability to extract each contextual cue in the visual periphery is a bottleneck that determines the utilization and contribution of each cue to search guidance and decision accuracy. We find that during the first four fixations of a visual search task observers first utilize the configuration of objects for coarse eye movement guidance and later use co-occurring object information for finer guidance. In the absence of contextual cues, observers were suboptimally biased to report the target object as being absent. The presence of the co-occurring object was the only contextual cue that had a significant effect in reducing decision bias. The early influence of object-based cues on eye movements is corroborated by a clear demonstration of observers' ability to extract object cues up to 16° into the visual periphery. The joint contributions of the cues to decision search accuracy approximates that expected from the combination of statistically independent cues and optimal cue combination. Finally, the lack of utilization and contribution of the background-based contextual cue to search guidance cannot be explained by the availability of the contextual cue in the visual periphery; instead it is related to background cues providing the least inherent information about the precise location of the target in the scene.
[Visual representation of natural scenes in flicker changes].
Nakashima, Ryoichi; Yokosawa, Kazuhiko
2010-08-01
Coherence theory in scene perception (Rensink, 2002) assumes the retention of volatile object representations on which attention is not focused. On the other hand, visual memory theory in scene perception (Hollingworth & Henderson, 2002) assumes that robust object representations are retained. In this study, we hypothesized that the difference between these two theories is derived from the difference of the experimental tasks that they are based on. In order to verify this hypothesis, we examined the properties of visual representation by using a change detection and memory task in a flicker paradigm. We measured the representations when participants were instructed to search for a change in a scene, and compared them with the intentional memory representations. The visual representations were retained in visual long-term memory even in the flicker paradigm, and were as robust as the intentional memory representations. However, the results indicate that the representations are unavailable for explicitly localizing a scene change, but are available for answering the recognition test. This suggests that coherence theory and visual memory theory are compatible.
Automatic acquisition of motion trajectories: tracking hockey players
NASA Astrophysics Data System (ADS)
Okuma, Kenji; Little, James J.; Lowe, David
2003-12-01
Computer systems that have the capability of analyzing complex and dynamic scenes play an essential role in video annotation. Scenes can be complex in such a way that there are many cluttered objects with different colors, shapes and sizes, and can be dynamic with multiple interacting moving objects and a constantly changing background. In reality, there are many scenes that are complex, dynamic, and challenging enough for computers to describe. These scenes include games of sports, air traffic, car traffic, street intersections, and cloud transformations. Our research is about the challenge of inventing a descriptive computer system that analyzes scenes of hockey games where multiple moving players interact with each other on a constantly moving background due to camera motions. Ultimately, such a computer system should be able to acquire reliable data by extracting the players" motion as their trajectories, querying them by analyzing the descriptive information of data, and predict the motions of some hockey players based on the result of the query. Among these three major aspects of the system, we primarily focus on visual information of the scenes, that is, how to automatically acquire motion trajectories of hockey players from video. More accurately, we automatically analyze the hockey scenes by estimating parameters (i.e., pan, tilt, and zoom) of the broadcast cameras, tracking hockey players in those scenes, and constructing a visual description of the data by displaying trajectories of those players. Many technical problems in vision such as fast and unpredictable players' motions and rapid camera motions make our challenge worth tackling. To the best of our knowledge, there have not been any automatic video annotation systems for hockey developed in the past. Although there are many obstacles to overcome, our efforts and accomplishments would hopefully establish the infrastructure of the automatic hockey annotation system and become a milestone for research in automatic video annotation in this domain.
Bag of Visual Words Model with Deep Spatial Features for Geographical Scene Classification
Wu, Lin
2017-01-01
With the popular use of geotagging images, more and more research efforts have been placed on geographical scene classification. In geographical scene classification, valid spatial feature selection can significantly boost the final performance. Bag of visual words (BoVW) can do well in selecting feature in geographical scene classification; nevertheless, it works effectively only if the provided feature extractor is well-matched. In this paper, we use convolutional neural networks (CNNs) for optimizing proposed feature extractor, so that it can learn more suitable visual vocabularies from the geotagging images. Our approach achieves better performance than BoVW as a tool for geographical scene classification, respectively, in three datasets which contain a variety of scene categories. PMID:28706534
ERIC Educational Resources Information Center
Giltrow, David
1978-01-01
One practical method which development film makers can adopt to increase comprehension of important scenes is to eliminate extraneous background information by putting it out of focus, or by shooting against plain backgrounds. (Author/STS)
Comparing object recognition from binary and bipolar edge images for visual prostheses
Jung, Jae-Hyun; Pu, Tian; Peli, Eli
2017-01-01
Visual prostheses require an effective representation method due to the limited display condition which has only 2 or 3 levels of grayscale in low resolution. Edges derived from abrupt luminance changes in images carry essential information for object recognition. Typical binary (black and white) edge images have been used to represent features to convey essential information. However, in scenes with a complex cluttered background, the recognition rate of the binary edge images by human observers is limited and additional information is required. The polarity of edges and cusps (black or white features on a gray background) carries important additional information; the polarity may provide shape from shading information missing in the binary edge image. This depth information may be restored by using bipolar edges. We compared object recognition rates from 16 binary edge images and bipolar edge images by 26 subjects to determine the possible impact of bipolar filtering in visual prostheses with 3 or more levels of grayscale. Recognition rates were higher with bipolar edge images and the improvement was significant in scenes with complex backgrounds. The results also suggest that erroneous shape from shading interpretation of bipolar edges resulting from pigment rather than boundaries of shape may confound the recognition. PMID:28458481
Oh, Hwamee; Leung, Hoi-Chung
2010-02-01
In this fMRI study, we investigated prefrontal cortex (PFC) and visual association regions during selective information processing. We recorded behavioral responses and neural activity during a delayed recognition task with a cue presented during the delay period. A specific cue ("Face" or "Scene") was used to indicate which one of the two initially viewed pictures of a face and a scene would be tested at the end of a trial, whereas a nonspecific cue ("Both") was used as control. As expected, the specific cues facilitated behavioral performance (faster response times) compared to the nonspecific cue. A postexperiment memory test showed that the items cued to remember were better recognized than those not cued. The fMRI results showed largely overlapped activations across the three cue conditions in dorsolateral and ventrolateral PFC, dorsomedial PFC, posterior parietal cortex, ventral occipito-temporal cortex, dorsal striatum, and pulvinar nucleus. Among those regions, dorsomedial PFC and inferior occipital gyrus remained active during the entire postcue delay period. Differential activity was mainly found in the association cortices. In particular, the parahippocampal area and posterior superior parietal lobe showed significantly enhanced activity during the postcue period of the scene condition relative to the Face and Both conditions. No regions showed differentially greater responses to the face cue. Our findings suggest that a better representation of visual information in working memory may depend on enhancing the more specialized visual association areas or their interaction with PFC.
Efficient summary statistical representation when change localization fails.
Haberman, Jason; Whitney, David
2011-10-01
People are sensitive to the summary statistics of the visual world (e.g., average orientation/speed/facial expression). We readily derive this information from complex scenes, often without explicit awareness. Given the fundamental and ubiquitous nature of summary statistical representation, we tested whether this kind of information is subject to the attentional constraints imposed by change blindness. We show that information regarding the summary statistics of a scene is available despite limited conscious access. In a novel experiment, we found that while observers can suffer from change blindness (i.e., not localize where change occurred between two views of the same scene), observers could nevertheless accurately report changes in the summary statistics (or "gist") about the very same scene. In the experiment, observers saw two successively presented sets of 16 faces that varied in expression. Four of the faces in the first set changed from one emotional extreme (e.g., happy) to another (e.g., sad) in the second set. Observers performed poorly when asked to locate any of the faces that changed (change blindness). However, when asked about the ensemble (which set was happier, on average), observer performance remained high. Observers were sensitive to the average expression even when they failed to localize any specific object change. That is, even when observers could not locate the very faces driving the change in average expression between the two sets, they nonetheless derived a precise ensemble representation. Thus, the visual system may be optimized to process summary statistics in an efficient manner, allowing it to operate despite minimal conscious access to the information presented.
Allocentric information is used for memory-guided reaching in depth: A virtual reality study.
Klinghammer, Mathias; Schütz, Immo; Blohm, Gunnar; Fiehler, Katja
2016-12-01
Previous research has demonstrated that humans use allocentric information when reaching to remembered visual targets, but most of the studies are limited to 2D space. Here, we study allocentric coding of memorized reach targets in 3D virtual reality. In particular, we investigated the use of allocentric information for memory-guided reaching in depth and the role of binocular and monocular (object size) depth cues for coding object locations in 3D space. To this end, we presented a scene with objects on a table which were located at different distances from the observer and served as reach targets or allocentric cues. After free visual exploration of this scene and a short delay the scene reappeared, but with one object missing (=reach target). In addition, the remaining objects were shifted horizontally or in depth. When objects were shifted in depth, we also independently manipulated object size by either magnifying or reducing their size. After the scene vanished, participants reached to the remembered target location on the blank table. Reaching endpoints deviated systematically in the direction of object shifts, similar to our previous results from 2D presentations. This deviation was stronger for object shifts in depth than in the horizontal plane and independent of observer-target-distance. Reaching endpoints systematically varied with changes in object size. Our results suggest that allocentric information is used for coding targets for memory-guided reaching in depth. Thereby, retinal disparity and vergence as well as object size provide important binocular and monocular depth cues. Copyright © 2016 Elsevier Ltd. All rights reserved.
Neuroscience-Enabled Complex Visual Scene Understanding
2012-04-12
some cases, it is hard to precisely say where or what we are looking at since a complex task governs eye fixations, for example in driving. While in...another objects ( say a door) can be resolved using the prior information about the scene. This knowledge can be provided from gist models, such as one...separation and combination of class-dependent features for handwriting recognition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 21, no. 10, pp. 1089
PROCRU: A model for analyzing crew procedures in approach to landing
NASA Technical Reports Server (NTRS)
Baron, S.; Muralidharan, R.; Lancraft, R.; Zacharias, G.
1980-01-01
A model for analyzing crew procedures in approach to landing is developed. The model employs the information processing structure used in the optimal control model and in recent models for monitoring and failure detection. Mechanisms are added to this basic structure to model crew decision making in this multi task environment. Decisions are based on probability assessments and potential mission impact (or gain). Sub models for procedural activities are included. The model distinguishes among external visual, instrument visual, and auditory sources of information. The external visual scene perception models incorporate limitations in obtaining information. The auditory information channel contains a buffer to allow for storage in memory until that information can be processed.
Handheld real-time volumetric 3-D gamma-ray imaging
NASA Astrophysics Data System (ADS)
Haefner, Andrew; Barnowski, Ross; Luke, Paul; Amman, Mark; Vetter, Kai
2017-06-01
This paper presents the concept of real-time fusion of gamma-ray imaging and visual scene data for a hand-held mobile Compton imaging system in 3-D. The ability to obtain and integrate both gamma-ray and scene data from a mobile platform enables improved capabilities in the localization and mapping of radioactive materials. This not only enhances the ability to localize these materials, but it also provides important contextual information of the scene which once acquired can be reviewed and further analyzed subsequently. To demonstrate these concepts, the high-efficiency multimode imager (HEMI) is used in a hand-portable implementation in combination with a Microsoft Kinect sensor. This sensor, in conjunction with open-source software, provides the ability to create a 3-D model of the scene and to track the position and orientation of HEMI in real-time. By combining the gamma-ray data and visual data, accurate 3-D maps of gamma-ray sources are produced in real-time. This approach is extended to map the location of radioactive materials within objects with unknown geometry.
Conscious visual memory with minimal attention.
Pinto, Yair; Vandenbroucke, Annelinde R; Otten, Marte; Sligte, Ilja G; Seth, Anil K; Lamme, Victor A F
2017-02-01
Is conscious visual perception limited to the locations that a person attends? The remarkable phenomenon of change blindness, which shows that people miss nearly all unattended changes in a visual scene, suggests the answer is yes. However, change blindness is found after visual interference (a mask or a new scene), so that subjects have to rely on working memory (WM), which has limited capacity, to detect the change. Before such interference, however, a much larger capacity store, called fragile memory (FM), which is easily overwritten by newly presented visual information, is present. Whether these different stores depend equally on spatial attention is central to the debate on the role of attention in conscious vision. In 2 experiments, we found that minimizing spatial attention almost entirely erases visual WM, as expected. Critically, FM remains largely intact. Moreover, minimally attended FM responses yield accurate metacognition, suggesting that conscious memory persists with limited spatial attention. Together, our findings help resolve the fundamental issue of how attention affects perception: Both visual consciousness and memory can be supported by only minimal attention. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
ERIC Educational Resources Information Center
Oh, Hwamee; Leung, Hoi-Chung
2010-01-01
In this fMRI study, we investigated prefrontal cortex (PFC) and visual association regions during selective information processing. We recorded behavioral responses and neural activity during a delayed recognition task with a cue presented during the delay period. A specific cue ("Face" or "Scene") was used to indicate which one of the two…
Scan patterns when viewing natural scenes: Emotion, complexity, and repetition
Bradley, Margaret M.; Houbova, Petra; Miccoli, Laura; Costa, Vincent D.; Lang, Peter J.
2011-01-01
Eye movements were monitored during picture viewing and effects of hedonic content, perceptual composition, and repetition on scanning assessed. In Experiment 1, emotional and neutral pictures that were figure-ground compositions or more complex scenes were presented for a 6 s free viewing period. Viewing emotional pictures or complex scenes prompted more fixations and broader scanning of the visual array, compared to neutral pictures or simple figure-ground compositions. Effects of emotion and composition were independent, supporting the hypothesis that these oculomotor indices reflect enhanced information seeking. Experiment 2 tested an orienting hypothesis by repeatedly presenting the same pictures. Although repetition altered specific scan patterns, emotional, compared to neutral, picture viewing continued to prompt oculomotor differences, suggesting that motivationally relevant cues enhance information seeking in appetitive and defensive contexts. PMID:21649664
Predictive and postdictive mechanisms jointly contribute to visual awareness.
Soga, Ryosuke; Akaishi, Rei; Sakai, Katsuyuki
2009-09-01
One of the fundamental issues in visual awareness is how we are able to perceive the scene in front of our eyes on time despite the delay in processing visual information. The prediction theory postulates that our visual system predicts the future to compensate for such delays. On the other hand, the postdiction theory postulates that our visual awareness is inevitably a delayed product. In the present study we used flash-lag paradigms in motion and color domains and examined how the perception of visual information at the time of flash is influenced by prior and subsequent visual events. We found that both types of event additively influence the perception of the present visual image, suggesting that our visual awareness results from joint contribution of predictive and postdictive mechanisms.
Statistics of high-level scene context
Greene, Michelle R.
2013-01-01
Context is critical for recognizing environments and for searching for objects within them: contextual associations have been shown to modulate reaction time and object recognition accuracy, as well as influence the distribution of eye movements and patterns of brain activations. However, we have not yet systematically quantified the relationships between objects and their scene environments. Here I seek to fill this gap by providing descriptive statistics of object-scene relationships. A total of 48, 167 objects were hand-labeled in 3499 scenes using the LabelMe tool (Russell et al., 2008). From these data, I computed a variety of descriptive statistics at three different levels of analysis: the ensemble statistics that describe the density and spatial distribution of unnamed “things” in the scene; the bag of words level where scenes are described by the list of objects contained within them; and the structural level where the spatial distribution and relationships between the objects are measured. The utility of each level of description for scene categorization was assessed through the use of linear classifiers, and the plausibility of each level for modeling human scene categorization is discussed. Of the three levels, ensemble statistics were found to be the most informative (per feature), and also best explained human patterns of categorization errors. Although a bag of words classifier had similar performance to human observers, it had a markedly different pattern of errors. However, certain objects are more useful than others, and ceiling classification performance could be achieved using only the 64 most informative objects. As object location tends not to vary as a function of category, structural information provided little additional information. Additionally, these data provide valuable information on natural scene redundancy that can be exploited for machine vision, and can help the visual cognition community to design experiments guided by statistics rather than intuition. PMID:24194723
The priming function of in-car audio instruction.
Keyes, Helen; Whitmore, Antony; Naneva, Stanislava; McDermott, Daragh
2018-05-01
Studies to date have focused on the priming power of visual road signs, but not the priming potential of audio road scene instruction. Here, the relative priming power of visual, audio, and multisensory road scene instructions was assessed. In a lab-based study, participants responded to target road scene turns following visual, audio, or multisensory road turn primes which were congruent or incongruent to the primes in direction, or control primes. All types of instruction (visual, audio, and multisensory) were successful in priming responses to a road scene. Responses to multisensory-primed targets (both audio and visual) were faster than responses to either audio or visual primes alone. Incongruent audio primes did not affect performance negatively in the manner of incongruent visual or multisensory primes. Results suggest that audio instructions have the potential to prime drivers to respond quickly and safely to their road environment. Peak performance will be observed if audio and visual road instruction primes can be timed to co-occur.
Eye Movements and Visual Memory for Scenes
2005-01-01
Scene memory research has demonstrated that the memory representation of a semantically inconsistent object in a scene is more detailed and/or complete... memory during scene viewing, then changes to semantically inconsistent objects (which should be represented more com- pletely) should be detected more... semantic description. Due to the surprise nature of the visual memory test, any learning that occurred during the search portion of the experiment was
Effects of aging on neural connectivity underlying selective memory for emotional scenes
Waring, Jill D.; Addis, Donna Rose; Kensinger, Elizabeth A.
2012-01-01
Older adults show age-related reductions in memory for neutral items within complex visual scenes, but just like young adults, older adults exhibit a memory advantage for emotional items within scenes compared with the background scene information. The present study examined young and older adults’ encoding-stage effective connectivity for selective memory of emotional items versus memory for both the emotional item and its background. In a functional magnetic resonance imaging (fMRI) study, participants viewed scenes containing either positive or negative items within neutral backgrounds. Outside the scanner, participants completed a memory test for items and backgrounds. Irrespective of scene content being emotionally positive or negative, older adults had stronger positive connections among frontal regions and from frontal regions to medial temporal lobe structures than did young adults, especially when items and backgrounds were subsequently remembered. These results suggest there are differences between young and older adults’ connectivity accompanying the encoding of emotional scenes. Older adults may require more frontal connectivity to encode all elements of a scene rather than just encoding the emotional item. PMID:22542836
Effects of aging on neural connectivity underlying selective memory for emotional scenes.
Waring, Jill D; Addis, Donna Rose; Kensinger, Elizabeth A
2013-02-01
Older adults show age-related reductions in memory for neutral items within complex visual scenes, but just like young adults, older adults exhibit a memory advantage for emotional items within scenes compared with the background scene information. The present study examined young and older adults' encoding-stage effective connectivity for selective memory of emotional items versus memory for both the emotional item and its background. In a functional magnetic resonance imaging (fMRI) study, participants viewed scenes containing either positive or negative items within neutral backgrounds. Outside the scanner, participants completed a memory test for items and backgrounds. Irrespective of scene content being emotionally positive or negative, older adults had stronger positive connections among frontal regions and from frontal regions to medial temporal lobe structures than did young adults, especially when items and backgrounds were subsequently remembered. These results suggest there are differences between young and older adults' connectivity accompanying the encoding of emotional scenes. Older adults may require more frontal connectivity to encode all elements of a scene rather than just encoding the emotional item. Published by Elsevier Inc.
Estimation of the Horizon in Photographed Outdoor Scenes by Human and Machine
Herdtweck, Christian; Wallraven, Christian
2013-01-01
We present three experiments on horizon estimation. In Experiment 1 we verify the human ability to estimate the horizon in static images from only visual input. Estimates are given without time constraints with emphasis on precision. The resulting estimates are used as baseline to evaluate horizon estimates from early visual processes. Stimuli are presented for only ms and then masked to purge visual short-term memory and enforcing estimates to rely on early processes, only. The high agreement between estimates and the lack of a training effect shows that enough information about viewpoint is extracted in the first few hundred milliseconds to make accurate horizon estimation possible. In Experiment 3 we investigate several strategies to estimate the horizon in the computer and compare human with machine “behavior” for different image manipulations and image scene types. PMID:24349073
Zhang, Xuetao; Huang, Jie; Yigit-Elliott, Serap; Rosenholtz, Ruth
2015-03-16
Observers can quickly search among shaded cubes for one lit from a unique direction. However, replace the cubes with similar 2-D patterns that do not appear to have a 3-D shape, and search difficulty increases. These results have challenged models of visual search and attention. We demonstrate that cube search displays differ from those with "equivalent" 2-D search items in terms of the informativeness of fairly low-level image statistics. This informativeness predicts peripheral discriminability of target-present from target-absent patches, which in turn predicts visual search performance, across a wide range of conditions. Comparing model performance on a number of classic search tasks, cube search does not appear unexpectedly easy. Easy cube search, per se, does not provide evidence for preattentive computation of 3-D scene properties. However, search asymmetries derived from rotating and/or flipping the cube search displays cannot be explained by the information in our current set of image statistics. This may merely suggest a need to modify the model's set of 2-D image statistics. Alternatively, it may be difficult cube search that provides evidence for preattentive computation of 3-D scene properties. By attributing 2-D luminance variations to a shaded 3-D shape, 3-D scene understanding may slow search for 2-D features of the target. © 2015 ARVO.
Zhang, Xuetao; Huang, Jie; Yigit-Elliott, Serap; Rosenholtz, Ruth
2015-01-01
Observers can quickly search among shaded cubes for one lit from a unique direction. However, replace the cubes with similar 2-D patterns that do not appear to have a 3-D shape, and search difficulty increases. These results have challenged models of visual search and attention. We demonstrate that cube search displays differ from those with “equivalent” 2-D search items in terms of the informativeness of fairly low-level image statistics. This informativeness predicts peripheral discriminability of target-present from target-absent patches, which in turn predicts visual search performance, across a wide range of conditions. Comparing model performance on a number of classic search tasks, cube search does not appear unexpectedly easy. Easy cube search, per se, does not provide evidence for preattentive computation of 3-D scene properties. However, search asymmetries derived from rotating and/or flipping the cube search displays cannot be explained by the information in our current set of image statistics. This may merely suggest a need to modify the model's set of 2-D image statistics. Alternatively, it may be difficult cube search that provides evidence for preattentive computation of 3-D scene properties. By attributing 2-D luminance variations to a shaded 3-D shape, 3-D scene understanding may slow search for 2-D features of the target. PMID:25780063
Scientific Visualization and Computational Science: Natural Partners
NASA Technical Reports Server (NTRS)
Uselton, Samuel P.; Lasinski, T. A. (Technical Monitor)
1995-01-01
Scientific visualization is developing rapidly, stimulated by computational science, which is gaining acceptance as a third alternative to theory and experiment. Computational science is based on numerical simulations of mathematical models derived from theory. But each individual simulation is like a hypothetical experiment; initial conditions are specified, and the result is a record of the observed conditions. Experiments can be simulated for situations that can not really be created or controlled. Results impossible to measure can be computed.. Even for observable values, computed samples are typically much denser. Numerical simulations also extend scientific exploration where the mathematics is analytically intractable. Numerical simulations are used to study phenomena from subatomic to intergalactic scales and from abstract mathematical structures to pragmatic engineering of everyday objects. But computational science methods would be almost useless without visualization. The obvious reason is that the huge amounts of data produced require the high bandwidth of the human visual system, and interactivity adds to the power. Visualization systems also provide a single context for all the activities involved from debugging the simulations, to exploring the data, to communicating the results. Most of the presentations today have their roots in image processing, where the fundamental task is: Given an image, extract information about the scene. Visualization has developed from computer graphics, and the inverse task: Given a scene description, make an image. Visualization extends the graphics paradigm by expanding the possible input. The goal is still to produce images; the difficulty is that the input is not a scene description displayable by standard graphics methods. Visualization techniques must either transform the data into a scene description or extend graphics techniques to display this odd input. Computational science is a fertile field for visualization research because the results vary so widely and include things that have no known appearance. The amount of data creates additional challenges for both hardware and software systems. Evaluations of visualization should ultimately reflect the insight gained into the scientific phenomena. So making good visualizations requires consideration of characteristics of the user and the purpose of the visualization. Knowledge about human perception and graphic design is also relevant. It is this breadth of knowledge that stimulates proposals for multidisciplinary visualization teams and intelligent visualization assistant software. Visualization is an immature field, but computational science is stimulating research on a broad front.
A category adjustment approach to memory for spatial location in natural scenes.
Holden, Mark P; Curby, Kim M; Newcombe, Nora S; Shipley, Thomas F
2010-05-01
Memories for spatial locations often show systematic errors toward the central value of the surrounding region. This bias has been explained using a Bayesian model in which fine-grained and categorical information are combined (Huttenlocher, Hedges, & Duncan, 1991). However, experiments testing this model have largely used locations contained in simple geometric shapes. Use of this paradigm raises 2 issues. First, do results generalize to the complex natural world? Second, what types of information might be used to segment complex spaces into constituent categories? Experiment 1 addressed the 1st question by showing a bias toward prototypical values in memory for spatial locations in complex natural scenes. Experiment 2 addressed the 2nd question by manipulating the availability of basic visual cues (using color negatives) or of semantic information about the scene (using inverted images). Error patterns suggest that both perceptual and conceptual information are involved in segmentation. The possible neurological foundations of location memory of this kind are discussed. PsycINFO Database Record (c) 2010 APA, all rights reserved.
ERIC Educational Resources Information Center
Huang, Tsung-Ren; Grossberg, Stephen
2010-01-01
How do humans use target-predictive contextual information to facilitate visual search? How are consistently paired scenic objects and positions learned and used to more efficiently guide search in familiar scenes? For example, humans can learn that a certain combination of objects may define a context for a kitchen and trigger a more efficient…
Postural and Spatial Orientation Driven by Virtual Reality
Keshner, Emily A.; Kenyon, Robert V.
2009-01-01
Orientation in space is a perceptual variable intimately related to postural orientation that relies on visual and vestibular signals to correctly identify our position relative to vertical. We have combined a virtual environment with motion of a posture platform to produce visual-vestibular conditions that allow us to explore how motion of the visual environment may affect perception of vertical and, consequently, affect postural stabilizing responses. In order to involve a higher level perceptual process, we needed to create a visual environment that was immersive. We did this by developing visual scenes that possess contextual information using color, texture, and 3-dimensional structures. Update latency of the visual scene was close to physiological latencies of the vestibulo-ocular reflex. Using this system we found that even when healthy young adults stand and walk on a stable support surface, they are unable to ignore wide field of view visual motion and they adapt their postural orientation to the parameters of the visual motion. Balance training within our environment elicited measurable rehabilitation outcomes. Thus we believe that virtual environments can serve as a clinical tool for evaluation and training of movement in situations that closely reflect conditions found in the physical world. PMID:19592796
A rain pixel recovery algorithm for videos with highly dynamic scenes.
Jie Chen; Lap-Pui Chau
2014-03-01
Rain removal is a very useful and important technique in applications such as security surveillance and movie editing. Several rain removal algorithms have been proposed these years, where photometric, chromatic, and probabilistic properties of the rain have been exploited to detect and remove the rainy effect. Current methods generally work well with light rain and relatively static scenes, when dealing with heavier rainfall in dynamic scenes, these methods give very poor visual results. The proposed algorithm is based on motion segmentation of dynamic scene. After applying photometric and chromatic constraints for rain detection, rain removal filters are applied on pixels such that their dynamic property as well as motion occlusion clue are considered; both spatial and temporal informations are then adaptively exploited during rain pixel recovery. Results show that the proposed algorithm has a much better performance for rainy scenes with large motion than existing algorithms.
A distributed code for color in natural scenes derived from center-surround filtered cone signals
Kellner, Christian J.; Wachtler, Thomas
2013-01-01
In the retina of trichromatic primates, chromatic information is encoded in an opponent fashion and transmitted to the lateral geniculate nucleus (LGN) and visual cortex via parallel pathways. Chromatic selectivities of neurons in the LGN form two separate clusters, corresponding to two classes of cone opponency. In the visual cortex, however, the chromatic selectivities are more distributed, which is in accordance with a population code for color. Previous studies of cone signals in natural scenes typically found opponent codes with chromatic selectivities corresponding to two directions in color space. Here we investigated how the non-linear spatio-chromatic filtering in the retina influences the encoding of color signals. Cone signals were derived from hyper-spectral images of natural scenes and preprocessed by center-surround filtering and rectification, resulting in parallel ON and OFF channels. Independent Component Analysis (ICA) on these signals yielded a highly sparse code with basis functions that showed spatio-chromatic selectivities. In contrast to previous analyses of linear transformations of cone signals, chromatic selectivities were not restricted to two main chromatic axes, but were more continuously distributed in color space, similar to the population code of color in the early visual cortex. Our results indicate that spatio-chromatic processing in the retina leads to a more distributed and more efficient code for natural scenes. PMID:24098289
Visual flow scene effects on the somatogravic illusion in non-pilots.
Eriksson, Lars; von Hofsten, Claes; Tribukait, Arne; Eiken, Ola; Andersson, Peter; Hedström, Johan
2008-09-01
The somatogravic illusion (SGI) is easily broken when the pilot looks out the aircraft window during daylight flight, but it has proven difficult to break or even reduce the SGI in non-pilots in simulators using synthetic visual scenes. Could visual-flow scenes that accommodate compensatory head movement reduce the SGI in naive subjects? We investigated the effects of visual cues on the SGI induced by a human centrifuge. The subject was equipped with a head-tracked, head-mounted display (HMD) and was seated in a fixed gondola facing the center of rotation. The angular velocity of the centrifuge increased from near zero until a 0.57-G centripetal acceleration was attained, resulting in a tilt of the gravitoinertial force vector, corresponding to a pitch-up of 30 degrees. The subject indicated perceived horizontal continuously by means of a manual adjustable-plate system. We performed two experiments with within-subjects designs. In Experiment 1, the subjects (N = 13) viewed a darkened HMD and a presentation of simple visual flow beneath a horizon. In Experiment 2, the subjects (N = 12) viewed a darkened HMD, a scene including symbology superimposed on simple visual flow and horizon, and this scene without visual flow (static). In Experiment 1, visual flow reduced the SGI from 12.4 +/- 1.4 degrees (mean +/- SE) to 8.7 +/- 1.5 degrees. In Experiment 2, the SGI was smaller in the visual flow condition (9.3 +/- 1.8 degrees) than with the static scene (13.3 +/- 1.7 degrees) and without HMD presentation (14.5 +/- 2.3 degrees), respectively. It is possible to reduce the SGI in non-pilots by means of a synthetic horizon and simple visual flow conveyed by a head-tracked HMD. This may reflect the power of a more intuitive display for reducing the SGI.
Electro-optical design for efficient visual communication
NASA Astrophysics Data System (ADS)
Huck, Friedrich O.; Fales, Carl L.; Jobson, Daniel J.; Rahman, Zia-ur
1994-06-01
Visual communication can be regarded as efficient only if the amount of information that it conveys from the scene to the observer approaches the maximum possible and the associated cost approaches the minimum possible. To deal with this problem, Fales and Huck have integrated the critical limiting factors that constrain image gathering into classical concepts of communication theory. This paper uses this approach to assess the electro-optical design of the image gathering device. Design variables include the f-number and apodization of the objective lens, the aperture size and sampling geometry of the photodetection mechanism, and lateral inhibition and nonlinear radiance-to-signal conversion akin to the retinal processing in the human eye. It is an agreeable consequence of this approach that the image gathering device that is designed along the guidelines developed from communication theory behaves very much like the human eye. The performance approaches the maximum possible in terms of the information content of the acquired data, and thereby, the fidelity, sharpness and clarity with which fine detail can be restored, the efficiency with which the visual information can be transmitted in the form of decorrelated data, and the robustness of these two attributes to the temporal and spatial variations in scene illumination.
Swallow, Khena M; Jiang, Yuhong V
2010-04-01
Recent work on event perception suggests that perceptual processing increases when events change. An important question is how such changes influence the way other information is processed, particularly during dual-task performance. In this study, participants monitored a long series of distractor items for an occasional target as they simultaneously encoded unrelated background scenes. The appearance of an occasional target could have two opposite effects on the secondary task: It could draw attention away from the second task, or, as a change in the ongoing event, it could improve secondary task performance. Results were consistent with the second possibility. Memory for scenes presented simultaneously with the targets was better than memory for scenes that preceded or followed the targets. This effect was observed when the primary detection task involved visual feature oddball detection, auditory oddball detection, and visual color-shape conjunction detection. It was eliminated when the detection task was omitted, and when it required an arbitrary response mapping. The appearance of occasional, task-relevant events appears to trigger a temporal orienting response that facilitates processing of concurrently attended information (Attentional Boost Effect). Copyright 2009 Elsevier B.V. All rights reserved.
Swallow, Khena M.; Jiang, Yuhong V.
2009-01-01
Recent work on event perception suggests that perceptual processing increases when events change. An important question is how such changes influence the way other information is processed, particularly during dual-task performance. In this study, participants monitored a long series of distractor items for an occasional target as they simultaneously encoded unrelated background scenes. The appearance of an occasional target could have two opposite effects on the secondary task: It could draw attention away from the second task, or, as a change in the ongoing event, it could improve secondary task performance. Results were consistent with the second possibility. Memory for scenes presented simultaneously with the targets was better than memory for scenes that preceded or followed the targets. This effect was observed when the primary detection task involved visual feature oddball detection, auditory oddball detection, and visual color-shape conjunction detection. It was eliminated when the detection task was omitted, and when it required an arbitrary response mapping. The appearance of occasional, task-relevant events appears to trigger a temporal orienting response that facilitates processing of concurrently attended information (Attentional Boost Effect). PMID:20080232
Azizi, Elham; Abel, Larry A; Stainer, Matthew J
2017-02-01
Action game playing has been associated with several improvements in visual attention tasks. However, it is not clear how such changes might influence the way we overtly select information from our visual world (i.e. eye movements). We examined whether action-video-game training changed eye movement behaviour in a series of visual search tasks including conjunctive search (relatively abstracted from natural behaviour), game-related search, and more naturalistic scene search. Forty nongamers were trained in either an action first-person shooter game or a card game (control) for 10 hours. As a further control, we recorded eye movements of 20 experienced action gamers on the same tasks. The results did not show any change in duration of fixations or saccade amplitude either from before to after the training or between all nongamers (pretraining) and experienced action gamers. However, we observed a change in search strategy, reflected by a reduction in the vertical distribution of fixations for the game-related search task in the action-game-trained group. This might suggest learning the likely distribution of targets. In other words, game training only skilled participants to search game images for targets important to the game, with no indication of transfer to the more natural scene search. Taken together, these results suggest no modification in overt allocation of attention. Either the skills that can be trained with action gaming are not powerful enough to influence information selection through eye movements, or action-game-learned skills are not used when deciding where to move the eyes.
Helo, Andrea; van Ommen, Sandrien; Pannasch, Sebastian; Danteny-Dordoigne, Lucile; Rämä, Pia
2017-11-01
Conceptual representations of everyday scenes are built in interaction with visual environment and these representations guide our visual attention. Perceptual features and object-scene semantic consistency have been found to attract our attention during scene exploration. The present study examined how visual attention in 24-month-old toddlers is attracted by semantic violations and how perceptual features (i. e. saliency, centre distance, clutter and object size) and linguistic properties (i. e. object label frequency and label length) affect gaze distribution. We compared eye movements of 24-month-old toddlers and adults while exploring everyday scenes which either contained an inconsistent (e.g., soap on a breakfast table) or consistent (e.g., soap in a bathroom) object. Perceptual features such as saliency, centre distance and clutter of the scene affected looking times in the toddler group during the whole viewing time whereas looking times in adults were affected only by centre distance during the early viewing time. Adults looked longer to inconsistent than consistent objects either if the objects had a high or a low saliency. In contrast, toddlers presented semantic consistency effect only when objects were highly salient. Additionally, toddlers with lower vocabulary skills looked longer to inconsistent objects while toddlers with higher vocabulary skills look equally long to both consistent and inconsistent objects. Our results indicate that 24-month-old children use scene context to guide visual attention when exploring the visual environment. However, perceptual features have a stronger influence in eye movement guidance in toddlers than in adults. Our results also indicate that language skills influence cognitive but not perceptual guidance of eye movements during scene perception in toddlers. Copyright © 2017 Elsevier Inc. All rights reserved.
Conjoint representation of texture ensemble and location in the parahippocampal place area.
Park, Jeongho; Park, Soojin
2017-04-01
Texture provides crucial information about the category or identity of a scene. Nonetheless, not much is known about how the texture information in a scene is represented in the brain. Previous studies have shown that the parahippocampal place area (PPA), a scene-selective part of visual cortex, responds to simple patches of texture ensemble. However, in natural scenes textures exist in spatial context within a scene. Here we tested two hypotheses that make different predictions on how textures within a scene context are represented in the PPA. The Texture-Only hypothesis suggests that the PPA represents texture ensemble (i.e., the kind of texture) as is, irrespective of its location in the scene. On the other hand, the Texture and Location hypothesis suggests that the PPA represents texture and its location within a scene (e.g., ceiling or wall) conjointly. We tested these two hypotheses across two experiments, using different but complementary methods. In experiment 1 , by using multivoxel pattern analysis (MVPA) and representational similarity analysis, we found that the representational similarity of the PPA activation patterns was significantly explained by the Texture-Only hypothesis but not by the Texture and Location hypothesis. In experiment 2 , using a repetition suppression paradigm, we found no repetition suppression for scenes that had the same texture ensemble but differed in location (supporting the Texture and Location hypothesis). On the basis of these results, we propose a framework that reconciles contrasting results from MVPA and repetition suppression and draw conclusions about how texture is represented in the PPA. NEW & NOTEWORTHY This study investigates how the parahippocampal place area (PPA) represents texture information within a scene context. We claim that texture is represented in the PPA at multiple levels: the texture ensemble information at the across-voxel level and the conjoint information of texture and its location at the within-voxel level. The study proposes a working hypothesis that reconciles contrasting results from multivoxel pattern analysis and repetition suppression, suggesting that the methods are complementary to each other but not necessarily interchangeable. Copyright © 2017 the American Physiological Society.
Information theoretic analysis of edge detection in visual communication
NASA Astrophysics Data System (ADS)
Jiang, Bo; Rahman, Zia-ur
2010-08-01
Generally, the designs of digital image processing algorithms and image gathering devices remain separate. Consequently, the performance of digital image processing algorithms is evaluated without taking into account the artifacts introduced into the process by the image gathering process. However, experiments show that the image gathering process profoundly impacts the performance of digital image processing and the quality of the resulting images. Huck et al. proposed one definitive theoretic analysis of visual communication channels, where the different parts, such as image gathering, processing, and display, are assessed in an integrated manner using Shannon's information theory. In this paper, we perform an end-to-end information theory based system analysis to assess edge detection methods. We evaluate the performance of the different algorithms as a function of the characteristics of the scene, and the parameters, such as sampling, additive noise etc., that define the image gathering system. The edge detection algorithm is regarded to have high performance only if the information rate from the scene to the edge approaches the maximum possible. This goal can be achieved only by jointly optimizing all processes. People generally use subjective judgment to compare different edge detection methods. There is not a common tool that can be used to evaluate the performance of the different algorithms, and to give people a guide for selecting the best algorithm for a given system or scene. Our information-theoretic assessment becomes this new tool to which allows us to compare the different edge detection operators in a common environment.
Scan patterns when viewing natural scenes: emotion, complexity, and repetition.
Bradley, Margaret M; Houbova, Petra; Miccoli, Laura; Costa, Vincent D; Lang, Peter J
2011-11-01
Eye movements were monitored during picture viewing, and effects of hedonic content, perceptual composition, and repetition on scanning assessed. In Experiment 1, emotional and neutral pictures that were figure-ground compositions or more complex scenes were presented for a 6-s free viewing period. Viewing emotional pictures or complex scenes prompted more fixations and broader scanning of the visual array, compared to neutral pictures or simple figure-ground compositions. Effects of emotion and composition were independent, supporting the hypothesis that these oculomotor indices reflect enhanced information seeking. Experiment 2 tested an orienting hypothesis by repeatedly presenting the same pictures. Although repetition altered specific scan patterns, emotional, compared to neutral, picture viewing continued to prompt oculomotor differences, suggesting that motivationally relevant cues enhance information seeking in appetitive and defensive contexts. Copyright © 2011 Society for Psychophysiological Research.
The Role of Visual Experience on the Representation and Updating of Novel Haptic Scenes
ERIC Educational Resources Information Center
Pasqualotto, Achille; Newell, Fiona N.
2007-01-01
We investigated the role of visual experience on the spatial representation and updating of haptic scenes by comparing recognition performance across sighted, congenitally and late blind participants. We first established that spatial updating occurs in sighted individuals to haptic scenes of novel objects. All participants were required to…
Schettino, Antonio; Keil, Andreas; Porcu, Emanuele; Müller, Matthias M
2016-06-01
The rapid extraction of affective cues from the visual environment is crucial for flexible behavior. Previous studies have reported emotion-dependent amplitude modulations of two event-related potential (ERP) components - the N1 and EPN - reflecting sensory gain control mechanisms in extrastriate visual areas. However, it is unclear whether both components are selective electrophysiological markers of attentional orienting toward emotional material or are also influenced by physical features of the visual stimuli. To address this question, electrical brain activity was recorded from seventeen male participants while viewing original and bright versions of neutral and erotic pictures. Bright neutral scenes were rated as more pleasant compared to their original counterpart, whereas erotic scenes were judged more positively when presented in their original version. Classical and mass univariate ERP analysis showed larger N1 amplitude for original relative to bright erotic pictures, with no differences for original and bright neutral scenes. Conversely, the EPN was only modulated by picture content and not by brightness, substantiating the idea that this component is a unique electrophysiological marker of attention allocation toward emotional material. Complementary topographic analysis revealed the early selective expression of a centro-parietal positivity following the presentation of original erotic scenes only, reflecting the recruitment of neural networks associated with sustained attention and facilitated memory encoding for motivationally relevant material. Overall, these results indicate that neural networks subtending the extraction of emotional information are differentially recruited depending on low-level perceptual features, which ultimately influence affective evaluations. Copyright © 2016 Elsevier Inc. All rights reserved.
Deployment of spatial attention towards locations in memory representations. An EEG study.
Leszczyński, Marcin; Wykowska, Agnieszka; Perez-Osorio, Jairo; Müller, Hermann J
2013-01-01
Recalling information from visual short-term memory (VSTM) involves the same neural mechanisms as attending to an actually perceived scene. In particular, retrieval from VSTM has been associated with orienting of visual attention towards a location within a spatially-organized memory representation. However, an open question concerns whether spatial attention is also recruited during VSTM retrieval even when performing the task does not require access to spatial coordinates of items in the memorized scene. The present study combined a visual search task with a modified, delayed central probe protocol, together with EEG analysis, to answer this question. We found a temporal contralateral negativity (TCN) elicited by a centrally presented go-signal which was spatially uninformative and featurally unrelated to the search target and informed participants only about a response key that they had to press to indicate a prepared target-present vs. -absent decision. This lateralization during VSTM retrieval (TCN) provides strong evidence of a shift of attention towards the target location in the memory representation, which occurred despite the fact that the present task required no spatial (or featural) information from the search to be encoded, maintained, and retrieved to produce the correct response and that the go-signal did not itself specify any information relating to the location and defining feature of the target.
Robust selectivity to two-object images in human visual cortex
Agam, Yigal; Liu, Hesheng; Papanastassiou, Alexander; Buia, Calin; Golby, Alexandra J.; Madsen, Joseph R.; Kreiman, Gabriel
2010-01-01
SUMMARY We can recognize objects in a fraction of a second in spite of the presence of other objects [1–3]. The responses in macaque areas V4 and inferior temporal cortex [4–15] to a neuron’s preferred stimuli are typically suppressed by the addition of a second object within the receptive field (see however [16, 17]). How can this suppression be reconciled with rapid visual recognition in complex scenes? One option is that certain “special categories” are unaffected by other objects [18] but this leaves the problem unsolved for other categories. Another possibility is that serial attentional shifts help ameliorate the problem of distractor objects [19–21]. Yet, psychophysical studies [1–3], scalp recordings [1] and neurophysiological recordings [14, 16, 22–24], suggest that the initial sweep of visual processing contains a significant amount of information. We recorded intracranial field potentials in human visual cortex during presentation of flashes of two-object images. Visual selectivity from temporal cortex during the initial ~200 ms was largely robust to the presence of other objects. We could train linear decoders on the responses to isolated objects and decode information in two-object images. These observations are compatible with parallel, hierarchical and feed-forward theories of rapid visual recognition [25] and may provide a neural substrate to begin to unravel rapid recognition in natural scenes. PMID:20417105
Cognitive processing in the primary visual cortex: from perception to memory.
Supèr, Hans
2002-01-01
The primary visual cortex is the first cortical area of the visual system that receives information from the external visual world. Based on the receptive field characteristics of the neurons in this area, it has been assumed that the primary visual cortex is a pure sensory area extracting basic elements of the visual scene. This information is then subsequently further processed upstream in the higher-order visual areas and provides us with perception and storage of the visual environment. However, recent findings show that such neural implementations are observed in the primary visual cortex. These neural correlates are expressed by the modulated activity of the late response of a neuron to a stimulus, and most likely depend on recurrent interactions between several areas of the visual system. This favors the concept of a distributed nature of visual processing in perceptual organization.
A novel scene management technology for complex virtual battlefield environment
NASA Astrophysics Data System (ADS)
Sheng, Changchong; Jiang, Libing; Tang, Bo; Tang, Xiaoan
2018-04-01
The efficient scene management of virtual environment is an important research content of computer real-time visualization, which has a decisive influence on the efficiency of drawing. However, Traditional scene management methods do not suitable for complex virtual battlefield environments, this paper combines the advantages of traditional scene graph technology and spatial data structure method, using the idea of management and rendering separation, a loose object-oriented scene graph structure is established to manage the entity model data in the scene, and the performance-based quad-tree structure is created for traversing and rendering. In addition, the collaborative update relationship between the above two structural trees is designed to achieve efficient scene management. Compared with the previous scene management method, this method is more efficient and meets the needs of real-time visualization.
ViCoMo: visual context modeling for scene understanding in video surveillance
NASA Astrophysics Data System (ADS)
Creusen, Ivo M.; Javanbakhti, Solmaz; Loomans, Marijn J. H.; Hazelhoff, Lykele B.; Roubtsova, Nadejda; Zinger, Svitlana; de With, Peter H. N.
2013-10-01
The use of contextual information can significantly aid scene understanding of surveillance video. Just detecting people and tracking them does not provide sufficient information to detect situations that require operator attention. We propose a proof-of-concept system that uses several sources of contextual information to improve scene understanding in surveillance video. The focus is on two scenarios that represent common video surveillance situations, parking lot surveillance and crowd monitoring. In the first scenario, a pan-tilt-zoom (PTZ) camera tracking system is developed for parking lot surveillance. Context is provided by the traffic sign recognition system to localize regular and handicapped parking spot signs as well as license plates. The PTZ algorithm has the ability to selectively detect and track persons based on scene context. In the second scenario, a group analysis algorithm is introduced to detect groups of people. Contextual information is provided by traffic sign recognition and region labeling algorithms and exploited for behavior understanding. In both scenarios, decision engines are used to interpret and classify the output of the subsystems and if necessary raise operator alerts. We show that using context information enables the automated analysis of complicated scenarios that were previously not possible using conventional moving object classification techniques.
The medium and the message: a revisionist view of image quality
NASA Astrophysics Data System (ADS)
Ferwerda, James A.
2010-02-01
In his book "Understanding Media" social theorist Marshall McLuhan declared: "The medium is the message." The thesis of this paper is that with respect to image quality, imaging system developers have taken McLuhan's dictum too much to heart. Efforts focus on improving the technical specifications of the media (e.g. dynamic range, color gamut, resolution, temporal response) with little regard for the visual messages the media will be used to communicate. We present a series of psychophysical studies that investigate the visual system's ability to "see through" the limitations of imaging media to perceive the messages (object and scene properties) the images represent. The purpose of these studies is to understand the relationships between the signal characteristics of an image and the fidelity of the visual information the image conveys. The results of these studies provide a new perspective on image quality that shows that images that may be very different in "quality", can be visually equivalent as realistic representations of objects and scenes.
Luminance cues constrain chromatic blur discrimination in natural scene stimuli.
Sharman, Rebecca J; McGraw, Paul V; Peirce, Jonathan W
2013-03-22
Introducing blur into the color components of a natural scene has very little effect on its percept, whereas blur introduced into the luminance component is very noticeable. Here we quantify the dominance of luminance information in blur detection and examine a number of potential causes. We show that the interaction between chromatic and luminance information is not explained by reduced acuity or spatial resolution limitations for chromatic cues, the effective contrast of the luminance cue, or chromatic and achromatic statistical regularities in the images. Regardless of the quality of chromatic information, the visual system gives primacy to luminance signals when determining edge location. In natural viewing, luminance information appears to be specialized for detecting object boundaries while chromatic information may be used to determine surface properties.
Understanding Recovery from Object Substitution Masking
ERIC Educational Resources Information Center
Goodhew, Stephanie C.; Dux, Paul E.; Lipp, Ottmar V.; Visser, Troy A. W.
2012-01-01
When we look at a scene, we are conscious of only a small fraction of the available visual information at any given point in time. This raises profound questions regarding how information is selected, when awareness occurs, and the nature of the mechanisms underlying these processes. One tool that may be used to probe these issues is…
Bayesian modeling of cue interaction: bistability in stereoscopic slant perception.
van Ee, Raymond; Adams, Wendy J; Mamassian, Pascal
2003-07-01
Our two eyes receive different views of a visual scene, and the resulting binocular disparities enable us to reconstruct its three-dimensional layout. However, the visual environment is also rich in monocular depth cues. We examined the resulting percept when observers view a scene in which there are large conflicts between the surface slant signaled by binocular disparities and the slant signaled by monocular perspective. For a range of disparity-perspective cue conflicts, many observers experience bistability: They are able to perceive two distinct slants and to flip between the two percepts in a controlled way. We present a Bayesian model that describes the quantitative aspects of perceived slant on the basis of the likelihoods of both perspective and disparity slant information combined with prior assumptions about the shape and orientation of objects in the scene. Our Bayesian approach can be regarded as an overarching framework that allows researchers to study all cue integration aspects-including perceptual decisions--in a unified manner.
Differential Visual Processing of Animal Images, with and without Conscious Awareness
Zhu, Weina; Drewes, Jan; Peatfield, Nicholas A.; Melcher, David
2016-01-01
The human visual system can quickly and efficiently extract categorical information from a complex natural scene. The rapid detection of animals in a scene is one compelling example of this phenomenon, and it suggests the automatic processing of at least some types of categories with little or no attentional requirements (Li et al., 2002, 2005). The aim of this study is to investigate whether the remarkable capability to categorize complex natural scenes exist in the absence of awareness, based on recent reports that “invisible” stimuli, which do not reach conscious awareness, can still be processed by the human visual system (Pasley et al., 2004; Williams et al., 2004; Fang and He, 2005; Jiang et al., 2006, 2007; Kaunitz et al., 2011a). In two experiments, we recorded event-related potentials (ERPs) in response to animal and non-animal/vehicle stimuli in both aware and unaware conditions in a continuous flash suppression (CFS) paradigm. Our results indicate that even in the “unseen” condition, the brain responds differently to animal and non-animal/vehicle images, consistent with rapid activation of animal-selective feature detectors prior to, or outside of, suppression by the CFS mask. PMID:27790106
Differential Visual Processing of Animal Images, with and without Conscious Awareness.
Zhu, Weina; Drewes, Jan; Peatfield, Nicholas A; Melcher, David
2016-01-01
The human visual system can quickly and efficiently extract categorical information from a complex natural scene. The rapid detection of animals in a scene is one compelling example of this phenomenon, and it suggests the automatic processing of at least some types of categories with little or no attentional requirements (Li et al., 2002, 2005). The aim of this study is to investigate whether the remarkable capability to categorize complex natural scenes exist in the absence of awareness, based on recent reports that "invisible" stimuli, which do not reach conscious awareness, can still be processed by the human visual system (Pasley et al., 2004; Williams et al., 2004; Fang and He, 2005; Jiang et al., 2006, 2007; Kaunitz et al., 2011a). In two experiments, we recorded event-related potentials (ERPs) in response to animal and non-animal/vehicle stimuli in both aware and unaware conditions in a continuous flash suppression (CFS) paradigm. Our results indicate that even in the "unseen" condition, the brain responds differently to animal and non-animal/vehicle images, consistent with rapid activation of animal-selective feature detectors prior to, or outside of, suppression by the CFS mask.
Age-related macular degeneration changes the processing of visual scenes in the brain.
Ramanoël, Stephen; Chokron, Sylvie; Hera, Ruxandra; Kauffmann, Louise; Chiquet, Christophe; Krainik, Alexandre; Peyrin, Carole
2018-01-01
In age-related macular degeneration (AMD), the processing of fine details in a visual scene, based on a high spatial frequency processing, is impaired, while the processing of global shapes, based on a low spatial frequency processing, is relatively well preserved. The present fMRI study aimed to investigate the residual abilities and functional brain changes of spatial frequency processing in visual scenes in AMD patients. AMD patients and normally sighted elderly participants performed a categorization task using large black and white photographs of scenes (indoors vs. outdoors) filtered in low and high spatial frequencies, and nonfiltered. The study also explored the effect of luminance contrast on the processing of high spatial frequencies. The contrast across scenes was either unmodified or equalized using a root-mean-square contrast normalization in order to increase contrast in high-pass filtered scenes. Performance was lower for high-pass filtered scenes than for low-pass and nonfiltered scenes, for both AMD patients and controls. The deficit for processing high spatial frequencies was more pronounced in AMD patients than in controls and was associated with lower activity for patients than controls not only in the occipital areas dedicated to central and peripheral visual fields but also in a distant cerebral region specialized for scene perception, the parahippocampal place area. Increasing the contrast improved the processing of high spatial frequency content and spurred activation of the occipital cortex for AMD patients. These findings may lead to new perspectives for rehabilitation procedures for AMD patients.
Takeuchi, Tatsuto; Yoshimoto, Sanae; Shimada, Yasuhiro; Kochiyama, Takanori; Kondo, Hirohito M
2017-02-19
Recent studies have shown that interindividual variability can be a rich source of information regarding the mechanism of human visual perception. In this study, we examined the mechanisms underlying interindividual variability in the perception of visual motion, one of the fundamental components of visual scene analysis, by measuring neurotransmitter concentrations using magnetic resonance spectroscopy. First, by psychophysically examining two types of motion phenomena-motion assimilation and contrast-we found that, following the presentation of the same stimulus, some participants perceived motion assimilation, while others perceived motion contrast. Furthermore, we found that the concentration of the excitatory neurotransmitter glutamate-glutamine (Glx) in the dorsolateral prefrontal cortex (Brodmann area 46) was positively correlated with the participant's tendency to motion assimilation over motion contrast; however, this effect was not observed in the visual areas. The concentration of the inhibitory neurotransmitter γ-aminobutyric acid had only a weak effect compared with that of Glx. We conclude that excitatory process in the suprasensory area is important for an individual's tendency to determine antagonistically perceived visual motion phenomena.This article is part of the themed issue 'Auditory and visual scene analysis'. © 2017 The Author(s).
A Model of Manual Control with Perspective Scene Viewing
NASA Technical Reports Server (NTRS)
Sweet, Barbara Townsend
2013-01-01
A model of manual control during perspective scene viewing is presented, which combines the Crossover Model with a simpli ed model of perspective-scene viewing and visual- cue selection. The model is developed for a particular example task: an idealized constant- altitude task in which the operator controls longitudinal position in the presence of both longitudinal and pitch disturbances. An experiment is performed to develop and vali- date the model. The model corresponds closely with the experimental measurements, and identi ed model parameters are highly consistent with the visual cues available in the perspective scene. The modeling results indicate that operators used one visual cue for position control, and another visual cue for velocity control (lead generation). Additionally, operators responded more quickly to rotation (pitch) than translation (longitudinal).
Simulating Navigation with Virtual 3d Geovisualizations - a Focus on Memory Related Factors
NASA Astrophysics Data System (ADS)
Lokka, I.; Çöltekin, A.
2016-06-01
The use of virtual environments (VE) for navigation-related studies, such as spatial cognition and path retrieval has been widely adopted in cognitive psychology and related fields. What motivates the use of VEs for such studies is that, as opposed to real-world, we can control for the confounding variables in simulated VEs. When simulating a geographic environment as a virtual world with the intention to train navigational memory in humans, an effective and efficient visual design is important to facilitate the amount of recall. However, it is not yet clear what amount of information should be included in such visual designs intended to facilitate remembering: there can be too little or too much of it. Besides the amount of information or level of detail, the types of visual features (`elements' in a visual scene) that should be included in the representations to create memorable scenes and paths must be defined. We analyzed the literature in cognitive psychology, geovisualization and information visualization, and identified the key factors for studying and evaluating geovisualization designs for their function to support and strengthen human navigational memory. The key factors we identified are: i) the individual abilities and age of the users, ii) the level of realism (LOR) included in the representations and iii) the context in which the navigation is performed, thus specific tasks within a case scenario. Here we present a concise literature review and our conceptual development for follow-up experiments.
Speakers of different languages process the visual world differently.
Chabal, Sarah; Marian, Viorica
2015-06-01
Language and vision are highly interactive. Here we show that people activate language when they perceive the visual world, and that this language information impacts how speakers of different languages focus their attention. For example, when searching for an item (e.g., clock) in the same visual display, English and Spanish speakers look at different objects. Whereas English speakers searching for the clock also look at a cloud, Spanish speakers searching for the clock also look at a gift, because the Spanish names for gift (regalo) and clock (reloj) overlap phonologically. These different looking patterns emerge despite an absence of direct language input, showing that linguistic information is automatically activated by visual scene processing. We conclude that the varying linguistic information available to speakers of different languages affects visual perception, leading to differences in how the visual world is processed. (c) 2015 APA, all rights reserved).
Fragile visual short-term memory is an object-based and location-specific store.
Pinto, Yaïr; Sligte, Ilja G; Shapiro, Kimron L; Lamme, Victor A F
2013-08-01
Fragile visual short-term memory (FM) is a recently discovered form of visual short-term memory. Evidence suggests that it provides rich and high-capacity storage, like iconic memory, yet it exists, without interference, almost as long as visual working memory. In the present study, we sought to unveil the functional underpinnings of this memory storage. We found that FM is only completely erased when the new visual scene appears at the same location and consists of the same objects as the to-be-recalled information. This result has two important implications: First, it shows that FM is an object- and location-specific store, and second, it suggests that FM might be used in everyday life when the presentation of visual information is appropriately designed.
Katz, Matthew L.; Viney, Tim J.; Nikolic, Konstantin
2016-01-01
Sensory stimuli are encoded by diverse kinds of neurons but the identities of the recorded neurons that are studied are often unknown. We explored in detail the firing patterns of eight previously defined genetically-identified retinal ganglion cell (RGC) types from a single transgenic mouse line. We first introduce a new technique of deriving receptive field vectors (RFVs) which utilises a modified form of mutual information (“Quadratic Mutual Information”). We analysed the firing patterns of RGCs during presentation of short duration (~10 second) complex visual scenes (natural movies). We probed the high dimensional space formed by the visual input for a much smaller dimensional subspace of RFVs that give the most information about the response of each cell. The new technique is very efficient and fast and the derivation of novel types of RFVs formed by the natural scene visual input was possible even with limited numbers of spikes per cell. This approach enabled us to estimate the 'visual memory' of each cell type and the corresponding receptive field area by calculating Mutual Information as a function of the number of frames and radius. Finally, we made predictions of biologically relevant functions based on the RFVs of each cell type. RGC class analysis was complemented with results for the cells’ response to simple visual input in the form of black and white spot stimulation, and their classification on several key physiological metrics. Thus RFVs lead to predictions of biological roles based on limited data and facilitate analysis of sensory-evoked spiking data from defined cell types. PMID:26845435
Category search speeds up face-selective fMRI responses in a non-hierarchical cortical face network.
Jiang, Fang; Badler, Jeremy B; Righi, Giulia; Rossion, Bruno
2015-05-01
The human brain is extremely efficient at detecting faces in complex visual scenes, but the spatio-temporal dynamics of this remarkable ability, and how it is influenced by category-search, remain largely unknown. In the present study, human subjects were shown gradually-emerging images of faces or cars in visual scenes, while neural activity was recorded using functional magnetic resonance imaging (fMRI). Category search was manipulated by the instruction to indicate the presence of either a face or a car, in different blocks, as soon as an exemplar of the target category was detected in the visual scene. The category selectivity of most face-selective areas was enhanced when participants were instructed to report the presence of faces in gradually decreasing noise stimuli. Conversely, the same regions showed much less selectivity when participants were instructed instead to detect cars. When "face" was the target category, the fusiform face area (FFA) showed consistently earlier differentiation of face versus car stimuli than did the "occipital face area" (OFA). When "car" was the target category, only the FFA showed differentiation of face versus car stimuli. These observations provide further challenges for hierarchical models of cortical face processing and show that during gradual revealing of information, selective category-search may decrease the required amount of information, enhancing and speeding up category-selective responses in the human brain. Copyright © 2015 Elsevier Ltd. All rights reserved.
NASA Technical Reports Server (NTRS)
Brady, Rachel A.; Batson, Crystal D.; Peters, Brian T.; Mulavara, Ajitkumar P.; Bloomberg, Jacob J.
2010-01-01
We designed a gait training study that presented combinations of visual flow and support surface manipulations to investigate the response of healthy adults to novel discordant sensorimotor conditions. We aimed to determine whether a relationship existed between subjects visual dependence and their scores on a collective measure of anxiety, cognition, and postural stability in a new discordant environment presented at the conclusion of training (Transfer Test). A treadmill was mounted to a motion base platform positioned 2 m behind a large visual screen. Training consisted of three walking sessions, each within a week of the previous visit, that presented four 5-minute exposures to various combinations of support surface and visual scene manipulations, all lateral sinusoids. The conditions were scene translation only, support surface translation only, simultaneous scene and support surface translations in-phase, and simultaneous scene and support surface translations 180 out-of-phase. During the Transfer Test, the trained participants received a 2-minute novel exposure. A visual sinusoidal roll perturbation, with twice the original flow rate, was superimposed on a sinusoidal support surface roll perturbation that was 90 out of phase with the scene. A high correlation existed between normalized torso translation, measured in the scene-only condition at the first visit, and a combined measure of normalized heart rate, stride frequency, and reaction time at the transfer test. Results suggest that visually dependent participants experience decreased postural stability, increased anxiety, and increased reaction times compared to their less visually dependent counterparts when negotiating novel discordant conditions.
A systematic comparison between visual cues for boundary detection.
Mély, David A; Kim, Junkyung; McGill, Mason; Guo, Yuliang; Serre, Thomas
2016-03-01
The detection of object boundaries is a critical first step for many visual processing tasks. Multiple cues (we consider luminance, color, motion and binocular disparity) available in the early visual system may signal object boundaries but little is known about their relative diagnosticity and how to optimally combine them for boundary detection. This study thus aims at understanding how early visual processes inform boundary detection in natural scenes. We collected color binocular video sequences of natural scenes to construct a video database. Each scene was annotated with two full sets of ground-truth contours (one set limited to object boundaries and another set which included all edges). We implemented an integrated computational model of early vision that spans all considered cues, and then assessed their diagnosticity by training machine learning classifiers on individual channels. Color and luminance were found to be most diagnostic while stereo and motion were least. Combining all cues yielded a significant improvement in accuracy beyond that of any cue in isolation. Furthermore, the accuracy of individual cues was found to be a poor predictor of their unique contribution for the combination. This result suggested a complex interaction between cues, which we further quantified using regularization techniques. Our systematic assessment of the accuracy of early vision models for boundary detection together with the resulting annotated video dataset should provide a useful benchmark towards the development of higher-level models of visual processing. Copyright © 2016 Elsevier Ltd. All rights reserved.
Scene-Aware Adaptive Updating for Visual Tracking via Correlation Filters
Zhang, Sirou; Qiao, Xiaoya
2017-01-01
In recent years, visual object tracking has been widely used in military guidance, human-computer interaction, road traffic, scene monitoring and many other fields. The tracking algorithms based on correlation filters have shown good performance in terms of accuracy and tracking speed. However, their performance is not satisfactory in scenes with scale variation, deformation, and occlusion. In this paper, we propose a scene-aware adaptive updating mechanism for visual tracking via a kernel correlation filter (KCF). First, a low complexity scale estimation method is presented, in which the corresponding weight in five scales is employed to determine the final target scale. Then, the adaptive updating mechanism is presented based on the scene-classification. We classify the video scenes as four categories by video content analysis. According to the target scene, we exploit the adaptive updating mechanism to update the kernel correlation filter to improve the robustness of the tracker, especially in scenes with scale variation, deformation, and occlusion. We evaluate our tracker on the CVPR2013 benchmark. The experimental results obtained with the proposed algorithm are improved by 33.3%, 15%, 6%, 21.9% and 19.8% compared to those of the KCF tracker on the scene with scale variation, partial or long-time large-area occlusion, deformation, fast motion and out-of-view. PMID:29140311
Visual search for arbitrary objects in real scenes
Alvarez, George A.; Rosenholtz, Ruth; Kuzmova, Yoana I.; Sherman, Ashley M.
2011-01-01
How efficient is visual search in real scenes? In searches for targets among arrays of randomly placed distractors, efficiency is often indexed by the slope of the reaction time (RT) × Set Size function. However, it may be impossible to define set size for real scenes. As an approximation, we hand-labeled 100 indoor scenes and used the number of labeled regions as a surrogate for set size. In Experiment 1, observers searched for named objects (a chair, bowl, etc.). With set size defined as the number of labeled regions, search was very efficient (~5 ms/item). When we controlled for a possible guessing strategy in Experiment 2, slopes increased somewhat (~15 ms/item), but they were much shallower than search for a random object among other distinctive objects outside of a scene setting (Exp. 3: ~40 ms/item). In Experiments 4–6, observers searched repeatedly through the same scene for different objects. Increased familiarity with scenes had modest effects on RTs, while repetition of target items had large effects (>500 ms). We propose that visual search in scenes is efficient because scene-specific forms of attentional guidance can eliminate most regions from the “functional set size” of items that could possibly be the target. PMID:21671156
Visual search for arbitrary objects in real scenes.
Wolfe, Jeremy M; Alvarez, George A; Rosenholtz, Ruth; Kuzmova, Yoana I; Sherman, Ashley M
2011-08-01
How efficient is visual search in real scenes? In searches for targets among arrays of randomly placed distractors, efficiency is often indexed by the slope of the reaction time (RT) × Set Size function. However, it may be impossible to define set size for real scenes. As an approximation, we hand-labeled 100 indoor scenes and used the number of labeled regions as a surrogate for set size. In Experiment 1, observers searched for named objects (a chair, bowl, etc.). With set size defined as the number of labeled regions, search was very efficient (~5 ms/item). When we controlled for a possible guessing strategy in Experiment 2, slopes increased somewhat (~15 ms/item), but they were much shallower than search for a random object among other distinctive objects outside of a scene setting (Exp. 3: ~40 ms/item). In Experiments 4-6, observers searched repeatedly through the same scene for different objects. Increased familiarity with scenes had modest effects on RTs, while repetition of target items had large effects (>500 ms). We propose that visual search in scenes is efficient because scene-specific forms of attentional guidance can eliminate most regions from the "functional set size" of items that could possibly be the target.
ERIC Educational Resources Information Center
Thiessen, Amber; Beukelman, David; Hux, Karen; Longenecker, Maria
2016-01-01
Purpose: The purpose of the study was to compare the visual attention patterns of adults with aphasia and adults without neurological conditions when viewing visual scenes with 2 types of engagement. Method: Eye-tracking technology was used to measure the visual attention patterns of 10 adults with aphasia and 10 adults without neurological…
ERIC Educational Resources Information Center
Wilkinson, Krista M.; Light, Janice
2011-01-01
Purpose: Many individuals with complex communication needs may benefit from visual aided augmentative and alternative communication systems. In visual scene displays (VSDs), language concepts are embedded into a photograph of a naturalistic event. Humans play a central role in communication development and might be important elements in VSDs.…
Delcasso, Sébastien; Huh, Namjung; Byeon, Jung Seop; Lee, Jihyun; Jung, Min Whan; Lee, Inah
2014-11-19
The hippocampus is important for contextual behavior, and the striatum plays key roles in decision making. When studying the functional relationships with the hippocampus, prior studies have focused mostly on the dorsolateral striatum (DLS), emphasizing the antagonistic relationships between the hippocampus and DLS in spatial versus response learning. By contrast, the functional relationships between the dorsomedial striatum (DMS) and hippocampus are relatively unknown. The current study reports that lesions to both the hippocampus and DMS profoundly impaired performance of rats in a visual scene-based memory task in which the animals were required to make a choice response by using visual scenes displayed in the background. Analysis of simultaneous recordings of local field potentials revealed that the gamma oscillatory power was higher in the DMS, but not in CA1, when the rat performed the task using familiar scenes than novel ones. In addition, the CA1-DMS networks increased coherence at γ, but not at θ, rhythm as the rat mastered the task. At the single-unit level, the neuronal populations in CA1 and DMS showed differential firing patterns when responses were made using familiar visual scenes than novel ones. Such learning-dependent firing patterns were observed earlier in the DMS than in CA1 before the rat made choice responses. The present findings suggest that both the hippocampus and DMS process memory representations for visual scenes in parallel with different time courses and that flexible choice action using background visual scenes requires coordinated operations of the hippocampus and DMS at γ frequencies. Copyright © 2014 the authors 0270-6474/14/3415534-14$15.00/0.
Faces in Context: Does Face Perception Depend on the Orientation of the Visual Scene?
Taubert, Jessica; van Golde, Celine; Verstraten, Frans A J
2016-10-01
The mechanisms held responsible for familiar face recognition are thought to be orientation dependent; inverted faces are more difficult to recognize than their upright counterparts. Although this effect of inversion has been investigated extensively, researchers have typically sliced faces from photographs and presented them in isolation. As such, it is not known whether the perceived orientation of a face is inherited from the visual scene in which it appears. Here, we address this question by measuring performance in a simultaneous same-different task while manipulating both the orientation of the faces and the scene. We found that the face inversion effect survived scene inversion. Nonetheless, an improvement in performance when the scene was upside down suggests that sensitivity to identity increased when the faces were more easily segmented from the scene. Thus, while these data identify congruency with the visual environment as a contributing factor in recognition performance, they imply different mechanisms operate on upright and inverted faces. © The Author(s) 2016.
Hybrid Reality Lab Capabilities - Video 2
NASA Technical Reports Server (NTRS)
Delgado, Francisco J.; Noyes, Matthew
2016-01-01
Our Hybrid Reality and Advanced Operations Lab is developing incredibly realistic and immersive systems that could be used to provide training, support engineering analysis, and augment data collection for various human performance metrics at NASA. To get a better understanding of what Hybrid Reality is, let's go through the two most commonly known types of immersive realities: Virtual Reality, and Augmented Reality. Virtual Reality creates immersive scenes that are completely made up of digital information. This technology has been used to train astronauts at NASA, used during teleoperation of remote assets (arms, rovers, robots, etc.) and other activities. One challenge with Virtual Reality is that if you are using it for real time-applications (like landing an airplane) then the information used to create the virtual scenes can be old (i.e. visualized long after physical objects moved in the scene) and not accurate enough to land the airplane safely. This is where Augmented Reality comes in. Augmented Reality takes real-time environment information (from a camera, or see through window, and places digitally created information into the scene so that it matches with the video/glass information). Augmented Reality enhances real environment information collected with a live sensor or viewport (e.g. camera, window, etc.) with the information-rich visualization provided by Virtual Reality. Hybrid Reality takes Augmented Reality even further, by creating a higher level of immersion where interactivity can take place. Hybrid Reality takes Virtual Reality objects and a trackable, physical representation of those objects, places them in the same coordinate system, and allows people to interact with both objects' representations (virtual and physical) simultaneously. After a short period of adjustment, the individuals begin to interact with all the objects in the scene as if they were real-life objects. The ability to physically touch and interact with digitally created objects that have the same shape, size, location to their physical object counterpart in virtual reality environment can be a game changer when it comes to training, planning, engineering analysis, science, entertainment, etc. Our Project is developing such capabilities for various types of environments. The video outlined with this abstract is a representation of an ISS Hybrid Reality experience. In the video you can see various Hybrid Reality elements that provide immersion beyond just standard Virtual Reality or Augmented Reality.
Average Orientation Is More Accessible through Object Boundaries than Surface Features
ERIC Educational Resources Information Center
Choo, Heeyoung; Levinthal, Brian R.; Franconeri, Steven L.
2012-01-01
In a glance, the visual system can provide a summary of some kinds of information about objects in a scene. We explore how summary information about "orientation" is extracted and find that some representations of orientation are privileged over others. Participants judged the average orientation of either a set of 6 bars or 6 circular…
Illumination discrimination in real and simulated scenes
Radonjić, Ana; Pearce, Bradley; Aston, Stacey; Krieger, Avery; Dubin, Hilary; Cottaris, Nicolas P.; Brainard, David H.; Hurlbert, Anya C.
2016-01-01
Characterizing humans' ability to discriminate changes in illumination provides information about the visual system's representation of the distal stimulus. We have previously shown that humans are able to discriminate illumination changes and that sensitivity to such changes depends on their chromatic direction. Probing illumination discrimination further would be facilitated by the use of computer-graphics simulations, which would, in practice, enable a wider range of stimulus manipulations. There is no a priori guarantee, however, that results obtained with simulated scenes generalize to real illuminated scenes. To investigate this question, we measured illumination discrimination in real and simulated scenes that were well-matched in mean chromaticity and scene geometry. Illumination discrimination thresholds were essentially identical for the two stimulus types. As in our previous work, these thresholds varied with illumination change direction. We exploited the flexibility offered by the use of graphics simulations to investigate whether the differences across direction are preserved when the surfaces in the scene are varied. We show that varying the scene's surface ensemble in a manner that also changes mean scene chromaticity modulates the relative sensitivity to illumination changes along different chromatic directions. Thus, any characterization of sensitivity to changes in illumination must be defined relative to the set of surfaces in the scene. PMID:28558392
Encodings of implied motion for animate and inanimate object categories in the two visual pathways.
Lu, Zhengang; Li, Xueting; Meng, Ming
2016-01-15
Previous research has proposed two separate pathways for visual processing: the dorsal pathway for "where" information vs. the ventral pathway for "what" information. Interestingly, the middle temporal cortex (MT) in the dorsal pathway is involved in representing implied motion from still pictures, suggesting an interaction between motion and object related processing. However, the relationship between how the brain encodes implied motion and how the brain encodes object/scene categories is unclear. To address this question, fMRI was used to measure activity along the two pathways corresponding to different animate and inanimate categories of still pictures with different levels of implied motion speed. In the visual areas of both pathways, activity induced by pictures of humans and animals was hardly modulated by the implied motion speed. By contrast, activity in these areas correlated with the implied motion speed for pictures of inanimate objects and scenes. The interaction between implied motion speed and stimuli category was significant, suggesting different encoding mechanisms of implied motion for animate-inanimate distinction. Further multivariate pattern analysis of activity in the dorsal pathway revealed significant effects of stimulus category that are comparable to the ventral pathway. Moreover, still pictures of inanimate objects/scenes with higher implied motion speed evoked activation patterns that were difficult to differentiate from those evoked by pictures of humans and animals, indicating a functional role of implied motion in the representation of object categories. These results provide novel evidence to support integrated encoding of motion and object categories, suggesting a rethink of the relationship between the two visual pathways. Copyright © 2015 Elsevier Inc. All rights reserved.
Idiosyncratic characteristics of saccadic eye movements when viewing different visual environments.
Andrews, T J; Coppola, D M
1999-08-01
Eye position was recorded in different viewing conditions to assess whether the temporal and spatial characteristics of saccadic eye movements in different individuals are idiosyncratic. Our aim was to determine the degree to which oculomotor control is based on endogenous factors. A total of 15 naive subjects viewed five visual environments: (1) The absence of visual stimulation (i.e. a dark room); (2) a repetitive visual environment (i.e. simple textured patterns); (3) a complex natural scene; (4) a visual search task; and (5) reading text. Although differences in visual environment had significant effects on eye movements, idiosyncrasies were also apparent. For example, the mean fixation duration and size of an individual's saccadic eye movements when passively viewing a complex natural scene covaried significantly with those same parameters in the absence of visual stimulation and in a repetitive visual environment. In contrast, an individual's spatio-temporal characteristics of eye movements during active tasks such as reading text or visual search covaried together, but did not correlate with the pattern of eye movements detected when viewing a natural scene, simple patterns or in the dark. These idiosyncratic patterns of eye movements in normal viewing reveal an endogenous influence on oculomotor control. The independent covariance of eye movements during different visual tasks shows that saccadic eye movements during active tasks like reading or visual search differ from those engaged during the passive inspection of visual scenes.
The use of visual cues in gravity judgements on parabolic motion.
Jörges, Björn; Hagenfeld, Lena; López-Moliner, Joan
2018-06-21
Evidence suggests that humans rely on an earth gravity prior for sensory-motor tasks like catching or reaching. Even under earth-discrepant conditions, this prior biases perception and action towards assuming a gravitational downwards acceleration of 9.81 m/s 2 . This can be particularly detrimental in interactions with virtual environments employing earth-discrepant gravity conditions for their visual presentation. The present study thus investigates how well humans discriminate visually presented gravities and which cues they use to extract gravity from the visual scene. To this end, we employed a Two-Interval Forced-Choice Design. In Experiment 1, participants had to judge which of two presented parabolas had the higher underlying gravity. We used two initial vertical velocities, two horizontal velocities and a constant target size. Experiment 2 added a manipulation of the reliability of the target size. Experiment 1 shows that participants have generally high discrimination thresholds for visually presented gravities, with weber fractions of 13 to beyond 30%. We identified the rate of change of the elevation angle (ẏ) and the visual angle (θ) as major cues. Experiment 2 suggests furthermore that size variability has a small influence on discrimination thresholds, while at the same time larger size variability increases reliance on ẏ and decreases reliance on θ. All in all, even though we use all available information, humans display low precision when extracting the governing gravity from a visual scene, which might further impact our capabilities of adapting to earth-discrepant gravity conditions with visual information alone. Copyright © 2018. Published by Elsevier Ltd.
Steady-state visual evoked potentials as a research tool in social affective neuroscience
Wieser, Matthias J.; Miskovic, Vladimir; Keil, Andreas
2017-01-01
Like many other primates, humans place a high premium on social information transmission and processing. One important aspect of this information concerns the emotional state of other individuals, conveyed by distinct visual cues such as facial expressions, overt actions, or by cues extracted from the situational context. A rich body of theoretical and empirical work has demonstrated that these socio-emotional cues are processed by the human visual system in a prioritized fashion, in the service of optimizing social behavior. Furthermore, socio-emotional perception is highly dependent on situational contexts and previous experience. Here, we review current issues in this area of research and discuss the utility of the steady-state visual evoked potential (ssVEP) technique for addressing key empirical questions. Methodological advantages and caveats are discussed with particular regard to quantifying time-varying competition among multiple perceptual objects, trial-by-trial analysis of visual cortical activation, functional connectivity, and the control of low-level stimulus features. Studies on facial expression and emotional scene processing are summarized, with an emphasis on viewing faces and other social cues in emotional contexts, or when competing with each other. Further, because the ssVEP technique can be readily accommodated to studying the viewing of complex scenes with multiple elements, it enables researchers to advance theoretical models of socio-emotional perception, based on complex, quasi-naturalistic viewing situations. PMID:27699794
Dillon, Moira R.; Spelke, Elizabeth S.
2015-01-01
Research on animals, infants, children, and adults provides evidence that distinct cognitive systems underlie navigation and object recognition. Here we examine whether and how these systems interact when children interpret 2D edge-based perspectival line drawings of scenes and objects. Such drawings serve as symbols early in development, and they preserve scene and object geometry from canonical points of view. Young children show limits when using geometry both in non-symbolic tasks and in symbolic map tasks that present 3D contexts from unusual, unfamiliar points of view. When presented with the familiar viewpoints in perspectival line drawings, however, do children engage more integrated geometric representations? In three experiments, children successfully interpreted line drawings with respect to their depicted scene or object. Nevertheless, children recruited distinct processes when navigating based on the information in these drawings, and these processes depended on the context in which the drawings were presented. These results suggest that children are flexible but limited in using geometric information to form integrated representations of scenes and objects, even when interpreting spatial symbols that are highly familiar and faithful renditions of the visual world. PMID:25441089
Modality-independent coding of spatial layout in the human brain
Wolbers, Thomas; Klatzky, Roberta L.; Loomis, Jack M.; Wutte, Magdalena G.; Giudice, Nicholas A.
2011-01-01
Summary In many non-human species, neural computations of navigational information such as position and orientation are not tied to a specific sensory modality [1, 2]. Rather, spatial signals are integrated from multiple input sources, likely leading to abstract representations of space. In contrast, the potential for abstract spatial representations in humans is not known, as most neuroscientific experiments on human navigation have focused exclusively on visual cues. Here, we tested the modality independence hypothesis with two fMRI experiments that characterized computations in regions implicated in processing spatial layout [3]. According to the hypothesis, such regions should be recruited for spatial computation of 3-D geometric configuration, independent of a specific sensory modality. In support of this view, sighted participants showed strong activation of the parahippocampal place area (PPA) and the retrosplenial cortex (RSC) for visual and haptic exploration of information-matched scenes but not objects. Functional connectivity analyses suggested that these effects were not related to visual recoding, which was further supported by a similar preference for haptic scenes found with blind participants. Taken together, these findings establish the PPA/RSC network as critical in modality-independent spatial computations and provide important evidence for a theory of high-level abstract spatial information processing in the human brain. PMID:21620708
Portelli, Geoffrey; Barrett, John M; Hilgen, Gerrit; Masquelier, Timothée; Maccione, Alessandro; Di Marco, Stefano; Berdondini, Luca; Kornprobst, Pierre; Sernagor, Evelyne
2016-01-01
How a population of retinal ganglion cells (RGCs) encodes the visual scene remains an open question. Going beyond individual RGC coding strategies, results in salamander suggest that the relative latencies of a RGC pair encode spatial information. Thus, a population code based on this concerted spiking could be a powerful mechanism to transmit visual information rapidly and efficiently. Here, we tested this hypothesis in mouse by recording simultaneous light-evoked responses from hundreds of RGCs, at pan-retinal level, using a new generation of large-scale, high-density multielectrode array consisting of 4096 electrodes. Interestingly, we did not find any RGCs exhibiting a clear latency tuning to the stimuli, suggesting that in mouse, individual RGC pairs may not provide sufficient information. We show that a significant amount of information is encoded synergistically in the concerted spiking of large RGC populations. Thus, the RGC population response described with relative activities, or ranks, provides more relevant information than classical independent spike count- or latency- based codes. In particular, we report for the first time that when considering the relative activities across the whole population, the wave of first stimulus-evoked spikes is an accurate indicator of stimulus content. We show that this coding strategy coexists with classical neural codes, and that it is more efficient and faster. Overall, these novel observations suggest that already at the level of the retina, concerted spiking provides a reliable and fast strategy to rapidly transmit new visual scenes.
Four types of ensemble coding in data visualizations.
Szafir, Danielle Albers; Haroz, Steve; Gleicher, Michael; Franconeri, Steven
2016-01-01
Ensemble coding supports rapid extraction of visual statistics about distributed visual information. Researchers typically study this ability with the goal of drawing conclusions about how such coding extracts information from natural scenes. Here we argue that a second domain can serve as another strong inspiration for understanding ensemble coding: graphs, maps, and other visual presentations of data. Data visualizations allow observers to leverage their ability to perform visual ensemble statistics on distributions of spatial or featural visual information to estimate actual statistics on data. We survey the types of visual statistical tasks that occur within data visualizations across everyday examples, such as scatterplots, and more specialized images, such as weather maps or depictions of patterns in text. We divide these tasks into four categories: identification of sets of values, summarization across those values, segmentation of collections, and estimation of structure. We point to unanswered questions for each category and give examples of such cross-pollination in the current literature. Increased collaboration between the data visualization and perceptual psychology research communities can inspire new solutions to challenges in visualization while simultaneously exposing unsolved problems in perception research.
The genesis of errors in drawing.
Chamberlain, Rebecca; Wagemans, Johan
2016-06-01
The difficulty adults find in drawing objects or scenes from real life is puzzling, assuming that there are few gross individual differences in the phenomenology of visual scenes and in fine motor control in the neurologically healthy population. A review of research concerning the perceptual, motoric and memorial correlates of drawing ability was conducted in order to understand why most adults err when trying to produce faithful representations of objects and scenes. The findings reveal that accurate perception of the subject and of the drawing is at the heart of drawing proficiency, although not to the extent that drawing skill elicits fundamental changes in visual perception. Instead, the decisive role of representational decisions reveals the importance of appropriate segmentation of the visual scene and of the influence of pictorial schemas. This leads to the conclusion that domain-specific, flexible, top-down control of visual attention plays a critical role in development of skill in visual art and may also be a window into creative thinking. Copyright © 2016 Elsevier Ltd. All rights reserved.
An infrared/video fusion system for military robotics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Davis, A.W.; Roberts, R.S.
1997-08-05
Sensory information is critical to the telerobotic operation of mobile robots. In particular, visual sensors are a key component of the sensor package on a robot engaged in urban military operations. Visual sensors provide the robot operator with a wealth of information including robot navigation and threat assessment. However, simple countermeasures such as darkness, smoke, or blinding by a laser, can easily neutralize visual sensors. In order to provide a robust visual sensing system, an infrared sensor is required to augment the primary visual sensor. An infrared sensor can acquire useful imagery in conditions that incapacitate a visual sensor. Amore » simple approach to incorporating an infrared sensor into the visual sensing system is to display two images to the operator: side-by-side visual and infrared images. However, dual images might overwhelm the operator with information, and result in degraded robot performance. A better solution is to combine the visual and infrared images into a single image that maximizes scene information. Fusing visual and infrared images into a single image demands balancing the mixture of visual and infrared information. Humans are accustom to viewing and interpreting visual images. They are not accustom to viewing or interpreting infrared images. Hence, the infrared image must be used to enhance the visual image, not obfuscate it.« less
Olivier, Agnès; Faugloire, Elise; Lejeune, Laure; Biau, Sophie; Isableu, Brice
2017-01-01
Maintaining equilibrium while riding a horse is a challenging task that involves complex sensorimotor processes. We evaluated the relative contribution of visual information (static or dynamic) to horseback riders' postural stability (measured from the variability of segment position in space) and the coordination modes they adopted to regulate balance according to their level of expertise. Riders' perceptual typologies and their possible relation to postural stability were also assessed. Our main assumption was that the contribution of visual information to postural control would be reduced among expert riders in favor of vestibular and somesthetic reliance. Twelve Professional riders and 13 Club riders rode an equestrian simulator at a gallop under four visual conditions: (1) with the projection of a simulated scene reproducing what a rider sees in the real context of a ride in an outdoor arena, (2) under stroboscopic illumination, preventing access to dynamic visual cues, (3) in normal lighting but without the projected scene (i.e., without the visual consequences of displacement) and (4) with no visual cues. The variability of the position of the head, upper trunk and lower trunk was measured along the anteroposterior (AP), mediolateral (ML), and vertical (V) axes. We computed discrete relative phase to assess the coordination between pairs of segments in the anteroposterior axis. Visual field dependence-independence was evaluated using the Rod and Frame Test (RFT). The results showed that the Professional riders exhibited greater overall postural stability than the Club riders, revealed mainly in the AP axis. In particular, head variability was lower in the Professional riders than in the Club riders in visually altered conditions, suggesting a greater ability to use vestibular and somesthetic information according to task constraints with expertise. In accordance with this result, RFT perceptual scores revealed that the Professional riders were less dependent on the visual field than were the Club riders. Finally, the Professional riders exhibited specific coordination modes that, unlike the Club riders, departed from pure in-phase and anti-phase patterns and depended on visual conditions. The present findings provide evidence of major differences in the sensorimotor processes contributing to postural control with expertise in horseback riding. PMID:28194100
Anticipation in Real-world Scenes: The Role of Visual Context and Visual Memory
ERIC Educational Resources Information Center
Coco, Moreno I.; Keller, Frank; Malcolm, George L.
2016-01-01
The human sentence processor is able to make rapid predictions about upcoming linguistic input. For example, upon hearing the verb eat, anticipatory eye-movements are launched toward edible objects in a visual scene (Altmann & Kamide, 1999). However, the cognitive mechanisms that underlie anticipation remain to be elucidated in ecologically…
Metabolic Mapping of the Brain's Response to Visual Stimulation: Studies in Humans.
ERIC Educational Resources Information Center
Phelps, Michael E.; Kuhl, David E.
1981-01-01
Studies demonstrate increasing glucose metabolic rates in human primary (PVC) and association (AVC) visual cortex as complexity of visual scenes increase. AVC increased more rapidly with scene complexity than PVC and increased local metabolic activities above control subject with eyes closed; indicates wide range and metabolic reserve of visual…
Coding of navigational affordances in the human visual system
Epstein, Russell A.
2017-01-01
A central component of spatial navigation is determining where one can and cannot go in the immediate environment. We used fMRI to test the hypothesis that the human visual system solves this problem by automatically identifying the navigational affordances of the local scene. Multivoxel pattern analyses showed that a scene-selective region of dorsal occipitoparietal cortex, known as the occipital place area, represents pathways for movement in scenes in a manner that is tolerant to variability in other visual features. These effects were found in two experiments: One using tightly controlled artificial environments as stimuli, the other using a diverse set of complex, natural scenes. A reconstruction analysis demonstrated that the population codes of the occipital place area could be used to predict the affordances of novel scenes. Taken together, these results reveal a previously unknown mechanism for perceiving the affordance structure of navigable space. PMID:28416669
NASA Astrophysics Data System (ADS)
Graham, James; Ternovskiy, Igor V.
2013-06-01
We applied a two stage unsupervised hierarchical learning system to model complex dynamic surveillance and cyber space monitoring systems using a non-commercial version of the NeoAxis visualization software. The hierarchical scene learning and recognition approach is based on hierarchical expectation maximization, and was linked to a 3D graphics engine for validation of learning and classification results and understanding the human - autonomous system relationship. Scene recognition is performed by taking synthetically generated data and feeding it to a dynamic logic algorithm. The algorithm performs hierarchical recognition of the scene by first examining the features of the objects to determine which objects are present, and then determines the scene based on the objects present. This paper presents a framework within which low level data linked to higher-level visualization can provide support to a human operator and be evaluated in a detailed and systematic way.
What you see is what you expect: rapid scene understanding benefits from prior experience.
Greene, Michelle R; Botros, Abraham P; Beck, Diane M; Fei-Fei, Li
2015-05-01
Although we are able to rapidly understand novel scene images, little is known about the mechanisms that support this ability. Theories of optimal coding assert that prior visual experience can be used to ease the computational burden of visual processing. A consequence of this idea is that more probable visual inputs should be facilitated relative to more unlikely stimuli. In three experiments, we compared the perceptions of highly improbable real-world scenes (e.g., an underwater press conference) with common images matched for visual and semantic features. Although the two groups of images could not be distinguished by their low-level visual features, we found profound deficits related to the improbable images: Observers wrote poorer descriptions of these images (Exp. 1), had difficulties classifying the images as unusual (Exp. 2), and even had lower sensitivity to detect these images in noise than to detect their more probable counterparts (Exp. 3). Taken together, these results place a limit on our abilities for rapid scene perception and suggest that perception is facilitated by prior visual experience.
Eyes Matched to the Prize: The State of Matched Filters in Insect Visual Circuits.
Kohn, Jessica R; Heath, Sarah L; Behnia, Rudy
2018-01-01
Confronted with an ever-changing visual landscape, animals must be able to detect relevant stimuli and translate this information into behavioral output. A visual scene contains an abundance of information: to interpret the entirety of it would be uneconomical. To optimally perform this task, neural mechanisms exist to enhance the detection of important features of the sensory environment while simultaneously filtering out irrelevant information. This can be accomplished by using a circuit design that implements specific "matched filters" that are tuned to relevant stimuli. Following this rule, the well-characterized visual systems of insects have evolved to streamline feature extraction on both a structural and functional level. Here, we review examples of specialized visual microcircuits for vital behaviors across insect species, including feature detection, escape, and estimation of self-motion. Additionally, we discuss how these microcircuits are modulated to weigh relevant input with respect to different internal and behavioral states.
Modes of Visual Recognition and Perceptually Relevant Sketch-based Coding for Images
NASA Technical Reports Server (NTRS)
Jobson, Daniel J.
1991-01-01
A review of visual recognition studies is used to define two levels of information requirements. These two levels are related to two primary subdivisions of the spatial frequency domain of images and reflect two distinct different physical properties of arbitrary scenes. In particular, pathologies in recognition due to cerebral dysfunction point to a more complete split into two major types of processing: high spatial frequency edge based recognition vs. low spatial frequency lightness (and color) based recognition. The former is more central and general while the latter is more specific and is necessary for certain special tasks. The two modes of recognition can also be distinguished on the basis of physical scene properties: the highly localized edges associated with reflectance and sharp topographic transitions vs. smooth topographic undulation. The extreme case of heavily abstracted images is pursued to gain an understanding of the minimal information required to support both modes of recognition. Here the intention is to define the semantic core of transmission. This central core of processing can then be fleshed out with additional image information and coding and rendering techniques.
Speakers of Different Languages Process the Visual World Differently
Chabal, Sarah; Marian, Viorica
2015-01-01
Language and vision are highly interactive. Here we show that people activate language when they perceive the visual world, and that this language information impacts how speakers of different languages focus their attention. For example, when searching for an item (e.g., clock) in the same visual display, English and Spanish speakers look at different objects. Whereas English speakers searching for the clock also look at a cloud, Spanish speakers searching for the clock also look at a gift, because the Spanish names for gift (regalo) and clock (reloj) overlap phonologically. These different looking patterns emerge despite an absence of direct linguistic input, showing that language is automatically activated by visual scene processing. We conclude that the varying linguistic information available to speakers of different languages affects visual perception, leading to differences in how the visual world is processed. PMID:26030171
Iconic memory for the gist of natural scenes.
Clarke, Jason; Mack, Arien
2014-11-01
Does iconic memory contain the gist of multiple scenes? Three experiments were conducted. In the first, four scenes from different basic-level categories were briefly presented in one of two conditions: a cue or a no-cue condition. The cue condition was designed to provide an index of the contents of iconic memory of the display. Subjects were more sensitive to scene gist in the cue condition than in the no-cue condition. In the second, the scenes came from the same basic-level category. We found no difference in sensitivity between the two conditions. In the third, six scenes from different basic level categories were presented in the visual periphery. Subjects were more sensitive to scene gist in the cue condition. These results suggest that scene gist is contained in iconic memory even in the visual periphery; however, iconic representations are not sufficiently detailed to distinguish between scenes coming from the same category. Copyright © 2014 Elsevier Inc. All rights reserved.
Lee, Sungkyoung; Cappella, Joseph N.
2014-01-01
Findings from previous studies on smoking cues and argument strength in antismoking messages have shown that the presence of smoking cues undermines the persuasiveness of antismoking public service announcements (PSAs) with weak arguments. This study conceptualized smoking cues (i.e., scenes showing smoking-related objects and behaviors) as stimuli motivationally relevant to the former smoker population and examined how smoking cues influence former smokers’ processing of antismoking PSAs. Specifically, by defining smoking cues and the strength of antismoking arguments in terms of resource allocation, this study examined former smokers’ recognition accuracy, memory strength, and memory judgment of visual (i.e., scenes excluding smoking cues) and audio information from antismoking PSAs. In line with previous findings, the results of the study showed that the presence of smoking cues undermined former smokers’ encoding of antismoking arguments, which includes the visual and audio information that compose the main content of antismoking messages. PMID:25477766
Dimensionality of visual complexity in computer graphics scenes
NASA Astrophysics Data System (ADS)
Ramanarayanan, Ganesh; Bala, Kavita; Ferwerda, James A.; Walter, Bruce
2008-02-01
How do human observers perceive visual complexity in images? This problem is especially relevant for computer graphics, where a better understanding of visual complexity can aid in the development of more advanced rendering algorithms. In this paper, we describe a study of the dimensionality of visual complexity in computer graphics scenes. We conducted an experiment where subjects judged the relative complexity of 21 high-resolution scenes, rendered with photorealistic methods. Scenes were gathered from web archives and varied in theme, number and layout of objects, material properties, and lighting. We analyzed the subject responses using multidimensional scaling of pooled subject responses. This analysis embedded the stimulus images in a two-dimensional space, with axes that roughly corresponded to "numerosity" and "material / lighting complexity". In a follow-up analysis, we derived a one-dimensional complexity ordering of the stimulus images. We compared this ordering with several computable complexity metrics, such as scene polygon count and JPEG compression size, and did not find them to be very correlated. Understanding the differences between these measures can lead to the design of more efficient rendering algorithms in computer graphics.
Intensity dependent spread theory
NASA Technical Reports Server (NTRS)
Holben, Richard
1990-01-01
The Intensity Dependent Spread (IDS) procedure is an image-processing technique based on a model of the processing which occurs in the human visual system. IDS processing is relevant to many aspects of machine vision and image processing. For quantum limited images, it produces an ideal trade-off between spatial resolution and noise averaging, performs edge enhancement thus requiring only mean-crossing detection for the subsequent extraction of scene edges, and yields edge responses whose amplitudes are independent of scene illumination, depending only upon the ratio of the reflectance on the two sides of the edge. These properties suggest that the IDS process may provide significant bandwidth reduction while losing only minimal scene information when used as a preprocessor at or near the image plane.
The visual light field in real scenes
Xia, Ling; Pont, Sylvia C.; Heynderickx, Ingrid
2014-01-01
Human observers' ability to infer the light field in empty space is known as the “visual light field.” While most relevant studies were performed using images on computer screens, we investigate the visual light field in a real scene by using a novel experimental setup. A “probe” and a scene were mixed optically using a semitransparent mirror. Twenty participants were asked to judge whether the probe fitted the scene with regard to the illumination intensity, direction, and diffuseness. Both smooth and rough probes were used to test whether observers use the additional cues for the illumination direction and diffuseness provided by the 3D texture over the rough probe. The results confirmed that observers are sensitive to the intensity, direction, and diffuseness of the illumination also in real scenes. For some lighting combinations on scene and probe, the awareness of a mismatch between the probe and scene was found to depend on which lighting condition was on the scene and which on the probe, which we called the “swap effect.” For these cases, the observers judged the fit to be better if the average luminance of the visible parts of the probe was closer to the average luminance of the visible parts of the scene objects. The use of a rough instead of smooth probe was found to significantly improve observers' abilities to detect mismatches in lighting diffuseness and directions. PMID:25926970
Deep Neural Network for Structural Prediction and Lane Detection in Traffic Scene.
Li, Jun; Mei, Xue; Prokhorov, Danil; Tao, Dacheng
2017-03-01
Hierarchical neural networks have been shown to be effective in learning representative image features and recognizing object classes. However, most existing networks combine the low/middle level cues for classification without accounting for any spatial structures. For applications such as understanding a scene, how the visual cues are spatially distributed in an image becomes essential for successful analysis. This paper extends the framework of deep neural networks by accounting for the structural cues in the visual signals. In particular, two kinds of neural networks have been proposed. First, we develop a multitask deep convolutional network, which simultaneously detects the presence of the target and the geometric attributes (location and orientation) of the target with respect to the region of interest. Second, a recurrent neuron layer is adopted for structured visual detection. The recurrent neurons can deal with the spatial distribution of visible cues belonging to an object whose shape or structure is difficult to explicitly define. Both the networks are demonstrated by the practical task of detecting lane boundaries in traffic scenes. The multitask convolutional neural network provides auxiliary geometric information to help the subsequent modeling of the given lane structures. The recurrent neural network automatically detects lane boundaries, including those areas containing no marks, without any explicit prior knowledge or secondary modeling.
Exploring eye movements in patients with glaucoma when viewing a driving scene.
Crabb, David P; Smith, Nicholas D; Rauscher, Franziska G; Chisholm, Catharine M; Barbur, John L; Edgar, David F; Garway-Heath, David F
2010-03-16
Glaucoma is a progressive eye disease and a leading cause of visual disability. Automated assessment of the visual field determines the different stages in the disease process: it would be desirable to link these measurements taken in the clinic with patient's actual function, or establish if patients compensate for their restricted field of view when performing everyday tasks. Hence, this study investigated eye movements in glaucomatous patients when viewing driving scenes in a hazard perception test (HPT). The HPT is a component of the UK driving licence test consisting of a series of short film clips of various traffic scenes viewed from the driver's perspective each containing hazardous situations that require the camera car to change direction or slow down. Data from nine glaucomatous patients with binocular visual field defects and ten age-matched control subjects were considered (all experienced drivers). Each subject viewed 26 different films with eye movements simultaneously monitored by an eye tracker. Computer software was purpose written to pre-process the data, co-register it to the film clips and to quantify eye movements and point-of-regard (using a dynamic bivariate contour ellipse analysis). On average, and across all HPT films, patients exhibited different eye movement characteristics to controls making, for example, significantly more saccades (P<0.001; 95% confidence interval for mean increase: 9.2 to 22.4%). Whilst the average region of 'point-of-regard' of the patients did not differ significantly from the controls, there were revealing cases where patients failed to see a hazard in relation to their binocular visual field defect. Characteristics of eye movement patterns in patients with bilateral glaucoma can differ significantly from age-matched controls when viewing a traffic scene. Further studies of eye movements made by glaucomatous patients could provide useful information about the definition of the visual field component required for fitness to drive.
Exploring Eye Movements in Patients with Glaucoma When Viewing a Driving Scene
Crabb, David P.; Smith, Nicholas D.; Rauscher, Franziska G.; Chisholm, Catharine M.; Barbur, John L.; Edgar, David F.; Garway-Heath, David F.
2010-01-01
Background Glaucoma is a progressive eye disease and a leading cause of visual disability. Automated assessment of the visual field determines the different stages in the disease process: it would be desirable to link these measurements taken in the clinic with patient's actual function, or establish if patients compensate for their restricted field of view when performing everyday tasks. Hence, this study investigated eye movements in glaucomatous patients when viewing driving scenes in a hazard perception test (HPT). Methodology/Principal Findings The HPT is a component of the UK driving licence test consisting of a series of short film clips of various traffic scenes viewed from the driver's perspective each containing hazardous situations that require the camera car to change direction or slow down. Data from nine glaucomatous patients with binocular visual field defects and ten age-matched control subjects were considered (all experienced drivers). Each subject viewed 26 different films with eye movements simultaneously monitored by an eye tracker. Computer software was purpose written to pre-process the data, co-register it to the film clips and to quantify eye movements and point-of-regard (using a dynamic bivariate contour ellipse analysis). On average, and across all HPT films, patients exhibited different eye movement characteristics to controls making, for example, significantly more saccades (P<0.001; 95% confidence interval for mean increase: 9.2 to 22.4%). Whilst the average region of ‘point-of-regard’ of the patients did not differ significantly from the controls, there were revealing cases where patients failed to see a hazard in relation to their binocular visual field defect. Conclusions/Significance Characteristics of eye movement patterns in patients with bilateral glaucoma can differ significantly from age-matched controls when viewing a traffic scene. Further studies of eye movements made by glaucomatous patients could provide useful information about the definition of the visual field component required for fitness to drive. PMID:20300522
Slow changing postural cues cancel visual field dependence on self-tilt detection.
Scotto Di Cesare, C; Macaluso, T; Mestre, D R; Bringoux, L
2015-01-01
Interindividual differences influence the multisensory integration process involved in spatial perception. Here, we assessed the effect of visual field dependence on self-tilt detection relative to upright, as a function of static vs. slow changing visual or postural cues. To that aim, we manipulated slow rotations (i.e., 0.05° s(-1)) of the body and/or the visual scene in pitch. Participants had to indicate whether they felt being tilted forward at successive angles. Results show that thresholds for self-tilt detection substantially differed between visual field dependent/independent subjects, when only the visual scene was rotated. This difference was no longer present when the body was actually rotated, whatever the visual scene condition (i.e., absent, static or rotated relative to the observer). These results suggest that the cancellation of visual field dependence by dynamic postural cues may rely on a multisensory reweighting process, where slow changing vestibular/somatosensory inputs may prevail over visual inputs. Copyright © 2014 Elsevier B.V. All rights reserved.
The Neural Dynamics of Attentional Selection in Natural Scenes.
Kaiser, Daniel; Oosterhof, Nikolaas N; Peelen, Marius V
2016-10-12
The human visual system can only represent a small subset of the many objects present in cluttered scenes at any given time, such that objects compete for representation. Despite these processing limitations, the detection of object categories in cluttered natural scenes is remarkably rapid. How does the brain efficiently select goal-relevant objects from cluttered scenes? In the present study, we used multivariate decoding of magneto-encephalography (MEG) data to track the neural representation of within-scene objects as a function of top-down attentional set. Participants detected categorical targets (cars or people) in natural scenes. The presence of these categories within a scene was decoded from MEG sensor patterns by training linear classifiers on differentiating cars and people in isolation and testing these classifiers on scenes containing one of the two categories. The presence of a specific category in a scene could be reliably decoded from MEG response patterns as early as 160 ms, despite substantial scene clutter and variation in the visual appearance of each category. Strikingly, we find that these early categorical representations fully depend on the match between visual input and top-down attentional set: only objects that matched the current attentional set were processed to the category level within the first 200 ms after scene onset. A sensor-space searchlight analysis revealed that this early attention bias was localized to lateral occipitotemporal cortex, reflecting top-down modulation of visual processing. These results show that attention quickly resolves competition between objects in cluttered natural scenes, allowing for the rapid neural representation of goal-relevant objects. Efficient attentional selection is crucial in many everyday situations. For example, when driving a car, we need to quickly detect obstacles, such as pedestrians crossing the street, while ignoring irrelevant objects. How can humans efficiently perform such tasks, given the multitude of objects contained in real-world scenes? Here we used multivariate decoding of magnetoencephalogaphy data to characterize the neural underpinnings of attentional selection in natural scenes with high temporal precision. We show that brain activity quickly tracks the presence of objects in scenes, but crucially only for those objects that were immediately relevant for the participant. These results provide evidence for fast and efficient attentional selection that mediates the rapid detection of goal-relevant objects in real-world environments. Copyright © 2016 the authors 0270-6474/16/3610522-07$15.00/0.
Fradcourt, B; Peyrin, C; Baciu, M; Campagne, A
2013-10-01
Previous studies performed on visual processing of emotional stimuli have revealed preference for a specific type of visual spatial frequencies (high spatial frequency, HSF; low spatial frequency, LSF) according to task demands. The majority of studies used a face and focused on the appraisal of the emotional state of others. The present behavioral study investigates the relative role of spatial frequencies on processing emotional natural scenes during two explicit cognitive appraisal tasks, one emotional, based on the self-emotional experience and one motivational, based on the tendency to action. Our results suggest that HSF information was the most relevant to rapidly identify the self-emotional experience (unpleasant, pleasant, and neutral) while LSF was required to rapidly identify the tendency to action (avoidance, approach, and no action). The tendency to action based on LSF analysis showed a priority for unpleasant stimuli whereas the identification of emotional experience based on HSF analysis showed a priority for pleasant stimuli. The present study confirms the interest of considering both emotional and motivational characteristics of visual stimuli. Copyright © 2013 Elsevier Inc. All rights reserved.
Correction techniques for depth errors with stereo three-dimensional graphic displays
NASA Technical Reports Server (NTRS)
Parrish, Russell V.; Holden, Anthony; Williams, Steven P.
1992-01-01
Three-dimensional (3-D), 'real-world' pictorial displays that incorporate 'true' depth cues via stereopsis techniques have proved effective for displaying complex information in a natural way to enhance situational awareness and to improve pilot/vehicle performance. In such displays, the display designer must map the depths in the real world to the depths available with the stereo display system. However, empirical data have shown that the human subject does not perceive the information at exactly the depth at which it is mathematically placed. Head movements can also seriously distort the depth information that is embedded in stereo 3-D displays because the transformations used in mapping the visual scene to the depth-viewing volume (DVV) depend intrinsically on the viewer location. The goal of this research was to provide two correction techniques; the first technique corrects the original visual scene to the DVV mapping based on human perception errors, and the second (which is based on head-positioning sensor input data) corrects for errors induced by head movements. Empirical data are presented to validate both correction techniques. A combination of the two correction techniques effectively eliminates the distortions of depth information embedded in stereo 3-D displays.
Scan Patterns Predict Sentence Production in the Cross-Modal Processing of Visual Scenes
ERIC Educational Resources Information Center
Coco, Moreno I.; Keller, Frank
2012-01-01
Most everyday tasks involve multiple modalities, which raises the question of how the processing of these modalities is coordinated by the cognitive system. In this paper, we focus on the coordination of visual attention and linguistic processing during speaking. Previous research has shown that objects in a visual scene are fixated before they…
ERIC Educational Resources Information Center
Rice, Katherine; Moriuchi, Jennifer M.; Jones, Warren; Klin, Ami
2012-01-01
Objective: To examine patterns of variability in social visual engagement and their relationship to standardized measures of social disability in a heterogeneous sample of school-aged children with autism spectrum disorders (ASD). Method: Eye-tracking measures of visual fixation during free-viewing of dynamic social scenes were obtained for 109…
Jolij, Jacob; Scholte, H Steven; van Gaal, Simon; Hodgson, Timothy L; Lamme, Victor A F
2011-12-01
Humans largely guide their behavior by their visual representation of the world. Recent studies have shown that visual information can trigger behavior within 150 msec, suggesting that visually guided responses to external events, in fact, precede conscious awareness of those events. However, is such a view correct? By using a texture discrimination task, we show that the brain relies on long-latency visual processing in order to guide perceptual decisions. Decreasing stimulus saliency leads to selective changes in long-latency visually evoked potential components reflecting scene segmentation. These latency changes are accompanied by almost equal changes in simple RTs and points of subjective simultaneity. Furthermore, we find a strong correlation between individual RTs and the latencies of scene segmentation related components in the visually evoked potentials, showing that the processes underlying these late brain potentials are critical in triggering a response. However, using the same texture stimuli in an antisaccade task, we found that reflexive, but erroneous, prosaccades, but not antisaccades, can be triggered by earlier visual processes. In other words: The brain can act quickly, but decides late. Differences between our study and earlier findings suggesting that action precedes conscious awareness can be explained by assuming that task demands determine whether a fast and unconscious, or a slower and conscious, representation is used to initiate a visually guided response.
Figure-Ground Organization in Visual Cortex for Natural Scenes
2016-01-01
Abstract Figure-ground organization and border-ownership assignment are essential for understanding natural scenes. It has been shown that many neurons in the macaque visual cortex signal border-ownership in displays of simple geometric shapes such as squares, but how well these neurons resolve border-ownership in natural scenes is not known. We studied area V2 neurons in behaving macaques with static images of complex natural scenes. We found that about half of the neurons were border-ownership selective for contours in natural scenes, and this selectivity originated from the image context. The border-ownership signals emerged within 70 ms after stimulus onset, only ∼30 ms after response onset. A substantial fraction of neurons were highly consistent across scenes. Thus, the cortical mechanisms of figure-ground organization are fast and efficient even in images of complex natural scenes. Understanding how the brain performs this task so fast remains a challenge. PMID:28058269
Visual wetness perception based on image color statistics.
Sawayama, Masataka; Adelson, Edward H; Nishida, Shin'ya
2017-05-01
Color vision provides humans and animals with the abilities to discriminate colors based on the wavelength composition of light and to determine the location and identity of objects of interest in cluttered scenes (e.g., ripe fruit among foliage). However, we argue that color vision can inform us about much more than color alone. Since a trichromatic image carries more information about the optical properties of a scene than a monochromatic image does, color can help us recognize complex material qualities. Here we show that human vision uses color statistics of an image for the perception of an ecologically important surface condition (i.e., wetness). Psychophysical experiments showed that overall enhancement of chromatic saturation, combined with a luminance tone change that increases the darkness and glossiness of the image, tended to make dry scenes look wetter. Theoretical analysis along with image analysis of real objects indicated that our image transformation, which we call the wetness enhancing transformation, is consistent with actual optical changes produced by surface wetting. Furthermore, we found that the wetness enhancing transformation operator was more effective for the images with many colors (large hue entropy) than for those with few colors (small hue entropy). The hue entropy may be used to separate surface wetness from other surface states having similar optical properties. While surface wetness and surface color might seem to be independent, there are higher order color statistics that can influence wetness judgments, in accord with the ecological statistics. The present findings indicate that the visual system uses color image statistics in an elegant way to help estimate the complex physical status of a scene.
Henderson, John M; Chanceaux, Myriam; Smith, Tim J
2009-01-23
We investigated the relationship between visual clutter and visual search in real-world scenes. Specifically, we investigated whether visual clutter, indexed by feature congestion, sub-band entropy, and edge density, correlates with search performance as assessed both by traditional behavioral measures (response time and error rate) and by eye movements. Our results demonstrate that clutter is related to search performance. These results hold for both traditional search measures and for eye movements. The results suggest that clutter may serve as an image-based proxy for search set size in real-world scenes.
Blind subjects construct conscious mental images of visual scenes encoded in musical form.
Cronly-Dillon, J; Persaud, K C; Blore, R
2000-01-01
Blind (previously sighted) subjects are able to analyse, describe and graphically represent a number of high-contrast visual images translated into musical form de novo. We presented musical transforms of a random assortment of photographic images of objects and urban scenes to such subjects, a few of which depicted architectural and other landmarks that may be useful in navigating a route to a particular destination. Our blind subjects were able to use the sound representation to construct a conscious mental image that was revealed by their ability to depict a visual target by drawing it. We noted the similarity between the way the visual system integrates information from successive fixations to form a representation that is stable across eye movements and the way a succession of image frames (encoded in sound) which depict different portions of the image are integrated to form a seamless mental image. Finally, we discuss the profound resemblance between the way a professional musician carries out a structural analysis of a musical composition in order to relate its structure to the perception of musical form and the strategies used by our blind subjects in isolating structural features that collectively reveal the identity of visual form. PMID:11413637
ERIC Educational Resources Information Center
Henderson, John M.; Nuthmann, Antje; Luke, Steven G.
2013-01-01
Recent research on eye movements during scene viewing has primarily focused on where the eyes fixate. But eye fixations also differ in their durations. Here we investigated whether fixation durations in scene viewing are under the direct and immediate control of the current visual input. Subjects freely viewed photographs of scenes in preparation…
Initial Scene Representations Facilitate Eye Movement Guidance in Visual Search
ERIC Educational Resources Information Center
Castelhano, Monica S.; Henderson, John M.
2007-01-01
What role does the initial glimpse of a scene play in subsequent eye movement guidance? In 4 experiments, a brief scene preview was followed by object search through the scene via a small moving window that was tied to fixation position. Experiment 1 demonstrated that the scene preview resulted in more efficient eye movements compared with a…
Standard deviation of luminance distribution affects lightness and pupillary response.
Kanari, Kei; Kaneko, Hirohiko
2014-12-01
We examined whether the standard deviation (SD) of luminance distribution serves as information of illumination. We measured the lightness of a patch presented in the center of a scrambled-dot pattern while manipulating the SD of the luminance distribution. Results showed that lightness decreased as the SD of the surround stimulus increased. We also measured pupil diameter while viewing a similar stimulus. The pupil diameter decreased as the SD of luminance distribution of the stimuli increased. We confirmed that these results were not obtained because of the increase of the highest luminance in the stimulus. Furthermore, results of field measurements revealed a correlation between the SD of luminance distribution and illuminance in natural scenes. These results indicated that the visual system refers to the SD of the luminance distribution in the visual stimulus to estimate the scene illumination.
Language-Mediated Eye Movements in the Absence of a Visual World: The "Blank Screen Paradigm"
ERIC Educational Resources Information Center
Altmann, Gerry T. M.
2004-01-01
The "visual world paradigm" typically involves presenting participants with a visual scene and recording eye movements as they either hear an instruction to manipulate objects in the scene or as they listen to a description of what may happen to those objects. In this study, participants heard each target sentence only after the corresponding…
Dynamic binding of visual features by neuronal/stimulus synchrony.
Iwabuchi, A
1998-05-01
When people see a visual scene, certain parts of the visual scene are treated as belonging together and we regard them as a perceptual unit, which is called a "figure". People focus on figures, and the remaining parts of the scene are disregarded as "ground". In Gestalt psychology this process is called "figure-ground segregation". According to current perceptual psychology, a figure is formed by binding various visual features in a scene, and developments in neuroscience have revealed that there are many feature-encoding neurons, which respond to such features specifically. It is not known, however, how the brain binds different features of an object into a coherent visual object representation. Recently, the theory of binding by neuronal synchrony, which argues that feature binding is dynamically mediated by neuronal synchrony of feature-encoding neurons, has been proposed. This review article portrays the problem of figure-ground segregation and features binding, summarizes neurophysiological and psychophysical experiments and theory relevant to feature binding by neuronal/stimulus synchrony, and suggests possible directions for future research on this topic.
The Quantum Binding Problem in the Context of Associative Memory
Wichert, Andreas
2016-01-01
We present a method to solve the binding problem by using a quantum algorithm for the retrieval of associations from associative memory during visual scene analysis. The problem is solved by mapping the information representing different objects into superposition by using entanglement and Grover’s amplification algorithm. PMID:27603782
Graffiti Crews' Potential Pedagogical Role
ERIC Educational Resources Information Center
Avramidis, Konstantinos; Drakopoulou, Konstantina
2012-01-01
Since the '60s New York style graffiti has gradually become an integral part of the urban visual landscape all over the world. The -so called- graffiti scene evolved into an alternative space where writers educate one another. Through their association with other writers, especially through their membership in informal organized groups known as…
Miconi, Thomas; Groomes, Laura; Kreiman, Gabriel
2016-01-01
When searching for an object in a scene, how does the brain decide where to look next? Visual search theories suggest the existence of a global “priority map” that integrates bottom-up visual information with top-down, target-specific signals. We propose a mechanistic model of visual search that is consistent with recent neurophysiological evidence, can localize targets in cluttered images, and predicts single-trial behavior in a search task. This model posits that a high-level retinotopic area selective for shape features receives global, target-specific modulation and implements local normalization through divisive inhibition. The normalization step is critical to prevent highly salient bottom-up features from monopolizing attention. The resulting activity pattern constitues a priority map that tracks the correlation between local input and target features. The maximum of this priority map is selected as the locus of attention. The visual input is then spatially enhanced around the selected location, allowing object-selective visual areas to determine whether the target is present at this location. This model can localize objects both in array images and when objects are pasted in natural scenes. The model can also predict single-trial human fixations, including those in error and target-absent trials, in a search task involving complex objects. PMID:26092221
Kahn, Itamar; Wig, Gagan S.; Schacter, Daniel L.
2012-01-01
Asymmetrical specialization of cognitive processes across the cerebral hemispheres is a hallmark of healthy brain development and an important evolutionary trait underlying higher cognition in humans. While previous research, including studies of priming, divided visual field presentation, and split-brain patients, demonstrates a general pattern of right/left asymmetry of form-specific versus form-abstract visual processing, little is known about brain organization underlying this dissociation. Here, using repetition priming of complex visual scenes and high-resolution functional magnetic resonance imaging (MRI), we demonstrate asymmetrical form specificity of visual processing between the right and left hemispheres within a region known to be critical for processing of visual spatial scenes (parahippocampal place area [PPA]). Next, we use resting-state functional connectivity MRI analyses to demonstrate that this functional asymmetry is associated with differential intrinsic activity correlations of the right versus left PPA with regions critically involved in perceptual versus conceptual processing, respectively. Our results demonstrate that the PPA comprises lateralized subregions across the cerebral hemispheres that are engaged in functionally dissociable yet complementary components of visual scene analysis. Furthermore, this functional asymmetry is associated with differential intrinsic functional connectivity of the PPA with distinct brain areas known to mediate dissociable cognitive processes. PMID:21968568
Wilkinson, Krista M; Light, Janice
2011-12-01
Many individuals with complex communication needs may benefit from visual aided augmentative and alternative communication systems. In visual scene displays (VSDs), language concepts are embedded into a photograph of a naturalistic event. Humans play a central role in communication development and might be important elements in VSDs. However, many VSDs omit human figures. In this study, the authors sought to describe the distribution of visual attention to humans in naturalistic scenes as compared with other elements. Nineteen college students observed 8 photographs in which a human figure appeared near 1 or more items that might be expected to compete for visual attention (such as a Christmas tree or a table loaded with food). Eye-tracking technology allowed precise recording of participants' gaze. The fixation duration over a 7-s viewing period and latency to view elements in the photograph were measured. Participants fixated on the human figures more rapidly and for longer than expected based on the size of these figures, regardless of the other elements in the scene. Human figures attract attention in a photograph even when presented alongside other attractive distracters. Results suggest that humans may be a powerful means to attract visual attention to key elements in VSDs.
Stevens, W Dale; Kahn, Itamar; Wig, Gagan S; Schacter, Daniel L
2012-08-01
Asymmetrical specialization of cognitive processes across the cerebral hemispheres is a hallmark of healthy brain development and an important evolutionary trait underlying higher cognition in humans. While previous research, including studies of priming, divided visual field presentation, and split-brain patients, demonstrates a general pattern of right/left asymmetry of form-specific versus form-abstract visual processing, little is known about brain organization underlying this dissociation. Here, using repetition priming of complex visual scenes and high-resolution functional magnetic resonance imaging (MRI), we demonstrate asymmetrical form specificity of visual processing between the right and left hemispheres within a region known to be critical for processing of visual spatial scenes (parahippocampal place area [PPA]). Next, we use resting-state functional connectivity MRI analyses to demonstrate that this functional asymmetry is associated with differential intrinsic activity correlations of the right versus left PPA with regions critically involved in perceptual versus conceptual processing, respectively. Our results demonstrate that the PPA comprises lateralized subregions across the cerebral hemispheres that are engaged in functionally dissociable yet complementary components of visual scene analysis. Furthermore, this functional asymmetry is associated with differential intrinsic functional connectivity of the PPA with distinct brain areas known to mediate dissociable cognitive processes.
Gagnier, Kristin Michod; Dickinson, Christopher A.; Intraub, Helene
2015-01-01
Observers frequently remember seeing more of a scene than was shown (boundary extension). Does this reflect a lack of eye fixations to the boundary region? Single-object photographs were presented for 14–15 s each. Main objects were either whole or slightly cropped by one boundary, creating a salient marker of boundary placement. All participants expected a memory test, but only half were informed that boundary memory would be tested. Participants in both conditions made multiple fixations to the boundary region and the cropped region during study. Demonstrating the importance of these regions, test-informed participants fixated them sooner, longer, and more frequently. Boundary ratings (Experiment 1) and border adjustment tasks (Experiments 2–4) revealed boundary extension in both conditions. The error was reduced, but not eliminated, in the test-informed condition. Surprisingly, test knowledge and multiple fixations to the salient cropped region, during study and at test, were insufficient to overcome boundary extension on the cropped side. Results are discussed within a traditional visual-centric framework versus a multisource model of scene perception. PMID:23547787
Michod Gagnier, Kristin; Dickinson, Christopher A; Intraub, Helene
2013-01-01
Observers frequently remember seeing more of a scene than was shown (boundary extension). Does this reflect a lack of eye fixations to the boundary region? Single-object photographs were presented for 14-15 s each. Main objects were either whole or slightly cropped by one boundary, creating a salient marker of boundary placement. All participants expected a memory test, but only half were informed that boundary memory would be tested. Participants in both conditions made multiple fixations to the boundary region and the cropped region during study. Demonstrating the importance of these regions, test-informed participants fixated them sooner, longer, and more frequently. Boundary ratings (Experiment 1) and border adjustment tasks (Experiments 2-4) revealed boundary extension in both conditions. The error was reduced, but not eliminated, in the test-informed condition. Surprisingly, test knowledge and multiple fixations to the salient cropped region, during study and at test, were insufficient to overcome boundary extension on the cropped side. Results are discussed within a traditional visual-centric framework versus a multisource model of scene perception.
Selecting and perceiving multiple visual objects
Xu, Yaoda; Chun, Marvin M.
2010-01-01
To explain how multiple visual objects are attended and perceived, we propose that our visual system first selects a fixed number of about four objects from a crowded scene based on their spatial information (object individuation) and then encode their details (object identification). We describe the involvement of the inferior intra-parietal sulcus (IPS) in object individuation and the superior IPS and higher visual areas in object identification. Our neural object-file theory synthesizes and extends existing ideas in visual cognition and is supported by behavioral and neuroimaging results. It provides a better understanding of the role of the different parietal areas in encoding visual objects and can explain various forms of capacity-limited processing in visual cognition such as working memory. PMID:19269882
Hillstrom, Anne P; Segabinazi, Joice D; Godwin, Hayward J; Liversedge, Simon P; Benson, Valerie
2017-02-19
We explored the influence of early scene analysis and visible object characteristics on eye movements when searching for objects in photographs of scenes. On each trial, participants were shown sequentially either a scene preview or a uniform grey screen (250 ms), a visual mask, the name of the target and the scene, now including the target at a likely location. During the participant's first saccade during search, the target location was changed to: (i) a different likely location, (ii) an unlikely but possible location or (iii) a very implausible location. The results showed that the first saccade landed more often on the likely location in which the target re-appeared than on unlikely or implausible locations, and overall the first saccade landed nearer the first target location with a preview than without. Hence, rapid scene analysis influenced initial eye movement planning, but availability of the target rapidly modified that plan. After the target moved, it was found more quickly when it appeared in a likely location than when it appeared in an unlikely or implausible location. The findings show that both scene gist and object properties are extracted rapidly, and are used in conjunction to guide saccadic eye movements during visual search.This article is part of the themed issue 'Auditory and visual scene analysis'. © 2017 The Author(s).
Fixation and saliency during search of natural scenes: the case of visual agnosia.
Foulsham, Tom; Barton, Jason J S; Kingstone, Alan; Dewhurst, Richard; Underwood, Geoffrey
2009-07-01
Models of eye movement control in natural scenes often distinguish between stimulus-driven processes (which guide the eyes to visually salient regions) and those based on task and object knowledge (which depend on expectations or identification of objects and scene gist). In the present investigation, the eye movements of a patient with visual agnosia were recorded while she searched for objects within photographs of natural scenes and compared to those made by students and age-matched controls. Agnosia is assumed to disrupt the top-down knowledge available in this task, and so may increase the reliance on bottom-up cues. The patient's deficit in object recognition was seen in poor search performance and inefficient scanning. The low-level saliency of target objects had an effect on responses in visual agnosia, and the most salient region in the scene was more likely to be fixated by the patient than by controls. An analysis of model-predicted saliency at fixation locations indicated a closer match between fixations and low-level saliency in agnosia than in controls. These findings are discussed in relation to saliency-map models and the balance between high and low-level factors in eye guidance.
Vertical gaze angle: absolute height-in-scene information for the programming of prehension.
Gardner, P L; Mon-Williams, M
2001-02-01
One possible source of information regarding the distance of a fixated target is provided by the height of the object within the visual scene. It is accepted that this cue can provide ordinal information, but generally it has been assumed that the nervous system cannot extract "absolute" information from height-in-scene. In order to use height-in-scene, the nervous system would need to be sensitive to ocular position with respect to the head and to head orientation with respect to the shoulders (i.e. vertical gaze angle or VGA). We used a perturbation technique to establish whether the nervous system uses vertical gaze angle as a distance cue. Vertical gaze angle was perturbed using ophthalmic prisms with the base oriented either up or down. In experiment 1, participants were required to carry out an open-loop pointing task whilst wearing: (1) no prisms; (2) a base-up prism; or (3) a base-down prism. In experiment 2, the participants reached to grasp an object under closed-loop viewing conditions whilst wearing: (1) no prisms; (2) a base-up prism; or (3) a base-down prism. Experiment 1 and 2 provided clear evidence that the human nervous system uses vertical gaze angle as a distance cue. It was found that the weighting attached to VGA decreased with increasing target distance. The weighting attached to VGA was also affected by the discrepancy between the height of the target, as specified by all other distance cues, and the height indicated by the initial estimate of the position of the supporting surface. We conclude by considering the use of height-in-scene information in the perception of surface slant and highlight some of the complexities that must be involved in the computation of environmental layout.
Progress in high-level exploratory vision
NASA Astrophysics Data System (ADS)
Brand, Matthew
1993-08-01
We have been exploring the hypothesis that vision is an explanatory process, in which causal and functional reasoning about potential motion plays an intimate role in mediating the activity of low-level visual processes. In particular, we have explored two of the consequences of this view for the construction of purposeful vision systems: Causal and design knowledge can be used to (1) drive focus of attention, and (2) choose between ambiguous image interpretations. An important result of visual understanding is an explanation of the scene's causal structure: How action is originated, constrained, and prevented, and what will happen in the immediate future. In everyday visual experience, most action takes the form of motion, and most causal analysis takes the form of dynamical analysis. This is even true of static scenes, where much of a scene's interest lies in how possible motions are arrested. This paper describes our progress in developing domain theories and visual processes for the understanding of various kinds of structured scenes, including structures built out of children's constructive toys and simple mechanical devices.
Bayesian learning of visual chunks by human observers
Orbán, Gergő; Fiser, József; Aslin, Richard N.; Lengyel, Máté
2008-01-01
Efficient and versatile processing of any hierarchically structured information requires a learning mechanism that combines lower-level features into higher-level chunks. We investigated this chunking mechanism in humans with a visual pattern-learning paradigm. We developed an ideal learner based on Bayesian model comparison that extracts and stores only those chunks of information that are minimally sufficient to encode a set of visual scenes. Our ideal Bayesian chunk learner not only reproduced the results of a large set of previous empirical findings in the domain of human pattern learning but also made a key prediction that we confirmed experimentally. In accordance with Bayesian learning but contrary to associative learning, human performance was well above chance when pair-wise statistics in the exemplars contained no relevant information. Thus, humans extract chunks from complex visual patterns by generating accurate yet economical representations and not by encoding the full correlational structure of the input. PMID:18268353
High-dynamic-range scene compression in humans
NASA Astrophysics Data System (ADS)
McCann, John J.
2006-02-01
Single pixel dynamic-range compression alters a particular input value to a unique output value - a look-up table. It is used in chemical and most digital photographic systems having S-shaped transforms to render high-range scenes onto low-range media. Post-receptor neural processing is spatial, as shown by the physiological experiments of Dowling, Barlow, Kuffler, and Hubel & Wiesel. Human vision does not render a particular receptor-quanta catch as a unique response. Instead, because of spatial processing, the response to a particular quanta catch can be any color. Visual response is scene dependent. Stockham proposed an approach to model human range compression using low-spatial frequency filters. Campbell, Ginsberg, Wilson, Watson, Daly and many others have developed spatial-frequency channel models. This paper describes experiments measuring the properties of desirable spatial-frequency filters for a variety of scenes. Given the radiances of each pixel in the scene and the observed appearances of objects in the image, one can calculate the visual mask for that individual image. Here, visual mask is the spatial pattern of changes made by the visual system in processing the input image. It is the spatial signature of human vision. Low-dynamic range images with many white areas need no spatial filtering. High-dynamic-range images with many blacks, or deep shadows, require strong spatial filtering. Sun on the right and shade on the left requires directional filters. These experiments show that variable scene- scenedependent filters are necessary to mimic human vision. Although spatial-frequency filters can model human dependent appearances, the problem still remains that an analysis of the scene is still needed to calculate the scene-dependent strengths of each of the filters for each frequency.
Wagner, Dylan D; Kelley, William M; Heatherton, Todd F
2011-12-01
People are able to rapidly infer complex personality traits and mental states even from the most minimal person information. Research has shown that when observers view a natural scene containing people, they spend a disproportionate amount of their time looking at the social features (e.g., faces, bodies). Does this preference for social features merely reflect the biological salience of these features or are observers spontaneously attempting to make sense of complex social dynamics? Using functional neuroimaging, we investigated neural responses to social and nonsocial visual scenes in a large sample of participants (n = 48) who varied on an individual difference measure assessing empathy and mentalizing (i.e., empathizing). Compared with other scene categories, viewing natural social scenes activated regions associated with social cognition (e.g., dorsomedial prefrontal cortex and temporal poles). Moreover, activity in these regions during social scene viewing was strongly correlated with individual differences in empathizing. These findings offer neural evidence that observers spontaneously engage in social cognition when viewing complex social material but that the degree to which people do so is mediated by individual differences in trait empathizing.
Fowlkes, Charless C.; Banks, Martin S.
2010-01-01
The shape of the contour separating two regions strongly influences judgments of which region is “figure” and which is “ground.” Convexity and other figure–ground cues are generally assumed to indicate only which region is nearer, but nothing about how much the regions are separated in depth. To determine the depth information conveyed by convexity, we examined natural scenes and found that depth steps across surfaces with convex silhouettes are likely to be larger than steps across surfaces with concave silhouettes. In a psychophysical experiment, we found that humans exploit this correlation. For a given binocular disparity, observers perceived more depth when the near surface's silhouette was convex rather than concave. We estimated the depth distributions observers used in making those judgments: they were similar to the natural-scene distributions. Our findings show that convexity should be reclassified as a metric depth cue. They also suggest that the dichotomy between metric and nonmetric depth cues is false and that the depth information provided many cues should be evaluated with respect to natural-scene statistics. Finally, the findings provide an explanation for why figure–ground cues modulate the responses of disparity-sensitive cells in visual cortex. PMID:20505093
Schomaker, Judith; Walper, Daniel; Wittmann, Bianca C; Einhäuser, Wolfgang
2017-04-01
In addition to low-level stimulus characteristics and current goals, our previous experience with stimuli can also guide attentional deployment. It remains unclear, however, if such effects act independently or whether they interact in guiding attention. In the current study, we presented natural scenes including every-day objects that differed in affective-motivational impact. In the first free-viewing experiment, we presented visually-matched triads of scenes in which one critical object was replaced that varied mainly in terms of motivational value, but also in terms of valence and arousal, as confirmed by ratings by a large set of observers. Treating motivation as a categorical factor, we found that it affected gaze. A linear-effect model showed that arousal, valence, and motivation predicted fixations above and beyond visual characteristics, like object size, eccentricity, or visual salience. In a second experiment, we experimentally investigated whether the effects of emotion and motivation could be modulated by visual salience. In a medium-salience condition, we presented the same unmodified scenes as in the first experiment. In a high-salience condition, we retained the saturation of the critical object in the scene, and decreased the saturation of the background, and in a low-salience condition, we desaturated the critical object while retaining the original saturation of the background. We found that highly salient objects guided gaze, but still found additional additive effects of arousal, valence and motivation, confirming that higher-level factors can also guide attention, as measured by fixations towards objects in natural scenes. Copyright © 2017 Elsevier Ltd. All rights reserved.
Interrupted Visual Searches Reveal Volatile Search Memory
ERIC Educational Resources Information Center
Shen, Y. Jeremy; Jiang, Yuhong V.
2006-01-01
This study investigated memory from interrupted visual searches. Participants conducted a change detection search task on polygons overlaid on scenes. Search was interrupted by various disruptions, including unfilled delay, passive viewing of other scenes, and additional search on new displays. Results showed that performance was unaffected by…
How emotion leads to selective memory: neuroimaging evidence.
Waring, Jill D; Kensinger, Elizabeth A
2011-06-01
Often memory for emotionally arousing items is enhanced relative to neutral items within complex visual scenes, but this enhancement can come at the expense of memory for peripheral background information. This 'trade-off' effect has been elicited by a range of stimulus valence and arousal levels, yet the magnitude of the effect has been shown to vary with these factors. Using fMRI, this study investigated the neural mechanisms underlying this selective memory for emotional scenes. Further, we examined how these processes are affected by stimulus dimensions of arousal and valence. The trade-off effect in memory occurred for low to high arousal positive and negative scenes. There was a core emotional memory network associated with the trade-off among all the emotional scene types, however, there were additional regions that were uniquely associated with the trade-off for each individual scene type. These results suggest that there is a common network of regions associated with the emotional memory trade-off effect, but that valence and arousal also independently affect the neural activity underlying the effect. Copyright © 2011 Elsevier Ltd. All rights reserved.
How emotion leads to selective memory: Neuroimaging evidence
Waring, Jill D.; Kensinger, Elizabeth A.
2011-01-01
Often memory for emotionally arousing items is enhanced relative to neutral items within complex visual scenes, but this enhancement can come at the expense of memory for peripheral background information. This ‘trade-off’ effect has been elicited by a range of stimulus valence and arousal levels, yet the magnitude of the effect has been shown to vary with these factors. Using fMRI, this study investigated the neural mechanisms underlying this selective memory for emotional scenes. Further, we examined how these processes are affected by stimulus dimensions of arousal and valence. The trade-off effect in memory occurred for low to high arousal positive and negative scenes. There was a core emotional memory network associated with the trade-off among all the emotional scene types, however there were additional regions that were uniquely associated with the trade-off for each individual scene type. These results suggest that there is a common network of regions associated with the emotional memory tradeoff effect, but that valence and arousal also independently affect the neural activity underlying the effect. PMID:21414333
Urakawa, Tomokazu; Ogata, Katsuya; Kimura, Takahiro; Kume, Yuko; Tobimatsu, Shozo
2015-01-01
Disambiguation of a noisy visual scene with prior knowledge is an indispensable task of the visual system. To adequately adapt to a dynamically changing visual environment full of noisy visual scenes, the implementation of knowledge-mediated disambiguation in the brain is imperative and essential for proceeding as fast as possible under the limited capacity of visual image processing. However, the temporal profile of the disambiguation process has not yet been fully elucidated in the brain. The present study attempted to determine how quickly knowledge-mediated disambiguation began to proceed along visual areas after the onset of a two-tone ambiguous image using magnetoencephalography with high temporal resolution. Using the predictive coding framework, we focused on activity reduction for the two-tone ambiguous image as an index of the implementation of disambiguation. Source analysis revealed that a significant activity reduction was observed in the lateral occipital area at approximately 120 ms after the onset of the ambiguous image, but not in preceding activity (about 115 ms) in the cuneus when participants perceptually disambiguated the ambiguous image with prior knowledge. These results suggested that knowledge-mediated disambiguation may be implemented as early as approximately 120 ms following an ambiguous visual scene, at least in the lateral occipital area, and provided an insight into the temporal profile of the disambiguation process of a noisy visual scene with prior knowledge. © 2014 Federation of European Neuroscience Societies and John Wiley & Sons Ltd.
Eye movements during information processing tasks: individual differences and cultural effects.
Rayner, Keith; Li, Xingshan; Williams, Carrick C; Cave, Kyle R; Well, Arnold D
2007-09-01
The eye movements of native English speakers, native Chinese speakers, and bilingual Chinese/English speakers who were either born in China (and moved to the US at an early age) or in the US were recorded during six tasks: (1) reading, (2) face processing, (3) scene perception, (4) visual search, (5) counting Chinese characters in a passage of text, and (6) visual search for Chinese characters. Across the different groups, there was a strong tendency for consistency in eye movement behavior; if fixation durations of a given viewer were long on one task, they tended to be long on other tasks (and the same tended to be true for saccade size). Some tasks, notably reading, did not conform to this pattern. Furthermore, experience with a given writing system had a large impact on fixation durations and saccade lengths. With respect to cultural differences, there was little evidence that Chinese participants spent more time looking at the background information (and, conversely less time looking at the foreground information) than the American participants. Also, Chinese participants' fixations were more numerous and of shorter duration than those of their American counterparts while viewing faces and scenes, and counting Chinese characters in text.
Lescroart, Mark D.; Stansbury, Dustin E.; Gallant, Jack L.
2015-01-01
Perception of natural visual scenes activates several functional areas in the human brain, including the Parahippocampal Place Area (PPA), Retrosplenial Complex (RSC), and the Occipital Place Area (OPA). It is currently unclear what specific scene-related features are represented in these areas. Previous studies have suggested that PPA, RSC, and/or OPA might represent at least three qualitatively different classes of features: (1) 2D features related to Fourier power; (2) 3D spatial features such as the distance to objects in a scene; or (3) abstract features such as the categories of objects in a scene. To determine which of these hypotheses best describes the visual representation in scene-selective areas, we applied voxel-wise modeling (VM) to BOLD fMRI responses elicited by a set of 1386 images of natural scenes. VM provides an efficient method for testing competing hypotheses by comparing predictions of brain activity based on encoding models that instantiate each hypothesis. Here we evaluated three different encoding models that instantiate each of the three hypotheses listed above. We used linear regression to fit each encoding model to the fMRI data recorded from each voxel, and we evaluated each fit model by estimating the amount of variance it predicted in a withheld portion of the data set. We found that voxel-wise models based on Fourier power or the subjective distance to objects in each scene predicted much of the variance predicted by a model based on object categories. Furthermore, the response variance explained by these three models is largely shared, and the individual models explain little unique variance in responses. Based on an evaluation of previous studies and the data we present here, we conclude that there is currently no good basis to favor any one of the three alternative hypotheses about visual representation in scene-selective areas. We offer suggestions for further studies that may help resolve this issue. PMID:26594164
Li, Rui; Zhang, Xiaodong; Li, Hanzhe; Zhang, Liming; Lu, Zhufeng; Chen, Jiangcheng
2018-08-01
Brain control technology can restore communication between the brain and a prosthesis, and choosing a Brain-Computer Interface (BCI) paradigm to evoke electroencephalogram (EEG) signals is an essential step for developing this technology. In this paper, the Scene Graph paradigm used for controlling prostheses was proposed; this paradigm is based on Steady-State Visual Evoked Potentials (SSVEPs) regarding the Scene Graph of a subject's intention. A mathematic model was built to predict SSVEPs evoked by the proposed paradigm and a sinusoidal stimulation method was used to present the Scene Graph stimulus to elicit SSVEPs from subjects. Then, a 2-degree of freedom (2-DOF) brain-controlled prosthesis system was constructed to validate the performance of the Scene Graph-SSVEP (SG-SSVEP)-based BCI. The classification of SG-SSVEPs was detected via the Canonical Correlation Analysis (CCA) approach. To assess the efficiency of proposed BCI system, the performances of traditional SSVEP-BCI system were compared. Experimental results from six subjects suggested that the proposed system effectively enhanced the SSVEP responses, decreased the degradation of SSVEP strength and reduced the visual fatigue in comparison with the traditional SSVEP-BCI system. The average signal to noise ratio (SNR) of SG-SSVEP was 6.31 ± 2.64 dB, versus 3.38 ± 0.78 dB of traditional-SSVEP. In addition, the proposed system achieved good performances in prosthesis control. The average accuracy was 94.58% ± 7.05%, and the corresponding high information transfer rate (IRT) was 19.55 ± 3.07 bit/min. The experimental results revealed that the SG-SSVEP based BCI system achieves the good performance and improved the stability relative to the conventional approach. Copyright © 2018 Elsevier B.V. All rights reserved.
ERIC Educational Resources Information Center
Marill, Thomas; And Others
The aim of the CYCLOPS Project research is the development of techniques for allowing computers to perform visual scene analysis, pre-processing of visual imagery, and perceptual learning. Work on scene analysis and learning has previously been described. The present report deals with research on pre-processing and with further work on scene…
Effects of Spatio-Temporal Aliasing on Out-the-Window Visual Systems
NASA Technical Reports Server (NTRS)
Sweet, Barbara T.; Stone, Leland S.; Liston, Dorion B.; Hebert, Tim M.
2014-01-01
Designers of out-the-window visual systems face a challenge when attempting to simulate the outside world as viewed from a cockpit. Many methodologies have been developed and adopted to aid in the depiction of particular scene features, or levels of static image detail. However, because aircraft move, it is necessary to also consider the quality of the motion in the simulated visual scene. When motion is introduced in the simulated visual scene, perceptual artifacts can become apparent. A particular artifact related to image motion, spatiotemporal aliasing, will be addressed. The causes of spatio-temporal aliasing will be discussed, and current knowledge regarding the impact of these artifacts on both motion perception and simulator task performance will be reviewed. Methods of reducing the impact of this artifact are also addressed
Kuniecki, Michał; Wołoszyn, Kinga; Domagalik, Aleksandra; Pilarczyk, Joanna
2018-05-01
Processing of emotional visual information engages cognitive functions and induces arousal. We aimed to examine the modulatory role of emotional valence on brain activations linked to the processing of visual information and those linked to arousal. Participants were scanned and their pupil size was measured while viewing negative and neutral images. The visual noise was added to the images in various proportions to parametrically manipulate the amount of visual information. Pupil size was used as an index of physiological arousal. We show that arousal induced by the negative images, as compared to the neutral ones, is primarily related to greater amygdala activity while increasing visibility of negative content to enhanced activity in the lateral occipital complex (LOC). We argue that more intense visual processing of negative scenes can occur irrespective of the level of arousal. It may suggest that higher areas of the visual stream are fine-tuned to process emotionally relevant objects. Both arousal and processing of emotional visual information modulated activity within the ventromedial prefrontal cortex (vmPFC). Overlapping activations within the vmPFC may reflect the integration of these aspects of emotional processing. Additionally, we show that emotionally-evoked pupil dilations are related to activations in the amygdala, vmPFC, and LOC.
Gestalt Effects in Visual Working Memory.
Kałamała, Patrycja; Sadowska, Aleksandra; Ordziniak, Wawrzyniec; Chuderski, Adam
2017-01-01
Four experiments investigated whether conforming to Gestalt principles, well known to drive visual perception, also facilitates the active maintenance of information in visual working memory (VWM). We used the change detection task, which required the memorization of visual patterns composed of several shapes. We observed no effects of symmetry of visual patterns on VWM performance. However, there was a moderate positive effect when a particular shape that was probed matched the shape of the whole pattern (the whole-part similarity effect). Data support the models assuming that VWM encodes not only particular objects of the perceptual scene but also the spatial relations between them (the ensemble representation). The ensemble representation may prime objects similar to its shape and thereby boost access to them. In contrast, the null effect of symmetry relates the fact that this very feature of an ensemble does not yield any useful additional information for VWM.
New insights into ambient and focal visual fixations using an automatic classification algorithm
Follet, Brice; Le Meur, Olivier; Baccino, Thierry
2011-01-01
Overt visual attention is the act of directing the eyes toward a given area. These eye movements are characterised by saccades and fixations. A debate currently surrounds the role of visual fixations. Do they all have the same role in the free viewing of natural scenes? Recent studies suggest that at least two types of visual fixations exist: focal and ambient. The former is believed to be used to inspect local areas accurately, whereas the latter is used to obtain the context of the scene. We investigated the use of an automated system to cluster visual fixations in two groups using four types of natural scene images. We found new evidence to support a focal–ambient dichotomy. Our data indicate that the determining factor is the saccade amplitude. The dependence on the low-level visual features and the time course of these two kinds of visual fixations were examined. Our results demonstrate that there is an interplay between both fixation populations and that focal fixations are more dependent on low-level visual features than are ambient fixations. PMID:23145248
Where's Wally: the influence of visual salience on referring expression generation.
Clarke, Alasdair D F; Elsner, Micha; Rohde, Hannah
2013-01-01
REFERRING EXPRESSION GENERATION (REG) PRESENTS THE CONVERSE PROBLEM TO VISUAL SEARCH: given a scene and a specified target, how does one generate a description which would allow somebody else to quickly and accurately locate the target?Previous work in psycholinguistics and natural language processing has failed to find an important and integrated role for vision in this task. That previous work, which relies largely on simple scenes, tends to treat vision as a pre-process for extracting feature categories that are relevant to disambiguation. However, the visual search literature suggests that some descriptions are better than others at enabling listeners to search efficiently within complex stimuli. This paper presents a study testing whether participants are sensitive to visual features that allow them to compose such "good" descriptions. Our results show that visual properties (salience, clutter, area, and distance) influence REG for targets embedded in images from the Where's Wally? books. Referring expressions for large targets are shorter than those for smaller targets, and expressions about targets in highly cluttered scenes use more words. We also find that participants are more likely to mention non-target landmarks that are large, salient, and in close proximity to the target. These findings identify a key role for visual salience in language production decisions and highlight the importance of scene complexity for REG.
Multilevel analysis of sports video sequences
NASA Astrophysics Data System (ADS)
Han, Jungong; Farin, Dirk; de With, Peter H. N.
2006-01-01
We propose a fully automatic and flexible framework for analysis and summarization of tennis broadcast video sequences, using visual features and specific game-context knowledge. Our framework can analyze a tennis video sequence at three levels, which provides a broad range of different analysis results. The proposed framework includes novel pixel-level and object-level tennis video processing algorithms, such as a moving-player detection taking both the color and the court (playing-field) information into account, and a player-position tracking algorithm based on a 3-D camera model. Additionally, we employ scene-level models for detecting events, like service, base-line rally and net-approach, based on a number real-world visual features. The system can summarize three forms of information: (1) all court-view playing frames in a game, (2) the moving trajectory and real-speed of each player, as well as relative position between the player and the court, (3) the semantic event segments in a game. The proposed framework is flexible in choosing the level of analysis that is desired. It is effective because the framework makes use of several visual cues obtained from the real-world domain to model important events like service, thereby increasing the accuracy of the scene-level analysis. The paper presents attractive experimental results highlighting the system efficiency and analysis capabilities.
NASA Technical Reports Server (NTRS)
Senger, Steven O.
1998-01-01
Volumetric data sets have become common in medicine and many sciences through technologies such as computed x-ray tomography (CT), magnetic resonance (MR), positron emission tomography (PET), confocal microscopy and 3D ultrasound. When presented with 2D images humans immediately and unconsciously begin a visual analysis of the scene. The viewer surveys the scene identifying significant landmarks and building an internal mental model of presented information. The identification of features is strongly influenced by the viewers expectations based upon their expert knowledge of what the image should contain. While not a conscious activity, the viewer makes a series of choices about how to interpret the scene. These choices occur in parallel with viewing the scene and effectively change the way the viewer sees the image. It is this interaction of viewing and choice which is the basis of many familiar visual illusions. This is especially important in the interpretation of medical images where it is the expert knowledge of the radiologist which interprets the image. For 3D data sets this interaction of view and choice is frustrated because choices must precede the visualization of the data set. It is not possible to visualize the data set with out making some initial choices which determine how the volume of data is presented to the eye. These choices include, view point orientation, region identification, color and opacity assignments. Further compounding the problem is the fact that these visualization choices are defined in terms of computer graphics as opposed to language of the experts knowledge. The long term goal of this project is to develop an environment where the user can interact with volumetric data sets using tools which promote the utilization of expert knowledge by incorporating visualization and choice into a tight computational loop. The tools will support activities involving the segmentation of structures, construction of surface meshes and local filtering of the data set. To conform to this environment tools should have several key attributes. First, they should be only rely on computations over a local neighborhood of the probe position. Second, they should operate iteratively over time converging towards a limit behavior. Third, they should adapt to user input modifying they operational parameters with time.
Spatial Frequency Priming of Scene Perception in Adolescents with and without ASD
ERIC Educational Resources Information Center
Vanmarcke, Steven; Noens, Ilse; Steyaert, Jean; Wagemans, Johan
2017-01-01
While most typically developing (TD) participants have a coarse-to-fine processing style, people with autism spectrum disorder (ASD) seem to be less globally and more locally biased when processing visual information. The stimulus-specific spatial frequency content might be directly relevant to determine this temporal hierarchy of visual…
East-West Cultural Differences in Context-Sensitivity are Evident in Early Childhood
ERIC Educational Resources Information Center
Imada, Toshie; Carlson, Stephanie M.; Itakura, Shoji
2013-01-01
Accumulating evidence suggests that North Americans tend to focus on central objects whereas East Asians tend to pay more attention to contextual information in a visual scene. Although it is generally believed that such culturally divergent attention tendencies develop through socialization, existing evidence largely depends on adult samples.…
Space flight visual simulation.
Xu, L
1985-01-01
In this paper, based on the scenes of stars seen by astronauts in their orbital flights, we have studied the mathematical model which must be constructed for CGI system to realize the space flight visual simulation. Considering such factors as the revolution and rotation of the Earth, exact date, time and site of orbital injection of the spacecraft, as well as its orbital flight and attitude motion, etc., we first defined all the instantaneous lines of sight and visual fields of astronauts in space. Then, through a series of coordinate transforms, the pictures of the scenes of stars changing with time-space were photographed one by one mathematically. In the procedure, we have designed a method of three-times "mathematical cutting." Finally, we obtained each instantaneous picture of the scenes of stars observed by astronauts through the window of the cockpit. Also, the dynamic conditions shaded by the Earth in the varying pictures of scenes of stars could be displayed.
Scene analysis for effective visual search in rough three-dimensional-modeling scenes
NASA Astrophysics Data System (ADS)
Wang, Qi; Hu, Xiaopeng
2016-11-01
Visual search is a fundamental technology in the computer vision community. It is difficult to find an object in complex scenes when there exist similar distracters in the background. We propose a target search method in rough three-dimensional-modeling scenes based on a vision salience theory and camera imaging model. We give the definition of salience of objects (or features) and explain the way that salience measurements of objects are calculated. Also, we present one type of search path that guides to the target through salience objects. Along the search path, when the previous objects are localized, the search region of each subsequent object decreases, which is calculated through imaging model and an optimization method. The experimental results indicate that the proposed method is capable of resolving the ambiguities resulting from distracters containing similar visual features with the target, leading to an improvement of search speed by over 50%.
Sasson, Noah J; Pinkham, Amy E; Weittenhiller, Lauren P; Faso, Daniel J; Simpson, Claire
2016-05-01
Although Schizophrenia (SCZ) and Autism Spectrum Disorder (ASD) share impairments in emotion recognition, the mechanisms underlying these impairments may differ. The current study used the novel "Emotions in Context" task to examine how the interpretation and visual inspection of facial affect is modulated by congruent and incongruent emotional contexts in SCZ and ASD. Both adults with SCZ (n= 44) and those with ASD (n= 21) exhibited reduced affect recognition relative to typically-developing (TD) controls (n= 39) when faces were integrated within broader emotional scenes but not when they were presented in isolation, underscoring the importance of using stimuli that better approximate real-world contexts. Additionally, viewing faces within congruent emotional scenes improved accuracy and visual attention to the face for controls more so than the clinical groups, suggesting that individuals with SCZ and ASD may not benefit from the presence of complementary emotional information as readily as controls. Despite these similarities, important distinctions between SCZ and ASD were found. In every condition, IQ was related to emotion-recognition accuracy for the SCZ group but not for the ASD or TD groups. Further, only the ASD group failed to increase their visual attention to faces in incongruent emotional scenes, suggesting a lower reliance on facial information within ambiguous emotional contexts relative to congruent ones. Collectively, these findings highlight both shared and distinct social cognitive processes in SCZ and ASD that may contribute to their characteristic social disabilities. © The Author 2015. Published by Oxford University Press on behalf of the Maryland Psychiatric Research Center. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Patai, Eva Zita; Buckley, Alice; Nobre, Anna Christina
2013-01-01
A popular model of visual perception states that coarse information (carried by low spatial frequencies) along the dorsal stream is rapidly transmitted to prefrontal and medial temporal areas, activating contextual information from memory, which can in turn constrain detailed input carried by high spatial frequencies arriving at a slower rate along the ventral visual stream, thus facilitating the processing of ambiguous visual stimuli. We were interested in testing whether this model contributes to memory-guided orienting of attention. In particular, we asked whether global, low-spatial frequency (LSF) inputs play a dominant role in triggering contextual memories in order to facilitate the processing of the upcoming target stimulus. We explored this question over four experiments. The first experiment replicated the LSF advantage reported in perceptual discrimination tasks by showing that participants were faster and more accurate at matching a low spatial frequency version of a scene, compared to a high spatial frequency version, to its original counterpart in a forced-choice task. The subsequent three experiments tested the relative contributions of low versus high spatial frequencies during memory-guided covert spatial attention orienting tasks. Replicating the effects of memory-guided attention, pre-exposure to scenes associated with specific spatial memories for target locations (memory cues) led to higher perceptual discrimination and faster response times to identify targets embedded in the scenes. However, either high or low spatial frequency cues were equally effective; LSF signals did not selectively or preferentially contribute to the memory-driven attention benefits to performance. Our results challenge a generalized model that LSFs activate contextual memories, which in turn bias attention and facilitate perception.
Patai, Eva Zita; Buckley, Alice; Nobre, Anna Christina
2013-01-01
A popular model of visual perception states that coarse information (carried by low spatial frequencies) along the dorsal stream is rapidly transmitted to prefrontal and medial temporal areas, activating contextual information from memory, which can in turn constrain detailed input carried by high spatial frequencies arriving at a slower rate along the ventral visual stream, thus facilitating the processing of ambiguous visual stimuli. We were interested in testing whether this model contributes to memory-guided orienting of attention. In particular, we asked whether global, low-spatial frequency (LSF) inputs play a dominant role in triggering contextual memories in order to facilitate the processing of the upcoming target stimulus. We explored this question over four experiments. The first experiment replicated the LSF advantage reported in perceptual discrimination tasks by showing that participants were faster and more accurate at matching a low spatial frequency version of a scene, compared to a high spatial frequency version, to its original counterpart in a forced-choice task. The subsequent three experiments tested the relative contributions of low versus high spatial frequencies during memory-guided covert spatial attention orienting tasks. Replicating the effects of memory-guided attention, pre-exposure to scenes associated with specific spatial memories for target locations (memory cues) led to higher perceptual discrimination and faster response times to identify targets embedded in the scenes. However, either high or low spatial frequency cues were equally effective; LSF signals did not selectively or preferentially contribute to the memory-driven attention benefits to performance. Our results challenge a generalized model that LSFs activate contextual memories, which in turn bias attention and facilitate perception. PMID:23776509
Towards Understanding the Role of Colour Information in Scene Perception using Night Vision Device
2009-06-01
possessing a visual system much simplified from that of living birds, reptiles, and teleost (bony) fish , which are generally tetrachromatic (Bowmaker...Levkowitz and Herman (1992) speculated that the results might be limited to “ blob ” detection. A possible mediating factor may have been the size and...sharpness of the “ blobs ” used in their task. Mullen (1985) showed that the visual system is much more sensitive to the 7 DSTO-RR-0345 high spatial
de Gruijter, Madeleine; de Poot, Christianne J; Elffers, Henk
2016-01-01
Currently, a series of promising new tools are under development that will enable crime scene investigators (CSIs) to analyze traces in situ during the crime scene investigation or enable them to detect blood and provide information on the age of blood. An experiment is conducted with thirty CSIs investigating a violent robbery at a mock crime scene to study the influence of such technologies on the perception and interpretation of traces during the first phase of the investigation. Results show that in their search for traces, CSIs are not directed by the availability of technologies, which is a reassuring finding. Qualitative findings suggest that CSIs are generally more focused on analyzing perpetrator traces than on reconstructing the event. A focus on perpetrator traces might become a risk when other crime-related traces are overlooked, and when analyzed traces are in fact not crime-related and in consequence lead to the identification of innocent suspects. © 2015 American Academy of Forensic Sciences.
Optimization of Visual Information Presentation for Visual Prosthesis.
Guo, Fei; Yang, Yuan; Gao, Yong
2018-01-01
Visual prosthesis applying electrical stimulation to restore visual function for the blind has promising prospects. However, due to the low resolution, limited visual field, and the low dynamic range of the visual perception, huge loss of information occurred when presenting daily scenes. The ability of object recognition in real-life scenarios is severely restricted for prosthetic users. To overcome the limitations, optimizing the visual information in the simulated prosthetic vision has been the focus of research. This paper proposes two image processing strategies based on a salient object detection technique. The two processing strategies enable the prosthetic implants to focus on the object of interest and suppress the background clutter. Psychophysical experiments show that techniques such as foreground zooming with background clutter removal and foreground edge detection with background reduction have positive impacts on the task of object recognition in simulated prosthetic vision. By using edge detection and zooming technique, the two processing strategies significantly improve the recognition accuracy of objects. We can conclude that the visual prosthesis using our proposed strategy can assist the blind to improve their ability to recognize objects. The results will provide effective solutions for the further development of visual prosthesis.
Optimization of Visual Information Presentation for Visual Prosthesis
Gao, Yong
2018-01-01
Visual prosthesis applying electrical stimulation to restore visual function for the blind has promising prospects. However, due to the low resolution, limited visual field, and the low dynamic range of the visual perception, huge loss of information occurred when presenting daily scenes. The ability of object recognition in real-life scenarios is severely restricted for prosthetic users. To overcome the limitations, optimizing the visual information in the simulated prosthetic vision has been the focus of research. This paper proposes two image processing strategies based on a salient object detection technique. The two processing strategies enable the prosthetic implants to focus on the object of interest and suppress the background clutter. Psychophysical experiments show that techniques such as foreground zooming with background clutter removal and foreground edge detection with background reduction have positive impacts on the task of object recognition in simulated prosthetic vision. By using edge detection and zooming technique, the two processing strategies significantly improve the recognition accuracy of objects. We can conclude that the visual prosthesis using our proposed strategy can assist the blind to improve their ability to recognize objects. The results will provide effective solutions for the further development of visual prosthesis. PMID:29731769
Dima, Diana C; Perry, Gavin; Singh, Krish D
2018-06-11
In navigating our environment, we rapidly process and extract meaning from visual cues. However, the relationship between visual features and categorical representations in natural scene perception is still not well understood. Here, we used natural scene stimuli from different categories and filtered at different spatial frequencies to address this question in a passive viewing paradigm. Using representational similarity analysis (RSA) and cross-decoding of magnetoencephalography (MEG) data, we show that categorical representations emerge in human visual cortex at ∼180 ms and are linked to spatial frequency processing. Furthermore, dorsal and ventral stream areas reveal temporally and spatially overlapping representations of low and high-level layer activations extracted from a feedforward neural network. Our results suggest that neural patterns from extrastriate visual cortex switch from low-level to categorical representations within 200 ms, highlighting the rapid cascade of processing stages essential in human visual perception. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
Freas, Cody A.; Wystrach, Antione; Narendra, Ajay; Cheng, Ken
2018-01-01
Solitary foraging ants commonly use visual cues from their environment for navigation. Foragers are known to store visual scenes from the surrounding panorama for later guidance to known resources and to return successfully back to the nest. Several ant species travel not only on the ground, but also climb trees to locate resources. The navigational information that guides animals back home during their descent, while their body is perpendicular to the ground, is largely unknown. Here, we investigate in a nocturnal ant, Myrmecia midas, whether foragers travelling down a tree use visual information to return home. These ants establish nests at the base of a tree on which they forage and in addition, they also forage on nearby trees. We collected foragers and placed them on the trunk of the nest tree or a foraging tree in multiple compass directions. Regardless of the displacement location, upon release ants immediately moved to the side of the trunk facing the nest during their descent. When ants were released on non-foraging trees near the nest, displaced foragers again travelled around the tree to the side facing the nest. All the displaced foragers reached the correct side of the tree well before reaching the ground. However, when the terrestrial cues around the tree were blocked, foragers were unable to orient correctly, suggesting that the surrounding panorama is critical to successful orientation on the tree. Through analysis of panoramic pictures, we show that views acquired at the base of the foraging tree nest can provide reliable nest-ward orientation up to 1.75 m above the ground. We discuss, how animals descending from trees compare their current scene to a memorised scene and report on the similarities in visually guided behaviour while navigating on the ground and descending from trees. PMID:29422880
Freas, Cody A; Wystrach, Antione; Narendra, Ajay; Cheng, Ken
2018-01-01
Solitary foraging ants commonly use visual cues from their environment for navigation. Foragers are known to store visual scenes from the surrounding panorama for later guidance to known resources and to return successfully back to the nest. Several ant species travel not only on the ground, but also climb trees to locate resources. The navigational information that guides animals back home during their descent, while their body is perpendicular to the ground, is largely unknown. Here, we investigate in a nocturnal ant, Myrmecia midas , whether foragers travelling down a tree use visual information to return home. These ants establish nests at the base of a tree on which they forage and in addition, they also forage on nearby trees. We collected foragers and placed them on the trunk of the nest tree or a foraging tree in multiple compass directions. Regardless of the displacement location, upon release ants immediately moved to the side of the trunk facing the nest during their descent. When ants were released on non-foraging trees near the nest, displaced foragers again travelled around the tree to the side facing the nest. All the displaced foragers reached the correct side of the tree well before reaching the ground. However, when the terrestrial cues around the tree were blocked, foragers were unable to orient correctly, suggesting that the surrounding panorama is critical to successful orientation on the tree. Through analysis of panoramic pictures, we show that views acquired at the base of the foraging tree nest can provide reliable nest-ward orientation up to 1.75 m above the ground. We discuss, how animals descending from trees compare their current scene to a memorised scene and report on the similarities in visually guided behaviour while navigating on the ground and descending from trees.
Brady, Timothy F; Konkle, Talia; Oliva, Aude; Alvarez, George A
2009-01-01
A large body of literature has shown that observers often fail to notice significant changes in visual scenes, even when these changes happen right in front of their eyes. For instance, people often fail to notice if their conversation partner is switched to another person, or if large background objects suddenly disappear.1,2 These 'change blindness' studies have led to the inference that the amount of information we remember about each item in a visual scene may be quite low.1 However, in recent work we have demonstrated that long-term memory is capable of storing a massive number of visual objects with significant detail about each item.3 In the present paper we attempt to reconcile these findings by demonstrating that observers do not experience 'change blindness' with the real world objects used in our previous experiment if they are given sufficient time to encode each item. The results reported here suggest that one of the major causes of change blindness for real-world objects is a lack of encoding time or attention to each object (see also refs. 4 and 5).
The influence of behavioral relevance on the processing of global scene properties: An ERP study.
Hansen, Natalie E; Noesen, Birken T; Nador, Jeffrey D; Harel, Assaf
2018-05-02
Recent work studying the temporal dynamics of visual scene processing (Harel et al., 2016) has found that global scene properties (GSPs) modulate the amplitude of early Event-Related Potentials (ERPs). It is still not clear, however, to what extent the processing of these GSPs is influenced by their behavioral relevance, determined by the goals of the observer. To address this question, we investigated how behavioral relevance, operationalized by the task context impacts the electrophysiological responses to GSPs. In a set of two experiments we recorded ERPs while participants viewed images of real-world scenes, varying along two GSPs, naturalness (manmade/natural) and spatial expanse (open/closed). In Experiment 1, very little attention to scene content was required as participants viewed the scenes while performing an orthogonal fixation-cross task. In Experiment 2 participants saw the same scenes but now had to actively categorize them, based either on their naturalness or spatial expense. We found that task context had very little impact on the early ERP responses to the naturalness and spatial expanse of the scenes: P1, N1, and P2 could distinguish between open and closed scenes and between manmade and natural scenes across both experiments. Further, the specific effects of naturalness and spatial expanse on the ERP components were largely unaffected by their relevance for the task. A task effect was found at the N1 and P2 level, but this effect was manifest across all scene dimensions, indicating a general effect rather than an interaction between task context and GSPs. Together, these findings suggest that the extraction of global scene information reflected in the early ERP components is rapid and very little influenced by top-down observer-based goals. Copyright © 2018 Elsevier Ltd. All rights reserved.
Banno, Hayaki; Saiki, Jun
2015-03-01
Recent studies have sought to determine which levels of categories are processed first in visual scene categorization and have shown that the natural and man-made superordinate-level categories are understood faster than are basic-level categories. The current study examined the robustness of the superordinate-level advantage in a visual scene categorization task. A go/no-go categorization task was evaluated with response time distribution analysis using an ex-Gaussian template. A visual scene was categorized as either superordinate or basic level, and two basic-level categories forming a superordinate category were judged as either similar or dissimilar to each other. First, outdoor/ indoor groups and natural/man-made were used as superordinate categories to investigate whether the advantage could be generalized beyond the natural/man-made boundary. Second, a set of images forming a superordinate category was manipulated. We predicted that decreasing image set similarity within the superordinate-level category would work against the speed advantage. We found that basic-level categorization was faster than outdoor/indoor categorization when the outdoor category comprised dissimilar basic-level categories. Our results indicate that the superordinate-level advantage in visual scene categorization is labile across different categories and category structures. © 2015 SAGE Publications.
Object representations in visual memory: evidence from visual illusions.
Ben-Shalom, Asaf; Ganel, Tzvi
2012-07-26
Human visual memory is considered to contain different levels of object representations. Representations in visual working memory (VWM) are thought to contain relatively elaborated information about object structure. Conversely, representations in iconic memory are thought to be more perceptual in nature. In four experiments, we tested the effects of two different categories of visual illusions on representations in VWM and in iconic memory. Unlike VWM that was affected by both types of illusions, iconic memory was immune to the effects of within-object contextual illusions and was affected only by illusions driven by between-objects contextual properties. These results show that iconic and visual working memory contain dissociable representations of object shape. These findings suggest that the global properties of the visual scene are processed prior to the processing of specific elements.
A hierarchical, retinotopic proto-organization of the primate visual system at birth
Arcaro, Michael J; Livingstone, Margaret S
2017-01-01
The adult primate visual system comprises a series of hierarchically organized areas. Each cortical area contains a topographic map of visual space, with different areas extracting different kinds of information from the retinal input. Here we asked to what extent the newborn visual system resembles the adult organization. We find that hierarchical, topographic organization is present at birth and therefore constitutes a proto-organization for the entire primate visual system. Even within inferior temporal cortex, this proto-organization was already present, prior to the emergence of category selectivity (e.g., faces or scenes). We propose that this topographic organization provides the scaffolding for the subsequent development of visual cortex that commences at the onset of visual experience DOI: http://dx.doi.org/10.7554/eLife.26196.001 PMID:28671063
Hollingworth, Andrew; Henderson, John M
2004-07-01
In a change detection paradigm, the global orientation of a natural scene was incrementally changed in 1 degree intervals. In Experiments 1 and 2, participants demonstrated sustained change blindness to incremental rotation, often coming to consider a significantly different scene viewpoint as an unchanged continuation of the original view. Experiment 3 showed that participants who failed to detect the incremental rotation nevertheless reliably detected a single-step rotation back to the initial view. Together, these results demonstrate an important dissociation between explicit change detection and visual memory. Following a change, visual memory is updated to reflect the changed state of the environment, even if the change was not detected.
The perception of naturalness correlates with low-level visual features of environmental scenes.
Berman, Marc G; Hout, Michael C; Kardan, Omid; Hunter, MaryCarol R; Yourganov, Grigori; Henderson, John M; Hanayik, Taylor; Karimi, Hossein; Jonides, John
2014-01-01
Previous research has shown that interacting with natural environments vs. more urban or built environments can have salubrious psychological effects, such as improvements in attention and memory. Even viewing pictures of nature vs. pictures of built environments can produce similar effects. A major question is: What is it about natural environments that produces these benefits? Problematically, there are many differing qualities between natural and urban environments, making it difficult to narrow down the dimensions of nature that may lead to these benefits. In this study, we set out to uncover visual features that related to individuals' perceptions of naturalness in images. We quantified naturalness in two ways: first, implicitly using a multidimensional scaling analysis and second, explicitly with direct naturalness ratings. Features that seemed most related to perceptions of naturalness were related to the density of contrast changes in the scene, the density of straight lines in the scene, the average color saturation in the scene and the average hue diversity in the scene. We then trained a machine-learning algorithm to predict whether a scene was perceived as being natural or not based on these low-level visual features and we could do so with 81% accuracy. As such we were able to reliably predict subjective perceptions of naturalness with objective low-level visual features. Our results can be used in future studies to determine if these features, which are related to naturalness, may also lead to the benefits attained from interacting with nature.
Saccadic Corollary Discharge Underlies Stable Visual Perception
Berman, Rebecca A.; Joiner, Wilsaan M.; Wurtz, Robert H.
2016-01-01
Saccadic eye movements direct the high-resolution foveae of our retinas toward objects of interest. With each saccade, the image jumps on the retina, causing a discontinuity in visual input. Our visual perception, however, remains stable. Philosophers and scientists over centuries have proposed that visual stability depends upon an internal neuronal signal that is a copy of the neuronal signal driving the eye movement, now referred to as a corollary discharge (CD) or efference copy. In the old world monkey, such a CD circuit for saccades has been identified extending from superior colliculus through MD thalamus to frontal cortex, but there is little evidence that this circuit actually contributes to visual perception. We tested the influence of this CD circuit on visual perception by first training macaque monkeys to report their perceived eye direction, and then reversibly inactivating the CD as it passes through the thalamus. We found that the monkey's perception changed; during CD inactivation, there was a difference between where the monkey perceived its eyes to be directed and where they were actually directed. Perception and saccade were decoupled. We established that the perceived eye direction at the end of the saccade was not derived from proprioceptive input from eye muscles, and was not altered by contextual visual information. We conclude that the CD provides internal information contributing to the brain's creation of perceived visual stability. More specifically, the CD might provide the internal saccade vector used to unite separate retinal images into a stable visual scene. SIGNIFICANCE STATEMENT Visual stability is one of the most remarkable aspects of human vision. The eyes move rapidly several times per second, displacing the retinal image each time. The brain compensates for this disruption, keeping our visual perception stable. A major hypothesis explaining this stability invokes a signal within the brain, a corollary discharge, that informs visual regions of the brain when and where the eyes are about to move. Such a corollary discharge circuit for eye movements has been identified in macaque monkey. We now show that selectively inactivating this brain circuit alters the monkey's visual perception. We conclude that this corollary discharge provides a critical signal that can be used to unite jumping retinal images into a consistent visual scene. PMID:26740647
NASA Technical Reports Server (NTRS)
Buccello-Stout, Regina R.; Cromwell, Ronita L.; Bloomberg, Jacob J.; Weaver, G. D.
2010-01-01
Research indicates a main contributor of injury in older adults is from falling. The decline in sensory systems limits information needed to successfully maneuver through the environment. The objective of this study was to determine if prolonged exposure to the realignment of perceptual-motor systems increases adaptability of balance, and if balance confidence improves after training. A total of 16 older adults between ages 65-85 were randomized to a control group (walking on a treadmill while viewing a static visual scene) and an experimental group (walking on a treadmill while viewing a rotating visual scene). Prior to visual exposure, participants completed six trials of walking through a soft foamed obstacle course. Participants came in twice a week for 4 weeks to complete training of walking on a treadmill and viewing the visual scene for 20 minutes each session. Participants completed the obstacle course after training and four weeks later. Average time, penalty, and Activity Balance Confidence Scale scores were computed for both groups across testing times. The older adults who trained, significantly improved their time through the obstacle course F (2, 28) = 9.41, p < 0.05, as well as reduced their penalty scores F (2, 28) = 21.03, p < 0.05, compared to those who did not train. There was no difference in balance confidence scores between groups across testing times F (2, 28) = 0.503, p > 0.05. Although the training group improved mobility through the obstacle course, there were no differences between the groups in balance confidence.
Moriya, Jun
2017-01-01
According to cognitive theories, verbal processing attenuates emotional processing, whereas visual imagery enhances emotional processing and contributes to the maintenance of social anxiety. Individuals with social anxiety report negative mental images in social situations. However, the general ability of visual mental imagery of neutral scenes in individuals with social anxiety is still unclear. The present study investigated the general ability of non-emotional mental imagery (vividness, preferences for imagery vs. verbal processing, and object or spatial imagery) and the moderating role of effortful control in attenuating social anxiety. The participants ( N = 231) completed five questionnaires. The results showed that social anxiety was not necessarily associated with all aspects of mental imagery. As suggested by theories, social anxiety was not associated with a preference for verbal processing. However, social anxiety was positively correlated with the visual imagery scale, especially the object imagery scale, which concerns the ability to construct pictorial images of individual objects. Further, it was negatively correlated with the spatial imagery scale, which concerns the ability to process information about spatial relations between objects. Although object imagery and spatial imagery positively and negatively predicted the degree of social anxiety, respectively, these effects were attenuated when socially anxious individuals had high effortful control. Specifically, in individuals with high effortful control, both object and spatial imagery were not associated with social anxiety. Socially anxious individuals might prefer to construct pictorial images of individual objects in natural scenes through object imagery. However, even in individuals who exhibit these features of mental imagery, effortful control could inhibit the increase in social anxiety.
Embodied attention and word learning by toddlers
Yu, Chen; Smith, Linda B.
2013-01-01
Many theories of early word learning begin with the uncertainty inherent to learning a word from its co-occurrence with a visual scene. However, the relevant visual scene for infant word learning is neither from the adult theorist’s view nor the mature partner’s view, but is rather from the learner’s personal view. Here we show that when 18-month old infants interacted with objects in play with their parents, they created moments in which a single object was visually dominant. If parents named the object during these moments of bottom-up selectivity, later forced-choice tests showed that infants learned the name, but did not when naming occurred during a less visually selective moment. The momentary visual input for parents and toddlers was captured via head cameras placed low on each participant’s forehead as parents played with and named objects for their infant. Frame-by-frame analyses of the head camera images at and around naming moments were conducted to determine the visual properties at input that were associated with learning. The analyses indicated that learning occurred when bottom-up visual information was clean and uncluttered. The sensory-motor behaviors of infants and parents were also analyzed to determine how their actions on the objects may have created these optimal visual moments for learning. The results are discussed with respect to early word learning, embodied attention, and the social role of parents in early word learning. PMID:22878116
Eye Movements Reveal the Dynamic Simulation of Speed in Language
ERIC Educational Resources Information Center
Speed, Laura J.; Vigliocco, Gabriella
2014-01-01
This study investigates how speed of motion is processed in language. In three eye-tracking experiments, participants were presented with visual scenes and spoken sentences describing fast or slow events (e.g., "The lion ambled/dashed to the balloon"). Results showed that looking time to relevant objects in the visual scene was affected…
NASA Technical Reports Server (NTRS)
Johnson, Walter W.; Kaiser, Mary K.
2003-01-01
Perspective synthetic displays that supplement, or supplant, the optical windows traditionally used for guidance and control of aircraft are accompanied by potentially significant human factors problems related to the optical geometric conformality of the display. Such geometric conformality is broken when optical features are not in the location they would be if directly viewed through a window. This often occurs when the scene is relayed or generated from a location different from the pilot s eyepoint. However, assuming no large visual/vestibular effects, a pilot cad often learn to use such a display very effectively. Important problems may arise, however, when display accuracy or consistency is compromised, and this can usually be related to geometrical discrepancies between how the synthetic visual scene behaves and how the visual scene through a window behaves. In addition to these issues, this paper examines the potentially critical problem of the disorientation that can arise when both a synthetic display and a real window are present in a flight deck, and no consistent visual interpretation is available.
Visual search for changes in scenes creates long-term, incidental memory traces.
Utochkin, Igor S; Wolfe, Jeremy M
2018-05-01
Humans are very good at remembering large numbers of scenes over substantial periods of time. But how good are they at remembering changes to scenes? In this study, we tested scene memory and change detection two weeks after initial scene learning. In Experiments 1-3, scenes were learned incidentally during visual search for change. In Experiment 4, observers explicitly memorized scenes. At test, after two weeks observers were asked to discriminate old from new scenes, to recall a change that they had detected in the study phase, or to detect a newly introduced change in the memorization experiment. Next, they performed a change detection task, usually looking for the same change as in the study period. Scene recognition memory was found to be similar in all experiments, regardless of the study task. In Experiment 1, more difficult change detection produced better scene memory. Experiments 2 and 3 supported a "depth-of-processing" account for the effects of initial search and change detection on incidental memory for scenes. Of most interest, change detection was faster during the test phase than during the study phase, even when the observer had no explicit memory of having found that change previously. This result was replicated in two of our three change detection experiments. We conclude that scenes can be encoded incidentally as well as explicitly and that changes in those scenes can leave measurable traces even if they are not explicitly recalled.
Ho-Phuoc, Tien; Guyader, Nathalie; Landragin, Frédéric; Guérin-Dugué, Anne
2012-02-03
Since Treisman's theory, it has been generally accepted that color is an elementary feature that guides eye movements when looking at natural scenes. Hence, most computational models of visual attention predict eye movements using color as an important visual feature. In this paper, using experimental data, we show that color does not affect where observers look when viewing natural scene images. Neither colors nor abnormal colors modify observers' fixation locations when compared to the same scenes in grayscale. In the same way, we did not find any significant difference between the scanpaths under grayscale, color, or abnormal color viewing conditions. However, we observed a decrease in fixation duration for color and abnormal color, and this was particularly true at the beginning of scene exploration. Finally, we found that abnormal color modifies saccade amplitude distribution.
Simulated Prosthetic Vision: The Benefits of Computer-Based Object Recognition and Localization.
Macé, Marc J-M; Guivarch, Valérian; Denis, Grégoire; Jouffrais, Christophe
2015-07-01
Clinical trials with blind patients implanted with a visual neuroprosthesis showed that even the simplest tasks were difficult to perform with the limited vision restored with current implants. Simulated prosthetic vision (SPV) is a powerful tool to investigate the putative functions of the upcoming generations of visual neuroprostheses. Recent studies based on SPV showed that several generations of implants will be required before usable vision is restored. However, none of these studies relied on advanced image processing. High-level image processing could significantly reduce the amount of information required to perform visual tasks and help restore visuomotor behaviors, even with current low-resolution implants. In this study, we simulated a prosthetic vision device based on object localization in the scene. We evaluated the usability of this device for object recognition, localization, and reaching. We showed that a very low number of electrodes (e.g., nine) are sufficient to restore visually guided reaching movements with fair timing (10 s) and high accuracy. In addition, performance, both in terms of accuracy and speed, was comparable with 9 and 100 electrodes. Extraction of high level information (object recognition and localization) from video images could drastically enhance the usability of current visual neuroprosthesis. We suggest that this method-that is, localization of targets of interest in the scene-may restore various visuomotor behaviors. This method could prove functional on current low-resolution implants. The main limitation resides in the reliability of the vision algorithms, which are improving rapidly. Copyright © 2015 International Center for Artificial Organs and Transplantation and Wiley Periodicals, Inc.
Semantic guidance of eye movements in real-world scenes
Hwang, Alex D.; Wang, Hsueh-Cheng; Pomplun, Marc
2011-01-01
The perception of objects in our visual world is influenced by not only their low-level visual features such as shape and color, but also their high-level features such as meaning and semantic relations among them. While it has been shown that low-level features in real-world scenes guide eye movements during scene inspection and search, the influence of semantic similarity among scene objects on eye movements in such situations has not been investigated. Here we study guidance of eye movements by semantic similarity among objects during real-world scene inspection and search. By selecting scenes from the LabelMe object-annotated image database and applying Latent Semantic Analysis (LSA) to the object labels, we generated semantic saliency maps of real-world scenes based on the semantic similarity of scene objects to the currently fixated object or the search target. An ROC analysis of these maps as predictors of subjects’ gaze transitions between objects during scene inspection revealed a preference for transitions to objects that were semantically similar to the currently inspected one. Furthermore, during the course of a scene search, subjects’ eye movements were progressively guided toward objects that were semantically similar to the search target. These findings demonstrate substantial semantic guidance of eye movements in real-world scenes and show its importance for understanding real-world attentional control. PMID:21426914
Semantic guidance of eye movements in real-world scenes.
Hwang, Alex D; Wang, Hsueh-Cheng; Pomplun, Marc
2011-05-25
The perception of objects in our visual world is influenced by not only their low-level visual features such as shape and color, but also their high-level features such as meaning and semantic relations among them. While it has been shown that low-level features in real-world scenes guide eye movements during scene inspection and search, the influence of semantic similarity among scene objects on eye movements in such situations has not been investigated. Here we study guidance of eye movements by semantic similarity among objects during real-world scene inspection and search. By selecting scenes from the LabelMe object-annotated image database and applying latent semantic analysis (LSA) to the object labels, we generated semantic saliency maps of real-world scenes based on the semantic similarity of scene objects to the currently fixated object or the search target. An ROC analysis of these maps as predictors of subjects' gaze transitions between objects during scene inspection revealed a preference for transitions to objects that were semantically similar to the currently inspected one. Furthermore, during the course of a scene search, subjects' eye movements were progressively guided toward objects that were semantically similar to the search target. These findings demonstrate substantial semantic guidance of eye movements in real-world scenes and show its importance for understanding real-world attentional control. Copyright © 2011 Elsevier Ltd. All rights reserved.
An Improved Text Localization Method for Natural Scene Images
NASA Astrophysics Data System (ADS)
Jiang, Mengdi; Cheng, Jianghua; Chen, Minghui; Ku, Xishu
2018-01-01
In order to extract text information effectively from natural scene image with complex background, multi-orientation perspective and multilingual languages, we present a new method based on the improved Stroke Feature Transform (SWT). Firstly, The Maximally Stable Extremal Region (MSER) method is used to detect text candidate regions. Secondly, the SWT algorithm is used in the candidate regions, which can improve the edge detection compared with tradition SWT method. Finally, the Frequency-tuned (FT) visual saliency is introduced to remove non-text candidate regions. The experiment results show that, the method can achieve good robustness for complex background with multi-orientation perspective, various characters and font sizes.
Photogrammetry and remote sensing for visualization of spatial data in a virtual reality environment
NASA Astrophysics Data System (ADS)
Bhagawati, Dwipen
2001-07-01
Researchers in many disciplines have started using the tool of Virtual Reality (VR) to gain new insights into problems in their respective disciplines. Recent advances in computer graphics, software and hardware technologies have created many opportunities for VR systems, advanced scientific and engineering applications being among them. In Geometronics, generally photogrammetry and remote sensing are used for management of spatial data inventory. VR technology can be suitably used for management of spatial data inventory. This research demonstrates usefulness of VR technology for inventory management by taking the roadside features as a case study. Management of roadside feature inventory involves positioning and visualization of the features. This research has developed a methodology to demonstrate how photogrammetric principles can be used to position the features using the video-logging images and GPS camera positioning and how image analysis can help produce appropriate texture for building the VR, which then can be visualized in a Cave Augmented Virtual Environment (CAVE). VR modeling was implemented in two stages to demonstrate the different approaches for modeling the VR scene. A simulated highway scene was implemented with the brute force approach, while modeling software was used to model the real world scene using feature positions produced in this research. The first approach demonstrates an implementation of the scene by writing C++ codes to include a multi-level wand menu for interaction with the scene that enables the user to interact with the scene. The interactions include editing the features inside the CAVE display, navigating inside the scene, and performing limited geographic analysis. The second approach demonstrates creation of a VR scene for a real roadway environment using feature positions determined in this research. The scene looks realistic with textures from the real site mapped on to the geometry of the scene. Remote sensing and digital image processing techniques were used for texturing the roadway features in this scene.
Jung, Wonmo; Bülthoff, Isabelle; Armann, Regine G M
2017-11-01
The brain can only attend to a fraction of all the information that is entering the visual system at any given moment. One way of overcoming the so-called bottleneck of selective attention (e.g., J. M. Wolfe, Võ, Evans, & Greene, 2011) is to make use of redundant visual information and extract summarized statistical information of the whole visual scene. Such ensemble representation occurs for low-level features of textures or simple objects, but it has also been reported for complex high-level properties. While the visual system has, for example, been shown to compute summary representations of facial expression, gender, or identity, it is less clear whether perceptual input from all parts of the visual field contributes equally to the ensemble percept. Here we extend the line of ensemble-representation research into the realm of race and look at the possibility that ensemble perception relies on weighting visual information differently depending on its origin from either the fovea or the visual periphery. We find that observers can judge the mean race of a set of faces, similar to judgments of mean emotion from faces and ensemble representations in low-level domains of visual processing. We also find that while peripheral faces seem to be taken into account for the ensemble percept, far more weight is given to stimuli presented foveally than peripherally. Whether this precision weighting of information stems from differences in the accuracy with which the visual system processes information across the visual field or from statistical inferences about the world needs to be determined by further research.
Modeling Of Object- And Scene-Prototypes With Hierarchically Structured Classes
NASA Astrophysics Data System (ADS)
Ren, Z.; Jensch, P.; Ameling, W.
1989-03-01
The success of knowledge-based image analysis methodology and implementation tools depends largely on an appropriately and efficiently built model wherein the domain-specific context information about and the inherent structure of the observed image scene have been encoded. For identifying an object in an application environment a computer vision system needs to know firstly the description of the object to be found in an image or in an image sequence, secondly the corresponding relationships between object descriptions within the image sequence. This paper presents models of image objects scenes by means of hierarchically structured classes. Using the topovisual formalism of graph and higraph, we are currently studying principally the relational aspect and data abstraction of the modeling in order to visualize the structural nature resident in image objects and scenes, and to formalize. their descriptions. The goal is to expose the structure of image scene and the correspondence of image objects in the low level image interpretation. process. The object-based system design approach has been applied to build the model base. We utilize the object-oriented programming language C + + for designing, testing and implementing the abstracted entity classes and the operation structures which have been modeled topovisually. The reference images used for modeling prototypes of objects and scenes are from industrial environments as'well as medical applications.
Overt attention toward oriented objects in free-viewing barn owls.
Harmening, Wolf Maximilian; Orlowski, Julius; Ben-Shahar, Ohad; Wagner, Hermann
2011-05-17
Visual saliency based on orientation contrast is a perceptual product attributed to the functional organization of the mammalian brain. We examined this visual phenomenon in barn owls by mounting a wireless video microcamera on the owls' heads and confronting them with visual scenes that contained one differently oriented target among similarly oriented distracters. Without being confined by any particular task, the owls looked significantly longer, more often, and earlier at the target, thus exhibiting visual search strategies so far demonstrated in similar conditions only in primates. Given the considerable differences in phylogeny and the structure of visual pathways between owls and humans, these findings suggest that orientation saliency has computational optimality in a wide variety of ecological contexts, and thus constitutes a universal building block for efficient visual information processing in general.
Retinal ganglion cell maps in the brain: implications for visual processing.
Dhande, Onkar S; Huberman, Andrew D
2014-02-01
Everything the brain knows about the content of the visual world is built from the spiking activity of retinal ganglion cells (RGCs). As the output neurons of the eye, RGCs include ∼20 different subtypes, each responding best to a specific feature in the visual scene. Here we discuss recent advances in identifying where different RGC subtypes route visual information in the brain, including which targets they connect to and how their organization within those targets influences visual processing. We also highlight examples where causal links have been established between specific RGC subtypes, their maps of central connections and defined aspects of light-mediated behavior and we suggest the use of techniques that stand to extend these sorts of analyses to circuits underlying visual perception. Copyright © 2013. Published by Elsevier Ltd.
Li, X; Yang, Y; Ren, G
2009-06-16
Language is often perceived together with visual information. Recent experimental evidences indicated that, during spoken language comprehension, the brain can immediately integrate visual information with semantic or syntactic information from speech. Here we used the mismatch negativity to further investigate whether prosodic information from speech could be immediately integrated into a visual scene context or not, and especially the time course and automaticity of this integration process. Sixteen Chinese native speakers participated in the study. The materials included Chinese spoken sentences and picture pairs. In the audiovisual situation, relative to the concomitant pictures, the spoken sentence was appropriately accented in the standard stimuli, but inappropriately accented in the two kinds of deviant stimuli. In the purely auditory situation, the speech sentences were presented without pictures. It was found that the deviants evoked mismatch responses in both audiovisual and purely auditory situations; the mismatch negativity in the purely auditory situation peaked at the same time as, but was weaker than that evoked by the same deviant speech sounds in the audiovisual situation. This pattern of results suggested immediate integration of prosodic information from speech and visual information from pictures in the absence of focused attention.
Cultural differences in the lateral occipital complex while viewing incongruent scenes
Yang, Yung-Jui; Goh, Joshua; Hong, Ying-Yi; Park, Denise C.
2010-01-01
Converging behavioral and neuroimaging evidence indicates that culture influences the processing of complex visual scenes. Whereas Westerners focus on central objects and tend to ignore context, East Asians process scenes more holistically, attending to the context in which objects are embedded. We investigated cultural differences in contextual processing by manipulating the congruence of visual scenes presented in an fMR-adaptation paradigm. We hypothesized that East Asians would show greater adaptation to incongruent scenes, consistent with their tendency to process contextual relationships more extensively than Westerners. Sixteen Americans and 16 native Chinese were scanned while viewing sets of pictures consisting of a focal object superimposed upon a background scene. In half of the pictures objects were paired with congruent backgrounds, and in the other half objects were paired with incongruent backgrounds. We found that within both the right and left lateral occipital complexes, Chinese participants showed significantly greater adaptation to incongruent scenes than to congruent scenes relative to American participants. These results suggest that Chinese were more sensitive to contextual incongruity than were Americans and that they reacted to incongruent object/background pairings by focusing greater attention on the object. PMID:20083532
Real-time visual simulation of APT system based on RTW and Vega
NASA Astrophysics Data System (ADS)
Xiong, Shuai; Fu, Chengyu; Tang, Tao
2012-10-01
The Matlab/Simulink simulation model of APT (acquisition, pointing and tracking) system is analyzed and established. Then the model's C code which can be used for real-time simulation is generated by RTW (Real-Time Workshop). Practical experiments show, the simulation result of running the C code is the same as running the Simulink model directly in the Matlab environment. MultiGen-Vega is a real-time 3D scene simulation software system. With it and OpenGL, the APT scene simulation platform is developed and used to render and display the virtual scenes of the APT system. To add some necessary graphics effects to the virtual scenes real-time, GLSL (OpenGL Shading Language) shaders are used based on programmable GPU. By calling the C code, the scene simulation platform can adjust the system parameters on-line and get APT system's real-time simulation data to drive the scenes. Practical application shows that this visual simulation platform has high efficiency, low charge and good simulation effect.
Hippocampal gamma-band Synchrony and pupillary responses index memory during visual search.
Montefusco-Siegmund, Rodrigo; Leonard, Timothy K; Hoffman, Kari L
2017-04-01
Memory for scenes is supported by the hippocampus, among other interconnected structures, but the neural mechanisms related to this process are not well understood. To assess the role of the hippocampus in memory-guided scene search, we recorded local field potentials and multiunit activity from the hippocampus of macaques as they performed goal-directed search tasks using natural scenes. We additionally measured pupil size during scene presentation, which in humans is modulated by recognition memory. We found that both pupil dilation and search efficiency accompanied scene repetition, thereby indicating memory for scenes. Neural correlates included a brief increase in hippocampal multiunit activity and a sustained synchronization of unit activity to gamma band oscillations (50-70 Hz). The repetition effects on hippocampal gamma synchronization occurred when pupils were most dilated, suggesting an interaction between aroused, attentive processing and hippocampal correlates of recognition memory. These results suggest that the hippocampus may support memory-guided visual search through enhanced local gamma synchrony. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Ultrafast scene detection and recognition with limited visual information
Hagmann, Carl Erick; Potter, Mary C.
2016-01-01
Humans can detect target color pictures of scenes depicting concepts like picnic or harbor in sequences of six or twelve pictures presented as briefly as 13 ms, even when the target is named after the sequence (Potter, Wyble, Hagmann, & McCourt, 2014). Such rapid detection suggests that feedforward processing alone enabled detection without recurrent cortical feedback. There is debate about whether coarse, global, low spatial frequencies (LSFs) provide predictive information to high cortical levels through the rapid magnocellular (M) projection of the visual path, enabling top-down prediction of possible object identities. To test the “Fast M” hypothesis, we compared detection of a named target across five stimulus conditions: unaltered color, blurred color, grayscale, thresholded monochrome, and LSF pictures. The pictures were presented for 13–80 ms in six-picture rapid serial visual presentation (RSVP) sequences. Blurred, monochrome, and LSF pictures were detected less accurately than normal color or grayscale pictures. When the target was named before the sequence, all picture types except LSF resulted in above-chance detection at all durations. Crucially, when the name was given only after the sequence, performance dropped and the monochrome and LSF pictures (but not the blurred pictures) were at or near chance. Thus, without advance information, monochrome and LSF pictures were rarely understood. The results offer only limited support for the Fast M hypothesis, suggesting instead that feedforward processing is able to activate conceptual representations without complementary reentrant processing. PMID:28255263
The effects of visual scenes on roll and pitch thresholds in pilots versus nonpilots.
Otakeno, Shinji; Matthews, Roger S J; Folio, Les; Previc, Fred H; Lessard, Charles S
2002-02-01
Previous studies have indicated that, compared with nonpilots, pilots rely more on vision than "seat-of-the-pants" sensations when presented with visual-vestibular conflict. The objective of this study was to evaluate whether pilots and nonpilots differ in their thresholds for tilt perception while viewing visual scenes depicting simulated flight. This study was conducted in the Advanced Spatial Disorientation Demonstrator (ASDD) at Brooks AFB, TX. There were 14 subjects (7 pilots and 7 nonpilots) who recorded tilt detection thresholds in pitch and roll while exposed to sub-threshold movement in each axis. During each test run, subjects were presented with computer-generated visual scenes depicting accelerating forward flight by day or night, and a blank (control) condition. The only significant effect detected by an analysis of variance (ANOVA) was that all subjects were more sensitive to tilt in roll than in pitch [F (2,24) = 18.96, p < 0.001]. Overall, pilots had marginally higher tilt detection thresholds compared with nonpilots (p = 0.055), but the type of visual scene had no significant effect on thresholds. In this study, pilots did not demonstrate greater visual dominance over vestibular and proprioceptive cues than nonpilots, but appeared to have higher pitch and roll thresholds overall. The finding of significantly lower detection thresholds in the roll axis vs. the pitch axis was an incidental finding for both subject groups.
Visual supports for shared reading with young children: the effect of static overlay design.
Wood Jackson, Carla; Wahlquist, Jordan; Marquis, Cassandra
2011-06-01
This study examined the effects of two types of static overlay design (visual scene display and grid display) on 39 children's use of a speech-generating device during shared storybook reading with an adult. This pilot project included two groups: preschool children with typical communication skills (n = 26) and with complex communication needs (n = 13). All participants engaged in shared reading with two books using each visual layout on a speech-generating device (SGD). The children averaged a greater number of activations when presented with a grid display during introductory exploration and free play. There was a large effect of the static overlay design on the number of silent hits, evidencing more silent hits with visual scene displays. On average, the children demonstrated relatively few spontaneous activations of the speech-generating device while the adult was reading, regardless of overlay design. When responding to questions, children with communication needs appeared to perform better when using visual scene displays, but the effect of display condition on the accuracy of responses to wh-questions was not statistically significant. In response to an open ended question, children with communication disorders demonstrated more frequent activations of the SGD using a grid display than a visual scene. Suggestions for future research as well as potential implications for designing AAC systems for shared reading with young children are discussed.
ERIC Educational Resources Information Center
Fletcher-Watson, S.; Collis, J. M.; Findlay, J. M.; Leekam, S. R.
2009-01-01
Change blindness describes the surprising difficulty of detecting large changes in visual scenes when changes occur during a visual disruption. In order to study the developmental course of this phenomenon, a modified version of the flicker paradigm, based on Rensink, O'Regan & Clark (1997), was given to three groups of children aged 6-12 years…
Reduced Change Blindness Suggests Enhanced Attention to Detail in Individuals with Autism
ERIC Educational Resources Information Center
Smith, Hayley; Milne, Elizabeth
2009-01-01
Background: The phenomenon of change blindness illustrates that a limited number of items within the visual scene are attended to at any one time. It has been suggested that individuals with autism focus attention on less contextually relevant aspects of the visual scene, show superior perceptual discrimination and notice details which are often…
Measuring familiarity for natural environments through visual images
William E. Hammitt
1979-01-01
An on-site visual preference methodology involving a pre-and-post rating of bog landscape photographs is discussed. Photographs were rated for familiarity as well as preference. Preference was shown to be closely related to familiarity, assuming visitors had the opportunity to view the scenes during the on-site hiking engagement. Scenes rated high on preference were...
Vos, Leia; Whitman, Douglas
2014-01-01
A considerable literature suggests that the right hemisphere is dominant in vigilance for novel and survival-related stimuli, such as predators, across a wide range of species. In contrast to vigilance for change, change blindness is a failure to detect obvious changes in a visual scene when they are obscured by a disruption in scene presentation. We studied lateralised change detection using a series of scenes with salient changes in either the left or right visual fields. In Study 1 left visual field changes were detected more rapidly than right visual field changes, confirming a right hemisphere advantage for change detection. Increasing stimulus difficulty resulted in greater right visual field detections and left hemisphere detection was more likely when change occurred in the right visual field on a prior trial. In Study 2 an intervening distractor task disrupted the influence of prior trials. Again, faster detection speeds were observed for the left visual field changes with a shift to a right visual field advantage with increasing time-to-detection. This suggests that a right hemisphere role for vigilance, or catching attention, and a left hemisphere role for target evaluation, or maintaining attention, is present at the earliest stage of change detection.
NASA Astrophysics Data System (ADS)
Sun, Xiao; Chai, Guobei; Liu, Wei; Bao, Wenzhuo; Zhao, Xiaoning; Ming, Delie
2018-02-01
Simple cells in primary visual cortex are believed to extract local edge information from a visual scene. In this paper, inspired by different receptive field properties and visual information flow paths of neurons, an improved Combination of Receptive Fields (CORF) model combined with non-classical receptive fields was proposed to simulate the responses of simple cell's receptive fields. Compared to the classical model, the proposed model is able to better imitate simple cell's physiologic structure with consideration of facilitation and suppression of non-classical receptive fields. And on this base, an edge detection algorithm as an application of the improved CORF model was proposed. Experimental results validate the robustness of the proposed algorithm to noise and background interference.
The effect of non-visual working memory load on top-down modulation of visual processing
Rissman, Jesse; Gazzaley, Adam; D'Esposito, Mark
2009-01-01
While a core function of the working memory (WM) system is the active maintenance of behaviorally relevant sensory representations, it is also critical that distracting stimuli are appropriately ignored. We used functional magnetic resonance imaging to examine the role of domain-general WM resources in the top-down attentional modulation of task-relevant and irrelevant visual representations. In our dual-task paradigm, each trial began with the auditory presentation of six random (high load) or sequentially-ordered (low load) digits. Next, two relevant visual stimuli (e.g., faces), presented amongst two temporally interspersed visual distractors (e.g., scenes), were to be encoded and maintained across a 7-sec delay interval, after which memory for the relevant images and digits was probed. When taxed by high load digit maintenance, participants exhibited impaired performance on the visual WM task and a selective failure to attenuate the neural processing of task-irrelevant scene stimuli. The over-processing of distractor scenes under high load was indexed by elevated encoding activity in a scene-selective region-of-interest relative to low load and passive viewing control conditions, as well as by improved long-term recognition memory for these items. In contrast, the load manipulation did not affect participants' ability to upregulate activity in this region when scenes were task-relevant. These results highlight the critical role of domain-general WM resources in the goal-directed regulation of distractor processing. Moreover, the consequences of increased WM load in young adults closely resemble the effects of cognitive aging on distractor filtering [Gazzaley et al., (2005) Nature Neuroscience 8, 1298-1300], suggesting the possibility of a common underlying mechanism. PMID:19397858
Visual processing in the central bee brain.
Paulk, Angelique C; Dacks, Andrew M; Phillips-Portillo, James; Fellous, Jean-Marc; Gronenberg, Wulfila
2009-08-12
Visual scenes comprise enormous amounts of information from which nervous systems extract behaviorally relevant cues. In most model systems, little is known about the transformation of visual information as it occurs along visual pathways. We examined how visual information is transformed physiologically as it is communicated from the eye to higher-order brain centers using bumblebees, which are known for their visual capabilities. We recorded intracellularly in vivo from 30 neurons in the central bumblebee brain (the lateral protocerebrum) and compared these neurons to 132 neurons from more distal areas along the visual pathway, namely the medulla and the lobula. In these three brain regions (medulla, lobula, and central brain), we examined correlations between the neurons' branching patterns and their responses primarily to color, but also to motion stimuli. Visual neurons projecting to the anterior central brain were generally color sensitive, while neurons projecting to the posterior central brain were predominantly motion sensitive. The temporal response properties differed significantly between these areas, with an increase in spike time precision across trials and a decrease in average reliable spiking as visual information processing progressed from the periphery to the central brain. These data suggest that neurons along the visual pathway to the central brain not only are segregated with regard to the physical features of the stimuli (e.g., color and motion), but also differ in the way they encode stimuli, possibly to allow for efficient parallel processing to occur.
Memory-guided attention during active viewing of edited dynamic scenes.
Valuch, Christian; König, Peter; Ansorge, Ulrich
2017-01-01
Films, TV shows, and other edited dynamic scenes contain many cuts, which are abrupt transitions from one video shot to the next. Cuts occur within or between scenes, and often join together visually and semantically related shots. Here, we tested to which degree memory for the visual features of the precut shot facilitates shifting attention to the postcut shot. We manipulated visual similarity across cuts, and measured how this affected covert attention (Experiment 1) and overt attention (Experiments 2 and 3). In Experiments 1 and 2, participants actively viewed a target movie that randomly switched locations with a second, distractor movie at the time of the cuts. In Experiments 1 and 2, participants were able to deploy attention more rapidly and accurately to the target movie's continuation when visual similarity was high than when it was low. Experiment 3 tested whether this could be explained by stimulus-driven (bottom-up) priming by feature similarity, using one clip at screen center that was followed by two alternative continuations to the left and right. Here, even the highest similarity across cuts did not capture attention. We conclude that following cuts of high visual similarity, memory-guided attention facilitates the deployment of attention, but this effect is (top-down) dependent on the viewer's active matching of scene content across cuts.
Wing, Erik A.; Ritchey, Maureen; Cabeza, Roberto
2015-01-01
Neurobiological memory models assume memory traces are stored in neocortex, with pointers in the hippocampus, and are then reactivated during retrieval, yielding the experience of remembering. Whereas most prior neuroimaging studies on reactivation have focused on the reactivation of sets or categories of items, the current study sought to identify cortical patterns pertaining to memory for individual scenes. During encoding, participants viewed pictures of scenes paired with matching labels (e.g., “barn,” “tunnel”), and, during retrieval, they recalled the scenes in response to the labels and rated the quality of their visual memories. Using representational similarity analyses, we interrogated the similarity between activation patterns during encoding and retrieval both at the item level (individual scenes) and the set level (all scenes). The study yielded four main findings. First, in occipitotemporal cortex, memory success increased with encoding-retrieval similarity (ERS) at the item level but not at the set level, indicating the reactivation of individual scenes. Second, in ventrolateral pFC, memory increased with ERS for both item and set levels, indicating the recapitulation of memory processes that benefit encoding and retrieval of all scenes. Third, in retrosplenial/posterior cingulate cortex, ERS was sensitive to individual scene information irrespective of memory success, suggesting automatic activation of scene contexts. Finally, consistent with neurobiological models, hippocampal activity during encoding predicted the subsequent reactivation of individual items. These findings show the promise of studying memory with greater specificity by isolating individual mnemonic representations and determining their relationship to factors like the detail with which past events are remembered. PMID:25313659
Charrier, A; Tardif, C; Gepner, B
2017-02-01
Face and gaze avoidance are among the most characteristic and salient symptoms of autism spectrum disorders (ASD). Studies using eye tracking highlighted early and lifelong ASD-specific abnormalities in attention to face such as decreased attention to internal facial features. These specificities could be partly explained by disorders in the perception and integration of rapid and complex information such as that conveyed by facial movements and more broadly by biological and physical environment. Therefore, we wish to test whether slowing down facial dynamics may improve the way children with ASD attend to a face. We used an eye tracking method to examine gaze patterns of children with ASD aged 3 to 8 (n=23) and TD controls (n=29) while viewing the face of a speaker telling a story. The story was divided into 6 sequences that were randomly displayed at 3 different speeds, i.e. a real-time speed (RT), a slow speed (S70=70% of RT speed), a very slow speed (S50=50% of RT speed). S70 and S50 were displayed thanks to software called Logiral™, aimed at slowing down visual and auditory stimuli simultaneously and without tone distortion. The visual scene was divided into four regions of interest (ROI): eyes region; mouth region; whole face region; outside the face region. The total time, number and mean duration of visual fixations on the whole visual scene and the four ROI were measured between and within the two groups. Compared to TD children, children with ASD spent significantly less time attending to the visual scenes and, when they looked at the scene, they spent less time scanning the speaker's face in general and her mouth in particular, and more time looking outside facial area. Within the ASD group mean duration of fixation increased on the whole scene and particularly on the mouth area, in R50 compared to RT. Children with mild autism spent more time looking at the face than the two other groups of ASD children, and spent more time attending to the face and mouth as well as longer mean duration of visual fixation on mouth and eyes, at slow speeds (S50 and/or S70) than at RT one. Slowing down facial dynamics enhances looking time on face, and particularly on mouth and/or eyes, in a group of 23 children with ASD and particularly in a small subgroup with mild autism. Given the crucial role of reading the eyes for emotional processing and that of lip-reading for language processing, our present result and other converging ones could pave the way for novel socio-emotional and verbal rehabilitation methods for autistic population. Further studies should investigate whether increased attention to face and particularly eyes and mouth is correlated to emotional/social and/or verbal/language improvements. Copyright © 2016 L'Encéphale, Paris. Published by Elsevier Masson SAS. All rights reserved.
NASA Astrophysics Data System (ADS)
Qi, K.; Qingfeng, G.
2017-12-01
With the popular use of High-Resolution Satellite (HRS) images, more and more research efforts have been placed on land-use scene classification. However, it makes the task difficult with HRS images for the complex background and multiple land-cover classes or objects. This article presents a multiscale deeply described correlaton model for land-use scene classification. Specifically, the convolutional neural network is introduced to learn and characterize the local features at different scales. Then, learnt multiscale deep features are explored to generate visual words. The spatial arrangement of visual words is achieved through the introduction of adaptive vector quantized correlograms at different scales. Experiments on two publicly available land-use scene datasets demonstrate that the proposed model is compact and yet discriminative for efficient representation of land-use scene images, and achieves competitive classification results with the state-of-art methods.
To search or to like: Mapping fixations to differentiate two forms of incidental scene memory.
Choe, Kyoung Whan; Kardan, Omid; Kotabe, Hiroki P; Henderson, John M; Berman, Marc G
2017-10-01
We employed eye-tracking to investigate how performing different tasks on scenes (e.g., intentionally memorizing them, searching for an object, evaluating aesthetic preference) can affect eye movements during encoding and subsequent scene memory. We found that scene memorability decreased after visual search (one incidental encoding task) compared to intentional memorization, and that preference evaluation (another incidental encoding task) produced better memory, similar to the incidental memory boost previously observed for words and faces. By analyzing fixation maps, we found that although fixation map similarity could explain how eye movements during visual search impairs incidental scene memory, it could not explain the incidental memory boost from aesthetic preference evaluation, implying that implicit mechanisms were at play. We conclude that not all incidental encoding tasks should be taken to be similar, as different mechanisms (e.g., explicit or implicit) lead to memory enhancements or decrements for different incidental encoding tasks.
Short-term memory for figure-ground organization in the visual cortex.
O'Herron, Philip; von der Heydt, Rüdiger
2009-03-12
Whether the visual system uses a buffer to store image information and the duration of that storage have been debated intensely in recent psychophysical studies. The long phases of stable perception of reversible figures suggest a memory that persists for seconds. But persistence of similar duration has not been found in signals of the visual cortex. Here, we show that figure-ground signals in the visual cortex can persist for a second or more after the removal of the figure-ground cues. When new figure-ground information is presented, the signals adjust rapidly, but when a figure display is changed to an ambiguous edge display, the signals decay slowly--a behavior that is characteristic of memory devices. Figure-ground signals represent the layout of objects in a scene, and we propose that a short-term memory for object layout is important in providing continuity of perception in the rapid stream of images flooding our eyes.
Feed-forward segmentation of figure-ground and assignment of border-ownership.
Supèr, Hans; Romeo, August; Keil, Matthias
2010-05-19
Figure-ground is the segmentation of visual information into objects and their surrounding backgrounds. Two main processes herein are boundary assignment and surface segregation, which rely on the integration of global scene information. Recurrent processing either by intrinsic horizontal connections that connect surrounding neurons or by feedback projections from higher visual areas provide such information, and are considered to be the neural substrate for figure-ground segmentation. On the contrary, a role of feedforward projections in figure-ground segmentation is unknown. To have a better understanding of a role of feedforward connections in figure-ground organization, we constructed a feedforward spiking model using a biologically plausible neuron model. By means of surround inhibition our simple 3-layered model performs figure-ground segmentation and one-sided border-ownership coding. We propose that the visual system uses feed forward suppression for figure-ground segmentation and border-ownership assignment.
Feed-Forward Segmentation of Figure-Ground and Assignment of Border-Ownership
Supèr, Hans; Romeo, August; Keil, Matthias
2010-01-01
Figure-ground is the segmentation of visual information into objects and their surrounding backgrounds. Two main processes herein are boundary assignment and surface segregation, which rely on the integration of global scene information. Recurrent processing either by intrinsic horizontal connections that connect surrounding neurons or by feedback projections from higher visual areas provide such information, and are considered to be the neural substrate for figure-ground segmentation. On the contrary, a role of feedforward projections in figure-ground segmentation is unknown. To have a better understanding of a role of feedforward connections in figure-ground organization, we constructed a feedforward spiking model using a biologically plausible neuron model. By means of surround inhibition our simple 3-layered model performs figure-ground segmentation and one-sided border-ownership coding. We propose that the visual system uses feed forward suppression for figure-ground segmentation and border-ownership assignment. PMID:20502718
Web GIS in practice VII: stereoscopic 3-D solutions for online maps and virtual globes
Boulos, Maged N Kamel; Robinson, Larry R
2009-01-01
Because our pupils are about 6.5 cm apart, each eye views a scene from a different angle and sends a unique image to the visual cortex, which then merges the images from both eyes into a single picture. The slight difference between the right and left images allows the brain to properly perceive the 'third dimension' or depth in a scene (stereopsis). However, when a person views a conventional 2-D (two-dimensional) image representation of a 3-D (three-dimensional) scene on a conventional computer screen, each eye receives essentially the same information. Depth in such cases can only be approximately inferred from visual clues in the image, such as perspective, as only one image is offered to both eyes. The goal of stereoscopic 3-D displays is to project a slightly different image into each eye to achieve a much truer and realistic perception of depth, of different scene planes, and of object relief. This paper presents a brief review of a number of stereoscopic 3-D hardware and software solutions for creating and displaying online maps and virtual globes (such as Google Earth) in "true 3D", with costs ranging from almost free to multi-thousand pounds sterling. A practical account is also given of the experience of the USGS BRD UMESC (United States Geological Survey's Biological Resources Division, Upper Midwest Environmental Sciences Center) in setting up a low-cost, full-colour stereoscopic 3-D system. PMID:19849837
Web GIS in practice VII: stereoscopic 3-D solutions for online maps and virtual globes.
Boulos, Maged N Kamel; Robinson, Larry R
2009-10-22
Because our pupils are about 6.5 cm apart, each eye views a scene from a different angle and sends a unique image to the visual cortex, which then merges the images from both eyes into a single picture. The slight difference between the right and left images allows the brain to properly perceive the 'third dimension' or depth in a scene (stereopsis). However, when a person views a conventional 2-D (two-dimensional) image representation of a 3-D (three-dimensional) scene on a conventional computer screen, each eye receives essentially the same information. Depth in such cases can only be approximately inferred from visual clues in the image, such as perspective, as only one image is offered to both eyes. The goal of stereoscopic 3-D displays is to project a slightly different image into each eye to achieve a much truer and realistic perception of depth, of different scene planes, and of object relief. This paper presents a brief review of a number of stereoscopic 3-D hardware and software solutions for creating and displaying online maps and virtual globes (such as Google Earth) in "true 3D", with costs ranging from almost free to multi-thousand pounds sterling. A practical account is also given of the experience of the USGS BRD UMESC (United States Geological Survey's Biological Resources Division, Upper Midwest Environmental Sciences Center) in setting up a low-cost, full-colour stereoscopic 3-D system.
Web GIS in practice VII: stereoscopic 3-D solutions for online maps and virtual globes
Boulos, Maged N.K.; Robinson, Larry R.
2009-01-01
Because our pupils are about 6.5 cm apart, each eye views a scene from a different angle and sends a unique image to the visual cortex, which then merges the images from both eyes into a single picture. The slight difference between the right and left images allows the brain to properly perceive the 'third dimension' or depth in a scene (stereopsis). However, when a person views a conventional 2-D (two-dimensional) image representation of a 3-D (three-dimensional) scene on a conventional computer screen, each eye receives essentially the same information. Depth in such cases can only be approximately inferred from visual clues in the image, such as perspective, as only one image is offered to both eyes. The goal of stereoscopic 3-D displays is to project a slightly different image into each eye to achieve a much truer and realistic perception of depth, of different scene planes, and of object relief. This paper presents a brief review of a number of stereoscopic 3-D hardware and software solutions for creating and displaying online maps and virtual globes (such as Google Earth) in "true 3D", with costs ranging from almost free to multi-thousand pounds sterling. A practical account is also given of the experience of the USGS BRD UMESC (United States Geological Survey's Biological Resources Division, Upper Midwest Environmental Sciences Center) in setting up a low-cost, full-colour stereoscopic 3-D system.
DspaceOgre 3D Graphics Visualization Tool
NASA Technical Reports Server (NTRS)
Jain, Abhinandan; Myin, Steven; Pomerantz, Marc I.
2011-01-01
This general-purpose 3D graphics visualization C++ tool is designed for visualization of simulation and analysis data for articulated mechanisms. Examples of such systems are vehicles, robotic arms, biomechanics models, and biomolecular structures. DspaceOgre builds upon the open-source Ogre3D graphics visualization library. It provides additional classes to support the management of complex scenes involving multiple viewpoints and different scene groups, and can be used as a remote graphics server. This software provides improved support for adding programs at the graphics processing unit (GPU) level for improved performance. It also improves upon the messaging interface it exposes for use as a visualization server.
Evans, Karla K; Horowitz, Todd S; Howe, Piers; Pedersini, Roccardo; Reijnen, Ester; Pinto, Yair; Kuzmova, Yoana; Wolfe, Jeremy M
2011-09-01
A typical visual scene we encounter in everyday life is complex and filled with a huge amount of perceptual information. The term, 'visual attention' describes a set of mechanisms that limit some processing to a subset of incoming stimuli. Attentional mechanisms shape what we see and what we can act upon. They allow for concurrent selection of some (preferably, relevant) information and inhibition of other information. This selection permits the reduction of complexity and informational overload. Selection can be determined both by the 'bottom-up' saliency of information from the environment and by the 'top-down' state and goals of the perceiver. Attentional effects can take the form of modulating or enhancing the selected information. A central role for selective attention is to enable the 'binding' of selected information into unified and coherent representations of objects in the outside world. In the overview on visual attention presented here we review the mechanisms and consequences of selection and inhibition over space and time. We examine theoretical, behavioral and neurophysiologic work done on visual attention. We also discuss the relations between attention and other cognitive processes such as automaticity and awareness. WIREs Cogni Sci 2011 2 503-514 DOI: 10.1002/wcs.127 For further resources related to this article, please visit the WIREs website. Copyright © 2011 John Wiley & Sons, Ltd.
Correlation between Inter-Blink Interval and Episodic Encoding during Movie Watching.
Shin, Young Seok; Chang, Won-du; Park, Jinsick; Im, Chang-Hwan; Lee, Sang In; Kim, In Young; Jang, Dong Pyo
2015-01-01
Human eye blinking is cognitively suppressed to minimize loss of visual information for important real-world events. Despite the relationship between eye blinking and cognitive state, the effect of eye blinks on cognition in real-world environments has received limited research attention. In this study, we focused on the temporal pattern of inter-eye blink interval (IEBI) during movie watching and investigated its relationship with episodic memory. As a control condition, 24 healthy subjects watched a nature documentary that lacked a specific story line while electroencephalography was performed. Immediately after viewing the movie, the subjects were asked to report its most memorable scene. Four weeks later, subjects were asked to score 32 randomly selected scenes from the movie, based on how much they were able to remember and describe. The results showed that the average IEBI was significantly longer during the movie than in the control condition. In addition, the significant increase in IEBI when watching a movie coincided with the most memorable scenes of the movie. The results suggested that the interesting episodic narrative of the movie attracted the subjects' visual attention relative to the documentary clip that did not have a story line. In the episodic memory test executed four weeks later, memory performance was significantly positively correlated with IEBI (p<0.001). In summary, IEBI may be a reliable bio-marker of the degree of concentration on naturalistic content that requires visual attention, such as a movie.
Correlation between Inter-Blink Interval and Episodic Encoding during Movie Watching
Shin, Young Seok; Chang, Won-du; Park, Jinsick; Im, Chang-Hwan; Lee, Sang In; Kim, In Young; Jang, Dong Pyo
2015-01-01
Human eye blinking is cognitively suppressed to minimize loss of visual information for important real-world events. Despite the relationship between eye blinking and cognitive state, the effect of eye blinks on cognition in real-world environments has received limited research attention. In this study, we focused on the temporal pattern of inter-eye blink interval (IEBI) during movie watching and investigated its relationship with episodic memory. As a control condition, 24 healthy subjects watched a nature documentary that lacked a specific story line while electroencephalography was performed. Immediately after viewing the movie, the subjects were asked to report its most memorable scene. Four weeks later, subjects were asked to score 32 randomly selected scenes from the movie, based on how much they were able to remember and describe. The results showed that the average IEBI was significantly longer during the movie than in the control condition. In addition, the significant increase in IEBI when watching a movie coincided with the most memorable scenes of the movie. The results suggested that the interesting episodic narrative of the movie attracted the subjects’ visual attention relative to the documentary clip that did not have a story line. In the episodic memory test executed four weeks later, memory performance was significantly positively correlated with IEBI (p<0.001). In summary, IEBI may be a reliable bio-marker of the degree of concentration on naturalistic content that requires visual attention, such as a movie. PMID:26529091
Memel, Molly; Ryan, Lee
2017-06-01
The ability to remember associations between previously unrelated pieces of information is often impaired in older adults (Naveh-Benjamin, 2000). Unitization, the process of creating a perceptually or semantically integrated representation that includes both items in an associative pair, attenuates age-related associative deficits (Bastin et al., 2013; Ahmad et al., 2015; Zheng et al., 2015). Compared to non-unitized pairs, unitized pairs may rely less on hippocampally-mediated binding associated with recollection, and more on familiarity-based processes mediated by perirhinal cortex (PRC) and parahippocampal cortex (PHC). While unitization of verbal materials improves associative memory in older adults, less is known about the impact of visual integration. The present study determined whether visual integration improves associative memory in older adults by minimizing the need for hippocampal (HC) recruitment and shifting encoding to non-hippocampal medial temporal structures, such as the PRC and PHC. Young and older adults were presented with a series of objects paired with naturalistic scenes while undergoing fMRI scanning, and were later given an associative memory test. Visual integration was varied by presenting the object either next to the scene (Separated condition) or visually integrated within the scene (Combined condition). Visual integration improved associative memory among young and older adults to a similar degree by increasing the hit rate for intact pairs, but without increasing false alarms for recombined pairs, suggesting enhanced recollection rather than increased reliance on familiarity. Also contrary to expectations, visual integration resulted in increased hippocampal activation in both age groups, along with increases in PRC and PHC activation. Activation in all three MTL regions predicted discrimination performance during the Separated condition in young adults, while only a marginal relationship between PRC activation and performance was observed during the Combined condition. Older adults showed less overall activation in MTL regions compared to young adults, and associative memory performance was most strongly predicted by prefrontal, rather than MTL, activation. We suggest that visual integration benefits both young and older adults similarly, and provides a special case of unitization that may be mediated by recollective, rather than familiarity-based encoding processes. Copyright © 2017 Elsevier Ltd. All rights reserved.
Absolute Depth Sensitivity in Cat Primary Visual Cortex under Natural Viewing Conditions.
Pigarev, Ivan N; Levichkina, Ekaterina V
2016-01-01
Mechanisms of 3D perception, investigated in many laboratories, have defined depth either relative to the fixation plane or to other objects in the visual scene. It is obvious that for efficient perception of the 3D world, additional mechanisms of depth constancy could operate in the visual system to provide information about absolute distance. Neurons with properties reflecting some features of depth constancy have been described in the parietal and extrastriate occipital cortical areas. It has also been shown that, for some neurons in the visual area V1, responses to stimuli of constant angular size differ at close and remote distances. The present study was designed to investigate whether, in natural free gaze viewing conditions, neurons tuned to absolute depths can be found in the primary visual cortex (area V1). Single-unit extracellular activity was recorded from the visual cortex of waking cats sitting on a trolley in front of a large screen. The trolley was slowly approaching the visual scene, which consisted of stationary sinusoidal gratings of optimal orientation rear-projected over the whole surface of the screen. Each neuron was tested with two gratings, with spatial frequency of one grating being twice as high as that of the other. Assuming that a cell is tuned to a spatial frequency, its maximum response to the grating with a spatial frequency twice as high should be shifted to a distance half way closer to the screen in order to attain the same size of retinal projection. For hypothetical neurons selective to absolute depth, location of the maximum response should remain at the same distance irrespective of the type of stimulus. It was found that about 20% of neurons in our experimental paradigm demonstrated sensitivity to particular distances independently of the spatial frequencies of the gratings. We interpret these findings as an indication of the use of absolute depth information in the primary visual cortex.
Machine learning-based augmented reality for improved surgical scene understanding.
Pauly, Olivier; Diotte, Benoit; Fallavollita, Pascal; Weidert, Simon; Euler, Ekkehard; Navab, Nassir
2015-04-01
In orthopedic and trauma surgery, AR technology can support surgeons in the challenging task of understanding the spatial relationships between the anatomy, the implants and their tools. In this context, we propose a novel augmented visualization of the surgical scene that mixes intelligently the different sources of information provided by a mobile C-arm combined with a Kinect RGB-Depth sensor. Therefore, we introduce a learning-based paradigm that aims at (1) identifying the relevant objects or anatomy in both Kinect and X-ray data, and (2) creating an object-specific pixel-wise alpha map that permits relevance-based fusion of the video and the X-ray images within one single view. In 12 simulated surgeries, we show very promising results aiming at providing for surgeons a better surgical scene understanding as well as an improved depth perception. Copyright © 2014 Elsevier Ltd. All rights reserved.
How the deployment of attention determines what we see
Treisman, Anne
2007-01-01
Attention is a tool to adapt what we see to our current needs. It can be focused narrowly on a single object or spread over several or distributed over the scene as a whole. In addition to increasing or decreasing the number of attended objects, these different deployments may have different effects on what we see. This chapter describes some research both on focused attention and its use in binding features, and on distributed attention and the kinds of information we gain and lose with the attention window opened wide. One kind of processing that we suggest occurs automatically with distributed attention results in a statistical description of sets of similar objects. Another gives the gist of the scene, which may be inferred from sets of features registered in parallel. Flexible use of these different modes of attention allows us to reconcile sharp capacity limits with a richer understanding of the visual scene. PMID:17387378
Impact of LANDSAT MSS sensor differences on change detection analysis
NASA Technical Reports Server (NTRS)
Likens, W. C.; Wrigley, R. C.
1983-01-01
Some 512 by 512 pixel subwindows for simultaneously acquired scene pairs obtained by LANDSAT 2,3 and 4 multispectral band scanners were coregistered using LANDSAT 4 scenes as the base to which the other images were registered. Scattergrams between the coregistered scenes (a form of contingency analysis) were used to radiometrically compare data from the various sensors. Mode values were derived and used to visually fit a linear regression. Root mean square errors of the registration varied between .1 and 1.5 pixels. There appear to be no major problem preventing the use of LANDSAT 4 MSS with previous MSS sensors for change detection, provided the noise interference can be removed or minimized. Data normalizations for change detection should be based on the data rather than solely on calibration information. This allows simultaneous normalization of the atmosphere as well as the radiometry.
Perceptual load in different regions of the visual scene and its relevance for driving.
Marciano, Hadas; Yeshurun, Yaffa
2015-06-01
The aim of this study was to better understand the role played by perceptual load, at both central and peripheral regions of the visual scene, in driving safety. Attention is a crucial factor in driving safety, and previous laboratory studies suggest that perceptual load is an important factor determining the efficiency of attentional selectivity. Yet, the effects of perceptual load on driving were never studied systematically. Using a driving simulator, we orthogonally manipulated the load levels at the road (central load) and its sides (peripheral load), while occasionally introducing critical events at one of these regions. Perceptual load affected driving performance at both regions of the visual scene. Critically, the effect was different for central versus peripheral load: Whereas load levels on the road mainly affected driving speed, load levels on its sides mainly affected the ability to detect critical events initiating from the roadsides. Moreover, higher levels of peripheral load impaired performance but mainly with low levels of central load, replicating findings with simple letter stimuli. Perceptual load has a considerable effect on driving, but the nature of this effect depends on the region of the visual scene at which the load is introduced. Given the observed importance of perceptual load, authors of future studies of driving safety should take it into account. Specifically, these findings suggest that our understanding of factors that may be relevant for driving safety would benefit from studying these factors under different levels of load at different regions of the visual scene. © 2014, Human Factors and Ergonomics Society.
ERIC Educational Resources Information Center
Altmann, Gerry T. M.; Kamide, Yuki
2009-01-01
Two experiments explored the mapping between language and mental representations of visual scenes. In both experiments, participants viewed, for example, a scene depicting a woman, a wine glass and bottle on the floor, an empty table, and various other objects. In Experiment 1, participants concurrently heard either "The woman will put the glass…
ERIC Educational Resources Information Center
Baxter, Mark G.; Browning, Philip G. F.; Mitchell, Anna S.
2008-01-01
Surgical disconnection of the frontal cortex and inferotemporal cortex severely impairs many aspects of visual learning and memory, including learning of new object-in-place scene memory problems, a monkey model of episodic memory. As part of a study of specialization within prefrontal cortex in visual learning and memory, we tested monkeys with…
Change Blindness Phenomena for Virtual Reality Display Systems.
Steinicke, Frank; Bruder, Gerd; Hinrichs, Klaus; Willemsen, Pete
2011-09-01
In visual perception, change blindness describes the phenomenon that persons viewing a visual scene may apparently fail to detect significant changes in that scene. These phenomena have been observed in both computer-generated imagery and real-world scenes. Several studies have demonstrated that change blindness effects occur primarily during visual disruptions such as blinks or saccadic eye movements. However, until now the influence of stereoscopic vision on change blindness has not been studied thoroughly in the context of visual perception research. In this paper, we introduce change blindness techniques for stereoscopic virtual reality (VR) systems, providing the ability to substantially modify a virtual scene in a manner that is difficult for observers to perceive. We evaluate techniques for semiimmersive VR systems, i.e., a passive and active stereoscopic projection system as well as an immersive VR system, i.e., a head-mounted display, and compare the results to those of monoscopic viewing conditions. For stereoscopic viewing conditions, we found that change blindness phenomena occur with the same magnitude as in monoscopic viewing conditions. Furthermore, we have evaluated the potential of the presented techniques for allowing abrupt, and yet significant, changes of a stereoscopically displayed virtual reality environment.
Watanabe, Hiroshi; Teramoto, Wataru; Umemura, Hiroyuki
2007-01-01
Objective We studied the effects of the presentation of a visual sign that warned subjects of acceleration around the yaw and pitch axes in virtual reality (VR) on their heart rate variability. Methods Synchronization of the immersive virtual reality equipment (CAVE) and motion base system generated a driving scene and provided subjects with dynamic and wide-ranging depth information and vestibular input. The heart rate variability of 21 subjects was measured while the subjects observed a simulated driving scene for 16 minutes under three different conditions. Results When the predictive sign of the acceleration appeared 3500 ms before the acceleration, the index of the activity of the autonomic nervous system (low/high frequency ratio; LF/HF ratio) of subjects did not change much, whereas when no sign appeared the LF/HF ratio increased over the observation time. When the predictive sign of the acceleration appeared 750 ms before the acceleration, no systematic change occurred. Conclusion The visual sign which informed subjects of the acceleration affected the activity of the autonomic nervous system when it appeared long enough before the acceleration. Also, our results showed the importance of the interval between the sign and the event and the relationship between the gradual representation of events and their quantity. PMID:17903267
Large, David R; Crundall, Elizabeth; Burnett, Gary; Harvey, Catherine; Konstantopoulos, Panos
2016-07-01
Drivers' awareness of the rearward road scene is critical when contemplating or executing lane-change manoeuvres, such as overtaking. Preliminary investigations have speculated on the use of rear-facing cameras to relay images to displays mounted inside the car to create 'digital mirrors'. These may overcome many of the limitations associated with traditional 'wing' and rear-view mirrors, yet will inevitably effect drivers' normal visual scanning behaviour, and may force them to consider the rearward road scene from an unfamiliar perspective that is incongruent with their mental model of the outside world. We describe a study conducted within a medium-fidelity simulator aiming to explore the visual behaviour, driving performance and opinions of drivers while using internally located digital mirrors during different overtaking manoeuvres. Using a generic UK motorway scenario, thirty-eight experienced drivers conducted overtaking manoeuvres using each of five different layouts of digital mirrors with varying degrees of 'real-world' mapping. The results showed reductions in decision time for lane changes and eyes-off road time while using the digital mirrors, when compared with baseline traditional reflective mirrors, suggesting that digital displays may enable drivers to more rapidly pick up the salient information from the rearward road scene. Subjectively, drivers preferred configurations that most closely matched existing mirror locations, where aspects of real-world mapping were largely preserved. The research highlights important human factors issues that require further investigation prior to further development/implementation of digital mirrors within vehicles. Future work should also aim to validate findings within real-world on-road environments whilst considering the effects of digital mirrors on other important visual behaviour characteristics, such as depth perception. Copyright © 2016 Elsevier Ltd and The Ergonomics Society. All rights reserved.
Semantic congruence affects hippocampal response to repetition of visual associations.
McAndrews, Mary Pat; Girard, Todd A; Wilkins, Leanne K; McCormick, Cornelia
2016-09-01
Recent research has shown complementary engagement of the hippocampus and medial prefrontal cortex (mPFC) in encoding and retrieving associations based on pre-existing or experimentally-induced schemas, such that the latter supports schema-congruent information whereas the former is more engaged for incongruent or novel associations. Here, we attempted to explore some of the boundary conditions in the relative involvement of those structures in short-term memory for visual associations. The current literature is based primarily on intentional evaluation of schema-target congruence and on study-test paradigms with relatively long delays between learning and retrieval. We used a continuous recognition paradigm to investigate hippocampal and mPFC activation to first and second presentations of scene-object pairs as a function of semantic congruence between the elements (e.g., beach-seashell versus schoolyard-lamp). All items were identical at first and second presentation and the context scene, which was presented 500ms prior to the appearance of the target object, was incidental to the task which required a recognition response to the central target only. Very short lags 2-8 intervening stimuli occurred between presentations. Encoding the targets with congruent contexts was associated with increased activation in visual cortical regions at initial presentation and faster response time at repetition, but we did not find enhanced activation in mPFC relative to incongruent stimuli at either presentation. We did observe enhanced activation in the right anterior hippocampus, as well as regions in visual and lateral temporal and frontal cortical regions, for the repetition of incongruent scene-object pairs. This pattern demonstrates rapid and incidental effects of schema processing in hippocampal, but not mPFC, engagement during continuous recognition. Copyright © 2016 Elsevier Ltd. All rights reserved.
Neural codes of seeing architectural styles
Choo, Heeyoung; Nasar, Jack L.; Nikrahei, Bardia; Walther, Dirk B.
2017-01-01
Images of iconic buildings, such as the CN Tower, instantly transport us to specific places, such as Toronto. Despite the substantial impact of architectural design on people’s visual experience of built environments, we know little about its neural representation in the human brain. In the present study, we have found patterns of neural activity associated with specific architectural styles in several high-level visual brain regions, but not in primary visual cortex (V1). This finding suggests that the neural correlates of the visual perception of architectural styles stem from style-specific complex visual structure beyond the simple features computed in V1. Surprisingly, the network of brain regions representing architectural styles included the fusiform face area (FFA) in addition to several scene-selective regions. Hierarchical clustering of error patterns further revealed that the FFA participated to a much larger extent in the neural encoding of architectural styles than entry-level scene categories. We conclude that the FFA is involved in fine-grained neural encoding of scenes at a subordinate-level, in our case, architectural styles of buildings. This study for the first time shows how the human visual system encodes visual aspects of architecture, one of the predominant and longest-lasting artefacts of human culture. PMID:28071765
Neural codes of seeing architectural styles.
Choo, Heeyoung; Nasar, Jack L; Nikrahei, Bardia; Walther, Dirk B
2017-01-10
Images of iconic buildings, such as the CN Tower, instantly transport us to specific places, such as Toronto. Despite the substantial impact of architectural design on people's visual experience of built environments, we know little about its neural representation in the human brain. In the present study, we have found patterns of neural activity associated with specific architectural styles in several high-level visual brain regions, but not in primary visual cortex (V1). This finding suggests that the neural correlates of the visual perception of architectural styles stem from style-specific complex visual structure beyond the simple features computed in V1. Surprisingly, the network of brain regions representing architectural styles included the fusiform face area (FFA) in addition to several scene-selective regions. Hierarchical clustering of error patterns further revealed that the FFA participated to a much larger extent in the neural encoding of architectural styles than entry-level scene categories. We conclude that the FFA is involved in fine-grained neural encoding of scenes at a subordinate-level, in our case, architectural styles of buildings. This study for the first time shows how the human visual system encodes visual aspects of architecture, one of the predominant and longest-lasting artefacts of human culture.
Moving through a multiplex holographic scene
NASA Astrophysics Data System (ADS)
Mrongovius, Martina
2013-02-01
This paper explores how movement can be used as a compositional element in installations of multiplex holograms. My holographic images are created from montages of hand-held video and photo-sequences. These spatially dynamic compositions are visually complex but anchored to landmarks and hints of the capturing process - such as the appearance of the photographer's shadow - to establish a sense of connection to the holographic scene. Moving around in front of the hologram, the viewer animates the holographic scene. A perception of motion then results from the viewer's bodily awareness of physical motion and the visual reading of dynamics within the scene or movement of perspective through a virtual suggestion of space. By linking and transforming the physical motion of the viewer with the visual animation, the viewer's bodily awareness - including proprioception, balance and orientation - play into the holographic composition. How multiplex holography can be a tool for exploring coupled, cross-referenced and transformed perceptions of movement is demonstrated with a number of holographic image installations. Through this process I expanded my creative composition practice to consider how dynamic and spatial scenes can be conveyed through the fragmented view of a multiplex hologram. This body of work was developed through an installation art practice and was the basis of my recently completed doctoral thesis: 'The Emergent Holographic Scene — compositions of movement and affect using multiplex holographic images'.
Ball, Felix; Elzemann, Anne; Busch, Niko A
2014-09-01
The change blindness paradigm, in which participants often fail to notice substantial changes in a scene, is a popular tool for studying scene perception, visual memory, and the link between awareness and attention. Some of the most striking and popular examples of change blindness have been demonstrated with digital photographs of natural scenes; in most studies, however, much simpler displays, such as abstract stimuli or "free-floating" objects, are typically used. Although simple displays have undeniable advantages, natural scenes remain a very useful and attractive stimulus for change blindness research. To assist researchers interested in using natural-scene stimuli in change blindness experiments, we provide here a step-by-step tutorial on how to produce changes in natural-scene images with a freely available image-processing tool (GIMP). We explain how changes in a scene can be made by deleting objects or relocating them within the scene or by changing the color of an object, in just a few simple steps. We also explain how the physical properties of such changes can be analyzed using GIMP and MATLAB (a high-level scientific programming tool). Finally, we present an experiment confirming that scenes manipulated according to our guidelines are effective in inducing change blindness and demonstrating the relationship between change blindness and the physical properties of the change and inter-individual differences in performance measures. We expect that this tutorial will be useful for researchers interested in studying the mechanisms of change blindness, attention, or visual memory using natural scenes.
Sex differences in visual attention to erotic and non-erotic stimuli.
Lykins, Amy D; Meana, Marta; Strauss, Gregory P
2008-04-01
It has been suggested that sex differences in the processing of erotic material (e.g., memory, genital arousal, brain activation patterns) may also be reflected by differential attention to visual cues in erotic material. To test this hypothesis, we presented 20 heterosexual men and 20 heterosexual women with erotic and non-erotic images of heterosexual couples and tracked their eye movements during scene presentation. Results supported previous findings that erotic and non-erotic information was visually processed in a different manner by both men and women. Men looked at opposite sex figures significantly longer than did women, and women looked at same sex figures significantly longer than did men. Within-sex analyses suggested that men had a strong visual attention preference for opposite sex figures as compared to same sex figures, whereas women appeared to disperse their attention evenly between opposite and same sex figures. These differences, however, were not limited to erotic images but evidenced in non-erotic images as well. No significant sex differences were found for attention to the contextual region of the scenes. Results were interpreted as potentially supportive of recent studies showing a greater non-specificity of sexual arousal in women. This interpretation assumes there is an erotic valence to images of the sex to which one orients, even when the image is not explicitly erotic. It also assumes a relationship between visual attention and erotic valence.
The use of visual cues for vehicle control and navigation
NASA Technical Reports Server (NTRS)
Hart, Sandra G.; Battiste, Vernol
1991-01-01
At least three levels of control are required to operate most vehicles: (1) inner-loop control to counteract the momentary effects of disturbances on vehicle position; (2) intermittent maneuvers to avoid obstacles, and (3) outer-loop control to maintain a planned route. Operators monitor dynamic optical relationships in their immediate surroundings to estimate momentary changes in forward, lateral, and vertical position, rates of change in speed and direction of motion, and distance from obstacles. The process of searching the external scene to find landmarks (for navigation) is intermittent and deliberate, while monitoring and responding to subtle changes in the visual scene (for vehicle control) is relatively continuous and 'automatic'. However, since operators may perform both tasks simultaneously, the dynamic optical cues available for a vehicle control task may be determined by the operator's direction of gaze for wayfinding. An attempt to relate the visual processes involved in vehicle control and wayfinding is presented. The frames of reference and information used by different operators (e.g., automobile drivers, airline pilots, and helicopter pilots) are reviewed with particular emphasis on the special problems encountered by helicopter pilots flying nap of the earth (NOE). The goal of this overview is to describe the context within which different vehicle control tasks are performed and to suggest ways in which the use of visual cues for geographical orientation might influence visually guided control activities.
Dioptric defocus maps across the visual field for different indoor environments.
García, Miguel García; Ohlendorf, Arne; Schaeffel, Frank; Wahl, Siegfried
2018-01-01
One of the factors proposed to regulate the eye growth is the error signal derived from the defocus in the retina and actually, this might arise from defocus not only in the fovea but the whole visual field. Therefore, myopia could be better predicted by spatio-temporally mapping the 'environmental defocus' over the visual field. At present, no devices are available that could provide this information. A 'Kinect sensor v1' camera (Microsoft Corp.) and a portable eye tracker were used for developing a system for quantifying 'indoor defocus error signals' across the central 58° of the visual field. Dioptric differences relative to the fovea (assumed to be in focus) were recorded over the visual field and 'defocus maps' were generated for various scenes and tasks.
An insect-inspired model for visual binding I: learning objects and their characteristics.
Northcutt, Brandon D; Dyhr, Jonathan P; Higgins, Charles M
2017-04-01
Visual binding is the process of associating the responses of visual interneurons in different visual submodalities all of which are responding to the same object in the visual field. Recently identified neuropils in the insect brain termed optic glomeruli reside just downstream of the optic lobes and have an internal organization that could support visual binding. Working from anatomical similarities between optic and olfactory glomeruli, we have developed a model of visual binding based on common temporal fluctuations among signals of independent visual submodalities. Here we describe and demonstrate a neural network model capable both of refining selectivity of visual information in a given visual submodality, and of associating visual signals produced by different objects in the visual field by developing inhibitory neural synaptic weights representing the visual scene. We also show that this model is consistent with initial physiological data from optic glomeruli. Further, we discuss how this neural network model may be implemented in optic glomeruli at a neuronal level.
Short temporal asynchrony disrupts visual object recognition
Singer, Jedediah M.; Kreiman, Gabriel
2014-01-01
Humans can recognize objects and scenes in a small fraction of a second. The cascade of signals underlying rapid recognition might be disrupted by temporally jittering different parts of complex objects. Here we investigated the time course over which shape information can be integrated to allow for recognition of complex objects. We presented fragments of object images in an asynchronous fashion and behaviorally evaluated categorization performance. We observed that visual recognition was significantly disrupted by asynchronies of approximately 30 ms, suggesting that spatiotemporal integration begins to break down with even small deviations from simultaneity. However, moderate temporal asynchrony did not completely obliterate recognition; in fact, integration of visual shape information persisted even with an asynchrony of 100 ms. We describe the data with a concise model based on the dynamic reduction of uncertainty about what image was presented. These results emphasize the importance of timing in visual processing and provide strong constraints for the development of dynamical models of visual shape recognition. PMID:24819738
Depth reversals in stereoscopic displays driven by apparent size
NASA Astrophysics Data System (ADS)
Sacher, Gunnar; Hayes, Amy; Thornton, Ian M.; Sereno, Margaret E.; Malony, Allen D.
1998-04-01
In visual scenes, depth information is derived from a variety of monocular and binocular cues. When in conflict, a monocular cue is sometimes able to override the binocular information. We examined the accuracy of relative depth judgments in orthographic, stereoscopic displays and found that perceived relative size can override binocular disparity as a depth cue in a situation where the relative size information is itself generated from disparity information, not from retinal size difference. A size discrimination task confirmed the assumption that disparity information was perceived and used to generate apparent size differences. The tendency for the apparent size cue to override disparity information can be modulated by varying the strength of the apparent size cue. In addition, an analysis of reaction times provides supporting evidence for this novel depth reversal effect. We believe that human perception must be regarded as an important component of stereoscopic applications. Hence, if applications are to be effective and accurate, it is necessary to take into account the richness and complexity of the human visual perceptual system that interacts with them. We discuss implications of this and similar research for human performance in virtual environments, the design of visual presentations for virtual worlds, and the design of visualization tools.
Runway Texture and Grid Pattern Effects on Rate-of-Descent Perception
NASA Technical Reports Server (NTRS)
Schroeder, J. A.; Dearing, M. G.; Sweet, B. T.; Kaiser, M. K.; Rutkowski, Mike (Technical Monitor)
2001-01-01
To date, perceptual errors occur in determining descent rate from a computer-generated image in flight simulation. Pilots tend to touch down twice as hard in simulation than in flight, and more training time is needed in simulation before reaching steady-state performance. Barnes suggested that recognition of range may be the culprit, and he cited that problems such as collimated objects, binocular vision, and poor resolution lead to poor estimation of the velocity vector. Brown's study essentially ruled out that the lack of binocular vision is the problem. Dorfel added specificity to the problem by showing that pilots underestimated range in simulated scenes by 50% when 800 ft from the runway threshold. Palmer and Petitt showed that pilots are able to distinguish between a 1.7 ft/sec and 2.9 ft/sec sink rate when passively observing sink rates in a night scene. Platform motion also plays a role, as previous research has shown that the addition of substantial platform motion improves pilot estimates of vertical velocity and results in simulated touchdown rates more closely resembling flight. This experiment examined how some specific variations in the visual scene properties affect a pilot's perception of sink rate. It extended another experiment that focused on the visual and motion cues necessary for helicopter autorotations. In that experiment, pilots performed steep approaches to a runway. The visual content of the runway and its surroundings varied in two ways: texture and rectangular grid spacing. Four textures, included a no-texture case, were evaluated. Three grid spacings, including a no-grid case, were evaluated. The results showed that pilot better controlled their vertical descent rates when good texture cues were present. No significant differences were found for the grid manipulation. Using those visual scenes a simple psychophysics, experiment was performed. The purpose was to determine if the variations in the visual scenes allowed pilots to better perceive vertical velocity. To determine that answer, pilots passively viewed a particular visual scene in which the vehicle was descending at two different rates. Pilots had to select which of the two rates they thought was the fastest rate. The difference between the two rates changed using a staircase method, depending on whether or not the pilot was correct, until a minimum threshold between the two descent rates was reached. This process was repeated for all of the visual scenes to decide whether or not the visual scenes did allow pilots to perceive vertical velocity better among them. All of the data have yet to be analyzed; however, neither the effects of grid nor texture revealed any statistically significant trends. On further examination of the staircase method employed, a possibility exists that the lack of an evident trend may be due to the exit criterion used during the study. As such, the experiment will be repeated with an improved exit criterion in February. Results of this study will be presented in the submitted paper.
Institute for Brain and Neural Systems
2009-10-06
to deal with computational complexity when analyzing large amounts of information in visual scenes. It seems natural that in addition to exploring...algorithms using methods from statistical pattern recognition and machine learning. Over the last fifteen years, significant advances had been made in...recognition, robustness to noise and ability to cope with significant variations in lighting conditions. Identifying an occluded target adds another layer of
Immediate use of prosody and context in predicting a syntactic structure.
Nakamura, Chie; Arai, Manabu; Mazuka, Reiko
2012-11-01
Numerous studies have reported an effect of prosodic information on parsing but whether prosody can impact even the initial parsing decision is still not evident. In a visual world eye-tracking experiment, we investigated the influence of contrastive intonation and visual context on processing temporarily ambiguous relative clause sentences in Japanese. Our results showed that listeners used the prosodic cue to make a structural prediction before hearing disambiguating information. Importantly, the effect was limited to cases where the visual scene provided an appropriate context for the prosodic cue, thus eliminating the explanation that listeners have simply associated marked prosodic information with a less frequent structure. Furthermore, the influence of the prosodic information was also evident following disambiguating information, in a way that reflected the initial analysis. The current study demonstrates that prosody, when provided with an appropriate context, influences the initial syntactic analysis and also the subsequent cost at disambiguating information. The results also provide first evidence for pre-head structural prediction driven by prosodic and contextual information with a head-final construction. Copyright © 2012 Elsevier B.V. All rights reserved.
Expedient range enhanced 3-D robot colour vision
NASA Astrophysics Data System (ADS)
Jarvis, R. A.
1983-01-01
Computer vision has been chosen, in many cases, as offering the richest form of sensory information which can be utilized for guiding robotic manipulation. The present investigation is concerned with the problem of three-dimensional (3D) visual interpretation of colored objects in support of robotic manipulation of those objects with a minimum of semantic guidance. The scene 'interpretations' are aimed at providing basic parameters to guide robotic manipulation rather than to provide humans with a detailed description of what the scene 'means'. Attention is given to overall system configuration, hue transforms, a connectivity analysis, plan/elevation segmentations, range scanners, elevation/range segmentation, higher level structure, eye in hand research, and aspects of array and video stream processing.
Viewing the dynamics and control of visual attention through the lens of electrophysiology
Woodman, Geoffrey F.
2013-01-01
How we find what we are looking for in complex visual scenes is a seemingly simple ability that has taken half a century to unravel. The first study to use the term visual search showed that as the number of objects in a complex scene increases, observers’ reaction times increase proportionally (Green and Anderson, 1956). This observation suggests that our ability to process the objects in the scenes is limited in capacity. However, if it is known that the target will have a certain feature attribute, for example, that it will be red, then only an increase in the number of red items increases reaction time. This observation suggests that we can control which visual inputs receive the benefit of our limited capacity to recognize the objects, such as those defined by the color red, as the items we seek. The nature of the mechanisms that underlie these basic phenomena in the literature on visual search have been more difficult to definitively determine. In this paper, I discuss how electrophysiological methods have provided us with the necessary tools to understand the nature of the mechanisms that give rise to the effects observed in the first visual search paper. I begin by describing how recordings of event-related potentials from humans and nonhuman primates have shown us how attention is deployed to possible target items in complex visual scenes. Then, I will discuss how event-related potential experiments have allowed us to directly measure the memory representations that are used to guide these deployments of attention to items with target-defining features. PMID:23357579
Transient cardio-respiratory responses to visually induced tilt illusions
NASA Technical Reports Server (NTRS)
Wood, S. J.; Ramsdell, C. D.; Mullen, T. J.; Oman, C. M.; Harm, D. L.; Paloski, W. H.
2000-01-01
Although the orthostatic cardio-respiratory response is primarily mediated by the baroreflex, studies have shown that vestibular cues also contribute in both humans and animals. We have demonstrated a visually mediated response to illusory tilt in some human subjects. Blood pressure, heart and respiration rate, and lung volume were monitored in 16 supine human subjects during two types of visual stimulation, and compared with responses to real passive whole body tilt from supine to head 80 degrees upright. Visual tilt stimuli consisted of either a static scene from an overhead mirror or constant velocity scene motion along different body axes generated by an ultra-wide dome projection system. Visual vertical cues were initially aligned with the longitudinal body axis. Subjective tilt and self-motion were reported verbally. Although significant changes in cardio-respiratory parameters to illusory tilts could not be demonstrated for the entire group, several subjects showed significant transient decreases in mean blood pressure resembling their initial response to passive head-up tilt. Changes in pulse pressure and a slight elevation in heart rate were noted. These transient responses are consistent with the hypothesis that visual-vestibular input contributes to the initial cardiovascular adjustment to a change in posture in humans. On average the static scene elicited perceived tilt without rotation. Dome scene pitch and yaw elicited perceived tilt and rotation, and dome roll motion elicited perceived rotation without tilt. A significant correlation between the magnitude of physiological and subjective reports could not be demonstrated.
The natural statistics of blur
Sprague, William W.; Cooper, Emily A.; Reissier, Sylvain; Yellapragada, Baladitya; Banks, Martin S.
2016-01-01
Blur from defocus can be both useful and detrimental for visual perception: It can be useful as a source of depth information and detrimental because it degrades image quality. We examined these aspects of blur by measuring the natural statistics of defocus blur across the visual field. Participants wore an eye-and-scene tracker that measured gaze direction, pupil diameter, and scene distances as they performed everyday tasks. We found that blur magnitude increases with increasing eccentricity. There is a vertical gradient in the distances that generate defocus blur: Blur below the fovea is generally due to scene points nearer than fixation; blur above the fovea is mostly due to points farther than fixation. There is no systematic horizontal gradient. Large blurs are generally caused by points farther rather than nearer than fixation. Consistent with the statistics, participants in a perceptual experiment perceived vertical blur gradients as slanted top-back whereas horizontal gradients were perceived equally as left-back and right-back. The tendency for people to see sharp as near and blurred as far is also consistent with the observed statistics. We calculated how many observations will be perceived as unsharp and found that perceptible blur is rare. Finally, we found that eye shape in ground-dwelling animals conforms to that required to put likely distances in best focus. PMID:27580043
Contrast statistics for foveated visual systems: fixation selection by minimizing contrast entropy
NASA Astrophysics Data System (ADS)
Raj, Raghu; Geisler, Wilson S.; Frazor, Robert A.; Bovik, Alan C.
2005-10-01
The human visual system combines a wide field of view with a high-resolution fovea and uses eye, head, and body movements to direct the fovea to potentially relevant locations in the visual scene. This strategy is sensible for a visual system with limited neural resources. However, for this strategy to be effective, the visual system needs sophisticated central mechanisms that efficiently exploit the varying spatial resolution of the retina. To gain insight into some of the design requirements of these central mechanisms, we have analyzed the effects of variable spatial resolution on local contrast in 300 calibrated natural images. Specifically, for each retinal eccentricity (which produces a certain effective level of blur), and for each value of local contrast observed at that eccentricity, we measured the probability distribution of the local contrast in the unblurred image. These conditional probability distributions can be regarded as posterior probability distributions for the ``true'' unblurred contrast, given an observed contrast at a given eccentricity. We find that these conditional probability distributions are adequately described by a few simple formulas. To explore how these statistics might be exploited by central perceptual mechanisms, we consider the task of selecting successive fixation points, where the goal on each fixation is to maximize total contrast information gained about the image (i.e., minimize total contrast uncertainty). We derive an entropy minimization algorithm and find that it performs optimally at reducing total contrast uncertainty and that it also works well at reducing the mean squared error between the original image and the image reconstructed from the multiple fixations. Our results show that measurements of local contrast alone could efficiently drive the scan paths of the eye when the goal is to gain as much information about the spatial structure of a scene as possible.
Enhanced RGB-D Mapping Method for Detailed 3D Indoor and Outdoor Modeling
Tang, Shengjun; Zhu, Qing; Chen, Wu; Darwish, Walid; Wu, Bo; Hu, Han; Chen, Min
2016-01-01
RGB-D sensors (sensors with RGB camera and Depth camera) are novel sensing systems that capture RGB images along with pixel-wise depth information. Although they are widely used in various applications, RGB-D sensors have significant drawbacks including limited measurement ranges (e.g., within 3 m) and errors in depth measurement increase with distance from the sensor with respect to 3D dense mapping. In this paper, we present a novel approach to geometrically integrate the depth scene and RGB scene to enlarge the measurement distance of RGB-D sensors and enrich the details of model generated from depth images. First, precise calibration for RGB-D Sensors is introduced. In addition to the calibration of internal and external parameters for both, IR camera and RGB camera, the relative pose between RGB camera and IR camera is also calibrated. Second, to ensure poses accuracy of RGB images, a refined false features matches rejection method is introduced by combining the depth information and initial camera poses between frames of the RGB-D sensor. Then, a global optimization model is used to improve the accuracy of the camera pose, decreasing the inconsistencies between the depth frames in advance. In order to eliminate the geometric inconsistencies between RGB scene and depth scene, the scale ambiguity problem encountered during the pose estimation with RGB image sequences can be resolved by integrating the depth and visual information and a robust rigid-transformation recovery method is developed to register RGB scene to depth scene. The benefit of the proposed joint optimization method is firstly evaluated with the publicly available benchmark datasets collected with Kinect. Then, the proposed method is examined by tests with two sets of datasets collected in both outside and inside environments. The experimental results demonstrate the feasibility and robustness of the proposed method. PMID:27690028
Enhanced RGB-D Mapping Method for Detailed 3D Indoor and Outdoor Modeling.
Tang, Shengjun; Zhu, Qing; Chen, Wu; Darwish, Walid; Wu, Bo; Hu, Han; Chen, Min
2016-09-27
RGB-D sensors (sensors with RGB camera and Depth camera) are novel sensing systems that capture RGB images along with pixel-wise depth information. Although they are widely used in various applications, RGB-D sensors have significant drawbacks including limited measurement ranges (e.g., within 3 m) and errors in depth measurement increase with distance from the sensor with respect to 3D dense mapping. In this paper, we present a novel approach to geometrically integrate the depth scene and RGB scene to enlarge the measurement distance of RGB-D sensors and enrich the details of model generated from depth images. First, precise calibration for RGB-D Sensors is introduced. In addition to the calibration of internal and external parameters for both, IR camera and RGB camera, the relative pose between RGB camera and IR camera is also calibrated. Second, to ensure poses accuracy of RGB images, a refined false features matches rejection method is introduced by combining the depth information and initial camera poses between frames of the RGB-D sensor. Then, a global optimization model is used to improve the accuracy of the camera pose, decreasing the inconsistencies between the depth frames in advance. In order to eliminate the geometric inconsistencies between RGB scene and depth scene, the scale ambiguity problem encountered during the pose estimation with RGB image sequences can be resolved by integrating the depth and visual information and a robust rigid-transformation recovery method is developed to register RGB scene to depth scene. The benefit of the proposed joint optimization method is firstly evaluated with the publicly available benchmark datasets collected with Kinect. Then, the proposed method is examined by tests with two sets of datasets collected in both outside and inside environments. The experimental results demonstrate the feasibility and robustness of the proposed method.
Camouflage target detection via hyperspectral imaging plus information divergence measurement
NASA Astrophysics Data System (ADS)
Chen, Yuheng; Chen, Xinhua; Zhou, Jiankang; Ji, Yiqun; Shen, Weimin
2016-01-01
Target detection is one of most important applications in remote sensing. Nowadays accurate camouflage target distinction is often resorted to spectral imaging technique due to its high-resolution spectral/spatial information acquisition ability as well as plenty of data processing methods. In this paper, hyper-spectral imaging technique together with spectral information divergence measure method is used to solve camouflage target detection problem. A self-developed visual-band hyper-spectral imaging device is adopted to collect data cubes of certain experimental scene before spectral information divergences are worked out so as to discriminate target camouflage and anomaly. Full-band information divergences are measured to evaluate target detection effect visually and quantitatively. Information divergence measurement is proved to be a low-cost and effective tool for target detection task and can be further developed to other target detection applications beyond spectral imaging technique.
Out of Mind, Out of Sight: Unexpected Scene Elements Frequently Go Unnoticed Until Primed.
Slavich, George M; Zimbardo, Philip G
2013-12-01
The human visual system employs a sophisticated set of strategies for scanning the environment and directing attention to stimuli that can be expected given the context and a person's past experience. Although these strategies enable us to navigate a very complex physical and social environment, they can also cause highly salient, but unexpected stimuli to go completely unnoticed. To examine the generality of this phenomenon, we conducted eight studies that included 15 different experimental conditions and 1,577 participants in all. These studies revealed that a large majority of participants do not report having seen a woman in the center of an urban scene who was photographed in midair as she was committing suicide. Despite seeing the scene repeatedly, 46 % of all participants failed to report seeing a central figure and only 4.8 % reported seeing a falling person. Frequency of noticing the suicidal woman was highest for participants who read a narrative priming story that increased the extent to which she was schematically congruent with the scene. In contrast to this robust effect of inattentional blindness , a majority of participants reported seeing other peripheral objects in the visual scene that were equally difficult to detect, yet more consistent with the scene. Follow-up qualitative analyses revealed that participants reported seeing many elements that were not actually present, but which could have been expected given the overall context of the scene. Together, these findings demonstrate the robustness of inattentional blindness and highlight the specificity with which different visual primes may increase noticing behavior.
Quétard, Boris; Quinton, Jean-Charles; Colomb, Michèle; Pezzulo, Giovanni; Barca, Laura; Izaute, Marie; Appadoo, Owen Kevin; Mermillod, Martial
2015-09-01
Detecting a pedestrian while driving in the fog is one situation where the prior expectation about the target presence is integrated with the noisy visual input. We focus on how these sources of information influence the oculomotor behavior and are integrated within an underlying decision-making process. The participants had to judge whether high-/low-density fog scenes displayed on a computer screen contained a pedestrian or a deer by executing a mouse movement toward the response button (mouse-tracking). A variable road sign was added on the scene to manipulate expectations about target identity. We then analyzed the timing and amplitude of the deviation of mouse trajectories toward the incorrect response and, using an eye tracker, the detection time (before fixating the target) and the identification time (fixations on the target). Results revealed that expectation of the correct target results in earlier decisions with less deviation toward the alternative response, this effect being partially explained by the facilitation of target identification.
Cortical systems mediating visual attention to both objects and spatial locations
Shomstein, Sarah; Behrmann, Marlene
2006-01-01
Natural visual scenes consist of many objects occupying a variety of spatial locations. Given that the plethora of information cannot be processed simultaneously, the multiplicity of inputs compete for representation. Using event-related functional MRI, we show that attention, the mechanism by which a subset of the input is selected, is mediated by the posterior parietal cortex (PPC). Of particular interest is that PPC activity is differentially sensitive to the object-based properties of the input, with enhanced activation for those locations bound by an attended object. Of great interest too is the ensuing modulation of activation in early cortical regions, reflected as differences in the temporal profile of the blood oxygenation level-dependent (BOLD) response for within-object versus between-object locations. These findings indicate that object-based selection results from an object-sensitive reorienting signal issued by the PPC. The dynamic circuit between the PPC and earlier sensory regions then enables observers to attend preferentially to objects of interest in complex scenes. PMID:16840559
Neural Correlates of Contextual Cueing Are Modulated by Explicit Learning
ERIC Educational Resources Information Center
Westerberg, Carmen E.; Miller, Brennan B.; Reber, Paul J.; Cohen, Neal J.; Paller, Ken A.
2011-01-01
Contextual cueing refers to the facilitated ability to locate a particular visual element in a scene due to prior exposure to the same scene. This facilitation is thought to reflect implicit learning, as it typically occurs without the observer's knowledge that scenes repeat. Unlike most other implicit learning effects, contextual cueing can be…
Micro-Valences: Perceiving Affective Valence in Everyday Objects
Lebrecht, Sophie; Bar, Moshe; Barrett, Lisa Feldman; Tarr, Michael J.
2012-01-01
Perceiving the affective valence of objects influences how we think about and react to the world around us. Conversely, the speed and quality with which we visually recognize objects in a visual scene can vary dramatically depending on that scene’s affective content. Although typical visual scenes contain mostly “everyday” objects, the affect perception in visual objects has been studied using somewhat atypical stimuli with strong affective valences (e.g., guns or roses). Here we explore whether affective valence must be strong or overt to exert an effect on our visual perception. We conclude that everyday objects carry subtle affective valences – “micro-valences” – which are intrinsic to their perceptual representation. PMID:22529828
Cohn, Neil; Taylor-Weiner, Amaro; Grossman, Suzanne
2012-01-01
Research on visual attention has shown that Americans tend to focus more on focal objects of a scene while Asians attend to the surrounding environment. The panels of comic books - the narrative frames in sequential images - highlight aspects of a scene comparably to how attention becomes focused on parts of a spatial array. Thus, we compared panels from American and Japanese comics to explore cross-cultural cognition beyond behavioral experimentation by looking at the expressive mediums produced by individuals from these cultures. This study compared the panels of two genres of American comics (Independent and Mainstream comics) with mainstream Japanese "manga" to examine how different cultures and genres direct attention through the framing of figures and scenes in comic panels. Both genres of American comics focused on whole scenes as much as individual characters, while Japanese manga individuated characters and parts of scenes. We argue that this framing of space from American and Japanese comic books simulate a viewer's integration of a visual scene, and is consistent with the research showing cross-cultural differences in the direction of attention.
Cohn, Neil; Taylor-Weiner, Amaro; Grossman, Suzanne
2012-01-01
Research on visual attention has shown that Americans tend to focus more on focal objects of a scene while Asians attend to the surrounding environment. The panels of comic books – the narrative frames in sequential images – highlight aspects of a scene comparably to how attention becomes focused on parts of a spatial array. Thus, we compared panels from American and Japanese comics to explore cross-cultural cognition beyond behavioral experimentation by looking at the expressive mediums produced by individuals from these cultures. This study compared the panels of two genres of American comics (Independent and Mainstream comics) with mainstream Japanese “manga” to examine how different cultures and genres direct attention through the framing of figures and scenes in comic panels. Both genres of American comics focused on whole scenes as much as individual characters, while Japanese manga individuated characters and parts of scenes. We argue that this framing of space from American and Japanese comic books simulate a viewer’s integration of a visual scene, and is consistent with the research showing cross-cultural differences in the direction of attention. PMID:23015794
Lobjois, Régis; Dagonneau, Virginie; Isableu, Brice
2016-11-01
Compared with driving or flight simulation, little is known about self-motion perception in riding simulation. The goal of this study was to examine whether or not continuous roll motion supports the sensation of leaning into bends in dynamic motorcycle simulation. To this end, riders were able to freely tune the visual scene and/or motorcycle simulator roll angle to find a pattern that matched their prior knowledge. Our results revealed idiosyncrasy in the combination of visual and proprioceptive information. Some subjects relied more on the visual dimension, but reported increased sickness symptoms with the visual roll angle. Others relied more on proprioceptive information, tuning the direction of the visual scenery to match three possible patterns. Our findings also showed that these two subgroups tuned the motorcycle simulator roll angle in a similar way. This suggests that sustained inertially specified roll motion have contributed to the sensation of leaning in spite of the occurrence of unexpected gravito-inertial stimulation during the tilt. Several hypotheses are discussed. Practitioner Summary: Self-motion perception in motorcycle simulation is a relatively new research area. We examined how participants combined visual and proprioceptive information. Findings revealed individual differences in the visual dimension. However, participants tuned the simulator roll angle similarly, supporting the hypothesis that sustained inertially specified roll motion contributes to a leaning sensation.
The Role of Prediction In Perception: Evidence From Interrupted Visual Search
Mereu, Stefania; Zacks, Jeffrey M.; Kurby, Christopher A.; Lleras, Alejandro
2014-01-01
Recent studies of rapid resumption—an observer’s ability to quickly resume a visual search after an interruption—suggest that predictions underlie visual perception. Previous studies showed that when the search display changes unpredictably after the interruption, rapid resumption disappears. This conclusion is at odds with our everyday experience, where the visual system seems to be quite efficient despite continuous changes of the visual scene; however, in the real world, changes can typically be anticipated based on previous knowledge. The present study aimed to evaluate whether changes to the visual display can be incorporated into the perceptual hypotheses, if observers are allowed to anticipate such changes. Results strongly suggest that an interrupted visual search can be rapidly resumed even when information in the display has changed after the interruption, so long as participants not only can anticipate them, but also are aware that such changes might occur. PMID:24820440
History of Reading Struggles Linked to Enhanced Learning in Low Spatial Frequency Scenes
Schneps, Matthew H.; Brockmole, James R.; Sonnert, Gerhard; Pomplun, Marc
2012-01-01
People with dyslexia, who face lifelong struggles with reading, exhibit numerous associated low-level sensory deficits including deficits in focal attention. Countering this, studies have shown that struggling readers outperform typical readers in some visual tasks that integrate distributed information across an expanse. Though such abilities would be expected to facilitate scene memory, prior investigations using the contextual cueing paradigm failed to find corresponding advantages in dyslexia. We suggest that these studies were confounded by task-dependent effects exaggerating known focal attention deficits in dyslexia, and that, if natural scenes were used as the context, advantages would emerge. Here, we investigate this hypothesis by comparing college students with histories of severe lifelong reading difficulties (SR) and typical readers (TR) in contexts that vary attention load. We find no differences in contextual-cueing when spatial contexts are letter-like objects, or when contexts are natural scenes. However, the SR group significantly outperforms the TR group when contexts are low-pass filtered natural scenes [F(3, 39) = 3.15, p<.05]. These findings suggest that perception or memory for low spatial frequency components in scenes is enhanced in dyslexia. These findings are important because they suggest strengths for spatial learning in a population otherwise impaired, carrying implications for the education and support of students who face challenges in school. PMID:22558210
History of reading struggles linked to enhanced learning in low spatial frequency scenes.
Schneps, Matthew H; Brockmole, James R; Sonnert, Gerhard; Pomplun, Marc
2012-01-01
People with dyslexia, who face lifelong struggles with reading, exhibit numerous associated low-level sensory deficits including deficits in focal attention. Countering this, studies have shown that struggling readers outperform typical readers in some visual tasks that integrate distributed information across an expanse. Though such abilities would be expected to facilitate scene memory, prior investigations using the contextual cueing paradigm failed to find corresponding advantages in dyslexia. We suggest that these studies were confounded by task-dependent effects exaggerating known focal attention deficits in dyslexia, and that, if natural scenes were used as the context, advantages would emerge. Here, we investigate this hypothesis by comparing college students with histories of severe lifelong reading difficulties (SR) and typical readers (TR) in contexts that vary attention load. We find no differences in contextual-cueing when spatial contexts are letter-like objects, or when contexts are natural scenes. However, the SR group significantly outperforms the TR group when contexts are low-pass filtered natural scenes [F(3, 39) = 3.15, p<.05]. These findings suggest that perception or memory for low spatial frequency components in scenes is enhanced in dyslexia. These findings are important because they suggest strengths for spatial learning in a population otherwise impaired, carrying implications for the education and support of students who face challenges in school.
Extracting heading and temporal range from optic flow: Human performance issues
NASA Technical Reports Server (NTRS)
Kaiser, Mary K.; Perrone, John A.; Stone, Leland; Banks, Martin S.; Crowell, James A.
1993-01-01
Pilots are able to extract information about their vehicle motion and environmental structure from dynamic transformations in the out-the-window scene. In this presentation, we focus on the information in the optic flow which specifies vehicle heading and distance to objects in the environment, scaled to a temporal metric. In particular, we are concerned with modeling how the human operators extract the necessary information, and what factors impact their ability to utilize the critical information. In general, the psychophysical data suggest that the human visual system is fairly robust to degradations in the visual display, e.g., reduced contrast and resolution or restricted field of view. However, extraneous motion flow, i.e., introduced by sensor rotation, greatly compromises human performance. The implications of these models and data for enhanced/synthetic vision systems are discussed.
Kotabe, Hiroki P; Kardan, Omid; Berman, Marc G
2017-08-01
Natural environments have powerful aesthetic appeal linked to their capacity for psychological restoration. In contrast, disorderly environments are aesthetically aversive, and have various detrimental psychological effects. But in our research, we have repeatedly found that natural environments are perceptually disorderly. What could explain this paradox? We present 3 competing hypotheses: the aesthetic preference for naturalness is more powerful than the aesthetic aversion to disorder (the nature-trumps-disorder hypothesis ); disorder is trivial to aesthetic preference in natural contexts (the harmless-disorder hypothesis ); and disorder is aesthetically preferred in natural contexts (the beneficial-disorder hypothesis ). Utilizing novel methods of perceptual study and diverse stimuli, we rule in the nature-trumps-disorder hypothesis and rule out the harmless-disorder and beneficial-disorder hypotheses. In examining perceptual mechanisms, we find evidence that high-level scene semantics are both necessary and sufficient for the nature-trumps-disorder effect. Necessity is evidenced by the effect disappearing in experiments utilizing only low-level visual stimuli (i.e., where scene semantics have been removed) and experiments utilizing a rapid-scene-presentation procedure that obscures scene semantics. Sufficiency is evidenced by the effect reappearing in experiments utilizing noun stimuli which remove low-level visual features. Furthermore, we present evidence that the interaction of scene semantics with low-level visual features amplifies the nature-trumps-disorder effect-the effect is weaker both when statistically adjusting for quantified low-level visual features and when using noun stimuli which remove low-level visual features. These results have implications for psychological theories bearing on the joint influence of low- and high-level perceptual inputs on affect and cognition, as well as for aesthetic design. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Comparison on driving fatigue related hemodynamics activated by auditory and visual stimulus
NASA Astrophysics Data System (ADS)
Deng, Zishan; Gao, Yuan; Li, Ting
2018-02-01
As one of the main causes of traffic accidents, driving fatigue deserves researchers' attention and its detection and monitoring during long-term driving require a new technique to realize. Since functional near-infrared spectroscopy (fNIRS) can be applied to detect cerebral hemodynamic responses, we can promisingly expect its application in fatigue level detection. Here, we performed three different kinds of experiments on a driver and recorded his cerebral hemodynamic responses when driving for long hours utilizing our device based on fNIRS. Each experiment lasted for 7 hours and one of the three specific experimental tests, detecting the driver's response to sounds, traffic lights and direction signs respectively, was done every hour. The results showed that visual stimulus was easier to cause fatigue compared with auditory stimulus and visual stimulus induced by traffic lights scenes was easier to cause fatigue compared with visual stimulus induced by direction signs in the first few hours. We also found that fatigue related hemodynamics caused by auditory stimulus increased fastest, then traffic lights scenes, and direction signs scenes slowest. Our study successfully compared audio, visual color, and visual character stimulus in sensitivity to cause driving fatigue, which is meaningful for driving safety management.
Loy Rodas, Nicolas; Barrera, Fernando; Padoy, Nicolas
2017-02-01
We present an approach to provide awareness to the harmful ionizing radiation generated during X-ray-guided minimally invasive procedures. A hand-held screen is used to display directly in the user's view information related to radiation safety in a mobile augmented reality (AR) manner. Instead of using markers, we propose a method to track the observer's viewpoint, which relies on the use of multiple RGB-D sensors and combines equipment detection for tracking initialization with a KinectFusion-like approach for frame-to-frame tracking. Two of the sensors are ceiling-mounted and a third one is attached to the hand-held screen. The ceiling cameras keep an updated model of the room's layout, which is used to exploit context information and improve the relocalization procedure. The system is evaluated on a multicamera dataset generated inside an operating room (OR) and containing ground-truth poses of the AR display. This dataset includes a wide variety of sequences with different scene configurations, occlusions, motion in the scene, and abrupt viewpoint changes. Qualitative results illustrating the different AR visualization modes for radiation awareness provided by the system are also presented. Our approach allows the user to benefit from a large AR visualization area and permits to recover from tracking failure caused by vast motion or changes in the scene just by looking at a piece of equipment. The system enables the user to see the 3-D propagation of radiation, the medical staff's exposure, and/or the doses deposited on the patient's surface as seen through his own eyes.
Automated Selection Of Pictures In Sequences
NASA Technical Reports Server (NTRS)
Rorvig, Mark E.; Shelton, Robert O.
1995-01-01
Method of automated selection of film or video motion-picture frames for storage or examination developed. Beneficial in situations in which quantity of visual information available exceeds amount stored or examined by humans in reasonable amount of time, and/or necessary to reduce large number of motion-picture frames to few conveying significantly different information in manner intermediate between movie and comic book or storyboard. For example, computerized vision system monitoring industrial process programmed to sound alarm when changes in scene exceed normal limits.
Bar, Moshe; Aminoff, Elissa; Schacter, Daniel L.
2009-01-01
The parahippocampal cortex (PHC) has been implicated both in episodic memory and in place/scene processing. We proposed that this region should instead be seen as intrinsically mediating contextual associations, and not place/scene processing or episodic memory exclusively. Given that place/scene processing and episodic memory both rely on associations, this modified framework provides a platform for reconciling what seemed like different roles assigned to the same region. Comparing scenes with scenes, we show here that the PHC responds significantly more strongly to scenes with rich contextual associations compared with scenes of equal visual qualities but less associations. This result provides the strongest support to the view that the PHC mediates contextual associations in general, rather than places or scenes proper, and necessitates a revision of current views such as that the PHC contains a dedicated place/scenes “module.” PMID:18716212
Traffic Signs in Complex Visual Environments
DOT National Transportation Integrated Search
1982-11-01
The effects of sign luminance on detection and recognition of traffic control devices is mediated through contrast with the immediate surround. Additionally, complex visual scenes are known to degrade visual performance with targets well above visual...
An Integrated Tone Mapping for High Dynamic Range Image Visualization
NASA Astrophysics Data System (ADS)
Liang, Lei; Pan, Jeng-Shyang; Zhuang, Yongjun
2018-01-01
There are two type tone mapping operators for high dynamic range (HDR) image visualization. HDR image mapped by perceptual operators have strong sense of reality, but will lose local details. Empirical operators can maximize local detail information of HDR image, but realism is not strong. A common tone mapping operator suitable for all applications is not available. This paper proposes a novel integrated tone mapping framework which can achieve conversion between empirical operators and perceptual operators. In this framework, the empirical operator is rendered based on improved saliency map, which simulates the visual attention mechanism of the human eye to the natural scene. The results of objective evaluation prove the effectiveness of the proposed solution.
NASA Astrophysics Data System (ADS)
Zlinszky, András; Schroiff, Anke; Otepka, Johannes; Mandlburger, Gottfried; Pfeifer, Norbert
2014-05-01
LIDAR point clouds hold valuable information for land cover and vegetation analysis, not only in the spatial distribution of the points but also in their various attributes. However, LIDAR point clouds are rarely used for visual interpretation, since for most users, the point cloud is difficult to interpret compared to passive optical imagery. Meanwhile, point cloud viewing software is available allowing interactive 3D interpretation, but typically only one attribute at a time. This results in a large number of points with the same colour, crowding the scene and often obscuring detail. We developed a scheme for mapping information from multiple LIDAR point attributes to the Red, Green, and Blue channels of a widely used LIDAR data format, which are otherwise mostly used to add information from imagery to create "photorealistic" point clouds. The possible combinations of parameters are therefore represented in a wide range of colours, but relative differences in individual parameter values of points can be well understood. The visualization was implemented in OPALS software, using a simple and robust batch script, and is viewer independent since the information is stored in the point cloud data file itself. In our case, the following colour channel assignment delivered best results: Echo amplitude in the Red, echo width in the Green and normalized height above a Digital Terrain Model in the Blue channel. With correct parameter scaling (but completely without point classification), points belonging to asphalt and bare soil are dark red, low grassland and crop vegetation are bright red to yellow, shrubs and low trees are green and high trees are blue. Depending on roof material and DTM quality, buildings are shown from red through purple to dark blue. Erroneously high or low points, or points with incorrect amplitude or echo width usually have colours contrasting from terrain or vegetation. This allows efficient visual interpretation of the point cloud in planar, profile and 3D views since it reduces crowding of the scene and delivers intuitive contextual information. The resulting visualization has proved useful for vegetation analysis for habitat mapping, and can also be applied as a first step for point cloud level classification. An interactive demonstration of the visualization script is shown during poster attendance, including the opportunity to view your own point cloud sample files.
Guest Editor's introduction: Special issue on distributed virtual environments
NASA Astrophysics Data System (ADS)
Lea, Rodger
1998-09-01
Distributed virtual environments (DVEs) combine technology from 3D graphics, virtual reality and distributed systems to provide an interactive 3D scene that supports multiple participants. Each participant has a representation in the scene, often known as an avatar, and is free to navigate through the scene and interact with both the scene and other viewers of the scene. Changes to the scene, for example, position changes of one avatar as the associated viewer navigates through the scene, or changes to objects in the scene via manipulation, are propagated in real time to all viewers. This ensures that all viewers of a shared scene `see' the same representation of it, allowing sensible reasoning about the scene. Early work on such environments was restricted to their use in simulation, in particular in military simulation. However, over recent years a number of interesting and potentially far-reaching attempts have been made to exploit the technology for a range of other uses, including: Social spaces. Such spaces can be seen as logical extensions of the familiar text chat space. In 3D social spaces avatars, representing participants, can meet in shared 3D scenes and in addition to text chat can use visual cues and even in some cases spatial audio. Collaborative working. A number of recent projects have attempted to explore the use of DVEs to facilitate computer-supported collaborative working (CSCW), where the 3D space provides a context and work space for collaboration. Gaming. The shared 3D space is already familiar, albeit in a constrained manner, to the gaming community. DVEs are a logical superset of existing 3D games and can provide a rich framework for advanced gaming applications. e-commerce. The ability to navigate through a virtual shopping mall and to look at, and even interact with, 3D representations of articles has appealed to the e-commerce community as it searches for the best method of presenting merchandise to electronic consumers. The technology needed to support these systems crosses a number of disciplines in computer science. These include, but are certainly not limited to, real-time graphics for the accurate and realistic representation of scenes, group communications for the efficient update of shared consistent scene data, user interface modelling to exploit the use of the 3D representation and multimedia systems technology for the delivery of streamed graphics and audio-visual data into the shared scene. It is this intersection of technologies and the overriding need to provide visual realism that places such high demands on the underlying distributed systems infrastructure and makes DVEs such fertile ground for distributed systems research. Two examples serve to show how DVE developers have exploited the unique aspects of their domain. Communications. The usual tension between latency and throughput is particularly noticeable within DVEs. To ensure the timely update of multiple viewers of a particular scene requires that such updates be propagated quickly. However, the sheer volume of changes to any one scene calls for techniques that minimize the number of distinct updates that are sent to the network. Several techniques have been used to address this tension; these include the use of multicast communications, and in particular multicast in wide-area networks to reduce actual message traffic. Multicast has been combined with general group communications to partition updates to related objects or users of a scene. A less traditional approach has been the use of dead reckoning whereby a client application that visualizes the scene calculates position updates by extrapolating movement based on previous information. This allows the system to reduce the number of communications needed to update objects that move in a stable manner within the scene. Scaling. DVEs, especially those used for social spaces, are required to support large numbers of simultaneous users in potentially large shared scenes. The desire for scalability has driven different architectural designs, for example, the use of fully distributed architectures which scale well but often suffer performance costs versus centralized and hierarchical architectures in which the inverse is true. However, DVEs have also exploited the spatial nature of their domain to address scalability and have pioneered techniques that exploit the semantics of the shared space to reduce data updates and so allow greater scalability. Several of the systems reported in this special issue apply a notion of area of interest to partition the scene and so reduce the participants in any data updates. The specification of area of interest differs between systems. One approach has been to exploit a geographical notion, i.e. a regular portion of a scene, or a semantic unit, such as a room or building. Another approach has been to define the area of interest as a spatial area associated with an avatar in the scene. The five papers in this special issue have been chosen to highlight the distributed systems aspects of the DVE domain. The first paper, on the DIVE system, described by Emmanuel Frécon and Mårten Stenius explores the use of multicast and group communication in a fully peer-to-peer architecture. The developers of DIVE have focused on its use as the basis for collaborative work environments and have explored the issues associated with maintaining and updating large complicated scenes. The second paper, by Hiroaki Harada et al, describes the AGORA system, a DVE concentrating on social spaces and employing a novel communication technique that incorporates position update and vector information to support dead reckoning. The paper by Simon Powers et al explores the application of DVEs to the gaming domain. They propose a novel architecture that separates out higher-level game semantics - the conceptual model - from the lower-level scene attributes - the dynamic model, both running on servers, from the actual visual representation - the visual model - running on the client. They claim a number of benefits from this approach, including better predictability and consistency. Wolfgang Broll discusses the SmallView system which is an attempt to provide a toolkit for DVEs. One of the key features of SmallView is a sophisticated application level protocol, DWTP, that provides support for a variety of communication models. The final paper, by Chris Greenhalgh, discusses the MASSIVE system which has been used to explore the notion of awareness in the 3D space via the concept of `auras'. These auras define an area of interest for users and support a mapping between what a user is aware of, and what data update rate the communications infrastructure can support. We hope that this selection of papers will serve to provide a clear introduction to the distributed system issues faced by the DVE community and the approaches they have taken in solving them. Finally, we wish to thank Hubert Le Van Gong for his tireless efforts in pulling together all these papers and both the referees and the authors of the papers for the time and effort in ensuring that their contributions teased out the interesting distributed systems issues for this special issue. † E-mail address: rodger@arch.sel.sony.com
Visualization of fluid dynamics at NASA Ames
NASA Technical Reports Server (NTRS)
Watson, Val
1989-01-01
The hardware and software currently used for visualization of fluid dynamics at NASA Ames is described. The software includes programs to create scenes (for example particle traces representing the flow over an aircraft), programs to interactively view the scenes, and programs to control the creation of video tapes and 16mm movies. The hardware includes high performance graphics workstations, a high speed network, digital video equipment, and film recorders.
Ward, Jamie; Hovard, Peter; Jones, Alicia; Rothen, Nicolas
2013-01-01
Memory has been shown to be enhanced in grapheme-color synaesthesia, and this enhancement extends to certain visual stimuli (that don't induce synaesthesia) as well as stimuli comprised of graphemes (which do). Previous studies have used a variety of testing procedures to assess memory in synaesthesia (e.g., free recall, recognition, associative learning) making it hard to know the extent to which memory benefits are attributable to the stimulus properties themselves, the testing method, participant strategies, or some combination of these factors. In the first experiment, we use the same testing procedure (recognition memory) for a variety of stimuli (written words, non-words, scenes, and fractals) and also check which memorization strategies were used. We demonstrate that grapheme-color synaesthetes show enhanced memory across all these stimuli, but this is not found for a non-visual type of synaesthesia (lexical-gustatory). In the second experiment, the memory advantage for scenes is explored further by manipulating the properties of the old and new images (changing color, orientation, or object presence). Again, grapheme-color synaesthetes show a memory advantage for scenes across all manipulations. Although recognition memory is generally enhanced in this study, the largest effects were found for abstract visual images (fractals) and scenes for which color can be used to discriminate old/new status. PMID:24187542
Ward, Jamie; Hovard, Peter; Jones, Alicia; Rothen, Nicolas
2013-01-01
Memory has been shown to be enhanced in grapheme-color synaesthesia, and this enhancement extends to certain visual stimuli (that don't induce synaesthesia) as well as stimuli comprised of graphemes (which do). Previous studies have used a variety of testing procedures to assess memory in synaesthesia (e.g., free recall, recognition, associative learning) making it hard to know the extent to which memory benefits are attributable to the stimulus properties themselves, the testing method, participant strategies, or some combination of these factors. In the first experiment, we use the same testing procedure (recognition memory) for a variety of stimuli (written words, non-words, scenes, and fractals) and also check which memorization strategies were used. We demonstrate that grapheme-color synaesthetes show enhanced memory across all these stimuli, but this is not found for a non-visual type of synaesthesia (lexical-gustatory). In the second experiment, the memory advantage for scenes is explored further by manipulating the properties of the old and new images (changing color, orientation, or object presence). Again, grapheme-color synaesthetes show a memory advantage for scenes across all manipulations. Although recognition memory is generally enhanced in this study, the largest effects were found for abstract visual images (fractals) and scenes for which color can be used to discriminate old/new status.
ERIC Educational Resources Information Center
Freeth, M.; Chapman, P.; Ropar, D.; Mitchell, P.
2010-01-01
Visual fixation patterns whilst viewing complex photographic scenes containing one person were studied in 24 high-functioning adolescents with Autism Spectrum Disorders (ASD) and 24 matched typically developing adolescents. Over two different scene presentation durations both groups spent a large, strikingly similar proportion of their viewing…
The artist's advantage: Better integration of object information across eye movements
Perdreau, Florian; Cavanagh, Patrick
2013-01-01
Over their careers, figurative artists spend thousands of hours analyzing objects and scene layout. We examined what impact this extensive training has on the ability to encode complex scenes, comparing participants with a wide range of training and drawing skills on a possible versus impossible objects task. We used a gaze-contingent display to control the amount of information the participants could sample on each fixation either from central or peripheral visual field. Test objects were displayed and participants reported, as quickly as possible, whether the object was structurally possible or not. Our results show that when viewing the image through a small central window, performance improved with the years of training, and to a lesser extent with the level of skill. This suggests that the extensive training itself confers an advantage for integrating object structure into more robust object descriptions. PMID:24349697
Nakashima, Ryoichi; Iwai, Ritsuko; Ueda, Sayako; Kumada, Takatsune
2015-01-01
When observers perceive several objects in a space, at the same time, they should effectively perceive their own position as a viewpoint. However, little is known about observers’ percepts of their own spatial location based on the visual scene information viewed from them. Previous studies indicate that two distinct visual spatial processes exist in the locomotion situation: the egocentric position perception and egocentric direction perception. Those studies examined such perceptions in information rich visual environments where much dynamic and static visual information was available. This study examined these two perceptions in information of impoverished environments, including only static lane edge information (i.e., limited information). We investigated the visual factors associated with static lane edge information that may affect these perceptions. Especially, we examined the effects of the two factors on egocentric direction and position perceptions. One is the “uprightness factor” that “far” visual information is seen at upper location than “near” visual information. The other is the “central vision factor” that observers usually look at “far” visual information using central vision (i.e., foveal vision) whereas ‘near’ visual information using peripheral vision. Experiment 1 examined the effect of the “uprightness factor” using normal and inverted road images. Experiment 2 examined the effect of the “central vision factor” using normal and transposed road images where the upper half of the normal image was presented under the lower half. Experiment 3 aimed to replicate the results of Experiments 1 and 2. Results showed that egocentric direction perception is interfered with image inversion or image transposition, whereas egocentric position perception is robust against these image transformations. That is, both “uprightness” and “central vision” factors are important for egocentric direction perception, but not for egocentric position perception. Therefore, the two visual spatial perceptions about observers’ own viewpoints are fundamentally dissociable. PMID:26648895
A bottom-up model of spatial attention predicts human error patterns in rapid scene recognition.
Einhäuser, Wolfgang; Mundhenk, T Nathan; Baldi, Pierre; Koch, Christof; Itti, Laurent
2007-07-20
Humans demonstrate a peculiar ability to detect complex targets in rapidly presented natural scenes. Recent studies suggest that (nearly) no focal attention is required for overall performance in such tasks. Little is known, however, of how detection performance varies from trial to trial and which stages in the processing hierarchy limit performance: bottom-up visual processing (attentional selection and/or recognition) or top-down factors (e.g., decision-making, memory, or alertness fluctuations)? To investigate the relative contribution of these factors, eight human observers performed an animal detection task in natural scenes presented at 20 Hz. Trial-by-trial performance was highly consistent across observers, far exceeding the prediction of independent errors. This consistency demonstrates that performance is not primarily limited by idiosyncratic factors but by visual processing. Two statistical stimulus properties, contrast variation in the target image and the information-theoretical measure of "surprise" in adjacent images, predict performance on a trial-by-trial basis. These measures are tightly related to spatial attention, demonstrating that spatial attention and rapid target detection share common mechanisms. To isolate the causal contribution of the surprise measure, eight additional observers performed the animal detection task in sequences that were reordered versions of those all subjects had correctly recognized in the first experiment. Reordering increased surprise before and/or after the target while keeping the target and distractors themselves unchanged. Surprise enhancement impaired target detection in all observers. Consequently, and contrary to several previously published findings, our results demonstrate that attentional limitations, rather than target recognition alone, affect the detection of targets in rapidly presented visual sequences.
How humans use visual optic flow to regulate stepping during walking.
Salinas, Mandy M; Wilken, Jason M; Dingwell, Jonathan B
2017-09-01
Humans use visual optic flow to regulate average walking speed. Among many possible strategies available, healthy humans walking on motorized treadmills allow fluctuations in stride length (L n ) and stride time (T n ) to persist across multiple consecutive strides, but rapidly correct deviations in stride speed (S n =L n /T n ) at each successive stride, n. Several experiments verified this stepping strategy when participants walked with no optic flow. This study determined how removing or systematically altering optic flow influenced peoples' stride-to-stride stepping control strategies. Participants walked on a treadmill with a virtual reality (VR) scene projected onto a 3m tall, 180° semi-cylindrical screen in front of the treadmill. Five conditions were tested: blank screen ("BLANK"), static scene ("STATIC"), or moving scene with optic flow speed slower than ("SLOW"), matched to ("MATCH"), or faster than ("FAST") walking speed. Participants took shorter and faster strides and demonstrated increased stepping variability during the BLANK condition compared to the other conditions. Thus, when visual information was removed, individuals appeared to walk more cautiously. Optic flow influenced both how quickly humans corrected stride speed deviations and how successful they were at enacting this strategy to try to maintain approximately constant speed at each stride. These results were consistent with Weber's law: healthy adults more-rapidly corrected stride speed deviations in a no optic flow condition (the lower intensity stimuli) compared to contexts with non-zero optic flow. These results demonstrate how the temporal characteristics of optic flow influence ability to correct speed fluctuations during walking. Copyright © 2017 Elsevier B.V. All rights reserved.
The effect of distraction on change detection in crowded acoustic scenes.
Petsas, Theofilos; Harrison, Jemma; Kashino, Makio; Furukawa, Shigeto; Chait, Maria
2016-11-01
In this series of behavioural experiments we investigated the effect of distraction on the maintenance of acoustic scene information in short-term memory. Stimuli are artificial acoustic 'scenes' composed of several (up to twelve) concurrent tone-pip streams ('sources'). A gap (1000 ms) is inserted partway through the 'scene'; Changes in the form of an appearance of a new source or disappearance of an existing source, occur after the gap in 50% of the trials. Listeners were instructed to monitor the unfolding 'soundscapes' for these events. Distraction was measured by presenting distractor stimuli during the gap. Experiments 1 and 2 used a dual task design where listeners were required to perform a task with varying attentional demands ('High Demand' vs. 'Low Demand') on brief auditory (Experiment 1a) or visual (Experiment 1b) signals presented during the gap. Experiments 2 and 3 required participants to ignore distractor sounds and focus on the change detection task. Our results demonstrate that the maintenance of scene information in short-term memory is influenced by the availability of attentional and/or processing resources during the gap, and that this dependence appears to be modality specific. We also show that these processes are susceptible to bottom up driven distraction even in situations when the distractors are not novel, but occur on each trial. Change detection performance is systematically linked with the, independently determined, perceptual salience of the distractor sound. The findings also demonstrate that the present task may be a useful objective means for determining relative perceptual salience. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
ERIC Educational Resources Information Center
Vanmarcke, Steven; Mullin, Caitlin; Van der Hallen, Ruth; Evers, Kris; Noens, Ilse; Steyaert, Jean; Wagemans, Johan
2016-01-01
Typically developing (TD) adults are able to extract global information from natural images and to categorize them within a single glance. This study aimed at extending these findings to individuals with autism spectrum disorder (ASD) using a free description open-encoding paradigm. Participants were asked to freely describe what they saw when…
Dioptric defocus maps across the visual field for different indoor environments
García, Miguel García; Ohlendorf, Arne; Schaeffel, Frank; Wahl, Siegfried
2017-01-01
One of the factors proposed to regulate the eye growth is the error signal derived from the defocus in the retina and actually, this might arise from defocus not only in the fovea but the whole visual field. Therefore, myopia could be better predicted by spatio-temporally mapping the ‘environmental defocus’ over the visual field. At present, no devices are available that could provide this information. A ‘Kinect sensor v1’ camera (Microsoft Corp.) and a portable eye tracker were used for developing a system for quantifying ‘indoor defocus error signals’ across the central 58° of the visual field. Dioptric differences relative to the fovea (assumed to be in focus) were recorded over the visual field and ‘defocus maps’ were generated for various scenes and tasks. PMID:29359108
Perceived Average Orientation Reflects Effective Gist of the Surface.
Cha, Oakyoon; Chong, Sang Chul
2018-03-01
The human ability to represent ensemble visual information, such as average orientation and size, has been suggested as the foundation of gist perception. To effectively summarize different groups of objects into the gist of a scene, observers should form ensembles separately for different groups, even when objects have similar visual features across groups. We hypothesized that the visual system utilizes perceptual groups characterized by spatial configuration and represents separate ensembles for different groups. Therefore, participants could not integrate ensembles of different perceptual groups on a task basis. We asked participants to determine the average orientation of visual elements comprising a surface with a contour situated inside. Although participants were asked to estimate the average orientation of all the elements, they ignored orientation signals embedded in the contour. This constraint may help the visual system to keep the visual features of occluding objects separate from those of the occluded objects.
Visual information processing; Proceedings of the Meeting, Orlando, FL, Apr. 20-22, 1992
NASA Technical Reports Server (NTRS)
Huck, Friedrich O. (Editor); Juday, Richard D. (Editor)
1992-01-01
Topics discussed in these proceedings include nonlinear processing and communications; feature extraction and recognition; image gathering, interpolation, and restoration; image coding; and wavelet transform. Papers are presented on noise reduction for signals from nonlinear systems; driving nonlinear systems with chaotic signals; edge detection and image segmentation of space scenes using fractal analyses; a vision system for telerobotic operation; a fidelity analysis of image gathering, interpolation, and restoration; restoration of images degraded by motion; and information, entropy, and fidelity in visual communication. Attention is also given to image coding methods and their assessment, hybrid JPEG/recursive block coding of images, modified wavelets that accommodate causality, modified wavelet transform for unbiased frequency representation, and continuous wavelet transform of one-dimensional signals by Fourier filtering.
Scene Recognition for Indoor Localization Using a Multi-Sensor Fusion Approach.
Liu, Mengyun; Chen, Ruizhi; Li, Deren; Chen, Yujin; Guo, Guangyi; Cao, Zhipeng; Pan, Yuanjin
2017-12-08
After decades of research, there is still no solution for indoor localization like the GNSS (Global Navigation Satellite System) solution for outdoor environments. The major reasons for this phenomenon are the complex spatial topology and RF transmission environment. To deal with these problems, an indoor scene constrained method for localization is proposed in this paper, which is inspired by the visual cognition ability of the human brain and the progress in the computer vision field regarding high-level image understanding. Furthermore, a multi-sensor fusion method is implemented on a commercial smartphone including cameras, WiFi and inertial sensors. Compared to former research, the camera on a smartphone is used to "see" which scene the user is in. With this information, a particle filter algorithm constrained by scene information is adopted to determine the final location. For indoor scene recognition, we take advantage of deep learning that has been proven to be highly effective in the computer vision community. For particle filter, both WiFi and magnetic field signals are used to update the weights of particles. Similar to other fingerprinting localization methods, there are two stages in the proposed system, offline training and online localization. In the offline stage, an indoor scene model is trained by Caffe (one of the most popular open source frameworks for deep learning) and a fingerprint database is constructed by user trajectories in different scenes. To reduce the volume requirement of training data for deep learning, a fine-tuned method is adopted for model training. In the online stage, a camera in a smartphone is used to recognize the initial scene. Then a particle filter algorithm is used to fuse the sensor data and determine the final location. To prove the effectiveness of the proposed method, an Android client and a web server are implemented. The Android client is used to collect data and locate a user. The web server is developed for indoor scene model training and communication with an Android client. To evaluate the performance, comparison experiments are conducted and the results demonstrate that a positioning accuracy of 1.32 m at 95% is achievable with the proposed solution. Both positioning accuracy and robustness are enhanced compared to approaches without scene constraint including commercial products such as IndoorAtlas.
Scene Recognition for Indoor Localization Using a Multi-Sensor Fusion Approach
Chen, Ruizhi; Li, Deren; Chen, Yujin; Guo, Guangyi; Cao, Zhipeng
2017-01-01
After decades of research, there is still no solution for indoor localization like the GNSS (Global Navigation Satellite System) solution for outdoor environments. The major reasons for this phenomenon are the complex spatial topology and RF transmission environment. To deal with these problems, an indoor scene constrained method for localization is proposed in this paper, which is inspired by the visual cognition ability of the human brain and the progress in the computer vision field regarding high-level image understanding. Furthermore, a multi-sensor fusion method is implemented on a commercial smartphone including cameras, WiFi and inertial sensors. Compared to former research, the camera on a smartphone is used to “see” which scene the user is in. With this information, a particle filter algorithm constrained by scene information is adopted to determine the final location. For indoor scene recognition, we take advantage of deep learning that has been proven to be highly effective in the computer vision community. For particle filter, both WiFi and magnetic field signals are used to update the weights of particles. Similar to other fingerprinting localization methods, there are two stages in the proposed system, offline training and online localization. In the offline stage, an indoor scene model is trained by Caffe (one of the most popular open source frameworks for deep learning) and a fingerprint database is constructed by user trajectories in different scenes. To reduce the volume requirement of training data for deep learning, a fine-tuned method is adopted for model training. In the online stage, a camera in a smartphone is used to recognize the initial scene. Then a particle filter algorithm is used to fuse the sensor data and determine the final location. To prove the effectiveness of the proposed method, an Android client and a web server are implemented. The Android client is used to collect data and locate a user. The web server is developed for indoor scene model training and communication with an Android client. To evaluate the performance, comparison experiments are conducted and the results demonstrate that a positioning accuracy of 1.32 m at 95% is achievable with the proposed solution. Both positioning accuracy and robustness are enhanced compared to approaches without scene constraint including commercial products such as IndoorAtlas. PMID:29292761
Do reference surfaces influence exocentric pointing?
Doumen, M J A; Kappers, A M L; Koenderink, J J
2008-06-01
All elements of the visual field are known to influence the perception of the egocentric distances of objects. Not only the ground surface of a scene, but also the surface at the back or other objects in the scene can affect an observer's egocentric distance estimation of an object. We tested whether this is also true for exocentric direction estimations. We used an exocentric pointing task to test whether the presence of poster-boards in the visual scene would influence the perception of the exocentric direction between two test-objects. In this task the observer has to direct a pointer, with a remote control, to a target. We placed the poster-boards at various positions in the visual field to test whether these boards would affect the settings of the observer. We found that they only affected the settings when they directly served as a reference for orienting the pointer to the target.
The Faces in Infant-Perspective Scenes Change over the First Year of Life
Jayaraman, Swapnaa; Fausey, Caitlin M.; Smith, Linda B.
2015-01-01
Mature face perception has its origins in the face experiences of infants. However, little is known about the basic statistics of faces in early visual environments. We used head cameras to capture and analyze over 72,000 infant-perspective scenes from 22 infants aged 1-11 months as they engaged in daily activities. The frequency of faces in these scenes declined markedly with age: for the youngest infants, faces were present 15 minutes in every waking hour but only 5 minutes for the oldest infants. In general, the available faces were well characterized by three properties: (1) they belonged to relatively few individuals; (2) they were close and visually large; and (3) they presented views showing both eyes. These three properties most strongly characterized the face corpora of our youngest infants and constitute environmental constraints on the early development of the visual system. PMID:26016988
Short-Term Memory for Figure-Ground Organization in the Visual Cortex
O’Herron, Philip; von der Heydt, Rüdiger
2009-01-01
Summary Whether the visual system uses a buffer to store image information and the duration of that storage have been debated intensely in recent psychophysical studies. The long phases of stable perception of reversible figures suggest a memory that persists for seconds. But persistence of similar duration has not been found in signals of the visual cortex. Here we show that figure-ground signals in the visual cortex can persist for a second or more after the removal of the figure-ground cues. When new figure-ground information is presented, the signals adjust rapidly, but when a figure display is changed to an ambiguous edge display, the signals decay slowly – a behavior that is characteristic of memory devices. Figure-ground signals represent the layout of objects in a scene, and we propose that a short-term memory for object layout is important in providing continuity of perception in the rapid stream of images flooding our eyes. PMID:19285475
Face, Body, and Center of Gravity Mediate Person Detection in Natural Scenes
ERIC Educational Resources Information Center
Bindemann, Markus; Scheepers, Christoph; Ferguson, Heather J.; Burton, A. Mike
2010-01-01
Person detection is an important prerequisite of social interaction, but is not well understood. Following suggestions that people in the visual field can capture a viewer's attention, this study examines the role of the face and the body for person detection in natural scenes. We observed that viewers tend first to look at the center of a scene,…
The Effect of Scene Variation on the Redundant Use of Color in Definite Reference
ERIC Educational Resources Information Center
Koolen, Ruud; Goudbeek, Martijn; Krahmer, Emiel
2013-01-01
This study investigates to what extent the amount of variation in a visual scene causes speakers to mention the attribute color in their definite target descriptions, focusing on scenes in which this attribute is not needed for identification of the target. The results of our three experiments show that speakers are more likely to redundantly…
Radiologists remember mountains better than radiographs, or do they?
Evans, Karla K; Marom, Edith M; Godoy, Myrna C B; Palacio, Diana; Sagebiel, Tara; Cuellar, Sonia Betancourt; McEntee, Mark; Tian, Charles; Brennan, Patrick C; Haygood, Tamara Miner
2016-01-01
Expertise with encoding material has been shown to aid long-term memory for that material. It is not clear how relevant this expertise is for image memorability (e.g., radiologists' memory for radiographs), and how robust over time. In two studies, we tested scene memory using a standard long-term memory paradigm. One compared the performance of radiologists to naïve observers on two image sets, chest radiographs and everyday scenes, and the other radiologists' memory with immediate as opposed to delayed recognition tests using musculoskeletal radiographs and forest scenes. Radiologists' memory was better than novices for images of expertise but no different for everyday scenes. With the heterogeneity of image sets equated, radiologists' expertise with radiographs afforded them better memory for the musculoskeletal radiographs than forest scenes. Enhanced memory for images of expertise disappeared over time, resulting in chance level performance for both image sets after weeks of delay. Expertise with the material is important for visual memorability but not to the same extent as idiosyncratic detail and variability of the image set. Similar memory decline with time for images of expertise as for everyday scenes further suggests that extended familiarity with an image is not a robust factor for visual memorability.
Volumetric visualization of multiple-return LIDAR data: Using voxels
Stoker, Jason M.
2009-01-01
Elevation data are an important component in the visualization and analysis of geographic information. The creation and display of 3D models representing bare earth, vegetation, and surface structures have become a major focus of light detection and ranging (lidar) remote sensing research in the past few years. Lidar is an active sensor that records the distance, or range, of a laser usually fi red from an airplane, helicopter, or satellite. By converting the millions of 3D lidar returns from a system into bare ground, vegetation, or structural elevation information, extremely accurate, high-resolution elevation models can be derived and produced to visualize and quantify scenes in three dimensions. These data can be used to produce high-resolution bare-earth digital elevation models; quantitative estimates of vegetative features such as canopy height, canopy closure, and biomass; and models of urban areas such as building footprints and 3D city models.
Mediated-reality magnification for macular degeneration rehabilitation
NASA Astrophysics Data System (ADS)
Martin-Gonzalez, Anabel; Kotliar, Konstantin; Rios-Martinez, Jorge; Lanzl, Ines; Navab, Nassir
2014-10-01
Age-related macular degeneration (AMD) is a gradually progressive eye condition, which is one of the leading causes of blindness and low vision in the Western world. Prevailing optical visual aids compensate part of the lost visual function, but omitting helpful complementary information. This paper proposes an efficient magnification technique, which can be implemented on a head-mounted display, for improving vision of patients with AMD, by preserving global information of the scene. Performance of the magnification approach is evaluated by simulating central vision loss in normally sighted subjects. Visual perception was measured as a function of text reading speed and map route following speed. Statistical analysis of experimental results suggests that our magnification method improves reading speed 1.2 times and spatial orientation to find routes on a map 1.5 times compared to a conventional magnification approach, being capable to enhance peripheral vision of AMD subjects along with their life quality.
Pyramidal neurovision architecture for vision machines
NASA Astrophysics Data System (ADS)
Gupta, Madan M.; Knopf, George K.
1993-08-01
The vision system employed by an intelligent robot must be active; active in the sense that it must be capable of selectively acquiring the minimal amount of relevant information for a given task. An efficient active vision system architecture that is based loosely upon the parallel-hierarchical (pyramidal) structure of the biological visual pathway is presented in this paper. Although the computational architecture of the proposed pyramidal neuro-vision system is far less sophisticated than the architecture of the biological visual pathway, it does retain some essential features such as the converging multilayered structure of its biological counterpart. In terms of visual information processing, the neuro-vision system is constructed from a hierarchy of several interactive computational levels, whereupon each level contains one or more nonlinear parallel processors. Computationally efficient vision machines can be developed by utilizing both the parallel and serial information processing techniques within the pyramidal computing architecture. A computer simulation of a pyramidal vision system for active scene surveillance is presented.
NASA Technical Reports Server (NTRS)
Sweet, Barbara T.; Kaiser, Mary K.
2013-01-01
Although current technology simulator visual systems can achieve extremely realistic levels they do not completely replicate the experience of a pilot sitting in the cockpit, looking at the outside world. Some differences in experience are due to visual artifacts, or perceptual features that would not be present in a naturally viewed scene. Others are due to features that are missing from the simulated scene. In this paper, these differences will be defined and discussed. The significance of these differences will be examined as a function of several particular operational tasks. A framework to facilitate the choice of visual system characteristics based on operational task requirements will be proposed.
Narrative comprehension and production in children with SLI: An eye movement study
ANDREU, LLORENÇ; SANZ-TORRENT, MONICA; OLMOS, JOAN GUÀRDIA; MACWHINNEY, BRIAN
2014-01-01
This study investigates narrative comprehension and production in children with specific language impairment (SLI). Twelve children with SLI (mean age 5; 8 years) and 12 typically developing children (mean age 5; 6 years) participated in an eye-tracking experiment designed to investigate online narrative comprehension and production in Catalan- and Spanish-speaking children with SLI. The comprehension task involved the recording of eye movements during the visual exploration of successive scenes in a story, while listening to the associated narrative. With regard to production, the children were asked to retell the story, while once again looking at the scenes, as their eye movements were monitored. During narrative production, children with SLI look at the most semantically relevant areas of the scenes fewer times than their age-matched controls, but no differences were found in narrative comprehension. Moreover, the analyses of speech productions revealed that children with SLI retained less information and made more semantic and syntactic errors during retelling. Implications for theories that characterize SLI are discussed. PMID:21453036
Tung, Nicole D; Barr, Jason; Sheppard, Dion J; Elliot, Douglas A; Tottey, Leah S; Walsh, Kevan A J
2015-05-01
The delivery of forensic science evidence in a clear and understandable manner is an important aspect of a forensic scientist's role during expert witness delivery in a courtroom trial. This article describes an Integrated Evidence Platform (IEP) system based on spherical photography which allows the audience to view the crime scene via a virtual tour and view the forensic scientist's evidence and results in context. Equipment and software programmes used in the creation of the IEP include a Nikon DSLR camera, a Seitz Roundshot VR Drive, PTGui Pro, and Tourweaver Professional Edition. The IEP enables a clear visualization of the crime scene, with embedded information such as photographs of items of interest, complex forensic evidence, the results of laboratory analyses, and scientific opinion evidence presented in context. The IEP has resulted in significant improvements to the pretrial disclosure of forensic results, enhanced the delivery of evidence in court, and improved the jury's understanding of the spatial relationship between results. © 2015 American Academy of Forensic Sciences.
Extended census transform histogram for land-use scene classification
NASA Astrophysics Data System (ADS)
Yuan, Baohua; Li, Shijin
2017-04-01
With the popular use of high-resolution satellite images, more and more research efforts have been focused on land-use scene classification. In scene classification, effective visual features can significantly boost the final performance. As a typical texture descriptor, the census transform histogram (CENTRIST) has emerged as a very powerful tool due to its effective representation ability. However, the most prominent limitation of CENTRIST is its small spatial support area, which may not necessarily be adept at capturing the key texture characteristics. We propose an extended CENTRIST (eCENTRIST), which is made up of three subschemes in a greater neighborhood scale. The proposed eCENTRIST not only inherits the advantages of CENTRIST but also encodes the more useful information of local structures. Meanwhile, multichannel eCENTRIST, which can capture the interactions from multichannel images, is developed to obtain higher categorization accuracy rates. Experimental results demonstrate that the proposed method can achieve competitive performance when compared to state-of-the-art methods.
Temporal dynamics of encoding, storage and reallocation of visual working memory
Bays, Paul M; Gorgoraptis, Nikos; Wee, Natalie; Marshall, Louise; Husain, Masud
2012-01-01
The process of encoding a visual scene into working memory has previously been studied using binary measures of recall. Here we examine the temporal evolution of memory resolution, based on observers’ ability to reproduce the orientations of objects presented in brief, masked displays. Recall precision was accurately described by the interaction of two independent constraints: an encoding limit that determines the maximum rate at which information can be transferred into memory, and a separate storage limit that determines the maximum fidelity with which information can be maintained. Recall variability decreased incrementally with time, consistent with a parallel encoding process in which visual information from multiple objects accumulates simultaneously in working memory. No evidence was observed for a limit on the number of items stored. Cueing one display item with a brief flash led to rapid development of a recall advantage for that item. This advantage was short-lived if the cue was simply a salient visual event, but was maintained if it indicated an object of particular relevance to the task. These cueing effects were observed even for items that had already been encoded into memory, indicating that limited memory resources can be rapidly reallocated to prioritize salient or goal-relevant information. PMID:21911739
Temporal dynamics of encoding, storage, and reallocation of visual working memory.
Bays, Paul M; Gorgoraptis, Nikos; Wee, Natalie; Marshall, Louise; Husain, Masud
2011-09-12
The process of encoding a visual scene into working memory has previously been studied using binary measures of recall. Here, we examine the temporal evolution of memory resolution, based on observers' ability to reproduce the orientations of objects presented in brief, masked displays. Recall precision was accurately described by the interaction of two independent constraints: an encoding limit that determines the maximum rate at which information can be transferred into memory and a separate storage limit that determines the maximum fidelity with which information can be maintained. Recall variability decreased incrementally with time, consistent with a parallel encoding process in which visual information from multiple objects accumulates simultaneously in working memory. No evidence was observed for a limit on the number of items stored. Cuing one display item with a brief flash led to rapid development of a recall advantage for that item. This advantage was short-lived if the cue was simply a salient visual event but was maintained if it indicated an object of particular relevance to the task. These cuing effects were observed even for items that had already been encoded into memory, indicating that limited memory resources can be rapidly reallocated to prioritize salient or goal-relevant information.
On the role of working memory in spatial contextual cueing.
Travis, Susan L; Mattingley, Jason B; Dux, Paul E
2013-01-01
The human visual system receives more information than can be consciously processed. To overcome this capacity limit, we employ attentional mechanisms to prioritize task-relevant (target) information over less relevant (distractor) information. Regularities in the environment can facilitate the allocation of attention, as demonstrated by the spatial contextual cueing paradigm. When observers are exposed repeatedly to a scene and invariant distractor information, learning from earlier exposures enhances the search for the target. Here, we investigated whether spatial contextual cueing draws on spatial working memory resources and, if so, at what level of processing working memory load has its effect. Participants performed 2 tasks concurrently: a visual search task, in which the spatial configuration of some search arrays occasionally repeated, and a spatial working memory task. Increases in working memory load significantly impaired contextual learning. These findings indicate that spatial contextual cueing utilizes working memory resources.
Top-down influences on visual attention during listening are modulated by observer sex.
Shen, John; Itti, Laurent
2012-07-15
In conversation, women have a small advantage in decoding non-verbal communication compared to men. In light of these findings, we sought to determine whether sex differences also existed in visual attention during a related listening task, and if so, if the differences existed among attention to high-level aspects of the scene or to conspicuous visual features. Using eye-tracking and computational techniques, we present direct evidence that men and women orient attention differently during conversational listening. We tracked the eyes of 15 men and 19 women who watched and listened to 84 clips featuring 12 different speakers in various outdoor settings. At the fixation following each saccadic eye movement, we analyzed the type of object that was fixated. Men gazed more often at the mouth and women at the eyes of the speaker. Women more often exhibited "distracted" saccades directed away from the speaker and towards a background scene element. Examining the multi-scale center-surround variation in low-level visual features (static: color, intensity, orientation, and dynamic: motion energy), we found that men consistently selected regions which expressed more variation in dynamic features, which can be attributed to a male preference for motion and a female preference for areas that may contain nonverbal information about the speaker. In sum, significant differences were observed, which we speculate arise from different integration strategies of visual cues in selecting the final target of attention. Our findings have implications for studies of sex in nonverbal communication, as well as for more predictive models of visual attention. Published by Elsevier Ltd.
How virtual reality works: illusions of vision in "real" and virtual environments
NASA Astrophysics Data System (ADS)
Stark, Lawrence W.
1995-04-01
Visual illusions abound in normal vision--illusions of clarity and completeness, of continuity in time and space, of presence and vivacity--and are part and parcel of the visual world inwhich we live. These illusions are discussed in terms of the human visual system, with its high- resolution fovea, moved from point to point in the visual scene by rapid saccadic eye movements (EMs). This sampling of visual information is supplemented by a low-resolution, wide peripheral field of view, especially sensitive to motion. Cognitive-spatial models controlling perception, imagery, and 'seeing,' also control the EMs that shift the fovea in the Scanpath mode. These illusions provide for presence, the sense off being within an environment. They equally well lead to 'Telepresence,' the sense of being within a virtual display, especially if the operator is intensely interacting within an eye-hand and head-eye human-machine interface that provides for congruent visual and motor frames of reference. Interaction, immersion, and interest compel telepresence; intuitive functioning and engineered information flows can optimize human adaptation to the artificial new world of virtual reality, as virtual reality expands into entertainment, simulation, telerobotics, and scientific visualization and other professional work.
Adeli, Hossein; Vitu, Françoise; Zelinsky, Gregory J
2017-02-08
Modern computational models of attention predict fixations using saliency maps and target maps, which prioritize locations for fixation based on feature contrast and target goals, respectively. But whereas many such models are biologically plausible, none have looked to the oculomotor system for design constraints or parameter specification. Conversely, although most models of saccade programming are tightly coupled to underlying neurophysiology, none have been tested using real-world stimuli and tasks. We combined the strengths of these two approaches in MASC, a model of attention in the superior colliculus (SC) that captures known neurophysiological constraints on saccade programming. We show that MASC predicted the fixation locations of humans freely viewing naturalistic scenes and performing exemplar and categorical search tasks, a breadth achieved by no other existing model. Moreover, it did this as well or better than its more specialized state-of-the-art competitors. MASC's predictive success stems from its inclusion of high-level but core principles of SC organization: an over-representation of foveal information, size-invariant population codes, cascaded population averaging over distorted visual and motor maps, and competition between motor point images for saccade programming, all of which cause further modulation of priority (attention) after projection of saliency and target maps to the SC. Only by incorporating these organizing brain principles into our models can we fully understand the transformation of complex visual information into the saccade programs underlying movements of overt attention. With MASC, a theoretical footing now exists to generate and test computationally explicit predictions of behavioral and neural responses in visually complex real-world contexts. SIGNIFICANCE STATEMENT The superior colliculus (SC) performs a visual-to-motor transformation vital to overt attention, but existing SC models cannot predict saccades to visually complex real-world stimuli. We introduce a brain-inspired SC model that outperforms state-of-the-art image-based competitors in predicting the sequences of fixations made by humans performing a range of everyday tasks (scene viewing and exemplar and categorical search), making clear the value of looking to the brain for model design. This work is significant in that it will drive new research by making computationally explicit predictions of SC neural population activity in response to naturalistic stimuli and tasks. It will also serve as a blueprint for the construction of other brain-inspired models, helping to usher in the next generation of truly intelligent autonomous systems. Copyright © 2017 the authors 0270-6474/17/371453-15$15.00/0.
Scene perception and the visual control of travel direction in navigating wood ants
Collett, Thomas S.; Lent, David D.; Graham, Paul
2014-01-01
This review reflects a few of Mike Land's many and varied contributions to visual science. In it, we show for wood ants, as Mike has done for a variety of animals, including readers of this piece, what can be learnt from a detailed analysis of an animal's visually guided eye, head or body movements. In the case of wood ants, close examination of their body movements, as they follow visually guided routes, is starting to reveal how they perceive and respond to their visual world and negotiate a path within it. We describe first some of the mechanisms that underlie the visual control of their paths, emphasizing that vision is not the ant's only sense. In the second part, we discuss how remembered local shape-dependent and global shape-independent features of a visual scene may interact in guiding the ant's path. PMID:24395962
Keshner, E A; Dhaher, Y
2008-07-01
Multiplanar environmental motion could generate head instability, particularly if the visual surround moves in planes orthogonal to a physical disturbance. We combined sagittal plane surface translations with visual field disturbances in 12 healthy (29-31 years) and 3 visually sensitive (27-57 years) adults. Center of pressure (COP), peak head angles, and RMS values of head motion were calculated and a three-dimensional model of joint motion was developed to examine gross head motion in three planes. We found that subjects standing quietly in front of a visual scene translating in the sagittal plane produced significantly greater (p<0.003) head motion in yaw than when on a translating platform. However, when the platform was translated in the dark or with a visual scene rotating in roll, head motion orthogonal to the plane of platform motion significantly increased (p<0.02). Visually sensitive subjects having no history of vestibular disorder produced large, delayed compensatory head motion. Orthogonal head motions were significantly greater in visually sensitive than in healthy subjects in the dark (p<0.05) and with a stationary scene (p<0.01). We concluded that motion of the visual field could modify compensatory response kinematics of a freely moving head in planes orthogonal to the direction of a physical perturbation. These results suggest that the mechanisms controlling head orientation in space are distinct from those that control trunk orientation in space. These behaviors would have been missed if only COP data were considered. Data suggest that rehabilitation training can be enhanced by combining visual and mechanical perturbation paradigms.
Long-Term Memories Bias Sensitivity and Target Selection in Complex Scenes
Patai, Eva Zita; Doallo, Sonia; Nobre, Anna Christina
2014-01-01
In everyday situations we often rely on our memories to find what we are looking for in our cluttered environment. Recently, we developed a new experimental paradigm to investigate how long-term memory (LTM) can guide attention, and showed how the pre-exposure to a complex scene in which a target location had been learned facilitated the detection of the transient appearance of the target at the remembered location (Summerfield, Lepsien, Gitelman, Mesulam, & Nobre, 2006; Summerfield, Rao, Garside, & Nobre, 2011). The present study extends these findings by investigating whether and how LTM can enhance perceptual sensitivity to identify targets occurring within their complex scene context. Behavioral measures showed superior perceptual sensitivity (d′) for targets located in remembered spatial contexts. We used the N2pc event-related potential to test whether LTM modulated the process of selecting the target from its scene context. Surprisingly, in contrast to effects of visual spatial cues or implicit contextual cueing, LTM for target locations significantly attenuated the N2pc potential. We propose that the mechanism by which these explicitly available LTMs facilitate perceptual identification of targets may differ from mechanisms triggered by other types of top-down sources of information. PMID:23016670
Color constancy in a scene with bright colors that do not have a fully natural surface appearance.
Fukuda, Kazuho; Uchikawa, Keiji
2014-04-01
Theoretical and experimental approaches have proposed that color constancy involves a correction related to some average of stimulation over the scene, and some of the studies showed that the average gives greater weight to surrounding bright colors. However, in a natural scene, high-luminance elements do not necessarily carry information about the scene illuminant when the luminance is too high for it to appear as a natural object color. The question is how a surrounding color's appearance mode influences its contribution to the degree of color constancy. Here the stimuli were simple geometric patterns, and the luminance of surrounding colors was tested over the range beyond the luminosity threshold. Observers performed perceptual achromatic setting on the test patch in order to measure the degree of color constancy and evaluated the surrounding bright colors' appearance mode. Broadly, our results support the assumption that the visual system counts only the colors in the object-color appearance for color constancy. However, detailed analysis indicated that surrounding colors without a fully natural object-color appearance had some sort of influence on color constancy. Consideration of this contribution of unnatural object color might be important for precise modeling of human color constancy.
Context matters: Anterior and posterior cortical midline responses to sad movie scenes.
Schlochtermeier, L H; Pehrs, C; Bakels, J-H; Jacobs, A M; Kappelhoff, H; Kuchinke, L
2017-04-15
Narrative movies can create powerful emotional responses. While recent research has advanced the understanding of neural networks involved in immersive movie viewing, their modulation within a movie's dynamic context remains inconclusive. In this study, 24 healthy participants passively watched sad scene climaxes taken from 24 romantic comedies, while brain activity was measured using functional magnetic resonance (fMRI). To study effects of context, the sad scene climaxes were presented with either coherent scene context, replaced non-coherent context or without context. In a second viewing, the same clips were rated continuously for sadness. The ratings varied over time with peaks of experienced sadness within the assumed climax intervals. Activations in anterior and posterior cortical midline regions increased if presented with both coherent and replaced context, while activation in the temporal gyri decreased. This difference was more pronounced for the coherent context condition. Psycho-Physiological interactions (PPI) analyses showed a context-dependent coupling of midline regions with occipital visual and sub-cortical reward regions. Our results demonstrate the pivotal role of midline structures and their interaction with perceptual and reward areas in processing contextually embedded socio-emotional information in movies. Copyright © 2017 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Assadi, Amir H.
2001-11-01
Perceptual geometry is an emerging field of interdisciplinary research whose objectives focus on study of geometry from the perspective of visual perception, and in turn, apply such geometric findings to the ecological study of vision. Perceptual geometry attempts to answer fundamental questions in perception of form and representation of space through synthesis of cognitive and biological theories of visual perception with geometric theories of the physical world. Perception of form and space are among fundamental problems in vision science. In recent cognitive and computational models of human perception, natural scenes are used systematically as preferred visual stimuli. Among key problems in perception of form and space, we have examined perception of geometry of natural surfaces and curves, e.g. as in the observer's environment. Besides a systematic mathematical foundation for a remarkably general framework, the advantages of the Gestalt theory of natural surfaces include a concrete computational approach to simulate or recreate images whose geometric invariants and quantities might be perceived and estimated by an observer. The latter is at the very foundation of understanding the nature of perception of space and form, and the (computer graphics) problem of rendering scenes to visually invoke virtual presence.
Acute stress influences the discrimination of complex scenes and complex faces in young healthy men.
Paul, M; Lech, R K; Scheil, J; Dierolf, A M; Suchan, B; Wolf, O T
2016-04-01
The stress-induced release of glucocorticoids has been demonstrated to influence hippocampal functions via the modulation of specific receptors. At the behavioral level stress is known to influence hippocampus dependent long-term memory. In recent years, studies have consistently associated the hippocampus with the non-mnemonic perception of scenes, while adjacent regions in the medial temporal lobe were associated with the perception of objects, and faces. So far it is not known whether and how stress influences non-mnemonic perceptual processes. In a behavioral study, fifty male participants were subjected either to the stressful socially evaluated cold-pressor test or to a non-stressful control procedure, before they completed a visual discrimination task, comprising scenes and faces. The complexity of the face and scene stimuli was manipulated in easy and difficult conditions. A significant three way interaction between stress, stimulus type and complexity was found. Stressed participants tended to commit more errors in the complex scenes condition. For complex faces a descriptive tendency in the opposite direction (fewer errors under stress) was observed. As a result the difference between the number of errors for scenes and errors for faces was significantly larger in the stress group. These results indicate that, beyond the effects of stress on long-term memory, stress influences the discrimination of spatial information, especially when the perception is characterized by a high complexity. Copyright © 2016 Elsevier Ltd. All rights reserved.
Contextual modulation of primary visual cortex by auditory signals.
Petro, L S; Paton, A T; Muckli, L
2017-02-19
Early visual cortex receives non-feedforward input from lateral and top-down connections (Muckli & Petro 2013 Curr. Opin. Neurobiol. 23, 195-201. (doi:10.1016/j.conb.2013.01.020)), including long-range projections from auditory areas. Early visual cortex can code for high-level auditory information, with neural patterns representing natural sound stimulation (Vetter et al. 2014 Curr. Biol. 24, 1256-1262. (doi:10.1016/j.cub.2014.04.020)). We discuss a number of questions arising from these findings. What is the adaptive function of bimodal representations in visual cortex? What type of information projects from auditory to visual cortex? What are the anatomical constraints of auditory information in V1, for example, periphery versus fovea, superficial versus deep cortical layers? Is there a putative neural mechanism we can infer from human neuroimaging data and recent theoretical accounts of cortex? We also present data showing we can read out high-level auditory information from the activation patterns of early visual cortex even when visual cortex receives simple visual stimulation, suggesting independent channels for visual and auditory signals in V1. We speculate which cellular mechanisms allow V1 to be contextually modulated by auditory input to facilitate perception, cognition and behaviour. Beyond cortical feedback that facilitates perception, we argue that there is also feedback serving counterfactual processing during imagery, dreaming and mind wandering, which is not relevant for immediate perception but for behaviour and cognition over a longer time frame.This article is part of the themed issue 'Auditory and visual scene analysis'. © 2017 The Authors.
Contextual modulation of primary visual cortex by auditory signals
Paton, A. T.
2017-01-01
Early visual cortex receives non-feedforward input from lateral and top-down connections (Muckli & Petro 2013 Curr. Opin. Neurobiol. 23, 195–201. (doi:10.1016/j.conb.2013.01.020)), including long-range projections from auditory areas. Early visual cortex can code for high-level auditory information, with neural patterns representing natural sound stimulation (Vetter et al. 2014 Curr. Biol. 24, 1256–1262. (doi:10.1016/j.cub.2014.04.020)). We discuss a number of questions arising from these findings. What is the adaptive function of bimodal representations in visual cortex? What type of information projects from auditory to visual cortex? What are the anatomical constraints of auditory information in V1, for example, periphery versus fovea, superficial versus deep cortical layers? Is there a putative neural mechanism we can infer from human neuroimaging data and recent theoretical accounts of cortex? We also present data showing we can read out high-level auditory information from the activation patterns of early visual cortex even when visual cortex receives simple visual stimulation, suggesting independent channels for visual and auditory signals in V1. We speculate which cellular mechanisms allow V1 to be contextually modulated by auditory input to facilitate perception, cognition and behaviour. Beyond cortical feedback that facilitates perception, we argue that there is also feedback serving counterfactual processing during imagery, dreaming and mind wandering, which is not relevant for immediate perception but for behaviour and cognition over a longer time frame. This article is part of the themed issue ‘Auditory and visual scene analysis’. PMID:28044015
Early Visual Foraging in Relationship to Familial Risk for Autism and Hyperactivity/Inattention.
Gliga, Teodora; Smith, Tim J; Likely, Noreen; Charman, Tony; Johnson, Mark H
2015-12-04
Information foraging is atypical in both autism spectrum disorders (ASDs) and ADHD; however, while ASD is associated with restricted exploration and preference for sameness, ADHD is characterized by hyperactivity and increased novelty seeking. Here, we ask whether similar biases are present in visual foraging in younger siblings of children with a diagnosis of ASD with or without additional high levels of hyperactivity and inattention. Fifty-four low-risk controls (LR) and 50 high-risk siblings (HR) took part in an eye-tracking study at 8 and 14 months and at 3 years of age. At 8 months, siblings of children with ASD and low levels of hyperactivity/inattention (HR/ASD-HI) were more likely to return to previously visited areas in the visual scene than were LR and siblings of children with ASD and high levels of hyperactivity/inattention (HR/ASD+HI). We show that visual foraging is atypical in infants at-risk for ASD. We also reveal a paradoxical effect, in that additional family risk for ADHD core symptoms mitigates the effect of ASD risk on visual information foraging. © The Author(s) 2015.
Bag of Lines (BoL) for Improved Aerial Scene Representation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sridharan, Harini; Cheriyadat, Anil M.
2014-09-22
Feature representation is a key step in automated visual content interpretation. In this letter, we present a robust feature representation technique, referred to as bag of lines (BoL), for high-resolution aerial scenes. The proposed technique involves extracting and compactly representing low-level line primitives from the scene. The compact scene representation is generated by counting the different types of lines representing various linear structures in the scene. Through extensive experiments, we show that the proposed scene representation is invariant to scale changes and scene conditions and can discriminate urban scene categories accurately. We compare the BoL representation with the popular scalemore » invariant feature transform (SIFT) and Gabor wavelets for their classification and clustering performance on an aerial scene database consisting of images acquired by sensors with different spatial resolutions. The proposed BoL representation outperforms the SIFT- and Gabor-based representations.« less
How music alters a kiss: superior temporal gyrus controls fusiform-amygdalar effective connectivity.
Pehrs, Corinna; Deserno, Lorenz; Bakels, Jan-Hendrik; Schlochtermeier, Lorna H; Kappelhoff, Hermann; Jacobs, Arthur M; Fritz, Thomas Hans; Koelsch, Stefan; Kuchinke, Lars
2014-11-01
While watching movies, the brain integrates the visual information and the musical soundtrack into a coherent percept. Multisensory integration can lead to emotion elicitation on which soundtrack valences may have a modulatory impact. Here, dynamic kissing scenes from romantic comedies were presented to 22 participants (13 females) during functional magnetic resonance imaging scanning. The kissing scenes were either accompanied by happy music, sad music or no music. Evidence from cross-modal studies motivated a predefined three-region network for multisensory integration of emotion, consisting of fusiform gyrus (FG), amygdala (AMY) and anterior superior temporal gyrus (aSTG). The interactions in this network were investigated using dynamic causal models of effective connectivity. This revealed bilinear modulations by happy and sad music with suppression effects on the connectivity from FG and AMY to aSTG. Non-linear dynamic causal modeling showed a suppressive gating effect of aSTG on fusiform-amygdalar connectivity. In conclusion, fusiform to amygdala coupling strength is modulated via feedback through aSTG as region for multisensory integration of emotional material. This mechanism was emotion-specific and more pronounced for sad music. Therefore, soundtrack valences may modulate emotion elicitation in movies by differentially changing preprocessed visual information to the amygdala. © The Author (2013). Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
How music alters a kiss: superior temporal gyrus controls fusiform–amygdalar effective connectivity
Deserno, Lorenz; Bakels, Jan-Hendrik; Schlochtermeier, Lorna H.; Kappelhoff, Hermann; Jacobs, Arthur M.; Fritz, Thomas Hans; Koelsch, Stefan; Kuchinke, Lars
2014-01-01
While watching movies, the brain integrates the visual information and the musical soundtrack into a coherent percept. Multisensory integration can lead to emotion elicitation on which soundtrack valences may have a modulatory impact. Here, dynamic kissing scenes from romantic comedies were presented to 22 participants (13 females) during functional magnetic resonance imaging scanning. The kissing scenes were either accompanied by happy music, sad music or no music. Evidence from cross-modal studies motivated a predefined three-region network for multisensory integration of emotion, consisting of fusiform gyrus (FG), amygdala (AMY) and anterior superior temporal gyrus (aSTG). The interactions in this network were investigated using dynamic causal models of effective connectivity. This revealed bilinear modulations by happy and sad music with suppression effects on the connectivity from FG and AMY to aSTG. Non-linear dynamic causal modeling showed a suppressive gating effect of aSTG on fusiform–amygdalar connectivity. In conclusion, fusiform to amygdala coupling strength is modulated via feedback through aSTG as region for multisensory integration of emotional material. This mechanism was emotion-specific and more pronounced for sad music. Therefore, soundtrack valences may modulate emotion elicitation in movies by differentially changing preprocessed visual information to the amygdala. PMID:24298171
Adaptation of facial synthesis to parameter analysis in MPEG-4 visual communication
NASA Astrophysics Data System (ADS)
Yu, Lu; Zhang, Jingyu; Liu, Yunhai
2000-12-01
In MPEG-4, Facial Definition Parameters (FDPs) and Facial Animation Parameters (FAPs) are defined to animate 1 a facial object. Most of the previous facial animation reconstruction systems were focused on synthesizing animation from manually or automatically generated FAPs but not the FAPs extracted from natural video scene. In this paper, an analysis-synthesis MPEG-4 visual communication system is established, in which facial animation is reconstructed from FAPs extracted from natural video scene.
Guidance for Development of a Flight Simulator Specification
2007-05-01
the simulated line of sight to the moon is less than one degree, and that the moon appears to move smoothly across the visual scene. The phase of the...Agencies have adopted the definition used by Optics Companies (this definition has also been adopted in this revision of the Air Force Guide...simulators that require tracking the target as it slues across the displayed scene, such as with air -to-ground or air -to- air combat tasks. Visual systems