Figure-Ground Organization in Visual Cortex for Natural Scenes
2016-01-01
Abstract Figure-ground organization and border-ownership assignment are essential for understanding natural scenes. It has been shown that many neurons in the macaque visual cortex signal border-ownership in displays of simple geometric shapes such as squares, but how well these neurons resolve border-ownership in natural scenes is not known. We studied area V2 neurons in behaving macaques with static images of complex natural scenes. We found that about half of the neurons were border-ownership selective for contours in natural scenes, and this selectivity originated from the image context. The border-ownership signals emerged within 70 ms after stimulus onset, only ∼30 ms after response onset. A substantial fraction of neurons were highly consistent across scenes. Thus, the cortical mechanisms of figure-ground organization are fast and efficient even in images of complex natural scenes. Understanding how the brain performs this task so fast remains a challenge. PMID:28058269
Gaze Control in Complex Scene Perception
2004-01-01
retained in memory from previously attended objects in natural scenes. Psychonomic Bulletin & Review , 8, 761-768. • The nature of the internal memory...scenes. Psychonomic Bulletin & Review , 8, 761-768. o Henderson, J. M., Falk, R. J., Minut, S., Dyer, F. C., & Mahadevan, S. (2001). Gaze control for face
Wagner, Dylan D; Kelley, William M; Heatherton, Todd F
2011-12-01
People are able to rapidly infer complex personality traits and mental states even from the most minimal person information. Research has shown that when observers view a natural scene containing people, they spend a disproportionate amount of their time looking at the social features (e.g., faces, bodies). Does this preference for social features merely reflect the biological salience of these features or are observers spontaneously attempting to make sense of complex social dynamics? Using functional neuroimaging, we investigated neural responses to social and nonsocial visual scenes in a large sample of participants (n = 48) who varied on an individual difference measure assessing empathy and mentalizing (i.e., empathizing). Compared with other scene categories, viewing natural social scenes activated regions associated with social cognition (e.g., dorsomedial prefrontal cortex and temporal poles). Moreover, activity in these regions during social scene viewing was strongly correlated with individual differences in empathizing. These findings offer neural evidence that observers spontaneously engage in social cognition when viewing complex social material but that the degree to which people do so is mediated by individual differences in trait empathizing.
A category adjustment approach to memory for spatial location in natural scenes.
Holden, Mark P; Curby, Kim M; Newcombe, Nora S; Shipley, Thomas F
2010-05-01
Memories for spatial locations often show systematic errors toward the central value of the surrounding region. This bias has been explained using a Bayesian model in which fine-grained and categorical information are combined (Huttenlocher, Hedges, & Duncan, 1991). However, experiments testing this model have largely used locations contained in simple geometric shapes. Use of this paradigm raises 2 issues. First, do results generalize to the complex natural world? Second, what types of information might be used to segment complex spaces into constituent categories? Experiment 1 addressed the 1st question by showing a bias toward prototypical values in memory for spatial locations in complex natural scenes. Experiment 2 addressed the 2nd question by manipulating the availability of basic visual cues (using color negatives) or of semantic information about the scene (using inverted images). Error patterns suggest that both perceptual and conceptual information are involved in segmentation. The possible neurological foundations of location memory of this kind are discussed. PsycINFO Database Record (c) 2010 APA, all rights reserved.
The lawful imprecision of human surface tilt estimation in natural scenes
2018-01-01
Estimating local surface orientation (slant and tilt) is fundamental to recovering the three-dimensional structure of the environment. It is unknown how well humans perform this task in natural scenes. Here, with a database of natural stereo-images having groundtruth surface orientation at each pixel, we find dramatic differences in human tilt estimation with natural and artificial stimuli. Estimates are precise and unbiased with artificial stimuli and imprecise and strongly biased with natural stimuli. An image-computable Bayes optimal model grounded in natural scene statistics predicts human bias, precision, and trial-by-trial errors without fitting parameters to the human data. The similarities between human and model performance suggest that the complex human performance patterns with natural stimuli are lawful, and that human visual systems have internalized local image and scene statistics to optimally infer the three-dimensional structure of the environment. These results generalize our understanding of vision from the lab to the real world. PMID:29384477
The lawful imprecision of human surface tilt estimation in natural scenes.
Kim, Seha; Burge, Johannes
2018-01-31
Estimating local surface orientation (slant and tilt) is fundamental to recovering the three-dimensional structure of the environment. It is unknown how well humans perform this task in natural scenes. Here, with a database of natural stereo-images having groundtruth surface orientation at each pixel, we find dramatic differences in human tilt estimation with natural and artificial stimuli. Estimates are precise and unbiased with artificial stimuli and imprecise and strongly biased with natural stimuli. An image-computable Bayes optimal model grounded in natural scene statistics predicts human bias, precision, and trial-by-trial errors without fitting parameters to the human data. The similarities between human and model performance suggest that the complex human performance patterns with natural stimuli are lawful, and that human visual systems have internalized local image and scene statistics to optimally infer the three-dimensional structure of the environment. These results generalize our understanding of vision from the lab to the real world. © 2018, Kim et al.
The Nature and Timing of Tele-Pseudoscopic Experiences
Hill, Harold; Allison, Robert S
2016-01-01
Interchanging the left and right eye views of a scene (pseudoscopic viewing) has been reported to produce vivid stereoscopic effects under certain conditions. In two separate field studies, we examined the experiences of 124 observers (76 in Study 1 and 48 in Study 2) while pseudoscopically viewing a distant natural outdoor scene. We found large individual differences in both the nature and the timing of their pseudoscopic experiences. While some observers failed to notice anything unusual about the pseudoscopic scene, most experienced multiple pseudoscopic phenomena, including apparent scene depth reversals, apparent object shape reversals, apparent size and flatness changes, apparent reversals of border ownership, and even complex illusory foreground surfaces. When multiple effects were experienced, patterns of co-occurrence suggested possible causal relationships between apparent scene depth reversals and several other pseudoscopic phenomena. The latency for experiencing pseudoscopic phenomena was found to correlate significantly with observer visual acuity, but not stereoacuity, in both studies. PMID:27482368
Some observations on value and greatness in drama.
Mandelbaum, George
2011-04-01
This paper argues that value in drama partly results from the nature of the resistance in a scene, resistance used in its common, everyday meaning. A playwright's ability to imagine and present such resistance rests on several factors, including his sublimation of the fantasies that underpin his work. Such sublimation is evident in Chekhov's continuing reworking in his plays of a fantasy that found its initial embodiment for him in one of the central scenes in Hamlet. The increasingly higher value of the scenes Chekhov wrote as he repeatedly reworked Shakespeare's scene resulted from his increasing sublimation of the initial fantasy and is reflected in the ever more complex nature of the resistance found in Chekhov's scenes, resistance that, in turn, created an ever more life-like, three-dimensional central character in the scenes. Copyright © 2011 Institute of Psychoanalysis.
Scene analysis in the natural environment
Lewicki, Michael S.; Olshausen, Bruno A.; Surlykke, Annemarie; Moss, Cynthia F.
2014-01-01
The problem of scene analysis has been studied in a number of different fields over the past decades. These studies have led to important insights into problems of scene analysis, but not all of these insights are widely appreciated, and there remain critical shortcomings in current approaches that hinder further progress. Here we take the view that scene analysis is a universal problem solved by all animals, and that we can gain new insight by studying the problems that animals face in complex natural environments. In particular, the jumping spider, songbird, echolocating bat, and electric fish, all exhibit behaviors that require robust solutions to scene analysis problems encountered in the natural environment. By examining the behaviors of these seemingly disparate animals, we emerge with a framework for studying scene analysis comprising four essential properties: (1) the ability to solve ill-posed problems, (2) the ability to integrate and store information across time and modality, (3) efficient recovery and representation of 3D scene structure, and (4) the use of optimal motor actions for acquiring information to progress toward behavioral goals. PMID:24744740
Flies and humans share a motion estimation strategy that exploits natural scene statistics
Clark, Damon A.; Fitzgerald, James E.; Ales, Justin M.; Gohl, Daryl M.; Silies, Marion A.; Norcia, Anthony M.; Clandinin, Thomas R.
2014-01-01
Sighted animals extract motion information from visual scenes by processing spatiotemporal patterns of light falling on the retina. The dominant models for motion estimation exploit intensity correlations only between pairs of points in space and time. Moving natural scenes, however, contain more complex correlations. Here we show that fly and human visual systems encode the combined direction and contrast polarity of moving edges using triple correlations that enhance motion estimation in natural environments. Both species extract triple correlations with neural substrates tuned for light or dark edges, and sensitivity to specific triple correlations is retained even as light and dark edge motion signals are combined. Thus, both species separately process light and dark image contrasts to capture motion signatures that can improve estimation accuracy. This striking convergence argues that statistical structures in natural scenes have profoundly affected visual processing, driving a common computational strategy over 500 million years of evolution. PMID:24390225
Statistics of natural binaural sounds.
Młynarski, Wiktor; Jost, Jürgen
2014-01-01
Binaural sound localization is usually considered a discrimination task, where interaural phase (IPD) and level (ILD) disparities at narrowly tuned frequency channels are utilized to identify a position of a sound source. In natural conditions however, binaural circuits are exposed to a stimulation by sound waves originating from multiple, often moving and overlapping sources. Therefore statistics of binaural cues depend on acoustic properties and the spatial configuration of the environment. Distribution of cues encountered naturally and their dependence on physical properties of an auditory scene have not been studied before. In the present work we analyzed statistics of naturally encountered binaural sounds. We performed binaural recordings of three auditory scenes with varying spatial configuration and analyzed empirical cue distributions from each scene. We have found that certain properties such as the spread of IPD distributions as well as an overall shape of ILD distributions do not vary strongly between different auditory scenes. Moreover, we found that ILD distributions vary much weaker across frequency channels and IPDs often attain much higher values, than can be predicted from head filtering properties. In order to understand the complexity of the binaural hearing task in the natural environment, sound waveforms were analyzed by performing Independent Component Analysis (ICA). Properties of learned basis functions indicate that in natural conditions soundwaves in each ear are predominantly generated by independent sources. This implies that the real-world sound localization must rely on mechanisms more complex than a mere cue extraction.
Statistics of Natural Binaural Sounds
Młynarski, Wiktor; Jost, Jürgen
2014-01-01
Binaural sound localization is usually considered a discrimination task, where interaural phase (IPD) and level (ILD) disparities at narrowly tuned frequency channels are utilized to identify a position of a sound source. In natural conditions however, binaural circuits are exposed to a stimulation by sound waves originating from multiple, often moving and overlapping sources. Therefore statistics of binaural cues depend on acoustic properties and the spatial configuration of the environment. Distribution of cues encountered naturally and their dependence on physical properties of an auditory scene have not been studied before. In the present work we analyzed statistics of naturally encountered binaural sounds. We performed binaural recordings of three auditory scenes with varying spatial configuration and analyzed empirical cue distributions from each scene. We have found that certain properties such as the spread of IPD distributions as well as an overall shape of ILD distributions do not vary strongly between different auditory scenes. Moreover, we found that ILD distributions vary much weaker across frequency channels and IPDs often attain much higher values, than can be predicted from head filtering properties. In order to understand the complexity of the binaural hearing task in the natural environment, sound waveforms were analyzed by performing Independent Component Analysis (ICA). Properties of learned basis functions indicate that in natural conditions soundwaves in each ear are predominantly generated by independent sources. This implies that the real-world sound localization must rely on mechanisms more complex than a mere cue extraction. PMID:25285658
An Improved Text Localization Method for Natural Scene Images
NASA Astrophysics Data System (ADS)
Jiang, Mengdi; Cheng, Jianghua; Chen, Minghui; Ku, Xishu
2018-01-01
In order to extract text information effectively from natural scene image with complex background, multi-orientation perspective and multilingual languages, we present a new method based on the improved Stroke Feature Transform (SWT). Firstly, The Maximally Stable Extremal Region (MSER) method is used to detect text candidate regions. Secondly, the SWT algorithm is used in the candidate regions, which can improve the edge detection compared with tradition SWT method. Finally, the Frequency-tuned (FT) visual saliency is introduced to remove non-text candidate regions. The experiment results show that, the method can achieve good robustness for complex background with multi-orientation perspective, various characters and font sizes.
Rolls, Edmund T.; Webb, Tristan J.
2014-01-01
Searching for and recognizing objects in complex natural scenes is implemented by multiple saccades until the eyes reach within the reduced receptive field sizes of inferior temporal cortex (IT) neurons. We analyze and model how the dorsal and ventral visual streams both contribute to this. Saliency detection in the dorsal visual system including area LIP is modeled by graph-based visual saliency, and allows the eyes to fixate potential objects within several degrees. Visual information at the fixated location subtending approximately 9° corresponding to the receptive fields of IT neurons is then passed through a four layer hierarchical model of the ventral cortical visual system, VisNet. We show that VisNet can be trained using a synaptic modification rule with a short-term memory trace of recent neuronal activity to capture both the required view and translation invariances to allow in the model approximately 90% correct object recognition for 4 objects shown in any view across a range of 135° anywhere in a scene. The model was able to generalize correctly within the four trained views and the 25 trained translations. This approach analyses the principles by which complementary computations in the dorsal and ventral visual cortical streams enable objects to be located and recognized in complex natural scenes. PMID:25161619
Scan patterns when viewing natural scenes: Emotion, complexity, and repetition
Bradley, Margaret M.; Houbova, Petra; Miccoli, Laura; Costa, Vincent D.; Lang, Peter J.
2011-01-01
Eye movements were monitored during picture viewing and effects of hedonic content, perceptual composition, and repetition on scanning assessed. In Experiment 1, emotional and neutral pictures that were figure-ground compositions or more complex scenes were presented for a 6 s free viewing period. Viewing emotional pictures or complex scenes prompted more fixations and broader scanning of the visual array, compared to neutral pictures or simple figure-ground compositions. Effects of emotion and composition were independent, supporting the hypothesis that these oculomotor indices reflect enhanced information seeking. Experiment 2 tested an orienting hypothesis by repeatedly presenting the same pictures. Although repetition altered specific scan patterns, emotional, compared to neutral, picture viewing continued to prompt oculomotor differences, suggesting that motivationally relevant cues enhance information seeking in appetitive and defensive contexts. PMID:21649664
Computer-generated hologram calculation for real scenes using a commercial portable plenoptic camera
NASA Astrophysics Data System (ADS)
Endo, Yutaka; Wakunami, Koki; Shimobaba, Tomoyoshi; Kakue, Takashi; Arai, Daisuke; Ichihashi, Yasuyuki; Yamamoto, Kenji; Ito, Tomoyoshi
2015-12-01
This paper shows the process used to calculate a computer-generated hologram (CGH) for real scenes under natural light using a commercial portable plenoptic camera. In the CGH calculation, a light field captured with the commercial plenoptic camera is converted into a complex amplitude distribution. Then the converted complex amplitude is propagated to a CGH plane. We tested both numerical and optical reconstructions of the CGH and showed that the CGH calculation from captured data with the commercial plenoptic camera was successful.
Scan patterns when viewing natural scenes: emotion, complexity, and repetition.
Bradley, Margaret M; Houbova, Petra; Miccoli, Laura; Costa, Vincent D; Lang, Peter J
2011-11-01
Eye movements were monitored during picture viewing, and effects of hedonic content, perceptual composition, and repetition on scanning assessed. In Experiment 1, emotional and neutral pictures that were figure-ground compositions or more complex scenes were presented for a 6-s free viewing period. Viewing emotional pictures or complex scenes prompted more fixations and broader scanning of the visual array, compared to neutral pictures or simple figure-ground compositions. Effects of emotion and composition were independent, supporting the hypothesis that these oculomotor indices reflect enhanced information seeking. Experiment 2 tested an orienting hypothesis by repeatedly presenting the same pictures. Although repetition altered specific scan patterns, emotional, compared to neutral, picture viewing continued to prompt oculomotor differences, suggesting that motivationally relevant cues enhance information seeking in appetitive and defensive contexts. Copyright © 2011 Society for Psychophysiological Research.
Idiosyncratic characteristics of saccadic eye movements when viewing different visual environments.
Andrews, T J; Coppola, D M
1999-08-01
Eye position was recorded in different viewing conditions to assess whether the temporal and spatial characteristics of saccadic eye movements in different individuals are idiosyncratic. Our aim was to determine the degree to which oculomotor control is based on endogenous factors. A total of 15 naive subjects viewed five visual environments: (1) The absence of visual stimulation (i.e. a dark room); (2) a repetitive visual environment (i.e. simple textured patterns); (3) a complex natural scene; (4) a visual search task; and (5) reading text. Although differences in visual environment had significant effects on eye movements, idiosyncrasies were also apparent. For example, the mean fixation duration and size of an individual's saccadic eye movements when passively viewing a complex natural scene covaried significantly with those same parameters in the absence of visual stimulation and in a repetitive visual environment. In contrast, an individual's spatio-temporal characteristics of eye movements during active tasks such as reading text or visual search covaried together, but did not correlate with the pattern of eye movements detected when viewing a natural scene, simple patterns or in the dark. These idiosyncratic patterns of eye movements in normal viewing reveal an endogenous influence on oculomotor control. The independent covariance of eye movements during different visual tasks shows that saccadic eye movements during active tasks like reading or visual search differ from those engaged during the passive inspection of visual scenes.
A Corticothalamic Circuit Model for Sound Identification in Complex Scenes
Otazu, Gonzalo H.; Leibold, Christian
2011-01-01
The identification of the sound sources present in the environment is essential for the survival of many animals. However, these sounds are not presented in isolation, as natural scenes consist of a superposition of sounds originating from multiple sources. The identification of a source under these circumstances is a complex computational problem that is readily solved by most animals. We present a model of the thalamocortical circuit that performs level-invariant recognition of auditory objects in complex auditory scenes. The circuit identifies the objects present from a large dictionary of possible elements and operates reliably for real sound signals with multiple concurrently active sources. The key model assumption is that the activities of some cortical neurons encode the difference between the observed signal and an internal estimate. Reanalysis of awake auditory cortex recordings revealed neurons with patterns of activity corresponding to such an error signal. PMID:21931668
Improved disparity map analysis through the fusion of monocular image segmentations
NASA Technical Reports Server (NTRS)
Perlant, Frederic P.; Mckeown, David M.
1991-01-01
The focus is to examine how estimates of three dimensional scene structure, as encoded in a scene disparity map, can be improved by the analysis of the original monocular imagery. The utilization of surface illumination information is provided by the segmentation of the monocular image into fine surface patches of nearly homogeneous intensity to remove mismatches generated during stereo matching. These patches are used to guide a statistical analysis of the disparity map based on the assumption that such patches correspond closely with physical surfaces in the scene. Such a technique is quite independent of whether the initial disparity map was generated by automated area-based or feature-based stereo matching. Stereo analysis results are presented on a complex urban scene containing various man-made and natural features. This scene contains a variety of problems including low building height with respect to the stereo baseline, buildings and roads in complex terrain, and highly textured buildings and terrain. The improvements are demonstrated due to monocular fusion with a set of different region-based image segmentations. The generality of this approach to stereo analysis and its utility in the development of general three dimensional scene interpretation systems are also discussed.
Top-down control of visual perception: attention in natural vision.
Rolls, Edmund T
2008-01-01
Top-down perceptual influences can bias (or pre-empt) perception. In natural scenes, the receptive fields of neurons in the inferior temporal visual cortex (IT) shrink to become close to the size of objects. This facilitates the read-out of information from the ventral visual system, because the information is primarily about the object at the fovea. Top-down attentional influences are much less evident in natural scenes than when objects are shown against blank backgrounds, though are still present. It is suggested that the reduced receptive-field size in natural scenes, and the effects of top-down attention contribute to change blindness. The receptive fields of IT neurons in complex scenes, though including the fovea, are frequently asymmetric around the fovea, and it is proposed that this is the solution the IT uses to represent multiple objects and their relative spatial positions in a scene. Networks that implement probabilistic decision-making are described, and it is suggested that, when in perceptual systems they take decisions (or 'test hypotheses'), they influence lower-level networks to bias visual perception. Finally, it is shown that similar processes extend to systems involved in the processing of emotion-provoking sensory stimuli, in that word-level cognitive states provide top-down biasing that reaches as far down as the orbitofrontal cortex, where, at the first stage of affective representations, olfactory, taste, flavour, and touch processing is biased (or pre-empted) in humans.
Pedale, Tiziana; Santangelo, Valerio
2015-01-01
One of the most important issues in the study of cognition is to understand which are the factors determining internal representation of the external world. Previous literature has started to highlight the impact of low-level sensory features (indexed by saliency-maps) in driving attention selection, hence increasing the probability for objects presented in complex and natural scenes to be successfully encoded into working memory (WM) and then correctly remembered. Here we asked whether the probability of retrieving high-saliency objects modulates the overall contents of WM, by decreasing the probability of retrieving other, lower-saliency objects. We presented pictures of natural scenes for 4 s. After a retention period of 8 s, we asked participants to verbally report as many objects/details as possible of the previous scenes. We then computed how many times the objects located at either the peak of maximal or minimal saliency in the scene (as indexed by a saliency-map; Itti et al., 1998) were recollected by participants. Results showed that maximal-saliency objects were recollected more often and earlier in the stream of successfully reported items than minimal-saliency objects. This indicates that bottom-up sensory salience increases the recollection probability and facilitates the access to memory representation at retrieval, respectively. Moreover, recollection of the maximal- (but not the minimal-) saliency objects predicted the overall amount of successfully recollected objects: The higher the probability of having successfully reported the most-salient object in the scene, the lower the amount of recollected objects. These findings highlight that bottom-up sensory saliency modulates the current contents of WM during recollection of objects from natural scenes, most likely by reducing available resources to encode and then retrieve other (lower saliency) objects. PMID:25741266
Text String Detection from Natural Scenes by Structure-based Partition and Grouping
Yi, Chucai; Tian, YingLi
2012-01-01
Text information in natural scene images serves as important clues for many image-based applications such as scene understanding, content-based image retrieval, assistive navigation, and automatic geocoding. However, locating text from complex background with multiple colors is a challenging task. In this paper, we explore a new framework to detect text strings with arbitrary orientations in complex natural scene images. Our proposed framework of text string detection consists of two steps: 1) Image partition to find text character candidates based on local gradient features and color uniformity of character components. 2) Character candidate grouping to detect text strings based on joint structural features of text characters in each text string such as character size differences, distances between neighboring characters, and character alignment. By assuming that a text string has at least three characters, we propose two algorithms of text string detection: 1) adjacent character grouping method, and 2) text line grouping method. The adjacent character grouping method calculates the sibling groups of each character candidate as string segments and then merges the intersecting sibling groups into text string. The text line grouping method performs Hough transform to fit text line among the centroids of text candidates. Each fitted text line describes the orientation of a potential text string. The detected text string is presented by a rectangle region covering all characters whose centroids are cascaded in its text line. To improve efficiency and accuracy, our algorithms are carried out in multi-scales. The proposed methods outperform the state-of-the-art results on the public Robust Reading Dataset which contains text only in horizontal orientation. Furthermore, the effectiveness of our methods to detect text strings with arbitrary orientations is evaluated on the Oriented Scene Text Dataset collected by ourselves containing text strings in non-horizontal orientations. PMID:21411405
Text string detection from natural scenes by structure-based partition and grouping.
Yi, Chucai; Tian, YingLi
2011-09-01
Text information in natural scene images serves as important clues for many image-based applications such as scene understanding, content-based image retrieval, assistive navigation, and automatic geocoding. However, locating text from a complex background with multiple colors is a challenging task. In this paper, we explore a new framework to detect text strings with arbitrary orientations in complex natural scene images. Our proposed framework of text string detection consists of two steps: 1) image partition to find text character candidates based on local gradient features and color uniformity of character components and 2) character candidate grouping to detect text strings based on joint structural features of text characters in each text string such as character size differences, distances between neighboring characters, and character alignment. By assuming that a text string has at least three characters, we propose two algorithms of text string detection: 1) adjacent character grouping method and 2) text line grouping method. The adjacent character grouping method calculates the sibling groups of each character candidate as string segments and then merges the intersecting sibling groups into text string. The text line grouping method performs Hough transform to fit text line among the centroids of text candidates. Each fitted text line describes the orientation of a potential text string. The detected text string is presented by a rectangle region covering all characters whose centroids are cascaded in its text line. To improve efficiency and accuracy, our algorithms are carried out in multi-scales. The proposed methods outperform the state-of-the-art results on the public Robust Reading Dataset, which contains text only in horizontal orientation. Furthermore, the effectiveness of our methods to detect text strings with arbitrary orientations is evaluated on the Oriented Scene Text Dataset collected by ourselves containing text strings in nonhorizontal orientations.
Salient contour extraction from complex natural scene in night vision image
NASA Astrophysics Data System (ADS)
Han, Jing; Yue, Jiang; Zhang, Yi; Bai, Lian-fa
2014-03-01
The theory of center-surround interaction in non-classical receptive field can be applied in night vision information processing. In this work, an optimized compound receptive field modulation method is proposed to extract salient contour from complex natural scene in low-light-level (LLL) and infrared images. The kernel idea is that multi-feature analysis can recognize the inhomogeneity in modulatory coverage more accurately and that center and surround with the grouping structure satisfying Gestalt rule deserves high connection-probability. Computationally, a multi-feature contrast weighted inhibition model is presented to suppress background and lower mutual inhibition among contour elements; a fuzzy connection facilitation model is proposed to achieve the enhancement of contour response, the connection of discontinuous contour and the further elimination of randomly distributed noise and texture; a multi-scale iterative attention method is designed to accomplish dynamic modulation process and extract contours of targets in multi-size. This work provides a series of biologically motivated computational visual models with high-performance for contour detection from cluttered scene in night vision images.
PHOG analysis of self-similarity in aesthetic images
NASA Astrophysics Data System (ADS)
Amirshahi, Seyed Ali; Koch, Michael; Denzler, Joachim; Redies, Christoph
2012-03-01
In recent years, there have been efforts in defining the statistical properties of aesthetic photographs and artworks using computer vision techniques. However, it is still an open question how to distinguish aesthetic from non-aesthetic images with a high recognition rate. This is possibly because aesthetic perception is influenced also by a large number of cultural variables. Nevertheless, the search for statistical properties of aesthetic images has not been futile. For example, we have shown that the radially averaged power spectrum of monochrome artworks of Western and Eastern provenance falls off according to a power law with increasing spatial frequency (1/f2 characteristics). This finding implies that this particular subset of artworks possesses a Fourier power spectrum that is self-similar across different scales of spatial resolution. Other types of aesthetic images, such as cartoons, comics and mangas also display this type of self-similarity, as do photographs of complex natural scenes. Since the human visual system is adapted to encode images of natural scenes in a particular efficient way, we have argued that artists imitate these statistics in their artworks. In support of this notion, we presented results that artists portrait human faces with the self-similar Fourier statistics of complex natural scenes although real-world photographs of faces are not self-similar. In view of these previous findings, we investigated other statistical measures of self-similarity to characterize aesthetic and non-aesthetic images. In the present work, we propose a novel measure of self-similarity that is based on the Pyramid Histogram of Oriented Gradients (PHOG). For every image, we first calculate PHOG up to pyramid level 3. The similarity between the histograms of each section at a particular level is then calculated to the parent section at the previous level (or to the histogram at the ground level). The proposed approach is tested on datasets of aesthetic and non-aesthetic categories of monochrome images. The aesthetic image datasets comprise a large variety of artworks of Western provenance. Other man-made aesthetically pleasing images, such as comics, cartoons and mangas, were also studied. For comparison, a database of natural scene photographs is used, as well as datasets of photographs of plants, simple objects and faces that are in general of low aesthetic value. As expected, natural scenes exhibit the highest degree of PHOG self-similarity. Images of artworks also show high selfsimilarity values, followed by cartoons, comics and mangas. On average, other (non-aesthetic) image categories are less self-similar in the PHOG analysis. A measure of scale-invariant self-similarity (PHOG) allows a good separation of the different aesthetic and non-aesthetic image categories. Our results provide further support for the notion that, like complex natural scenes, images of artworks display a higher degree of self-similarity across different scales of resolution than other image categories. Whether the high degree of self-similarity is the basis for the perception of beauty in both complex natural scenery and artworks remains to be investigated.
Colour agnosia impairs the recognition of natural but not of non-natural scenes.
Nijboer, Tanja C W; Van Der Smagt, Maarten J; Van Zandvoort, Martine J E; De Haan, Edward H F
2007-03-01
Scene recognition can be enhanced by appropriate colour information, yet the level of visual processing at which colour exerts its effects is still unclear. It has been suggested that colour supports low-level sensory processing, while others have claimed that colour information aids semantic categorization and recognition of objects and scenes. We investigated the effect of colour on scene recognition in a case of colour agnosia, M.A.H. In a scene identification task, participants had to name images of natural or non-natural scenes in six different formats. Irrespective of scene format, M.A.H. was much slower on the natural than on the non-natural scenes. As expected, neither M.A.H. nor control participants showed any difference in performance for the non-natural scenes. However, for the natural scenes, appropriate colour facilitated scene recognition in control participants (i.e., shorter reaction times), whereas M.A.H.'s performance did not differ across formats. Our data thus support the hypothesis that the effect of colour occurs at the level of learned associations.
Viewing the dynamics and control of visual attention through the lens of electrophysiology
Woodman, Geoffrey F.
2013-01-01
How we find what we are looking for in complex visual scenes is a seemingly simple ability that has taken half a century to unravel. The first study to use the term visual search showed that as the number of objects in a complex scene increases, observers’ reaction times increase proportionally (Green and Anderson, 1956). This observation suggests that our ability to process the objects in the scenes is limited in capacity. However, if it is known that the target will have a certain feature attribute, for example, that it will be red, then only an increase in the number of red items increases reaction time. This observation suggests that we can control which visual inputs receive the benefit of our limited capacity to recognize the objects, such as those defined by the color red, as the items we seek. The nature of the mechanisms that underlie these basic phenomena in the literature on visual search have been more difficult to definitively determine. In this paper, I discuss how electrophysiological methods have provided us with the necessary tools to understand the nature of the mechanisms that give rise to the effects observed in the first visual search paper. I begin by describing how recordings of event-related potentials from humans and nonhuman primates have shown us how attention is deployed to possible target items in complex visual scenes. Then, I will discuss how event-related potential experiments have allowed us to directly measure the memory representations that are used to guide these deployments of attention to items with target-defining features. PMID:23357579
Coding of navigational affordances in the human visual system
Epstein, Russell A.
2017-01-01
A central component of spatial navigation is determining where one can and cannot go in the immediate environment. We used fMRI to test the hypothesis that the human visual system solves this problem by automatically identifying the navigational affordances of the local scene. Multivoxel pattern analyses showed that a scene-selective region of dorsal occipitoparietal cortex, known as the occipital place area, represents pathways for movement in scenes in a manner that is tolerant to variability in other visual features. These effects were found in two experiments: One using tightly controlled artificial environments as stimuli, the other using a diverse set of complex, natural scenes. A reconstruction analysis demonstrated that the population codes of the occipital place area could be used to predict the affordances of novel scenes. Taken together, these results reveal a previously unknown mechanism for perceiving the affordance structure of navigable space. PMID:28416669
Marin, Manuela M.; Leder, Helmut
2013-01-01
Subjective complexity has been found to be related to hedonic measures of preference, pleasantness and beauty, but there is no consensus about the nature of this relationship in the visual and musical domains. Moreover, the affective content of stimuli has been largely neglected so far in the study of complexity but is crucial in many everyday contexts and in aesthetic experiences. We thus propose a cross-domain approach that acknowledges the multidimensional nature of complexity and that uses a wide range of objective complexity measures combined with subjective ratings. In four experiments, we employed pictures of affective environmental scenes, representational paintings, and Romantic solo and chamber music excerpts. Stimuli were pre-selected to vary in emotional content (pleasantness and arousal) and complexity (low versus high number of elements). For each set of stimuli, in a between-subjects design, ratings of familiarity, complexity, pleasantness and arousal were obtained for a presentation time of 25 s from 152 participants. In line with Berlyne’s collative-motivation model, statistical analyses controlling for familiarity revealed a positive relationship between subjective complexity and arousal, and the highest correlations were observed for musical stimuli. Evidence for a mediating role of arousal in the complexity-pleasantness relationship was demonstrated in all experiments, but was only significant for females with regard to music. The direction and strength of the linear relationship between complexity and pleasantness depended on the stimulus type and gender. For environmental scenes, the root mean square contrast measures and measures of compressed file size correlated best with subjective complexity, whereas only edge detection based on phase congruency yielded equivalent results for representational paintings. Measures of compressed file size and event density also showed positive correlations with complexity and arousal in music, which is relevant for the discussion on which aspects of complexity are domain-specific and which are domain-general. PMID:23977295
Marin, Manuela M; Leder, Helmut
2013-01-01
Subjective complexity has been found to be related to hedonic measures of preference, pleasantness and beauty, but there is no consensus about the nature of this relationship in the visual and musical domains. Moreover, the affective content of stimuli has been largely neglected so far in the study of complexity but is crucial in many everyday contexts and in aesthetic experiences. We thus propose a cross-domain approach that acknowledges the multidimensional nature of complexity and that uses a wide range of objective complexity measures combined with subjective ratings. In four experiments, we employed pictures of affective environmental scenes, representational paintings, and Romantic solo and chamber music excerpts. Stimuli were pre-selected to vary in emotional content (pleasantness and arousal) and complexity (low versus high number of elements). For each set of stimuli, in a between-subjects design, ratings of familiarity, complexity, pleasantness and arousal were obtained for a presentation time of 25 s from 152 participants. In line with Berlyne's collative-motivation model, statistical analyses controlling for familiarity revealed a positive relationship between subjective complexity and arousal, and the highest correlations were observed for musical stimuli. Evidence for a mediating role of arousal in the complexity-pleasantness relationship was demonstrated in all experiments, but was only significant for females with regard to music. The direction and strength of the linear relationship between complexity and pleasantness depended on the stimulus type and gender. For environmental scenes, the root mean square contrast measures and measures of compressed file size correlated best with subjective complexity, whereas only edge detection based on phase congruency yielded equivalent results for representational paintings. Measures of compressed file size and event density also showed positive correlations with complexity and arousal in music, which is relevant for the discussion on which aspects of complexity are domain-specific and which are domain-general.
Differential Visual Processing of Animal Images, with and without Conscious Awareness
Zhu, Weina; Drewes, Jan; Peatfield, Nicholas A.; Melcher, David
2016-01-01
The human visual system can quickly and efficiently extract categorical information from a complex natural scene. The rapid detection of animals in a scene is one compelling example of this phenomenon, and it suggests the automatic processing of at least some types of categories with little or no attentional requirements (Li et al., 2002, 2005). The aim of this study is to investigate whether the remarkable capability to categorize complex natural scenes exist in the absence of awareness, based on recent reports that “invisible” stimuli, which do not reach conscious awareness, can still be processed by the human visual system (Pasley et al., 2004; Williams et al., 2004; Fang and He, 2005; Jiang et al., 2006, 2007; Kaunitz et al., 2011a). In two experiments, we recorded event-related potentials (ERPs) in response to animal and non-animal/vehicle stimuli in both aware and unaware conditions in a continuous flash suppression (CFS) paradigm. Our results indicate that even in the “unseen” condition, the brain responds differently to animal and non-animal/vehicle images, consistent with rapid activation of animal-selective feature detectors prior to, or outside of, suppression by the CFS mask. PMID:27790106
Differential Visual Processing of Animal Images, with and without Conscious Awareness.
Zhu, Weina; Drewes, Jan; Peatfield, Nicholas A; Melcher, David
2016-01-01
The human visual system can quickly and efficiently extract categorical information from a complex natural scene. The rapid detection of animals in a scene is one compelling example of this phenomenon, and it suggests the automatic processing of at least some types of categories with little or no attentional requirements (Li et al., 2002, 2005). The aim of this study is to investigate whether the remarkable capability to categorize complex natural scenes exist in the absence of awareness, based on recent reports that "invisible" stimuli, which do not reach conscious awareness, can still be processed by the human visual system (Pasley et al., 2004; Williams et al., 2004; Fang and He, 2005; Jiang et al., 2006, 2007; Kaunitz et al., 2011a). In two experiments, we recorded event-related potentials (ERPs) in response to animal and non-animal/vehicle stimuli in both aware and unaware conditions in a continuous flash suppression (CFS) paradigm. Our results indicate that even in the "unseen" condition, the brain responds differently to animal and non-animal/vehicle images, consistent with rapid activation of animal-selective feature detectors prior to, or outside of, suppression by the CFS mask.
Direct versus indirect processing changes the influence of color in natural scene categorization.
Otsuka, Sachio; Kawaguchi, Jun
2009-10-01
We examined whether participants would use a negative priming (NP) paradigm to categorize color and grayscale images of natural scenes that were presented peripherally and were ignored. We focused on (1) attentional resources allocated to natural scenes and (2) direct versus indirect processing of them. We set up low and high attention-load conditions, based on the set size of the searched stimuli in the prime display (one and five). Participants were required to detect and categorize the target objects in natural scenes in a central visual search task, ignoring peripheral natural images in both the prime and probe displays. The results showed that, irrespective of attention load, NP was observed for color scenes but not for grayscale scenes. We did not observe any effect of color information in central visual search, where participants responded directly to natural scenes. These results indicate that, in a situation in which participants indirectly process natural scenes, color information is critical to object categorization, but when the scenes are processed directly, color information does not contribute to categorization.
Collet, Anne-Claire; Fize, Denis; VanRullen, Rufin
2015-01-01
Rapid visual categorization is a crucial ability for survival of many animal species, including monkeys and humans. In real conditions, objects (either animate or inanimate) are never isolated but embedded in a complex background made of multiple elements. It has been shown in humans and monkeys that the contextual background can either enhance or impair object categorization, depending on context/object congruency (for example, an animal in a natural vs. man-made environment). Moreover, a scene is not only a collection of objects; it also has global physical features (i.e phase and amplitude of Fourier spatial frequencies) which help define its gist. In our experiment, we aimed to explore and compare the contribution of the amplitude spectrum of scenes in the context-object congruency effect in monkeys and humans. We designed a rapid visual categorization task, Animal versus Non-Animal, using as contexts both real scenes photographs and noisy backgrounds built from the amplitude spectrum of real scenes but with randomized phase spectrum. We showed that even if the contextual congruency effect was comparable in both species when the context was a real scene, it differed when the foreground object was surrounded by a noisy background: in monkeys we found a similar congruency effect in both conditions, but in humans the congruency effect was absent (or even reversed) when the context was a noisy background. PMID:26207915
Teng, Santani
2017-01-01
In natural environments, visual and auditory stimulation elicit responses across a large set of brain regions in a fraction of a second, yielding representations of the multimodal scene and its properties. The rapid and complex neural dynamics underlying visual and auditory information processing pose major challenges to human cognitive neuroscience. Brain signals measured non-invasively are inherently noisy, the format of neural representations is unknown, and transformations between representations are complex and often nonlinear. Further, no single non-invasive brain measurement technique provides a spatio-temporally integrated view. In this opinion piece, we argue that progress can be made by a concerted effort based on three pillars of recent methodological development: (i) sensitive analysis techniques such as decoding and cross-classification, (ii) complex computational modelling using models such as deep neural networks, and (iii) integration across imaging methods (magnetoencephalography/electroencephalography, functional magnetic resonance imaging) and models, e.g. using representational similarity analysis. We showcase two recent efforts that have been undertaken in this spirit and provide novel results about visual and auditory scene analysis. Finally, we discuss the limits of this perspective and sketch a concrete roadmap for future research. This article is part of the themed issue ‘Auditory and visual scene analysis’. PMID:28044019
Cichy, Radoslaw Martin; Teng, Santani
2017-02-19
In natural environments, visual and auditory stimulation elicit responses across a large set of brain regions in a fraction of a second, yielding representations of the multimodal scene and its properties. The rapid and complex neural dynamics underlying visual and auditory information processing pose major challenges to human cognitive neuroscience. Brain signals measured non-invasively are inherently noisy, the format of neural representations is unknown, and transformations between representations are complex and often nonlinear. Further, no single non-invasive brain measurement technique provides a spatio-temporally integrated view. In this opinion piece, we argue that progress can be made by a concerted effort based on three pillars of recent methodological development: (i) sensitive analysis techniques such as decoding and cross-classification, (ii) complex computational modelling using models such as deep neural networks, and (iii) integration across imaging methods (magnetoencephalography/electroencephalography, functional magnetic resonance imaging) and models, e.g. using representational similarity analysis. We showcase two recent efforts that have been undertaken in this spirit and provide novel results about visual and auditory scene analysis. Finally, we discuss the limits of this perspective and sketch a concrete roadmap for future research.This article is part of the themed issue 'Auditory and visual scene analysis'. © 2017 The Authors.
Unplanned Complex Suicide-A Consideration of Multiple Methods.
Ateriya, Navneet; Kanchan, Tanuj; Shekhawat, Raghvendra Singh; Setia, Puneet; Saraf, Ashish
2018-05-01
Detailed death investigations are mandatory to find out the exact cause and manner in non-natural deaths. In this reference, use of multiple methods in suicide poses a challenge for the investigators especially when the choice of methods to cause death is unplanned. There is an increased likelihood that doubts of homicide are raised in cases of unplanned complex suicides. A case of complex suicide is reported where the victim resorted to multiple methods to end his life, and what appeared to be an unplanned variant based on the death scene investigations. A meticulous crime scene examination, interviews of the victim's relatives and other witnesses, and a thorough autopsy are warranted to conclude on the cause and manner of death in all such cases. © 2017 American Academy of Forensic Sciences.
The Identification and Modeling of Visual Cue Usage in Manual Control Task Experiments
NASA Technical Reports Server (NTRS)
Sweet, Barbara Townsend; Trejo, Leonard J. (Technical Monitor)
1999-01-01
Many fields of endeavor require humans to conduct manual control tasks while viewing a perspective scene. Manual control refers to tasks in which continuous, or nearly continuous, control adjustments are required. Examples include flying an aircraft, driving a car, and riding a bicycle. Perspective scenes can arise through natural viewing of the world, simulation of a scene (as in flight simulators), or through imaging devices (such as the cameras on an unmanned aerospace vehicle). Designers frequently have some degree of control over the content and characteristics of a perspective scene; airport designers can choose runway markings, vehicle designers can influence the size and shape of windows, as well as the location of the pilot, and simulator database designers can choose scene complexity and content. Little theoretical framework exists to help designers determine the answers to questions related to perspective scene content. An empirical approach is most commonly used to determine optimum perspective scene configurations. The goal of the research effort described in this dissertation has been to provide a tool for modeling the characteristics of human operators conducting manual control tasks with perspective-scene viewing. This is done for the purpose of providing an algorithmic, as opposed to empirical, method for analyzing the effects of changing perspective scene content for closed-loop manual control tasks.
The new generation of OpenGL support in ROOT
NASA Astrophysics Data System (ADS)
Tadel, M.
2008-07-01
OpenGL has been promoted to become the main 3D rendering engine of the ROOT framework. This required a major re-modularization of OpenGL support on all levels, from basic window-system specific interface to medium-level object-representation and top-level scene management. This new architecture allows seamless integration of external scene-graph libraries into the ROOT OpenGL viewer as well as inclusion of ROOT 3D scenes into external GUI and OpenGL-based 3D-rendering frameworks. Scene representation was removed from inside of the viewer, allowing scene-data to be shared among several viewers and providing for a natural implementation of multi-view canvas layouts. The object-graph traversal infrastructure allows free mixing of 3D and 2D-pad graphics and makes implementation of ROOT canvas in pure OpenGL possible. Scene-elements representing ROOT objects trigger automatic instantiation of user-provided rendering-objects based on the dictionary information and class-naming convention. Additionally, a finer, per-object control over scene-updates is available to the user, allowing overhead-free maintenance of dynamic 3D scenes and creation of complex real-time animations. User-input handling was modularized as well, making it easy to support application-specific scene navigation, selection handling and tool management.
Brown, Daniel K; Barton, Jo L; Gladwell, Valerie F
2013-06-04
A randomized crossover study explored whether viewing different scenes prior to a stressor altered autonomic function during the recovery from the stressor. The two scenes were (a) nature (composed of trees, grass, fields) or (b) built (composed of man-made, urban scenes lacking natural characteristics) environments. Autonomic function was assessed using noninvasive techniques of heart rate variability; in particular, time domain analyses evaluated parasympathetic activity, using root-mean-square of successive differences (RMSSD). During stress, secondary cardiovascular markers (heart rate, systolic and diastolic blood pressure) showed significant increases from baseline which did not differ between the two viewing conditions. Parasympathetic activity, however, was significantly higher in recovery following the stressor in the viewing scenes of nature condition compared to viewing scenes depicting built environments (RMSSD; 50.0 ± 31.3 vs 34.8 ± 14.8 ms). Thus, viewing nature scenes prior to a stressor alters autonomic activity in the recovery period. The secondary aim was to examine autonomic function during viewing of the two scenes. Standard deviation of R-R intervals (SDRR), as change from baseline, during the first 5 min of viewing nature scenes was greater than during built scenes. Overall, this suggests that nature can elicit improvements in the recovery process following a stressor.
2013-01-01
A randomized crossover study explored whether viewing different scenes prior to a stressor altered autonomic function during the recovery from the stressor. The two scenes were (a) nature (composed of trees, grass, fields) or (b) built (composed of man-made, urban scenes lacking natural characteristics) environments. Autonomic function was assessed using noninvasive techniques of heart rate variability; in particular, time domain analyses evaluated parasympathetic activity, using root-mean-square of successive differences (RMSSD). During stress, secondary cardiovascular markers (heart rate, systolic and diastolic blood pressure) showed significant increases from baseline which did not differ between the two viewing conditions. Parasympathetic activity, however, was significantly higher in recovery following the stressor in the viewing scenes of nature condition compared to viewing scenes depicting built environments (RMSSD; 50.0 ± 31.3 vs 34.8 ± 14.8 ms). Thus, viewing nature scenes prior to a stressor alters autonomic activity in the recovery period. The secondary aim was to examine autonomic function during viewing of the two scenes. Standard deviation of R-R intervals (SDRR), as change from baseline, during the first 5 min of viewing nature scenes was greater than during built scenes. Overall, this suggests that nature can elicit improvements in the recovery process following a stressor. PMID:23590163
A view not to be missed: Salient scene content interferes with cognitive restoration
Van der Jagt, Alexander P. N.; Craig, Tony; Brewer, Mark J.; Pearson, David G.
2017-01-01
Attention Restoration Theory (ART) states that built scenes place greater load on attentional resources than natural scenes. This is explained in terms of "hard" and "soft" fascination of built and natural scenes. Given a lack of direct empirical evidence for this assumption we propose that perceptual saliency of scene content can function as an empirically derived indicator of fascination. Saliency levels were established by measuring speed of scene category detection using a Go/No-Go detection paradigm. Experiment 1 shows that built scenes are more salient than natural scenes. Experiment 2 replicates these findings using greyscale images, ruling out a colour-based response strategy, and additionally shows that built objects in natural scenes affect saliency to a greater extent than the reverse. Experiment 3 demonstrates that the saliency of scene content is directly linked to cognitive restoration using an established restoration paradigm. Overall, these findings demonstrate an important link between the saliency of scene content and related cognitive restoration. PMID:28723975
A view not to be missed: Salient scene content interferes with cognitive restoration.
Van der Jagt, Alexander P N; Craig, Tony; Brewer, Mark J; Pearson, David G
2017-01-01
Attention Restoration Theory (ART) states that built scenes place greater load on attentional resources than natural scenes. This is explained in terms of "hard" and "soft" fascination of built and natural scenes. Given a lack of direct empirical evidence for this assumption we propose that perceptual saliency of scene content can function as an empirically derived indicator of fascination. Saliency levels were established by measuring speed of scene category detection using a Go/No-Go detection paradigm. Experiment 1 shows that built scenes are more salient than natural scenes. Experiment 2 replicates these findings using greyscale images, ruling out a colour-based response strategy, and additionally shows that built objects in natural scenes affect saliency to a greater extent than the reverse. Experiment 3 demonstrates that the saliency of scene content is directly linked to cognitive restoration using an established restoration paradigm. Overall, these findings demonstrate an important link between the saliency of scene content and related cognitive restoration.
Rotation-invariant features for multi-oriented text detection in natural images.
Yao, Cong; Zhang, Xin; Bai, Xiang; Liu, Wenyu; Ma, Yi; Tu, Zhuowen
2013-01-01
Texts in natural scenes carry rich semantic information, which can be used to assist a wide range of applications, such as object recognition, image/video retrieval, mapping/navigation, and human computer interaction. However, most existing systems are designed to detect and recognize horizontal (or near-horizontal) texts. Due to the increasing popularity of mobile-computing devices and applications, detecting texts of varying orientations from natural images under less controlled conditions has become an important but challenging task. In this paper, we propose a new algorithm to detect texts of varying orientations. Our algorithm is based on a two-level classification scheme and two sets of features specially designed for capturing the intrinsic characteristics of texts. To better evaluate the proposed method and compare it with the competing algorithms, we generate a comprehensive dataset with various types of texts in diverse real-world scenes. We also propose a new evaluation protocol, which is more suitable for benchmarking algorithms for detecting texts in varying orientations. Experiments on benchmark datasets demonstrate that our system compares favorably with the state-of-the-art algorithms when handling horizontal texts and achieves significantly enhanced performance on variant texts in complex natural scenes.
Correlated Topic Vector for Scene Classification.
Wei, Pengxu; Qin, Fei; Wan, Fang; Zhu, Yi; Jiao, Jianbin; Ye, Qixiang
2017-07-01
Scene images usually involve semantic correlations, particularly when considering large-scale image data sets. This paper proposes a novel generative image representation, correlated topic vector, to model such semantic correlations. Oriented from the correlated topic model, correlated topic vector intends to naturally utilize the correlations among topics, which are seldom considered in the conventional feature encoding, e.g., Fisher vector, but do exist in scene images. It is expected that the involvement of correlations can increase the discriminative capability of the learned generative model and consequently improve the recognition accuracy. Incorporated with the Fisher kernel method, correlated topic vector inherits the advantages of Fisher vector. The contributions to the topics of visual words have been further employed by incorporating the Fisher kernel framework to indicate the differences among scenes. Combined with the deep convolutional neural network (CNN) features and Gibbs sampling solution, correlated topic vector shows great potential when processing large-scale and complex scene image data sets. Experiments on two scene image data sets demonstrate that correlated topic vector improves significantly the deep CNN features, and outperforms existing Fisher kernel-based features.
Intrinsic dimensionality predicts the saliency of natural dynamic scenes.
Vig, Eleonora; Dorr, Michael; Martinetz, Thomas; Barth, Erhardt
2012-06-01
Since visual attention-based computer vision applications have gained popularity, ever more complex, biologically inspired models seem to be needed to predict salient locations (or interest points) in naturalistic scenes. In this paper, we explore how far one can go in predicting eye movements by using only basic signal processing, such as image representations derived from efficient coding principles, and machine learning. To this end, we gradually increase the complexity of a model from simple single-scale saliency maps computed on grayscale videos to spatiotemporal multiscale and multispectral representations. Using a large collection of eye movements on high-resolution videos, supervised learning techniques fine-tune the free parameters whose addition is inevitable with increasing complexity. The proposed model, although very simple, demonstrates significant improvement in predicting salient locations in naturalistic videos over four selected baseline models and two distinct data labeling scenarios.
How affective information from faces and scenes interacts in the brain
Vandenbulcke, Mathieu; Sinke, Charlotte B. A.; Goebel, Rainer; de Gelder, Beatrice
2014-01-01
Facial expression perception can be influenced by the natural visual context in which the face is perceived. We performed an fMRI experiment presenting participants with fearful or neutral faces against threatening or neutral background scenes. Triangles and scrambled scenes served as control stimuli. The results showed that the valence of the background influences face selective activity in the right anterior parahippocampal place area (PPA) and subgenual anterior cingulate cortex (sgACC) with higher activation for neutral backgrounds compared to threatening backgrounds (controlled for isolated background effects) and that this effect correlated with trait empathy in the sgACC. In addition, the left fusiform gyrus (FG) responds to the affective congruence between face and background scene. The results show that valence of the background modulates face processing and support the hypothesis that empathic processing in sgACC is inhibited when affective information is present in the background. In addition, the findings reveal a pattern of complex scene perception showing a gradient of functional specialization along the posterior–anterior axis: from sensitivity to the affective content of scenes (extrastriate body area: EBA and posterior PPA), over scene emotion–face emotion interaction (left FG) via category–scene interaction (anterior PPA) to scene–category–personality interaction (sgACC). PMID:23956081
Mizuhara, Hiroaki; Sato, Naoyuki; Yamaguchi, Yoko
2015-05-01
Neural oscillations are crucial for revealing dynamic cortical networks and for serving as a possible mechanism of inter-cortical communication, especially in association with mnemonic function. The interplay of the slow and fast oscillations might dynamically coordinate the mnemonic cortical circuits to rehearse stored items during working memory retention. We recorded simultaneous EEG-fMRI during a working memory task involving a natural scene to verify whether the cortical networks emerge with the neural oscillations for memory of the natural scene. The slow EEG power was enhanced in association with the better accuracy of working memory retention, and accompanied cortical activities in the mnemonic circuits for the natural scene. Fast oscillation showed a phase-amplitude coupling to the slow oscillation, and its power was tightly coupled with the cortical activities for representing the visual images of natural scenes. The mnemonic cortical circuit with the slow neural oscillations would rehearse the distributed natural scene representations with the fast oscillation for working memory retention. The coincidence of the natural scene representations could be obtained by the slow oscillation phase to create a coherent whole of the natural scene in the working memory. Copyright © 2015 Elsevier Inc. All rights reserved.
Texture metric that predicts target detection performance
NASA Astrophysics Data System (ADS)
Culpepper, Joanne B.
2015-12-01
Two texture metrics based on gray level co-occurrence error (GLCE) are used to predict probability of detection and mean search time. The two texture metrics are local clutter metrics and are based on the statistics of GLCE probability distributions. The degree of correlation between various clutter metrics and the target detection performance of the nine military vehicles in complex natural scenes found in the Search_2 dataset are presented. Comparison is also made between four other common clutter metrics found in the literature: root sum of squares, Doyle, statistical variance, and target structure similarity. The experimental results show that the GLCE energy metric is a better predictor of target detection performance when searching for targets in natural scenes than the other clutter metrics studied.
Comparison of algorithms for blood stain detection applied to forensic hyperspectral imagery
NASA Astrophysics Data System (ADS)
Yang, Jie; Messinger, David W.; Mathew, Jobin J.; Dube, Roger R.
2016-05-01
Blood stains are among the most important types of evidence for forensic investigation. They contain valuable DNA information, and the pattern of the stains can suggest specifics about the nature of the violence that transpired at the scene. Early detection of blood stains is particularly important since the blood reacts physically and chemically with air and materials over time. Accurate identification of blood remnants, including regions that might have been intentionally cleaned, is an important aspect of forensic investigation. Hyperspectral imaging might be a potential method to detect blood stains because it is non-contact and provides substantial spectral information that can be used to identify regions in a scene with trace amounts of blood. The potential complexity of scenes in which such vast violence occurs can be high when the range of scene material types and conditions containing blood stains at a crime scene are considered. Some stains are hard to detect by the unaided eye, especially if a conscious effort to clean the scene has occurred (we refer to these as "latent" blood stains). In this paper we present the initial results of a study of the use of hyperspectral imaging algorithms for blood detection in complex scenes. We describe a hyperspectral imaging system which generates images covering 400 nm - 700 nm visible range with a spectral resolution of 10 nm. Three image sets of 31 wavelength bands were generated using this camera for a simulated indoor crime scene in which blood stains were placed on a T-shirt and walls. To detect blood stains in the scene, Principal Component Analysis (PCA), Subspace Reed Xiaoli Detection (SRXD), and Topological Anomaly Detection (TAD) algorithms were used. Comparison of the three hyperspectral image analysis techniques shows that TAD is most suitable for detecting blood stains and discovering latent blood stains.
Lescroart, Mark D.; Stansbury, Dustin E.; Gallant, Jack L.
2015-01-01
Perception of natural visual scenes activates several functional areas in the human brain, including the Parahippocampal Place Area (PPA), Retrosplenial Complex (RSC), and the Occipital Place Area (OPA). It is currently unclear what specific scene-related features are represented in these areas. Previous studies have suggested that PPA, RSC, and/or OPA might represent at least three qualitatively different classes of features: (1) 2D features related to Fourier power; (2) 3D spatial features such as the distance to objects in a scene; or (3) abstract features such as the categories of objects in a scene. To determine which of these hypotheses best describes the visual representation in scene-selective areas, we applied voxel-wise modeling (VM) to BOLD fMRI responses elicited by a set of 1386 images of natural scenes. VM provides an efficient method for testing competing hypotheses by comparing predictions of brain activity based on encoding models that instantiate each hypothesis. Here we evaluated three different encoding models that instantiate each of the three hypotheses listed above. We used linear regression to fit each encoding model to the fMRI data recorded from each voxel, and we evaluated each fit model by estimating the amount of variance it predicted in a withheld portion of the data set. We found that voxel-wise models based on Fourier power or the subjective distance to objects in each scene predicted much of the variance predicted by a model based on object categories. Furthermore, the response variance explained by these three models is largely shared, and the individual models explain little unique variance in responses. Based on an evaluation of previous studies and the data we present here, we conclude that there is currently no good basis to favor any one of the three alternative hypotheses about visual representation in scene-selective areas. We offer suggestions for further studies that may help resolve this issue. PMID:26594164
Guidance of visual search by memory and knowledge.
Hollingworth, Andrew
2012-01-01
To behave intelligently in the world, humans must be able to find objects efficiently within the complex environments they inhabit. A growing proportion of the literature on visual search is devoted to understanding this type of natural search. In the present chapter, I review the literature on visual search through natural scenes, focusing on the role of memory and knowledge in guiding attention to task-relevant objects.
The influence of behavioral relevance on the processing of global scene properties: An ERP study.
Hansen, Natalie E; Noesen, Birken T; Nador, Jeffrey D; Harel, Assaf
2018-05-02
Recent work studying the temporal dynamics of visual scene processing (Harel et al., 2016) has found that global scene properties (GSPs) modulate the amplitude of early Event-Related Potentials (ERPs). It is still not clear, however, to what extent the processing of these GSPs is influenced by their behavioral relevance, determined by the goals of the observer. To address this question, we investigated how behavioral relevance, operationalized by the task context impacts the electrophysiological responses to GSPs. In a set of two experiments we recorded ERPs while participants viewed images of real-world scenes, varying along two GSPs, naturalness (manmade/natural) and spatial expanse (open/closed). In Experiment 1, very little attention to scene content was required as participants viewed the scenes while performing an orthogonal fixation-cross task. In Experiment 2 participants saw the same scenes but now had to actively categorize them, based either on their naturalness or spatial expense. We found that task context had very little impact on the early ERP responses to the naturalness and spatial expanse of the scenes: P1, N1, and P2 could distinguish between open and closed scenes and between manmade and natural scenes across both experiments. Further, the specific effects of naturalness and spatial expanse on the ERP components were largely unaffected by their relevance for the task. A task effect was found at the N1 and P2 level, but this effect was manifest across all scene dimensions, indicating a general effect rather than an interaction between task context and GSPs. Together, these findings suggest that the extraction of global scene information reflected in the early ERP components is rapid and very little influenced by top-down observer-based goals. Copyright © 2018 Elsevier Ltd. All rights reserved.
The Incongruency Advantage for Environmental Sounds Presented in Natural Auditory Scenes
Gygi, Brian; Shafiro, Valeriy
2011-01-01
The effect of context on the identification of common environmental sounds (e.g., dogs barking or cars honking) was tested by embedding them in familiar auditory background scenes (street ambience, restaurants). Initial results with subjects trained on both the scenes and the sounds to be identified showed a significant advantage of about 5 percentage points better accuracy for sounds that were contextually incongruous with the background scene (e.g., a rooster crowing in a hospital). Further studies with naïve (untrained) listeners showed that this Incongruency Advantage (IA) is level-dependent: there is no advantage for incongruent sounds lower than a Sound/Scene ratio (So/Sc) of −7.5 dB, but there is about 5 percentage points better accuracy for sounds with greater So/Sc. Testing a new group of trained listeners on a larger corpus of sounds and scenes showed that the effect is robust and not confined to specific stimulus set. Modeling using spectral-temporal measures showed that neither analyses based on acoustic features, nor semantic assessments of sound-scene congruency can account for this difference, indicating the Incongruency Advantage is a complex effect, possibly arising from the sensitivity of the auditory system to new and unexpected events, under particular listening conditions. PMID:21355664
Two Distinct Scene-Processing Networks Connecting Vision and Memory.
Baldassano, Christopher; Esteva, Andre; Fei-Fei, Li; Beck, Diane M
2016-01-01
A number of regions in the human brain are known to be involved in processing natural scenes, but the field has lacked a unifying framework for understanding how these different regions are organized and interact. We provide evidence from functional connectivity and meta-analyses for a new organizational principle, in which scene processing relies upon two distinct networks that split the classically defined parahippocampal place area (PPA). The first network of strongly connected regions consists of the occipital place area/transverse occipital sulcus and posterior PPA, which contain retinotopic maps and are not strongly coupled to the hippocampus at rest. The second network consists of the caudal inferior parietal lobule, retrosplenial complex, and anterior PPA, which connect to the hippocampus (especially anterior hippocampus), and are implicated in both visual and nonvisual tasks, including episodic memory and navigation. We propose that these two distinct networks capture the primary functional division among scene-processing regions, between those that process visual features from the current view of a scene and those that connect information from a current scene view with a much broader temporal and spatial context. This new framework for understanding the neural substrates of scene-processing bridges results from many lines of research, and makes specific functional predictions.
ERIC Educational Resources Information Center
Rieger, Jochem W.; Kochy, Nick; Schalk, Franziska; Gruschow, Marcus; Heinze, Hans-Jochen
2008-01-01
The visual system rapidly extracts information about objects from the cluttered natural environment. In 5 experiments, the authors quantified the influence of orientation and semantics on the classification speed of objects in natural scenes, particularly with regard to object-context interactions. Natural scene photographs were presented in an…
Ball, Felix; Elzemann, Anne; Busch, Niko A
2014-09-01
The change blindness paradigm, in which participants often fail to notice substantial changes in a scene, is a popular tool for studying scene perception, visual memory, and the link between awareness and attention. Some of the most striking and popular examples of change blindness have been demonstrated with digital photographs of natural scenes; in most studies, however, much simpler displays, such as abstract stimuli or "free-floating" objects, are typically used. Although simple displays have undeniable advantages, natural scenes remain a very useful and attractive stimulus for change blindness research. To assist researchers interested in using natural-scene stimuli in change blindness experiments, we provide here a step-by-step tutorial on how to produce changes in natural-scene images with a freely available image-processing tool (GIMP). We explain how changes in a scene can be made by deleting objects or relocating them within the scene or by changing the color of an object, in just a few simple steps. We also explain how the physical properties of such changes can be analyzed using GIMP and MATLAB (a high-level scientific programming tool). Finally, we present an experiment confirming that scenes manipulated according to our guidelines are effective in inducing change blindness and demonstrating the relationship between change blindness and the physical properties of the change and inter-individual differences in performance measures. We expect that this tutorial will be useful for researchers interested in studying the mechanisms of change blindness, attention, or visual memory using natural scenes.
[Perception of objects and scenes in age-related macular degeneration].
Tran, T H C; Boucart, M
2012-01-01
Vision related quality of life questionnaires suggest that patients with AMD exhibit difficulties in finding objects and in mobility. In the natural environment, objects seldom appear in isolation. They appear in a spatial context which may obscure them in part or place obstacles in the patient's path. Furthermore, the luminance of a natural scene varies as a function of the hour of the day and the light source, which can alter perception. This study aims to evaluate recognition of objects and natural scenes by patients with AMD, by using photographs of such scenes. Studies demonstrate that AMD patients are able to categorize scenes as nature scenes or urban scenes and to discriminate indoor from outdoor scenes with a high degree of precision. They detect objects better in isolation, in color, or against a white background than in their natural contexts. These patients encounter more difficulties than normally sighted individuals in detecting objects in a low-contrast, black-and-white scene. These results may have implications for rehabilitation, for layout of texts and magazines for the reading-impaired and for the rearrangement of the spatial environment of older AMD patients in order to facilitate mobility, finding objects and reducing the risk of falls. Copyright © 2011 Elsevier Masson SAS. All rights reserved.
A three-layer model of natural image statistics.
Gutmann, Michael U; Hyvärinen, Aapo
2013-11-01
An important property of visual systems is to be simultaneously both selective to specific patterns found in the sensory input and invariant to possible variations. Selectivity and invariance (tolerance) are opposing requirements. It has been suggested that they could be joined by iterating a sequence of elementary selectivity and tolerance computations. It is, however, unknown what should be selected or tolerated at each level of the hierarchy. We approach this issue by learning the computations from natural images. We propose and estimate a probabilistic model of natural images that consists of three processing layers. Two natural image data sets are considered: image patches, and complete visual scenes downsampled to the size of small patches. For both data sets, we find that in the first two layers, simple and complex cell-like computations are performed. In the third layer, we mainly find selectivity to longer contours; for patch data, we further find some selectivity to texture, while for the downsampled complete scenes, some selectivity to curvature is observed. Copyright © 2013 Elsevier Ltd. All rights reserved.
Developmental changes in attention to faces and bodies in static and dynamic scenes.
Stoesz, Brenda M; Jakobson, Lorna S
2014-01-01
Typically developing individuals show a strong visual preference for faces and face-like stimuli; however, this may come at the expense of attending to bodies or to other aspects of a scene. The primary goal of the present study was to provide additional insight into the development of attentional mechanisms that underlie perception of real people in naturalistic scenes. We examined the looking behaviors of typical children, adolescents, and young adults as they viewed static and dynamic scenes depicting one or more people. Overall, participants showed a bias to attend to faces more than on other parts of the scenes. Adding motion cues led to a reduction in the number, but an increase in the average duration of face fixations in single-character scenes. When multiple characters appeared in a scene, motion-related effects were attenuated and participants shifted their gaze from faces to bodies, or made off-screen glances. Children showed the largest effects related to the introduction of motion cues or additional characters, suggesting that they find dynamic faces difficult to process, and are especially prone to look away from faces when viewing complex social scenes-a strategy that could reduce the cognitive and the affective load imposed by having to divide one's attention between multiple faces. Our findings provide new insights into the typical development of social attention during natural scene viewing, and lay the foundation for future work examining gaze behaviors in typical and atypical development.
Temporal and spatial adaptation of transient responses to local features
O'Carroll, David C.; Barnett, Paul D.; Nordström, Karin
2012-01-01
Interpreting visual motion within the natural environment is a challenging task, particularly considering that natural scenes vary enormously in brightness, contrast and spatial structure. The performance of current models for the detection of self-generated optic flow depends critically on these very parameters, but despite this, animals manage to successfully navigate within a broad range of scenes. Within global scenes local areas with more salient features are common. Recent work has highlighted the influence that local, salient features have on the encoding of optic flow, but it has been difficult to quantify how local transient responses affect responses to subsequent features and thus contribute to the global neural response. To investigate this in more detail we used experimenter-designed stimuli and recorded intracellularly from motion-sensitive neurons. We limited the stimulus to a small vertically elongated strip, to investigate local and global neural responses to pairs of local “doublet” features that were designed to interact with each other in the temporal and spatial domain. We show that the passage of a high-contrast doublet feature produces a complex transient response from local motion detectors consistent with predictions of a simple computational model. In the neuron, the passage of a high-contrast feature induces a local reduction in responses to subsequent low-contrast features. However, this neural contrast gain reduction appears to be recruited only when features stretch vertically (i.e., orthogonal to the direction of motion) across at least several aligned neighboring ommatidia. Horizontal displacement of the components of elongated features abolishes the local adaptation effect. It is thus likely that features in natural scenes with vertically aligned edges, such as tree trunks, recruit the greatest amount of response suppression. This property could emphasize the local responses to such features vs. those in nearby texture within the scene. PMID:23087617
Eisenberger, Robert; Sucharski, Ivan L; Yalowitz, Steven; Kent, Robert J; Loomis, Ross J; Jones, Jason R; Paylor, Sarah; Aselage, Justin; Mueller, Meta Steiger; McLaughlin, John P
2010-04-01
Eight studies assessed the motive for sensory pleasure (MSP) involving a general disposition to enjoy and pursue pleasant nature-related experiences and avoid unpleasant nature-related experiences. The stated enjoyment of pleasant sights, smells, sounds, and tactile sensations formed a unitary construct that was distinct from sensation seeking, novelty preference, and need for cognition. MSP was found to be related to (a) enjoyment of pleasant nature scenes and music of high but not low clarity; (b) enjoyment of writings that portrayed highly detailed nature scenes; (c) enjoyment of pleasantly themed paintings and dislike of unpleasant paintings, as distinct from findings with Openness to Experience; (d) choice of pleasant nature scenes over exciting or intellectually stimulating scenes; (e) view duration and memory of artistically rendered quilts; (f) interest in detailed information about nature scenes; and (g) frequency of sensory-type suggestions for improvement of a museum exhibit.
Animal Detection in Natural Images: Effects of Color and Image Database
Zhu, Weina; Drewes, Jan; Gegenfurtner, Karl R.
2013-01-01
The visual system has a remarkable ability to extract categorical information from complex natural scenes. In order to elucidate the role of low-level image features for the recognition of objects in natural scenes, we recorded saccadic eye movements and event-related potentials (ERPs) in two experiments, in which human subjects had to detect animals in previously unseen natural images. We used a new natural image database (ANID) that is free of some of the potential artifacts that have plagued the widely used COREL images. Color and grayscale images picked from the ANID and COREL databases were used. In all experiments, color images induced a greater N1 EEG component at earlier time points than grayscale images. We suggest that this influence of color in animal detection may be masked by later processes when measuring reation times. The ERP results of go/nogo and forced choice tasks were similar to those reported earlier. The non-animal stimuli induced bigger N1 than animal stimuli both in the COREL and ANID databases. This result indicates ultra-fast processing of animal images is possible irrespective of the particular database. With the ANID images, the difference between color and grayscale images is more pronounced than with the COREL images. The earlier use of the COREL images might have led to an underestimation of the contribution of color. Therefore, we conclude that the ANID image database is better suited for the investigation of the processing of natural scenes than other databases commonly used. PMID:24130744
Where's Wally: the influence of visual salience on referring expression generation.
Clarke, Alasdair D F; Elsner, Micha; Rohde, Hannah
2013-01-01
REFERRING EXPRESSION GENERATION (REG) PRESENTS THE CONVERSE PROBLEM TO VISUAL SEARCH: given a scene and a specified target, how does one generate a description which would allow somebody else to quickly and accurately locate the target?Previous work in psycholinguistics and natural language processing has failed to find an important and integrated role for vision in this task. That previous work, which relies largely on simple scenes, tends to treat vision as a pre-process for extracting feature categories that are relevant to disambiguation. However, the visual search literature suggests that some descriptions are better than others at enabling listeners to search efficiently within complex stimuli. This paper presents a study testing whether participants are sensitive to visual features that allow them to compose such "good" descriptions. Our results show that visual properties (salience, clutter, area, and distance) influence REG for targets embedded in images from the Where's Wally? books. Referring expressions for large targets are shorter than those for smaller targets, and expressions about targets in highly cluttered scenes use more words. We also find that participants are more likely to mention non-target landmarks that are large, salient, and in close proximity to the target. These findings identify a key role for visual salience in language production decisions and highlight the importance of scene complexity for REG.
Bradley, Margaret M.; Lang, Peter J.
2013-01-01
During rapid serial visual presentation (RSVP), the perceptual system is confronted with a rapidly changing array of sensory information demanding resolution. At rapid rates of presentation, previous studies have found an early (e.g., 150–280 ms) negativity over occipital sensors that is enhanced when emotional, as compared with neutral, pictures are viewed, suggesting facilitated perception. In the present study, we explored how picture composition and the presence of people in the image affect perceptual processing of pictures of natural scenes. Using RSVP, pictures that differed in perceptual composition (figure–ground or scenes), content (presence of people or not), and emotional content (emotionally arousing or neutral) were presented in a continuous stream for 330 ms each with no intertrial interval. In both subject and picture analyses, all three variables affected the amplitude of occipital negativity, with the greatest enhancement for figure–ground compositions (as compared with scenes), irrespective of content and emotional arousal, supporting an interpretation that ease of perceptual processing is associated with enhanced occipital negativity. Viewing emotional pictures prompted enhanced negativity only for pictures that depicted people, suggesting that specific features of emotionally arousing images are associated with facilitated perceptual processing, rather than all emotional content. PMID:23780520
Compressed digital holography: from micro towards macro
NASA Astrophysics Data System (ADS)
Schretter, Colas; Bettens, Stijn; Blinder, David; Pesquet-Popescu, Béatrice; Cagnazzo, Marco; Dufaux, Frédéric; Schelkens, Peter
2016-09-01
signal processing methods from software-driven computer engineering and applied mathematics. The compressed sensing theory in particular established a practical framework for reconstructing the scene content using few linear combinations of complex measurements and a sparse prior for regularizing the solution. Compressed sensing found direct applications in digital holography for microscopy. Indeed, the wave propagation phenomenon in free space mixes in a natural way the spatial distribution of point sources from the 3-dimensional scene. As the 3-dimensional scene is mapped to a 2-dimensional hologram, the hologram samples form a compressed representation of the scene as well. This overview paper discusses contributions in the field of compressed digital holography at the micro scale. Then, an outreach on future extensions towards the real-size macro scale is discussed. Thanks to advances in sensor technologies, increasing computing power and the recent improvements in sparse digital signal processing, holographic modalities are on the verge of practical high-quality visualization at a macroscopic scale where much higher resolution holograms must be acquired and processed on the computer.
ERIC Educational Resources Information Center
Greene, Michelle R.; Oliva, Aude
2009-01-01
Human observers are able to rapidly and accurately categorize natural scenes, but the representation mediating this feat is still unknown. Here we propose a framework of rapid scene categorization that does not segment a scene into objects and instead uses a vocabulary of global, ecological properties that describe spatial and functional aspects…
Guidance of Attention to Objects and Locations by Long-Term Memory of Natural Scenes
ERIC Educational Resources Information Center
Becker, Mark W.; Rasmussen, Ian P.
2008-01-01
Four flicker change-detection experiments demonstrate that scene-specific long-term memory guides attention to both behaviorally relevant locations and objects within a familiar scene. Participants performed an initial block of change-detection trials, detecting the addition of an object to a natural scene. After a 30-min delay, participants…
Statistical regularities of art images and natural scenes: spectra, sparseness and nonlinearities.
Graham, Daniel J; Field, David J
2007-01-01
Paintings are the product of a process that begins with ordinary vision in the natural world and ends with manipulation of pigments on canvas. Because artists must produce images that can be seen by a visual system that is thought to take advantage of statistical regularities in natural scenes, artists are likely to replicate many of these regularities in their painted art. We have tested this notion by computing basic statistical properties and modeled cell response properties for a large set of digitized paintings and natural scenes. We find that both representational and non-representational (abstract) paintings from our sample (124 images) show basic similarities to a sample of natural scenes in terms of their spatial frequency amplitude spectra, but the paintings and natural scenes show significantly different mean amplitude spectrum slopes. We also find that the intensity distributions of paintings show a lower skewness and sparseness than natural scenes. We account for this by considering the range of luminances found in the environment compared to the range available in the medium of paint. A painting's range is limited by the reflective properties of its materials. We argue that artists do not simply scale the intensity range down but use a compressive nonlinearity. In our studies, modeled retinal and cortical filter responses to the images were less sparse for the paintings than for the natural scenes. But when a compressive nonlinearity was applied to the images, both the paintings' sparseness and the modeled responses to the paintings showed the same or greater sparseness compared to the natural scenes. This suggests that artists achieve some degree of nonlinear compression in their paintings. Because paintings have captivated humans for millennia, finding basic statistical regularities in paintings' spatial structure could grant insights into the range of spatial patterns that humans find compelling.
The perception of naturalness correlates with low-level visual features of environmental scenes.
Berman, Marc G; Hout, Michael C; Kardan, Omid; Hunter, MaryCarol R; Yourganov, Grigori; Henderson, John M; Hanayik, Taylor; Karimi, Hossein; Jonides, John
2014-01-01
Previous research has shown that interacting with natural environments vs. more urban or built environments can have salubrious psychological effects, such as improvements in attention and memory. Even viewing pictures of nature vs. pictures of built environments can produce similar effects. A major question is: What is it about natural environments that produces these benefits? Problematically, there are many differing qualities between natural and urban environments, making it difficult to narrow down the dimensions of nature that may lead to these benefits. In this study, we set out to uncover visual features that related to individuals' perceptions of naturalness in images. We quantified naturalness in two ways: first, implicitly using a multidimensional scaling analysis and second, explicitly with direct naturalness ratings. Features that seemed most related to perceptions of naturalness were related to the density of contrast changes in the scene, the density of straight lines in the scene, the average color saturation in the scene and the average hue diversity in the scene. We then trained a machine-learning algorithm to predict whether a scene was perceived as being natural or not based on these low-level visual features and we could do so with 81% accuracy. As such we were able to reliably predict subjective perceptions of naturalness with objective low-level visual features. Our results can be used in future studies to determine if these features, which are related to naturalness, may also lead to the benefits attained from interacting with nature.
Santangelo, Valerio; Di Francesco, Simona Arianna; Mastroberardino, Serena; Macaluso, Emiliano
2015-12-01
The Brief presentation of a complex scene entails that only a few objects can be selected, processed indepth, and stored in memory. Both low-level sensory salience and high-level context-related factors (e.g., the conceptual match/mismatch between objects and scene context) contribute to this selection process, but how the interplay between these factors affects memory encoding is largely unexplored. Here, during fMRI we presented participants with pictures of everyday scenes. After a short retention interval, participants judged the position of a target object extracted from the initial scene. The target object could be either congruent or incongruent with the context of the scene, and could be located in a region of the image with maximal or minimal salience. Behaviourally, we found a reduced impact of saliency on visuospatial working memory performance when the target was out-of-context. Encoding-related fMRI results showed that context-congruent targets activated dorsoparietal regions, while context-incongruent targets de-activated the ventroparietal cortex. Saliency modulated activity both in dorsal and ventral regions, with larger context-related effects for salient targets. These findings demonstrate the joint contribution of knowledge-based and saliency-driven attention for memory encoding, highlighting a dissociation between dorsal and ventral parietal regions. © 2015 Wiley Periodicals, Inc.
Lu, Kun-Han; Hung, Shao-Chin; Wen, Haiguang; Marussich, Lauren; Liu, Zhongming
2016-01-01
Complex, sustained, dynamic, and naturalistic visual stimulation can evoke distributed brain activities that are highly reproducible within and across individuals. However, the precise origins of such reproducible responses remain incompletely understood. Here, we employed concurrent functional magnetic resonance imaging (fMRI) and eye tracking to investigate the experimental and behavioral factors that influence fMRI activity and its intra- and inter-subject reproducibility during repeated movie stimuli. We found that widely distributed and highly reproducible fMRI responses were attributed primarily to the high-level natural content in the movie. In the absence of such natural content, low-level visual features alone in a spatiotemporally scrambled control stimulus evoked significantly reduced degree and extent of reproducible responses, which were mostly confined to the primary visual cortex (V1). We also found that the varying gaze behavior affected the cortical response at the peripheral part of V1 and in the oculomotor network, with minor effects on the response reproducibility over the extrastriate visual areas. Lastly, scene transitions in the movie stimulus due to film editing partly caused the reproducible fMRI responses at widespread cortical areas, especially along the ventral visual pathway. Therefore, the naturalistic nature of a movie stimulus is necessary for driving highly reliable visual activations. In a movie-stimulation paradigm, scene transitions and individuals’ gaze behavior should be taken as potential confounding factors in order to properly interpret cortical activity that supports natural vision. PMID:27564573
Thake, Carol L; Bambling, Matthew; Edirippulige, Sisira; Marx, Eric
2017-10-01
Research supports therapeutic use of nature scenes in healthcare settings, particularly to reduce stress. However, limited literature is available to provide a cohesive guide for selecting scenes that may provide optimal therapeutic effect. This study produced and tested a replicable process for selecting nature scenes with therapeutic potential. Psychoevolutionary theory informed the construction of the Importance for Survival Scale (IFSS), and its usefulness for identifying scenes that people generally prefer to view and that hold potential to reduce stress was tested. Relationships between Importance for Survival (IFS), preference, and restoration were tested. General community participants ( N = 20 males, 20 females; M age = 48 years) Q-sorted sets of landscape photographs (preranked by the researcher in terms of IFS using the IFSS) from most to least preferred, and then completed the Short-Version Revised Restoration Scale in response to viewing a selection of the scenes. Results showed significant positive relationships between IFS and each of scene preference (large effect), and restoration potential (medium effect), as well as between scene preference and restoration potential across the levels of IFS (medium effect), and for individual participants and scenes (large effect). IFS was supported as a framework for identifying nature scenes that people will generally prefer to view and that hold potential for restoration from emotional distress; however, greater therapeutic potential may be expected when people can choose which of the scenes they would prefer to view. Evidence for the effectiveness of the IFSS was produced.
The Relationship Between Online Visual Representation of a Scene and Long-Term Scene Memory
ERIC Educational Resources Information Center
Hollingworth, Andrew
2005-01-01
In 3 experiments the author investigated the relationship between the online visual representation of natural scenes and long-term visual memory. In a change detection task, a target object either changed or remained the same from an initial image of a natural scene to a test image. Two types of changes were possible: rotation in depth, or…
Dimensionality of visual complexity in computer graphics scenes
NASA Astrophysics Data System (ADS)
Ramanarayanan, Ganesh; Bala, Kavita; Ferwerda, James A.; Walter, Bruce
2008-02-01
How do human observers perceive visual complexity in images? This problem is especially relevant for computer graphics, where a better understanding of visual complexity can aid in the development of more advanced rendering algorithms. In this paper, we describe a study of the dimensionality of visual complexity in computer graphics scenes. We conducted an experiment where subjects judged the relative complexity of 21 high-resolution scenes, rendered with photorealistic methods. Scenes were gathered from web archives and varied in theme, number and layout of objects, material properties, and lighting. We analyzed the subject responses using multidimensional scaling of pooled subject responses. This analysis embedded the stimulus images in a two-dimensional space, with axes that roughly corresponded to "numerosity" and "material / lighting complexity". In a follow-up analysis, we derived a one-dimensional complexity ordering of the stimulus images. We compared this ordering with several computable complexity metrics, such as scene polygon count and JPEG compression size, and did not find them to be very correlated. Understanding the differences between these measures can lead to the design of more efficient rendering algorithms in computer graphics.
Fusion of monocular cues to detect man-made structures in aerial imagery
NASA Technical Reports Server (NTRS)
Shufelt, Jefferey; Mckeown, David M.
1991-01-01
The extraction of buildings from aerial imagery is a complex problem for automated computer vision. It requires locating regions in a scene that possess properties distinguishing them as man-made objects as opposed to naturally occurring terrain features. It is reasonable to assume that no single detection method can correctly delineate or verify buildings in every scene. A cooperative-methods paradigm is useful in approaching the building extraction problem. Using this paradigm, each extraction technique provides information which can be added or assimilated into an overall interpretation of the scene. Thus, the main objective is to explore the development of computer vision system that integrates the results of various scene analysis techniques into an accurate and robust interpretation of the underlying three dimensional scene. The problem of building hypothesis fusion in aerial imagery is discussed. Building extraction techniques are briefly surveyed, including four building extraction, verification, and clustering systems. A method for fusing the symbolic data generated by these systems is described, and applied to monocular image and stereo image data sets. Evaluation methods for the fusion results are described, and the fusion results are analyzed using these methods.
A novel scene management technology for complex virtual battlefield environment
NASA Astrophysics Data System (ADS)
Sheng, Changchong; Jiang, Libing; Tang, Bo; Tang, Xiaoan
2018-04-01
The efficient scene management of virtual environment is an important research content of computer real-time visualization, which has a decisive influence on the efficiency of drawing. However, Traditional scene management methods do not suitable for complex virtual battlefield environments, this paper combines the advantages of traditional scene graph technology and spatial data structure method, using the idea of management and rendering separation, a loose object-oriented scene graph structure is established to manage the entity model data in the scene, and the performance-based quad-tree structure is created for traversing and rendering. In addition, the collaborative update relationship between the above two structural trees is designed to achieve efficient scene management. Compared with the previous scene management method, this method is more efficient and meets the needs of real-time visualization.
Motivational Objects in Natural Scenes (MONS): A Database of >800 Objects.
Schomaker, Judith; Rau, Elias M; Einhäuser, Wolfgang; Wittmann, Bianca C
2017-01-01
In daily life, we are surrounded by objects with pre-existing motivational associations. However, these are rarely controlled for in experiments with natural stimuli. Research on natural stimuli would therefore benefit from stimuli with well-defined motivational properties; in turn, such stimuli also open new paths in research on motivation. Here we introduce a database of Motivational Objects in Natural Scenes (MONS). The database consists of 107 scenes. Each scene contains 2 to 7 objects placed at approximately equal distance from the scene center. Each scene was photographed creating 3 versions, with one object ("critical object") being replaced to vary the overall motivational value of the scene (appetitive, aversive, and neutral), while maintaining high visual similarity between the three versions. Ratings on motivation, valence, arousal and recognizability were obtained using internet-based questionnaires. Since the main objective was to provide stimuli of well-defined motivational value, three motivation scales were used: (1) Desire to own the object; (2) Approach/Avoid; (3) Desire to interact with the object. Three sets of ratings were obtained in independent sets of observers: for all 805 objects presented on a neutral background, for 321 critical objects presented in their scene context, and for the entire scenes. On the basis of the motivational ratings, objects were subdivided into aversive, neutral, and appetitive categories. The MONS database will provide a standardized basis for future studies on motivational value under realistic conditions.
NASA Astrophysics Data System (ADS)
Chandler, Damon M.; Field, David J.
2007-04-01
Natural scenes, like most all natural data sets, show considerable redundancy. Although many forms of redundancy have been investigated (e.g., pixel distributions, power spectra, contour relationships, etc.), estimates of the true entropy of natural scenes have been largely considered intractable. We describe a technique for estimating the entropy and relative dimensionality of image patches based on a function we call the proximity distribution (a nearest-neighbor technique). The advantage of this function over simple statistics such as the power spectrum is that the proximity distribution is dependent on all forms of redundancy. We demonstrate that this function can be used to estimate the entropy (redundancy) of 3×3 patches of known entropy as well as 8×8 patches of Gaussian white noise, natural scenes, and noise with the same power spectrum as natural scenes. The techniques are based on assumptions regarding the intrinsic dimensionality of the data, and although the estimates depend on an extrapolation model for images larger than 3×3, we argue that this approach provides the best current estimates of the entropy and compressibility of natural-scene patches and that it provides insights into the efficiency of any coding strategy that aims to reduce redundancy. We show that the sample of 8×8 patches of natural scenes used in this study has less than half the entropy of 8×8 white noise and less than 60% of the entropy of noise with the same power spectrum. In addition, given a finite number of samples (<220) drawn randomly from the space of 8×8 patches, the subspace of 8×8 natural-scene patches shows a dimensionality that depends on the sampling density and that for low densities is significantly lower dimensional than the space of 8×8 patches of white noise and noise with the same power spectrum.
The Neural Dynamics of Attentional Selection in Natural Scenes.
Kaiser, Daniel; Oosterhof, Nikolaas N; Peelen, Marius V
2016-10-12
The human visual system can only represent a small subset of the many objects present in cluttered scenes at any given time, such that objects compete for representation. Despite these processing limitations, the detection of object categories in cluttered natural scenes is remarkably rapid. How does the brain efficiently select goal-relevant objects from cluttered scenes? In the present study, we used multivariate decoding of magneto-encephalography (MEG) data to track the neural representation of within-scene objects as a function of top-down attentional set. Participants detected categorical targets (cars or people) in natural scenes. The presence of these categories within a scene was decoded from MEG sensor patterns by training linear classifiers on differentiating cars and people in isolation and testing these classifiers on scenes containing one of the two categories. The presence of a specific category in a scene could be reliably decoded from MEG response patterns as early as 160 ms, despite substantial scene clutter and variation in the visual appearance of each category. Strikingly, we find that these early categorical representations fully depend on the match between visual input and top-down attentional set: only objects that matched the current attentional set were processed to the category level within the first 200 ms after scene onset. A sensor-space searchlight analysis revealed that this early attention bias was localized to lateral occipitotemporal cortex, reflecting top-down modulation of visual processing. These results show that attention quickly resolves competition between objects in cluttered natural scenes, allowing for the rapid neural representation of goal-relevant objects. Efficient attentional selection is crucial in many everyday situations. For example, when driving a car, we need to quickly detect obstacles, such as pedestrians crossing the street, while ignoring irrelevant objects. How can humans efficiently perform such tasks, given the multitude of objects contained in real-world scenes? Here we used multivariate decoding of magnetoencephalogaphy data to characterize the neural underpinnings of attentional selection in natural scenes with high temporal precision. We show that brain activity quickly tracks the presence of objects in scenes, but crucially only for those objects that were immediately relevant for the participant. These results provide evidence for fast and efficient attentional selection that mediates the rapid detection of goal-relevant objects in real-world environments. Copyright © 2016 the authors 0270-6474/16/3610522-07$15.00/0.
Detecting text in natural scenes with multi-level MSER and SWT
NASA Astrophysics Data System (ADS)
Lu, Tongwei; Liu, Renjun
2018-04-01
The detection of the characters in the natural scene is susceptible to factors such as complex background, variable viewing angle and diverse forms of language, which leads to poor detection results. Aiming at these problems, a new text detection method was proposed, which consisted of two main stages, candidate region extraction and text region detection. At first stage, the method used multiple scale transformations of original image and multiple thresholds of maximally stable extremal regions (MSER) to detect the text regions which could detect character regions comprehensively. At second stage, obtained SWT maps by using the stroke width transform (SWT) algorithm to compute the candidate regions, then using cascaded classifiers to propose non-text regions. The proposed method was evaluated on the standard benchmark datasets of ICDAR2011 and the datasets that we made our own data sets. The experiment results showed that the proposed method have greatly improved that compared to other text detection methods.
Image Enhancement for Astronomical Scenes
2013-09-01
address this problem in the context of natural scenes. However, these techniques often misbehave when confronted with low-SNR scenes that are also...scenes. However, these techniques often misbehave when confronted with low-SNR scenes that are also mostly empty space. We compare two classes of
Detection of chromatic and luminance distortions in natural scenes.
Jennings, Ben J; Wang, Karen; Menzies, Samantha; Kingdom, Frederick A A
2015-09-01
A number of studies have measured visual thresholds for detecting spatial distortions applied to images of natural scenes. In one study, Bex [J. Vis.10(2), 1 (2010)10.1167/10.2.231534-7362] measured sensitivity to sinusoidal spatial modulations of image scale. Here, we measure sensitivity to sinusoidal scale distortions applied to the chromatic, luminance, or both layers of natural scene images. We first established that sensitivity does not depend on whether the undistorted comparison image was of the same or of a different scene. Next, we found that, when the luminance but not chromatic layer was distorted, performance was the same regardless of whether the chromatic layer was present, absent, or phase-scrambled; in other words, the chromatic layer, in whatever form, did not affect sensitivity to the luminance layer distortion. However, when the chromatic layer was distorted, sensitivity was higher when the luminance layer was intact compared to when absent or phase-scrambled. These detection threshold results complement the appearance of periodic distortions of the image scale: when the luminance layer is distorted visibly, the scene appears distorted, but when the chromatic layer is distorted visibly, there is little apparent scene distortion. We conclude that (a) observers have a built-in sense of how a normal image of a natural scene should appear, and (b) the detection of distortion in, as well as the apparent distortion of, natural scene images is mediated predominantly by the luminance layer and not chromatic layer.
Fixations on objects in natural scenes: dissociating importance from salience
't Hart, Bernard M.; Schmidt, Hannah C. E. F.; Roth, Christine; Einhäuser, Wolfgang
2013-01-01
The relation of selective attention to understanding of natural scenes has been subject to intense behavioral research and computational modeling, and gaze is often used as a proxy for such attention. The probability of an image region to be fixated typically correlates with its contrast. However, this relation does not imply a causal role of contrast. Rather, contrast may relate to an object's “importance” for a scene, which in turn drives attention. Here we operationalize importance by the probability that an observer names the object as characteristic for a scene. We modify luminance contrast of either a frequently named (“common”/“important”) or a rarely named (“rare”/“unimportant”) object, track the observers' eye movements during scene viewing and ask them to provide keywords describing the scene immediately after. When no object is modified relative to the background, important objects draw more fixations than unimportant ones. Increases of contrast make an object more likely to be fixated, irrespective of whether it was important for the original scene, while decreases in contrast have little effect on fixations. Any contrast modification makes originally unimportant objects more important for the scene. Finally, important objects are fixated more centrally than unimportant objects, irrespective of contrast. Our data suggest a dissociation between object importance (relevance for the scene) and salience (relevance for attention). If an object obeys natural scene statistics, important objects are also salient. However, when natural scene statistics are violated, importance and salience are differentially affected. Object salience is modulated by the expectation about object properties (e.g., formed by context or gist), and importance by the violation of such expectations. In addition, the dependence of fixated locations within an object on the object's importance suggests an analogy to the effects of word frequency on landing positions in reading. PMID:23882251
NASA Astrophysics Data System (ADS)
van der Linde, Ian; Rajashekar, Umesh; Cormack, Lawrence K.; Bovik, Alan C.
2005-03-01
Recent years have seen a resurgent interest in eye movements during natural scene viewing. Aspects of eye movements that are driven by low-level image properties are of particular interest due to their applicability to biologically motivated artificial vision and surveillance systems. In this paper, we report an experiment in which we recorded observers" eye movements while they viewed calibrated greyscale images of natural scenes. Immediately after viewing each image, observers were shown a test patch and asked to indicate if they thought it was part of the image they had just seen. The test patch was either randomly selected from a different image from the same database or, unbeknownst to the observer, selected from either the first or last location fixated on the image just viewed. We find that several low-level image properties differed significantly relative to the observers" ability to successfully designate each patch. We also find that the differences between patch statistics for first and last fixations are small compared to the differences between hit and miss responses. The goal of the paper was to, in a non-cognitive natural setting, measure the image properties that facilitate visual memory, additionally observing the role that temporal location (first or last fixation) of the test patch played. We propose that a memorability map of a complex natural scene may be constructed to represent the low-level memorability of local regions in a similar fashion to the familiar saliency map, which records bottom-up fixation attractors.
Surface-illuminant ambiguity and color constancy: effects of scene complexity and depth cues.
Kraft, James M; Maloney, Shannon I; Brainard, David H
2002-01-01
Two experiments were conducted to study how scene complexity and cues to depth affect human color constancy. Specifically, two levels of scene complexity were compared. The low-complexity scene contained two walls with the same surface reflectance and a test patch which provided no information about the illuminant. In addition to the surfaces visible in the low-complexity scene, the high-complexity scene contained two rectangular solid objects and 24 paper samples with diverse surface reflectances. Observers viewed illuminated objects in an experimental chamber and adjusted the test patch until it appeared achromatic. Achromatic settings made tinder two different illuminants were used to compute an index that quantified the degree of constancy. Two experiments were conducted: one in which observers viewed the stimuli directly, and one in which they viewed the scenes through an optical system that reduced cues to depth. In each experiment, constancy was assessed for two conditions. In the valid-cue condition, many cues provided valid information about the illuminant change. In the invalid-cue condition, some image cues provided invalid information. Four broad conclusions are drawn from the data: (a) constancy is generally better in the valid-cue condition than in the invalid-cue condition: (b) for the stimulus configuration used, increasing image complexity has little effect in the valid-cue condition but leads to increased constancy in the invalid-cue condition; (c) for the stimulus configuration used, reducing cues to depth has little effect for either constancy condition: and (d) there is moderate individual variation in the degree of constancy exhibited, particularly in the degree to which the complexity manipulation affects performance.
Marin, Manuela M.; Lampatz, Allegra; Wandl, Michaela; Leder, Helmut
2016-01-01
In his seminal book on esthetics, Berlyne (1971) posited an inverted-U relationship between complexity and hedonic tone in arts appreciation, however, converging evidence for his theory is still missing. The disregard of the multidimensionality of complexity may explain some of the divergent results. Here, we argue that definitions of hedonic tone are manifold and systematically examined whether the nature of the relationship between complexity and hedonic tone is determined by the specific measure of hedonic tone. In Experiment 1, we studied three picture categories with similar affective and semantic contents: 96 affective environmental scenes, which were also converted into 96 cartoons, and 96 representational paintings. Complexity varied along the dimension of elements. In a between-subjects design, each stimulus was presented for 5 s to 206 female participants. Subjective ratings of hedonic tone (either beauty, pleasantness or liking), arousal, complexity and familiarity were collected in three conditions per stimulus set. Complexity and arousal were positively associated in all conditions, with the strongest association observed for paintings. For environmental scenes and cartoons, there was no significant association between complexity and hedonic tone, and the three measures of hedonic tone were highly correlated (all rs > 0.85). As predicted, in paintings the measures of hedonic tone were less strongly correlated (all rs > 0.73), and when controlling for familiarity, the association with complexity was significantly positive for beauty (rs = 0.26), weakly negative for pleasantness (rs = -0.16) and not present for liking. Experiment 2 followed a similar approach and 77 female participants, all non-musicians, rated 92 musical excerpts (15 s) in three conditions of hedonic tone (either beauty, pleasantness or liking). Results indicated a strong relationship between complexity and arousal (all rs > 0.85). When controlling for familiarity effects, the relationship between complexity and beauty followed an inverted-U curve, whereas the relationship between complexity and pleasantness was negative (rs = -0.26) and the one between complexity and liking positive (rs = 0.29). We relate our results to Berlyne’s theory and the latest findings in neuroaesthetics, proposing that future studies need to acknowledge the multifaceted nature of hedonic tone in esthetic experiences of artforms. PMID:27867350
Motivational Objects in Natural Scenes (MONS): A Database of >800 Objects
Schomaker, Judith; Rau, Elias M.; Einhäuser, Wolfgang; Wittmann, Bianca C.
2017-01-01
In daily life, we are surrounded by objects with pre-existing motivational associations. However, these are rarely controlled for in experiments with natural stimuli. Research on natural stimuli would therefore benefit from stimuli with well-defined motivational properties; in turn, such stimuli also open new paths in research on motivation. Here we introduce a database of Motivational Objects in Natural Scenes (MONS). The database consists of 107 scenes. Each scene contains 2 to 7 objects placed at approximately equal distance from the scene center. Each scene was photographed creating 3 versions, with one object (“critical object”) being replaced to vary the overall motivational value of the scene (appetitive, aversive, and neutral), while maintaining high visual similarity between the three versions. Ratings on motivation, valence, arousal and recognizability were obtained using internet-based questionnaires. Since the main objective was to provide stimuli of well-defined motivational value, three motivation scales were used: (1) Desire to own the object; (2) Approach/Avoid; (3) Desire to interact with the object. Three sets of ratings were obtained in independent sets of observers: for all 805 objects presented on a neutral background, for 321 critical objects presented in their scene context, and for the entire scenes. On the basis of the motivational ratings, objects were subdivided into aversive, neutral, and appetitive categories. The MONS database will provide a standardized basis for future studies on motivational value under realistic conditions. PMID:29033870
Recognising the forest, but not the trees: an effect of colour on scene perception and recognition.
Nijboer, Tanja C W; Kanai, Ryota; de Haan, Edward H F; van der Smagt, Maarten J
2008-09-01
Colour has been shown to facilitate the recognition of scene images, but only when these images contain natural scenes, for which colour is 'diagnostic'. Here we investigate whether colour can also facilitate memory for scene images, and whether this would hold for natural scenes in particular. In the first experiment participants first studied a set of colour and greyscale natural and man-made scene images. Next, the same images were presented, randomly mixed with a different set. Participants were asked to indicate whether they had seen the images during the study phase. Surprisingly, performance was better for greyscale than for coloured images, and this difference is due to the higher false alarm rate for both natural and man-made coloured scenes. We hypothesized that this increase in false alarm rate was due to a shift from scrutinizing details of the image to recognition of the gist of the (coloured) image. A second experiment, utilizing images without a nameable gist, confirmed this hypothesis as participants now performed equally on greyscale and coloured images. In the final experiment we specifically targeted the more detail-based perception and recognition for greyscale images versus the more gist-based perception and recognition for coloured images with a change detection paradigm. The results show that changes to images are detected faster when image-pairs were presented in greyscale than in colour. This counterintuitive result held for both natural and man-made scenes (but not for scenes without nameable gist) and thus corroborates the shift from more detailed processing of images in greyscale to more gist-based processing of coloured images.
Statistical regularities in art: Relations with visual coding and perception.
Graham, Daniel J; Redies, Christoph
2010-07-21
Since at least 1935, vision researchers have used art stimuli to test human response to complex scenes. This is sensible given the "inherent interestingness" of art and its relation to the natural visual world. The use of art stimuli has remained popular, especially in eye tracking studies. Moreover, stimuli in common use by vision scientists are inspired by the work of famous artists (e.g., Mondrians). Artworks are also popular in vision science as illustrations of a host of visual phenomena, such as depth cues and surface properties. However, until recently, there has been scant consideration of the spatial, luminance, and color statistics of artwork, and even less study of ways that regularities in such statistics could affect visual processing. Furthermore, the relationship between regularities in art images and those in natural scenes has received little or no attention. In the past few years, there has been a concerted effort to study statistical regularities in art as they relate to neural coding and visual perception, and art stimuli have begun to be studied in rigorous ways, as natural scenes have been. In this minireview, we summarize quantitative studies of links between regular statistics in artwork and processing in the visual stream. The results of these studies suggest that art is especially germane to understanding human visual coding and perception, and it therefore warrants wider study. Copyright 2010 Elsevier Ltd. All rights reserved.
Cultural differences in the lateral occipital complex while viewing incongruent scenes
Yang, Yung-Jui; Goh, Joshua; Hong, Ying-Yi; Park, Denise C.
2010-01-01
Converging behavioral and neuroimaging evidence indicates that culture influences the processing of complex visual scenes. Whereas Westerners focus on central objects and tend to ignore context, East Asians process scenes more holistically, attending to the context in which objects are embedded. We investigated cultural differences in contextual processing by manipulating the congruence of visual scenes presented in an fMR-adaptation paradigm. We hypothesized that East Asians would show greater adaptation to incongruent scenes, consistent with their tendency to process contextual relationships more extensively than Westerners. Sixteen Americans and 16 native Chinese were scanned while viewing sets of pictures consisting of a focal object superimposed upon a background scene. In half of the pictures objects were paired with congruent backgrounds, and in the other half objects were paired with incongruent backgrounds. We found that within both the right and left lateral occipital complexes, Chinese participants showed significantly greater adaptation to incongruent scenes than to congruent scenes relative to American participants. These results suggest that Chinese were more sensitive to contextual incongruity than were Americans and that they reacted to incongruent object/background pairings by focusing greater attention on the object. PMID:20083532
Institute for Brain and Neural Systems
2009-10-06
to deal with computational complexity when analyzing large amounts of information in visual scenes. It seems natural that in addition to exploring...algorithms using methods from statistical pattern recognition and machine learning. Over the last fifteen years, significant advances had been made in...recognition, robustness to noise and ability to cope with significant variations in lighting conditions. Identifying an occluded target adds another layer of
Local spatial frequency analysis for computer vision
NASA Technical Reports Server (NTRS)
Krumm, John; Shafer, Steven A.
1990-01-01
A sense of vision is a prerequisite for a robot to function in an unstructured environment. However, real-world scenes contain many interacting phenomena that lead to complex images which are difficult to interpret automatically. Typical computer vision research proceeds by analyzing various effects in isolation (e.g., shading, texture, stereo, defocus), usually on images devoid of realistic complicating factors. This leads to specialized algorithms which fail on real-world images. Part of this failure is due to the dichotomy of useful representations for these phenomena. Some effects are best described in the spatial domain, while others are more naturally expressed in frequency. In order to resolve this dichotomy, we present the combined space/frequency representation which, for each point in an image, shows the spatial frequencies at that point. Within this common representation, we develop a set of simple, natural theories describing phenomena such as texture, shape, aliasing and lens parameters. We show these theories lead to algorithms for shape from texture and for dealiasing image data. The space/frequency representation should be a key aid in untangling the complex interaction of phenomena in images, allowing automatic understanding of real-world scenes.
ERIC Educational Resources Information Center
Freeth, M.; Chapman, P.; Ropar, D.; Mitchell, P.
2010-01-01
Visual fixation patterns whilst viewing complex photographic scenes containing one person were studied in 24 high-functioning adolescents with Autism Spectrum Disorders (ASD) and 24 matched typically developing adolescents. Over two different scene presentation durations both groups spent a large, strikingly similar proportion of their viewing…
Effects of chromatic image statistics on illumination induced color differences.
Lucassen, Marcel P; Gevers, Theo; Gijsenij, Arjan; Dekker, Niels
2013-09-01
We measure the color fidelity of visual scenes that are rendered under different (simulated) illuminants and shown on a calibrated LCD display. Observers make triad illuminant comparisons involving the renderings from two chromatic test illuminants and one achromatic reference illuminant shown simultaneously. Four chromatic test illuminants are used: two along the daylight locus (yellow and blue), and two perpendicular to it (red and green). The observers select the rendering having the best color fidelity, thereby indirectly judging which of the two test illuminants induces the smallest color differences compared to the reference. Both multicolor test scenes and natural scenes are studied. The multicolor scenes are synthesized and represent ellipsoidal distributions in CIELAB chromaticity space having the same mean chromaticity but different chromatic orientations. We show that, for those distributions, color fidelity is best when the vector of the illuminant change (pointing from neutral to chromatic) is parallel to the major axis of the scene's chromatic distribution. For our selection of natural scenes, which generally have much broader chromatic distributions, we measure a higher color fidelity for the yellow and blue illuminants than for red and green. Scrambled versions of the natural images are also studied to exclude possible semantic effects. We quantitatively predict the average observer response (i.e., the illuminant probability) with four types of models, differing in the extent to which they incorporate information processing by the visual system. Results show different levels of performance for the models, and different levels for the multicolor scenes and the natural scenes. Overall, models based on the scene averaged color difference have the best performance. We discuss how color constancy algorithms may be improved by exploiting knowledge of the chromatic distribution of the visual scene.
Visual flight control in naturalistic and artificial environments.
Baird, Emily; Dacke, Marie
2012-12-01
Although the visual flight control strategies of flying insects have evolved to cope with the complexity of the natural world, studies investigating this behaviour have typically been performed indoors using simplified two-dimensional artificial visual stimuli. How well do the results from these studies reflect the natural behaviour of flying insects considering the radical differences in contrast, spatial composition, colour and dimensionality between these visual environments? Here, we aim to answer this question by investigating the effect of three- and two-dimensional naturalistic and artificial scenes on bumblebee flight control in an outdoor setting and compare the results with those of similar experiments performed in an indoor setting. In particular, we focus on investigating the effect of axial (front-to-back) visual motion cues on ground speed and centring behaviour. Our results suggest that, in general, ground speed control and centring behaviour in bumblebees is not affected by whether the visual scene is two- or three dimensional, naturalistic or artificial, or whether the experiment is conducted indoors or outdoors. The only effect that we observe between naturalistic and artificial scenes on flight control is that when the visual scene is three-dimensional and the visual information on the floor is minimised, bumblebees fly further from the midline of the tunnel. The findings presented here have implications not only for understanding the mechanisms of visual flight control in bumblebees, but also for the results of past and future investigations into visually guided flight control in other insects.
Acute stress influences the discrimination of complex scenes and complex faces in young healthy men.
Paul, M; Lech, R K; Scheil, J; Dierolf, A M; Suchan, B; Wolf, O T
2016-04-01
The stress-induced release of glucocorticoids has been demonstrated to influence hippocampal functions via the modulation of specific receptors. At the behavioral level stress is known to influence hippocampus dependent long-term memory. In recent years, studies have consistently associated the hippocampus with the non-mnemonic perception of scenes, while adjacent regions in the medial temporal lobe were associated with the perception of objects, and faces. So far it is not known whether and how stress influences non-mnemonic perceptual processes. In a behavioral study, fifty male participants were subjected either to the stressful socially evaluated cold-pressor test or to a non-stressful control procedure, before they completed a visual discrimination task, comprising scenes and faces. The complexity of the face and scene stimuli was manipulated in easy and difficult conditions. A significant three way interaction between stress, stimulus type and complexity was found. Stressed participants tended to commit more errors in the complex scenes condition. For complex faces a descriptive tendency in the opposite direction (fewer errors under stress) was observed. As a result the difference between the number of errors for scenes and errors for faces was significantly larger in the stress group. These results indicate that, beyond the effects of stress on long-term memory, stress influences the discrimination of spatial information, especially when the perception is characterized by a high complexity. Copyright © 2016 Elsevier Ltd. All rights reserved.
Ho-Phuoc, Tien; Guyader, Nathalie; Landragin, Frédéric; Guérin-Dugué, Anne
2012-02-03
Since Treisman's theory, it has been generally accepted that color is an elementary feature that guides eye movements when looking at natural scenes. Hence, most computational models of visual attention predict eye movements using color as an important visual feature. In this paper, using experimental data, we show that color does not affect where observers look when viewing natural scene images. Neither colors nor abnormal colors modify observers' fixation locations when compared to the same scenes in grayscale. In the same way, we did not find any significant difference between the scanpaths under grayscale, color, or abnormal color viewing conditions. However, we observed a decrease in fixation duration for color and abnormal color, and this was particularly true at the beginning of scene exploration. Finally, we found that abnormal color modifies saccade amplitude distribution.
Modulation of visually evoked movement responses in moving virtual environments.
Reed-Jones, Rebecca J; Vallis, Lori Ann
2009-01-01
Virtual-reality technology is being increasingly used to understand how humans perceive and act in the moving world around them. What is currently not clear is how virtual reality technology is perceived by human participants and what virtual scenes are effective in evoking movement responses to visual stimuli. We investigated the effect of virtual-scene context on human responses to a virtual visual perturbation. We hypothesised that exposure to a natural scene that matched the visual expectancies of the natural world would create a perceptual set towards presence, and thus visual guidance of body movement in a subsequently presented virtual scene. Results supported this hypothesis; responses to a virtual visual perturbation presented in an ambiguous virtual scene were increased when participants first viewed a scene that consisted of natural landmarks which provided 'real-world' visual motion cues. Further research in this area will provide a basis of knowledge for the effective use of this technology in the study of human movement responses.
Residual attention guidance in blindsight monkeys watching complex natural scenes.
Yoshida, Masatoshi; Itti, Laurent; Berg, David J; Ikeda, Takuro; Kato, Rikako; Takaura, Kana; White, Brian J; Munoz, Douglas P; Isa, Tadashi
2012-08-07
Patients with damage to primary visual cortex (V1) demonstrate residual performance on laboratory visual tasks despite denial of conscious seeing (blindsight) [1]. After a period of recovery, which suggests a role for plasticity [2], visual sensitivity higher than chance is observed in humans and monkeys for simple luminance-defined stimuli, grating stimuli, moving gratings, and other stimuli [3-7]. Some residual cognitive processes including bottom-up attention and spatial memory have also been demonstrated [8-10]. To date, little is known about blindsight with natural stimuli and spontaneous visual behavior. In particular, is orienting attention toward salient stimuli during free viewing still possible? We used a computational saliency map model to analyze spontaneous eye movements of monkeys with blindsight from unilateral ablation of V1. Despite general deficits in gaze allocation, monkeys were significantly attracted to salient stimuli. The contribution of orientation features to salience was nearly abolished, whereas contributions of motion, intensity, and color features were preserved. Control experiments employing laboratory stimuli confirmed the free-viewing finding that lesioned monkeys retained color sensitivity. Our results show that attention guidance over complex natural scenes is preserved in the absence of V1, thereby directly challenging theories and models that crucially depend on V1 to compute the low-level visual features that guide attention. Copyright © 2012 Elsevier Ltd. All rights reserved.
Object detection in natural scenes: Independent effects of spatial and category-based attention.
Stein, Timo; Peelen, Marius V
2017-04-01
Humans are remarkably efficient in detecting highly familiar object categories in natural scenes, with evidence suggesting that such object detection can be performed in the (near) absence of attention. Here we systematically explored the influences of both spatial attention and category-based attention on the accuracy of object detection in natural scenes. Manipulating both types of attention additionally allowed for addressing how these factors interact: whether the requirement for spatial attention depends on the extent to which observers are prepared to detect a specific object category-that is, on category-based attention. The results showed that the detection of targets from one category (animals or vehicles) was better than the detection of targets from two categories (animals and vehicles), demonstrating the beneficial effect of category-based attention. This effect did not depend on the semantic congruency of the target object and the background scene, indicating that observers attended to visual features diagnostic of the foreground target objects from the cued category. Importantly, in three experiments the detection of objects in scenes presented in the periphery was significantly impaired when observers simultaneously performed an attentionally demanding task at fixation, showing that spatial attention affects natural scene perception. In all experiments, the effects of category-based attention and spatial attention on object detection performance were additive rather than interactive. Finally, neither spatial nor category-based attention influenced metacognitive ability for object detection performance. These findings demonstrate that efficient object detection in natural scenes is independently facilitated by spatial and category-based attention.
Face, Body, and Center of Gravity Mediate Person Detection in Natural Scenes
ERIC Educational Resources Information Center
Bindemann, Markus; Scheepers, Christoph; Ferguson, Heather J.; Burton, A. Mike
2010-01-01
Person detection is an important prerequisite of social interaction, but is not well understood. Following suggestions that people in the visual field can capture a viewer's attention, this study examines the role of the face and the body for person detection in natural scenes. We observed that viewers tend first to look at the center of a scene,…
Sparse Coding of Natural Human Motion Yields Eigenmotions Consistent Across People
NASA Astrophysics Data System (ADS)
Thomik, Andreas; Faisal, A. Aldo
2015-03-01
Providing a precise mathematical description of the structure of natural human movement is a challenging problem. We use a data-driven approach to seek a generative model of movement capturing the underlying simplicity of spatial and temporal structure of behaviour observed in daily life. In perception, the analysis of natural scenes has shown that sparse codes of such scenes are information theoretic efficient descriptors with direct neuronal correlates. Translating from perception to action, we identify a generative model of movement generation by the human motor system. Using wearable full-hand motion capture, we measure the digit movement of the human hand in daily life. We learn a dictionary of ``eigenmotions'' which we use for sparse encoding of the movement data. We show that the dictionaries are generally well preserved across subjects with small deviations accounting for individuality of the person and variability in tasks. Further, the dictionary elements represent motions which can naturally describe hand movements. Our findings suggest the motor system can compose complex movement behaviours out of the spatially and temporally sparse activation of ``eigenmotion'' neurons, and is consistent with data on grasp-type specificity of specialised neurons in the premotor cortex. Andreas is supported by the Luxemburg Research Fund (1229297).
Short report: the effect of expertise in hiking on recognition memory for mountain scenes.
Kawamura, Satoru; Suzuki, Sae; Morikawa, Kazunori
2007-10-01
The nature of an expert memory advantage that does not depend on stimulus structure or chunking was examined, using more ecologically valid stimuli in the context of a more natural activity than previously studied domains. Do expert hikers and novice hikers see and remember mountain scenes differently? In the present experiment, 18 novice hikers and 17 expert hikers were presented with 60 photographs of scenes from hiking trails. These scenes differed in the degree of functional aspects that implied some action possibilities or dangers. The recognition test revealed that the memory performance of experts was significantly superior to that of novices for scenes with highly functional aspects. The memory performance for the scenes with few functional aspects did not differ between novices and experts. These results suggest that experts pay more attention to, and thus remember better, scenes with functional meanings than do novices.
Cultural differences in attention: Eye movement evidence from a comparative visual search task.
Alotaibi, Albandri; Underwood, Geoffrey; Smith, Alastair D
2017-10-01
Individual differences in visual attention have been linked to thinking style: analytic thinking (common in individualistic cultures) is thought to promote attention to detail and focus on the most important part of a scene, whereas holistic thinking (common in collectivist cultures) promotes attention to the global structure of a scene and the relationship between its parts. However, this theory is primarily based on relatively simple judgement tasks. We compared groups from Great Britain (an individualist culture) and Saudi Arabia (a collectivist culture) on a more complex comparative visual search task, using simple natural scenes. A higher overall number of fixations for Saudi participants, along with longer search times, indicated less efficient search behaviour than British participants. Furthermore, intra-group comparisons of scan-path for Saudi participants revealed less similarity than within the British group. Together, these findings suggest that there is a positive relationship between an analytic cognitive style and controlled attention. Copyright © 2017 Elsevier Inc. All rights reserved.
Some of the thousand words a picture is worth.
Mandler, J M; Johnson, N S
1976-09-01
The effects of real-world schemata on recognition of complex pictures were studied. Two kinds of pictures were used: pictures of objects forming real-world scenes and unorganized collections of the same objects. The recognition test employed distractors that varied four types of information: inventory, spatial location, descriptive and spatial composition. Results emphasized the selective nature of schemata since superior recognition of one kind of information was offset by loss of another. Spatial location information was better recognized in real-world scenes and spatial composition information was better recognized in unorganized scenes. Organized and unorganized pictures did not differ with respect of inventory and descriptive information. The longer the pictures were studied, the longer subjects took to recognize them. Reaction time for hits, misses, and false alarms increased dramatically as presentation time increased from 5 to 60 sec. It was suggested that detection of a difference in a distractor terminated search, but that when no difference was detected, an exhaustive search of the available information took place.
Miskovic, Vladimir; Martinovic, Jasna; Wieser, Matthias J; Petro, Nathan M; Bradley, Margaret M; Keil, Andreas
2015-03-01
Emotionally arousing scenes readily capture visual attention, prompting amplified neural activity in sensory regions of the brain. The physical stimulus features and related information channels in the human visual system that contribute to this modulation, however, are not known. Here, we manipulated low-level physical parameters of complex scenes varying in hedonic valence and emotional arousal in order to target the relative contributions of luminance based versus chromatic visual channels to emotional perception. Stimulus-evoked brain electrical activity was measured during picture viewing and used to quantify neural responses sensitive to lower-tier visual cortical involvement (steady-state visual evoked potentials) as well as the late positive potential, reflecting a more distributed cortical event. Results showed that the enhancement for emotional content was stimulus-selective when examining the steady-state segments of the evoked visual potentials. Response amplification was present only for low spatial frequency, grayscale stimuli, and not for high spatial frequency, red/green stimuli. In contrast, the late positive potential was modulated by emotion regardless of the scene's physical properties. Our findings are discussed in relation to neurophysiologically plausible constraints operating at distinct stages of the cortical processing stream. Copyright © 2015 Elsevier B.V. All rights reserved.
Neural Correlates of Auditory Figure-Ground Segregation Based on Temporal Coherence
Teki, Sundeep; Barascud, Nicolas; Picard, Samuel; Payne, Christopher; Griffiths, Timothy D.; Chait, Maria
2016-01-01
To make sense of natural acoustic environments, listeners must parse complex mixtures of sounds that vary in frequency, space, and time. Emerging work suggests that, in addition to the well-studied spectral cues for segregation, sensitivity to temporal coherence—the coincidence of sound elements in and across time—is also critical for the perceptual organization of acoustic scenes. Here, we examine pre-attentive, stimulus-driven neural processes underlying auditory figure-ground segregation using stimuli that capture the challenges of listening in complex scenes where segregation cannot be achieved based on spectral cues alone. Signals (“stochastic figure-ground”: SFG) comprised a sequence of brief broadband chords containing random pure tone components that vary from 1 chord to another. Occasional tone repetitions across chords are perceived as “figures” popping out of a stochastic “ground.” Magnetoencephalography (MEG) measurement in naïve, distracted, human subjects revealed robust evoked responses, commencing from about 150 ms after figure onset that reflect the emergence of the “figure” from the randomly varying “ground.” Neural sources underlying this bottom-up driven figure-ground segregation were localized to planum temporale, and the intraparietal sulcus, demonstrating that this area, outside the “classic” auditory system, is also involved in the early stages of auditory scene analysis.” PMID:27325682
Neural Correlates of Auditory Figure-Ground Segregation Based on Temporal Coherence.
Teki, Sundeep; Barascud, Nicolas; Picard, Samuel; Payne, Christopher; Griffiths, Timothy D; Chait, Maria
2016-09-01
To make sense of natural acoustic environments, listeners must parse complex mixtures of sounds that vary in frequency, space, and time. Emerging work suggests that, in addition to the well-studied spectral cues for segregation, sensitivity to temporal coherence-the coincidence of sound elements in and across time-is also critical for the perceptual organization of acoustic scenes. Here, we examine pre-attentive, stimulus-driven neural processes underlying auditory figure-ground segregation using stimuli that capture the challenges of listening in complex scenes where segregation cannot be achieved based on spectral cues alone. Signals ("stochastic figure-ground": SFG) comprised a sequence of brief broadband chords containing random pure tone components that vary from 1 chord to another. Occasional tone repetitions across chords are perceived as "figures" popping out of a stochastic "ground." Magnetoencephalography (MEG) measurement in naïve, distracted, human subjects revealed robust evoked responses, commencing from about 150 ms after figure onset that reflect the emergence of the "figure" from the randomly varying "ground." Neural sources underlying this bottom-up driven figure-ground segregation were localized to planum temporale, and the intraparietal sulcus, demonstrating that this area, outside the "classic" auditory system, is also involved in the early stages of auditory scene analysis." © The Author 2016. Published by Oxford University Press.
Ghodrati, Masoud; Ghodousi, Mahrad; Yoonessi, Ali
2016-01-01
Humans are fast and accurate in categorizing complex natural images. It is, however, unclear what features of visual information are exploited by brain to perceive the images with such speed and accuracy. It has been shown that low-level contrast statistics of natural scenes can explain the variance of amplitude of event-related potentials (ERP) in response to rapidly presented images. In this study, we investigated the effect of these statistics on frequency content of ERPs. We recorded ERPs from human subjects, while they viewed natural images each presented for 70 ms. Our results showed that Weibull contrast statistics, as a biologically plausible model, explained the variance of ERPs the best, compared to other image statistics that we assessed. Our time-frequency analysis revealed a significant correlation between these statistics and ERPs' power within theta frequency band (~3-7 Hz). This is interesting, as theta band is believed to be involved in context updating and semantic encoding. This correlation became significant at ~110 ms after stimulus onset, and peaked at 138 ms. Our results show that not only the amplitude but also the frequency of neural responses can be modulated with low-level contrast statistics of natural images and highlights their potential role in scene perception.
Ghodrati, Masoud; Ghodousi, Mahrad; Yoonessi, Ali
2016-01-01
Humans are fast and accurate in categorizing complex natural images. It is, however, unclear what features of visual information are exploited by brain to perceive the images with such speed and accuracy. It has been shown that low-level contrast statistics of natural scenes can explain the variance of amplitude of event-related potentials (ERP) in response to rapidly presented images. In this study, we investigated the effect of these statistics on frequency content of ERPs. We recorded ERPs from human subjects, while they viewed natural images each presented for 70 ms. Our results showed that Weibull contrast statistics, as a biologically plausible model, explained the variance of ERPs the best, compared to other image statistics that we assessed. Our time-frequency analysis revealed a significant correlation between these statistics and ERPs' power within theta frequency band (~3–7 Hz). This is interesting, as theta band is believed to be involved in context updating and semantic encoding. This correlation became significant at ~110 ms after stimulus onset, and peaked at 138 ms. Our results show that not only the amplitude but also the frequency of neural responses can be modulated with low-level contrast statistics of natural images and highlights their potential role in scene perception. PMID:28018197
Parallel programming of saccades during natural scene viewing: evidence from eye movement positions.
Wu, Esther X W; Gilani, Syed Omer; van Boxtel, Jeroen J A; Amihai, Ido; Chua, Fook Kee; Yen, Shih-Cheng
2013-10-24
Previous studies have shown that saccade plans during natural scene viewing can be programmed in parallel. This evidence comes mainly from temporal indicators, i.e., fixation durations and latencies. In the current study, we asked whether eye movement positions recorded during scene viewing also reflect parallel programming of saccades. As participants viewed scenes in preparation for a memory task, their inspection of the scene was suddenly disrupted by a transition to another scene. We examined whether saccades after the transition were invariably directed immediately toward the center or were contingent on saccade onset times relative to the transition. The results, which showed a dissociation in eye movement behavior between two groups of saccades after the scene transition, supported the parallel programming account. Saccades with relatively long onset times (>100 ms) after the transition were directed immediately toward the center of the scene, probably to restart scene exploration. Saccades with short onset times (<100 ms) moved to the center only one saccade later. Our data on eye movement positions provide novel evidence of parallel programming of saccades during scene viewing. Additionally, results from the analyses of intersaccadic intervals were also consistent with the parallel programming hypothesis.
Goal-Side Selection in Soccer Penalty Kicking When Viewing Natural Scenes
Weigelt, Matthias; Memmert, Daniel
2012-01-01
The present study investigates the influence of goalkeeper displacement on goal-side selection in soccer penalty kicking. Facing a penalty situation, participants viewed photo-realistic images of a goalkeeper and a soccer goal. In the action selection task, they were asked to kick to the greater goal-side, and in the perception task, they indicated the position of the goalkeeper on the goal line. To this end, the goalkeeper was depicted in a regular goalkeeping posture, standing either in the exact middle of the goal or being displaced at different distances to the left or right of the goal’s center. Results showed that the goalkeeper’s position on the goal line systematically affected goal-side selection, even when participants were not aware of the displacement. These findings provide further support for the notion that the implicit processing of the stimulus layout in natural scenes can effect action selection in complex environments, such in soccer penalty shooting. PMID:22973246
Fu, Qiufang; Liu, Yong-Jin; Dienes, Zoltan; Wu, Jianhui; Chen, Wenfeng; Fu, Xiaolan
2016-07-01
A fundamental question in vision research is whether visual recognition is determined by edge-based information (e.g., edge, line, and conjunction) or surface-based information (e.g., color, brightness, and texture). To investigate this question, we manipulated the stimulus onset asynchrony (SOA) between the scene and the mask in a backward masking task of natural scene categorization. The behavioral results showed that correct classification was higher for line-drawings than for color photographs when the SOA was 13ms, but lower when the SOA was longer. The ERP results revealed that most latencies of early components were shorter for the line-drawings than for the color photographs, and the latencies gradually increased with the SOA for the color photographs but not for the line-drawings. The results provide new evidence that edge-based information is the primary determinant of natural scene categorization, receiving priority processing; by contrast, surface information takes longer to facilitate natural scene categorization. Copyright © 2016 Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Bacon-Mace, Nadege; Kirchner, Holle; Fabre-Thorpe, Michele; Thorpe, Simon J.
2007-01-01
Using manual responses, human participants are remarkably fast and accurate at deciding if a natural scene contains an animal, but recent data show that they are even faster to indicate with saccadic eye movements which of 2 scenes contains an animal. How could it be that 2 images can apparently be processed faster than a single image? To better…
Marin, Manuela M; Lampatz, Allegra; Wandl, Michaela; Leder, Helmut
2016-01-01
In his seminal book on esthetics, Berlyne (1971) posited an inverted-U relationship between complexity and hedonic tone in arts appreciation, however, converging evidence for his theory is still missing. The disregard of the multidimensionality of complexity may explain some of the divergent results. Here, we argue that definitions of hedonic tone are manifold and systematically examined whether the nature of the relationship between complexity and hedonic tone is determined by the specific measure of hedonic tone. In Experiment 1, we studied three picture categories with similar affective and semantic contents: 96 affective environmental scenes, which were also converted into 96 cartoons, and 96 representational paintings. Complexity varied along the dimension of elements. In a between-subjects design, each stimulus was presented for 5 s to 206 female participants. Subjective ratings of hedonic tone (either beauty, pleasantness or liking), arousal, complexity and familiarity were collected in three conditions per stimulus set. Complexity and arousal were positively associated in all conditions, with the strongest association observed for paintings. For environmental scenes and cartoons, there was no significant association between complexity and hedonic tone, and the three measures of hedonic tone were highly correlated (all r s > 0.85). As predicted, in paintings the measures of hedonic tone were less strongly correlated (all r s > 0.73), and when controlling for familiarity, the association with complexity was significantly positive for beauty ( r s = 0.26), weakly negative for pleasantness ( r s = -0.16) and not present for liking. Experiment 2 followed a similar approach and 77 female participants, all non-musicians, rated 92 musical excerpts (15 s) in three conditions of hedonic tone (either beauty, pleasantness or liking). Results indicated a strong relationship between complexity and arousal (all r s > 0.85). When controlling for familiarity effects, the relationship between complexity and beauty followed an inverted-U curve, whereas the relationship between complexity and pleasantness was negative ( r s = -0.26) and the one between complexity and liking positive ( r s = 0.29). We relate our results to Berlyne's theory and the latest findings in neuroaesthetics, proposing that future studies need to acknowledge the multifaceted nature of hedonic tone in esthetic experiences of artforms.
Manipulating the content of dynamic natural scenes to characterize response in human MT/MST.
Durant, Szonya; Wall, Matthew B; Zanker, Johannes M
2011-09-09
Optic flow is one of the most important sources of information for enabling human navigation through the world. A striking finding from single-cell studies in monkeys is the rapid saturation of response of MT/MST areas with the density of optic flow type motion information. These results are reflected psychophysically in human perception in the saturation of motion aftereffects. We began by comparing responses to natural optic flow scenes in human visual brain areas to responses to the same scenes with inverted contrast (photo negative). This changes scene familiarity while preserving local motion signals. This manipulation had no effect; however, the response was only correlated with the density of local motion (calculated by a motion correlation model) in V1, not in MT/MST. To further investigate this, we manipulated the visible proportion of natural dynamic scenes and found that areas MT and MST did not increase in response over a 16-fold increase in the amount of information presented, i.e., response had saturated. This makes sense in light of the sparseness of motion information in natural scenes, suggesting that the human brain is well adapted to exploit a small amount of dynamic signal and extract information important for survival.
The effects of views of nature on autonomic control.
Gladwell, V F; Brown, D K; Barton, J L; Tarvainen, M P; Kuoppa, P; Pretty, J; Suddaby, J M; Sandercock, G R H
2012-09-01
Previously studies have shown that nature improves mood and self-esteem and reduces blood pressure. Walking within a natural environment has been suggested to alter autonomic nervous system control, but the mechanisms are not fully understood. Heart rate variability (HRV) is a non-invasive method of assessing autonomic control and can give an insight into vagal modulation. Our hypothesis was that viewing nature alone within a controlled laboratory environment would induce higher levels of HRV as compared to built scenes. Heart rate (HR) and blood pressure (BP) were measured during viewing different scenes in a controlled environment. HRV was used to investigate alterations in autonomic activity, specifically parasympathetic activity. Each participant lay in the semi-supine position in a laboratory while we recorded 5 min (n = 29) of ECG, BP and respiration as they viewed two collections of slides (one containing nature views and the other built scenes). During viewing of nature, markers of parasympathetic activity were increased in both studies. Root mean squared of successive differences increased 4.2 ± 7.7 ms (t = 2.9, p = 0.008) and natural logarithm of high frequency increased 0.19 ± 0.36 ms(2) Hz(-1) (t = 2.9, p = 0.007) as compared to built scenes. Mean HR and BP were not significantly altered. This study provides evidence that autonomic control of the heart is altered by the simple act of just viewing natural scenes with an increase in vagal activity.
Scene text recognition in mobile applications by character descriptor and structure configuration.
Yi, Chucai; Tian, Yingli
2014-07-01
Text characters and strings in natural scene can provide valuable information for many applications. Extracting text directly from natural scene images or videos is a challenging task because of diverse text patterns and variant background interferences. This paper proposes a method of scene text recognition from detected text regions. In text detection, our previously proposed algorithms are applied to obtain text regions from scene image. First, we design a discriminative character descriptor by combining several state-of-the-art feature detectors and descriptors. Second, we model character structure at each character class by designing stroke configuration maps. Our algorithm design is compatible with the application of scene text extraction in smart mobile devices. An Android-based demo system is developed to show the effectiveness of our proposed method on scene text information extraction from nearby objects. The demo system also provides us some insight into algorithm design and performance improvement of scene text extraction. The evaluation results on benchmark data sets demonstrate that our proposed scheme of text recognition is comparable with the best existing methods.
Torralbo, Ana; Walther, Dirk B.; Chai, Barry; Caddigan, Eamon; Fei-Fei, Li; Beck, Diane M.
2013-01-01
Within the range of images that we might categorize as a “beach”, for example, some will be more representative of that category than others. Here we first confirmed that humans could categorize “good” exemplars better than “bad” exemplars of six scene categories and then explored whether brain regions previously implicated in natural scene categorization showed a similar sensitivity to how well an image exemplifies a category. In a behavioral experiment participants were more accurate and faster at categorizing good than bad exemplars of natural scenes. In an fMRI experiment participants passively viewed blocks of good or bad exemplars from the same six categories. A multi-voxel pattern classifier trained to discriminate among category blocks showed higher decoding accuracy for good than bad exemplars in the PPA, RSC and V1. This difference in decoding accuracy cannot be explained by differences in overall BOLD signal, as average BOLD activity was either equivalent or higher for bad than good scenes in these areas. These results provide further evidence that V1, RSC and the PPA not only contain information relevant for natural scene categorization, but their activity patterns mirror the fundamentally graded nature of human categories. Analysis of the image statistics of our good and bad exemplars shows that variability in low-level features and image structure is higher among bad than good exemplars. A simulation of our neuroimaging experiment suggests that such a difference in variance could account for the observed differences in decoding accuracy. These results are consistent with both low-level models of scene categorization and models that build categories around a prototype. PMID:23555588
Emotions' Impact on Viewing Behavior under Natural Conditions
Kaspar, Kai; Hloucal, Teresa-Maria; Kriz, Jürgen; Canzler, Sonja; Gameiro, Ricardo Ramos; Krapp, Vanessa; König, Peter
2013-01-01
Human overt attention under natural conditions is guided by stimulus features as well as by higher cognitive components, such as task and emotional context. In contrast to the considerable progress regarding the former, insight into the interaction of emotions and attention is limited. Here we investigate the influence of the current emotional context on viewing behavior under natural conditions. In two eye-tracking studies participants freely viewed complex scenes embedded in sequences of emotion-laden images. The latter primes constituted specific emotional contexts for neutral target images. Viewing behavior toward target images embedded into sets of primes was affected by the current emotional context, revealing the intensity of the emotional context as a significant moderator. The primes themselves were not scanned in different ways when presented within a block (Study 1), but when presented individually, negative primes were more actively scanned than positive primes (Study 2). These divergent results suggest an interaction between emotional priming and further context factors. Additionally, in most cases primes were scanned more actively than target images. Interestingly, the mere presence of emotion-laden stimuli in a set of images of different categories slowed down viewing activity overall, but the known effect of image category was not affected. Finally, viewing behavior remained largely constant on single images as well as across the targets' post-prime positions (Study 2). We conclude that the emotional context significantly influences the exploration of complex scenes and the emotional context has to be considered in predictions of eye-movement patterns. PMID:23326353
Memory, emotion, and pupil diameter: Repetition of natural scenes.
Bradley, Margaret M; Lang, Peter J
2015-09-01
Recent studies have suggested that pupil diameter, like the "old-new" ERP, may be a measure of memory. Because the amplitude of the old-new ERP is enhanced for items encoded in the context of repetitions that are distributed (spaced), compared to massed (contiguous), we investigated whether pupil diameter is similarly sensitive to repetition. Emotional and neutral pictures of natural scenes were viewed once or repeated with massed (contiguous) or distributed (spaced) repetition during incidental free viewing and then tested on an explicit recognition test. Although an old-new difference in pupil diameter was found during successful recognition, pupil diameter was not enhanced for distributed, compared to massed, repetitions during either recognition or initial free viewing. Moreover, whereas a significant old-new difference was found for erotic scenes that had been seen only once during encoding, this difference was absent when erotic scenes were repeated. Taken together, the data suggest that pupil diameter is not a straightforward index of prior occurrence for natural scenes. © 2015 Society for Psychophysiological Research.
Kotabe, Hiroki P; Kardan, Omid; Berman, Marc G
2017-08-01
Natural environments have powerful aesthetic appeal linked to their capacity for psychological restoration. In contrast, disorderly environments are aesthetically aversive, and have various detrimental psychological effects. But in our research, we have repeatedly found that natural environments are perceptually disorderly. What could explain this paradox? We present 3 competing hypotheses: the aesthetic preference for naturalness is more powerful than the aesthetic aversion to disorder (the nature-trumps-disorder hypothesis ); disorder is trivial to aesthetic preference in natural contexts (the harmless-disorder hypothesis ); and disorder is aesthetically preferred in natural contexts (the beneficial-disorder hypothesis ). Utilizing novel methods of perceptual study and diverse stimuli, we rule in the nature-trumps-disorder hypothesis and rule out the harmless-disorder and beneficial-disorder hypotheses. In examining perceptual mechanisms, we find evidence that high-level scene semantics are both necessary and sufficient for the nature-trumps-disorder effect. Necessity is evidenced by the effect disappearing in experiments utilizing only low-level visual stimuli (i.e., where scene semantics have been removed) and experiments utilizing a rapid-scene-presentation procedure that obscures scene semantics. Sufficiency is evidenced by the effect reappearing in experiments utilizing noun stimuli which remove low-level visual features. Furthermore, we present evidence that the interaction of scene semantics with low-level visual features amplifies the nature-trumps-disorder effect-the effect is weaker both when statistically adjusting for quantified low-level visual features and when using noun stimuli which remove low-level visual features. These results have implications for psychological theories bearing on the joint influence of low- and high-level perceptual inputs on affect and cognition, as well as for aesthetic design. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Modulation of Temporal Precision in Thalamic Population Responses to Natural Visual Stimuli
Desbordes, Gaëlle; Jin, Jianzhong; Alonso, Jose-Manuel; Stanley, Garrett B.
2010-01-01
Natural visual stimuli have highly structured spatial and temporal properties which influence the way visual information is encoded in the visual pathway. In response to natural scene stimuli, neurons in the lateral geniculate nucleus (LGN) are temporally precise – on a time scale of 10–25 ms – both within single cells and across cells within a population. This time scale, established by non stimulus-driven elements of neuronal firing, is significantly shorter than that of natural scenes, yet is critical for the neural representation of the spatial and temporal structure of the scene. Here, a generalized linear model (GLM) that combines stimulus-driven elements with spike-history dependence associated with intrinsic cellular dynamics is shown to predict the fine timing precision of LGN responses to natural scene stimuli, the corresponding correlation structure across nearby neurons in the population, and the continuous modulation of spike timing precision and latency across neurons. A single model captured the experimentally observed neural response, across different levels of contrasts and different classes of visual stimuli, through interactions between the stimulus correlation structure and the nonlinearity in spike generation and spike history dependence. Given the sensitivity of the thalamocortical synapse to closely timed spikes and the importance of fine timing precision for the faithful representation of natural scenes, the modulation of thalamic population timing over these time scales is likely important for cortical representations of the dynamic natural visual environment. PMID:21151356
Auditory salience using natural soundscapes.
Huang, Nicholas; Elhilali, Mounya
2017-03-01
Salience describes the phenomenon by which an object stands out from a scene. While its underlying processes are extensively studied in vision, mechanisms of auditory salience remain largely unknown. Previous studies have used well-controlled auditory scenes to shed light on some of the acoustic attributes that drive the salience of sound events. Unfortunately, the use of constrained stimuli in addition to a lack of well-established benchmarks of salience judgments hampers the development of comprehensive theories of sensory-driven auditory attention. The present study explores auditory salience in a set of dynamic natural scenes. A behavioral measure of salience is collected by having human volunteers listen to two concurrent scenes and indicate continuously which one attracts their attention. By using natural scenes, the study takes a data-driven rather than experimenter-driven approach to exploring the parameters of auditory salience. The findings indicate that the space of auditory salience is multidimensional (spanning loudness, pitch, spectral shape, as well as other acoustic attributes), nonlinear and highly context-dependent. Importantly, the results indicate that contextual information about the entire scene over both short and long scales needs to be considered in order to properly account for perceptual judgments of salience.
Number of perceptually distinct surface colors in natural scenes.
Marín-Franch, Iván; Foster, David H
2010-09-30
The ability to perceptually identify distinct surfaces in natural scenes by virtue of their color depends not only on the relative frequency of surface colors but also on the probabilistic nature of observer judgments. Previous methods of estimating the number of discriminable surface colors, whether based on theoretical color gamuts or recorded from real scenes, have taken a deterministic approach. Thus, a three-dimensional representation of the gamut of colors is divided into elementary cells or points which are spaced at one discrimination-threshold unit intervals and which are then counted. In this study, information-theoretic methods were used to take into account both differing surface-color frequencies and observer response uncertainty. Spectral radiances were calculated from 50 hyperspectral images of natural scenes and were represented in a perceptually almost uniform color space. The average number of perceptually distinct surface colors was estimated as 7.3 × 10(3), much smaller than that based on counting methods. This number is also much smaller than the number of distinct points in a scene that are, in principle, available for reliable identification under illuminant changes, suggesting that color constancy, or the lack of it, does not generally determine the limit on the use of color for surface identification.
Xiang, Juanjuan; Simon, Jonathan; Elhilali, Mounya
2010-01-01
Processing of complex acoustic scenes depends critically on the temporal integration of sensory information as sounds evolve naturally over time. It has been previously speculated that this process is guided by both innate mechanisms of temporal processing in the auditory system, as well as top-down mechanisms of attention, and possibly other schema-based processes. In an effort to unravel the neural underpinnings of these processes and their role in scene analysis, we combine Magnetoencephalography (MEG) with behavioral measures in humans in the context of polyrhythmic tone sequences. While maintaining unchanged sensory input, we manipulate subjects’ attention to one of two competing rhythmic streams in the same sequence. The results reveal that the neural representation of the attended rhythm is significantly enhanced both in its steady-state power and spatial phase coherence relative to its unattended state, closely correlating with its perceptual detectability for each listener. Interestingly, the data reveals a differential efficiency of rhythmic rates of the order of few hertz during the streaming process, closely following known neural and behavioral measures of temporal modulation sensitivity in the auditory system. These findings establish a direct link between known temporal modulation tuning in the auditory system (particularly at the level of auditory cortex) and the temporal integration of perceptual features in a complex acoustic scene, while mediated by processes of attention. PMID:20826671
Fractal dimension and the navigational information provided by natural scenes.
Shamsyeh Zahedi, Moosarreza; Zeil, Jochen
2018-01-01
Recent work on virtual reality navigation in humans has suggested that navigational success is inversely correlated with the fractal dimension (FD) of artificial scenes. Here we investigate the generality of this claim by analysing the relationship between the fractal dimension of natural insect navigation environments and a quantitative measure of the navigational information content of natural scenes. We show that the fractal dimension of natural scenes is in general inversely proportional to the information they provide to navigating agents on heading direction as measured by the rotational image difference function (rotIDF). The rotIDF determines the precision and accuracy with which the orientation of a reference image can be recovered or maintained and the range over which a gradient descent in image differences will find the minimum of the rotIDF, that is the reference orientation. However, scenes with similar fractal dimension can differ significantly in the depth of the rotIDF, because FD does not discriminate between the orientations of edges, while the rotIDF is mainly affected by edge orientation parallel to the axis of rotation. We present a new equation for the rotIDF relating navigational information to quantifiable image properties such as contrast to show (1) that for any given scene the maximum value of the rotIDF (its depth) is proportional to pixel variance and (2) that FD is inversely proportional to pixel variance. This contrast dependence, together with scene differences in orientation statistics, explains why there is no strict relationship between FD and navigational information. Our experimental data and their numerical analysis corroborate these results.
NASA Astrophysics Data System (ADS)
Keane, Tommy P.; Cahill, Nathan D.; Tarduno, John A.; Jacobs, Robert A.; Pelz, Jeff B.
2014-02-01
Mobile eye-tracking provides the fairly unique opportunity to record and elucidate cognition in action. In our research, we are searching for patterns in, and distinctions between, the visual-search performance of experts and novices in the geo-sciences. Traveling to regions resultant from various geological processes as part of an introductory field studies course in geology, we record the prima facie gaze patterns of experts and novices when they are asked to determine the modes of geological activity that have formed the scene-view presented to them. Recording eye video and scene video in natural settings generates complex imagery that requires advanced applications of computer vision research to generate registrations and mappings between the views of separate observers. By developing such mappings, we could then place many observers into a single mathematical space where we can spatio-temporally analyze inter- and intra-subject fixations, saccades, and head motions. While working towards perfecting these mappings, we developed an updated experiment setup that allowed us to statistically analyze intra-subject eye-movement events without the need for a common domain. Through such analyses we are finding statistical differences between novices and experts in these visual-search tasks. In the course of this research we have developed a unified, open-source, software framework for processing, visualization, and interaction of mobile eye-tracking and high-resolution panoramic imagery.
Detecting natural occlusion boundaries using local cues
DiMattina, Christopher; Fox, Sean A.; Lewicki, Michael S.
2012-01-01
Occlusion boundaries and junctions provide important cues for inferring three-dimensional scene organization from two-dimensional images. Although several investigators in machine vision have developed algorithms for detecting occlusions and other edges in natural images, relatively few psychophysics or neurophysiology studies have investigated what features are used by the visual system to detect natural occlusions. In this study, we addressed this question using a psychophysical experiment where subjects discriminated image patches containing occlusions from patches containing surfaces. Image patches were drawn from a novel occlusion database containing labeled occlusion boundaries and textured surfaces in a variety of natural scenes. Consistent with related previous work, we found that relatively large image patches were needed to attain reliable performance, suggesting that human subjects integrate complex information over a large spatial region to detect natural occlusions. By defining machine observers using a set of previously studied features measured from natural occlusions and surfaces, we demonstrate that simple features defined at the spatial scale of the image patch are insufficient to account for human performance in the task. To define machine observers using a more biologically plausible multiscale feature set, we trained standard linear and neural network classifiers on the rectified outputs of a Gabor filter bank applied to the image patches. We found that simple linear classifiers could not match human performance, while a neural network classifier combining filter information across location and spatial scale compared well. These results demonstrate the importance of combining a variety of cues defined at multiple spatial scales for detecting natural occlusions. PMID:23255731
Guidance of attention to objects and locations by long-term memory of natural scenes.
Becker, Mark W; Rasmussen, Ian P
2008-11-01
Four flicker change-detection experiments demonstrate that scene-specific long-term memory guides attention to both behaviorally relevant locations and objects within a familiar scene. Participants performed an initial block of change-detection trials, detecting the addition of an object to a natural scene. After a 30-min delay, participants performed an unanticipated 2nd block of trials. When the same scene occurred in the 2nd block, the change within the scene was (a) identical to the original change, (b) a new object appearing in the original change location, (c) the same object appearing in a new location, or (d) a new object appearing in a new location. Results suggest that attention is rapidly allocated to previously relevant locations and then to previously relevant objects. This pattern of locations dominating objects remained when object identity information was made more salient. Eye tracking verified that scene memory results in more direct scan paths to previously relevant locations and objects. This contextual guidance suggests that a high-capacity long-term memory for scenes is used to insure that limited attentional capacity is allocated efficiently rather than being squandered.
On the Encoding of Panoramic Visual Scenes in Navigating Wood Ants.
Buehlmann, Cornelia; Woodgate, Joseph L; Collett, Thomas S
2016-08-08
A natural visual panorama is a complex stimulus formed of many component shapes. It gives an animal a sense of place and supplies guiding signals for controlling the animal's direction of travel [1]. Insects with their economical neural processing [2] are good subjects for analyzing the encoding and memory of such scenes [3-5]. Honeybees [6] and ants [7, 8] foraging from their nest can follow habitual routes guided only by visual cues within a natural panorama. Here, we analyze the headings that ants adopt when a familiar panorama composed of two or three shapes is manipulated by removing a shape or by replacing training shapes with unfamiliar ones. We show that (1) ants recognize a component shape not only through its particular visual features, but also by its spatial relation to other shapes in the scene, and that (2) each segmented shape [9] contributes its own directional signal to generating the ant's chosen heading. We found earlier that ants trained to a feeder placed to one side of a single shape [10] and tested with shapes of different widths learn the retinal position of the training shape's center of mass (CoM) [11, 12] when heading toward the feeder. They then guide themselves by placing the shape's CoM in the remembered retinal position [10]. This use of CoM in a one-shape panorama combined with the results here suggests that the ants' memory of a multi-shape panorama comprises the retinal positions of the horizontal CoMs of each major component shape within the scene, bolstered by local descriptors of that shape. Copyright © 2016 Elsevier Ltd. All rights reserved.
Automatic temperature computation for realistic IR simulation
NASA Astrophysics Data System (ADS)
Le Goff, Alain; Kersaudy, Philippe; Latger, Jean; Cathala, Thierry; Stolte, Nilo; Barillot, Philippe
2000-07-01
Polygon temperature computation in 3D virtual scenes is fundamental for IR image simulation. This article describes in detail the temperature calculation software and its current extensions, briefly presented in [1]. This software, called MURET, is used by the simulation workshop CHORALE of the French DGA. MURET is a one-dimensional thermal software, which accurately takes into account the material thermal attributes of three-dimensional scene and the variation of the environment characteristics (atmosphere) as a function of the time. Concerning the environment, absorbed incident fluxes are computed wavelength by wavelength, for each half an hour, druing 24 hours before the time of the simulation. For each polygon, incident fluxes are compsed of: direct solar fluxes, sky illumination (including diffuse solar fluxes). Concerning the materials, classical thermal attributes are associated to several layers, such as conductivity, absorption, spectral emissivity, density, specific heat, thickness and convection coefficients are taken into account. In the future, MURET will be able to simulate permeable natural materials (water influence) and vegetation natural materials (woods). This model of thermal attributes induces a very accurate polygon temperature computation for the complex 3D databases often found in CHORALE simulations. The kernel of MUET consists of an efficient ray tracer allowing to compute the history (over 24 hours) of the shadowed parts of the 3D scene and a library, responsible for the thermal computations. The great originality concerns the way the heating fluxes are computed. Using ray tracing, the flux received in each 3D point of the scene accurately takes into account the masking (hidden surfaces) between objects. By the way, this library supplies other thermal modules such as a thermal shows computation tool.
Three-dimensional scene encryption and display based on computer-generated holograms.
Kong, Dezhao; Cao, Liangcai; Jin, Guofan; Javidi, Bahram
2016-10-10
An optical encryption and display method for a three-dimensional (3D) scene is proposed based on computer-generated holograms (CGHs) using a single phase-only spatial light modulator. The 3D scene is encoded as one complex Fourier CGH. The Fourier CGH is then decomposed into two phase-only CGHs with random distributions by the vector stochastic decomposition algorithm. Two CGHs are interleaved as one final phase-only CGH for optical encryption and reconstruction. The proposed method can support high-level nonlinear optical 3D scene security and complex amplitude modulation of the optical field. The exclusive phase key offers strong resistances of decryption attacks. Experimental results demonstrate the validity of the novel method.
The influence of color on emotional perception of natural scenes.
Codispoti, Maurizio; De Cesarei, Andrea; Ferrari, Vera
2012-01-01
Is color a critical factor when processing the emotional content of natural scenes? Under challenging perceptual conditions, such as when pictures are briefly presented, color might facilitate scene segmentation and/or function as a semantic cue via association with scene-relevant concepts (e.g., red and blood/injury). To clarify the influence of color on affective picture perception, we compared the late positive potentials (LPP) to color versus grayscale pictures, presented for very brief (24 ms) and longer (6 s) exposure durations. Results indicated that removing color information had no effect on the affective modulation of the LPP, regardless of exposure duration. These findings imply that the recognition of the emotional content of scenes, even when presented very briefly, does not critically rely on color information. Copyright © 2011 Society for Psychophysiological Research.
ERIC Educational Resources Information Center
Kim, Nanyoung
2009-01-01
In this article, the author describes an underwater scene composition for elementary-education majors. This project deals with watercolor with crayon or oil-pastel resist (medium); the beauty of nature represented by fish in the underwater scene (theme); texture and pattern (design elements); drawing simple forms (drawing skill); and composition…
Deconstructing Visual Scenes in Cortex: Gradients of Object and Spatial Layout Information
Kravitz, Dwight J.; Baker, Chris I.
2013-01-01
Real-world visual scenes are complex cluttered, and heterogeneous stimuli engaging scene- and object-selective cortical regions including parahippocampal place area (PPA), retrosplenial complex (RSC), and lateral occipital complex (LOC). To understand the unique contribution of each region to distributed scene representations, we generated predictions based on a neuroanatomical framework adapted from monkey and tested them using minimal scenes in which we independently manipulated both spatial layout (open, closed, and gradient) and object content (furniture, e.g., bed, dresser). Commensurate with its strong connectivity with posterior parietal cortex, RSC evidenced strong spatial layout information but no object information, and its response was not even modulated by object presence. In contrast, LOC, which lies within the ventral visual pathway, contained strong object information but no background information. Finally, PPA, which is connected with both the dorsal and the ventral visual pathway, showed information about both objects and spatial backgrounds and was sensitive to the presence or absence of either. These results suggest that 1) LOC, PPA, and RSC have distinct representations, emphasizing different aspects of scenes, 2) the specific representations in each region are predictable from their patterns of connectivity, and 3) PPA combines both spatial layout and object information as predicted by connectivity. PMID:22473894
Eye Movements and Visual Memory for Scenes
2005-01-01
Scene memory research has demonstrated that the memory representation of a semantically inconsistent object in a scene is more detailed and/or complete... memory during scene viewing, then changes to semantically inconsistent objects (which should be represented more com- pletely) should be detected more... semantic description. Due to the surprise nature of the visual memory test, any learning that occurred during the search portion of the experiment was
Anticipatory scene representation in preschool children's recall and recognition memory.
Kreindel, Erica; Intraub, Helene
2017-09-01
Behavioral and neuroscience research on boundary extension (false memory beyond the edges of a view of a scene) has provided new insights into the constructive nature of scene representation, and motivates questions about development. Early research with children (as young as 6-7 years) was consistent with boundary extension, but relied on an analysis of spatial errors in drawings which are open to alternative explanations (e.g. drawing ability). Experiment 1 replicated and extended prior drawing results with 4-5-year-olds and adults. In Experiment 2, a new, forced-choice immediate recognition memory test was implemented with the same children. On each trial, a card (photograph of a simple scene) was immediately replaced by a test card (identical view and either a closer or more wide-angle view) and participants indicated which one matched the original view. Error patterns supported boundary extension; identical photographs were more frequently rejected when the closer view was the original view, than vice versa. This asymmetry was not attributable to a selection bias (guessing tasks; Experiments 3-5). In Experiment 4, working memory load was increased by presenting more expansive views of more complex scenes. Again, children exhibited boundary extension, but now adults did not, unless stimulus duration was reduced to 5 s (limiting time to implement strategies; Experiment 5). We propose that like adults, children interpret photographs as views of places in the world; they extrapolate the anticipated continuation of the scene beyond the view and misattribute it to having been seen. Developmental differences in source attribution decision processes provide an explanation for the age-related differences observed. © 2016 John Wiley & Sons Ltd.
NASA Astrophysics Data System (ADS)
Graham, James; Ternovskiy, Igor V.
2013-06-01
We applied a two stage unsupervised hierarchical learning system to model complex dynamic surveillance and cyber space monitoring systems using a non-commercial version of the NeoAxis visualization software. The hierarchical scene learning and recognition approach is based on hierarchical expectation maximization, and was linked to a 3D graphics engine for validation of learning and classification results and understanding the human - autonomous system relationship. Scene recognition is performed by taking synthetically generated data and feeding it to a dynamic logic algorithm. The algorithm performs hierarchical recognition of the scene by first examining the features of the objects to determine which objects are present, and then determines the scene based on the objects present. This paper presents a framework within which low level data linked to higher-level visualization can provide support to a human operator and be evaluated in a detailed and systematic way.
Scene perception in posterior cortical atrophy: categorization, description and fixation patterns.
Shakespeare, Timothy J; Yong, Keir X X; Frost, Chris; Kim, Lois G; Warrington, Elizabeth K; Crutch, Sebastian J
2013-01-01
Partial or complete Balint's syndrome is a core feature of the clinico-radiological syndrome of posterior cortical atrophy (PCA), in which individuals experience a progressive deterioration of cortical vision. Although multi-object arrays are frequently used to detect simultanagnosia in the clinical assessment and diagnosis of PCA, to date there have been no group studies of scene perception in patients with the syndrome. The current study involved three linked experiments conducted in PCA patients and healthy controls. Experiment 1 evaluated the accuracy and latency of complex scene perception relative to individual faces and objects (color and grayscale) using a categorization paradigm. PCA patients were both less accurate (faces < scenes < objects) and slower (scenes < objects < faces) than controls on all categories, with performance strongly associated with their level of basic visual processing impairment; patients also showed a small advantage for color over grayscale stimuli. Experiment 2 involved free description of real world scenes. PCA patients generated fewer features and more misperceptions than controls, though perceptual errors were always consistent with the patient's global understanding of the scene (whether correct or not). Experiment 3 used eye tracking measures to compare patient and control eye movements over initial and subsequent fixations of scenes. Patients' fixation patterns were significantly different to those of young and age-matched controls, with comparable group differences for both initial and subsequent fixations. Overall, these findings describe the variability in everyday scene perception exhibited by individuals with PCA, and indicate the importance of exposure duration in the perception of complex scenes.
Cortical Representations of Speech in a Multitalker Auditory Scene.
Puvvada, Krishna C; Simon, Jonathan Z
2017-09-20
The ability to parse a complex auditory scene into perceptual objects is facilitated by a hierarchical auditory system. Successive stages in the hierarchy transform an auditory scene of multiple overlapping sources, from peripheral tonotopically based representations in the auditory nerve, into perceptually distinct auditory-object-based representations in the auditory cortex. Here, using magnetoencephalography recordings from men and women, we investigate how a complex acoustic scene consisting of multiple speech sources is represented in distinct hierarchical stages of the auditory cortex. Using systems-theoretic methods of stimulus reconstruction, we show that the primary-like areas in the auditory cortex contain dominantly spectrotemporal-based representations of the entire auditory scene. Here, both attended and ignored speech streams are represented with almost equal fidelity, and a global representation of the full auditory scene with all its streams is a better candidate neural representation than that of individual streams being represented separately. We also show that higher-order auditory cortical areas, by contrast, represent the attended stream separately and with significantly higher fidelity than unattended streams. Furthermore, the unattended background streams are more faithfully represented as a single unsegregated background object rather than as separated objects. Together, these findings demonstrate the progression of the representations and processing of a complex acoustic scene up through the hierarchy of the human auditory cortex. SIGNIFICANCE STATEMENT Using magnetoencephalography recordings from human listeners in a simulated cocktail party environment, we investigate how a complex acoustic scene consisting of multiple speech sources is represented in separate hierarchical stages of the auditory cortex. We show that the primary-like areas in the auditory cortex use a dominantly spectrotemporal-based representation of the entire auditory scene, with both attended and unattended speech streams represented with almost equal fidelity. We also show that higher-order auditory cortical areas, by contrast, represent an attended speech stream separately from, and with significantly higher fidelity than, unattended speech streams. Furthermore, the unattended background streams are represented as a single undivided background object rather than as distinct background objects. Copyright © 2017 the authors 0270-6474/17/379189-08$15.00/0.
Victor, Jonathan D; Mechler, Ferenc; Ohiorhenuan, Ifije; Schmid, Anita M; Purpura, Keith P
2009-12-01
A full understanding of the computations performed in primary visual cortex is an important yet elusive goal. Receptive field models consisting of cascades of linear filters and static nonlinearities may be adequate to account for responses to simple stimuli such as gratings and random checkerboards, but their predictions of responses to complex stimuli such as natural scenes are only approximately correct. It is unclear whether these discrepancies are limited to quantitative inaccuracies that reflect well-recognized mechanisms such as response normalization, gain controls, and cross-orientation suppression or, alternatively, imply additional qualitative features of the underlying computations. To address this question, we examined responses of V1 and V2 neurons in the monkey and area 17 neurons in the cat to two-dimensional Hermite functions (TDHs). TDHs are intermediate in complexity between traditional analytic stimuli and natural scenes and have mathematical properties that facilitate their use to test candidate models. By exploiting these properties, along with the laminar organization of V1, we identify qualitative aspects of neural computations beyond those anticipated from the above-cited model framework. Specifically, we find that V1 neurons receive signals from orientation-selective mechanisms that are highly nonlinear: they are sensitive to phase correlations, not just spatial frequency content. That is, the behavior of V1 neurons departs from that of linear-nonlinear cascades with standard modulatory mechanisms in a qualitative manner: even relatively simple stimuli evoke responses that imply complex spatial nonlinearities. The presence of these findings in the input layers suggests that these nonlinearities act in a feedback fashion.
Research in interactive scene analysis
NASA Technical Reports Server (NTRS)
Tenenbaum, J. M.; Garvey, T. D.; Weyl, S. A.; Wolf, H. C.
1975-01-01
An interactive scene interpretation system (ISIS) was developed as a tool for constructing and experimenting with man-machine and automatic scene analysis methods tailored for particular image domains. A recently developed region analysis subsystem based on the paradigm of Brice and Fennema is described. Using this subsystem a series of experiments was conducted to determine good criteria for initially partitioning a scene into atomic regions and for merging these regions into a final partition of the scene along object boundaries. Semantic (problem-dependent) knowledge is essential for complete, correct partitions of complex real-world scenes. An interactive approach to semantic scene segmentation was developed and demonstrated on both landscape and indoor scenes. This approach provides a reasonable methodology for segmenting scenes that cannot be processed completely automatically, and is a promising basis for a future automatic system. A program is described that can automatically generate strategies for finding specific objects in a scene based on manually designated pictorial examples.
Superordinate Level Processing Has Priority Over Basic-Level Processing in Scene Gist Recognition
Sun, Qi; Zheng, Yang; Sun, Mingxia; Zheng, Yuanjie
2016-01-01
By combining a perceptual discrimination task and a visuospatial working memory task, the present study examined the effects of visuospatial working memory load on the hierarchical processing of scene gist. In the perceptual discrimination task, two scene images from the same (manmade–manmade pairing or natural–natural pairing) or different superordinate level categories (manmade–natural pairing) were presented simultaneously, and participants were asked to judge whether these two images belonged to the same basic-level category (e.g., street–street pairing) or not (e.g., street–highway pairing). In the concurrent working memory task, spatial load (position-based load in Experiment 1) and object load (figure-based load in Experiment 2) were manipulated. The results were as follows: (a) spatial load and object load have stronger effects on discrimination of same basic-level scene pairing than same superordinate level scene pairing; (b) spatial load has a larger impact on the discrimination of scene pairings at early stages than at later stages; on the contrary, object information has a larger influence on at later stages than at early stages. It followed that superordinate level processing has priority over basic-level processing in scene gist recognition and spatial information contributes to the earlier and object information to the later stages in scene gist recognition. PMID:28382195
Metabolic Mapping of the Brain's Response to Visual Stimulation: Studies in Humans.
ERIC Educational Resources Information Center
Phelps, Michael E.; Kuhl, David E.
1981-01-01
Studies demonstrate increasing glucose metabolic rates in human primary (PVC) and association (AVC) visual cortex as complexity of visual scenes increase. AVC increased more rapidly with scene complexity than PVC and increased local metabolic activities above control subject with eyes closed; indicates wide range and metabolic reserve of visual…
CB Database: A change blindness database for objects in natural indoor scenes.
Sareen, Preeti; Ehinger, Krista A; Wolfe, Jeremy M
2016-12-01
Change blindness has been a topic of interest in cognitive sciences for decades. Change detection experiments are frequently used for studying various research topics such as attention and perception. However, creating change detection stimuli is tedious and there is no open repository of such stimuli using natural scenes. We introduce the Change Blindness (CB) Database with object changes in 130 colored images of natural indoor scenes. The size and eccentricity are provided for all the changes as well as reaction time data from a baseline experiment. In addition, we have two specialized satellite databases that are subsets of the 130 images. In one set, changes are seen in rooms or in mirrors in those rooms (Mirror Change Database). In the other, changes occur in a room or out a window (Window Change Database). Both the sets have controlled background, change size, and eccentricity. The CB Database is intended to provide researchers with a stimulus set of natural scenes with defined stimulus parameters that can be used for a wide range of experiments. The CB Database can be found at http://search.bwh.harvard.edu/new/CBDatabase.html .
Color constancy in natural scenes explained by global image statistics
Foster, David H.; Amano, Kinjiro; Nascimento, Sérgio M. C.
2007-01-01
To what extent do observers' judgments of surface color with natural scenes depend on global image statistics? To address this question, a psychophysical experiment was performed in which images of natural scenes under two successive daylights were presented on a computer-controlled high-resolution color monitor. Observers reported whether there was a change in reflectance of a test surface in the scene. The scenes were obtained with a hyperspectral imaging system and included variously trees, shrubs, grasses, ferns, flowers, rocks, and buildings. Discrimination performance, quantified on a scale of 0 to 1 with a color-constancy index, varied from 0.69 to 0.97 over 21 scenes and two illuminant changes, from a correlated color temperature of 25,000 K to 6700 K and from 4000 K to 6700 K. The best account of these effects was provided by receptor-based rather than colorimetric properties of the images. Thus, in a linear regression, 43% of the variance in constancy index was explained by the log of the mean relative deviation in spatial cone-excitation ratios evaluated globally across the two images of a scene. A further 20% was explained by including the mean chroma of the first image and its difference from that of the second image and a further 7% by the mean difference in hue. Together, all four global color properties accounted for 70% of the variance and provided a good fit to the effects of scene and of illuminant change on color constancy, and, additionally, of changing test-surface position. By contrast, a spatial-frequency analysis of the images showed that the gradient of the luminance amplitude spectrum accounted for only 5% of the variance. PMID:16961965
Color constancy in natural scenes explained by global image statistics.
Foster, David H; Amano, Kinjiro; Nascimento, Sérgio M C
2006-01-01
To what extent do observers' judgments of surface color with natural scenes depend on global image statistics? To address this question, a psychophysical experiment was performed in which images of natural scenes under two successive daylights were presented on a computer-controlled high-resolution color monitor. Observers reported whether there was a change in reflectance of a test surface in the scene. The scenes were obtained with a hyperspectral imaging system and included variously trees, shrubs, grasses, ferns, flowers, rocks, and buildings. Discrimination performance, quantified on a scale of 0 to 1 with a color-constancy index, varied from 0.69 to 0.97 over 21 scenes and two illuminant changes, from a correlated color temperature of 25,000 K to 6700 K and from 4000 K to 6700 K. The best account of these effects was provided by receptor-based rather than colorimetric properties of the images. Thus, in a linear regression, 43% of the variance in constancy index was explained by the log of the mean relative deviation in spatial cone-excitation ratios evaluated globally across the two images of a scene. A further 20% was explained by including the mean chroma of the first image and its difference from that of the second image and a further 7% by the mean difference in hue. Together, all four global color properties accounted for 70% of the variance and provided a good fit to the effects of scene and of illuminant change on color constancy, and, additionally, of changing test-surface position. By contrast, a spatial-frequency analysis of the images showed that the gradient of the luminance amplitude spectrum accounted for only 5% of the variance.
Spatial Correlations in Natural Scenes Modulate Response Reliability in Mouse Visual Cortex
Rikhye, Rajeev V.
2015-01-01
Intrinsic neuronal variability significantly limits information encoding in the primary visual cortex (V1). Certain stimuli can suppress this intertrial variability to increase the reliability of neuronal responses. In particular, responses to natural scenes, which have broadband spatiotemporal statistics, are more reliable than responses to stimuli such as gratings. However, very little is known about which stimulus statistics modulate reliable coding and how this occurs at the neural ensemble level. Here, we sought to elucidate the role that spatial correlations in natural scenes play in reliable coding. We developed a novel noise-masking method to systematically alter spatial correlations in natural movies, without altering their edge structure. Using high-speed two-photon calcium imaging in vivo, we found that responses in mouse V1 were much less reliable at both the single neuron and population level when spatial correlations were removed from the image. This change in reliability was due to a reorganization of between-neuron correlations. Strongly correlated neurons formed ensembles that reliably and accurately encoded visual stimuli, whereas reducing spatial correlations reduced the activation of these ensembles, leading to an unreliable code. Together with an ensemble-specific normalization model, these results suggest that the coordinated activation of specific subsets of neurons underlies the reliable coding of natural scenes. SIGNIFICANCE STATEMENT The natural environment is rich with information. To process this information with high fidelity, V1 neurons have to be robust to noise and, consequentially, must generate responses that are reliable from trial to trial. While several studies have hinted that both stimulus attributes and population coding may reduce noise, the details remain unclear. Specifically, what features of natural scenes are important and how do they modulate reliability? This study is the first to investigate the role of spatial correlations, which are a fundamental attribute of natural scenes, in shaping stimulus coding by V1 neurons. Our results provide new insights into how stimulus spatial correlations reorganize the correlated activation of specific ensembles of neurons to ensure accurate information processing in V1. PMID:26511254
A systematic comparison between visual cues for boundary detection.
Mély, David A; Kim, Junkyung; McGill, Mason; Guo, Yuliang; Serre, Thomas
2016-03-01
The detection of object boundaries is a critical first step for many visual processing tasks. Multiple cues (we consider luminance, color, motion and binocular disparity) available in the early visual system may signal object boundaries but little is known about their relative diagnosticity and how to optimally combine them for boundary detection. This study thus aims at understanding how early visual processes inform boundary detection in natural scenes. We collected color binocular video sequences of natural scenes to construct a video database. Each scene was annotated with two full sets of ground-truth contours (one set limited to object boundaries and another set which included all edges). We implemented an integrated computational model of early vision that spans all considered cues, and then assessed their diagnosticity by training machine learning classifiers on individual channels. Color and luminance were found to be most diagnostic while stereo and motion were least. Combining all cues yielded a significant improvement in accuracy beyond that of any cue in isolation. Furthermore, the accuracy of individual cues was found to be a poor predictor of their unique contribution for the combination. This result suggested a complex interaction between cues, which we further quantified using regularization techniques. Our systematic assessment of the accuracy of early vision models for boundary detection together with the resulting annotated video dataset should provide a useful benchmark towards the development of higher-level models of visual processing. Copyright © 2016 Elsevier Ltd. All rights reserved.
Keefe, Bruce D; Wincenciak, Joanna; Jellema, Tjeerd; Ward, James W; Barraclough, Nick E
2016-07-01
When observing another individual's actions, we can both recognize their actions and infer their beliefs concerning the physical and social environment. The extent to which visual adaptation influences action recognition and conceptually later stages of processing involved in deriving the belief state of the actor remains unknown. To explore this we used virtual reality (life-size photorealistic actors presented in stereoscopic three dimensions) to see how visual adaptation influences the perception of individuals in naturally unfolding social scenes at increasingly higher levels of action understanding. We presented scenes in which one actor picked up boxes (of varying number and weight), after which a second actor picked up a single box. Adaptation to the first actor's behavior systematically changed perception of the second actor. Aftereffects increased with the duration of the first actor's behavior, declined exponentially over time, and were independent of view direction. Inferences about the second actor's expectation of box weight were also distorted by adaptation to the first actor. Distortions in action recognition and actor expectations did not, however, extend across different actions, indicating that adaptation is not acting at an action-independent abstract level but rather at an action-dependent level. We conclude that although adaptation influences more complex inferences about belief states of individuals, this is likely to be a result of adaptation at an earlier action recognition stage rather than adaptation operating at a higher, more abstract level in mentalizing or simulation systems.
2009-01-22
CAPE CANAVERAL, Fla. – The sun rising over the Launch Complex 39 Area turn basin at NASA's Kennedy Space Center in Florida casts a brilliant flame in the water. At right is the U.S. flag on the grounds of the NASA News Center. Kennedy is surrounded by water: the Banana River, Banana Creek, Indian River Lagoon and the Atlantic Ocean, all of which provide scenes of beauty and nature that contrast with the high technology and power of the center. Photo credit: NASA/Ben Smegelsky
Fixation and saliency during search of natural scenes: the case of visual agnosia.
Foulsham, Tom; Barton, Jason J S; Kingstone, Alan; Dewhurst, Richard; Underwood, Geoffrey
2009-07-01
Models of eye movement control in natural scenes often distinguish between stimulus-driven processes (which guide the eyes to visually salient regions) and those based on task and object knowledge (which depend on expectations or identification of objects and scene gist). In the present investigation, the eye movements of a patient with visual agnosia were recorded while she searched for objects within photographs of natural scenes and compared to those made by students and age-matched controls. Agnosia is assumed to disrupt the top-down knowledge available in this task, and so may increase the reliance on bottom-up cues. The patient's deficit in object recognition was seen in poor search performance and inefficient scanning. The low-level saliency of target objects had an effect on responses in visual agnosia, and the most salient region in the scene was more likely to be fixated by the patient than by controls. An analysis of model-predicted saliency at fixation locations indicated a closer match between fixations and low-level saliency in agnosia than in controls. These findings are discussed in relation to saliency-map models and the balance between high and low-level factors in eye guidance.
Scene perception in posterior cortical atrophy: categorization, description and fixation patterns
Shakespeare, Timothy J.; Yong, Keir X. X.; Frost, Chris; Kim, Lois G.; Warrington, Elizabeth K.; Crutch, Sebastian J.
2013-01-01
Partial or complete Balint's syndrome is a core feature of the clinico-radiological syndrome of posterior cortical atrophy (PCA), in which individuals experience a progressive deterioration of cortical vision. Although multi-object arrays are frequently used to detect simultanagnosia in the clinical assessment and diagnosis of PCA, to date there have been no group studies of scene perception in patients with the syndrome. The current study involved three linked experiments conducted in PCA patients and healthy controls. Experiment 1 evaluated the accuracy and latency of complex scene perception relative to individual faces and objects (color and grayscale) using a categorization paradigm. PCA patients were both less accurate (faces < scenes < objects) and slower (scenes < objects < faces) than controls on all categories, with performance strongly associated with their level of basic visual processing impairment; patients also showed a small advantage for color over grayscale stimuli. Experiment 2 involved free description of real world scenes. PCA patients generated fewer features and more misperceptions than controls, though perceptual errors were always consistent with the patient's global understanding of the scene (whether correct or not). Experiment 3 used eye tracking measures to compare patient and control eye movements over initial and subsequent fixations of scenes. Patients' fixation patterns were significantly different to those of young and age-matched controls, with comparable group differences for both initial and subsequent fixations. Overall, these findings describe the variability in everyday scene perception exhibited by individuals with PCA, and indicate the importance of exposure duration in the perception of complex scenes. PMID:24106469
Social relevance drives viewing behavior independent of low-level salience in rhesus macaques
Solyst, James A.; Buffalo, Elizabeth A.
2014-01-01
Quantifying attention to social stimuli during the viewing of complex social scenes with eye tracking has proven to be a sensitive method in the diagnosis of autism spectrum disorders years before average clinical diagnosis. Rhesus macaques provide an ideal model for understanding the mechanisms underlying social viewing behavior, but to date no comparable behavioral task has been developed for use in monkeys. Using a novel scene-viewing task, we monitored the gaze of three rhesus macaques while they freely viewed well-controlled composed social scenes and analyzed the time spent viewing objects and monkeys. In each of six behavioral sessions, monkeys viewed a set of 90 images (540 unique scenes) with each image presented twice. In two-thirds of the repeated scenes, either a monkey or an object was replaced with a novel item (manipulated scenes). When viewing a repeated scene, monkeys made longer fixations and shorter saccades, shifting from a rapid orienting to global scene contents to a more local analysis of fewer items. In addition to this repetition effect, in manipulated scenes, monkeys demonstrated robust memory by spending more time viewing the replaced items. By analyzing attention to specific scene content, we found that monkeys strongly preferred to view conspecifics and that this was not related to their salience in terms of low-level image features. A model-free analysis of viewing statistics found that monkeys that were viewed earlier and longer had direct gaze and redder sex skin around their face and rump, two important visual social cues. These data provide a quantification of viewing strategy, memory and social preferences in rhesus macaques viewing complex social scenes, and they provide an important baseline with which to compare to the effects of therapeutics aimed at enhancing social cognition. PMID:25414633
History of Reading Struggles Linked to Enhanced Learning in Low Spatial Frequency Scenes
Schneps, Matthew H.; Brockmole, James R.; Sonnert, Gerhard; Pomplun, Marc
2012-01-01
People with dyslexia, who face lifelong struggles with reading, exhibit numerous associated low-level sensory deficits including deficits in focal attention. Countering this, studies have shown that struggling readers outperform typical readers in some visual tasks that integrate distributed information across an expanse. Though such abilities would be expected to facilitate scene memory, prior investigations using the contextual cueing paradigm failed to find corresponding advantages in dyslexia. We suggest that these studies were confounded by task-dependent effects exaggerating known focal attention deficits in dyslexia, and that, if natural scenes were used as the context, advantages would emerge. Here, we investigate this hypothesis by comparing college students with histories of severe lifelong reading difficulties (SR) and typical readers (TR) in contexts that vary attention load. We find no differences in contextual-cueing when spatial contexts are letter-like objects, or when contexts are natural scenes. However, the SR group significantly outperforms the TR group when contexts are low-pass filtered natural scenes [F(3, 39) = 3.15, p<.05]. These findings suggest that perception or memory for low spatial frequency components in scenes is enhanced in dyslexia. These findings are important because they suggest strengths for spatial learning in a population otherwise impaired, carrying implications for the education and support of students who face challenges in school. PMID:22558210
History of reading struggles linked to enhanced learning in low spatial frequency scenes.
Schneps, Matthew H; Brockmole, James R; Sonnert, Gerhard; Pomplun, Marc
2012-01-01
People with dyslexia, who face lifelong struggles with reading, exhibit numerous associated low-level sensory deficits including deficits in focal attention. Countering this, studies have shown that struggling readers outperform typical readers in some visual tasks that integrate distributed information across an expanse. Though such abilities would be expected to facilitate scene memory, prior investigations using the contextual cueing paradigm failed to find corresponding advantages in dyslexia. We suggest that these studies were confounded by task-dependent effects exaggerating known focal attention deficits in dyslexia, and that, if natural scenes were used as the context, advantages would emerge. Here, we investigate this hypothesis by comparing college students with histories of severe lifelong reading difficulties (SR) and typical readers (TR) in contexts that vary attention load. We find no differences in contextual-cueing when spatial contexts are letter-like objects, or when contexts are natural scenes. However, the SR group significantly outperforms the TR group when contexts are low-pass filtered natural scenes [F(3, 39) = 3.15, p<.05]. These findings suggest that perception or memory for low spatial frequency components in scenes is enhanced in dyslexia. These findings are important because they suggest strengths for spatial learning in a population otherwise impaired, carrying implications for the education and support of students who face challenges in school.
Finding the Cause: Verbal Framing Helps Children Extract Causal Evidence Embedded in a Complex Scene
ERIC Educational Resources Information Center
Butler, Lucas P.; Markman, Ellen M.
2012-01-01
In making causal inferences, children must both identify a causal problem and selectively attend to meaningful evidence. Four experiments demonstrate that verbally framing an event ("Which animals make Lion laugh?") helps 4-year-olds extract evidence from a complex scene to make accurate causal inferences. Whereas framing was unnecessary when…
Recent Experiments Conducted with the Wide-Field Imaging Interferometry Testbed (WIIT)
NASA Technical Reports Server (NTRS)
Leisawitz, David T.; Juanola-Parramon, Roser; Bolcar, Matthew; Iacchetta, Alexander S.; Maher, Stephen F.; Rinehart, Stephen A.
2016-01-01
The Wide-field Imaging Interferometry Testbed (WIIT) was developed at NASA's Goddard Space Flight Center to demonstrate and explore the practical limitations inherent in wide field-of-view double Fourier (spatio-spectral) interferometry. The testbed delivers high-quality interferometric data and is capable of observing spatially and spectrally complex hyperspectral test scenes. Although WIIT operates at visible wavelengths, by design the data are representative of those from a space-based far-infrared observatory. We used WIIT to observe a calibrated, independently characterized test scene of modest spatial and spectral complexity, and an astronomically realistic test scene of much greater spatial and spectral complexity. This paper describes the experimental setup, summarizes the performance of the testbed, and presents representative data.
Everyone knows what is interesting: Salient locations which should be fixated
Masciocchi, Christopher Michael; Mihalas, Stefan; Parkhurst, Derrick; Niebur, Ernst
2010-01-01
Most natural scenes are too complex to be perceived instantaneously in their entirety. Observers therefore have to select parts of them and process these parts sequentially. We study how this selection and prioritization process is performed by humans at two different levels. One is the overt attention mechanism of saccadic eye movements in a free-viewing paradigm. The second is a conscious decision process in which we asked observers which points in a scene they considered the most interesting. We find in a very large participant population (more than one thousand) that observers largely agree on which points they consider interesting. Their selections are also correlated with the eye movement pattern of different subjects. Both are correlated with predictions of a purely bottom–up saliency map model. Thus, bottom–up saliency influences cognitive processes as far removed from the sensory periphery as in the conscious choice of what an observer considers interesting. PMID:20053088
On a common circle: natural scenes and Gestalt rules.
Sigman, M; Cecchi, G A; Gilbert, C D; Magnasco, M O
2001-02-13
To understand how the human visual system analyzes images, it is essential to know the structure of the visual environment. In particular, natural images display consistent statistical properties that distinguish them from random luminance distributions. We have studied the geometric regularities of oriented elements (edges or line segments) present in an ensemble of visual scenes, asking how much information the presence of a segment in a particular location of the visual scene carries about the presence of a second segment at different relative positions and orientations. We observed strong long-range correlations in the distribution of oriented segments that extend over the whole visual field. We further show that a very simple geometric rule, cocircularity, predicts the arrangement of segments in natural scenes, and that different geometrical arrangements show relevant differences in their scaling properties. Our results show similarities to geometric features of previous physiological and psychophysical studies. We discuss the implications of these findings for theories of early vision.
Perception of Objects in Natural Scenes: Is It Really Attention Free?
ERIC Educational Resources Information Center
Evans, Karla K.; Treisman, Anne
2005-01-01
Studies have suggested attention-free semantic processing of natural scenes in which concurrent tasks leave category detection unimpaired (e.g., F. Li, R. VanRullen, C. Koch, & P. Perona, 2002). Could this ability reflect detection of disjunctive feature sets rather than high-level binding? Participants detected an animal target in a rapid serial…
Miskovic, Vladimir; Martinovic, Jasna; Wieser, Matthias M.; Petro, Nathan M.; Bradley, Margaret M.; Keil, Andreas
2015-01-01
Emotionally arousing scenes readily capture visual attention, prompting amplified neural activity in sensory regions of the brain. The physical stimulus features and related information channels in the human visual system that contribute to this modulation, however, are not known. Here, we manipulated low-level physical parameters of complex scenes varying in hedonic valence and emotional arousal in order to target the relative contributions of luminance based versus chromatic visual channels to emotional perception. Stimulus-evoked brain electrical activity was measured during picture viewing and used to quantify neural responses sensitive to lower-tier visual cortical involvement (steady-state visual evoked potentials) as well as the late positive potential, reflecting a more distributed cortical event. Results showed that the enhancement for emotional content was stimulus-selective when examining the steady-state segments of the evoked visual potentials. Response amplification was present only for low spatial frequency, grayscale stimuli, and not for high spatial frequency, red/green stimuli. In contrast, the late positive potential was modulated by emotion regardless of the scene’s physical properties. Our findings are discussed in relation to neurophysiologically plausible constraints operating at distinct stages of the cortical processing stream. PMID:25640949
The Incongruency Advantage for Environmental Sounds Presented in Natural Auditory Scenes
ERIC Educational Resources Information Center
Gygi, Brian; Shafiro, Valeriy
2011-01-01
The effect of context on the identification of common environmental sounds (e.g., dogs barking or cars honking) was tested by embedding them in familiar auditory background scenes (street ambience, restaurants). Initial results with subjects trained on both the scenes and the sounds to be identified showed a significant advantage of about five…
Anticipatory Scene Representation in Preschool Children's Recall and Recognition Memory
ERIC Educational Resources Information Center
Kreindel, Erica; Intraub, Helene
2017-01-01
Behavioral and neuroscience research on boundary extension (false memory beyond the edges of a view of a scene) has provided new insights into the constructive nature of scene representation, and motivates questions about development. Early research with children (as young as 6-7 years) was consistent with boundary extension, but relied on an…
ERIC Educational Resources Information Center
Gygi, Brian; Shafiro, Valeriy
2013-01-01
Purpose: Previously, Gygi and Shafiro (2011) found that when environmental sounds are semantically incongruent with the background scene (e.g., horse galloping in a restaurant), they can be identified more accurately by young normal-hearing listeners (YNH) than sounds congruent with the scene (e.g., horse galloping at a racetrack). This study…
Gestalt-like constraints produce veridical (Euclidean) percepts of 3D indoor scenes
Kwon, TaeKyu; Li, Yunfeng; Sawada, Tadamasa; Pizlo, Zygmunt
2015-01-01
This study, which was influenced a lot by Gestalt ideas, extends our prior work on the role of a priori constraints in the veridical perception of 3D shapes to the perception of 3D scenes. Our experiments tested how human subjects perceive the layout of a naturally-illuminated indoor scene that contains common symmetrical 3D objects standing on a horizontal floor. In one task, the subject was asked to draw a top view of a scene that was viewed either monocularly or binocularly. The top views the subjects reconstructed were configured accurately except for their overall size. These size errors varied from trial to trial, and were shown most-likely to result from the presence of a response bias. There was little, if any, evidence of systematic distortions of the subjects’ perceived visual space, the kind of distortions that have been reported in numerous experiments run under very unnatural conditions. This shown, we proceeded to use Foley’s (Vision Research 12 (1972) 323–332) isosceles right triangle experiment to test the intrinsic geometry of visual space directly. This was done with natural viewing, with the impoverished viewing conditions Foley had used, as well as with a number of intermediate viewing conditions. Our subjects produced very accurate triangles when the viewing conditions were natural, but their performance deteriorated systematically as the viewing conditions were progressively impoverished. Their perception of visual space became more compressed as their natural visual environment was degraded. Once this was shown, we developed a computational model that emulated the most salient features of our psychophysical results. We concluded that human observers see 3D scenes veridically when they view natural 3D objects within natural 3D environments. PMID:26525845
The Southampton-York Natural Scenes (SYNS) dataset: Statistics of surface attitude
Adams, Wendy J.; Elder, James H.; Graf, Erich W.; Leyland, Julian; Lugtigheid, Arthur J.; Muryy, Alexander
2016-01-01
Recovering 3D scenes from 2D images is an under-constrained task; optimal estimation depends upon knowledge of the underlying scene statistics. Here we introduce the Southampton-York Natural Scenes dataset (SYNS: https://syns.soton.ac.uk), which provides comprehensive scene statistics useful for understanding biological vision and for improving machine vision systems. In order to capture the diversity of environments that humans encounter, scenes were surveyed at random locations within 25 indoor and outdoor categories. Each survey includes (i) spherical LiDAR range data (ii) high-dynamic range spherical imagery and (iii) a panorama of stereo image pairs. We envisage many uses for the dataset and present one example: an analysis of surface attitude statistics, conditioned on scene category and viewing elevation. Surface normals were estimated using a novel adaptive scale selection algorithm. Across categories, surface attitude below the horizon is dominated by the ground plane (0° tilt). Near the horizon, probability density is elevated at 90°/270° tilt due to vertical surfaces (trees, walls). Above the horizon, probability density is elevated near 0° slant due to overhead structure such as ceilings and leaf canopies. These structural regularities represent potentially useful prior assumptions for human and machine observers, and may predict human biases in perceived surface attitude. PMID:27782103
The Perception of Concurrent Sound Objects in Harmonic Complexes Impairs Gap Detection
ERIC Educational Resources Information Center
Leung, Ada W. S.; Jolicoeur, Pierre; Vachon, Francois; Alain, Claude
2011-01-01
Since the introduction of the concept of auditory scene analysis, there has been a paucity of work focusing on the theoretical explanation of how attention is allocated within a complex auditory scene. Here we examined signal detection in situations that promote either the fusion of tonal elements into a single sound object or the segregation of a…
Fowlkes, Charless C.; Banks, Martin S.
2010-01-01
The shape of the contour separating two regions strongly influences judgments of which region is “figure” and which is “ground.” Convexity and other figure–ground cues are generally assumed to indicate only which region is nearer, but nothing about how much the regions are separated in depth. To determine the depth information conveyed by convexity, we examined natural scenes and found that depth steps across surfaces with convex silhouettes are likely to be larger than steps across surfaces with concave silhouettes. In a psychophysical experiment, we found that humans exploit this correlation. For a given binocular disparity, observers perceived more depth when the near surface's silhouette was convex rather than concave. We estimated the depth distributions observers used in making those judgments: they were similar to the natural-scene distributions. Our findings show that convexity should be reclassified as a metric depth cue. They also suggest that the dichotomy between metric and nonmetric depth cues is false and that the depth information provided many cues should be evaluated with respect to natural-scene statistics. Finally, the findings provide an explanation for why figure–ground cues modulate the responses of disparity-sensitive cells in visual cortex. PMID:20505093
Color constancy in a scene with bright colors that do not have a fully natural surface appearance.
Fukuda, Kazuho; Uchikawa, Keiji
2014-04-01
Theoretical and experimental approaches have proposed that color constancy involves a correction related to some average of stimulation over the scene, and some of the studies showed that the average gives greater weight to surrounding bright colors. However, in a natural scene, high-luminance elements do not necessarily carry information about the scene illuminant when the luminance is too high for it to appear as a natural object color. The question is how a surrounding color's appearance mode influences its contribution to the degree of color constancy. Here the stimuli were simple geometric patterns, and the luminance of surrounding colors was tested over the range beyond the luminosity threshold. Observers performed perceptual achromatic setting on the test patch in order to measure the degree of color constancy and evaluated the surrounding bright colors' appearance mode. Broadly, our results support the assumption that the visual system counts only the colors in the object-color appearance for color constancy. However, detailed analysis indicated that surrounding colors without a fully natural object-color appearance had some sort of influence on color constancy. Consideration of this contribution of unnatural object color might be important for precise modeling of human color constancy.
On validating remote sensing simulations using coincident real data
NASA Astrophysics Data System (ADS)
Wang, Mingming; Yao, Wei; Brown, Scott; Goodenough, Adam; van Aardt, Jan
2016-05-01
The remote sensing community often requires data simulation, either via spectral/spatial downsampling or through virtual, physics-based models, to assess systems and algorithms. The Digital Imaging and Remote Sensing Image Generation (DIRSIG) model is one such first-principles, physics-based model for simulating imagery for a range of modalities. Complex simulation of vegetation environments subsequently has become possible, as scene rendering technology and software advanced. This in turn has created questions related to the validity of such complex models, with potential multiple scattering, bidirectional distribution function (BRDF), etc. phenomena that could impact results in the case of complex vegetation scenes. We selected three sites, located in the Pacific Southwest domain (Fresno, CA) of the National Ecological Observatory Network (NEON). These sites represent oak savanna, hardwood forests, and conifer-manzanita-mixed forests. We constructed corresponding virtual scenes, using airborne LiDAR and imaging spectroscopy data from NEON, ground-based LiDAR data, and field-collected spectra to characterize the scenes. Imaging spectroscopy data for these virtual sites then were generated using the DIRSIG simulation environment. This simulated imagery was compared to real AVIRIS imagery (15m spatial resolution; 12 pixels/scene) and NEON Airborne Observation Platform (AOP) data (1m spatial resolution; 180 pixels/scene). These tests were performed using a distribution-comparison approach for select spectral statistics, e.g., established the spectra's shape, for each simulated versus real distribution pair. The initial comparison results of the spectral distributions indicated that the shapes of spectra between the virtual and real sites were closely matched.
Experiencing simultanagnosia through windowed viewing of complex social scenes.
Dalrymple, Kirsten A; Birmingham, Elina; Bischof, Walter F; Barton, Jason J S; Kingstone, Alan
2011-01-07
Simultanagnosia is a disorder of visual attention, defined as an inability to see more than one object at once. It has been conceived as being due to a constriction of the visual "window" of attention, a metaphor that we examine in the present article. A simultanagnosic patient (SL) and two non-simultanagnosic control patients (KC and ES) described social scenes while their eye movements were monitored. These data were compared to a group of healthy subjects who described the same scenes under the same conditions as the patients, or through an aperture that restricted their vision to a small portion of the scene. Experiment 1 demonstrated that SL showed unusually low proportions of fixations to the eyes in social scenes, which contrasted with all other participants who demonstrated the standard preferential bias toward eyes. Experiments 2 and 3 revealed that when healthy participants viewed scenes through a window that was contingent on where they looked (Experiment 2) or where they moved a computer mouse (Experiment 3), their behavior closely mirrored that of patient SL. These findings suggest that a constricted window of visual processing has important consequences for how simultanagnosic patients explore their world. Our paradigm's capacity to mimic simultanagnosic behaviors while viewing complex scenes implies that it may be a valid way of modeling simultanagnosia in healthy individuals, providing a useful tool for future research. More broadly, our results support the thesis that people fixate the eyes in social scenes because they are informative to the meaning of the scene. Copyright © 2010 Elsevier B.V. All rights reserved.
Foulsham, Tom; Alan, Rana; Kingstone, Alan
2011-10-01
Previous research has demonstrated that search and memory for items within natural scenes can be disrupted by "scrambling" the images. In the present study, we asked how disrupting the structure of a scene through scrambling might affect the control of eye fixations in either a search task (Experiment 1) or a memory task (Experiment 2). We found that the search decrement in scrambled scenes was associated with poorer guidance of the eyes to the target. Across both tasks, scrambling led to shorter fixations and longer saccades, and more distributed, less selective overt attention, perhaps corresponding to an ambient mode of processing. These results confirm that scene structure has widespread effects on the guidance of eye movements in scenes. Furthermore, the results demonstrate the trade-off between scene structure and visual saliency, with saliency having more of an effect on eye guidance in scrambled scenes.
NASA Astrophysics Data System (ADS)
Rengarajan, Rajagopalan; Goodenough, Adam A.; Schott, John R.
2016-10-01
Many remote sensing applications rely on simulated scenes to perform complex interaction and sensitivity studies that are not possible with real-world scenes. These applications include the development and validation of new and existing algorithms, understanding of the sensor's performance prior to launch, and trade studies to determine ideal sensor configurations. The accuracy of these applications is dependent on the realism of the modeled scenes and sensors. The Digital Image and Remote Sensing Image Generation (DIRSIG) tool has been used extensively to model the complex spectral and spatial texture variation expected in large city-scale scenes and natural biomes. In the past, material properties that were used to represent targets in the simulated scenes were often assumed to be Lambertian in the absence of hand-measured directional data. However, this assumption presents a limitation for new algorithms that need to recognize the anisotropic behavior of targets. We have developed a new method to model and simulate large-scale high-resolution terrestrial scenes by combining bi-directional reflectance distribution function (BRDF) products from Moderate Resolution Imaging Spectroradiometer (MODIS) data, high spatial resolution data, and hyperspectral data. The high spatial resolution data is used to separate materials and add textural variations to the scene, and the directional hemispherical reflectance from the hyperspectral data is used to adjust the magnitude of the MODIS BRDF. In this method, the shape of the BRDF is preserved since it changes very slowly, but its magnitude is varied based on the high resolution texture and hyperspectral data. In addition to the MODIS derived BRDF, target/class specific BRDF values or functions can also be applied to features of specific interest. The purpose of this paper is to discuss the techniques and the methodology used to model a forest region at a high resolution. The simulated scenes using this method for varying view angles show the expected variations in the reflectance due to the BRDF effects of the Harvard forest. The effectiveness of this technique to simulate real sensor data is evaluated by comparing the simulated data with the Landsat 8 Operational Land Image (OLI) data over the Harvard forest. Regions of interest were selected from the simulated and the real data for different targets and their Top-of-Atmospheric (TOA) radiance were compared. After adjusting for scaling correction due to the difference in atmospheric conditions between the simulated and the real data, the TOA radiance is found to agree within 5 % in the NIR band and 10 % in the visible bands for forest targets under similar illumination conditions. The technique presented in this paper can be extended for other biomes (e.g. desert regions and agricultural regions) by using the appropriate geographic regions. Since the entire scene is constructed in a simulated environment, parameters such as BRDF or its effects can be analyzed for general or target specific algorithm improvements. Also, the modeling and simulation techniques can be used as a baseline for the development and comparison of new sensor designs and to investigate the operational and environmental factors that affects the sensor constellations such as Sentinel and Landsat missions.
Rapid natural scene categorization in the near absence of attention
Li, Fei Fei; VanRullen, Rufin; Koch, Christof; Perona, Pietro
2002-01-01
What can we see when we do not pay attention? It is well known that we can be “blind” even to major aspects of natural scenes when we attend elsewhere. The only tasks that do not need attention appear to be carried out in the early stages of the visual system. Contrary to this common belief, we report that subjects can rapidly detect animals or vehicles in briefly presented novel natural scenes while simultaneously performing another attentionally demanding task. By comparison, they are unable to discriminate large T's from L's, or bisected two-color disks from their mirror images under the same conditions. We conclude that some visual tasks associated with “high-level” cortical areas may proceed in the near absence of attention. PMID:12077298
Banno, Hayaki; Saiki, Jun
2015-03-01
Recent studies have sought to determine which levels of categories are processed first in visual scene categorization and have shown that the natural and man-made superordinate-level categories are understood faster than are basic-level categories. The current study examined the robustness of the superordinate-level advantage in a visual scene categorization task. A go/no-go categorization task was evaluated with response time distribution analysis using an ex-Gaussian template. A visual scene was categorized as either superordinate or basic level, and two basic-level categories forming a superordinate category were judged as either similar or dissimilar to each other. First, outdoor/ indoor groups and natural/man-made were used as superordinate categories to investigate whether the advantage could be generalized beyond the natural/man-made boundary. Second, a set of images forming a superordinate category was manipulated. We predicted that decreasing image set similarity within the superordinate-level category would work against the speed advantage. We found that basic-level categorization was faster than outdoor/indoor categorization when the outdoor category comprised dissimilar basic-level categories. Our results indicate that the superordinate-level advantage in visual scene categorization is labile across different categories and category structures. © 2015 SAGE Publications.
Ryals, Anthony J.; Wang, Jane X.; Polnaszek, Kelly L.; Voss, Joel L.
2015-01-01
Although hippocampus unequivocally supports explicit/ declarative memory, fewer findings have demonstrated its role in implicit expressions of memory. We tested for hippocampal contributions to an implicit expression of configural/relational memory for complex scenes using eye-movement tracking during functional magnetic resonance imaging (fMRI) scanning. Participants studied scenes and were later tested using scenes that resembled study scenes in their overall feature configuration but comprised different elements. These configurally similar scenes were used to limit explicit memory, and were intermixed with new scenes that did not resemble studied scenes. Scene configuration memory was expressed through eye movements reflecting exploration overlap (EO), which is the viewing of the same scene locations at both study and test. EO reliably discriminated similar study-test scene pairs from study-new scene pairs, was reliably greater for similarity-based recognition hits than for misses, and correlated with hippocampal fMRI activity. In contrast, subjects could not reliably discriminate similar from new scenes by overt judgments, although ratings of familiarity were slightly higher for similar than new scenes. Hippocampal fMRI correlates of this weak explicit memory were distinct from EO-related activity. These findings collectively suggest that EO was an implicit expression of scene configuration memory associated with hippocampal activity. Visual exploration can therefore reflect implicit hippocampal-related memory processing that can be observed in eye-movement behavior during naturalistic scene viewing. PMID:25620526
Chromatic blur perception in the presence of luminance contrast.
Jennings, Ben J; Kingdom, Frederick A A
2017-06-01
Hel-Or showed that blurring the chromatic but not the luminance layer of an image of a natural scene failed to elicit any impression of blur. Subsequent studies have suggested that this effect is due either to chromatic blur being masked by spatially contiguous luminance edges in the scene (Journal of Vision 13 (2013) 14), or to a relatively compressed transducer function for chromatic blur (Journal of Vision 15 (2015) 6). To test between the two explanations we conducted experiments using as stimuli both images of natural scenes as well as simple edges. First, we found that in color-and-luminance images of natural scenes more chromatic blur was needed to perceptually match a given level of blur in an isoluminant, i.e. colour-only scene. However, when the luminance layer in the scene was rotated relative to the chromatic layer, thus removing the colour-luminance edge correlations, the matched blur levels were near equal. Both results are consistent with Sharman et al.'s explanation. Second, when observers matched the blurs of luminance-only with isoluminant scenes, the matched blurs were equal, against Kingdom et al.'s prediction. Third, we measured the perceived blur in a square-wave as a function of (i) contrast (ii) number of luminance edges and (iii) the relative spatial phase between the colour and luminance edges. We found that the perceived chromatic blur was dependent on both relative phase and the number of luminance edges, or dependent on the luminance contrast if only a single edge is present. We conclude that this Hel-Or effect is largely due to masking of chromatic blur by spatially contiguous luminance edges. Copyright © 2017 Elsevier Ltd. All rights reserved.
Overt attention in natural scenes: objects dominate features.
Stoll, Josef; Thrun, Michael; Nuthmann, Antje; Einhäuser, Wolfgang
2015-02-01
Whether overt attention in natural scenes is guided by object content or by low-level stimulus features has become a matter of intense debate. Experimental evidence seemed to indicate that once object locations in a scene are known, salience models provide little extra explanatory power. This approach has recently been criticized for using inadequate models of early salience; and indeed, state-of-the-art salience models outperform trivial object-based models that assume a uniform distribution of fixations on objects. Here we propose to use object-based models that take a preferred viewing location (PVL) close to the centre of objects into account. In experiment 1, we demonstrate that, when including this comparably subtle modification, object-based models again are at par with state-of-the-art salience models in predicting fixations in natural scenes. One possible interpretation of these results is that objects rather than early salience dominate attentional guidance. In this view, early-salience models predict fixations through the correlation of their features with object locations. To test this hypothesis directly, in two additional experiments we reduced low-level salience in image areas of high object content. For these modified stimuli, the object-based model predicted fixations significantly better than early salience. This finding held in an object-naming task (experiment 2) and a free-viewing task (experiment 3). These results provide further evidence for object-based fixation selection--and by inference object-based attentional guidance--in natural scenes. Copyright © 2014 The Authors. Published by Elsevier Ltd.. All rights reserved.
Automatic acquisition of motion trajectories: tracking hockey players
NASA Astrophysics Data System (ADS)
Okuma, Kenji; Little, James J.; Lowe, David
2003-12-01
Computer systems that have the capability of analyzing complex and dynamic scenes play an essential role in video annotation. Scenes can be complex in such a way that there are many cluttered objects with different colors, shapes and sizes, and can be dynamic with multiple interacting moving objects and a constantly changing background. In reality, there are many scenes that are complex, dynamic, and challenging enough for computers to describe. These scenes include games of sports, air traffic, car traffic, street intersections, and cloud transformations. Our research is about the challenge of inventing a descriptive computer system that analyzes scenes of hockey games where multiple moving players interact with each other on a constantly moving background due to camera motions. Ultimately, such a computer system should be able to acquire reliable data by extracting the players" motion as their trajectories, querying them by analyzing the descriptive information of data, and predict the motions of some hockey players based on the result of the query. Among these three major aspects of the system, we primarily focus on visual information of the scenes, that is, how to automatically acquire motion trajectories of hockey players from video. More accurately, we automatically analyze the hockey scenes by estimating parameters (i.e., pan, tilt, and zoom) of the broadcast cameras, tracking hockey players in those scenes, and constructing a visual description of the data by displaying trajectories of those players. Many technical problems in vision such as fast and unpredictable players' motions and rapid camera motions make our challenge worth tackling. To the best of our knowledge, there have not been any automatic video annotation systems for hockey developed in the past. Although there are many obstacles to overcome, our efforts and accomplishments would hopefully establish the infrastructure of the automatic hockey annotation system and become a milestone for research in automatic video annotation in this domain.
Iconic memory for the gist of natural scenes.
Clarke, Jason; Mack, Arien
2014-11-01
Does iconic memory contain the gist of multiple scenes? Three experiments were conducted. In the first, four scenes from different basic-level categories were briefly presented in one of two conditions: a cue or a no-cue condition. The cue condition was designed to provide an index of the contents of iconic memory of the display. Subjects were more sensitive to scene gist in the cue condition than in the no-cue condition. In the second, the scenes came from the same basic-level category. We found no difference in sensitivity between the two conditions. In the third, six scenes from different basic level categories were presented in the visual periphery. Subjects were more sensitive to scene gist in the cue condition. These results suggest that scene gist is contained in iconic memory even in the visual periphery; however, iconic representations are not sufficiently detailed to distinguish between scenes coming from the same category. Copyright © 2014 Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Sanocki, Thomas; Sulman, Noah
2013-01-01
Three experiments measured the efficiency of monitoring complex scenes composed of changing objects, or events. All events lasted about 4 s, but in a given block of trials, could be of a single type (single task) or of multiple types (multitask, with a total of four event types). Overall accuracy of detecting target events amid distractors was…
Robust colour constancy in red-green dichromats
Linhares, João M. M.; Moreira, Humberto; Lillo, Julio; Nascimento, Sérgio M. C.
2017-01-01
Colour discrimination has been widely studied in red-green (R-G) dichromats but the extent to which their colour constancy is affected remains unclear. This work estimated the extent of colour constancy for four normal trichromatic observers and seven R-G dichromats when viewing natural scenes under simulated daylight illuminants. Hyperspectral imaging data from natural scenes were used to generate the stimuli on a calibrated CRT display. In experiment 1, observers viewed a reference scene illuminated by daylight with a correlated colour temperature (CCT) of 6700K; observers then viewed sequentially two versions of the same scene, one illuminated by either a higher or lower CCT (condition 1, pure CCT change with constant luminance) or a higher or lower average luminance (condition 2, pure luminance change with a constant CCT). The observers’ task was to identify the version of the scene that looked different from the reference scene. Thresholds for detecting a pure CCT change or a pure luminance change were estimated, and it was found that those for R-G dichromats were marginally higher than for normal trichromats regarding CCT. In experiment 2, observers viewed sequentially a reference scene and a comparison scene with a CCT change or a luminance change above threshold for each observer. The observers’ task was to identify whether or not the change was an intensity change. No significant differences were found between the responses of normal trichromats and dichromats. These data suggest robust colour constancy mechanisms along daylight locus in R-G dichromacy. PMID:28662218
Robust colour constancy in red-green dichromats.
Álvaro, Leticia; Linhares, João M M; Moreira, Humberto; Lillo, Julio; Nascimento, Sérgio M C
2017-01-01
Colour discrimination has been widely studied in red-green (R-G) dichromats but the extent to which their colour constancy is affected remains unclear. This work estimated the extent of colour constancy for four normal trichromatic observers and seven R-G dichromats when viewing natural scenes under simulated daylight illuminants. Hyperspectral imaging data from natural scenes were used to generate the stimuli on a calibrated CRT display. In experiment 1, observers viewed a reference scene illuminated by daylight with a correlated colour temperature (CCT) of 6700K; observers then viewed sequentially two versions of the same scene, one illuminated by either a higher or lower CCT (condition 1, pure CCT change with constant luminance) or a higher or lower average luminance (condition 2, pure luminance change with a constant CCT). The observers' task was to identify the version of the scene that looked different from the reference scene. Thresholds for detecting a pure CCT change or a pure luminance change were estimated, and it was found that those for R-G dichromats were marginally higher than for normal trichromats regarding CCT. In experiment 2, observers viewed sequentially a reference scene and a comparison scene with a CCT change or a luminance change above threshold for each observer. The observers' task was to identify whether or not the change was an intensity change. No significant differences were found between the responses of normal trichromats and dichromats. These data suggest robust colour constancy mechanisms along daylight locus in R-G dichromacy.
A test of size-scaling and relative-size hypotheses for the moon illusion.
Redding, Gordon M
2002-11-01
In two experiments participants reproduced the size of the moon in pictorial scenes under two conditions: when the scene element was normally oriented, producing a depth gradient like a floor, or when the scene element was inverted, producing a depth gradient like a ceiling. Target moons were located near to or far from the scene element. Consistent with size constancy scaling, the illusion reversed when the "floor" of a pictorial scene was inverted to represent a "ceiling." Relative size contrast predicted a reduction or increase in the illusion with no change in direction. The relation between pictorial and natural moon illusions is discussed.
Mining Very High Resolution INSAR Data Based On Complex-GMRF Cues And Relevance Feedback
NASA Astrophysics Data System (ADS)
Singh, Jagmal; Popescu, Anca; Soccorsi, Matteo; Datcu, Mihai
2012-01-01
With the increase in number of remote sensing satellites, the number of image-data scenes in our repositories is also increasing and a large quantity of these scenes are never received and used. Thus automatic retrieval of de- sired image-data using query by image content to fully utilize the huge repository volume is becoming of great interest. Generally different users are interested in scenes containing different kind of objects and structures. So its important to analyze all the image information mining (IIM) methods so that its easier for user to select a method depending upon his/her requirement. We concentrate our study only on high-resolution SAR images and we propose to use InSAR observations instead of only one single look complex (SLC) images for mining scenes containing coherent objects such as high-rise buildings. However in case of objects with less coherence like areas with vegetation cover, SLC images exhibits better performance. We demonstrate IIM performance comparison using complex-Gauss Markov Random Fields as texture descriptor for image patches and SVM relevance- feedback.
A bottom-up model of spatial attention predicts human error patterns in rapid scene recognition.
Einhäuser, Wolfgang; Mundhenk, T Nathan; Baldi, Pierre; Koch, Christof; Itti, Laurent
2007-07-20
Humans demonstrate a peculiar ability to detect complex targets in rapidly presented natural scenes. Recent studies suggest that (nearly) no focal attention is required for overall performance in such tasks. Little is known, however, of how detection performance varies from trial to trial and which stages in the processing hierarchy limit performance: bottom-up visual processing (attentional selection and/or recognition) or top-down factors (e.g., decision-making, memory, or alertness fluctuations)? To investigate the relative contribution of these factors, eight human observers performed an animal detection task in natural scenes presented at 20 Hz. Trial-by-trial performance was highly consistent across observers, far exceeding the prediction of independent errors. This consistency demonstrates that performance is not primarily limited by idiosyncratic factors but by visual processing. Two statistical stimulus properties, contrast variation in the target image and the information-theoretical measure of "surprise" in adjacent images, predict performance on a trial-by-trial basis. These measures are tightly related to spatial attention, demonstrating that spatial attention and rapid target detection share common mechanisms. To isolate the causal contribution of the surprise measure, eight additional observers performed the animal detection task in sequences that were reordered versions of those all subjects had correctly recognized in the first experiment. Reordering increased surprise before and/or after the target while keeping the target and distractors themselves unchanged. Surprise enhancement impaired target detection in all observers. Consequently, and contrary to several previously published findings, our results demonstrate that attentional limitations, rather than target recognition alone, affect the detection of targets in rapidly presented visual sequences.
Menzel, Claudia; Hayn-Leichsenring, Gregor U; Langner, Oliver; Wiese, Holger; Redies, Christoph
2015-01-01
We investigated whether low-level processed image properties that are shared by natural scenes and artworks - but not veridical face photographs - affect the perception of facial attractiveness and age. Specifically, we considered the slope of the radially averaged Fourier power spectrum in a log-log plot. This slope is a measure of the distribution of special frequency power in an image. Images of natural scenes and artworks possess - compared to face images - a relatively shallow slope (i.e., increased high spatial frequency power). Since aesthetic perception might be based on the efficient processing of images with natural scene statistics, we assumed that the perception of facial attractiveness might also be affected by these properties. We calculated Fourier slope and other beauty-associated measurements in face images and correlated them with ratings of attractiveness and age of the depicted persons (Study 1). We found that Fourier slope - in contrast to the other tested image properties - did not predict attractiveness ratings when we controlled for age. In Study 2A, we overlaid face images with random-phase patterns with different statistics. Patterns with a slope similar to those in natural scenes and artworks resulted in lower attractiveness and higher age ratings. In Studies 2B and 2C, we directly manipulated the Fourier slope of face images and found that images with shallower slopes were rated as more attractive. Additionally, attractiveness of unaltered faces was affected by the Fourier slope of a random-phase background (Study 3). Faces in front of backgrounds with statistics similar to natural scenes and faces were rated as more attractive. We conclude that facial attractiveness ratings are affected by specific image properties. An explanation might be the efficient coding hypothesis.
Langner, Oliver; Wiese, Holger; Redies, Christoph
2015-01-01
We investigated whether low-level processed image properties that are shared by natural scenes and artworks – but not veridical face photographs – affect the perception of facial attractiveness and age. Specifically, we considered the slope of the radially averaged Fourier power spectrum in a log-log plot. This slope is a measure of the distribution of special frequency power in an image. Images of natural scenes and artworks possess – compared to face images – a relatively shallow slope (i.e., increased high spatial frequency power). Since aesthetic perception might be based on the efficient processing of images with natural scene statistics, we assumed that the perception of facial attractiveness might also be affected by these properties. We calculated Fourier slope and other beauty-associated measurements in face images and correlated them with ratings of attractiveness and age of the depicted persons (Study 1). We found that Fourier slope – in contrast to the other tested image properties – did not predict attractiveness ratings when we controlled for age. In Study 2A, we overlaid face images with random-phase patterns with different statistics. Patterns with a slope similar to those in natural scenes and artworks resulted in lower attractiveness and higher age ratings. In Studies 2B and 2C, we directly manipulated the Fourier slope of face images and found that images with shallower slopes were rated as more attractive. Additionally, attractiveness of unaltered faces was affected by the Fourier slope of a random-phase background (Study 3). Faces in front of backgrounds with statistics similar to natural scenes and faces were rated as more attractive. We conclude that facial attractiveness ratings are affected by specific image properties. An explanation might be the efficient coding hypothesis. PMID:25835539
Lechtzin, Noah; Busse, Anne M; Smith, Michael T; Grossman, Stuart; Nesbit, Suzanne; Diette, Gregory B
2010-09-01
Bone marrow aspiration and biopsy (BMAB) is painful when performed with only local anesthetic. Our objective was to determine whether viewing nature scenes and listening to nature sounds can reduce pain during BMAB. This was a randomized, controlled clinical trial. Adult patients undergoing outpatient BMAB with only local anesthetic were assigned to use either a nature scene with accompanying nature sounds, city scene with city sounds, or standard care. The primary outcome was a visual analog scale (0-10) of pain. Prespecified secondary analyses included categorizing pain as mild and moderate to severe and using multiple logistic regression to adjust for potential confounding variables. One hundred and twenty (120) subjects were enrolled: 44 in the Nature arm, 39 in the City arm, and 37 in the Standard Care arm. The mean pain scores, which were the primary outcome, were not significantly different between the three arms. A higher proportion in the Standard Care arm had moderate-to-severe pain (pain rating ≥4) than in the Nature arm (78.4% versus 60.5%), though this was not statistically significant (p = 0.097). This difference was statistically significant after adjusting for differences in the operators who performed the procedures (odds ratio = 3.71, p = 0.02). We confirmed earlier findings showing that BMAB is poorly tolerated. While mean pain scores were not significantly different between the study arms, secondary analyses suggest that viewing a nature scene while listening to nature sounds is a safe, inexpensive method that may reduce pain during BMAB. This approach should be considered to alleviate pain during invasive procedures.
Some findings on prospect and refuge: I.
Stamps, Arthur E
2008-02-01
Prospect and refuge theory suggests that preferences for environments are based on prospect (the unimpeded opportunity to see) and refuge (the opportunity to hide). This article reports two experiments on how well four factors derived from prospect and refuge theory predicted responses of comfort or liking. The factors were prospect (depth of view), refuge (presence of protective regions in front of the observer or occluding edges that might indicate possibilities of escape), direction of light (either front lighting or back lighting), and venue (natural or built environments). Exp. 1 had 16 landscape scenes and 29 participants; Exp. 2 had 16 landscapes, 14 rooms, and 18 participants. Empirical support was obtained for the claim that people will like gazing out over scenes of distant mountains. For venue, built scenes were preferred over scenes of nature. Results for refuge were ambiguous, and those for di rection of light were nill.
Acoustic simulation in architecture with parallel algorithm
NASA Astrophysics Data System (ADS)
Li, Xiaohong; Zhang, Xinrong; Li, Dan
2004-03-01
In allusion to complexity of architecture environment and Real-time simulation of architecture acoustics, a parallel radiosity algorithm was developed. The distribution of sound energy in scene is solved with this method. And then the impulse response between sources and receivers at frequency segment, which are calculated with multi-process, are combined into whole frequency response. The numerical experiment shows that parallel arithmetic can improve the acoustic simulating efficiency of complex scene.
Cornelissen, Tim H W; Võ, Melissa L-H
2017-01-01
People have an amazing ability to identify objects and scenes with only a glimpse. How automatic is this scene and object identification? Are scene and object semantics-let alone their semantic congruity-processed to a degree that modulates ongoing gaze behavior even if they are irrelevant to the task at hand? Objects that do not fit the semantics of the scene (e.g., a toothbrush in an office) are typically fixated longer and more often than objects that are congruent with the scene context. In this study, we overlaid a letter T onto photographs of indoor scenes and instructed participants to search for it. Some of these background images contained scene-incongruent objects. Despite their lack of relevance to the search, we found that participants spent more time in total looking at semantically incongruent compared to congruent objects in the same position of the scene. Subsequent tests of explicit and implicit memory showed that participants did not remember many of the inconsistent objects and no more of the consistent objects. We argue that when we view natural environments, scene and object relationships are processed obligatorily, such that irrelevant semantic mismatches between scene and object identity can modulate ongoing eye-movement behavior.
Memory for sound, with an ear toward hearing in complex auditory scenes.
Snyder, Joel S; Gregg, Melissa K
2011-10-01
An area of research that has experienced recent growth is the study of memory during perception of simple and complex auditory scenes. These studies have provided important information about how well auditory objects are encoded in memory and how well listeners can notice changes in auditory scenes. These are significant developments because they present an opportunity to better understand how we hear in realistic situations, how higher-level aspects of hearing such as semantics and prior exposure affect perception, and the similarities and differences between auditory perception and perception in other modalities, such as vision and touch. The research also poses exciting challenges for behavioral and neural models of how auditory perception and memory work.
Automated synthetic scene generation
NASA Astrophysics Data System (ADS)
Givens, Ryan N.
Physics-based simulations generate synthetic imagery to help organizations anticipate system performance of proposed remote sensing systems. However, manually constructing synthetic scenes which are sophisticated enough to capture the complexity of real-world sites can take days to months depending on the size of the site and desired fidelity of the scene. This research, sponsored by the Air Force Research Laboratory's Sensors Directorate, successfully developed an automated approach to fuse high-resolution RGB imagery, lidar data, and hyperspectral imagery and then extract the necessary scene components. The method greatly reduces the time and money required to generate realistic synthetic scenes and developed new approaches to improve material identification using information from all three of the input datasets.
Temporal variations of natural soil salinity in an arid environment using satellite images
NASA Astrophysics Data System (ADS)
Gutierrez, M.; Johnson, E.
2010-11-01
In many remote arid areas the scarce amount of conventional soil salinity data precludes detailed analyses of salinity variations for the purpose of predicting its impact on agricultural production. A tool that is an appropriate surrogate for on-ground testing in determining temporal variations of soil salinity is Landsat satellite data. In this study six Landsat scenes over El Cuervo, a closed basin adjacent to the middle Rio Conchos basin in northern Mexico, were used to show temporal variation of natural salts from 1986 to 2005. Natural salts were inferred from ground reference data and spectral responses. Transformations used were Tasseled Cap, Principal Components and several (band) ratios. Classification of each scene was performed from the development of Regions Of Interest derived from geochemical data collected by SGM, spectral responses derived from ENVI software, and a small amount of field data collected by the authors. The resultant land cover classes showed a relationship between climatic drought and areal coverage of natural salts. When little precipitation occurred three months prior to the capture of the Landsat scene, approximately 15%-20% of the area was classified as salt. This is compared to practically no classified salt in the wetter years of 1992 and 2005 Landsat scenes.
Schomaker, Judith; Walper, Daniel; Wittmann, Bianca C; Einhäuser, Wolfgang
2017-04-01
In addition to low-level stimulus characteristics and current goals, our previous experience with stimuli can also guide attentional deployment. It remains unclear, however, if such effects act independently or whether they interact in guiding attention. In the current study, we presented natural scenes including every-day objects that differed in affective-motivational impact. In the first free-viewing experiment, we presented visually-matched triads of scenes in which one critical object was replaced that varied mainly in terms of motivational value, but also in terms of valence and arousal, as confirmed by ratings by a large set of observers. Treating motivation as a categorical factor, we found that it affected gaze. A linear-effect model showed that arousal, valence, and motivation predicted fixations above and beyond visual characteristics, like object size, eccentricity, or visual salience. In a second experiment, we experimentally investigated whether the effects of emotion and motivation could be modulated by visual salience. In a medium-salience condition, we presented the same unmodified scenes as in the first experiment. In a high-salience condition, we retained the saturation of the critical object in the scene, and decreased the saturation of the background, and in a low-salience condition, we desaturated the critical object while retaining the original saturation of the background. We found that highly salient objects guided gaze, but still found additional additive effects of arousal, valence and motivation, confirming that higher-level factors can also guide attention, as measured by fixations towards objects in natural scenes. Copyright © 2017 Elsevier Ltd. All rights reserved.
Visible-Infrared Hyperspectral Image Projector
NASA Technical Reports Server (NTRS)
Bolcar, Matthew
2013-01-01
The VisIR HIP generates spatially-spectrally complex scenes. The generated scenes simulate real-world targets viewed by various remote sensing instruments. The VisIR HIP consists of two subsystems: a spectral engine and a spatial engine. The spectral engine generates spectrally complex uniform illumination that spans the wavelength range between 380 nm and 1,600 nm. The spatial engine generates two-dimensional gray-scale scenes. When combined, the two engines are capable of producing two-dimensional scenes with a unique spectrum at each pixel. The VisIR HIP can be used to calibrate any spectrally sensitive remote-sensing instrument. Tests were conducted on the Wide-field Imaging Interferometer Testbed at NASA s Goddard Space Flight Center. The device is a variation of the calibrated hyperspectral image projector developed by the National Institute of Standards and Technology in Gaithersburg, MD. It uses Gooch & Housego Visible and Infrared OL490 Agile Light Sources to generate arbitrary spectra. The two light sources are coupled to a digital light processing (DLP(TradeMark)) digital mirror device (DMD) that serves as the spatial engine. Scenes are displayed on the DMD synchronously with desired spectrum. Scene/spectrum combinations are displayed in rapid succession, over time intervals that are short compared to the integration time of the system under test.
Manhole Cover Detection Using Vehicle-Based Multi-Sensor Data
NASA Astrophysics Data System (ADS)
Ji, S.; Shi, Y.; Shi, Z.
2012-07-01
A new method combined wit multi-view matching and feature extraction technique is developed to detect manhole covers on the streets using close-range images combined with GPS/IMU and LINDAR data. The covers are an important target on the road traffic as same as transport signs, traffic lights and zebra crossing but with more unified shapes. However, the different shoot angle and distance, ground material, complex street scene especially its shadow, and cars in the road have a great impact on the cover detection rate. The paper introduces a new method in edge detection and feature extraction in order to overcome these difficulties and greatly improve the detection rate. The LIDAR data are used to do scene segmentation and the street scene and cars are excluded from the roads. And edge detection method base on canny which sensitive to arcs and ellipses is applied on the segmented road scene and the interesting areas contain arcs are extracted and fitted to ellipse. The ellipse are then resampled for invariance to shooting angle and distance and then are matched to adjacent images for further checking if covers and . More than 1000 images with different scenes are used in our tests and the detection rate is analyzed. The results verified our method have its advantages in correct covers detection in the complex street scene.
Werner, Annette
2014-11-01
Illumination in natural scenes changes at multiple temporal and spatial scales: slow changes in global illumination occur in the course of a day, and we encounter fast and localised illumination changes when visually exploring the non-uniform light field of three-dimensional scenes; in addition, very long-term chromatic variations may come from the environment, like for example seasonal changes. In this context, I consider the temporal and spatial properties of chromatic adaptation and discuss their functional significance for colour constancy in three-dimensional scenes. A process of fast spatial tuning in chromatic adaptation is proposed as a possible sensory mechanism for linking colour constancy to the spatial structure of a scene. The observed middlewavelength selectivity of this process is particularly suitable for adaptation to the mean chromaticity and the compensation of interreflections in natural scenes. Two types of sensory colour constancy are distinguished, based on the functional differences of their temporal and spatial scales: a slow type, operating at a global scale for the compensation of the ambient illumination; and a fast colour constancy, which is locally restricted and well suited to compensate region-specific variations in the light field of three dimensional scenes. Copyright © 2014 Elsevier B.V. All rights reserved.
Falomir, Zoe; Kluth, Thomas
2017-06-24
The challenge of describing 3D real scenes is tackled in this paper using qualitative spatial descriptors. A key point to study is which qualitative descriptors to use and how these qualitative descriptors must be organized to produce a suitable cognitive explanation. In order to find answers, a survey test was carried out with human participants which openly described a scene containing some pieces of furniture. The data obtained in this survey are analysed, and taking this into account, the QSn3D computational approach was developed which uses a XBox 360 Kinect to obtain 3D data from a real indoor scene. Object features are computed on these 3D data to identify objects in indoor scenes. The object orientation is computed, and qualitative spatial relations between the objects are extracted. These qualitative spatial relations are the input to a grammar which applies saliency rules obtained from the survey study and generates cognitive natural language descriptions of scenes. Moreover, these qualitative descriptors can be expressed as first-order logical facts in Prolog for further reasoning. Finally, a validation study is carried out to test whether the descriptions provided by QSn3D approach are human readable. The obtained results show that their acceptability is higher than 82%.
Chromatic information and feature detection in fast visual analysis
Del Viva, Maria M.; Punzi, Giovanni; Shevell, Steven K.; ...
2016-08-01
The visual system is able to recognize a scene based on a sketch made of very simple features. This ability is likely crucial for survival, when fast image recognition is necessary, and it is believed that a primal sketch is extracted very early in the visual processing. Such highly simplified representations can be sufficient for accurate object discrimination, but an open question is the role played by color in this process. Rich color information is available in natural scenes, yet artist's sketches are usually monochromatic; and, black-andwhite movies provide compelling representations of real world scenes. Also, the contrast sensitivity ofmore » color is low at fine spatial scales. We approach the question from the perspective of optimal information processing by a system endowed with limited computational resources. We show that when such limitations are taken into account, the intrinsic statistical properties of natural scenes imply that the most effective strategy is to ignore fine-scale color features and devote most of the bandwidth to gray-scale information. We find confirmation of these information-based predictions from psychophysics measurements of fast-viewing discrimination of natural scenes. As a result, we conclude that the lack of colored features in our visual representation, and our overall low sensitivity to high-frequency color components, are a consequence of an adaptation process, optimizing the size and power consumption of our brain for the visual world we live in.« less
Chromatic information and feature detection in fast visual analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Del Viva, Maria M.; Punzi, Giovanni; Shevell, Steven K.
The visual system is able to recognize a scene based on a sketch made of very simple features. This ability is likely crucial for survival, when fast image recognition is necessary, and it is believed that a primal sketch is extracted very early in the visual processing. Such highly simplified representations can be sufficient for accurate object discrimination, but an open question is the role played by color in this process. Rich color information is available in natural scenes, yet artist's sketches are usually monochromatic; and, black-andwhite movies provide compelling representations of real world scenes. Also, the contrast sensitivity ofmore » color is low at fine spatial scales. We approach the question from the perspective of optimal information processing by a system endowed with limited computational resources. We show that when such limitations are taken into account, the intrinsic statistical properties of natural scenes imply that the most effective strategy is to ignore fine-scale color features and devote most of the bandwidth to gray-scale information. We find confirmation of these information-based predictions from psychophysics measurements of fast-viewing discrimination of natural scenes. As a result, we conclude that the lack of colored features in our visual representation, and our overall low sensitivity to high-frequency color components, are a consequence of an adaptation process, optimizing the size and power consumption of our brain for the visual world we live in.« less
Spontaneous Action Representation in Smokers when Watching Movie Characters Smoke
Wagner, Dylan D.; Cin, Sonya Dal; Sargent, James D.; Kelley, William M.; Heatherton, Todd F.
2013-01-01
Do smokers simulate smoking when they see someone else smoke? For regular smokers, smoking is such a highly practiced motor skill that it often occurs automatically, without conscious awareness. Research on the brain basis of action observation has delineated a frontopareital network that is commonly recruited when people observe, plan or imitate actions. Here, we investigated whether this action observation network would be preferentially recruited in smokers when viewing complex smoking cues, such as those occurring in motion pictures. Seventeen right-handed smokers and seventeen non-smokers watched a popular movie while undergoing functional magnetic resonance imaging. Using a natural stimulus, such as a movie, allowd us to keep both smoking and non-smoking participants naïve to the goals of the experiment. Brain activity evoked by scenes of movie smoking was contrasted with non-smoking control scenes which were matched for frequency and duration. Compared to non-smokers, smokers showed greater activity in left anterior intraparietal sulcus and inferior frontal gyrus, both regions involved in the simulation of contralateral hand-based gestures, when viewing smoking vs. control scenes. These results demonstrate that smokers spontaneously represent the action of smoking when viewing others smoke, the consequence of which may make it more difficult to abstain from smoking. PMID:21248113
Feature diagnosticity and task context shape activity in human scene-selective cortex.
Lowe, Matthew X; Gallivan, Jason P; Ferber, Susanne; Cant, Jonathan S
2016-01-15
Scenes are constructed from multiple visual features, yet previous research investigating scene processing has often focused on the contributions of single features in isolation. In the real world, features rarely exist independently of one another and likely converge to inform scene identity in unique ways. Here, we utilize fMRI and pattern classification techniques to examine the interactions between task context (i.e., attend to diagnostic global scene features; texture or layout) and high-level scene attributes (content and spatial boundary) to test the novel hypothesis that scene-selective cortex represents multiple visual features, the importance of which varies according to their diagnostic relevance across scene categories and task demands. Our results show for the first time that scene representations are driven by interactions between multiple visual features and high-level scene attributes. Specifically, univariate analysis of scene-selective cortex revealed that task context and feature diagnosticity shape activity differentially across scene categories. Examination using multivariate decoding methods revealed results consistent with univariate findings, but also evidence for an interaction between high-level scene attributes and diagnostic visual features within scene categories. Critically, these findings suggest visual feature representations are not distributed uniformly across scene categories but are shaped by task context and feature diagnosticity. Thus, we propose that scene-selective cortex constructs a flexible representation of the environment by integrating multiple diagnostically relevant visual features, the nature of which varies according to the particular scene being perceived and the goals of the observer. Copyright © 2015 Elsevier Inc. All rights reserved.
Effect of manmade pixels on the inherent dimension of natural material distributions
NASA Astrophysics Data System (ADS)
Schlamm, Ariel; Messinger, David; Basener, William
2009-05-01
The inherent dimension of hyperspectral data may be a useful metric for discriminating between the presence of manmade and natural materials in a scene without reliance on spectral signatures take from libraries. Previously, a simple geometric method for approximating the inherent dimension was introduced along with results from application to single material clusters. This method uses an estimate of the slope from a graph based on the point density estimation in the spectral space. Other information can be gathered from the plot which may aid in the discrimination between manmade and natural materials. In order to use these measures to differentiate between the two material types, the effect of the inclusion of manmade pixels on the phenomenology of the background distribution must be evaluated. Here, a procedure for injecting manmade pixels into a natural region of a scene is discussed. The results of dimension estimation on natural scenes with varying amounts of manmade pixels injected are presented here, indicating that these metrics can be sensitive to the presence of manmade phenomenology in an image.
Kanda, Hideyuki; Okamura, Tomonori; Turin, Tanvir Chowdhury; Hayakawa, Takehito; Kadowaki, Takashi; Ueshima, Hirotsugu
2006-06-01
Japanese serial television dramas are becoming very popular overseas, particularly in other Asian countries. Exposure to smoking scenes in movies and television dramas has been known to trigger initiation of habitual smoking in young people. Smoking scenes in Japanese dramas may affect the smoking behavior of many young Asians. We examined smoking scenes and smoking-related items in serial television dramas targeting young audiences in Japan during the same season in two consecutive years. Fourteen television dramas targeting the young audience broadcast between July and September in 2001 and 2002 were analyzed. A total of 136 h 42 min of television programs were divided into unit scenes of 3 min (a total of 2734 unit scenes). All the unit scenes were reviewed for smoking scenes and smoking-related items. Of the 2734 3-min unit scenes, 205 (7.5%) were actual smoking scenes and 387 (14.2%) depicted smoking environments with the presence of smoking-related items, such as ash trays. In 185 unit scenes (90.2% of total smoking scenes), actors were shown smoking. Actresses were less frequently shown smoking (9.8% of total smoking scenes). Smoking characters in dramas were in the 20-49 age group in 193 unit scenes (94.1% of total smoking scenes). In 96 unit scenes (46.8% of total smoking scenes), at least one non-smoker was present in the smoking scenes. The smoking locations were mainly indoors, including offices, restaurants and homes (122 unit scenes, 59.6%). The most common smoking-related items shown were ash trays (in 45.5% of smoking-item-related scenes) and cigarettes (in 30.2% of smoking-item-related scenes). Only 3 unit scenes (0.1 % of all scenes) promoted smoking prohibition. This was a descriptive study to examine the nature of smoking scenes observed in Japanese television dramas from a public health perspective.
Neuroscience-Enabled Complex Visual Scene Understanding
2012-04-12
some cases, it is hard to precisely say where or what we are looking at since a complex task governs eye fixations, for example in driving. While in...another objects ( say a door) can be resolved using the prior information about the scene. This knowledge can be provided from gist models, such as one...separation and combination of class-dependent features for handwriting recognition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 21, no. 10, pp. 1089
Learning what to expect (in visual perception)
Seriès, Peggy; Seitz, Aaron R.
2013-01-01
Expectations are known to greatly affect our experience of the world. A growing theory in computational neuroscience is that perception can be successfully described using Bayesian inference models and that the brain is “Bayes-optimal” under some constraints. In this context, expectations are particularly interesting, because they can be viewed as prior beliefs in the statistical inference process. A number of questions remain unsolved, however, for example: How fast do priors change over time? Are there limits in the complexity of the priors that can be learned? How do an individual’s priors compare to the true scene statistics? Can we unlearn priors that are thought to correspond to natural scene statistics? Where and what are the neural substrate of priors? Focusing on the perception of visual motion, we here review recent studies from our laboratories and others addressing these issues. We discuss how these data on motion perception fit within the broader literature on perceptual Bayesian priors, perceptual expectations, and statistical and perceptual learning and review the possible neural basis of priors. PMID:24187536
Cortical systems mediating visual attention to both objects and spatial locations
Shomstein, Sarah; Behrmann, Marlene
2006-01-01
Natural visual scenes consist of many objects occupying a variety of spatial locations. Given that the plethora of information cannot be processed simultaneously, the multiplicity of inputs compete for representation. Using event-related functional MRI, we show that attention, the mechanism by which a subset of the input is selected, is mediated by the posterior parietal cortex (PPC). Of particular interest is that PPC activity is differentially sensitive to the object-based properties of the input, with enhanced activation for those locations bound by an attended object. Of great interest too is the ensuing modulation of activation in early cortical regions, reflected as differences in the temporal profile of the blood oxygenation level-dependent (BOLD) response for within-object versus between-object locations. These findings indicate that object-based selection results from an object-sensitive reorienting signal issued by the PPC. The dynamic circuit between the PPC and earlier sensory regions then enables observers to attend preferentially to objects of interest in complex scenes. PMID:16840559
Making space for criminalistics: Hans Gross and fin-de-siècle CSI
Burney, Ian; Pemberton, Neil
2013-01-01
This article explores the articulation of a novel forensic object—the ‘crime scene’—and its corresponding expert—the investigating officer. Through a detailed engagement with the work of the late nineteenth-century Austrian jurist and criminalist Hans Gross, it analyses the dynamic and reflexive nature of this model of ‘CSI’, emphasising the material, physical, psychological and instrumental means through which the crime scene as a delineated space, and its investigator as a disciplined agent operating within it, jointly came into being. It has a further, historiographic, aim: to move away from the commonplace emphasis in histories of forensics on fin-de-siècle criminology and toward its comparatively under-explored contemporary, criminalistics. In so doing, it opens up new ways of thinking about the crime scene as a defining feature of our present-day forensic culture that recognise its historical contingency and the complex processes at work in its creation and development. PMID:23036861
Linking brain, mind and behavior.
Makeig, Scott; Gramann, Klaus; Jung, Tzyy-Ping; Sejnowski, Terrence J; Poizner, Howard
2009-08-01
Cortical brain areas and dynamics evolved to organize motor behavior in our three-dimensional environment also support more general human cognitive processes. Yet traditional brain imaging paradigms typically allow and record only minimal participant behavior, then reduce the recorded data to single map features of averaged responses. To more fully investigate the complex links between distributed brain dynamics and motivated natural behavior, we propose the development of wearable mobile brain/body imaging (MoBI) systems that continuously capture the wearer's high-density electrical brain and muscle signals, three-dimensional body movements, audiovisual scene and point of regard, plus new data-driven analysis methods to model their interrelationships. The new imaging modality should allow new insights into how spatially distributed brain dynamics support natural human cognition and agency.
Long-Term Memories Bias Sensitivity and Target Selection in Complex Scenes
Patai, Eva Zita; Doallo, Sonia; Nobre, Anna Christina
2014-01-01
In everyday situations we often rely on our memories to find what we are looking for in our cluttered environment. Recently, we developed a new experimental paradigm to investigate how long-term memory (LTM) can guide attention, and showed how the pre-exposure to a complex scene in which a target location had been learned facilitated the detection of the transient appearance of the target at the remembered location (Summerfield, Lepsien, Gitelman, Mesulam, & Nobre, 2006; Summerfield, Rao, Garside, & Nobre, 2011). The present study extends these findings by investigating whether and how LTM can enhance perceptual sensitivity to identify targets occurring within their complex scene context. Behavioral measures showed superior perceptual sensitivity (d′) for targets located in remembered spatial contexts. We used the N2pc event-related potential to test whether LTM modulated the process of selecting the target from its scene context. Surprisingly, in contrast to effects of visual spatial cues or implicit contextual cueing, LTM for target locations significantly attenuated the N2pc potential. We propose that the mechanism by which these explicitly available LTMs facilitate perceptual identification of targets may differ from mechanisms triggered by other types of top-down sources of information. PMID:23016670
Ogourtsova, Tatiana; Archambault, Philippe; Sangani, Samir; Lamontagne, Anouk
2018-01-01
Unilateral spatial neglect (USN) is a highly prevalent and disabling poststroke impairment. USN is traditionally assessed with paper-and-pencil tests that lack ecological validity, generalization to real-life situations and are easily compensated for in chronic stages. Virtual reality (VR) can, however, counteract these limitations. We aimed to examine the feasibility of a novel assessment of USN symptoms in a functional shopping activity, the Ecological VR-based Evaluation of Neglect Symptoms (EVENS). EVENS is immersive and consists of simple and complex 3-dimensional scenes depicting grocery shopping shelves, where joystick-based object detection and navigation tasks are performed while seated. Effects of virtual scene complexity on navigational and detection abilities in patients with (USN+, n = 12) and without (USN-, n = 15) USN following a right hemisphere stroke and in age-matched healthy controls (HC, n = 9) were determined. Longer detection times, larger mediolateral deviations from ideal paths and longer navigation times were found in USN+ versus USN- and HC groups, particularly in the complex scene. EVENS detected lateralized and nonlateralized USN-related deficits, performance alterations that were dependent or independent of USN severity, and performance alterations in 3 USN- subjects versus HC. EVENS' environmental changing complexity, along with the functional tasks of far space detection and navigation can potentially be clinically relevant and warrant further empirical investigation. Findings are discussed in terms of attentional models, lateralized versus nonlateralized deficits in USN, and tasks-specific mechanisms.
McMahon, David B T; Russ, Brian E; Elnaiem, Heba D; Kurnikova, Anastasia I; Leopold, David A
2015-04-08
Several visual areas within the STS of the macaque brain respond strongly to faces and other biological stimuli. Determining the principles that govern neural responses in this region has proven challenging, due in part to the inherently complex stimulus domain of dynamic biological stimuli that are not captured by an easily parameterized stimulus set. Here we investigated neural responses in one fMRI-defined face patch in the anterior fundus (AF) of the STS while macaques freely view complex videos rich with natural social content. Longitudinal single-unit recordings allowed for the accumulation of each neuron's responses to repeated video presentations across sessions. We found that individual neurons, while diverse in their response patterns, were consistently and deterministically driven by the video content. We used principal component analysis to compute a family of eigenneurons, which summarized 24% of the shared population activity in the first two components. We found that the most prominent component of AF activity reflected an interaction between visible body region and scene layout. Close-up shots of faces elicited the strongest neural responses, whereas far away shots of faces or close-up shots of hindquarters elicited weak or inhibitory responses. Sensitivity to the apparent proximity of faces was also observed in gamma band local field potential. This category-selective sensitivity to spatial scale, together with the known exchange of anatomical projections of this area with regions involved in visuospatial analysis, suggests that the AF face patch may be specialized in aspects of face perception that pertain to the layout of a social scene.
IR characteristic simulation of city scenes based on radiosity model
NASA Astrophysics Data System (ADS)
Xiong, Xixian; Zhou, Fugen; Bai, Xiangzhi; Yu, Xiyu
2013-09-01
Reliable modeling for thermal infrared (IR) signatures of real-world city scenes is required for signature management of civil and military platforms. Traditional modeling methods generally assume that scene objects are individual entities during the physical processes occurring in infrared range. However, in reality, the physical scene involves convective and conductive interactions between objects as well as the radiations interactions between objects. A method based on radiosity model describes these complex effects. It has been developed to enable an accurate simulation for the radiance distribution of the city scenes. Firstly, the physical processes affecting the IR characteristic of city scenes were described. Secondly, heat balance equations were formed on the basis of combining the atmospheric conditions, shadow maps and the geometry of scene. Finally, finite difference method was used to calculate the kinetic temperature of object surface. A radiosity model was introduced to describe the scattering effect of radiation between surface elements in the scene. By the synthesis of objects radiance distribution in infrared range, we could obtain the IR characteristic of scene. Real infrared images and model predictions were shown and compared. The results demonstrate that this method can realistically simulate the IR characteristic of city scenes. It effectively displays the infrared shadow effects and the radiation interactions between objects in city scenes.
Weakly Supervised Segmentation-Aided Classification of Urban Scenes from 3d LIDAR Point Clouds
NASA Astrophysics Data System (ADS)
Guinard, S.; Landrieu, L.
2017-05-01
We consider the problem of the semantic classification of 3D LiDAR point clouds obtained from urban scenes when the training set is limited. We propose a non-parametric segmentation model for urban scenes composed of anthropic objects of simple shapes, partionning the scene into geometrically-homogeneous segments which size is determined by the local complexity. This segmentation can be integrated into a conditional random field classifier (CRF) in order to capture the high-level structure of the scene. For each cluster, this allows us to aggregate the noisy predictions of a weakly-supervised classifier to produce a higher confidence data term. We demonstrate the improvement provided by our method over two publicly-available large-scale data sets.
Higher-order scene statistics of breast images
NASA Astrophysics Data System (ADS)
Abbey, Craig K.; Sohl-Dickstein, Jascha N.; Olshausen, Bruno A.; Eckstein, Miguel P.; Boone, John M.
2009-02-01
Researchers studying human and computer vision have found description and construction of these systems greatly aided by analysis of the statistical properties of naturally occurring scenes. More specifically, it has been found that receptive fields with directional selectivity and bandwidth properties similar to mammalian visual systems are more closely matched to the statistics of natural scenes. It is argued that this allows for sparse representation of the independent components of natural images [Olshausen and Field, Nature, 1996]. These theories have important implications for medical image perception. For example, will a system that is designed to represent the independent components of natural scenes, where objects occlude one another and illumination is typically reflected, be appropriate for X-ray imaging, where features superimpose on one another and illumination is transmissive? In this research we begin to examine these issues by evaluating higher-order statistical properties of breast images from X-ray projection mammography (PM) and dedicated breast computed tomography (bCT). We evaluate kurtosis in responses of octave bandwidth Gabor filters applied to PM and to coronal slices of bCT scans. We find that kurtosis in PM rises and quickly saturates for filter center frequencies with an average value above 0.95. By contrast, kurtosis in bCT peaks near 0.20 cyc/mm with kurtosis of approximately 2. Our findings suggest that the human visual system may be tuned to represent breast tissue more effectively in bCT over a specific range of spatial frequencies.
Dynamic Target Acquisition: Empirical Models of Operator Performance.
1980-08-01
for 30,000 Ft Initial Slant Range VARIABLES MEAN Signature X Scene Complexity Low Medium High Active Target FLIR 22794 20162 20449 Inactive Target...Interactions for 30,000 Ft Initial Slant Range I Signature X Scene Complexity V * ORDERED MEANS 14867 18076 18079 18315 19105 19643 20162 20449 22794...14867 18076 1 183159 19105* 1 19643 20162* 20449 * 1 22794Signature X Speed I ORDERED MEANS 13429 15226 16604 17344 19033 20586 22641 24033 24491 1
Efficient summary statistical representation when change localization fails.
Haberman, Jason; Whitney, David
2011-10-01
People are sensitive to the summary statistics of the visual world (e.g., average orientation/speed/facial expression). We readily derive this information from complex scenes, often without explicit awareness. Given the fundamental and ubiquitous nature of summary statistical representation, we tested whether this kind of information is subject to the attentional constraints imposed by change blindness. We show that information regarding the summary statistics of a scene is available despite limited conscious access. In a novel experiment, we found that while observers can suffer from change blindness (i.e., not localize where change occurred between two views of the same scene), observers could nevertheless accurately report changes in the summary statistics (or "gist") about the very same scene. In the experiment, observers saw two successively presented sets of 16 faces that varied in expression. Four of the faces in the first set changed from one emotional extreme (e.g., happy) to another (e.g., sad) in the second set. Observers performed poorly when asked to locate any of the faces that changed (change blindness). However, when asked about the ensemble (which set was happier, on average), observer performance remained high. Observers were sensitive to the average expression even when they failed to localize any specific object change. That is, even when observers could not locate the very faces driving the change in average expression between the two sets, they nonetheless derived a precise ensemble representation. Thus, the visual system may be optimized to process summary statistics in an efficient manner, allowing it to operate despite minimal conscious access to the information presented.
A review of visual perception mechanisms that regulate rapid adaptive camouflage in cuttlefish.
Chiao, Chuan-Chin; Chubb, Charles; Hanlon, Roger T
2015-09-01
We review recent research on the visual mechanisms of rapid adaptive camouflage in cuttlefish. These neurophysiologically complex marine invertebrates can camouflage themselves against almost any background, yet their ability to quickly (0.5-2 s) alter their body patterns on different visual backgrounds poses a vexing challenge: how to pick the correct body pattern amongst their repertoire. The ability of cuttlefish to change appropriately requires a visual system that can rapidly assess complex visual scenes and produce the motor responses-the neurally controlled body patterns-that achieve camouflage. Using specifically designed visual backgrounds and assessing the corresponding body patterns quantitatively, we and others have uncovered several aspects of scene variation that are important in regulating cuttlefish patterning responses. These include spatial scale of background pattern, background intensity, background contrast, object edge properties, object contrast polarity, object depth, and the presence of 3D objects. Moreover, arm postures and skin papillae are also regulated visually for additional aspects of concealment. By integrating these visual cues, cuttlefish are able to rapidly select appropriate body patterns for concealment throughout diverse natural environments. This sensorimotor approach of studying cuttlefish camouflage thus provides unique insights into the mechanisms of visual perception in an invertebrate image-forming eye.
Effects of aging on neural connectivity underlying selective memory for emotional scenes
Waring, Jill D.; Addis, Donna Rose; Kensinger, Elizabeth A.
2012-01-01
Older adults show age-related reductions in memory for neutral items within complex visual scenes, but just like young adults, older adults exhibit a memory advantage for emotional items within scenes compared with the background scene information. The present study examined young and older adults’ encoding-stage effective connectivity for selective memory of emotional items versus memory for both the emotional item and its background. In a functional magnetic resonance imaging (fMRI) study, participants viewed scenes containing either positive or negative items within neutral backgrounds. Outside the scanner, participants completed a memory test for items and backgrounds. Irrespective of scene content being emotionally positive or negative, older adults had stronger positive connections among frontal regions and from frontal regions to medial temporal lobe structures than did young adults, especially when items and backgrounds were subsequently remembered. These results suggest there are differences between young and older adults’ connectivity accompanying the encoding of emotional scenes. Older adults may require more frontal connectivity to encode all elements of a scene rather than just encoding the emotional item. PMID:22542836
Effects of aging on neural connectivity underlying selective memory for emotional scenes.
Waring, Jill D; Addis, Donna Rose; Kensinger, Elizabeth A
2013-02-01
Older adults show age-related reductions in memory for neutral items within complex visual scenes, but just like young adults, older adults exhibit a memory advantage for emotional items within scenes compared with the background scene information. The present study examined young and older adults' encoding-stage effective connectivity for selective memory of emotional items versus memory for both the emotional item and its background. In a functional magnetic resonance imaging (fMRI) study, participants viewed scenes containing either positive or negative items within neutral backgrounds. Outside the scanner, participants completed a memory test for items and backgrounds. Irrespective of scene content being emotionally positive or negative, older adults had stronger positive connections among frontal regions and from frontal regions to medial temporal lobe structures than did young adults, especially when items and backgrounds were subsequently remembered. These results suggest there are differences between young and older adults' connectivity accompanying the encoding of emotional scenes. Older adults may require more frontal connectivity to encode all elements of a scene rather than just encoding the emotional item. Published by Elsevier Inc.
NASA Astrophysics Data System (ADS)
Sun, Z.; Xu, Y.; Hoegner, L.; Stilla, U.
2018-05-01
In this work, we propose a classification method designed for the labeling of MLS point clouds, with detrended geometric features extracted from the points of the supervoxel-based local context. To achieve the analysis of complex 3D urban scenes, acquired points of the scene should be tagged with individual labels of different classes. Thus, assigning a unique label to the points of an object that belong to the same category plays an essential role in the entire 3D scene analysis workflow. Although plenty of studies in this field have been reported, this work is still a challenging task. Specifically, in this work: 1) A novel geometric feature extraction method, detrending the redundant and in-salient information in the local context, is proposed, which is proved to be effective for extracting local geometric features from the 3D scene. 2) Instead of using individual point as basic element, the supervoxel-based local context is designed to encapsulate geometric characteristics of points, providing a flexible and robust solution for feature extraction. 3) Experiments using complex urban scene with manually labeled ground truth are conducted, and the performance of proposed method with respect to different methods is analyzed. With the testing dataset, we have obtained a result of 0.92 for overall accuracy for assigning eight semantic classes.
Adaptation of facial synthesis to parameter analysis in MPEG-4 visual communication
NASA Astrophysics Data System (ADS)
Yu, Lu; Zhang, Jingyu; Liu, Yunhai
2000-12-01
In MPEG-4, Facial Definition Parameters (FDPs) and Facial Animation Parameters (FAPs) are defined to animate 1 a facial object. Most of the previous facial animation reconstruction systems were focused on synthesizing animation from manually or automatically generated FAPs but not the FAPs extracted from natural video scene. In this paper, an analysis-synthesis MPEG-4 visual communication system is established, in which facial animation is reconstructed from FAPs extracted from natural video scene.
A distributed code for color in natural scenes derived from center-surround filtered cone signals
Kellner, Christian J.; Wachtler, Thomas
2013-01-01
In the retina of trichromatic primates, chromatic information is encoded in an opponent fashion and transmitted to the lateral geniculate nucleus (LGN) and visual cortex via parallel pathways. Chromatic selectivities of neurons in the LGN form two separate clusters, corresponding to two classes of cone opponency. In the visual cortex, however, the chromatic selectivities are more distributed, which is in accordance with a population code for color. Previous studies of cone signals in natural scenes typically found opponent codes with chromatic selectivities corresponding to two directions in color space. Here we investigated how the non-linear spatio-chromatic filtering in the retina influences the encoding of color signals. Cone signals were derived from hyper-spectral images of natural scenes and preprocessed by center-surround filtering and rectification, resulting in parallel ON and OFF channels. Independent Component Analysis (ICA) on these signals yielded a highly sparse code with basis functions that showed spatio-chromatic selectivities. In contrast to previous analyses of linear transformations of cone signals, chromatic selectivities were not restricted to two main chromatic axes, but were more continuously distributed in color space, similar to the population code of color in the early visual cortex. Our results indicate that spatio-chromatic processing in the retina leads to a more distributed and more efficient code for natural scenes. PMID:24098289
Significance of Crime Scene Visit by Forensic Pathologist in Cases of Atypical Firearm Injuries.
Thejaswi, H T; Kumar, A; Krishna, K
2015-01-01
Deaths due to firearms are some of the interesting and contentious cases that a forensic pathologist/autopsy surgeon encounters in his practice. Whenever there is 'ambiguity' regarding the nature or sequence of events any unnatural deaths including those caused by firearms the practice of visiting crime scene should be encouraged especially in a country like India where autopsy surgeons often neglect it. Here we present a case report in which there were inconsistencies in the autopsy findings with the alleged history. The witnesses heard about four to six gunshot sounds, whereas only two spent cartridge cases were retrieved from the crime scene. Authors identified the atypical nature of firearm injuries sustained by the victims that were possible by just two bullets. Crime scene visit was undertaken where we discovered the possibility of the echo effect behind the production of four to six sounds. Further by using computer software program, positions of the gunman, victims and the bullet trajectory of the two bullets was created.
Advanced Weapon System (AWS) Sensor Prediction Techniques Study. Volume II
1981-09-01
models are suggested. TV. 1-1 ’ICourant Com’p’uter Sctence Report #9 December 1975 Scene Analysis: A Survey Carl Weiman Cou rant Institute of...some crucial differences. In the psycho- logical model of mechanical vision, the aim of scene analysis is to perceive and understand 2-0 images of 3-D...scenes. The meaning of this analogy can be clarified using a rudimentary informational model ; this yields a natural hierarchy from physical
Brady, Timothy F; Oliva, Aude
2008-07-01
Recent work has shown that observers can parse streams of syllables, tones, or visual shapes and learn statistical regularities in them without conscious intent (e.g., learn that A is always followed by B). Here, we demonstrate that these statistical-learning mechanisms can operate at an abstract, conceptual level. In Experiments 1 and 2, observers incidentally learned which semantic categories of natural scenes covaried (e.g., kitchen scenes were always followed by forest scenes). In Experiments 3 and 4, category learning with images of scenes transferred to words that represented the categories. In each experiment, the category of the scenes was irrelevant to the task. Together, these results suggest that statistical-learning mechanisms can operate at a categorical level, enabling generalization of learned regularities using existing conceptual knowledge. Such mechanisms may guide learning in domains as disparate as the acquisition of causal knowledge and the development of cognitive maps from environmental exploration.
The occipital place area represents the local elements of scenes
Kamps, Frederik S.; Julian, Joshua B.; Kubilius, Jonas; Kanwisher, Nancy; Dilks, Daniel D.
2016-01-01
Neuroimaging studies have identified three scene-selective regions in human cortex: parahippocampal place area (PPA), retrosplenial complex (RSC), and occipital place area (OPA). However, precisely what scene information each region represents in not clear, especially for the least studied, more posterior OPA. Here we hypothesized that OPA represents local elements of scenes within two independent, yet complementary scene descriptors: spatial boundary (i.e., the layout of external surfaces) and scene content (e.g., internal objects). If OPA processes the local elements of spatial boundary information, then it should respond to these local elements (e.g., walls) themselves, regardless of their spatial arrangement. Indeed, we found OPA, but not PPA or RSC, responded similarly to images of intact rooms and these same rooms in which the surfaces were fractured and rearranged, disrupting the spatial boundary. Next, if OPA represents the local elements of scene content information, then it should respond more when more such local elements (e.g., furniture) are present. Indeed, we found that OPA, but not PPA or RSC, responded more to multiple than single pieces of furniture. Taken together, these findings reveal that OPA analyzes local scene elements – both in spatial boundary and scene content representation – while PPA and RSC represent global scene properties. PMID:26931815
The occipital place area represents the local elements of scenes.
Kamps, Frederik S; Julian, Joshua B; Kubilius, Jonas; Kanwisher, Nancy; Dilks, Daniel D
2016-05-15
Neuroimaging studies have identified three scene-selective regions in human cortex: parahippocampal place area (PPA), retrosplenial complex (RSC), and occipital place area (OPA). However, precisely what scene information each region represents is not clear, especially for the least studied, more posterior OPA. Here we hypothesized that OPA represents local elements of scenes within two independent, yet complementary scene descriptors: spatial boundary (i.e., the layout of external surfaces) and scene content (e.g., internal objects). If OPA processes the local elements of spatial boundary information, then it should respond to these local elements (e.g., walls) themselves, regardless of their spatial arrangement. Indeed, we found that OPA, but not PPA or RSC, responded similarly to images of intact rooms and these same rooms in which the surfaces were fractured and rearranged, disrupting the spatial boundary. Next, if OPA represents the local elements of scene content information, then it should respond more when more such local elements (e.g., furniture) are present. Indeed, we found that OPA, but not PPA or RSC, responded more to multiple than single pieces of furniture. Taken together, these findings reveal that OPA analyzes local scene elements - both in spatial boundary and scene content representation - while PPA and RSC represent global scene properties. Copyright © 2016 Elsevier Inc. All rights reserved.
Neural correlates of contextual cueing are modulated by explicit learning.
Westerberg, Carmen E; Miller, Brennan B; Reber, Paul J; Cohen, Neal J; Paller, Ken A
2011-10-01
Contextual cueing refers to the facilitated ability to locate a particular visual element in a scene due to prior exposure to the same scene. This facilitation is thought to reflect implicit learning, as it typically occurs without the observer's knowledge that scenes repeat. Unlike most other implicit learning effects, contextual cueing can be impaired following damage to the medial temporal lobe. Here we investigated neural correlates of contextual cueing and explicit scene memory in two participant groups. Only one group was explicitly instructed about scene repetition. Participants viewed a sequence of complex scenes that depicted a landscape with five abstract geometric objects. Superimposed on each object was a letter T or L rotated left or right by 90°. Participants responded according to the target letter (T) orientation. Responses were highly accurate for all scenes. Response speeds were faster for repeated versus novel scenes. The magnitude of this contextual cueing did not differ between the two groups. Also, in both groups repeated scenes yielded reduced hemodynamic activation compared with novel scenes in several regions involved in visual perception and attention, and reductions in some of these areas were correlated with response-time facilitation. In the group given instructions about scene repetition, recognition memory for scenes was superior and was accompanied by medial temporal and more anterior activation. Thus, strategic factors can promote explicit memorization of visual scene information, which appears to engage additional neural processing beyond what is required for implicit learning of object configurations and target locations in a scene. Copyright © 2011 Elsevier Ltd. All rights reserved.
Neural correlates of contextual cueing are modulated by explicit learning
Westerberg, Carmen E.; Miller, Brennan B.; Reber, Paul J.; Cohen, Neal J.; Paller, Ken A.
2011-01-01
Contextual cueing refers to the facilitated ability to locate a particular visual element in a scene due to prior exposure to the same scene. This facilitation is thought to reflect implicit learning, as it typically occurs without the observer’s knowledge that scenes repeat. Unlike most other implicit learning effects, contextual cueing can be impaired following damage to the medial temporal lobe. Here we investigated neural correlates of contextual cueing and explicit scene memory in two participant groups. Only one group was explicitly instructed about scene repetition. Participants viewed a sequence of complex scenes that depicted a landscape with five abstract geometric objects. Superimposed on each object was a letter T or L rotated left or right by 90°. Participants responded according to the target letter (T) orientation. Responses were highly accurate for all scenes. Response speeds were faster for repeated versus novel scenes. The magnitude of this contextual cueing did not differ between the two groups. Also, in both groups repeated scenes yielded reduced hemodynamic activation compared with novel scenes in several regions involved in visual perception and attention, and reductions in some of these areas were correlated with response-time facilitation. In the group given instructions about scene repetition, recognition memory for scenes was superior and was accompanied by medial temporal and more anterior activation. Thus, strategic factors can promote explicit memorization of visual scene information, which appears to engage additional neural processing beyond what is required for implicit learning of object configurations and target locations in a scene. PMID:21889947
Interaction between scene-based and array-based contextual cueing.
Rosenbaum, Gail M; Jiang, Yuhong V
2013-07-01
Contextual cueing refers to the cueing of spatial attention by repeated spatial context. Previous studies have demonstrated distinctive properties of contextual cueing by background scenes and by an array of search items. Whereas scene-based contextual cueing reflects explicit learning of the scene-target association, array-based contextual cueing is supported primarily by implicit learning. In this study, we investigated the interaction between scene-based and array-based contextual cueing. Participants searched for a target that was predicted by both the background scene and the locations of distractor items. We tested three possible patterns of interaction: (1) The scene and the array could be learned independently, in which case cueing should be expressed even when only one cue was preserved; (2) the scene and array could be learned jointly, in which case cueing should occur only when both cues were preserved; (3) overshadowing might occur, in which case learning of the stronger cue should preclude learning of the weaker cue. In several experiments, we manipulated the nature of the contextual cues present during training and testing. We also tested explicit awareness of scenes, scene-target associations, and arrays. The results supported the overshadowing account: Specifically, scene-based contextual cueing precluded array-based contextual cueing when both were predictive of the location of a search target. We suggest that explicit, endogenous cues dominate over implicit cues in guiding spatial attention.
NASA Astrophysics Data System (ADS)
Assadi, Amir H.
2001-11-01
Perceptual geometry is an emerging field of interdisciplinary research whose objectives focus on study of geometry from the perspective of visual perception, and in turn, apply such geometric findings to the ecological study of vision. Perceptual geometry attempts to answer fundamental questions in perception of form and representation of space through synthesis of cognitive and biological theories of visual perception with geometric theories of the physical world. Perception of form and space are among fundamental problems in vision science. In recent cognitive and computational models of human perception, natural scenes are used systematically as preferred visual stimuli. Among key problems in perception of form and space, we have examined perception of geometry of natural surfaces and curves, e.g. as in the observer's environment. Besides a systematic mathematical foundation for a remarkably general framework, the advantages of the Gestalt theory of natural surfaces include a concrete computational approach to simulate or recreate images whose geometric invariants and quantities might be perceived and estimated by an observer. The latter is at the very foundation of understanding the nature of perception of space and form, and the (computer graphics) problem of rendering scenes to visually invoke virtual presence.
Estimating 3D tilt from local image cues in natural scenes
Burge, Johannes; McCann, Brian C.; Geisler, Wilson S.
2016-01-01
Estimating three-dimensional (3D) surface orientation (slant and tilt) is an important first step toward estimating 3D shape. Here, we examine how three local image cues from the same location (disparity gradient, luminance gradient, and dominant texture orientation) should be combined to estimate 3D tilt in natural scenes. We collected a database of natural stereoscopic images with precisely co-registered range images that provide the ground-truth distance at each pixel location. We then analyzed the relationship between ground-truth tilt and image cue values. Our analysis is free of assumptions about the joint probability distributions and yields the Bayes optimal estimates of tilt, given the cue values. Rich results emerge: (a) typical tilt estimates are only moderately accurate and strongly influenced by the cardinal bias in the prior probability distribution; (b) when cue values are similar, or when slant is greater than 40°, estimates are substantially more accurate; (c) when luminance and texture cues agree, they often veto the disparity cue, and when they disagree, they have little effect; and (d) simplifying assumptions common in the cue combination literature is often justified for estimating tilt in natural scenes. The fact that tilt estimates are typically not very accurate is consistent with subjective impressions from viewing small patches of natural scene. The fact that estimates are substantially more accurate for a subset of image locations is also consistent with subjective impressions and with the hypothesis that perceived surface orientation, at more global scales, is achieved by interpolation or extrapolation from estimates at key locations. PMID:27738702
Variability of eye movements when viewing dynamic natural scenes.
Dorr, Michael; Martinetz, Thomas; Gegenfurtner, Karl R; Barth, Erhardt
2010-08-26
How similar are the eye movement patterns of different subjects when free viewing dynamic natural scenes? We collected a large database of eye movements from 54 subjects on 18 high-resolution videos of outdoor scenes and measured their variability using the Normalized Scanpath Saliency, which we extended to the temporal domain. Even though up to about 80% of subjects looked at the same image region in some video parts, variability usually was much greater. Eye movements on natural movies were then compared with eye movements in several control conditions. "Stop-motion" movies had almost identical semantic content as the original videos but lacked continuous motion. Hollywood action movie trailers were used to probe the upper limit of eye movement coherence that can be achieved by deliberate camera work, scene cuts, etc. In a "repetitive" condition, subjects viewed the same movies ten times each over the course of 2 days. Results show several systematic differences between conditions both for general eye movement parameters such as saccade amplitude and fixation duration and for eye movement variability. Most importantly, eye movements on static images are initially driven by stimulus onset effects and later, more so than on continuous videos, by subject-specific idiosyncrasies; eye movements on Hollywood movies are significantly more coherent than those on natural movies. We conclude that the stimuli types often used in laboratory experiments, static images and professionally cut material, are not very representative of natural viewing behavior. All stimuli and gaze data are publicly available at http://www.inb.uni-luebeck.de/tools-demos/gaze.
Hobbs, Jennifer A; Towal, R Blythe; Hartmann, Mitra J Z
2015-08-01
Analysis of natural scene statistics has been a powerful approach for understanding neural coding in the auditory and visual systems. In the field of somatosensation, it has been more challenging to quantify the natural tactile scene, in part because somatosensory signals are so tightly linked to the animal's movements. The present work takes a step towards quantifying the natural tactile scene for the rat vibrissal system by simulating rat whisking motions to systematically investigate the probabilities of whisker-object contact in naturalistic environments. The simulations permit an exhaustive search through the complete space of possible contact patterns, thereby allowing for the characterization of the patterns that would most likely occur during long sequences of natural exploratory behavior. We specifically quantified the probabilities of 'concomitant contact', that is, given that a particular whisker makes contact with a surface during a whisk, what is the probability that each of the other whiskers will also make contact with the surface during that whisk? Probabilities of concomitant contact were quantified in simulations that assumed increasingly naturalistic conditions: first, the space of all possible head poses; second, the space of behaviorally preferred head poses as measured experimentally; and third, common head poses in environments such as cages and burrows. As environments became more naturalistic, the probability distributions shifted from exhibiting a 'row-wise' structure to a more diagonal structure. Results also reveal that the rat appears to use motor strategies (e.g. head pitches) that generate contact patterns that are particularly well suited to extract information in the presence of uncertainty. © 2015. Published by The Company of Biologists Ltd.
Psychophysiological responses and restorative values of wilderness environments
Chun-Yen Chang; Ping-Kun Chen; William E. Hammitt; Lisa Machnik
2007-01-01
Scenes of natural areas were used as stimuli to analyze the psychological and physiological responses of subjects while viewing wildland scenes. Attention Restoration Theory (Kaplan 1995) and theorized components of restorative environments were used as an orientation for selection of the visual stimuli. Conducted in Taiwan, the studies recorded the psychophysiological...
Disbergen, Niels R.; Valente, Giancarlo; Formisano, Elia; Zatorre, Robert J.
2018-01-01
Polyphonic music listening well exemplifies processes typically involved in daily auditory scene analysis situations, relying on an interactive interplay between bottom-up and top-down processes. Most studies investigating scene analysis have used elementary auditory scenes, however real-world scene analysis is far more complex. In particular, music, contrary to most other natural auditory scenes, can be perceived by either integrating or, under attentive control, segregating sound streams, often carried by different instruments. One of the prominent bottom-up cues contributing to multi-instrument music perception is their timbre difference. In this work, we introduce and validate a novel paradigm designed to investigate, within naturalistic musical auditory scenes, attentive modulation as well as its interaction with bottom-up processes. Two psychophysical experiments are described, employing custom-composed two-voice polyphonic music pieces within a framework implementing a behavioral performance metric to validate listener instructions requiring either integration or segregation of scene elements. In Experiment 1, the listeners' locus of attention was switched between individual instruments or the aggregate (i.e., both instruments together), via a task requiring the detection of temporal modulations (i.e., triplets) incorporated within or across instruments. Subjects responded post-stimulus whether triplets were present in the to-be-attended instrument(s). Experiment 2 introduced the bottom-up manipulation by adding a three-level morphing of instrument timbre distance to the attentional framework. The task was designed to be used within neuroimaging paradigms; Experiment 2 was additionally validated behaviorally in the functional Magnetic Resonance Imaging (fMRI) environment. Experiment 1 subjects (N = 29, non-musicians) completed the task at high levels of accuracy, showing no group differences between any experimental conditions. Nineteen listeners also participated in Experiment 2, showing a main effect of instrument timbre distance, even though within attention-condition timbre-distance contrasts did not demonstrate any timbre effect. Correlation of overall scores with morph-distance effects, computed by subtracting the largest from the smallest timbre distance scores, showed an influence of general task difficulty on the timbre distance effect. Comparison of laboratory and fMRI data showed scanner noise had no adverse effect on task performance. These Experimental paradigms enable to study both bottom-up and top-down contributions to auditory stream segregation and integration within psychophysical and neuroimaging experiments. PMID:29563861
Interactive physically-based sound simulation
NASA Astrophysics Data System (ADS)
Raghuvanshi, Nikunj
The realization of interactive, immersive virtual worlds requires the ability to present a realistic audio experience that convincingly compliments their visual rendering. Physical simulation is a natural way to achieve such realism, enabling deeply immersive virtual worlds. However, physically-based sound simulation is very computationally expensive owing to the high-frequency, transient oscillations underlying audible sounds. The increasing computational power of desktop computers has served to reduce the gap between required and available computation, and it has become possible to bridge this gap further by using a combination of algorithmic improvements that exploit the physical, as well as perceptual properties of audible sounds. My thesis is a step in this direction. My dissertation concentrates on developing real-time techniques for both sub-problems of sound simulation: synthesis and propagation. Sound synthesis is concerned with generating the sounds produced by objects due to elastic surface vibrations upon interaction with the environment, such as collisions. I present novel techniques that exploit human auditory perception to simulate scenes with hundreds of sounding objects undergoing impact and rolling in real time. Sound propagation is the complementary problem of modeling the high-order scattering and diffraction of sound in an environment as it travels from source to listener. I discuss my work on a novel numerical acoustic simulator (ARD) that is hundred times faster and consumes ten times less memory than a high-accuracy finite-difference technique, allowing acoustic simulations on previously-intractable spaces, such as a cathedral, on a desktop computer. Lastly, I present my work on interactive sound propagation that leverages my ARD simulator to render the acoustics of arbitrary static scenes for multiple moving sources and listener in real time, while accounting for scene-dependent effects such as low-pass filtering and smooth attenuation behind obstructions, reverberation, scattering from complex geometry and sound focusing. This is enabled by a novel compact representation that takes a thousand times less memory than a direct scheme, thus reducing memory footprints to fit within available main memory. To the best of my knowledge, this is the only technique and system in existence to demonstrate auralization of physical wave-based effects in real-time on large, complex 3D scenes.
Sofer, Imri; Crouzet, Sébastien M.; Serre, Thomas
2015-01-01
Observers can rapidly perform a variety of visual tasks such as categorizing a scene as open, as outdoor, or as a beach. Although we know that different tasks are typically associated with systematic differences in behavioral responses, to date, little is known about the underlying mechanisms. Here, we implemented a single integrated paradigm that links perceptual processes with categorization processes. Using a large image database of natural scenes, we trained machine-learning classifiers to derive quantitative measures of task-specific perceptual discriminability based on the distance between individual images and different categorization boundaries. We showed that the resulting discriminability measure accurately predicts variations in behavioral responses across categorization tasks and stimulus sets. We further used the model to design an experiment, which challenged previous interpretations of the so-called “superordinate advantage.” Overall, our study suggests that observed differences in behavioral responses across rapid categorization tasks reflect natural variations in perceptual discriminability. PMID:26335683
Dima, Diana C; Perry, Gavin; Singh, Krish D
2018-06-11
In navigating our environment, we rapidly process and extract meaning from visual cues. However, the relationship between visual features and categorical representations in natural scene perception is still not well understood. Here, we used natural scene stimuli from different categories and filtered at different spatial frequencies to address this question in a passive viewing paradigm. Using representational similarity analysis (RSA) and cross-decoding of magnetoencephalography (MEG) data, we show that categorical representations emerge in human visual cortex at ∼180 ms and are linked to spatial frequency processing. Furthermore, dorsal and ventral stream areas reveal temporally and spatially overlapping representations of low and high-level layer activations extracted from a feedforward neural network. Our results suggest that neural patterns from extrastriate visual cortex switch from low-level to categorical representations within 200 ms, highlighting the rapid cascade of processing stages essential in human visual perception. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
Gaussian Radial Basis Function for Efficient Computation of Forest Indirect Illumination
NASA Astrophysics Data System (ADS)
Abbas, Fayçal; Babahenini, Mohamed Chaouki
2018-06-01
Global illumination of natural scenes in real time like forests is one of the most complex problems to solve, because the multiple inter-reflections between the light and material of the objects composing the scene. The major problem that arises is the problem of visibility computation. In fact, the computing of visibility is carried out for all the set of leaves visible from the center of a given leaf, given the enormous number of leaves present in a tree, this computation performed for each leaf of the tree which also reduces performance. We describe a new approach that approximates visibility queries, which precede in two steps. The first step is to generate point cloud representing the foliage. We assume that the point cloud is composed of two classes (visible, not-visible) non-linearly separable. The second step is to perform a point cloud classification by applying the Gaussian radial basis function, which measures the similarity in term of distance between each leaf and a landmark leaf. It allows approximating the visibility requests to extract the leaves that will be used to calculate the amount of indirect illumination exchanged between neighbor leaves. Our approach allows efficiently treat the light exchanges in the scene of a forest, it allows a fast computation and produces images of good visual quality, all this takes advantage of the immense power of computation of the GPU.
Using Bayesian neural networks to classify forest scenes
NASA Astrophysics Data System (ADS)
Vehtari, Aki; Heikkonen, Jukka; Lampinen, Jouko; Juujarvi, Jouni
1998-10-01
We present results that compare the performance of Bayesian learning methods for neural networks on the task of classifying forest scenes into trees and background. Classification task is demanding due to the texture richness of the trees, occlusions of the forest scene objects and diverse lighting conditions under operation. This makes it difficult to determine which are optimal image features for the classification. A natural way to proceed is to extract many different types of potentially suitable features, and to evaluate their usefulness in later processing stages. One approach to cope with large number of features is to use Bayesian methods to control the model complexity. Bayesian learning uses a prior on model parameters, combines this with evidence from a training data, and the integrates over the resulting posterior to make predictions. With this method, we can use large networks and many features without fear of overfitting. For this classification task we compare two Bayesian learning methods for multi-layer perceptron (MLP) neural networks: (1) The evidence framework of MacKay uses a Gaussian approximation to the posterior weight distribution and maximizes with respect to hyperparameters. (2) In a Markov Chain Monte Carlo (MCMC) method due to Neal, the posterior distribution of the network parameters is numerically integrated using the MCMC method. As baseline classifiers for comparison we use (3) MLP early stop committee, (4) K-nearest-neighbor and (5) Classification And Regression Tree.
2012-01-01
Background In research on event-related potentials (ERP) to emotional pictures, greater attention to emotional than neutral stimuli (i.e., motivated attention) is commonly indexed by two difference waves between emotional and neutral stimuli: the early posterior negativity (EPN) and the late positive potential (LPP). Evidence suggests that if attention is directed away from the pictures, then the emotional effects on EPN and LPP are eliminated. However, a few studies have found residual, emotional effects on EPN and LPP. In these studies, pictures were shown at fixation, and picture composition was that of simple figures rather than that of complex scenes. Because figures elicit larger LPP than do scenes, figures might capture and hold attention more strongly than do scenes. Here, we showed negative and neutral pictures of figures and scenes and tested first, whether emotional effects are larger to figures than scenes for both EPN and LPP, and second, whether emotional effects on EPN and LPP are reduced less for unattended figures than scenes. Results Emotional effects on EPN and LPP were larger for figures than scenes. When pictures were unattended, emotional effects on EPN increased for scenes but tended to decrease for figures, whereas emotional effects on LPP decreased similarly for figures and scenes. Conclusions Emotional effects on EPN and LPP were larger for figures than scenes, but these effects did not resist manipulations of attention more strongly for figures than scenes. These findings imply that the emotional content captures attention more strongly for figures than scenes, but that the emotional content does not hold attention more strongly for figures than scenes. PMID:22607397
Change deafness for real spatialized environmental scenes.
Gaston, Jeremy; Dickerson, Kelly; Hipp, Daniel; Gerhardstein, Peter
2017-01-01
The everyday auditory environment is complex and dynamic; often, multiple sounds co-occur and compete for a listener's cognitive resources. 'Change deafness', framed as the auditory analog to the well-documented phenomenon of 'change blindness', describes the finding that changes presented within complex environments are often missed. The present study examines a number of stimulus factors that may influence change deafness under real-world listening conditions. Specifically, an AX (same-different) discrimination task was used to examine the effects of both spatial separation over a loudspeaker array and the type of change (sound source additions and removals) on discrimination of changes embedded in complex backgrounds. Results using signal detection theory and accuracy analyses indicated that, under most conditions, errors were significantly reduced for spatially distributed relative to non-spatial scenes. A second goal of the present study was to evaluate a possible link between memory for scene contents and change discrimination. Memory was evaluated by presenting a cued recall test following each trial of the discrimination task. Results using signal detection theory and accuracy analyses indicated that recall ability was similar in terms of accuracy, but there were reductions in sensitivity compared to previous reports. Finally, the present study used a large and representative sample of outdoor, urban, and environmental sounds, presented in unique combinations of nearly 1000 trials per participant. This enabled the exploration of the relationship between change perception and the perceptual similarity between change targets and background scene sounds. These (post hoc) analyses suggest both a categorical and a stimulus-level relationship between scene similarity and the magnitude of change errors.
Scene-Aware Adaptive Updating for Visual Tracking via Correlation Filters
Zhang, Sirou; Qiao, Xiaoya
2017-01-01
In recent years, visual object tracking has been widely used in military guidance, human-computer interaction, road traffic, scene monitoring and many other fields. The tracking algorithms based on correlation filters have shown good performance in terms of accuracy and tracking speed. However, their performance is not satisfactory in scenes with scale variation, deformation, and occlusion. In this paper, we propose a scene-aware adaptive updating mechanism for visual tracking via a kernel correlation filter (KCF). First, a low complexity scale estimation method is presented, in which the corresponding weight in five scales is employed to determine the final target scale. Then, the adaptive updating mechanism is presented based on the scene-classification. We classify the video scenes as four categories by video content analysis. According to the target scene, we exploit the adaptive updating mechanism to update the kernel correlation filter to improve the robustness of the tracker, especially in scenes with scale variation, deformation, and occlusion. We evaluate our tracker on the CVPR2013 benchmark. The experimental results obtained with the proposed algorithm are improved by 33.3%, 15%, 6%, 21.9% and 19.8% compared to those of the KCF tracker on the scene with scale variation, partial or long-time large-area occlusion, deformation, fast motion and out-of-view. PMID:29140311
Real-time scene and signature generation for ladar and imaging sensors
NASA Astrophysics Data System (ADS)
Swierkowski, Leszek; Christie, Chad L.; Antanovskii, Leonid; Gouthas, Efthimios
2014-05-01
This paper describes development of two key functionalities within the VIRSuite scene simulation program, broadening its scene generation capabilities and increasing accuracy of thermal signatures. Firstly, a new LADAR scene generation module has been designed. It is capable of simulating range imagery for Geiger mode LADAR, in addition to the already existing functionality for linear mode systems. Furthermore, a new 3D heat diffusion solver has been developed within the VIRSuite signature prediction module. It is capable of calculating the temperature distribution in complex three-dimensional objects for enhanced dynamic prediction of thermal signatures. With these enhancements, VIRSuite is now a robust tool for conducting dynamic simulation for missiles with multi-mode seekers.
Research on the generation of the background with sea and sky in infrared scene
NASA Astrophysics Data System (ADS)
Dong, Yan-zhi; Han, Yan-li; Lou, Shu-li
2008-03-01
It is important for scene generation to keep the texture of infrared images in simulation of anti-ship infrared imaging guidance. We studied the fractal method and applied it to the infrared scene generation. We adopted the method of horizontal-vertical (HV) partition to encode the original image. Basing on the properties of infrared image with sea-sky background, we took advantage of Local Iteration Function System (LIFS) to decrease the complexity of computation and enhance the processing rate. Some results were listed. The results show that the fractal method can keep the texture of infrared image better and can be used in the infrared scene generation widely in future.
Kumar, Manoj; Federmeier, Kara D; Fei-Fei, Li; Beck, Diane M
2017-07-15
A long-standing core question in cognitive science is whether different modalities and representation types (pictures, words, sounds, etc.) access a common store of semantic information. Although different input types have been shown to activate a shared network of brain regions, this does not necessitate that there is a common representation, as the neurons in these regions could still differentially process the different modalities. However, multi-voxel pattern analysis can be used to assess whether, e.g., pictures and words evoke a similar pattern of activity, such that the patterns that separate categories in one modality transfer to the other. Prior work using this method has found support for a common code, but has two limitations: they have either only examined disparate categories (e.g. animals vs. tools) that are known to activate different brain regions, raising the possibility that the pattern separation and inferred similarity reflects only large scale differences between the categories or they have been limited to individual object representations. By using natural scene categories, we not only extend the current literature on cross-modal representations beyond objects, but also, because natural scene categories activate a common set of brain regions, we identify a more fine-grained (i.e. higher spatial resolution) common representation. Specifically, we studied picture- and word-based representations of natural scene stimuli from four different categories: beaches, cities, highways, and mountains. Participants passively viewed blocks of either phrases (e.g. "sandy beach") describing scenes or photographs from those same scene categories. To determine whether the phrases and pictures evoke a common code, we asked whether a classifier trained on one stimulus type (e.g. phrase stimuli) would transfer (i.e. cross-decode) to the other stimulus type (e.g. picture stimuli). The analysis revealed cross-decoding in the occipitotemporal, posterior parietal and frontal cortices. This similarity of neural activity patterns across the two input types, for categories that co-activate local brain regions, provides strong evidence of a common semantic code for pictures and words in the brain. Copyright © 2017 Elsevier Inc. All rights reserved.
Wu, Chia-Chien; Wang, Hsueh-Cheng; Pomplun, Marc
2014-12-01
A previous study (Vision Research 51 (2011) 1192-1205) found evidence for semantic guidance of visual attention during the inspection of real-world scenes, i.e., an influence of semantic relationships among scene objects on overt shifts of attention. In particular, the results revealed an observer bias toward gaze transitions between semantically similar objects. However, this effect is not necessarily indicative of semantic processing of individual objects but may be mediated by knowledge of the scene gist, which does not require object recognition, or by known spatial dependency among objects. To examine the mechanisms underlying semantic guidance, in the present study, participants were asked to view a series of displays with the scene gist excluded and spatial dependency varied. Our results show that spatial dependency among objects seems to be sufficient to induce semantic guidance. Scene gist, on the other hand, does not seem to affect how observers use semantic information to guide attention while viewing natural scenes. Extracting semantic information mainly based on spatial dependency may be an efficient strategy of the visual system that only adds little cognitive load to the viewing task. Copyright © 2014 Elsevier Ltd. All rights reserved.
Measurements of scene spectral radiance variability
NASA Astrophysics Data System (ADS)
Seeley, Juliette A.; Wack, Edward C.; Mooney, Daniel L.; Muldoon, Michael; Shey, Shen; Upham, Carolyn A.; Harvey, John M.; Czerwinski, Richard N.; Jordan, Michael P.; Vallières, Alexandre; Chamberland, Martin
2006-05-01
Detection performance of LWIR passive standoff chemical agent sensors is strongly influenced by various scene parameters, such as atmospheric conditions, temperature contrast, concentration-path length product (CL), agent absorption coefficient, and scene spectral variability. Although temperature contrast, CL, and agent absorption coefficient affect the detected signal in a predictable manner, fluctuations in background scene spectral radiance have less intuitive consequences. The spectral nature of the scene is not problematic in and of itself; instead it is spatial and temporal fluctuations in the scene spectral radiance that cannot be entirely corrected for with data processing. In addition, the consequence of such variability is a function of the spectral signature of the agent that is being detected and is thus different for each agent. To bracket the performance of background-limited (low sensor NEDN), passive standoff chemical sensors in the range of relevant conditions, assessment of real scene data is necessary1. Currently, such data is not widely available2. To begin to span the range of relevant scene conditions, we have acquired high fidelity scene spectral radiance measurements with a Telops FTIR imaging spectrometer 3. We have acquired data in a variety of indoor and outdoor locations at different times of day and year. Some locations include indoor office environments, airports, urban and suburban scenes, waterways, and forest. We report agent-dependent clutter measurements for three of these backgrounds.
Positive emotion can protect against source memory impairment.
MacKenzie, Graham; Powell, Tim F; Donaldson, David I
2015-01-01
Despite widespread belief that memory is enhanced by emotion, evidence also suggests that emotion can impair memory. Here we test predictions inspired by object-based binding theory, which states that memory enhancement or impairment depends on the nature of the information to be retrieved. We investigated emotional memory in the context of source retrieval, using images of scenes that were negative, neutral or positive in valence. At study each scene was paired with a colour and during retrieval participants reported the source colour for recognised scenes. Critically, we isolated effects of valence by equating stimulus arousal across conditions. In Experiment 1 colour borders surrounded scenes at study: memory impairment was found for both negative and positive scenes. Experiment 2 used colours superimposed over scenes at study: valence affected source retrieval, with memory impairment for negative scenes only. These findings challenge current theories of emotional memory by showing that emotion can impair memory for both intrinsic and extrinsic source information, even when arousal is equated between emotional and neutral stimuli, and by dissociating the effects of positive and negative emotion on episodic memory retrieval.
NASA Astrophysics Data System (ADS)
Kypraios, Ioannis; Young, Rupert C. D.; Chatwin, Chris R.; Birch, Phil M.
2009-04-01
θThe window unit in the design of the complex logarithmic r-θ mapping for hybrid optical neural network filter can allow multiple objects of the same class to be detected within the input image. Additionally, the architecture of the neural network unit of the complex logarithmic r-θ mapping for hybrid optical neural network filter becomes attractive for accommodating the recognition of multiple objects of different classes within the input image by modifying the output layer of the unit. We test the overall filter for multiple objects of the same and of different classes' recognition within cluttered input images and video sequences of cluttered scenes. Logarithmic r-θ mapping for hybrid optical neural network filter is shown to exhibit with a single pass over the input data simultaneously in-plane rotation, out-of-plane rotation, scale, log r-θ map translation and shift invariance, and good clutter tolerance by recognizing correctly the different objects within the cluttered scenes. We record in our results additional extracted information from the cluttered scenes about the objects' relative position, scale and in-plane rotation.
Fractal active contour model for segmenting the boundary of man-made target in nature scenes
NASA Astrophysics Data System (ADS)
Li, Min; Tang, Yandong; Wang, Lidi; Shi, Zelin
2006-02-01
In this paper, a novel geometric active contour model based on the fractal dimension feature to extract the boundary of man-made target in nature scenes is presented. In order to suppress the nature clutters, an adaptive weighting function is defined using the fractal dimension feature. Then the weighting function is introduced into the geodesic active contour model to detect the boundary of man-made target. Curve driven by our proposed model can evolve gradually from the initial position to the boundary of man-made target without being disturbed by nature clutters, even if the initial curve is far away from the true boundary. Experimental results validate the effectiveness and feasibility of our model.
Changing scenes: memory for naturalistic events following change blindness.
Mäntylä, Timo; Sundström, Anna
2004-11-01
Research on scene perception indicates that viewers often fail to detect large changes to scene regions when these changes occur during a visual disruption such as a saccade or a movie cut. In two experiments, we examined whether this relative inability to detect changes would produce systematic biases in event memory. In Experiment 1, participants decided whether two successively presented images were the same or different, followed by a memory task, in which they recalled the content of the viewed scene. In Experiment 2, participants viewed a short video, in which an actor carried out a series of daily activities, and central scenes' attributes were changed during a movie cut. A high degree of change blindness was observed in both experiments, and these effects were related to scene complexity (Experiment 1) and level of retrieval support (Experiment 2). Most important, participants reported the changed, rather than the initial, event attributes following a failure in change detection. These findings suggest that attentional limitations during encoding contribute to biases in episodic memory.
The Nature of Change Detection and Online Representations of Scenes
ERIC Educational Resources Information Center
Ryan,J ennifer D.; Cohen, Neal J.
2004-01-01
This article provides evidence for implicit change detection and for the contribution of multiple memory sources to online representations. Multiple eye-movement measures distinguished original from changed scenes, even when college students had no conscious awareness for the change. Patients with amnesia showed a systematic deficit on 1 class of…
Hippocampal gamma-band Synchrony and pupillary responses index memory during visual search.
Montefusco-Siegmund, Rodrigo; Leonard, Timothy K; Hoffman, Kari L
2017-04-01
Memory for scenes is supported by the hippocampus, among other interconnected structures, but the neural mechanisms related to this process are not well understood. To assess the role of the hippocampus in memory-guided scene search, we recorded local field potentials and multiunit activity from the hippocampus of macaques as they performed goal-directed search tasks using natural scenes. We additionally measured pupil size during scene presentation, which in humans is modulated by recognition memory. We found that both pupil dilation and search efficiency accompanied scene repetition, thereby indicating memory for scenes. Neural correlates included a brief increase in hippocampal multiunit activity and a sustained synchronization of unit activity to gamma band oscillations (50-70 Hz). The repetition effects on hippocampal gamma synchronization occurred when pupils were most dilated, suggesting an interaction between aroused, attentive processing and hippocampal correlates of recognition memory. These results suggest that the hippocampus may support memory-guided visual search through enhanced local gamma synchrony. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Smith, Tim J; Mital, Parag K
2013-07-17
Does viewing task influence gaze during dynamic scene viewing? Research into the factors influencing gaze allocation during free viewing of dynamic scenes has reported that the gaze of multiple viewers clusters around points of high motion (attentional synchrony), suggesting that gaze may be primarily under exogenous control. However, the influence of viewing task on gaze behavior in static scenes and during real-world interaction has been widely demonstrated. To dissociate exogenous from endogenous factors during dynamic scene viewing we tracked participants' eye movements while they (a) freely watched unedited videos of real-world scenes (free viewing) or (b) quickly identified where the video was filmed (spot-the-location). Static scenes were also presented as controls for scene dynamics. Free viewing of dynamic scenes showed greater attentional synchrony, longer fixations, and more gaze to people and areas of high flicker compared with static scenes. These differences were minimized by the viewing task. In comparison with the free viewing of dynamic scenes, during the spot-the-location task fixation durations were shorter, saccade amplitudes were longer, and gaze exhibited less attentional synchrony and was biased away from areas of flicker and people. These results suggest that the viewing task can have a significant influence on gaze during a dynamic scene but that endogenous control is slow to kick in as initial saccades default toward the screen center, areas of high motion and people before shifting to task-relevant features. This default-like viewing behavior returns after the viewing task is completed, confirming that gaze behavior is more predictable during free viewing of dynamic than static scenes but that this may be due to natural correlation between regions of interest (e.g., people) and motion.
Scanning silence: mental imagery of complex sounds.
Bunzeck, Nico; Wuestenberg, Torsten; Lutz, Kai; Heinze, Hans-Jochen; Jancke, Lutz
2005-07-15
In this functional magnetic resonance imaging (fMRI) study, we investigated the neural basis of mental auditory imagery of familiar complex sounds that did not contain language or music. In the first condition (perception), the subjects watched familiar scenes and listened to the corresponding sounds that were presented simultaneously. In the second condition (imagery), the same scenes were presented silently and the subjects had to mentally imagine the appropriate sounds. During the third condition (control), the participants watched a scrambled version of the scenes without sound. To overcome the disadvantages of the stray acoustic scanner noise in auditory fMRI experiments, we applied sparse temporal sampling technique with five functional clusters that were acquired at the end of each movie presentation. Compared to the control condition, we found bilateral activations in the primary and secondary auditory cortices (including Heschl's gyrus and planum temporale) during perception of complex sounds. In contrast, the imagery condition elicited bilateral hemodynamic responses only in the secondary auditory cortex (including the planum temporale). No significant activity was observed in the primary auditory cortex. The results show that imagery and perception of complex sounds that do not contain language or music rely on overlapping neural correlates of the secondary but not primary auditory cortex.
Eye movements, visual search and scene memory, in an immersive virtual environment.
Kit, Dmitry; Katz, Leor; Sullivan, Brian; Snyder, Kat; Ballard, Dana; Hayhoe, Mary
2014-01-01
Visual memory has been demonstrated to play a role in both visual search and attentional prioritization in natural scenes. However, it has been studied predominantly in experimental paradigms using multiple two-dimensional images. Natural experience, however, entails prolonged immersion in a limited number of three-dimensional environments. The goal of the present experiment was to recreate circumstances comparable to natural visual experience in order to evaluate the role of scene memory in guiding eye movements in a natural environment. Subjects performed a continuous visual-search task within an immersive virtual-reality environment over three days. We found that, similar to two-dimensional contexts, viewers rapidly learn the location of objects in the environment over time, and use spatial memory to guide search. Incidental fixations did not provide obvious benefit to subsequent search, suggesting that semantic contextual cues may often be just as efficient, or that many incidentally fixated items are not held in memory in the absence of a specific task. On the third day of the experience in the environment, previous search items changed in color. These items were fixated upon with increased probability relative to control objects, suggesting that memory-guided prioritization (or Surprise) may be a robust mechanisms for attracting gaze to novel features of natural environments, in addition to task factors and simple spatial saliency.
NASA Astrophysics Data System (ADS)
Qi, K.; Qingfeng, G.
2017-12-01
With the popular use of High-Resolution Satellite (HRS) images, more and more research efforts have been placed on land-use scene classification. However, it makes the task difficult with HRS images for the complex background and multiple land-cover classes or objects. This article presents a multiscale deeply described correlaton model for land-use scene classification. Specifically, the convolutional neural network is introduced to learn and characterize the local features at different scales. Then, learnt multiscale deep features are explored to generate visual words. The spatial arrangement of visual words is achieved through the introduction of adaptive vector quantized correlograms at different scales. Experiments on two publicly available land-use scene datasets demonstrate that the proposed model is compact and yet discriminative for efficient representation of land-use scene images, and achieves competitive classification results with the state-of-art methods.
Assessment of synthetic image fidelity
NASA Astrophysics Data System (ADS)
Mitchell, Kevin D.; Moorhead, Ian R.; Gilmore, Marilyn A.; Watson, Graham H.; Thomson, Mitch; Yates, T.; Troscianko, Tomasz; Tolhurst, David J.
2000-07-01
Computer generated imagery is increasingly used for a wide variety of purposes ranging from computer games to flight simulators to camouflage and sensor assessment. The fidelity required for this imagery is dependent on the anticipated use - for example when used for camouflage design it must be physically correct spectrally and spatially. The rendering techniques used will also depend upon the waveband being simulated, spatial resolution of the sensor and the required frame rate. Rendering of natural outdoor scenes is particularly demanding, because of the statistical variation in materials and illumination, atmospheric effects and the complex geometric structures of objects such as trees. The accuracy of the simulated imagery has tended to be assessed subjectively in the past. First and second order statistics do not capture many of the essential characteristics of natural scenes. Direct pixel comparison would impose an unachievable demand on the synthetic imagery. For many applications, such as camouflage design, it is important that nay metrics used will work in both visible and infrared wavebands. We are investigating a variety of different methods of comparing real and synthetic imagery and comparing synthetic imagery rendered to different levels of fidelity. These techniques will include neural networks (ICA), higher order statistics and models of human contrast perception. This paper will present an overview of the analyses we have carried out and some initial results along with some preliminary conclusions regarding the fidelity of synthetic imagery.
Forced to remember: when memory is biased by salient information.
Santangelo, Valerio
2015-04-15
The last decades have seen a rapid growing in the attempt to understand the key factors involved in the internal memory representation of the external world. Visual salience have been found to provide a major contribution in predicting the probability for an item/object embedded in a complex setting (i.e., a natural scene) to be encoded and then remembered later on. Here I review the existing literature highlighting the impact of perceptual- (based on low-level sensory features) and semantics-related salience (based on high-level knowledge) on short-term memory representation, along with the neural mechanisms underpinning the interplay between these factors. The available evidence reveal that both perceptual- and semantics-related factors affect attention selection mechanisms during the encoding of natural scenes. Biasing internal memory representation, both perceptual and semantics factors increase the probability to remember high- to the detriment of low-saliency items. The available evidence also highlight an interplay between these factors, with a reduced impact of perceptual-related salience in biasing memory representation as a function of the increasing availability of semantics-related salient information. The neural mechanisms underpinning this interplay involve the activation of different portions of the frontoparietal attention control network. Ventral regions support the assignment of selection/encoding priorities based on high-level semantics, while the involvement of dorsal regions reflects priorities assignment based on low-level sensory features. Copyright © 2015 Elsevier B.V. All rights reserved.
Bölte, Jens; Hofmann, Reinhild; Meier, Claudine C.; Dobel, Christian
2018-01-01
At the interface between scene perception and speech production, we investigated how rapidly action scenes can activate semantic and lexical information. Experiment 1 examined how complex action-scene primes, presented for 150 ms, 100 ms, or 50 ms and subsequently masked, influenced the speed with which immediately following action-picture targets are named. Prime and target actions were either identical, showed the same action with different actors and environments, or were unrelated. Relative to unrelated primes, identical and same-action primes facilitated naming the target action, even when presented for 50 ms. In Experiment 2, neutral primes assessed the direction of effects. Identical and same-action scenes induced facilitation but unrelated actions induced interference. In Experiment 3, written verbs were used as targets for naming, preceded by action primes. When target verbs denoted the prime action, clear facilitation was obtained. In contrast, interference was observed when target verbs were phonologically similar, but otherwise unrelated, to the names of prime actions. This is clear evidence for word-form activation by masked action scenes. Masked action pictures thus provide conceptual information that is detailed enough to facilitate apprehension and naming of immediately following scenes. Masked actions even activate their word-form information–as is evident when targets are words. We thus show how language production can be primed with briefly flashed masked action scenes, in answer to long-standing questions in scene processing. PMID:29652939
Using articulated scene models for dynamic 3d scene analysis in vista spaces
NASA Astrophysics Data System (ADS)
Beuter, Niklas; Swadzba, Agnes; Kummert, Franz; Wachsmuth, Sven
2010-09-01
In this paper we describe an efficient but detailed new approach to analyze complex dynamic scenes directly in 3D. The arising information is important for mobile robots to solve tasks in the area of household robotics. In our work a mobile robot builds an articulated scene model by observing the environment in the visual field or rather in the so-called vista space. The articulated scene model consists of essential knowledge about the static background, about autonomously moving entities like humans or robots and finally, in contrast to existing approaches, information about articulated parts. These parts describe movable objects like chairs, doors or other tangible entities, which could be moved by an agent. The combination of the static scene, the self-moving entities and the movable objects in one articulated scene model enhances the calculation of each single part. The reconstruction process for parts of the static scene benefits from removal of the dynamic parts and in turn, the moving parts can be extracted more easily through the knowledge about the background. In our experiments we show, that the system delivers simultaneously an accurate static background model, moving persons and movable objects. This information of the articulated scene model enables a mobile robot to detect and keep track of interaction partners, to navigate safely through the environment and finally, to strengthen the interaction with the user through the knowledge about the 3D articulated objects and 3D scene analysis. [Figure not available: see fulltext.
NASA Astrophysics Data System (ADS)
Tsifouti, A.; Triantaphillidou, S.; Larabi, M. C.; Doré, G.; Bilissi, E.; Psarrou, A.
2015-01-01
In this investigation we study the effects of compression and frame rate reduction on the performance of four video analytics (VA) systems utilizing a low complexity scenario, such as the Sterile Zone (SZ). Additionally, we identify the most influential scene parameters affecting the performance of these systems. The SZ scenario is a scene consisting of a fence, not to be trespassed, and an area with grass. The VA system needs to alarm when there is an intruder (attack) entering the scene. The work includes testing of the systems with uncompressed and compressed (using H.264/MPEG-4 AVC at 25 and 5 frames per second) footage, consisting of quantified scene parameters. The scene parameters include descriptions of scene contrast, camera to subject distance, and attack portrayal. Additional footage, including only distractions (no attacks) is also investigated. Results have shown that every system has performed differently for each compression/frame rate level, whilst overall, compression has not adversely affected the performance of the systems. Frame rate reduction has decreased performance and scene parameters have influenced the behavior of the systems differently. Most false alarms were triggered with a distraction clip, including abrupt shadows through the fence. Findings could contribute to the improvement of VA systems.
The singular nature of auditory and visual scene analysis in autism
Lin, I.-Fan; Shirama, Aya; Kato, Nobumasa
2017-01-01
Individuals with autism spectrum disorder often have difficulty acquiring relevant auditory and visual information in daily environments, despite not being diagnosed as hearing impaired or having low vision. Resent psychophysical and neurophysiological studies have shown that autistic individuals have highly specific individual differences at various levels of information processing, including feature extraction, automatic grouping and top-down modulation in auditory and visual scene analysis. Comparison of the characteristics of scene analysis between auditory and visual modalities reveals some essential commonalities, which could provide clues about the underlying neural mechanisms. Further progress in this line of research may suggest effective methods for diagnosing and supporting autistic individuals. This article is part of the themed issue ‘Auditory and visual scene analysis'. PMID:28044025
GeoPAT: A toolbox for pattern-based information retrieval from large geospatial databases
NASA Astrophysics Data System (ADS)
Jasiewicz, Jarosław; Netzel, Paweł; Stepinski, Tomasz
2015-07-01
Geospatial Pattern Analysis Toolbox (GeoPAT) is a collection of GRASS GIS modules for carrying out pattern-based geospatial analysis of images and other spatial datasets. The need for pattern-based analysis arises when images/rasters contain rich spatial information either because of their very high resolution or their very large spatial extent. Elementary units of pattern-based analysis are scenes - patches of surface consisting of a complex arrangement of individual pixels (patterns). GeoPAT modules implement popular GIS algorithms, such as query, overlay, and segmentation, to operate on the grid of scenes. To achieve these capabilities GeoPAT includes a library of scene signatures - compact numerical descriptors of patterns, and a library of distance functions - providing numerical means of assessing dissimilarity between scenes. Ancillary GeoPAT modules use these functions to construct a grid of scenes or to assign signatures to individual scenes having regular or irregular geometries. Thus GeoPAT combines knowledge retrieval from patterns with mapping tasks within a single integrated GIS environment. GeoPAT is designed to identify and analyze complex, highly generalized classes in spatial datasets. Examples include distinguishing between different styles of urban settlements using VHR images, delineating different landscape types in land cover maps, and mapping physiographic units from DEM. The concept of pattern-based spatial analysis is explained and the roles of all modules and functions are described. A case study example pertaining to delineation of landscape types in a subregion of NLCD is given. Performance evaluation is included to highlight GeoPAT's applicability to very large datasets. The GeoPAT toolbox is available for download from
NASA Astrophysics Data System (ADS)
Menze, Moritz; Heipke, Christian; Geiger, Andreas
2018-06-01
This work investigates the estimation of dense three-dimensional motion fields, commonly referred to as scene flow. While great progress has been made in recent years, large displacements and adverse imaging conditions as observed in natural outdoor environments are still very challenging for current approaches to reconstruction and motion estimation. In this paper, we propose a unified random field model which reasons jointly about 3D scene flow as well as the location, shape and motion of vehicles in the observed scene. We formulate the problem as the task of decomposing the scene into a small number of rigidly moving objects sharing the same motion parameters. Thus, our formulation effectively introduces long-range spatial dependencies which commonly employed local rigidity priors are lacking. Our inference algorithm then estimates the association of image segments and object hypotheses together with their three-dimensional shape and motion. We demonstrate the potential of the proposed approach by introducing a novel challenging scene flow benchmark which allows for a thorough comparison of the proposed scene flow approach with respect to various baseline models. In contrast to previous benchmarks, our evaluation is the first to provide stereo and optical flow ground truth for dynamic real-world urban scenes at large scale. Our experiments reveal that rigid motion segmentation can be utilized as an effective regularizer for the scene flow problem, improving upon existing two-frame scene flow methods. At the same time, our method yields plausible object segmentations without requiring an explicitly trained recognition model for a specific object class.
Measuring familiarity for natural environments through visual images
William E. Hammitt
1979-01-01
An on-site visual preference methodology involving a pre-and-post rating of bog landscape photographs is discussed. Photographs were rated for familiarity as well as preference. Preference was shown to be closely related to familiarity, assuming visitors had the opportunity to view the scenes during the on-site hiking engagement. Scenes rated high on preference were...
Effects of capacity limits, memory loss, and sound type in change deafness.
Gregg, Melissa K; Irsik, Vanessa C; Snyder, Joel S
2017-11-01
Change deafness, the inability to notice changes to auditory scenes, has the potential to provide insights about sound perception in busy situations typical of everyday life. We determined the extent to which change deafness to sounds is due to the capacity of processing multiple sounds and the loss of memory for sounds over time. We also determined whether these processing limitations work differently for varying types of sounds within a scene. Auditory scenes composed of naturalistic sounds, spectrally dynamic unrecognizable sounds, tones, and noise rhythms were presented in a change-detection task. On each trial, two scenes were presented that were same or different. We manipulated the number of sounds within each scene to measure memory capacity and the silent interval between scenes to measure memory loss. For all sounds, change detection was worse as scene size increased, demonstrating the importance of capacity limits. Change detection to the natural sounds did not deteriorate much as the interval between scenes increased up to 2,000 ms, but it did deteriorate substantially with longer intervals. For artificial sounds, in contrast, change-detection performance suffered even for very short intervals. The results suggest that change detection is generally limited by capacity, regardless of sound type, but that auditory memory is more enduring for sounds with naturalistic acoustic structures.
Turk, Elisabeth E
2010-06-01
Hypothermia refers to a situation where there is a drop in body core temperature below 35 degrees C. It is a potentially fatal condition. In forensic medicine and pathology, cases of hypothermia often pose a special challenge to experts because of their complex nature, and the often absent or nonspecific nature of morphological findings. The scene of the incident may raise suspicions of a crime initially, due to phenomena such as terminal burrowing behavior and paradoxical undressing. An element of hypothermia often contributes to the cause of death in drug- and alcohol-related fatalities, in the homeless, in immersion deaths, in accidents and in cases of abuse or neglect, making the condition extremely relevant to forensic medical specialists. The aim of this review is to give an overview of the pathophysiological aspects of hypothermia and to illustrate different aspects relevant to forensic medical casework.
(In) Sensitivity to spatial distortion in natural scenes
Bex, Peter J.
2010-01-01
The perception of object structure in the natural environment is remarkably stable under large variation in image size and projection, especially given our insensitivity to spatial position outside the fovea. Sensitivity to periodic spatial distortions that were introduced into one quadrant of gray-scale natural images was measured in a 4AFC task. Observers were able to detect the presence of distortions in unfamiliar images even though they did not significantly affect the amplitude spectrum. Sensitivity depended on the spatial period of the distortion and on the image structure at the location of the distortion. The results suggest that the detection of distortion involves decisions made in the late stages of image perception and is based on an expectation of the typical structure of natural scenes. PMID:20462324
An exploratory study into the effects of extraordinary nature on emotions, mood, and prosociality
Joye, Yannick; Bolderdijk, Jan Willem
2015-01-01
Environmental psychology research has demonstrated that exposure to mundane natural environments can be psychologically beneficial, and can, for instance, improve individuals' mood and concentration. However, little research has yet examined the psychological benefits of extraordinary, awe-evoking kinds of nature, such as spectacular mountain scenes or impressive waterfalls. In this study, we aimed to address the underrepresentation of such extraordinary nature in research on human—nature interactions. Specifically, we examined whether watching a picture slideshow of awesome as opposed to mundane nature differentially affected individuals' emotions, mood, social value orientation (SVO), and their willingness to donate something to others. Our analyses revealed that, compared to mundane nature and a neutral condition, watching awesome natural scenes and phenomena had some unique and pronounced emotional effects (e.g., feeling small and humble), triggered the most mood improvement, and led to a more prosocial SVO. We found that participants' willingness to donate did not differ significantly for any of the conditions. PMID:25674067
Rapid detection of person information in a naturalistic scene.
Fletcher-Watson, Sue; Findlay, John M; Leekam, Susan R; Benson, Valerie
2008-01-01
A preferential-looking paradigm was used to investigate how gaze is distributed in naturalistic scenes. Two scenes were presented side by side: one contained a single person (person-present) and one did not (person-absent). Eye movements were recorded, the principal measures being the time spent looking at each region of the scenes, and the latency and location of the first fixation within each trial. We studied gaze patterns during free viewing, and also in a task requiring gender discrimination of the human figure depicted. Results indicated a strong bias towards looking to the person-present scene. This bias was present on the first fixation after image presentation, confirming previous findings of ultra-rapid processing of complex information. Faces attracted disproportionately many fixations, the preference emerging in the first fixation and becoming stronger in the following ones. These biases were exaggerated in the gender-discrimination task. A tendency to look at the object being fixated by the person in the scene was shown to be strongest at a slightly later point in the gaze sequence. We conclude that human bodies and faces are subject to special perceptual processing when presented as part of a naturalistic scene.
The Influence of Familiarity on Affective Responses to Natural Scenes
NASA Astrophysics Data System (ADS)
Sanabria Z., Jorge C.; Cho, Youngil; Yamanaka, Toshimasa
This kansei study explored how familiarity with image-word combinations influences affective states. Stimuli were obtained from Japanese print advertisements (ads), and consisted of images (e.g., natural-scene backgrounds) and their corresponding headlines (advertising copy). Initially, a group of subjects evaluated their level of familiarity with images and headlines independently, and stimuli were filtered based on the results. In the main experiment, a different group of subjects rated their pleasure and arousal to, and familiarity with, image-headline combinations. The Self-Assessment Manikin (SAM) scale was used to evaluate pleasure and arousal, and a bipolar scale was used to evaluate familiarity. The results showed a high correlation between familiarity and pleasure, but low correlation between familiarity and arousal. The characteristics of the stimuli, and their effect on the variables of pleasure, arousal and familiarity, were explored through ANOVA. It is suggested that, in the case of natural-scene ads, familiarity with image-headline combinations may increase the pleasure response to the ads, and that certain components in the images (e.g., water) may increase arousal levels.
Baudot, Pierre; Levy, Manuel; Marre, Olivier; Monier, Cyril; Pananceau, Marc; Frégnac, Yves
2013-01-01
Synaptic noise is thought to be a limiting factor for computational efficiency in the brain. In visual cortex (V1), ongoing activity is present in vivo, and spiking responses to simple stimuli are highly unreliable across trials. Stimulus statistics used to plot receptive fields, however, are quite different from those experienced during natural visuomotor exploration. We recorded V1 neurons intracellularly in the anaesthetized and paralyzed cat and compared their spiking and synaptic responses to full field natural images animated by simulated eye-movements to those evoked by simpler (grating) or higher dimensionality statistics (dense noise). In most cells, natural scene animation was the only condition where high temporal precision (in the 10–20 ms range) was maintained during sparse and reliable activity. At the subthreshold level, irregular but highly reproducible membrane potential dynamics were observed, even during long (several 100 ms) “spike-less” periods. We showed that both the spatial structure of natural scenes and the temporal dynamics of eye-movements increase the signal-to-noise ratio by a non-linear amplification of the signal combined with a reduction of the subthreshold contextual noise. These data support the view that the sparsening and the time precision of the neural code in V1 may depend primarily on three factors: (1) broadband input spectrum: the bandwidth must be rich enough for recruiting optimally the diversity of spatial and time constants during recurrent processing; (2) tight temporal interplay of excitation and inhibition: conductance measurements demonstrate that natural scene statistics narrow selectively the duration of the spiking opportunity window during which the balance between excitation and inhibition changes transiently and reversibly; (3) signal energy in the lower frequency band: a minimal level of power is needed below 10 Hz to reach consistently the spiking threshold, a situation rarely reached with visual dense noise. PMID:24409121
Baudot, Pierre; Levy, Manuel; Marre, Olivier; Monier, Cyril; Pananceau, Marc; Frégnac, Yves
2013-01-01
Synaptic noise is thought to be a limiting factor for computational efficiency in the brain. In visual cortex (V1), ongoing activity is present in vivo, and spiking responses to simple stimuli are highly unreliable across trials. Stimulus statistics used to plot receptive fields, however, are quite different from those experienced during natural visuomotor exploration. We recorded V1 neurons intracellularly in the anaesthetized and paralyzed cat and compared their spiking and synaptic responses to full field natural images animated by simulated eye-movements to those evoked by simpler (grating) or higher dimensionality statistics (dense noise). In most cells, natural scene animation was the only condition where high temporal precision (in the 10-20 ms range) was maintained during sparse and reliable activity. At the subthreshold level, irregular but highly reproducible membrane potential dynamics were observed, even during long (several 100 ms) "spike-less" periods. We showed that both the spatial structure of natural scenes and the temporal dynamics of eye-movements increase the signal-to-noise ratio by a non-linear amplification of the signal combined with a reduction of the subthreshold contextual noise. These data support the view that the sparsening and the time precision of the neural code in V1 may depend primarily on three factors: (1) broadband input spectrum: the bandwidth must be rich enough for recruiting optimally the diversity of spatial and time constants during recurrent processing; (2) tight temporal interplay of excitation and inhibition: conductance measurements demonstrate that natural scene statistics narrow selectively the duration of the spiking opportunity window during which the balance between excitation and inhibition changes transiently and reversibly; (3) signal energy in the lower frequency band: a minimal level of power is needed below 10 Hz to reach consistently the spiking threshold, a situation rarely reached with visual dense noise.
Scene-based nonuniformity correction with video sequences and registration.
Hardie, R C; Hayat, M M; Armstrong, E; Yasuda, B
2000-03-10
We describe a new, to our knowledge, scene-based nonuniformity correction algorithm for array detectors. The algorithm relies on the ability to register a sequence of observed frames in the presence of the fixed-pattern noise caused by pixel-to-pixel nonuniformity. In low-to-moderate levels of nonuniformity, sufficiently accurate registration may be possible with standard scene-based registration techniques. If the registration is accurate, and motion exists between the frames, then groups of independent detectors can be identified that observe the same irradiance (or true scene value). These detector outputs are averaged to generate estimates of the true scene values. With these scene estimates, and the corresponding observed values through a given detector, a curve-fitting procedure is used to estimate the individual detector response parameters. These can then be used to correct for detector nonuniformity. The strength of the algorithm lies in its simplicity and low computational complexity. Experimental results, to illustrate the performance of the algorithm, include the use of visible-range imagery with simulated nonuniformity and infrared imagery with real nonuniformity.
NASA Technical Reports Server (NTRS)
Franks, Shannon; Masek, Jeffrey G.; Headley, Rachel M.; Gasch, John; Arvidson, Terry
2009-01-01
The Global Land Survey (GLS) 2005 is a cloud-free, orthorectified collection of Landsat imagery acquired during the 2004-2007 epoch intended to support global land-cover and ecological monitoring. Due to the numerous complexities in selecting imagery for the GLS2005, NASA and the U.S. Geological Survey (USGS) sponsored the development of an automated scene selection tool, the Large Area Scene Selection Interface (LASSI), to aid in the selection of imagery for this data set. This innovative approach to scene selection applied a user-defined weighting system to various scene parameters: image cloud cover, image vegetation greenness, choice of sensor, and the ability of the Landsat 7 Scan Line Corrector (SLC)-off pair to completely fill image gaps, among others. The parameters considered in scene selection were weighted according to their relative importance to the data set, along with the algorithm's sensitivity to that weight. This paper describes the methodology and analysis that established the parameter weighting strategy, as well as the post-screening processes used in selecting the optimal data set for GLS2005.
Assessing Multiple Object Tracking in Young Children Using a Game
ERIC Educational Resources Information Center
Ryokai, Kimiko; Farzin, Faraz; Kaltman, Eric; Niemeyer, Greg
2013-01-01
Visual tracking of multiple objects in a complex scene is a critical survival skill. When we attempt to safely cross a busy street, follow a ball's position during a sporting event, or monitor children in a busy playground, we rely on our brain's capacity to selectively attend to and track the position of specific objects in a dynamic scene. This…
Psychophysical Criteria for Visual Simulation Systems.
1980-05-01
definitive data were found to estab- lish detection thresholds; therefore, this is one area where a psycho- physical study was recommended. Differential size...The specific functional relationships needinq quantification were the following: 1. The effect of Horizontal Aniseikonia on Target Detection and...Transition Technique 6. The Effects of Scene Complexity and Separation on the Detection of Scene Misalignment 7. Absolute Brightness Levels in
ERIC Educational Resources Information Center
Wilkinson, Krista M.; Light, Janice
2011-01-01
Purpose: Many individuals with complex communication needs may benefit from visual aided augmentative and alternative communication systems. In visual scene displays (VSDs), language concepts are embedded into a photograph of a naturalistic event. Humans play a central role in communication development and might be important elements in VSDs.…
Children Use Object-Level Category Knowledge to Detect Changes in Complex Auditory Scenes
ERIC Educational Resources Information Center
Vanden Bosch der Nederlanden, Christina M.; Snyder, Joel S.; Hannon, Erin E.
2016-01-01
Children interact with and learn about all types of sound sources, including dogs, bells, trains, and human beings. Although it is clear that knowledge of semantic categories for everyday sights and sounds develops during childhood, there are very few studies examining how children use this knowledge to make sense of auditory scenes. We used a…
Local statistics of retinal optic flow for self-motion through natural sceneries.
Calow, Dirk; Lappe, Markus
2007-12-01
Image analysis in the visual system is well adapted to the statistics of natural scenes. Investigations of natural image statistics have so far mainly focused on static features. The present study is dedicated to the measurement and the analysis of the statistics of optic flow generated on the retina during locomotion through natural environments. Natural locomotion includes bouncing and swaying of the head and eye movement reflexes that stabilize gaze onto interesting objects in the scene while walking. We investigate the dependencies of the local statistics of optic flow on the depth structure of the natural environment and on the ego-motion parameters. To measure these dependencies we estimate the mutual information between correlated data sets. We analyze the results with respect to the variation of the dependencies over the visual field, since the visual motions in the optic flow vary depending on visual field position. We find that retinal flow direction and retinal speed show only minor statistical interdependencies. Retinal speed is statistically tightly connected to the depth structure of the scene. Retinal flow direction is statistically mostly driven by the relation between the direction of gaze and the direction of ego-motion. These dependencies differ at different visual field positions such that certain areas of the visual field provide more information about ego-motion and other areas provide more information about depth. The statistical properties of natural optic flow may be used to tune the performance of artificial vision systems based on human imitating behavior, and may be useful for analyzing properties of natural vision systems.
Learning Relative Motion Concepts in Immersive and Non-immersive Virtual Environments
NASA Astrophysics Data System (ADS)
Kozhevnikov, Michael; Gurlitt, Johannes; Kozhevnikov, Maria
2013-12-01
The focus of the current study is to understand which unique features of an immersive virtual reality environment have the potential to improve learning relative motion concepts. Thirty-seven undergraduate students learned relative motion concepts using computer simulation either in immersive virtual environment (IVE) or non-immersive desktop virtual environment (DVE) conditions. Our results show that after the simulation activities, both IVE and DVE groups exhibited a significant shift toward a scientific understanding in their conceptual models and epistemological beliefs about the nature of relative motion, and also a significant improvement on relative motion problem-solving tests. In addition, we analyzed students' performance on one-dimensional and two-dimensional questions in the relative motion problem-solving test separately and found that after training in the simulation, the IVE group performed significantly better than the DVE group on solving two-dimensional relative motion problems. We suggest that egocentric encoding of the scene in IVE (where the learner constitutes a part of a scene they are immersed in), as compared to allocentric encoding on a computer screen in DVE (where the learner is looking at the scene from "outside"), is more beneficial than DVE for studying more complex (two-dimensional) relative motion problems. Overall, our findings suggest that such aspects of virtual realities as immersivity, first-hand experience, and the possibility of changing different frames of reference can facilitate understanding abstract scientific phenomena and help in displacing intuitive misconceptions with more accurate mental models.
Multi- and hyperspectral scene modeling
NASA Astrophysics Data System (ADS)
Borel, Christoph C.; Tuttle, Ronald F.
2011-06-01
This paper shows how to use a public domain raytracer POV-Ray (Persistence Of Vision Raytracer) to render multiand hyper-spectral scenes. The scripting environment allows automatic changing of the reflectance and transmittance parameters. The radiosity rendering mode allows accurate simulation of multiple-reflections between surfaces and also allows semi-transparent surfaces such as plant leaves. We show that POV-Ray computes occlusion accurately using a test scene with two blocks under a uniform sky. A complex scene representing a plant canopy is generated using a few lines of script. With appropriate rendering settings, shadows cast by leaves are rendered in many bands. Comparing single and multiple reflection renderings, the effect of multiple reflections is clearly visible and accounts for 25% of the overall apparent canopy reflectance in the near infrared.
Global Transsaccadic Change Blindness During Scene Perception
2003-09-01
objects in natural scenes. Psychonomic Bulletin & Review , 8 , 761–768. Irwin, D.E. (1991). Information integration across saccadic eye... Psychonomic Bulletin & Review , 5 , 644–649. Tanaka, K. (1996). Inferotemporal cortex and object vision. Annual Review of Neuro- science... Bulletin & Review , 8 , 753–760. Hoffman, J.R., & Subramanian, B. (1995). The role of visual attention in saccadic eye movements. Perception
Earth observation taken during STS-102
2001-06-14
STS102-342-024 (8-21 March 2001)--- The largest percentage of astronaut out-the-window photography is scientific in nature. However, occasionally scenes such as this one showing the moon over Earth's airglow are irresistable for crew members with cameras. The 35mm scene was recorded by one of the STS-102 astronauts from the aft flight deck of the Space Shuttle Discovery.
Blind prediction of natural video quality.
Saad, Michele A; Bovik, Alan C; Charrier, Christophe
2014-03-01
We propose a blind (no reference or NR) video quality evaluation model that is nondistortion specific. The approach relies on a spatio-temporal model of video scenes in the discrete cosine transform domain, and on a model that characterizes the type of motion occurring in the scenes, to predict video quality. We use the models to define video statistics and perceptual features that are the basis of a video quality assessment (VQA) algorithm that does not require the presence of a pristine video to compare against in order to predict a perceptual quality score. The contributions of this paper are threefold. 1) We propose a spatio-temporal natural scene statistics (NSS) model for videos. 2) We propose a motion model that quantifies motion coherency in video scenes. 3) We show that the proposed NSS and motion coherency models are appropriate for quality assessment of videos, and we utilize them to design a blind VQA algorithm that correlates highly with human judgments of quality. The proposed algorithm, called video BLIINDS, is tested on the LIVE VQA database and on the EPFL-PoliMi video database and shown to perform close to the level of top performing reduced and full reference VQA algorithms.
Scene construction in schizophrenia.
Raffard, Stéphane; D'Argembeau, Arnaud; Bayard, Sophie; Boulenger, Jean-Philippe; Van der Linden, Martial
2010-09-01
Recent research has revealed that schizophrenia patients are impaired in remembering the past and imagining the future. In this study, we examined patients' ability to engage in scene construction (i.e., the process of mentally generating and maintaining a complex and coherent scene), which is a key part of retrieving past experiences and episodic future thinking. 24 participants with schizophrenia and 25 healthy controls were asked to imagine new fictitious experiences and described their mental representations of the scenes in as much detail as possible. Descriptions were scored according to various dimensions (e.g., sensory details, spatial reference), and participants also provided ratings of their subjective experience when imagining the scenes (e.g., their sense of presence, the perceived similarity of imagined events to past experiences). Imagined scenes contained less phenomenological details (d = 1.11) and were more fragmented (d = 2.81) in schizophrenia patients compared to controls. Furthermore, positive symptoms were positively correlated to the sense of presence (r = .43) and the perceived similarity of imagined events to past episodes (r = .47), whereas negative symptoms were negatively related to the overall richness of the imagined scenes (r = -.43). The results suggest that schizophrenic patients' impairments in remembering the past and imagining the future are, at least in part, due to deficits in the process of scene construction. The relationships between the characteristics of imagined scenes and positive and negative symptoms could be related to reality monitoring deficits and difficulties in strategic retrieval processes, respectively. Copyright 2010 APA, all rights reserved.
An Attempt at Assessing Preferences for Natural Landscapes
ERIC Educational Resources Information Center
Calvin, James S.; And Others
1972-01-01
Investigation of ways in which man makes a psychological assessment of his environment. Concerned with variables in the environment itself, fifteen photographs of natural landscape scenes were rated on each of twenty-one semantic differential scales by college students. Two major dimensions emerged: natural scenic beauty and a natural force…
Eye Movements, Visual Search and Scene Memory, in an Immersive Virtual Environment
Sullivan, Brian; Snyder, Kat; Ballard, Dana; Hayhoe, Mary
2014-01-01
Visual memory has been demonstrated to play a role in both visual search and attentional prioritization in natural scenes. However, it has been studied predominantly in experimental paradigms using multiple two-dimensional images. Natural experience, however, entails prolonged immersion in a limited number of three-dimensional environments. The goal of the present experiment was to recreate circumstances comparable to natural visual experience in order to evaluate the role of scene memory in guiding eye movements in a natural environment. Subjects performed a continuous visual-search task within an immersive virtual-reality environment over three days. We found that, similar to two-dimensional contexts, viewers rapidly learn the location of objects in the environment over time, and use spatial memory to guide search. Incidental fixations did not provide obvious benefit to subsequent search, suggesting that semantic contextual cues may often be just as efficient, or that many incidentally fixated items are not held in memory in the absence of a specific task. On the third day of the experience in the environment, previous search items changed in color. These items were fixated upon with increased probability relative to control objects, suggesting that memory-guided prioritization (or Surprise) may be a robust mechanisms for attracting gaze to novel features of natural environments, in addition to task factors and simple spatial saliency. PMID:24759905
How color enhances visual memory for natural scenes.
Spence, Ian; Wong, Patrick; Rusan, Maria; Rastegar, Naghmeh
2006-01-01
We offer a framework for understanding how color operates to improve visual memory for images of the natural environment, and we present an extensive data set that quantifies the contribution of color in the encoding and recognition phases. Using a continuous recognition task with colored and monochrome gray-scale images of natural scenes at short exposure durations, we found that color enhances recognition memory by conferring an advantage during encoding and by strengthening the encoding-specificity effect. Furthermore, because the pattern of performance was similar at all exposure durations, and because form and color are processed in different areas of cortex, the results imply that color must be bound as an integral part of the representation at the earliest stages of processing.
Neural representations of contextual guidance in visual search of real-world scenes.
Preston, Tim J; Guo, Fei; Das, Koel; Giesbrecht, Barry; Eckstein, Miguel P
2013-05-01
Exploiting scene context and object-object co-occurrence is critical in guiding eye movements and facilitating visual search, yet the mediating neural mechanisms are unknown. We used functional magnetic resonance imaging while observers searched for target objects in scenes and used multivariate pattern analyses (MVPA) to show that the lateral occipital complex (LOC) can predict the coarse spatial location of observers' expectations about the likely location of 213 different targets absent from the scenes. In addition, we found weaker but significant representations of context location in an area related to the orienting of attention (intraparietal sulcus, IPS) as well as a region related to scene processing (retrosplenial cortex, RSC). Importantly, the degree of agreement among 100 independent raters about the likely location to contain a target object in a scene correlated with LOC's ability to predict the contextual location while weaker but significant effects were found in IPS, RSC, the human motion area, and early visual areas (V1, V3v). When contextual information was made irrelevant to observers' behavioral task, the MVPA analysis of LOC and the other areas' activity ceased to predict the location of context. Thus, our findings suggest that the likely locations of targets in scenes are represented in various visual areas with LOC playing a key role in contextual guidance during visual search of objects in real scenes.
Constrained sampling experiments reveal principles of detection in natural scenes.
Sebastian, Stephen; Abrams, Jared; Geisler, Wilson S
2017-07-11
A fundamental everyday visual task is to detect target objects within a background scene. Using relatively simple stimuli, vision science has identified several major factors that affect detection thresholds, including the luminance of the background, the contrast of the background, the spatial similarity of the background to the target, and uncertainty due to random variations in the properties of the background and in the amplitude of the target. Here we use an experimental approach based on constrained sampling from multidimensional histograms of natural stimuli, together with a theoretical analysis based on signal detection theory, to discover how these factors affect detection in natural scenes. We sorted a large collection of natural image backgrounds into multidimensional histograms, where each bin corresponds to a particular luminance, contrast, and similarity. Detection thresholds were measured for a subset of bins spanning the space, where a natural background was randomly sampled from a bin on each trial. In low-uncertainty conditions, both the background bin and the amplitude of the target were fixed, and, in high-uncertainty conditions, they varied randomly on each trial. We found that thresholds increase approximately linearly along all three dimensions and that detection accuracy is unaffected by background bin and target amplitude uncertainty. The results are predicted from first principles by a normalized matched-template detector, where the dynamic normalizing gain factor follows directly from the statistical properties of the natural backgrounds. The results provide an explanation for classic laws of psychophysics and their underlying neural mechanisms.
Azzopardi, George; Petkov, Nicolai
2014-01-01
The remarkable abilities of the primate visual system have inspired the construction of computational models of some visual neurons. We propose a trainable hierarchical object recognition model, which we call S-COSFIRE (S stands for Shape and COSFIRE stands for Combination Of Shifted FIlter REsponses) and use it to localize and recognize objects of interests embedded in complex scenes. It is inspired by the visual processing in the ventral stream (V1/V2 → V4 → TEO). Recognition and localization of objects embedded in complex scenes is important for many computer vision applications. Most existing methods require prior segmentation of the objects from the background which on its turn requires recognition. An S-COSFIRE filter is automatically configured to be selective for an arrangement of contour-based features that belong to a prototype shape specified by an example. The configuration comprises selecting relevant vertex detectors and determining certain blur and shift parameters. The response is computed as the weighted geometric mean of the blurred and shifted responses of the selected vertex detectors. S-COSFIRE filters share similar properties with some neurons in inferotemporal cortex, which provided inspiration for this work. We demonstrate the effectiveness of S-COSFIRE filters in two applications: letter and keyword spotting in handwritten manuscripts and object spotting in complex scenes for the computer vision system of a domestic robot. S-COSFIRE filters are effective to recognize and localize (deformable) objects in images of complex scenes without requiring prior segmentation. They are versatile trainable shape detectors, conceptually simple and easy to implement. The presented hierarchical shape representation contributes to a better understanding of the brain and to more robust computer vision algorithms. PMID:25126068
Berry, Meredith S.; Repke, Meredith A.; Nickerson, Norma P.; Conway, Lucian G.; Odum, Amy L.; Jordan, Kerry E.
2015-01-01
Impulsivity in delay discounting is associated with maladaptive behaviors such as overeating and drug and alcohol abuse. Researchers have recently noted that delay discounting, even when measured by a brief laboratory task, may be the best predictor of human health related behaviors (e.g., exercise) currently available. Identifying techniques to decrease impulsivity in delay discounting, therefore, could help improve decision-making on a global scale. Visual exposure to natural environments is one recent approach shown to decrease impulsive decision-making in a delay discounting task, although the mechanism driving this result is currently unknown. The present experiment was thus designed to evaluate not only whether visual exposure to natural (mountains, lakes) relative to built (buildings, cities) environments resulted in less impulsivity, but also whether this exposure influenced time perception. Participants were randomly assigned to either a natural environment condition or a built environment condition. Participants viewed photographs of either natural scenes or built scenes before and during a delay discounting task in which they made choices about receiving immediate or delayed hypothetical monetary outcomes. Participants also completed an interval bisection task in which natural or built stimuli were judged as relatively longer or shorter presentation durations. Following the delay discounting and interval bisection tasks, additional measures of time perception were administered, including how many minutes participants thought had passed during the session and a scale measurement of whether time "flew" or "dragged" during the session. Participants exposed to natural as opposed to built scenes were less impulsive and also reported longer subjective session times, although no differences across groups were revealed with the interval bisection task. These results are the first to suggest that decreased impulsivity from exposure to natural as opposed to built environments may be related to lengthened time perception. PMID:26558610
Sadeghi, Zahra; McClelland, James L; Hoffman, Paul
2015-09-01
An influential position in lexical semantics holds that semantic representations for words can be derived through analysis of patterns of lexical co-occurrence in large language corpora. Firth (1957) famously summarised this principle as "you shall know a word by the company it keeps". We explored whether the same principle could be applied to non-verbal patterns of object co-occurrence in natural scenes. We performed latent semantic analysis (LSA) on a set of photographed scenes in which all of the objects present had been manually labelled. This resulted in a representation of objects in a high-dimensional space in which similarity between two objects indicated the degree to which they appeared in similar scenes. These representations revealed similarities among objects belonging to the same taxonomic category (e.g., items of clothing) as well as cross-category associations (e.g., between fruits and kitchen utensils). We also compared representations generated from this scene dataset with two established methods for elucidating semantic representations: (a) a published database of semantic features generated verbally by participants and (b) LSA applied to a linguistic corpus in the usual fashion. Statistical comparisons of the three methods indicated significant association between the structures revealed by each method, with the scene dataset displaying greater convergence with feature-based representations than did LSA applied to linguistic data. The results indicate that information about the conceptual significance of objects can be extracted from their patterns of co-occurrence in natural environments, opening the possibility for such data to be incorporated into existing models of conceptual representation. Copyright © 2014 The Authors. Published by Elsevier Ltd.. All rights reserved.
Scene-based nonuniformity correction technique for infrared focal-plane arrays.
Liu, Yong-Jin; Zhu, Hong; Zhao, Yi-Gong
2009-04-20
A scene-based nonuniformity correction algorithm is presented to compensate for the gain and bias nonuniformity in infrared focal-plane array sensors, which can be separated into three parts. First, an interframe-prediction method is used to estimate the true scene, since nonuniformity correction is a typical blind-estimation problem and both scene values and detector parameters are unavailable. Second, the estimated scene, along with its corresponding observed data obtained by detectors, is employed to update the gain and the bias by means of a line-fitting technique. Finally, with these nonuniformity parameters, the compensated output of each detector is obtained by computing a very simple formula. The advantages of the proposed algorithm lie in its low computational complexity and storage requirements and ability to capture temporal drifts in the nonuniformity parameters. The performance of every module is demonstrated with simulated and real infrared image sequences. Experimental results indicate that the proposed algorithm exhibits a superior correction effect.
The role of memory for visual search in scenes
Võ, Melissa Le-Hoa; Wolfe, Jeremy M.
2014-01-01
Many daily activities involve looking for something. The ease with which these searches are performed often allows one to forget that searching represents complex interactions between visual attention and memory. While a clear understanding exists of how search efficiency will be influenced by visual features of targets and their surrounding distractors or by the number of items in the display, the role of memory in search is less well understood. Contextual cueing studies have shown that implicit memory for repeated item configurations can facilitate search in artificial displays. When searching more naturalistic environments, other forms of memory come into play. For instance, semantic memory provides useful information about which objects are typically found where within a scene, and episodic scene memory provides information about where a particular object was seen the last time a particular scene was viewed. In this paper, we will review work on these topics, with special emphasis on the role of memory in guiding search in organized, real-world scenes. PMID:25684693
The role of memory for visual search in scenes.
Le-Hoa Võ, Melissa; Wolfe, Jeremy M
2015-03-01
Many daily activities involve looking for something. The ease with which these searches are performed often allows one to forget that searching represents complex interactions between visual attention and memory. Although a clear understanding exists of how search efficiency will be influenced by visual features of targets and their surrounding distractors or by the number of items in the display, the role of memory in search is less well understood. Contextual cueing studies have shown that implicit memory for repeated item configurations can facilitate search in artificial displays. When searching more naturalistic environments, other forms of memory come into play. For instance, semantic memory provides useful information about which objects are typically found where within a scene, and episodic scene memory provides information about where a particular object was seen the last time a particular scene was viewed. In this paper, we will review work on these topics, with special emphasis on the role of memory in guiding search in organized, real-world scenes. © 2015 New York Academy of Sciences.
Traffic Signs in Complex Visual Environments
DOT National Transportation Integrated Search
1982-11-01
The effects of sign luminance on detection and recognition of traffic control devices is mediated through contrast with the immediate surround. Additionally, complex visual scenes are known to degrade visual performance with targets well above visual...
Color constancy influenced by unnatural spatial structure.
Mizokami, Yoko; Yaguchi, Hirohisa
2014-04-01
The recognition of spatial structures is important for color constancy because we cannot identify an object's color under different illuminations without knowing which space it is in and how that space is illuminated. To show the importance of the natural structure of environments on color constancy, we investigated the way in which color appearance was affected by unnatural viewing conditions in which a spatial structure was distorted. Observers judged the color of a test patch placed in the center of a small room illuminated by white or reddish lights, as well as two rooms illuminated by white and reddish light, respectively. In the natural viewing condition, an observer saw the room(s) through a viewing window, whereas in an unnatural viewing condition, the scene structure was scrambled by a kaleidoscope-type viewing box. Results of single room condition with one illuminant color showed little difference in color constancy between the two viewing conditions. However, it decreased in the two-rooms condition with a more complex arrangement of space and illumination. The patch's appearance under the unnatural viewing condition was more influenced by simultaneous contrast than its appearance under the natural viewing condition. It also appears that color appearance under white illumination is more stable compared to that under reddish illumination. These findings suggest that natural spatial structure plays an important role for color constancy in a complex environment.
Chen, Juan; Sperandio, Irene; Goodale, Melvyn Alan
2015-01-01
Objects rarely appear in isolation in natural scenes. Although many studies have investigated how nearby objects influence perception in cluttered scenes (i.e., crowding), none has studied how nearby objects influence visually guided action. In Experiment 1, we found that participants could scale their grasp to the size of a crowded target even when they could not perceive its size, demonstrating for the first time that neurologically intact participants can use visual information that is not available to conscious report to scale their grasp to real objects in real scenes. In Experiments 2 and 3, we found that changing the eccentricity of the display and the orientation of the flankers had no effect on grasping but strongly affected perception. The differential effects of eccentricity and flanker orientation on perception and grasping show that the known differences in retinotopy between the ventral and dorsal streams are reflected in the way in which people deal with targets in cluttered scenes. © The Author(s) 2014.
[Visual representation of natural scenes in flicker changes].
Nakashima, Ryoichi; Yokosawa, Kazuhiko
2010-08-01
Coherence theory in scene perception (Rensink, 2002) assumes the retention of volatile object representations on which attention is not focused. On the other hand, visual memory theory in scene perception (Hollingworth & Henderson, 2002) assumes that robust object representations are retained. In this study, we hypothesized that the difference between these two theories is derived from the difference of the experimental tasks that they are based on. In order to verify this hypothesis, we examined the properties of visual representation by using a change detection and memory task in a flicker paradigm. We measured the representations when participants were instructed to search for a change in a scene, and compared them with the intentional memory representations. The visual representations were retained in visual long-term memory even in the flicker paradigm, and were as robust as the intentional memory representations. However, the results indicate that the representations are unavailable for explicitly localizing a scene change, but are available for answering the recognition test. This suggests that coherence theory and visual memory theory are compatible.
A lighting metric for quantitative evaluation of accent lighting systems
NASA Astrophysics Data System (ADS)
Acholo, Cyril O.; Connor, Kenneth A.; Radke, Richard J.
2014-09-01
Accent lighting is critical for artwork and sculpture lighting in museums, and subject lighting for stage, Film and television. The research problem of designing effective lighting in such settings has been revived recently with the rise of light-emitting-diode-based solid state lighting. In this work, we propose an easy-to-apply quantitative measure of the scene's visual quality as perceived by human viewers. We consider a well-accent-lit scene as one which maximizes the information about the scene (in an information-theoretic sense) available to the user. We propose a metric based on the entropy of the distribution of colors, which are extracted from an image of the scene from the viewer's perspective. We demonstrate that optimizing the metric as a function of illumination configuration (i.e., position, orientation, and spectral composition) results in natural, pleasing accent lighting. We use a photorealistic simulation tool to validate the functionality of our proposed approach, showing its successful application to two- and three-dimensional scenes.
Camouflage and visual perception
Troscianko, Tom; Benton, Christopher P.; Lovell, P. George; Tolhurst, David J.; Pizlo, Zygmunt
2008-01-01
How does an animal conceal itself from visual detection by other animals? This review paper seeks to identify general principles that may apply in this broad area. It considers mechanisms of visual encoding, of grouping and object encoding, and of search. In most cases, the evidence base comes from studies of humans or species whose vision approximates to that of humans. The effort is hampered by a relatively sparse literature on visual function in natural environments and with complex foraging tasks. However, some general constraints emerge as being potentially powerful principles in understanding concealment—a ‘constraint’ here means a set of simplifying assumptions. Strategies that disrupt the unambiguous encoding of discontinuities of intensity (edges), and of other key visual attributes, such as motion, are key here. Similar strategies may also defeat grouping and object-encoding mechanisms. Finally, the paper considers how we may understand the processes of search for complex targets in complex scenes. The aim is to provide a number of pointers towards issues, which may be of assistance in understanding camouflage and concealment, particularly with reference to how visual systems can detect the shape of complex, concealed objects. PMID:18990671
Texture classification using autoregressive filtering
NASA Technical Reports Server (NTRS)
Lawton, W. M.; Lee, M.
1984-01-01
A general theory of image texture models is proposed and its applicability to the problem of scene segmentation using texture classification is discussed. An algorithm, based on half-plane autoregressive filtering, which optimally utilizes second order statistics to discriminate between texture classes represented by arbitrary wide sense stationary random fields is described. Empirical results of applying this algorithm to natural and sysnthesized scenes are presented and future research is outlined.
Atmospheric correction analysis on LANDSAT data over the Amazon region. [Manaus, Brazil
NASA Technical Reports Server (NTRS)
Parada, N. D. J. (Principal Investigator); Dias, L. A. V.; Dossantos, J. R.; Formaggio, A. R.
1983-01-01
The Amazon Region natural resources were studied in two ways and compared. A LANDSAT scene and its attributes were selected, and a maximum likelihood classification was made. The scene was atmospherically corrected, taking into account Amazonic peculiarities revealed by (ground truth) of the same area, and the subsequent classification. Comparison shows that the classification improves with the atmospherically corrected images.
ERIC Educational Resources Information Center
Torralba, Antonio; Oliva, Aude; Castelhano, Monica S.; Henderson, John M.
2006-01-01
Many experiments have shown that the human visual system makes extensive use of contextual information for facilitating object search in natural scenes. However, the question of how to formally model contextual influences is still open. On the basis of a Bayesian framework, the authors present an original approach of attentional guidance by global…
Sherman, Aleksandra; Grabowecky, Marcia; Suzuki, Satoru
2015-08-01
What shapes art appreciation? Much research has focused on the importance of visual features themselves (e.g., symmetry, natural scene statistics) and of the viewer's experience and expertise with specific artworks. However, even after taking these factors into account, there are considerable individual differences in art preferences. Our new result suggests that art preference is also influenced by the compatibility between visual properties and the characteristics of the viewer's visual system. Specifically, we have demonstrated, using 120 artworks from diverse periods, cultures, genres, and styles, that art appreciation is increased when the level of visual complexity within an artwork is compatible with the viewer's visual working memory capacity. The result highlights the importance of the interaction between visual features and the beholder's general visual capacity in shaping art appreciation. (c) 2015 APA, all rights reserved).
Virtual environments for scene of crime reconstruction and analysis
NASA Astrophysics Data System (ADS)
Howard, Toby L. J.; Murta, Alan D.; Gibson, Simon
2000-02-01
This paper describes research conducted in collaboration with Greater Manchester Police (UK), to evalute the utility of Virtual Environments for scene of crime analysis, forensic investigation, and law enforcement briefing and training. We present an illustrated case study of the construction of a high-fidelity virtual environment, intended to match a particular real-life crime scene as closely as possible. We describe and evaluate the combination of several approaches including: the use of the Manchester Scene Description Language for constructing complex geometrical models; the application of a radiosity rendering algorithm with several novel features based on human perceptual consideration; texture extraction from forensic photography; and experiments with interactive walkthroughs and large-screen stereoscopic display of the virtual environment implemented using the MAVERIK system. We also discuss the potential applications of Virtual Environment techniques in the Law Enforcement and Forensic communities.
Using 3D range cameras for crime scene documentation and legal medicine
NASA Astrophysics Data System (ADS)
Cavagnini, Gianluca; Sansoni, Giovanna; Trebeschi, Marco
2009-01-01
Crime scene documentation and legal medicine analysis are part of a very complex process which is aimed at identifying the offender starting from the collection of the evidences on the scene. This part of the investigation is very critical, since the crime scene is extremely volatile, and once it is removed, it can not be precisely created again. For this reason, the documentation process should be as complete as possible, with minimum invasiveness. The use of optical 3D imaging sensors has been considered as a possible aid to perform the documentation step, since (i) the measurement is contactless and (ii) the process required to editing and modeling the 3D data is quite similar to the reverse engineering procedures originally developed for the manufacturing field. In this paper we show the most important results obtained in the experimentation.
Robust traffic sign detection using fuzzy shape recognizer
NASA Astrophysics Data System (ADS)
Li, Lunbo; Li, Jun; Sun, Jianhong
2009-10-01
A novel fuzzy approach for the detection of traffic signs in natural environments is presented. More than 3000 road images were collected under different weather conditions by a digital camera, and used for testing this approach. Every RGB image was converted into HSV colour space, and segmented by the hue and saturation thresholds. A symmetrical detector was used to extract the local features of the regions of interest (ROI), and the shape of ROI was determined by a fuzzy shape recognizer which invoked a set of fuzzy rules. The experimental results show that the proposed algorithm is translation, rotation and scaling invariant, and gives reliable shape recognition in complex traffic scenes where clustering and partial occlusion normally occur.
Di Dio, Cinzia; Ardizzi, Martina; Massaro, Davide; Di Cesare, Giuseppe; Gilli, Gabriella; Marchetti, Antonella; Gallese, Vittorio
2016-01-01
Movement perception and its role in aesthetic experience have been often studied, within empirical aesthetics, in relation to the human body. No such specificity has been defined in neuroimaging studies with respect to contents lacking a human form. The aim of this work was to explore, through functional magnetic imaging (f MRI), how perceived movement is processed during the aesthetic judgment of paintings using two types of content: human subjects and scenes of nature. Participants, untutored in the arts, were shown the stimuli and asked to make aesthetic judgments. Additionally, they were instructed to observe the paintings and to rate their perceived movement in separate blocks. Observation highlighted spontaneous processes associated with aesthetic experience, whereas movement judgment outlined activations specifically related to movement processing. The ratings recorded during aesthetic judgment revealed that nature scenes received higher scored than human content paintings. The imaging data showed similar activation, relative to baseline, for all stimuli in the three tasks, including activation of occipito-temporal areas, posterior parietal, and premotor cortices. Contrast analyses within aesthetic judgment task showed that human content activated, relative to nature, precuneus, fusiform gyrus, and posterior temporal areas, whose activation was prominent for dynamic human paintings. In contrast, nature scenes activated, relative to human stimuli, occipital and posterior parietal cortex/precuneus, involved in visuospatial exploration and pragmatic coding of movement, as well as central insula. Static nature paintings further activated, relative to dynamic nature stimuli, central and posterior insula. Besides insular activation, which was specific for aesthetic judgment, we found a large overlap in the activation pattern characterizing each stimulus dimension (content and dynamism) across observation, aesthetic judgment, and movement judgment tasks. These findings support the idea that the aesthetic evaluation of artworks depicting both human subjects and nature scenes involves a motor component, and that the associated neural processes occur quite spontaneously in the viewer. Furthermore, considering the functional roles of posterior and central insula, we suggest that nature paintings may evoke aesthetic processes requiring an additional proprioceptive and sensori-motor component implemented by “motor accessibility” to the represented scenario, which is needed to judge the aesthetic value of the observed painting. PMID:26793087
Auditory scene analysis in school-aged children with developmental language disorders
Sussman, E.; Steinschneider, M.; Lee, W.; Lawson, K.
2014-01-01
Natural sound environments are dynamic, with overlapping acoustic input originating from simultaneously active sources. A key function of the auditory system is to integrate sensory inputs that belong together and segregate those that come from different sources. We hypothesized that this skill is impaired in individuals with phonological processing difficulties. There is considerable disagreement about whether phonological impairments observed in children with developmental language disorders can be attributed to specific linguistic deficits or to more general acoustic processing deficits. However, most tests of general auditory abilities have been conducted with a single set of sounds. We assessed the ability of school-aged children (7–15 years) to parse complex auditory non-speech input, and determined whether the presence of phonological processing impairments was associated with stream perception performance. A key finding was that children with language impairments did not show the same developmental trajectory for stream perception as typically developing children. In addition, children with language impairments required larger frequency separations between sounds to hear distinct streams compared to age-matched peers. Furthermore, phonological processing ability was a significant predictor of stream perception measures, but only in the older age groups. No such association was found in the youngest children. These results indicate that children with language impairments have difficulty parsing speech streams, or identifying individual sound events when there are competing sound sources. We conclude that language group differences may in part reflect fundamental maturational disparities in the analysis of complex auditory scenes. PMID:24548430
Reconstruction and simplification of urban scene models based on oblique images
NASA Astrophysics Data System (ADS)
Liu, J.; Guo, B.
2014-08-01
We describe a multi-view stereo reconstruction and simplification algorithms for urban scene models based on oblique images. The complexity, diversity, and density within the urban scene, it increases the difficulty to build the city models using the oblique images. But there are a lot of flat surfaces existing in the urban scene. One of our key contributions is that a dense matching algorithm based on Self-Adaptive Patch in view of the urban scene is proposed. The basic idea of matching propagating based on Self-Adaptive Patch is to build patches centred by seed points which are already matched. The extent and shape of the patches can adapt to the objects of urban scene automatically: when the surface is flat, the extent of the patch would become bigger; while the surface is very rough, the extent of the patch would become smaller. The other contribution is that the mesh generated by Graph Cuts is 2-manifold surface satisfied the half edge data structure. It is solved by clustering and re-marking tetrahedrons in s-t graph. The purpose of getting 2- manifold surface is to simply the mesh by edge collapse algorithm which can preserve and stand out the features of buildings.
Is moral beauty different from facial beauty? Evidence from an fMRI study
Wang, Tingting; Mo, Ce; Tan, Li Hai; Cant, Jonathan S.; Zhong, Luojin; Cupchik, Gerald
2015-01-01
Is moral beauty different from facial beauty? Two functional magnetic resonance imaging experiments were performed to answer this question. Experiment 1 investigated the network of moral aesthetic judgments and facial aesthetic judgments. Participants performed aesthetic judgments and gender judgments on both faces and scenes containing moral acts. The conjunction analysis of the contrasts ‘facial aesthetic judgment > facial gender judgment’ and ‘scene moral aesthetic judgment > scene gender judgment’ identified the common involvement of the orbitofrontal cortex (OFC), inferior temporal gyrus and medial superior frontal gyrus, suggesting that both types of aesthetic judgments are based on the orchestration of perceptual, emotional and cognitive components. Experiment 2 examined the network of facial beauty and moral beauty during implicit perception. Participants performed a non-aesthetic judgment task on both faces (beautiful vs common) and scenes (containing morally beautiful vs neutral information). We observed that facial beauty (beautiful faces > common faces) involved both the cortical reward region OFC and the subcortical reward region putamen, whereas moral beauty (moral beauty scenes > moral neutral scenes) only involved the OFC. Moreover, compared with facial beauty, moral beauty spanned a larger-scale cortical network, indicating more advanced and complex cerebral representations characterizing moral beauty. PMID:25298010
Ahmad, Fahad N; Moscovitch, Morris; Hockley, William E
2017-04-01
Konkle, Brady, Alvarez and Oliva (Psychological Science, 21, 1551-1556, 2010) showed that participants have an exceptional long-term memory (LTM) for photographs of scenes. We examined to what extent participants' exceptional LTM for scenes is determined by presentation time during encoding. In addition, at retrieval, we varied the nature of the lures in a forced-choice recognition task so that they resembled the target in gist (i.e., global or categorical) information, but were distinct in verbatim information (e.g., an "old" beach scene and a similar "new" beach scene; exemplar condition) or vice versa (e.g., a beach scene and a new scene from a novel category; novel condition). In Experiment 1, half of the list of scenes was presented for 1 s, whereas the other half was presented for 4 s. We found lower performance for shorter study presentation time in the exemplar test condition and similar performance for both study presentation times in the novel test condition. In Experiment 2, participants showed similar performance in an exemplar test for which the lure was of a different category but a category that was used at study. In Experiment 3, when presentation time was lowered to 500 ms, recognition accuracy was reduced in both novel and exemplar test conditions. A less detailed memorial representation of the studied scene containing more gist (i.e., meaning) than verbatim (i.e., surface or perceptual details) information is retrieved from LTM after a short compared to a long study presentation time. We conclude that our findings support fuzzy-trace theory.
Vanmarcke, Steven; Wagemans, Johan
2015-01-01
In everyday life, we are generally able to dynamically understand and adapt to socially (ir)elevant encounters, and to make appropriate decisions about these. All of this requires an impressive ability to directly filter and obtain the most informative aspects of a complex visual scene. Such rapid gist perception can be assessed in multiple ways. In the ultrafast categorization paradigm developed by Simon Thorpe et al. (1996), participants get a clear categorization task in advance and succeed at detecting the target object of interest (animal) almost perfectly (even with 20 ms exposures). Since this pioneering work, follow-up studies consistently reported population-level reaction time differences on different categorization tasks, indicating a superordinate advantage (animal versus dog) and effects of perceptual similarity (animals versus vehicles) and object category size (natural versus animal versus dog). In this study, we replicated and extended these separate findings by using a systematic collection of different categorization tasks (varying in presentation time, task demands, and stimuli) and focusing on individual differences in terms of e.g., gender and intelligence. In addition to replicating the main findings from the literature, we find subtle, yet consistent gender differences (women faster than men). PMID:26034569
Correction techniques for depth errors with stereo three-dimensional graphic displays
NASA Technical Reports Server (NTRS)
Parrish, Russell V.; Holden, Anthony; Williams, Steven P.
1992-01-01
Three-dimensional (3-D), 'real-world' pictorial displays that incorporate 'true' depth cues via stereopsis techniques have proved effective for displaying complex information in a natural way to enhance situational awareness and to improve pilot/vehicle performance. In such displays, the display designer must map the depths in the real world to the depths available with the stereo display system. However, empirical data have shown that the human subject does not perceive the information at exactly the depth at which it is mathematically placed. Head movements can also seriously distort the depth information that is embedded in stereo 3-D displays because the transformations used in mapping the visual scene to the depth-viewing volume (DVV) depend intrinsically on the viewer location. The goal of this research was to provide two correction techniques; the first technique corrects the original visual scene to the DVV mapping based on human perception errors, and the second (which is based on head-positioning sensor input data) corrects for errors induced by head movements. Empirical data are presented to validate both correction techniques. A combination of the two correction techniques effectively eliminates the distortions of depth information embedded in stereo 3-D displays.
Web pages: What can you see in a single fixation?
Jahanian, Ali; Keshvari, Shaiyan; Rosenholtz, Ruth
2018-01-01
Research in human vision suggests that in a single fixation, humans can extract a significant amount of information from a natural scene, e.g. the semantic category, spatial layout, and object identities. This ability is useful, for example, for quickly determining location, navigating around obstacles, detecting threats, and guiding eye movements to gather more information. In this paper, we ask a new question: What can we see at a glance at a web page - an artificial yet complex "real world" stimulus? Is it possible to notice the type of website, or where the relevant elements are, with only a glimpse? We find that observers, fixating at the center of a web page shown for only 120 milliseconds, are well above chance at classifying the page into one of ten categories. Furthermore, this ability is supported in part by text that they can read at a glance. Users can also understand the spatial layout well enough to reliably localize the menu bar and to detect ads, even though the latter are often camouflaged among other graphical elements. We discuss the parallels between web page gist and scene gist, and the implications of our findings for both vision science and human-computer interaction.
Alcohol imagery on New Zealand television
McGee, Rob; Ketchel, Juanita; Reeder, Anthony I
2007-01-01
Background To examine the extent and nature of alcohol imagery on New Zealand (NZ) television, a content analysis of 98 hours of prime-time television programs and advertising was carried out over 7 consecutive days' viewing in June/July 2004. The main outcome measures were number of scenes in programs, trailers and advertisements depicting alcohol imagery; the extent of critical versus neutral and promotional imagery; and the mean number of scenes with alcohol per hour, and characteristics of scenes in which alcohol featured. Results There were 648 separate depictions of alcohol imagery across the week, with an average of one scene every nine minutes. Scenes depicting uncritical imagery outnumbered scenes showing possible adverse health consequences of drinking by 12 to 1. Conclusion The evidence points to a large amount of alcohol imagery incidental to storylines in programming on NZ television. Alcohol is also used in many advertisements to market non-alcohol goods and services. More attention needs to be paid to the extent of alcohol imagery on television from the industry, the government and public health practitioners. Health education with young people could raise critical awareness of the way alcohol imagery is presented on television. PMID:17270053
Human vision is attuned to the diffuseness of natural light
Morgenstern, Yaniv; Geisler, Wilson S.; Murray, Richard F.
2014-01-01
All images are highly ambiguous, and to perceive 3-D scenes, the human visual system relies on assumptions about what lighting conditions are most probable. Here we show that human observers' assumptions about lighting diffuseness are well matched to the diffuseness of lighting in real-world scenes. We use a novel multidirectional photometer to measure lighting in hundreds of environments, and we find that the diffuseness of natural lighting falls in the same range as previous psychophysical estimates of the visual system's assumptions about diffuseness. We also find that natural lighting is typically directional enough to override human observers' assumption that light comes from above. Furthermore, we find that, although human performance on some tasks is worse in diffuse light, this can be largely accounted for by intrinsic task difficulty. These findings suggest that human vision is attuned to the diffuseness levels of natural lighting conditions. PMID:25139864
Luminance cues constrain chromatic blur discrimination in natural scene stimuli.
Sharman, Rebecca J; McGraw, Paul V; Peirce, Jonathan W
2013-03-22
Introducing blur into the color components of a natural scene has very little effect on its percept, whereas blur introduced into the luminance component is very noticeable. Here we quantify the dominance of luminance information in blur detection and examine a number of potential causes. We show that the interaction between chromatic and luminance information is not explained by reduced acuity or spatial resolution limitations for chromatic cues, the effective contrast of the luminance cue, or chromatic and achromatic statistical regularities in the images. Regardless of the quality of chromatic information, the visual system gives primacy to luminance signals when determining edge location. In natural viewing, luminance information appears to be specialized for detecting object boundaries while chromatic information may be used to determine surface properties.
ERIC Educational Resources Information Center
Golding, Barry; Campbell, Coral
2009-01-01
In this article, the authors set the scene for this research volume. They sought to emphasize and broaden their interest and concern about their "Learning to be drier" theme in this edition to the 77 per cent of Australians who live within 50 km of the Australian coast, the majority of whom also live in major cities and urban complexes.…
Large Capacity of Conscious Access for Incidental Memories in Natural Scenes.
Kaunitz, Lisandro N; Rowe, Elise G; Tsuchiya, Naotsugu
2016-09-01
When searching a crowd, people can detect a target face only by direct fixation and attention. Once the target is found, it is consciously experienced and remembered, but what is the perceptual fate of the fixated nontarget faces? Whereas introspection suggests that one may remember nontargets, previous studies have proposed that almost no memory should be retained. Using a gaze-contingent paradigm, we asked subjects to visually search for a target face within a crowded natural scene and then tested their memory for nontarget faces, as well as their confidence in those memories. Subjects remembered up to seven fixated, nontarget faces with more than 70% accuracy. Memory accuracy was correlated with trial-by-trial confidence ratings, which implies that the memory was consciously maintained and accessed. When the search scene was inverted, no more than three nontarget faces were remembered. These findings imply that incidental memory for faces, such as those recalled by eyewitnesses, is more reliable than is usually assumed. © The Author(s) 2016.
Wood, Carly; Angus, Caroline; Pretty, Jules; Sandercock, Gavin; Barton, Jo
2013-01-01
This study assessed whether exercising whilst viewing natural or built scenes affected self-esteem (SE) and mood in adolescents. Twenty-five adolescents participated in three exercise tests on consecutive days. A graded exercise test established the work rate equivalent to 50% heart rate reserve for use in subsequent constant load tests (CLTs). Participants undertook two 15-min CLTs in random order viewing scenes of either natural or built environments. Participants completed Rosenberg's SE scale and the adolescent profile of mood states questionnaire pre- and post-exercise. There was a significant main effect for SE (F(1) = 6.10; P < 0.05) and mood (F(6) = 5.29; P < 0.001) due to exercise, but no effect of viewing different environmental scenes (P > 0.05). Short bouts of moderate physical activity can have a positive impact on SE and mood in adolescents. Future research should incorporate field studies to examine the psychological effects of contact with real environments.
4D light-field sensing system for people counting
NASA Astrophysics Data System (ADS)
Hou, Guangqi; Zhang, Chi; Wang, Yunlong; Sun, Zhenan
2016-03-01
Counting the number of people is still an important task in social security applications, and a few methods based on video surveillance have been proposed in recent years. In this paper, we design a novel optical sensing system to directly acquire the depth map of the scene from one light-field camera. The light-field sensing system can count the number of people crossing the passageway, and record the direction and intensity of rays at a snapshot without any assistant light devices. Depth maps are extracted from the raw light-ray sensing data. Our smart sensing system is equipped with a passive imaging sensor, which is able to naturally discern the depth difference between the head and shoulders for each person. Then a human model is built. Through detecting the human model from light-field images, the number of people passing the scene can be counted rapidly. We verify the feasibility of the sensing system as well as the accuracy by capturing real-world scenes passing single and multiple people under natural illumination.
Subramanian, Ramanathan; Shankar, Divya; Sebe, Nicu; Melcher, David
2014-03-26
A basic question in vision research regards where people look in complex scenes and how this influences their performance in various tasks. Previous studies with static images have demonstrated a close link between where people look and what they remember. Here, we examined the pattern of eye movements when participants watched neutral and emotional clips from Hollywood-style movies. Participants answered multiple-choice memory questions concerning visual and auditory scene details immediately upon viewing 1-min-long neutral or emotional movie clips. Fixations were more narrowly focused for emotional clips, and immediate memory for object details was worse compared to matched neutral scenes, implying preferential attention to emotional events. Although we found the expected correlation between where people looked and what they remembered for neutral clips, this relationship broke down for emotional clips. When participants were subsequently presented with key frames (static images) extracted from the movie clips such that presentation duration of the target objects (TOs) corresponding to the multiple-choice questions was matched and the earlier questions were repeated, more fixations were observed on the TOs, and memory performance also improved significantly, confirming that emotion modulates the relationship between gaze position and memory performance. Finally, in a long-term memory test, old/new recognition performance was significantly better for emotional scenes as compared to neutral scenes. Overall, these results are consistent with the hypothesis that emotional content draws eye fixations and strengthens memory for the scene gist while weakening encoding of peripheral scene details.
Miller, Laurence
2010-01-01
Effective emergency mental health intervention for victims of crime, natural disaster or terrorism begins the moment the first responders arrive. This article describes a range of on-scene crisis intervention options, including verbal communication, body language, behavioral strategies, and interpersonal style. The correct intervention in the first few moments and hours of a crisis can profoundly influence the recovery course of victims and survivors of catastrophic events.
Euros, Pounds and Albion at Arms: European Monetary Policy and British Defence in the 21st Century
2004-09-01
Shakespeare , Hamlet , Prince of Denmark, act 1, scene 2. 2 William Shakespeare , Richard the Second, act 2, scene 1. 3 Hugo Young, This Blessed Plot: Britain...THIS PAGE INTENTIONALLY LEFT BLANK 1 I. INTRODUCTION Queen: Good Hamlet , cast thy nighted colour...tis common. All that lives must die, Passing through nature to eternity. Hamlet : Ay, madam, it is common. Queen: If it
Parameterization of sparse vegetation in thermal images of natural ground landscapes
NASA Astrophysics Data System (ADS)
Agassi, Eyal; Ben-Yosef, Nissim
1997-10-01
The radiant statistics of thermal images of desert terrain scenes and their temporal behavior have been fully understood and well modeled. Unlike desert scenes, most natural terrestrial landscapes contain vegetative objects. A plant is a living object that regulates its temperature through evapotranspiration of leaf stomata, and plant interaction with the outside world is influenced by its physiological processes. Therefore, the heat balance equation for a vegetative object differs from that for an inorganic surface element. Despite this difficulty, plants can be incorporated into the desert surface model when an effective heat conduction parameter is associated with vegetation. Due to evapotranspiration, the effective heat conduction of plants during daytime is much higher than at night. As a result, plants (mainly trees and bushes) are usually the coldest objects in the scene in the daytime while they are not necessarily the warmest objects at night. The parameterization of vegetative objects in terms of effective heat conduction enables the extension of the desert terrain model for scenes with sparse vegetation and the estimation of their radiant statistics and their diurnal behavior. The effective heat conduction image can serve as a tool for vegetation type classification and assessment of the dominant physical process that determinate thermal image properties.
Sayer, James R; Buonarosa, Mary Lynn
2008-01-01
This study examines the effects of high-visibility garment design on daytime pedestrian conspicuity in work zones. Factors assessed were garment color, amount of background material, pedestrian arm motion, scene complexity, and driver age. The study was conducted in naturalistic conditions on public roads in real traffic. Drivers drove two passes on a 31-km route and indicated when they detected pedestrians outfitted in the fluorescent garments. The locations of the vehicle and the pedestrian were recorded. Detection distances between fluorescent yellow-green and fluorescent red-orange garments were not significantly different, nor were there any significant two-way interactions involving garment color. Pedestrians were detected at longer distances in lower complexity scenes. Arm motion significantly increased detection distances for pedestrians wearing a Class 2 vest, but had little added benefit on detection distances for pedestrians wearing a Class 2 jacket. Daytime detection distances for pedestrians wearing Class 2 or Class 3 garments are longest when the complexity of the surround is low. The more background information a driver has to search through, the longer it is likely to take the driver to locate a pedestrian--even when wearing a high-visibility garment. These findings will provide information to safety garment manufacturers about characteristics of high-visibility safety garments which make them effective for daytime use.
Knight, Rod; Fast, Danya; DeBeck, Kora; Shoveller, Jean; Small, Will
2017-05-02
Urban drug "scenes" have been identified as important risk environments that shape the health of street-entrenched youth. New knowledge is needed to inform policy and programing interventions to help reduce youths' drug scene involvement and related health risks. The aim of this study was to identify how young people envisioned exiting a local, inner-city drug scene in Vancouver, Canada, as well as the individual, social and structural factors that shaped their experiences. Between 2008 and 2016, we draw on 150 semi-structured interviews with 75 street-entrenched youth. We also draw on data generated through ethnographic fieldwork conducted with a subgroup of 25 of these youth between. Youth described that, in order to successfully exit Vancouver's inner city drug scene, they would need to: (a) secure legitimate employment and/or obtain education or occupational training; (b) distance themselves - both physically and socially - from the urban drug scene; and (c) reduce their drug consumption. As youth attempted to leave the scene, most experienced substantial social and structural barriers (e.g., cycling in and out of jail, the need to access services that are centralized within a place that they are trying to avoid), in addition to managing complex individual health issues (e.g., substance dependence). Factors that increased youth's capacity to successfully exit the drug scene included access to various forms of social and cultural capital operating outside of the scene, including supportive networks of friends and/or family, as well as engagement with addiction treatment services (e.g., low-threshold access to methadone) to support cessation or reduction of harmful forms of drug consumption. Policies and programming interventions that can facilitate young people's efforts to reduce engagement with Vancouver's inner-city drug scene are critically needed, including meaningful educational and/or occupational training opportunities, 'low threshold' addiction treatment services, as well as access to supportive housing outside of the scene.
Scene analysis for effective visual search in rough three-dimensional-modeling scenes
NASA Astrophysics Data System (ADS)
Wang, Qi; Hu, Xiaopeng
2016-11-01
Visual search is a fundamental technology in the computer vision community. It is difficult to find an object in complex scenes when there exist similar distracters in the background. We propose a target search method in rough three-dimensional-modeling scenes based on a vision salience theory and camera imaging model. We give the definition of salience of objects (or features) and explain the way that salience measurements of objects are calculated. Also, we present one type of search path that guides to the target through salience objects. Along the search path, when the previous objects are localized, the search region of each subsequent object decreases, which is calculated through imaging model and an optimization method. The experimental results indicate that the proposed method is capable of resolving the ambiguities resulting from distracters containing similar visual features with the target, leading to an improvement of search speed by over 50%.
Graham, Daniel J; Field, David J
2008-01-01
Two recent studies suggest that natural scenes and paintings show similar statistical properties. But does the content or region of origin of an artwork affect its statistical properties? We addressed this question by having judges place paintings from a large, diverse collection of paintings into one of three subject-matter categories using a forced-choice paradigm. Basic statistics for images whose caterogization was agreed by all judges showed no significant differences between those judged to be 'landscape' and 'portrait/still-life', but these two classes differed from paintings judged to be 'abstract'. All categories showed basic spatial statistical regularities similar to those typical of natural scenes. A test of the full painting collection (140 images) with respect to the works' place of origin (provenance) showed significant differences between Eastern works and Western ones, differences which we find are likely related to the materials and the choice of background color. Although artists deviate slightly from reproducing natural statistics in abstract art (compared to representational art), the great majority of human art likely shares basic statistical limitations. We argue that statistical regularities in art are rooted in the need to make art visible to the eye, not in the inherent aesthetic value of natural-scene statistics, and we suggest that variability in spatial statistics may be generally imposed by manufacture.
IKONOS geometric characterization
Helder, Dennis; Coan, Michael; Patrick, Kevin; Gaska, Peter
2003-01-01
The IKONOS spacecraft acquired images on July 3, 17, and 25, and August 13, 2001 of Brookings SD, a small city in east central South Dakota, and on May 22, June 30, and July 30, 2000, of the rural area around the EROS Data Center. South Dakota State University (SDSU) evaluated the Brookings scenes and the USGS EROS Data Center (EDC) evaluated the other scenes. The images evaluated by SDSU utilized various natural objects and man-made features as identifiable targets randomly distribution throughout the scenes, while the images evaluated by EDC utilized pre-marked artificial points (panel points) to provide the best possible targets distributed in a grid pattern. Space Imaging provided products at different processing levels to each institution. For each scene, the pixel (line, sample) locations of the various targets were compared to field observed, survey-grade Global Positioning System locations. Patterns of error distribution for each product were plotted, and a variety of statistical statements of accuracy are made. The IKONOS sensor also acquired 12 pairs of stereo images of globally distributed scenes between April 2000 and April 2001. For each scene, analysts at the National Imagery and Mapping Agency (NIMA) compared derived photogrammetric coordinates to their corresponding NIMA field-surveyed ground control point (GCPs). NIMA analysts determined horizontal and vertical accuracies by averaging the differences between the derived photogrammetric points and the field-surveyed GCPs for all 12 stereo pairs. Patterns of error distribution for each scene are presented.
Evaluation of a Passive Nature Viewing Program Set to Music.
Cadman, Sally J
2014-09-01
Research has revealed that passive nature viewing (viewing nature scenes without actually being in nature) has many health benefits but little is known about the best method of offering this complementary modality. The purpose of this pilot program was to evaluate the impact of a passive nature viewing program set to music on stress reduction in adults living in the community. A pre- and postsurvey design along with weekly recordings of stress and relaxation levels were used to evaluate the effect of this passive nature viewing program on stress reduction. Participants watched one of three preselected nature scenes for 5 minutes a day over 1 month and rated their stress and relaxation levels weekly on a 100-mm Visual Analogue Scale before and after viewing the nature DVD. Quantitative analysis were not performed because of the less number of subjects (n = 10) completing the study. Qualitative analysis found five key categories that have an impact on program use: (a) technology, (b) personal preferences, (c) time, (d) immersion, and (e) use of the program. Holistic nurses may consider integrating patient preferences and immersion strategies in the design of future passive nature viewing programs to reduce attrition and improve success. © The Author(s) 2013.
Parahippocampal and retrosplenial contributions to human spatial navigation
Epstein, Russell A.
2010-01-01
Spatial navigation is a core cognitive ability in humans and animals. Neuroimaging studies have identified two functionally-defined brain regions that activate during navigational tasks and also during passive viewing of navigationally-relevant stimuli such as environmental scenes: the parahippocampal place area (PPA) and the retrosplenial complex (RSC). Recent findings indicate that the PPA and RSC play distinct and complementary roles in spatial navigation, with the PPA more concerned with representation of the local visual scene and RSC more concerned with situating the scene within the broader spatial environment. These findings are a first step towards understanding the separate components of the cortical network that mediates spatial navigation in humans. PMID:18760955
Neural Correlates of Divided Attention in Natural Scenes.
Fagioli, Sabrina; Macaluso, Emiliano
2016-09-01
Individuals are able to split attention between separate locations, but divided spatial attention incurs the additional requirement of monitoring multiple streams of information. Here, we investigated divided attention using photos of natural scenes, where the rapid categorization of familiar objects and prior knowledge about the likely positions of objects in the real world might affect the interplay between these spatial and nonspatial factors. Sixteen participants underwent fMRI during an object detection task. They were presented with scenes containing either a person or a car, located on the left or right side of the photo. Participants monitored either one or both object categories, in one or both visual hemifields. First, we investigated the interplay between spatial and nonspatial attention by comparing conditions of divided attention between categories and/or locations. We then assessed the contribution of top-down processes versus stimulus-driven signals by separately testing the effects of divided attention in target and nontarget trials. The results revealed activation of a bilateral frontoparietal network when dividing attention between the two object categories versus attending to a single category but no main effect of dividing attention between spatial locations. Within this network, the left dorsal premotor cortex and the left intraparietal sulcus were found to combine task- and stimulus-related signals. These regions showed maximal activation when participants monitored two categories at spatially separate locations and the scene included a nontarget object. We conclude that the dorsal frontoparietal cortex integrates top-down and bottom-up signals in the presence of distractors during divided attention in real-world scenes.
How emotion leads to selective memory: neuroimaging evidence.
Waring, Jill D; Kensinger, Elizabeth A
2011-06-01
Often memory for emotionally arousing items is enhanced relative to neutral items within complex visual scenes, but this enhancement can come at the expense of memory for peripheral background information. This 'trade-off' effect has been elicited by a range of stimulus valence and arousal levels, yet the magnitude of the effect has been shown to vary with these factors. Using fMRI, this study investigated the neural mechanisms underlying this selective memory for emotional scenes. Further, we examined how these processes are affected by stimulus dimensions of arousal and valence. The trade-off effect in memory occurred for low to high arousal positive and negative scenes. There was a core emotional memory network associated with the trade-off among all the emotional scene types, however, there were additional regions that were uniquely associated with the trade-off for each individual scene type. These results suggest that there is a common network of regions associated with the emotional memory trade-off effect, but that valence and arousal also independently affect the neural activity underlying the effect. Copyright © 2011 Elsevier Ltd. All rights reserved.
How emotion leads to selective memory: Neuroimaging evidence
Waring, Jill D.; Kensinger, Elizabeth A.
2011-01-01
Often memory for emotionally arousing items is enhanced relative to neutral items within complex visual scenes, but this enhancement can come at the expense of memory for peripheral background information. This ‘trade-off’ effect has been elicited by a range of stimulus valence and arousal levels, yet the magnitude of the effect has been shown to vary with these factors. Using fMRI, this study investigated the neural mechanisms underlying this selective memory for emotional scenes. Further, we examined how these processes are affected by stimulus dimensions of arousal and valence. The trade-off effect in memory occurred for low to high arousal positive and negative scenes. There was a core emotional memory network associated with the trade-off among all the emotional scene types, however there were additional regions that were uniquely associated with the trade-off for each individual scene type. These results suggest that there is a common network of regions associated with the emotional memory tradeoff effect, but that valence and arousal also independently affect the neural activity underlying the effect. PMID:21414333
LivePhantom: Retrieving Virtual World Light Data to Real Environments.
Kolivand, Hoshang; Billinghurst, Mark; Sunar, Mohd Shahrizal
2016-01-01
To achieve realistic Augmented Reality (AR), shadows play an important role in creating a 3D impression of a scene. Casting virtual shadows on real and virtual objects is one of the topics of research being conducted in this area. In this paper, we propose a new method for creating complex AR indoor scenes using real time depth detection to exert virtual shadows on virtual and real environments. A Kinect camera was used to produce a depth map for the physical scene mixing into a single real-time transparent tacit surface. Once this is created, the camera's position can be tracked from the reconstructed 3D scene. Real objects are represented by virtual object phantoms in the AR scene enabling users holding a webcam and a standard Kinect camera to capture and reconstruct environments simultaneously. The tracking capability of the algorithm is shown and the findings are assessed drawing upon qualitative and quantitative methods making comparisons with previous AR phantom generation applications. The results demonstrate the robustness of the technique for realistic indoor rendering in AR systems.
LivePhantom: Retrieving Virtual World Light Data to Real Environments
2016-01-01
To achieve realistic Augmented Reality (AR), shadows play an important role in creating a 3D impression of a scene. Casting virtual shadows on real and virtual objects is one of the topics of research being conducted in this area. In this paper, we propose a new method for creating complex AR indoor scenes using real time depth detection to exert virtual shadows on virtual and real environments. A Kinect camera was used to produce a depth map for the physical scene mixing into a single real-time transparent tacit surface. Once this is created, the camera’s position can be tracked from the reconstructed 3D scene. Real objects are represented by virtual object phantoms in the AR scene enabling users holding a webcam and a standard Kinect camera to capture and reconstruct environments simultaneously. The tracking capability of the algorithm is shown and the findings are assessed drawing upon qualitative and quantitative methods making comparisons with previous AR phantom generation applications. The results demonstrate the robustness of the technique for realistic indoor rendering in AR systems. PMID:27930663
Sensor-Topology Based Simplicial Complex Reconstruction from Mobile Laser Scanning
NASA Astrophysics Data System (ADS)
Guinard, S.; Vallet, B.
2018-05-01
We propose a new method for the reconstruction of simplicial complexes (combining points, edges and triangles) from 3D point clouds from Mobile Laser Scanning (MLS). Our main goal is to produce a reconstruction of a scene that is adapted to the local geometry of objects. Our method uses the inherent topology of the MLS sensor to define a spatial adjacency relationship between points. We then investigate each possible connexion between adjacent points and filter them by searching collinear structures in the scene, or structures perpendicular to the laser beams. Next, we create triangles for each triplet of self-connected edges. Last, we improve this method with a regularization based on the co-planarity of triangles and collinearity of remaining edges. We compare our results to a naive simplicial complexes reconstruction based on edge length.
Hollingworth, Andrew; Henderson, John M
2004-07-01
In a change detection paradigm, the global orientation of a natural scene was incrementally changed in 1 degree intervals. In Experiments 1 and 2, participants demonstrated sustained change blindness to incremental rotation, often coming to consider a significantly different scene viewpoint as an unchanged continuation of the original view. Experiment 3 showed that participants who failed to detect the incremental rotation nevertheless reliably detected a single-step rotation back to the initial view. Together, these results demonstrate an important dissociation between explicit change detection and visual memory. Following a change, visual memory is updated to reflect the changed state of the environment, even if the change was not detected.
Research on 3D virtual campus scene modeling based on 3ds Max and VRML
NASA Astrophysics Data System (ADS)
Kang, Chuanli; Zhou, Yanliu; Liang, Xianyue
2015-12-01
With the rapid development of modem technology, the digital information management and the virtual reality simulation technology has become a research hotspot. Virtual campus 3D model can not only express the real world objects of natural, real and vivid, and can expand the campus of the reality of time and space dimension, the combination of school environment and information. This paper mainly uses 3ds Max technology to create three-dimensional model of building and on campus buildings, special land etc. And then, the dynamic interactive function is realized by programming the object model in 3ds Max by VRML .This research focus on virtual campus scene modeling technology and VRML Scene Design, and the scene design process in a variety of real-time processing technology optimization strategy. This paper guarantees texture map image quality and improve the running speed of image texture mapping. According to the features and architecture of Guilin University of Technology, 3ds Max, AutoCAD and VRML were used to model the different objects of the virtual campus. Finally, the result of virtual campus scene is summarized.
Video Segmentation Descriptors for Event Recognition
2014-12-08
Velastin, 3D Extended Histogram of Oriented Gradients (3DHOG) for Classification of Road Users in Urban Scenes , BMVC, 2009. [3] M.-Y. Chen and A. Hauptmann...computed on 3D volume outputted by the hierarchical segmentation . Each video is described as follows. Each supertube is temporally divided in n-frame...strength of these descriptors is their adaptability to the scene variations since they are grounded on a video segmentation . This makes them naturally robust
Statistics of high-level scene context.
Greene, Michelle R
2013-01-01
CONTEXT IS CRITICAL FOR RECOGNIZING ENVIRONMENTS AND FOR SEARCHING FOR OBJECTS WITHIN THEM: contextual associations have been shown to modulate reaction time and object recognition accuracy, as well as influence the distribution of eye movements and patterns of brain activations. However, we have not yet systematically quantified the relationships between objects and their scene environments. Here I seek to fill this gap by providing descriptive statistics of object-scene relationships. A total of 48, 167 objects were hand-labeled in 3499 scenes using the LabelMe tool (Russell et al., 2008). From these data, I computed a variety of descriptive statistics at three different levels of analysis: the ensemble statistics that describe the density and spatial distribution of unnamed "things" in the scene; the bag of words level where scenes are described by the list of objects contained within them; and the structural level where the spatial distribution and relationships between the objects are measured. The utility of each level of description for scene categorization was assessed through the use of linear classifiers, and the plausibility of each level for modeling human scene categorization is discussed. Of the three levels, ensemble statistics were found to be the most informative (per feature), and also best explained human patterns of categorization errors. Although a bag of words classifier had similar performance to human observers, it had a markedly different pattern of errors. However, certain objects are more useful than others, and ceiling classification performance could be achieved using only the 64 most informative objects. As object location tends not to vary as a function of category, structural information provided little additional information. Additionally, these data provide valuable information on natural scene redundancy that can be exploited for machine vision, and can help the visual cognition community to design experiments guided by statistics rather than intuition.
Is moral beauty different from facial beauty? Evidence from an fMRI study.
Wang, Tingting; Mo, Lei; Mo, Ce; Tan, Li Hai; Cant, Jonathan S; Zhong, Luojin; Cupchik, Gerald
2015-06-01
Is moral beauty different from facial beauty? Two functional magnetic resonance imaging experiments were performed to answer this question. Experiment 1 investigated the network of moral aesthetic judgments and facial aesthetic judgments. Participants performed aesthetic judgments and gender judgments on both faces and scenes containing moral acts. The conjunction analysis of the contrasts 'facial aesthetic judgment > facial gender judgment' and 'scene moral aesthetic judgment > scene gender judgment' identified the common involvement of the orbitofrontal cortex (OFC), inferior temporal gyrus and medial superior frontal gyrus, suggesting that both types of aesthetic judgments are based on the orchestration of perceptual, emotional and cognitive components. Experiment 2 examined the network of facial beauty and moral beauty during implicit perception. Participants performed a non-aesthetic judgment task on both faces (beautiful vs common) and scenes (containing morally beautiful vs neutral information). We observed that facial beauty (beautiful faces > common faces) involved both the cortical reward region OFC and the subcortical reward region putamen, whereas moral beauty (moral beauty scenes > moral neutral scenes) only involved the OFC. Moreover, compared with facial beauty, moral beauty spanned a larger-scale cortical network, indicating more advanced and complex cerebral representations characterizing moral beauty. © The Author (2014). Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
Integrating mechanisms of visual guidance in naturalistic language production.
Coco, Moreno I; Keller, Frank
2015-05-01
Situated language production requires the integration of visual attention and linguistic processing. Previous work has not conclusively disentangled the role of perceptual scene information and structural sentence information in guiding visual attention. In this paper, we present an eye-tracking study that demonstrates that three types of guidance, perceptual, conceptual, and structural, interact to control visual attention. In a cued language production experiment, we manipulate perceptual (scene clutter) and conceptual guidance (cue animacy) and measure structural guidance (syntactic complexity of the utterance). Analysis of the time course of language production, before and during speech, reveals that all three forms of guidance affect the complexity of visual responses, quantified in terms of the entropy of attentional landscapes and the turbulence of scan patterns, especially during speech. We find that perceptual and conceptual guidance mediate the distribution of attention in the scene, whereas structural guidance closely relates to scan pattern complexity. Furthermore, the eye-voice span of the cued object and its perceptual competitor are similar; its latency mediated by both perceptual and structural guidance. These results rule out a strict interpretation of structural guidance as the single dominant form of visual guidance in situated language production. Rather, the phase of the task and the associated demands of cross-modal cognitive processing determine the mechanisms that guide attention.
You think you know where you looked? You better look again.
Võ, Melissa L-H; Aizenman, Avigael M; Wolfe, Jeremy M
2016-10-01
People are surprisingly bad at knowing where they have looked in a scene. We tested participants' ability to recall their own eye movements in 2 experiments using natural or artificial scenes. In each experiment, participants performed a change-detection (Exp.1) or search (Exp.2) task. On 25% of trials, after 3 seconds of viewing the scene, participants were asked to indicate where they thought they had just fixated. They responded by making mouse clicks on 12 locations in the unchanged scene. After 135 trials, observers saw 10 new scenes and were asked to put 12 clicks where they thought someone else would have looked. Although observers located their own fixations more successfully than a random model, their performance was no better than when they were guessing someone else's fixations. Performance with artificial scenes was worse, though judging one's own fixations was slightly superior. Even after repeating the fixation-location task on 30 scenes immediately after scene viewing, performance was far from the prediction of an ideal observer. Memory for our own fixation locations appears to add next to nothing beyond what common sense tells us about the likely fixations of others. These results have important implications for socially important visual search tasks. For example, a radiologist might think he has looked at "everything" in an image, but eye tracking data suggest that this is not so. Such shortcomings might be avoided by providing observers with better insights of where they have looked. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Selective looking at natural scenes: Hedonic content and gender.
Bradley, Margaret M; Costa, Vincent D; Lang, Peter J
2015-10-01
Choice viewing behavior when looking at affective scenes was assessed to examine differences due to hedonic content and gender by monitoring eye movements in a selective looking paradigm. On each trial, participants viewed a pair of pictures that included a neutral picture together with an affective scene depicting either contamination, mutilation, threat, food, nude males, or nude females. The duration of time that gaze was directed to each picture in the pair was determined from eye fixations. Results indicated that viewing choices varied with both hedonic content and gender. Initially, gaze duration for both men and women was heightened when viewing all affective contents, but was subsequently followed by significant avoidance of scenes depicting contamination or nude males. Gender differences were most pronounced when viewing pictures of nude females, with men continuing to devote longer gaze time to pictures of nude females throughout viewing, whereas women avoided scenes of nude people, whether male or female, later in the viewing interval. For women, reported disgust of sexual activity was also inversely related to gaze duration for nude scenes. Taken together, selective looking as indexed by eye movements reveals differential perceptual intake as a function of specific content, gender, and individual differences. Copyright © 2015 Elsevier B.V. All rights reserved.
Modeling Of Object- And Scene-Prototypes With Hierarchically Structured Classes
NASA Astrophysics Data System (ADS)
Ren, Z.; Jensch, P.; Ameling, W.
1989-03-01
The success of knowledge-based image analysis methodology and implementation tools depends largely on an appropriately and efficiently built model wherein the domain-specific context information about and the inherent structure of the observed image scene have been encoded. For identifying an object in an application environment a computer vision system needs to know firstly the description of the object to be found in an image or in an image sequence, secondly the corresponding relationships between object descriptions within the image sequence. This paper presents models of image objects scenes by means of hierarchically structured classes. Using the topovisual formalism of graph and higraph, we are currently studying principally the relational aspect and data abstraction of the modeling in order to visualize the structural nature resident in image objects and scenes, and to formalize. their descriptions. The goal is to expose the structure of image scene and the correspondence of image objects in the low level image interpretation. process. The object-based system design approach has been applied to build the model base. We utilize the object-oriented programming language C + + for designing, testing and implementing the abstracted entity classes and the operation structures which have been modeled topovisually. The reference images used for modeling prototypes of objects and scenes are from industrial environments as'well as medical applications.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Arthur Bleeker, PNNL
2015-03-11
SVF is a full featured OpenGL 3d framework that allows for rapid creation of complex visualizations. The SVF framework handles much of the lifecycle and complex tasks required for a 3d visualization. Unlike a game framework SVF was designed to use fewer resources, work well in a windowed environment, and only render when necessary. The scene also takes advantage of multiple threads to free up the UI thread as much as possible. Shapes (actors) in the scene are created by adding or removing functionality (through support objects) during runtime. This allows a highly flexible and dynamic means of creating highlymore » complex actors without the code complexity (it also helps overcome the lack of multiple inheritance in Java.) All classes are highly customizable and there are abstract classes which are intended to be subclassed to allow a developer to create more complex and highly performant actors. There are multiple demos included in the framework to help the developer get started and shows off nearly all of the functionality. Some simple shapes (actors) are already created for you such as text, bordered text, radial text, text area, complex paths, NURBS paths, cube, disk, grid, plane, geometric shapes, and volumetric area. It also comes with various camera types for viewing that can be dragged, zoomed, and rotated. Picking or selecting items in the scene can be accomplished in various ways depending on your needs (raycasting or color picking.) The framework currently has functionality for tooltips, animation, actor pools, color gradients, 2d physics, text, 1d/2d/3d textures, children, blending, clipping planes, view frustum culling, custom shaders, and custom actor states« less
Review of the CIE system of colorimetry and its use in dentistry.
Westland, Stephen
2003-01-01
When we observe the light reflected from surfaces in a scene or look directly at light emitted by light sources, we experience the sensation of color. Color is just one attribute of a complex and not fully understood set of properties that define the appearance of our surroundings. To measure or specify the color of an object, we need to take into account the nature of the light under which the object is viewed, the spectral reflectance properties of the surface, and the properties of the human color vision system. In this article the CIE system of colorimetry is briefly reviewed and its limitations are described. The consequences of these limitations for color measurement in dentistry are discussed.
Recent progress in heteronuclear long-range NMR of complex carbohydrates: 3D H2BC and clean HMBC.
Meier, Sebastian; Petersen, Bent O; Duus, Jens Ø; Sørensen, Ole W
2009-11-02
The new NMR experiments 3D H2BC and clean HMBC are explored for challenging applications to a complex carbohydrate at natural abundance of (13)C. The 3D H2BC experiment is crucial for sequential assignment as it yields heteronuclear one- and two-bond together with COSY correlations for the (1)H spins, all in a single spectrum with good resolution and non-informative diagonal-type peaks suppressed. Clean HMBC is a remedy for the ubiquitous problem of strong coupling induced one-bond correlation artifacts in HMBC spectra of carbohydrates. Both experiments work well for one of the largest carbohydrates whose structure has been determined by NMR, not least due to the enhanced resolution offered by the third dimension in 3D H2BC and the improved spectral quality due to artifact suppression in clean HMBC. Hence these new experiments set the scene to take advantage of the sensitivity boost achieved by the latest generation of cold probes for NMR structure determination of even larger and more complex carbohydrates in solution.
NASA Astrophysics Data System (ADS)
Keane, Tommy P.; Saber, Eli; Rhody, Harvey; Savakis, Andreas; Raj, Jeffrey
2012-04-01
Contemporary research in automated panorama creation utilizes camera calibration or extensive knowledge of camera locations and relations to each other to achieve successful results. Research in image registration attempts to restrict these same camera parameters or apply complex point-matching schemes to overcome the complications found in real-world scenarios. This paper presents a novel automated panorama creation algorithm by developing an affine transformation search based on maximized mutual information (MMI) for region-based registration. Standard MMI techniques have been limited to applications with airborne/satellite imagery or medical images. We show that a novel MMI algorithm can approximate an accurate registration between views of realistic scenes of varying depth distortion. The proposed algorithm has been developed using stationary, color, surveillance video data for a scenario with no a priori camera-to-camera parameters. This algorithm is robust for strict- and nearly-affine-related scenes, while providing a useful approximation for the overlap regions in scenes related by a projective homography or a more complex transformation, allowing for a set of efficient and accurate initial conditions for pixel-based registration.
Soft Mixer Assignment in a Hierarchical Generative Model of Natural Scene Statistics
Schwartz, Odelia; Sejnowski, Terrence J.; Dayan, Peter
2010-01-01
Gaussian scale mixture models offer a top-down description of signal generation that captures key bottom-up statistical characteristics of filter responses to images. However, the pattern of dependence among the filters for this class of models is prespecified. We propose a novel extension to the gaussian scale mixture model that learns the pattern of dependence from observed inputs and thereby induces a hierarchical representation of these inputs. Specifically, we propose that inputs are generated by gaussian variables (modeling local filter structure), multiplied by a mixer variable that is assigned probabilistically to each input from a set of possible mixers. We demonstrate inference of both components of the generative model, for synthesized data and for different classes of natural images, such as a generic ensemble and faces. For natural images, the mixer variable assignments show invariances resembling those of complex cells in visual cortex; the statistics of the gaussian components of the model are in accord with the outputs of divisive normalization models. We also show how our model helps interrelate a wide range of models of image statistics and cortical processing. PMID:16999575
Multispectral Terrain Background Simulation Techniques For Use In Airborne Sensor Evaluation
NASA Astrophysics Data System (ADS)
Weinberg, Michael; Wohlers, Ronald; Conant, John; Powers, Edward
1988-08-01
A background simulation code developed at Aerodyne Research, Inc., called AERIE is designed to reflect the major sources of clutter that are of concern to staring and scanning sensors of the type being considered for various airborne threat warning (both aircraft and missiles) sensors. The code is a first principles model that could be used to produce a consistent image of the terrain for various spectral bands, i.e., provide the proper scene correlation both spectrally and spatially. The code utilizes both topographic and cultural features to model terrain, typically from DMA data, with a statistical overlay of the critical underlying surface properties (reflectance, emittance, and thermal factors) to simulate the resulting texture in the scene. Strong solar scattering from water surfaces is included with allowance for wind driven surface roughness. Clouds can be superimposed on the scene using physical cloud models and an analytical representation of the reflectivity obtained from scattering off spherical particles. The scene generator is augmented by collateral codes that allow for the generation of images at finer resolution. These codes provide interpolation of the basic DMA databases using fractal procedures that preserve the high frequency power spectral density behavior of the original scene. Scenes are presented illustrating variations in altitude, radiance, resolution, material, thermal factors, and emissivities. The basic models utilized for simulation of the various scene components and various "engineering level" approximations are incorporated to reduce the computational complexity of the simulation.
Hesse, Janis K; Tsao, Doris Y
2016-11-02
Segmentation and recognition of objects in a visual scene are two problems that are hard to solve separately from each other. When segmenting an ambiguous scene, it is helpful to already know the present objects and their shapes. However, for recognizing an object in clutter, one would like to consider its isolated segment alone to avoid confounds from features of other objects. Border-ownership cells (Zhou et al., 2000) appear to play an important role in segmentation, as they signal the side-of-figure of artificial stimuli. The present work explores the role of border-ownership cells in dorsal macaque visual areas V2 and V3 in the segmentation of natural object stimuli and locally ambiguous stimuli. We report two major results. First, compared with previous estimates, we found a smaller percentage of cells that were consistent across artificial stimuli used previously. Second, we found that the average response of those neurons that did respond consistently to the side-of-figure of artificial stimuli also consistently signaled, as a population, the side-of-figure for borders of single faces, occluding faces and, with higher latencies, even stimuli with illusory contours, such as Mooney faces and natural faces completely missing local edge information. In contrast, the local edge or the outlines of the face alone could not always evoke a significant border-ownership signal. Our results underscore that border ownership is coded by a population of cells, and indicate that these cells integrate a variety of cues, including low-level features and global object context, to compute the segmentation of the scene. To distinguish different objects in a natural scene, the brain must segment the image into regions corresponding to objects. The so-called "border-ownership" cells appear to be dedicated to this task, as they signal for a given edge on which side the object is that owns it. Here, we report that individual border-ownership cells are unreliable when tested across a battery of artificial stimuli used previously but can signal border-ownership consistently as a population. We show that these border-ownership population signals are also suited for signaling border-ownership for natural objects and at longer latency, even for stimuli without local edge information. Our results suggest that border-ownership cells integrate both local, low-level and global, high-level cues to segment the scene. Copyright © 2016 the authors 0270-6474/16/3611338-12$15.00/0.
Generation of binary holograms for deep scenes captured with a camera and a depth sensor
NASA Astrophysics Data System (ADS)
Leportier, Thibault; Park, Min-Chul
2017-01-01
This work presents binary hologram generation from images of a real object acquired from a Kinect sensor. Since hologram calculation from a point-cloud or polygon model presents a heavy computational burden, we adopted a depth-layer approach to generate the holograms. This method enables us to obtain holographic data of large scenes quickly. Our investigations focus on the performance of different methods, iterative and noniterative, to convert complex holograms into binary format. Comparisons were performed to examine the reconstruction of the binary holograms at different depths. We also propose to modify the direct binary search algorithm to take into account several reference image planes. Then, deep scenes featuring multiple planes of interest can be reconstructed with better efficiency.
Parametric Coding of the Size and Clutter of Natural Scenes in the Human Brain
Park, Soojin; Konkle, Talia; Oliva, Aude
2015-01-01
Estimating the size of a space and its degree of clutter are effortless and ubiquitous tasks of moving agents in a natural environment. Here, we examine how regions along the occipital–temporal lobe respond to pictures of indoor real-world scenes that parametrically vary in their physical “size” (the spatial extent of a space bounded by walls) and functional “clutter” (the organization and quantity of objects that fill up the space). Using a linear regression model on multivoxel pattern activity across regions of interest, we find evidence that both properties of size and clutter are represented in the patterns of parahippocampal cortex, while the retrosplenial cortex activity patterns are predominantly sensitive to the size of a space, rather than the degree of clutter. Parametric whole-brain analyses confirmed these results. Importantly, this size and clutter information was represented in a way that generalized across different semantic categories. These data provide support for a property-based representation of spaces, distributed across multiple scene-selective regions of the cerebral cortex. PMID:24436318
New insights into ambient and focal visual fixations using an automatic classification algorithm
Follet, Brice; Le Meur, Olivier; Baccino, Thierry
2011-01-01
Overt visual attention is the act of directing the eyes toward a given area. These eye movements are characterised by saccades and fixations. A debate currently surrounds the role of visual fixations. Do they all have the same role in the free viewing of natural scenes? Recent studies suggest that at least two types of visual fixations exist: focal and ambient. The former is believed to be used to inspect local areas accurately, whereas the latter is used to obtain the context of the scene. We investigated the use of an automated system to cluster visual fixations in two groups using four types of natural scene images. We found new evidence to support a focal–ambient dichotomy. Our data indicate that the determining factor is the saccade amplitude. The dependence on the low-level visual features and the time course of these two kinds of visual fixations were examined. Our results demonstrate that there is an interplay between both fixation populations and that focal fixations are more dependent on low-level visual features than are ambient fixations. PMID:23145248
2017-01-01
The application of insect and arthropod information to medicolegal death investigations is one of the more exacting applications of entomology. Historically limited to homicide investigations, the integration of full time forensic entomology services to the medical examiner’s office in Harris County has opened up the opportunity to apply entomology to a wide variety of manner of death classifications and types of scenes to make observations on a number of different geographical and species-level trends in Harris County, Texas, USA. In this study, a retrospective analysis was made of 203 forensic entomology cases analyzed during the course of medicolegal death investigations performed by the Harris County Institute of Forensic Sciences in Houston, TX, USA from January 2013 through April 2016. These cases included all manner of death classifications, stages of decomposition and a variety of different scene types that were classified into decedents transported from the hospital (typically associated with myiasis or sting allergy; 3.0%), outdoor scenes (32.0%) or indoor scenes (65.0%). Ambient scene air temperature at the time scene investigation was the only significantly different factor observed between indoor and outdoor scenes with average indoor scene temperature being slightly cooler (25.2°C) than that observed outdoors (28.0°C). Relative humidity was not found to be significantly different between scene types. Most of the indoor scenes were classified as natural (43.3%) whereas most of the outdoor scenes were classified as homicides (12.3%). All other manner of death classifications came from both indoor and outdoor scenes. Several species were found to be significantly associated with indoor scenes as indicated by a binomial test, including Blaesoxipha plinthopyga (Wiedemann) (Diptera: Sarcophagidae), all Sarcophagidae (including B. plinthopyga), Megaselia scalaris Loew (Diptera: Phoridae), Synthesiomyia nudiseta Wulp (Diptera: Muscidae) and Lucilia cuprina (Wiedemann) (Diptera: Calliphoridae). The only species that was a significant indicator of an outdoor scene was Lucilia eximia (Wiedemann) (Diptera: Calliphoridae). All other insect species that were collected in five or more cases were collected from both indoor and outdoor scenes. A species list with month of collection and basic scene characteristics with the length of the estimated time of colonization is also presented. The data presented here provide valuable casework related species data for Harris County, TX and nearby areas on the Gulf Coast that can be used to compare to other climate regions with other species assemblages and to assist in identifying new species introductions to the area. This study also highlights the importance of potential sources of uncertainty in preparation and interpretation of forensic entomology reports from different scene types. PMID:28604832
Sanford, Michelle R
2017-01-01
The application of insect and arthropod information to medicolegal death investigations is one of the more exacting applications of entomology. Historically limited to homicide investigations, the integration of full time forensic entomology services to the medical examiner's office in Harris County has opened up the opportunity to apply entomology to a wide variety of manner of death classifications and types of scenes to make observations on a number of different geographical and species-level trends in Harris County, Texas, USA. In this study, a retrospective analysis was made of 203 forensic entomology cases analyzed during the course of medicolegal death investigations performed by the Harris County Institute of Forensic Sciences in Houston, TX, USA from January 2013 through April 2016. These cases included all manner of death classifications, stages of decomposition and a variety of different scene types that were classified into decedents transported from the hospital (typically associated with myiasis or sting allergy; 3.0%), outdoor scenes (32.0%) or indoor scenes (65.0%). Ambient scene air temperature at the time scene investigation was the only significantly different factor observed between indoor and outdoor scenes with average indoor scene temperature being slightly cooler (25.2°C) than that observed outdoors (28.0°C). Relative humidity was not found to be significantly different between scene types. Most of the indoor scenes were classified as natural (43.3%) whereas most of the outdoor scenes were classified as homicides (12.3%). All other manner of death classifications came from both indoor and outdoor scenes. Several species were found to be significantly associated with indoor scenes as indicated by a binomial test, including Blaesoxipha plinthopyga (Wiedemann) (Diptera: Sarcophagidae), all Sarcophagidae (including B. plinthopyga), Megaselia scalaris Loew (Diptera: Phoridae), Synthesiomyia nudiseta Wulp (Diptera: Muscidae) and Lucilia cuprina (Wiedemann) (Diptera: Calliphoridae). The only species that was a significant indicator of an outdoor scene was Lucilia eximia (Wiedemann) (Diptera: Calliphoridae). All other insect species that were collected in five or more cases were collected from both indoor and outdoor scenes. A species list with month of collection and basic scene characteristics with the length of the estimated time of colonization is also presented. The data presented here provide valuable casework related species data for Harris County, TX and nearby areas on the Gulf Coast that can be used to compare to other climate regions with other species assemblages and to assist in identifying new species introductions to the area. This study also highlights the importance of potential sources of uncertainty in preparation and interpretation of forensic entomology reports from different scene types.
Generation of binary holograms with a Kinect sensor for a high speed color holographic display
NASA Astrophysics Data System (ADS)
Leportier, Thibault; Park, Min-Chul; Yano, Sumio; Son, Jung-Young
2017-05-01
The Kinect sensor is a device that enables to capture a real scene with a camera and a depth sensor. A virtual model of the scene can then be obtained with a point cloud representation. A complex hologram can then be computed. However, complex data cannot be used directly because display devices cannot handle amplitude and phase modulation at the same time. Binary holograms are commonly used since they present several advantages. Among the methods that were proposed to convert holograms into a binary format, the direct-binary search (DBS) not only gives the best performance, it also offers the possibility to choose the display parameters of the binary hologram differently than the original complex hologram. Since wavelength and reconstruction distance can be modified, compensation of chromatic aberrations can be handled. In this study, we examine the potential of DBS for RGB holographic display.
Katz, Matthew L.; Viney, Tim J.; Nikolic, Konstantin
2016-01-01
Sensory stimuli are encoded by diverse kinds of neurons but the identities of the recorded neurons that are studied are often unknown. We explored in detail the firing patterns of eight previously defined genetically-identified retinal ganglion cell (RGC) types from a single transgenic mouse line. We first introduce a new technique of deriving receptive field vectors (RFVs) which utilises a modified form of mutual information (“Quadratic Mutual Information”). We analysed the firing patterns of RGCs during presentation of short duration (~10 second) complex visual scenes (natural movies). We probed the high dimensional space formed by the visual input for a much smaller dimensional subspace of RFVs that give the most information about the response of each cell. The new technique is very efficient and fast and the derivation of novel types of RFVs formed by the natural scene visual input was possible even with limited numbers of spikes per cell. This approach enabled us to estimate the 'visual memory' of each cell type and the corresponding receptive field area by calculating Mutual Information as a function of the number of frames and radius. Finally, we made predictions of biologically relevant functions based on the RFVs of each cell type. RGC class analysis was complemented with results for the cells’ response to simple visual input in the form of black and white spot stimulation, and their classification on several key physiological metrics. Thus RFVs lead to predictions of biological roles based on limited data and facilitate analysis of sensory-evoked spiking data from defined cell types. PMID:26845435
Adaptation to Skew Distortions of Natural Scenes and Retinal Specificity of Its Aftereffects
Habtegiorgis, Selam W.; Rifai, Katharina; Lappe, Markus; Wahl, Siegfried
2017-01-01
Image skew is one of the prominent distortions that exist in optical elements, such as in spectacle lenses. The present study evaluates adaptation to image skew in dynamic natural images. Moreover, the cortical levels involved in skew coding were probed using retinal specificity of skew adaptation aftereffects. Left and right skewed natural image sequences were shown to observers as adapting stimuli. The point of subjective equality (PSE), i.e., the skew amplitude in simple geometrical patterns that is perceived to be unskewed, was used to quantify the aftereffect of each adapting skew direction. The PSE, in a two-alternative forced choice paradigm, shifted toward the adapting skew direction. Moreover, significant adaptation aftereffects were obtained not only at adapted, but also at non-adapted retinal locations during fixation. Skew adaptation information was transferred partially to non-adapted retinal locations. Thus, adaptation to skewed natural scenes induces coordinated plasticity in lower and higher cortical areas of the visual pathway. PMID:28751870
The benefits of mystery in nature on attention: assessing the impacts of presentation duration
Szolosi, Andrew M.; Watson, Jason M.; Ruddell, Edward J.
2014-01-01
Although research has provided prodigious evidence in support of the cognitive benefits that natural settings have over urban settings, all nature is not equal. Within nature, natural settings that contain mystery are often among the most preferred nature scenes. With the prospect of acquiring new information, scenes of this type could more effectively elicit a person's sense of fascination, enabling that person to rest the more effortful forms of attention. The present study examined the direct cognitive benefits that mystery in nature has on attention. Settings of this sort presumably evoke a form of attention that is undemanding or effortless. In order to investigate that notion, participants (n = 144) completed a Recognition Memory Task (RMT) that evaluated recognition performance based on the presence of mystery and presentation duration (300 ms, 1 s, 5 s, and 10 s). Results revealed that with additional viewing time, images perceived high in mystery achieved greater improvements in recognition performance when compared to those images perceived low in mystery. Tests for mediation showed that the effect mystery had on recognition performance occurred through perceptions of fascination. Implications of these and other findings are discussed in the context of Attention Restoration Theory. PMID:25505441
The effects of nature images on pain in a simulated hospital patient room.
Vincent, Ellen; Battisto, Dina; Grimes, Larry; McCubbin, James
2010-01-01
Views of nature have been reported to relieve stress and pain, making nature an ideal medium for use in healthcare settings. In hospitals whose design does not allow for a view of nature, virtual and surrogate views of nature may be viable therapeutic options. This study tests the effects of specific nature images, as defined by Appleton's prospect refuge theory of landscape preference, on participants experiencing pain. The hypotheses were: (1) Nature views are variable in their impact on specific psychological and physiological health status indicators; and (2) Prospect and refuge nature scenes are more therapeutic than hazard nature scenes. The research question was (1) Which nature image categories are most therapeutic as evidenced by reduced pain and positive mood? An experiment using mixed methods assessed the effects of four different nature scenes on physiological (blood pressure, heart rate) and psychological (mood) responses when a person was subjected to a pain stressor. Four groups were subjected to a specific nature image category of prospect, refuge, hazard, or mixed prospect and refuge; the fifth group viewed no image. The Short-Form McGill Pain Questionnaire and the Profile of Mood States survey instruments were used to assess pain and mood, respectively. Continuous physiological readings of heart rate and blood pressure were collected. Pain was induced through a cold pressor task, which required participants to immerse their nondominant hand in ice water for up to 120 seconds. The mixed prospect and refuge image treatment showed significantly lower sensory pain responses, and the no-image treatment indicated significantly higher affective pain perception responses. The hazard image treatment had significantly lower diastolic blood pressure readings during the pain treatment, but it also had significantly high total mood disturbance. Although there was no clear "most" therapeutic image, the mixed prospect and refuge image showed significant potential to reduce sensory pain. The hazard image was the most effective at distracting participants from pain, but it should not be considered a positive distraction because it also received the highest mood disturbance scores of all groups.
Learned Compact Local Feature Descriptor for Tls-Based Geodetic Monitoring of Natural Outdoor Scenes
NASA Astrophysics Data System (ADS)
Gojcic, Z.; Zhou, C.; Wieser, A.
2018-05-01
The advantages of terrestrial laser scanning (TLS) for geodetic monitoring of man-made and natural objects are not yet fully exploited. Herein we address one of the open challenges by proposing feature-based methods for identification of corresponding points in point clouds of two or more epochs. We propose a learned compact feature descriptor tailored for point clouds of natural outdoor scenes obtained using TLS. We evaluate our method both on a benchmark data set and on a specially acquired outdoor dataset resembling a simplified monitoring scenario where we successfully estimate 3D displacement vectors of a rock that has been displaced between the scans. We show that the proposed descriptor has the capacity to generalize to unseen data and achieves state-of-the-art performance while being time efficient at the matching step due the low dimension.
Image Feature Types and Their Predictions of Aesthetic Preference and Naturalness
Ibarra, Frank F.; Kardan, Omid; Hunter, MaryCarol R.; Kotabe, Hiroki P.; Meyer, Francisco A. C.; Berman, Marc G.
2017-01-01
Previous research has investigated ways to quantify visual information of a scene in terms of a visual processing hierarchy, i.e., making sense of visual environment by segmentation and integration of elementary sensory input. Guided by this research, studies have developed categories for low-level visual features (e.g., edges, colors), high-level visual features (scene-level entities that convey semantic information such as objects), and how models of those features predict aesthetic preference and naturalness. For example, in Kardan et al. (2015a), 52 participants provided aesthetic preference and naturalness ratings, which are used in the current study, for 307 images of mixed natural and urban content. Kardan et al. (2015a) then developed a model using low-level features to predict aesthetic preference and naturalness and could do so with high accuracy. What has yet to be explored is the ability of higher-level visual features (e.g., horizon line position relative to viewer, geometry of building distribution relative to visual access) to predict aesthetic preference and naturalness of scenes, and whether higher-level features mediate some of the association between the low-level features and aesthetic preference or naturalness. In this study we investigated these relationships and found that low- and high- level features explain 68.4% of the variance in aesthetic preference ratings and 88.7% of the variance in naturalness ratings. Additionally, several high-level features mediated the relationship between the low-level visual features and aaesthetic preference. In a multiple mediation analysis, the high-level feature mediators accounted for over 50% of the variance in predicting aesthetic preference. These results show that high-level visual features play a prominent role predicting aesthetic preference, but do not completely eliminate the predictive power of the low-level visual features. These strong predictors provide powerful insights for future research relating to landscape and urban design with the aim of maximizing subjective well-being, which could lead to improved health outcomes on a larger scale. PMID:28503158
Text Line Detection from Rectangle Traffic Panels of Natural Scene
NASA Astrophysics Data System (ADS)
Wang, Shiyuan; Huang, Linlin; Hu, Jian
2018-01-01
Traffic sign detection and recognition is very important for Intelligent Transportation. Among traffic signs, traffic panel contains rich information. However, due to low resolution and blur in the rectangular traffic panel, it is difficult to extract the character and symbols. In this paper, we propose a coarse-to-fine method to detect the Chinese character on traffic panels from natural scenes. Given a traffic panel Color Quantization is applied to extract candidate regions of Chinese characters. Second, a multi-stage filter based on learning is applied to discard the non-character regions. Third, we aggregate the characters for text lines by Distance Metric Learning method. Experimental results on real traffic images from Baidu Street View demonstrate the effectiveness of the proposed method.
A hybrid approach for text detection in natural scenes
NASA Astrophysics Data System (ADS)
Wang, Runmin; Sang, Nong; Wang, Ruolin; Kuang, Xiaoqin
2013-10-01
In this paper, a hybrid approach is proposed to detect texts in natural scenes. It is performed by the following steps: Firstly, the edge map and the text saliency region are obtained. Secondly, the text candidate regions are detected by connected components (CC) based method and are identified by an off-line trained HOG classifier. And then, the remaining CCs are grouped into text lines with some heuristic strategies to make up for the false negatives. Finally, the text lines are broken into separate words. The performance of the proposed approach is evaluated on the location detection database of ICDAR 2003 robust reading competition. Experimental results demonstrate the validity of our approach and are competitive with other state-of-the-art algorithms.
Wiltshire, Patricia E J; Hawksworth, David L; Webb, Judy A; Edwards, Kevin J
2015-09-01
The body of a murdered woman was found on the planted periphery of a busy roundabout in Dundee, United Kingdom. A suspect was apprehended and his footwear yielded a similar palynological (botanical and mycological) profile to that obtained from the ground and vegetation of the crime scene, and to that of the victim's clothing. The sources of palynomorphs at the roundabout were the in situ vegetation, and macerated woody mulch which had been laid on the ground surface. The degree of rarity of individual forensic markers, the complexity of the overall profile, and the application of both botanical and mycological expertise, led to a high level of resolution in the results, enabling the exhibits to be linked to the crime scene. The suspect was convicted of murder. The interpretation of the results allowed conclusions which added to the list of essential protocols for crime scene sampling as well the requirement for advanced expertise in identification. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Navigable points estimation for mobile robots using binary image skeletonization
NASA Astrophysics Data System (ADS)
Martinez S., Fernando; Jacinto G., Edwar; Montiel A., Holman
2017-02-01
This paper describes the use of image skeletonization for the estimation of all the navigable points, inside a scene of mobile robots navigation. Those points are used for computing a valid navigation path, using standard methods. The main idea is to find the middle and the extreme points of the obstacles in the scene, taking into account the robot size, and create a map of navigable points, in order to reduce the amount of information for the planning algorithm. Those points are located by means of the skeletonization of a binary image of the obstacles and the scene background, along with some other digital image processing algorithms. The proposed algorithm automatically gives a variable number of navigable points per obstacle, depending on the complexity of its shape. As well as, the way how the algorithm can change some of their parameters in order to change the final number of the resultant key points is shown. The results shown here were obtained applying different kinds of digital image processing algorithms on static scenes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
O'Donnell, T.J.; Olson, A.J.
1981-08-01
GRAMPS, a graphics language interpreter has been developed in FORTRAN 77 to be used in conjunction with an interactive vector display list processor (Evans and Sutherland Multi-Picture-System). Several of the features of the language make it very useful and convenient for real-time scene construction, manipulation and animation. The GRAMPS language syntax allows natural interaction with scene elements as well as easy, interactive assignment of graphics input devices. GRAMPS facilitates the creation, manipulation and copying of complex nested picture structures. The language has a powerful macro feature that enables new graphics commands to be developed and incorporated interactively. Animation may bemore » achieved in GRAMPS by two different, yet mutually compatible means. Picture structures may contain framed data, which consist of a sequence of fixed objects. These structures may be displayed sequentially to give a traditional frame animation effect. In addition, transformation information on picture structures may be saved at any time in the form of new macro commands that will transform these structures from one saved state to another in a specified number of steps, yielding an interpolated transformation animation effect. An overview of the GRAMPS command structure is given and several examples of application of the language to molecular modeling and animation are presented.« less
Learning-dependent plasticity with and without training in the human brain.
Zhang, Jiaxiang; Kourtzi, Zoe
2010-07-27
Long-term experience through development and evolution and shorter-term training in adulthood have both been suggested to contribute to the optimization of visual functions that mediate our ability to interpret complex scenes. However, the brain plasticity mechanisms that mediate the detection of objects in cluttered scenes remain largely unknown. Here, we combine behavioral and functional MRI (fMRI) measurements to investigate the human-brain mechanisms that mediate our ability to learn statistical regularities and detect targets in clutter. We show two different routes to visual learning in clutter with discrete brain plasticity signatures. Specifically, opportunistic learning of regularities typical in natural contours (i.e., collinearity) can occur simply through frequent exposure, generalize across untrained stimulus features, and shape processing in occipitotemporal regions implicated in the representation of global forms. In contrast, learning to integrate discontinuities (i.e., elements orthogonal to contour paths) requires task-specific training (bootstrap-based learning), is stimulus-dependent, and enhances processing in intraparietal regions implicated in attention-gated learning. We propose that long-term experience with statistical regularities may facilitate opportunistic learning of collinear contours, whereas learning to integrate discontinuities entails bootstrap-based training for the detection of contours in clutter. These findings provide insights in understanding how long-term experience and short-term training interact to shape the optimization of visual recognition processes.
Robust algebraic image enhancement for intelligent control systems
NASA Technical Reports Server (NTRS)
Lerner, Bao-Ting; Morrelli, Michael
1993-01-01
Robust vision capability for intelligent control systems has been an elusive goal in image processing. The computationally intensive techniques a necessary for conventional image processing make real-time applications, such as object tracking and collision avoidance difficult. In order to endow an intelligent control system with the needed vision robustness, an adequate image enhancement subsystem capable of compensating for the wide variety of real-world degradations, must exist between the image capturing and the object recognition subsystems. This enhancement stage must be adaptive and must operate with consistency in the presence of both statistical and shape-based noise. To deal with this problem, we have developed an innovative algebraic approach which provides a sound mathematical framework for image representation and manipulation. Our image model provides a natural platform from which to pursue dynamic scene analysis, and its incorporation into a vision system would serve as the front-end to an intelligent control system. We have developed a unique polynomial representation of gray level imagery and applied this representation to develop polynomial operators on complex gray level scenes. This approach is highly advantageous since polynomials can be manipulated very easily, and are readily understood, thus providing a very convenient environment for image processing. Our model presents a highly structured and compact algebraic representation of grey-level images which can be viewed as fuzzy sets.
Miconi, Thomas; Groomes, Laura; Kreiman, Gabriel
2016-01-01
When searching for an object in a scene, how does the brain decide where to look next? Visual search theories suggest the existence of a global “priority map” that integrates bottom-up visual information with top-down, target-specific signals. We propose a mechanistic model of visual search that is consistent with recent neurophysiological evidence, can localize targets in cluttered images, and predicts single-trial behavior in a search task. This model posits that a high-level retinotopic area selective for shape features receives global, target-specific modulation and implements local normalization through divisive inhibition. The normalization step is critical to prevent highly salient bottom-up features from monopolizing attention. The resulting activity pattern constitues a priority map that tracks the correlation between local input and target features. The maximum of this priority map is selected as the locus of attention. The visual input is then spatially enhanced around the selected location, allowing object-selective visual areas to determine whether the target is present at this location. This model can localize objects both in array images and when objects are pasted in natural scenes. The model can also predict single-trial human fixations, including those in error and target-absent trials, in a search task involving complex objects. PMID:26092221
The use of higher-order statistics in rapid object categorization in natural scenes.
Banno, Hayaki; Saiki, Jun
2015-02-04
We can rapidly and efficiently recognize many types of objects embedded in complex scenes. What information supports this object recognition is a fundamental question for understanding our visual processing. We investigated the eccentricity-dependent role of shape and statistical information for ultrarapid object categorization, using the higher-order statistics proposed by Portilla and Simoncelli (2000). Synthesized textures computed by their algorithms have the same higher-order statistics as the originals, while the global shapes were destroyed. We used the synthesized textures to manipulate the availability of shape information separately from the statistics. We hypothesized that shape makes a greater contribution to central vision than to peripheral vision and that statistics show the opposite pattern. Results did not show contributions clearly biased by eccentricity. Statistical information demonstrated a robust contribution not only in peripheral but also in central vision. For shape, the results supported the contribution in both central and peripheral vision. Further experiments revealed some interesting properties of the statistics. They are available for a limited time, attributable to the presence or absence of animals without shape, and predict how easily humans detect animals in original images. Our data suggest that when facing the time constraint of categorical processing, higher-order statistics underlie our significant performance for rapid categorization, irrespective of eccentricity. © 2015 ARVO.
Buttafuoco, Arianna; Pedale, Tiziana; Buchanan, Tony W; Santangelo, Valerio
2018-02-01
Emotional events are thought to have privileged access to attention and memory, consuming resources needed to encode competing emotionally neutral stimuli. However, it is not clear whether this detrimental effect is automatic or depends on the successful maintenance of the specific emotional object within working memory. Here, participants viewed everyday scenes including an emotional object among other neutral objects followed by a free-recollection task. Results showed that emotional objects-irrespective of their perceptual saliency-were recollected more often than neutral objects. The probability of being recollected increased as a function of the arousal of the emotional objects, specifically for negative objects. Successful recollection of emotional objects (positive or negative) from a scene reduced the overall number of recollected neutral objects from the same scene. This indicates that only emotional stimuli that are efficient in grabbing (and then consuming) available attentional resources play a crucial role during the encoding of competing information, with a subsequent bias in the recollection of neutral representations.
On the effectiveness of noise masks: naturalistic vs. un-naturalistic image statistics.
Hansen, Bruce C; Hess, Robert F
2012-05-01
It has been argued that the human visual system is optimized for identification of broadband objects embedded in stimuli possessing orientation averaged power spectra fall-offs that obey the 1/f(β) relationship typically observed in natural scene imagery (i.e., β=2.0 on logarithmic axes). Here, we were interested in whether individual spatial channels leading to recognition are functionally optimized for narrowband targets when masked by noise possessing naturalistic image statistics (β=2.0). The current study therefore explores the impact of variable β noise masks on the identification of narrowband target stimuli ranging in spatial complexity, while simultaneously controlling for physical or perceived differences between the masks. The results show that β=2.0 noise masks produce the largest identification thresholds regardless of target complexity, and thus do not seem to yield functionally optimized channel processing. The differential masking effects are discussed in the context of contrast gain control. Copyright © 2012 Elsevier Ltd. All rights reserved.
3D Reasoning from Blocks to Stability.
Zhaoyin Jia; Gallagher, Andrew C; Saxena, Ashutosh; Chen, Tsuhan
2015-05-01
Objects occupy physical space and obey physical laws. To truly understand a scene, we must reason about the space that objects in it occupy, and how each objects is supported stably by each other. In other words, we seek to understand which objects would, if moved, cause other objects to fall. This 3D volumetric reasoning is important for many scene understanding tasks, ranging from segmentation of objects to perception of a rich 3D, physically well-founded, interpretations of the scene. In this paper, we propose a new algorithm to parse a single RGB-D image with 3D block units while jointly reasoning about the segments, volumes, supporting relationships, and object stability. Our algorithm is based on the intuition that a good 3D representation of the scene is one that fits the depth data well, and is a stable, self-supporting arrangement of objects (i.e., one that does not topple). We design an energy function for representing the quality of the block representation based on these properties. Our algorithm fits 3D blocks to the depth values corresponding to image segments, and iteratively optimizes the energy function. Our proposed algorithm is the first to consider stability of objects in complex arrangements for reasoning about the underlying structure of the scene. Experimental results show that our stability-reasoning framework improves RGB-D segmentation and scene volumetric representation.
Moving through a multiplex holographic scene
NASA Astrophysics Data System (ADS)
Mrongovius, Martina
2013-02-01
This paper explores how movement can be used as a compositional element in installations of multiplex holograms. My holographic images are created from montages of hand-held video and photo-sequences. These spatially dynamic compositions are visually complex but anchored to landmarks and hints of the capturing process - such as the appearance of the photographer's shadow - to establish a sense of connection to the holographic scene. Moving around in front of the hologram, the viewer animates the holographic scene. A perception of motion then results from the viewer's bodily awareness of physical motion and the visual reading of dynamics within the scene or movement of perspective through a virtual suggestion of space. By linking and transforming the physical motion of the viewer with the visual animation, the viewer's bodily awareness - including proprioception, balance and orientation - play into the holographic composition. How multiplex holography can be a tool for exploring coupled, cross-referenced and transformed perceptions of movement is demonstrated with a number of holographic image installations. Through this process I expanded my creative composition practice to consider how dynamic and spatial scenes can be conveyed through the fragmented view of a multiplex hologram. This body of work was developed through an installation art practice and was the basis of my recently completed doctoral thesis: 'The Emergent Holographic Scene — compositions of movement and affect using multiplex holographic images'.
Functional anatomy of temporal organisation and domain-specificity of episodic memory retrieval.
Kwok, Sze Chai; Shallice, Tim; Macaluso, Emiliano
2012-10-01
Episodic memory provides information about the "when" of events as well as "what" and "where" they happened. Using functional imaging, we investigated the domain specificity of retrieval-related processes following encoding of complex, naturalistic events. Subjects watched a 42-min TV episode, and 24h later, made discriminative choices of scenes from the clip during fMRI. Subjects were presented with two scenes and required to either choose the scene that happened earlier in the film (Temporal), or the scene with a correct spatial arrangement (Spatial), or the scene that had been shown (Object). We identified a retrieval network comprising the precuneus, lateral and dorsal parietal cortex, middle frontal and medial temporal areas. The precuneus and angular gyrus are associated with temporal retrieval, with precuneal activity correlating negatively with temporal distance between two happenings at encoding. A dorsal fronto-parietal network engages during spatial retrieval, while antero-medial temporal regions activate during object-related retrieval. We propose that access to episodic memory traces involves different processes depending on task requirements. These include memory-searching within an organised knowledge structure in the precuneus (Temporal task), online maintenance of spatial information in dorsal fronto-parietal cortices (Spatial task) and combining scene-related spatial and non-spatial information in the hippocampus (Object task). Our findings support the proposal of process-specific dissociations of retrieval. Copyright © 2012 Elsevier Ltd. All rights reserved.
Functional anatomy of temporal organisation and domain-specificity of episodic memory retrieval
Kwok, Sze Chai; Shallice, Tim; Macaluso, Emiliano
2013-01-01
Episodic memory provides information about the “when” of events as well as “what” and “where” they happened. Using functional imaging, we investigated the domain specificity of retrieval-related processes following encoding of complex, naturalistic events. Subjects watched a 42-min TV episode, and 24 h later, made discriminative choices of scenes from the clip during fMRI. Subjects were presented with two scenes and required to either choose the scene that happened earlier in the film (Temporal), or the scene with a correct spatial arrangement (Spatial), or the scene that had been shown (Object). We identified a retrieval network comprising the precuneus, lateral and dorsal parietal cortex, middle frontal and medial temporal areas. The precuneus and angular gyrus are associated with temporal retrieval, with precuneal activity correlating negatively with temporal distance between two happenings at encoding. A dorsal fronto-parietal network engages during spatial retrieval, while antero-medial temporal regions activate during object-related retrieval. We propose that access to episodic memory traces involves different processes depending on task requirements. These include memory-searching within an organised knowledge structure in the precuneus (Temporal task), online maintenance of spatial information in dorsal fronto-parietal cortices (Spatial task) and combining scene-related spatial and non-spatial information in the hippocampus (Object task). Our findings support the proposal of process-specific dissociations of retrieval. PMID:22877840
Out of Mind, Out of Sight: Unexpected Scene Elements Frequently Go Unnoticed Until Primed.
Slavich, George M; Zimbardo, Philip G
2013-12-01
The human visual system employs a sophisticated set of strategies for scanning the environment and directing attention to stimuli that can be expected given the context and a person's past experience. Although these strategies enable us to navigate a very complex physical and social environment, they can also cause highly salient, but unexpected stimuli to go completely unnoticed. To examine the generality of this phenomenon, we conducted eight studies that included 15 different experimental conditions and 1,577 participants in all. These studies revealed that a large majority of participants do not report having seen a woman in the center of an urban scene who was photographed in midair as she was committing suicide. Despite seeing the scene repeatedly, 46 % of all participants failed to report seeing a central figure and only 4.8 % reported seeing a falling person. Frequency of noticing the suicidal woman was highest for participants who read a narrative priming story that increased the extent to which she was schematically congruent with the scene. In contrast to this robust effect of inattentional blindness , a majority of participants reported seeing other peripheral objects in the visual scene that were equally difficult to detect, yet more consistent with the scene. Follow-up qualitative analyses revealed that participants reported seeing many elements that were not actually present, but which could have been expected given the overall context of the scene. Together, these findings demonstrate the robustness of inattentional blindness and highlight the specificity with which different visual primes may increase noticing behavior.
Statistics of high-level scene context
Greene, Michelle R.
2013-01-01
Context is critical for recognizing environments and for searching for objects within them: contextual associations have been shown to modulate reaction time and object recognition accuracy, as well as influence the distribution of eye movements and patterns of brain activations. However, we have not yet systematically quantified the relationships between objects and their scene environments. Here I seek to fill this gap by providing descriptive statistics of object-scene relationships. A total of 48, 167 objects were hand-labeled in 3499 scenes using the LabelMe tool (Russell et al., 2008). From these data, I computed a variety of descriptive statistics at three different levels of analysis: the ensemble statistics that describe the density and spatial distribution of unnamed “things” in the scene; the bag of words level where scenes are described by the list of objects contained within them; and the structural level where the spatial distribution and relationships between the objects are measured. The utility of each level of description for scene categorization was assessed through the use of linear classifiers, and the plausibility of each level for modeling human scene categorization is discussed. Of the three levels, ensemble statistics were found to be the most informative (per feature), and also best explained human patterns of categorization errors. Although a bag of words classifier had similar performance to human observers, it had a markedly different pattern of errors. However, certain objects are more useful than others, and ceiling classification performance could be achieved using only the 64 most informative objects. As object location tends not to vary as a function of category, structural information provided little additional information. Additionally, these data provide valuable information on natural scene redundancy that can be exploited for machine vision, and can help the visual cognition community to design experiments guided by statistics rather than intuition. PMID:24194723
Rapid discrimination of visual scene content in the human brain.
Anokhin, Andrey P; Golosheykin, Simon; Sirevaag, Erik; Kristjansson, Sean; Rohrbaugh, John W; Heath, Andrew C
2006-06-06
The rapid evaluation of complex visual environments is critical for an organism's adaptation and survival. Previous studies have shown that emotionally significant visual scenes, both pleasant and unpleasant, elicit a larger late positive wave in the event-related brain potential (ERP) than emotionally neutral pictures. The purpose of the present study was to examine whether neuroelectric responses elicited by complex pictures discriminate between specific, biologically relevant contents of the visual scene and to determine how early in the picture processing this discrimination occurs. Subjects (n = 264) viewed 55 color slides differing in both scene content and emotional significance. No categorical judgments or responses were required. Consistent with previous studies, we found that emotionally arousing pictures, regardless of their content, produce a larger late positive wave than neutral pictures. However, when pictures were further categorized by content, anterior ERP components in a time window between 200 and 600 ms following stimulus onset showed a high selectivity for pictures with erotic content compared to other pictures regardless of their emotional valence (pleasant, neutral, and unpleasant) or emotional arousal. The divergence of ERPs elicited by erotic and non-erotic contents started at 185 ms post-stimulus in the fronto-central midline region, with a later onset in parietal regions. This rapid, selective, and content-specific processing of erotic materials and its dissociation from other pictures (including emotionally positive pictures) suggests the existence of a specialized neural network for prioritized processing of a distinct category of biologically relevant stimuli with high adaptive and evolutionary significance.
Rapid discrimination of visual scene content in the human brain
Anokhin, Andrey P.; Golosheykin, Simon; Sirevaag, Erik; Kristjansson, Sean; Rohrbaugh, John W.; Heath, Andrew C.
2007-01-01
The rapid evaluation of complex visual environments is critical for an organism's adaptation and survival. Previous studies have shown that emotionally significant visual scenes, both pleasant and unpleasant, elicit a larger late positive wave in the event-related brain potential (ERP) than emotionally neutral pictures. The purpose of the present study was to examine whether neuroelectric responses elicited by complex pictures discriminate between specific, biologically relevant contents of the visual scene and to determine how early in the picture processing this discrimination occurs. Subjects (n=264) viewed 55 color slides differing in both scene content and emotional significance. No categorical judgments or responses were required. Consistent with previous studies, we found that emotionally arousing pictures, regardless of their content, produce a larger late positive wave than neutral pictures. However, when pictures were further categorized by content, anterior ERP components in a time window between 200−600 ms following stimulus onset showed a high selectivity for pictures with erotic content compared to other pictures regardless of their emotional valence (pleasant, neutral, and unpleasant) or emotional arousal. The divergence of ERPs elicited by erotic and non-erotic contents started at 185 ms post-stimulus in the fronto-central midline regions, with a later onset in parietal regions. This rapid, selective, and content-specific processing of erotic materials and its dissociation from other pictures (including emotionally positive pictures) suggests the existence of a specialized neural network for prioritized processing of a distinct category of biologically relevant stimuli with high adaptive and evolutionary significance. PMID:16712815
Auditory Scene Analysis: The Sweet Music of Ambiguity
Pressnitzer, Daniel; Suied, Clara; Shamma, Shihab A.
2011-01-01
In this review paper aimed at the non-specialist, we explore the use that neuroscientists and musicians have made of perceptual illusions based on ambiguity. The pivotal issue is auditory scene analysis (ASA), or what enables us to make sense of complex acoustic mixtures in order to follow, for instance, a single melody in the midst of an orchestra. In general, ASA uncovers the most likely physical causes that account for the waveform collected at the ears. However, the acoustical problem is ill-posed and it must be solved from noisy sensory input. Recently, the neural mechanisms implicated in the transformation of ambiguous sensory information into coherent auditory scenes have been investigated using so-called bistability illusions (where an unchanging ambiguous stimulus evokes a succession of distinct percepts in the mind of the listener). After reviewing some of those studies, we turn to music, which arguably provides some of the most complex acoustic scenes that a human listener will ever encounter. Interestingly, musicians will not always aim at making each physical source intelligible, but rather express one or more melodic lines with a small or large number of instruments. By means of a few musical illustrations and by using a computational model inspired by neuro-physiological principles, we suggest that this relies on a detailed (if perhaps implicit) knowledge of the rules of ASA and of its inherent ambiguity. We then put forward the opinion that some degree perceptual ambiguity may participate in our appreciation of music. PMID:22174701
The auditory scene: an fMRI study on melody and accompaniment in professional pianists.
Spada, Danilo; Verga, Laura; Iadanza, Antonella; Tettamanti, Marco; Perani, Daniela
2014-11-15
The auditory scene is a mental representation of individual sounds extracted from the summed sound waveform reaching the ears of the listeners. Musical contexts represent particularly complex cases of auditory scenes. In such a scenario, melody may be seen as the main object moving on a background represented by the accompaniment. Both melody and accompaniment vary in time according to harmonic rules, forming a typical texture with melody in the most prominent, salient voice. In the present sparse acquisition functional magnetic resonance imaging study, we investigated the interplay between melody and accompaniment in trained pianists, by observing the activation responses elicited by processing: (1) melody placed in the upper and lower texture voices, leading to, respectively, a higher and lower auditory salience; (2) harmonic violations occurring in either the melody, the accompaniment, or both. The results indicated that the neural activation elicited by the processing of polyphonic compositions in expert musicians depends upon the upper versus lower position of the melodic line in the texture, and showed an overall greater activation for the harmonic processing of melody over accompaniment. Both these two predominant effects were characterized by the involvement of the posterior cingulate cortex and precuneus, among other associative brain regions. We discuss the prominent role of the posterior medial cortex in the processing of melodic and harmonic information in the auditory stream, and propose to frame this processing in relation to the cognitive construction of complex multimodal sensory imagery scenes. Copyright © 2014 Elsevier Inc. All rights reserved.
Henninger, Jörg; Krahe, Rüdiger; Kirschbaum, Frank; Grewe, Jan; Benda, Jan
2018-06-13
Sensory systems evolve in the ecological niches that each species is occupying. Accordingly, encoding of natural stimuli by sensory neurons is expected to be adapted to the statistics of these stimuli. For a direct quantification of sensory scenes, we tracked natural communication behavior of male and female weakly electric fish, Apteronotus rostratus , in their Neotropical rainforest habitat with high spatiotemporal resolution over several days. In the context of courtship, we observed large quantities of electrocommunication signals. Echo responses, acknowledgment signals, and their synchronizing role in spawning demonstrated the behavioral relevance of these signals. In both courtship and aggressive contexts, we observed robust behavioral responses in stimulus regimes that have so far been neglected in electrophysiological studies of this well characterized sensory system and that are well beyond the range of known best frequency and amplitude tuning of the electroreceptor afferents' firing rate modulation. Our results emphasize the importance of quantifying sensory scenes derived from freely behaving animals in their natural habitats for understanding the function and evolution of neural systems. SIGNIFICANCE STATEMENT The processing mechanisms of sensory systems have evolved in the context of the natural lives of organisms. To understand the functioning of sensory systems therefore requires probing them in the stimulus regimes in which they evolved. We took advantage of the continuously generated electric fields of weakly electric fish to explore electrosensory stimulus statistics in their natural Neotropical habitat. Unexpectedly, many of the electrocommunication signals recorded during courtship, spawning, and aggression had much smaller amplitudes or higher frequencies than stimuli used so far in neurophysiological characterizations of the electrosensory system. Our results demonstrate that quantifying sensory scenes derived from freely behaving animals in their natural habitats is essential to avoid biases in the choice of stimuli used to probe brain function. Copyright © 2018 the authors 0270-6474/18/385456-11$15.00/0.
Fixed Pattern Noise pixel-wise linear correction for crime scene imaging CMOS sensor
NASA Astrophysics Data System (ADS)
Yang, Jie; Messinger, David W.; Dube, Roger R.; Ientilucci, Emmett J.
2017-05-01
Filtered multispectral imaging technique might be a potential method for crime scene documentation and evidence detection due to its abundant spectral information as well as non-contact and non-destructive nature. Low-cost and portable multispectral crime scene imaging device would be highly useful and efficient. The second generation crime scene imaging system uses CMOS imaging sensor to capture spatial scene and bandpass Interference Filters (IFs) to capture spectral information. Unfortunately CMOS sensors suffer from severe spatial non-uniformity compared to CCD sensors and the major cause is Fixed Pattern Noise (FPN). IFs suffer from "blue shift" effect and introduce spatial-spectral correlated errors. Therefore, Fixed Pattern Noise (FPN) correction is critical to enhance crime scene image quality and is also helpful for spatial-spectral noise de-correlation. In this paper, a pixel-wise linear radiance to Digital Count (DC) conversion model is constructed for crime scene imaging CMOS sensor. Pixel-wise conversion gain Gi,j and Dark Signal Non-Uniformity (DSNU) Zi,j are calculated. Also, conversion gain is divided into four components: FPN row component, FPN column component, defects component and effective photo response signal component. Conversion gain is then corrected to average FPN column and row components and defects component so that the sensor conversion gain is uniform. Based on corrected conversion gain and estimated image incident radiance from the reverse of pixel-wise linear radiance to DC model, corrected image spatial uniformity can be enhanced to 7 times as raw image, and the bigger the image DC value within its dynamic range, the better the enhancement.
Binocular contrast-gain control for natural scenes: Image structure and phase alignment.
Huang, Pi-Chun; Dai, Yu-Ming
2018-05-01
In the context of natural scenes, we applied the pattern-masking paradigm to investigate how image structure and phase alignment affect contrast-gain control in binocular vision. We measured the discrimination thresholds of bandpass-filtered natural-scene images (targets) under various types of pedestals. Our first experiment had four pedestal types: bandpass-filtered pedestals, unfiltered pedestals, notch-filtered pedestals (which enabled removal of the spatial frequency), and misaligned pedestals (which involved rotation of unfiltered pedestals). Our second experiment featured six types of pedestals: bandpass-filtered, unfiltered, and notch-filtered pedestals, and the corresponding phase-scrambled pedestals. The thresholds were compared for monocular, binocular, and dichoptic viewing configurations. The bandpass-filtered pedestal and unfiltered pedestals showed classic dipper shapes; the dipper shapes of the notch-filtered, misaligned, and phase-scrambled pedestals were weak. We adopted a two-stage binocular contrast-gain control model to describe our results. We deduced that the phase-alignment information influenced the contrast-gain control mechanism before the binocular summation stage and that the phase-alignment information and structural misalignment information caused relatively strong divisive inhibition in the monocular and interocular suppression stages. When the pedestals were phase-scrambled, the elimination of the interocular suppression processing was the most convincing explanation of the results. Thus, our results indicated that both phase-alignment information and similar image structures cause strong interocular suppression. Copyright © 2018 Elsevier Ltd. All rights reserved.
San Francisco Bay, California as seen from STS-59
1994-04-14
STS059-213-009 (9-20 April 1994) --- San Francisco Bay. Orient with the sea up. The delta of the combined Sacramento and San Joaquin Rivers occupies the foreground, San Francisco Bay the middle distance, and the Pacific Ocean the rest. Variations in water color caused both by sediment load and by wind streaking strike the eye. Man-made features dominate this scene. The Lafayette/Concord complex is left of the bay head, Vallejo is to the right, the Berkeley/Oakland complex rims the shoreline of the main bay, and San Francisco fills the peninsula beyond. Salt-evaporation ponds contain differently-colored algae depending on salinity. The low altitude (less than 120 nautical miles) and unusually-clear air combine to provide unusually-strong green colors in this Spring scene. Hasselblad camera.
San Francisco Bay, California as seen from STS-59
NASA Technical Reports Server (NTRS)
1994-01-01
San Francisco Bay as seen from STS-59. View is oriented with the sea up. The delta of the combined Sacramento and San Joaquin Rivers occupies the foreground with San Francisco Bay in the middle distance, then the Pacific Ocean. Variations in water color caused both by sediment load and by wind streaking strike the eye. Man-made features dominate this scene. The Lafayette/Concord complex is left of the bay head, Vallejo is to the right, the Berkeley/Oakland complex rims the shoreline of the main bay, and San Francisco fills the peninsula beyond. Salt-evaporation ponds contain differently-colored algae depending on salinity. The low altitude (less than 120 nautical miles) and unusually-clear air combine to provide unusually-strong green colors in this Spring scene.
A neighboring structure reconstructed matching algorithm based on LARK features
NASA Astrophysics Data System (ADS)
Xue, Taobei; Han, Jing; Zhang, Yi; Bai, Lianfa
2015-11-01
Aimed at the low contrast ratio and high noise of infrared images, and the randomness and ambient occlusion of its objects, this paper presents a neighboring structure reconstructed matching (NSRM) algorithm based on LARK features. The neighboring structure relationships of local window are considered based on a non-negative linear reconstruction method to build a neighboring structure relationship matrix. Then the LARK feature matrix and the NSRM matrix are processed separately to get two different similarity images. By fusing and analyzing the two similarity images, those infrared objects are detected and marked by the non-maximum suppression. The NSRM approach is extended to detect infrared objects with incompact structure. High performance is demonstrated on infrared body set, indicating a lower false detecting rate than conventional methods in complex natural scenes.
Oliver, Daniel G.; Serovich, Julianne M.; Mason, Tina L.
2006-01-01
In this paper we discuss the complexities of interview transcription. While often seen as a behind-the-scenes task, we suggest that transcription is a powerful act of representation. Transcription is practiced in multiple ways, often using naturalism, in which every utterance is captured in as much detail as possible, and/or denaturalism, in which grammar is corrected, interview noise (e.g., stutters, pauses, etc.) is removed and nonstandard accents (i.e., non-majority) are standardized. In this article, we discuss the constraints and opportunities of our transcription decisions and point to an intermediate, reflective step. We suggest that researchers incorporate reflection into their research design by interrogating their transcription decisions and the possible impact these decisions may have on participants and research outcomes. PMID:16534533
GPU accelerated edge-region based level set evolution constrained by 2D gray-scale histogram.
Balla-Arabé, Souleymane; Gao, Xinbo; Wang, Bin
2013-07-01
Due to its intrinsic nature which allows to easily handle complex shapes and topological changes, the level set method (LSM) has been widely used in image segmentation. Nevertheless, LSM is computationally expensive, which limits its applications in real-time systems. For this purpose, we propose a new level set algorithm, which uses simultaneously edge, region, and 2D histogram information in order to efficiently segment objects of interest in a given scene. The computational complexity of the proposed LSM is greatly reduced by using the highly parallelizable lattice Boltzmann method (LBM) with a body force to solve the level set equation (LSE). The body force is the link with image data and is defined from the proposed LSE. The proposed LSM is then implemented using an NVIDIA graphics processing units to fully take advantage of the LBM local nature. The new algorithm is effective, robust against noise, independent to the initial contour, fast, and highly parallelizable. The edge and region information enable to detect objects with and without edges, and the 2D histogram information enable the effectiveness of the method in a noisy environment. Experimental results on synthetic and real images demonstrate subjectively and objectively the performance of the proposed method.
Searching for simplicity in the analysis of neurons and behavior
Stephens, Greg J.; Osborne, Leslie C.; Bialek, William
2011-01-01
What fascinates us about animal behavior is its richness and complexity, but understanding behavior and its neural basis requires a simpler description. Traditionally, simplification has been imposed by training animals to engage in a limited set of behaviors, by hand scoring behaviors into discrete classes, or by limiting the sensory experience of the organism. An alternative is to ask whether we can search through the dynamics of natural behaviors to find explicit evidence that these behaviors are simpler than they might have been. We review two mathematical approaches to simplification, dimensionality reduction and the maximum entropy method, and we draw on examples from different levels of biological organization, from the crawling behavior of Caenorhabditis elegans to the control of smooth pursuit eye movements in primates, and from the coding of natural scenes by networks of neurons in the retina to the rules of English spelling. In each case, we argue that the explicit search for simplicity uncovers new and unexpected features of the biological system and that the evidence for simplification gives us a language with which to phrase new questions for the next generation of experiments. The fact that similar mathematical structures succeed in taming the complexity of very different biological systems hints that there is something more general to be discovered. PMID:21383186
a Modeling Method of Fluttering Leaves Based on Point Cloud
NASA Astrophysics Data System (ADS)
Tang, J.; Wang, Y.; Zhao, Y.; Hao, W.; Ning, X.; Lv, K.; Shi, Z.; Zhao, M.
2017-09-01
Leaves falling gently or fluttering are common phenomenon in nature scenes. The authenticity of leaves falling plays an important part in the dynamic modeling of natural scenes. The leaves falling model has a widely applications in the field of animation and virtual reality. We propose a novel modeling method of fluttering leaves based on point cloud in this paper. According to the shape, the weight of leaves and the wind speed, three basic trajectories of leaves falling are defined, which are the rotation falling, the roll falling and the screw roll falling. At the same time, a parallel algorithm based on OpenMP is implemented to satisfy the needs of real-time in practical applications. Experimental results demonstrate that the proposed method is amenable to the incorporation of a variety of desirable effects.
Conjoint representation of texture ensemble and location in the parahippocampal place area.
Park, Jeongho; Park, Soojin
2017-04-01
Texture provides crucial information about the category or identity of a scene. Nonetheless, not much is known about how the texture information in a scene is represented in the brain. Previous studies have shown that the parahippocampal place area (PPA), a scene-selective part of visual cortex, responds to simple patches of texture ensemble. However, in natural scenes textures exist in spatial context within a scene. Here we tested two hypotheses that make different predictions on how textures within a scene context are represented in the PPA. The Texture-Only hypothesis suggests that the PPA represents texture ensemble (i.e., the kind of texture) as is, irrespective of its location in the scene. On the other hand, the Texture and Location hypothesis suggests that the PPA represents texture and its location within a scene (e.g., ceiling or wall) conjointly. We tested these two hypotheses across two experiments, using different but complementary methods. In experiment 1 , by using multivoxel pattern analysis (MVPA) and representational similarity analysis, we found that the representational similarity of the PPA activation patterns was significantly explained by the Texture-Only hypothesis but not by the Texture and Location hypothesis. In experiment 2 , using a repetition suppression paradigm, we found no repetition suppression for scenes that had the same texture ensemble but differed in location (supporting the Texture and Location hypothesis). On the basis of these results, we propose a framework that reconciles contrasting results from MVPA and repetition suppression and draw conclusions about how texture is represented in the PPA. NEW & NOTEWORTHY This study investigates how the parahippocampal place area (PPA) represents texture information within a scene context. We claim that texture is represented in the PPA at multiple levels: the texture ensemble information at the across-voxel level and the conjoint information of texture and its location at the within-voxel level. The study proposes a working hypothesis that reconciles contrasting results from multivoxel pattern analysis and repetition suppression, suggesting that the methods are complementary to each other but not necessarily interchangeable. Copyright © 2017 the American Physiological Society.
NASA Technical Reports Server (NTRS)
Parrish, Russell V.; Busquets, Anthony M.; Williams, Steven P.; Nold, Dean E.
2003-01-01
A simulation study was conducted in 1994 at Langley Research Center that used 12 commercial airline pilots repeatedly flying complex Microwave Landing System (MLS)-type approaches to parallel runways under Category IIIc weather conditions. Two sensor insert concepts of 'Synthetic Vision Systems' (SVS) were used in the simulated flights, with a more conventional electro-optical display (similar to a Head-Up Display with raster capability for sensor imagery), flown under less restrictive visibility conditions, used as a control condition. The SVS concepts combined the sensor imagery with a computer-generated image (CGI) of an out-the-window scene based on an onboard airport database. Various scenarios involving runway traffic incursions (taxiing aircraft and parked fuel trucks) and navigational system position errors (both static and dynamic) were used to assess the pilots' ability to manage the approach task with the display concepts. The two SVS sensor insert concepts contrasted the simple overlay of sensor imagery on the CGI scene without additional image processing (the SV display) to the complex integration (the AV display) of the CGI scene with pilot-decision aiding using both object and edge detection techniques for detection of obstacle conflicts and runway alignment errors.
Cant, Jonathan S; Xu, Yaoda
2017-02-01
Our visual system can extract summary statistics from large collections of objects without forming detailed representations of the individual objects in the ensemble. In a region in ventral visual cortex encompassing the collateral sulcus and the parahippocampal gyrus and overlapping extensively with the scene-selective parahippocampal place area (PPA), we have previously reported fMRI adaptation to object ensembles when ensemble statistics repeated, even when local image features differed across images (e.g., two different images of the same strawberry pile). We additionally showed that this ensemble representation is similar to (but still distinct from) how visual texture patterns are processed in this region and is not explained by appealing to differences in the color of the elements that make up the ensemble. To further explore the nature of ensemble representation in this brain region, here we used PPA as our ROI and investigated in detail how the shape and surface properties (i.e., both texture and color) of the individual objects constituting an ensemble affect the ensemble representation in anterior-medial ventral visual cortex. We photographed object ensembles of stone beads that varied in shape and surface properties. A given ensemble always contained beads of the same shape and surface properties (e.g., an ensemble of star-shaped rose quartz beads). A change to the shape and/or surface properties of all the beads in an ensemble resulted in a significant release from adaptation in PPA compared with conditions in which no ensemble feature changed. In contrast, in the object-sensitive lateral occipital area (LO), we only observed a significant release from adaptation when the shape of the ensemble elements varied, and found no significant results in additional scene-sensitive regions, namely, the retrosplenial complex and occipital place area. Together, these results demonstrate that the shape and surface properties of the individual objects comprising an ensemble both contribute significantly to object ensemble representation in anterior-medial ventral visual cortex and further demonstrate a functional dissociation between object- (LO) and scene-selective (PPA) visual cortical regions and within the broader scene-processing network itself.
Region grouping in natural foliage scenes: image statistics and human performance.
Ing, Almon D; Wilson, J Anthony; Geisler, Wilson S
2010-04-27
This study investigated the mechanisms of grouping and segregation in natural scenes of close-up foliage, an important class of scenes for human and non-human primates. Close-up foliage images were collected with a digital camera calibrated to match the responses of human L, M, and S cones at each pixel. The images were used to construct a database of hand-segmented leaves and branches that correctly localizes the image region subtended by each object. We considered a task where a visual system is presented with two image patches and is asked to assign a category label (either same or different) depending on whether the patches appear to lie on the same surface or different surfaces. We estimated several approximately ideal classifiers for the task, each of which used a unique set of image properties. Of the image properties considered, we found that ideal classifiers rely primarily on the difference in average intensity and color between patches, and secondarily on the differences in the contrasts between patches. In psychophysical experiments, human performance mirrored the trends predicted by the ideal classifiers. In an initial phase without corrective feedback, human accuracy was slightly below ideal. After practice with feedback, human accuracy was approximately ideal.
Referenceless perceptual fog density prediction model
NASA Astrophysics Data System (ADS)
Choi, Lark Kwon; You, Jaehee; Bovik, Alan C.
2014-02-01
We propose a perceptual fog density prediction model based on natural scene statistics (NSS) and "fog aware" statistical features, which can predict the visibility in a foggy scene from a single image without reference to a corresponding fogless image, without side geographical camera information, without training on human-rated judgments, and without dependency on salient objects such as lane markings or traffic signs. The proposed fog density predictor only makes use of measurable deviations from statistical regularities observed in natural foggy and fog-free images. A fog aware collection of statistical features is derived from a corpus of foggy and fog-free images by using a space domain NSS model and observed characteristics of foggy images such as low contrast, faint color, and shifted intensity. The proposed model not only predicts perceptual fog density for the entire image but also provides a local fog density index for each patch. The predicted fog density of the model correlates well with the measured visibility in a foggy scene as measured by judgments taken in a human subjective study on a large foggy image database. As one application, the proposed model accurately evaluates the performance of defog algorithms designed to enhance the visibility of foggy images.
36 CFR 28.12 - Development standards.
Code of Federal Regulations, 2011 CFR
2011-07-01
... significant harm to the natural resources of the Seashore. (c) Minimum lot size is 4,000 square feet. A.../FEMA shown on Flood Insurance Rate Maps for Fire Island communities. (g) A swimming pool is an... hazards and/or detract from the natural or cultural scene. (j) A zoning authority shall have in place...
Plenoptic layer-based modeling for image based rendering.
Pearson, James; Brookes, Mike; Dragotti, Pier Luigi
2013-09-01
Image based rendering is an attractive alternative to model based rendering for generating novel views because of its lower complexity and potential for photo-realistic results. To reduce the number of images necessary for alias-free rendering, some geometric information for the 3D scene is normally necessary. In this paper, we present a fast automatic layer-based method for synthesizing an arbitrary new view of a scene from a set of existing views. Our algorithm takes advantage of the knowledge of the typical structure of multiview data to perform occlusion-aware layer extraction. In addition, the number of depth layers used to approximate the geometry of the scene is chosen based on plenoptic sampling theory with the layers placed non-uniformly to account for the scene distribution. The rendering is achieved using a probabilistic interpolation approach and by extracting the depth layer information on a small number of key images. Numerical results demonstrate that the algorithm is fast and yet is only 0.25 dB away from the ideal performance achieved with the ground-truth knowledge of the 3D geometry of the scene of interest. This indicates that there are measurable benefits from following the predictions of plenoptic theory and that they remain true when translated into a practical system for real world data.
Use of LANDSAT images of vegetation cover to estimate effective hydraulic properties of soils
NASA Technical Reports Server (NTRS)
Eagleson, Peter S.; Jasinski, Michael F.
1988-01-01
This work focuses on the characterization of natural, spatially variable, semivegetated landscapes using a linear, stochastic, canopy-soil reflectance model. A first application of the model was the investigation of the effects of subpixel and regional variability of scenes on the shape and structure of red-infrared scattergrams. Additionally, the model was used to investigate the inverse problem, the estimation of subpixel vegetation cover, given only the scattergrams of simulated satellite scale multispectral scenes. The major aspects of that work, including recent field investigations, are summarized.
NASA Astrophysics Data System (ADS)
Meitzler, Thomas J.
The field of computer vision interacts with fields such as psychology, vision research, machine vision, psychophysics, mathematics, physics, and computer science. The focus of this thesis is new algorithms and methods for the computation of the probability of detection (Pd) of a target in a cluttered scene. The scene can be either a natural visual scene such as one sees with the naked eye (visual), or, a scene displayed on a monitor with the help of infrared sensors. The relative clutter and the temperature difference between the target and background (DeltaT) are defined and then used to calculate a relative signal -to-clutter ratio (SCR) from which the Pd is calculated for a target in a cluttered scene. It is shown how this definition can include many previous definitions of clutter and (DeltaT). Next, fuzzy and neural -fuzzy techniques are used to calculate the Pd and it is shown how these methods can give results that have a good correlation with experiment. The experimental design for actually measuring the Pd of a target by observers is described. Finally, wavelets are applied to the calculation of clutter and it is shown how this new definition of clutter based on wavelets can be used to compute the Pd of a target.
Coping with Work-related Traumatic Situations among Crime Scene Technicians.
Pavšič Mrevlje, Tinkara
2016-10-01
Crime scene technicians collect evidence related to crime and are therefore exposed to many traumatic situations. The coping strategies they use are thus very important in the process of facing the psychological consequences of such work. The available literature shows that crime scene technicians are an understudied subgroup of police workers. Our study is therefore the first unfolding insights into technicians' coping strategies, post-traumatic symptomatology and somatic health, based on a sample of 64 male crime scene technicians (85% of all Slovene technicians). Crime scene technicians mainly use avoidance coping strategies. Approach strategies that are more effective in the long-term-i.e. lead to a larger buffering of the effects of traumatic stress-are more frequently used if technicians are familiar with the nature of the task, when they have time to prepare for it, and if they feel that past situations have been positively resolved. Behavioural avoidance strategies were found to be least effective when dealing with traumatic experiences and are also related to more frequent problems of physical health. Results indicate that appropriate trainings for future technicians would facilitate the use of more effective coping strategies and consequently lead to a more effective and satisfied worker. Copyright © 2014 John Wiley & Sons, Ltd. Copyright © 2015 John Wiley & Sons, Ltd.
A new approach to modeling the influence of image features on fixation selection in scenes
Nuthmann, Antje; Einhäuser, Wolfgang
2015-01-01
Which image characteristics predict where people fixate when memorizing natural images? To answer this question, we introduce a new analysis approach that combines a novel scene-patch analysis with generalized linear mixed models (GLMMs). Our method allows for (1) directly describing the relationship between continuous feature value and fixation probability, and (2) assessing each feature's unique contribution to fixation selection. To demonstrate this method, we estimated the relative contribution of various image features to fixation selection: luminance and luminance contrast (low-level features); edge density (a mid-level feature); visual clutter and image segmentation to approximate local object density in the scene (higher-level features). An additional predictor captured the central bias of fixation. The GLMM results revealed that edge density, clutter, and the number of homogenous segments in a patch can independently predict whether image patches are fixated or not. Importantly, neither luminance nor contrast had an independent effect above and beyond what could be accounted for by the other predictors. Since the parcellation of the scene and the selection of features can be tailored to the specific research question, our approach allows for assessing the interplay of various factors relevant for fixation selection in scenes in a powerful and flexible manner. PMID:25752239
Age-related changes in visual exploratory behavior in a natural scene setting
Hamel, Johanna; De Beukelaer, Sophie; Kraft, Antje; Ohl, Sven; Audebert, Heinrich J.; Brandt, Stephan A.
2013-01-01
Diverse cognitive functions decline with increasing age, including the ability to process central and peripheral visual information in a laboratory testing situation (useful visual field of view). To investigate whether and how this influences activities of daily life, we studied age-related changes in visual exploratory behavior in a natural scene setting: a driving simulator paradigm of variable complexity was tested in subjects of varying ages with simultaneous eye- and head-movement recordings via a head-mounted camera. Detection and reaction times were also measured by visual fixation and manual reaction. We considered video computer game experience as a possible influence on performance. Data of 73 participants of varying ages were analyzed, driving two different courses. We analyzed the influence of route difficulty level, age, and eccentricity of test stimuli on oculomotor and driving behavior parameters. No significant age effects were found regarding saccadic parameters. In the older subjects head-movements increasingly contributed to gaze amplitude. More demanding courses and more peripheral stimuli locations induced longer reaction times in all age groups. Deterioration of the functionally useful visual field of view with increasing age was not suggested in our study group. However, video game-experienced subjects revealed larger saccade amplitudes and a broader distribution of fixations on the screen. They reacted faster to peripheral objects suggesting the notion of a general detection task rather than perceiving driving as a central task. As the video game-experienced population consisted of younger subjects, our study indicates that effects due to video game experience can easily be misinterpreted as age effects if not accounted for. We therefore view it as essential to consider video game experience in all testing methods using virtual media. PMID:23801970
Programmable personality interface for the dynamic infrared scene generator (IRSG2)
NASA Astrophysics Data System (ADS)
Buford, James A., Jr.; Mobley, Scott B.; Mayhall, Anthony J.; Braselton, William J.
1998-07-01
As scene generator platforms begin to rely specifically on commercial off-the-shelf (COTS) hardware and software components, the need for high speed programmable personality interfaces (PPIs) are required for interfacing to Infrared (IR) flight computer/processors and complex IR projectors in the hardware-in-the-loop (HWIL) simulation facilities. Recent technological advances and innovative applications of established technologies are beginning to allow development of cost effective PPIs to interface to COTS scene generators. At the U.S. Army Aviation and Missile Command (AMCOM) Missile Research, Development, and Engineering Center (MRDEC) researchers have developed such a PPI to reside between the AMCOM MRDEC IR Scene Generator (IRSG) and either a missile flight computer or the dynamic Laser Diode Array Projector (LDAP). AMCOM MRDEC has developed several PPIs for the first and second generation IRSGs (IRSG1 and IRSG2), which are based on Silicon Graphics Incorporated (SGI) Onyx and Onyx2 computers with Reality Engine 2 (RE2) and Infinite Reality (IR/IR2) graphics engines. This paper provides an overview of PPIs designed, integrated, tested, and verified at AMCOM MRDEC, specifically the IRSG2's PPI.
Recent advances in exploring the neural underpinnings of auditory scene perception
Snyder, Joel S.; Elhilali, Mounya
2017-01-01
Studies of auditory scene analysis have traditionally relied on paradigms using artificial sounds—and conventional behavioral techniques—to elucidate how we perceptually segregate auditory objects or streams from each other. In the past few decades, however, there has been growing interest in uncovering the neural underpinnings of auditory segregation using human and animal neuroscience techniques, as well as computational modeling. This largely reflects the growth in the fields of cognitive neuroscience and computational neuroscience and has led to new theories of how the auditory system segregates sounds in complex arrays. The current review focuses on neural and computational studies of auditory scene perception published in the past few years. Following the progress that has been made in these studies, we describe (1) theoretical advances in our understanding of the most well-studied aspects of auditory scene perception, namely segregation of sequential patterns of sounds and concurrently presented sounds; (2) the diversification of topics and paradigms that have been investigated; and (3) how new neuroscience techniques (including invasive neurophysiology in awake humans, genotyping, and brain stimulation) have been used in this field. PMID:28199022
Richard, Christian M; Wright, Richard D; Ee, Cheryl; Prime, Steven L; Shimizu, Yujiro; Vavrik, John
2002-01-01
The effect of a concurrent auditory task on visual search was investigated using an image-flicker technique. Participants were undergraduate university students with normal or corrected-to-normal vision who searched for changes in images of driving scenes that involved either driving-related (e.g., traffic light) or driving-unrelated (e.g., mailbox) scene elements. The results indicated that response times were significantly slower if the search was accompanied by a concurrent auditory task. In addition, slower overall responses to scenes involving driving-unrelated changes suggest that the underlying process affected by the concurrent auditory task is strategic in nature. These results were interpreted in terms of their implications for using a cellular telephone while driving. Actual or potential applications of this research include the development of safer in-vehicle communication devices.
Filippi, Massimo; Riccitelli, Gianna; Falini, Andrea; Di Salle, Francesco; Vuilleumier, Patrik; Comi, Giancarlo; Rocca, Maria A.
2010-01-01
Empathy and affective appraisals for conspecifics are among the hallmarks of social interaction. Using functional MRI, we hypothesized that vegetarians and vegans, who made their feeding choice for ethical reasons, might show brain responses to conditions of suffering involving humans or animals different from omnivores. We recruited 20 omnivore subjects, 19 vegetarians, and 21 vegans. The groups were matched for sex and age. Brain activation was investigated using fMRI and an event-related design during observation of negative affective pictures of human beings and animals (showing mutilations, murdered people, human/animal threat, tortures, wounds, etc.). Participants saw negative-valence scenes related to humans and animals, alternating with natural landscapes. During human negative valence scenes, compared with omnivores, vegetarians and vegans had an increased recruitment of the anterior cingulate cortex (ACC) and inferior frontal gyrus (IFG). More critically, during animal negative valence scenes, they had decreased amygdala activation and increased activation of the lingual gyri, the left cuneus, the posterior cingulate cortex and several areas mainly located in the frontal lobes, including the ACC, the IFG and the middle frontal gyrus. Nonetheless, also substantial differences between vegetarians and vegans have been found responding to negative scenes. Vegetarians showed a selective recruitment of the right inferior parietal lobule during human negative scenes, and a prevailing activation of the ACC during animal negative scenes. Conversely, during animal negative scenes an increased activation of the inferior prefrontal cortex was observed in vegans. These results suggest that empathy toward non conspecifics has different neural representation among individuals with different feeding habits, perhaps reflecting different motivational factors and beliefs. PMID:20520767
Text Extraction from Scene Images by Character Appearance and Structure Modeling
Yi, Chucai; Tian, Yingli
2012-01-01
In this paper, we propose a novel algorithm to detect text information from natural scene images. Scene text classification and detection are still open research topics. Our proposed algorithm is able to model both character appearance and structure to generate representative and discriminative text descriptors. The contributions of this paper include three aspects: 1) a new character appearance model by a structure correlation algorithm which extracts discriminative appearance features from detected interest points of character samples; 2) a new text descriptor based on structons and correlatons, which model character structure by structure differences among character samples and structure component co-occurrence; and 3) a new text region localization method by combining color decomposition, character contour refinement, and string line alignment to localize character candidates and refine detected text regions. We perform three groups of experiments to evaluate the effectiveness of our proposed algorithm, including text classification, text detection, and character identification. The evaluation results on benchmark datasets demonstrate that our algorithm achieves the state-of-the-art performance on scene text classification and detection, and significantly outperforms the existing algorithms for character identification. PMID:23316111
A multiple-feature and multiple-kernel scene segmentation algorithm for humanoid robot.
Liu, Zhi; Xu, Shuqiong; Zhang, Yun; Chen, Chun Lung Philip
2014-11-01
This technical correspondence presents a multiple-feature and multiple-kernel support vector machine (MFMK-SVM) methodology to achieve a more reliable and robust segmentation performance for humanoid robot. The pixel wise intensity, gradient, and C1 SMF features are extracted via the local homogeneity model and Gabor filter, which would be used as inputs of MFMK-SVM model. It may provide multiple features of the samples for easier implementation and efficient computation of MFMK-SVM model. A new clustering method, which is called feature validity-interval type-2 fuzzy C-means (FV-IT2FCM) clustering algorithm, is proposed by integrating a type-2 fuzzy criterion in the clustering optimization process to improve the robustness and reliability of clustering results by the iterative optimization. Furthermore, the clustering validity is employed to select the training samples for the learning of the MFMK-SVM model. The MFMK-SVM scene segmentation method is able to fully take advantage of the multiple features of scene image and the ability of multiple kernels. Experiments on the BSDS dataset and real natural scene images demonstrate the superior performance of our proposed method.
A model of proto-object based saliency
Russell, Alexander F.; Mihalaş, Stefan; von der Heydt, Rudiger; Niebur, Ernst; Etienne-Cummings, Ralph
2013-01-01
Organisms use the process of selective attention to optimally allocate their computational resources to the instantaneously most relevant subsets of a visual scene, ensuring that they can parse the scene in real time. Many models of bottom-up attentional selection assume that elementary image features, like intensity, color and orientation, attract attention. Gestalt psychologists, how-ever, argue that humans perceive whole objects before they analyze individual features. This is supported by recent psychophysical studies that show that objects predict eye-fixations better than features. In this report we present a neurally inspired algorithm of object based, bottom-up attention. The model rivals the performance of state of the art non-biologically plausible feature based algorithms (and outperforms biologically plausible feature based algorithms) in its ability to predict perceptual saliency (eye fixations and subjective interest points) in natural scenes. The model achieves this by computing saliency as a function of proto-objects that establish the perceptual organization of the scene. All computational mechanisms of the algorithm have direct neural correlates, and our results provide evidence for the interface theory of attention. PMID:24184601
NASA Astrophysics Data System (ADS)
den Hollander, Richard J. M.; Bouma, Henri; van Rest, Jeroen H. C.; ten Hove, Johan-Martijn; ter Haar, Frank B.; Burghouts, Gertjan J.
2017-10-01
Video analytics is essential for managing large quantities of raw data that are produced by video surveillance systems (VSS) for the prevention, repression and investigation of crime and terrorism. Analytics is highly sensitive to changes in the scene, and for changes in the optical chain so a VSS with analytics needs careful configuration and prompt maintenance to avoid false alarms. However, there is a trend from static VSS consisting of fixed CCTV cameras towards more dynamic VSS deployments over public/private multi-organization networks, consisting of a wider variety of visual sensors, including pan-tilt-zoom (PTZ) cameras, body-worn cameras and cameras on moving platforms. This trend will lead to more dynamic scenes and more frequent changes in the optical chain, creating structural problems for analytics. If these problems are not adequately addressed, analytics will not be able to continue to meet end users' developing needs. In this paper, we present a three-part solution for managing the performance of complex analytics deployments. The first part is a register containing meta data describing relevant properties of the optical chain, such as intrinsic and extrinsic calibration, and parameters of the scene such as lighting conditions or measures for scene complexity (e.g. number of people). A second part frequently assesses these parameters in the deployed VSS, stores changes in the register, and signals relevant changes in the setup to the VSS administrator. A third part uses the information in the register to dynamically configure analytics tasks based on VSS operator input. In order to support the feasibility of this solution, we give an overview of related state-of-the-art technologies for autocalibration (self-calibration), scene recognition and lighting estimation in relation to person detection. The presented solution allows for rapid and robust deployment of Video Content Analysis (VCA) tasks in large scale ad-hoc networks.
A Parallel Rendering Algorithm for MIMD Architectures
NASA Technical Reports Server (NTRS)
Crockett, Thomas W.; Orloff, Tobias
1991-01-01
Applications such as animation and scientific visualization demand high performance rendering of complex three dimensional scenes. To deliver the necessary rendering rates, highly parallel hardware architectures are required. The challenge is then to design algorithms and software which effectively use the hardware parallelism. A rendering algorithm targeted to distributed memory MIMD architectures is described. For maximum performance, the algorithm exploits both object-level and pixel-level parallelism. The behavior of the algorithm is examined both analytically and experimentally. Its performance for large numbers of processors is found to be limited primarily by communication overheads. An experimental implementation for the Intel iPSC/860 shows increasing performance from 1 to 128 processors across a wide range of scene complexities. It is shown that minimal modifications to the algorithm will adapt it for use on shared memory architectures as well.
Saliency Detection on Light Field.
Li, Nianyi; Ye, Jinwei; Ji, Yu; Ling, Haibin; Yu, Jingyi
2017-08-01
Existing saliency detection approaches use images as inputs and are sensitive to foreground/background similarities, complex background textures, and occlusions. We explore the problem of using light fields as input for saliency detection. Our technique is enabled by the availability of commercial plenoptic cameras that capture the light field of a scene in a single shot. We show that the unique refocusing capability of light fields provides useful focusness, depths, and objectness cues. We further develop a new saliency detection algorithm tailored for light fields. To validate our approach, we acquire a light field database of a range of indoor and outdoor scenes and generate the ground truth saliency map. Experiments show that our saliency detection scheme can robustly handle challenging scenarios such as similar foreground and background, cluttered background, complex occlusions, etc., and achieve high accuracy and robustness.
Three-dimensional tracking for efficient fire fighting in complex situations
NASA Astrophysics Data System (ADS)
Akhloufi, Moulay; Rossi, Lucile
2009-05-01
Each year, hundred millions hectares of forests burn causing human and economic losses. For efficient fire fighting, the personnel in the ground need tools permitting the prediction of fire front propagation. In this work, we present a new technique for automatically tracking fire spread in three-dimensional space. The proposed approach uses a stereo system to extract a 3D shape from fire images. A new segmentation technique is proposed and permits the extraction of fire regions in complex unstructured scenes. It works in the visible spectrum and combines information extracted from YUV and RGB color spaces. Unlike other techniques, our algorithm does not require previous knowledge about the scene. The resulting fire regions are classified into different homogenous zones using clustering techniques. Contours are then extracted and a feature detection algorithm is used to detect interest points like local maxima and corners. Extracted points from stereo images are then used to compute the 3D shape of the fire front. The resulting data permits to build the fire volume. The final model is used to compute important spatial and temporal fire characteristics like: spread dynamics, local orientation, heading direction, etc. Tests conducted on the ground show the efficiency of the proposed scheme. This scheme is being integrated with a fire spread mathematical model in order to predict and anticipate the fire behaviour during fire fighting. Also of interest to fire-fighters, is the proposed automatic segmentation technique that can be used in early detection of fire in complex scenes.
Navigating the auditory scene: an expert role for the hippocampus.
Teki, Sundeep; Kumar, Sukhbinder; von Kriegstein, Katharina; Stewart, Lauren; Lyness, C Rebecca; Moore, Brian C J; Capleton, Brian; Griffiths, Timothy D
2012-08-29
Over a typical career piano tuners spend tens of thousands of hours exploring a specialized acoustic environment. Tuning requires accurate perception and adjustment of beats in two-note chords that serve as a navigational device to move between points in previously learned acoustic scenes. It is a two-stage process that depends on the following: first, selective listening to beats within frequency windows, and, second, the subsequent use of those beats to navigate through a complex soundscape. The neuroanatomical substrates underlying brain specialization for such fundamental organization of sound scenes are unknown. Here, we demonstrate that professional piano tuners are significantly better than controls matched for age and musical ability on a psychophysical task simulating active listening to beats within frequency windows that is based on amplitude modulation rate discrimination. Tuners show a categorical increase in gray matter volume in the right frontal operculum and right superior temporal lobe. Tuners also show a striking enhancement of gray matter volume in the anterior hippocampus, parahippocampal gyrus, and superior temporal gyrus, and an increase in white matter volume in the posterior hippocampus as a function of years of tuning experience. The relationship with gray matter volume is sensitive to years of tuning experience and starting age but not actual age or level of musicality. Our findings support a role for a core set of regions in the hippocampus and superior temporal cortex in skilled exploration of complex sound scenes in which precise sound "templates" are encoded and consolidated into memory over time in an experience-dependent manner.
Visual wetness perception based on image color statistics.
Sawayama, Masataka; Adelson, Edward H; Nishida, Shin'ya
2017-05-01
Color vision provides humans and animals with the abilities to discriminate colors based on the wavelength composition of light and to determine the location and identity of objects of interest in cluttered scenes (e.g., ripe fruit among foliage). However, we argue that color vision can inform us about much more than color alone. Since a trichromatic image carries more information about the optical properties of a scene than a monochromatic image does, color can help us recognize complex material qualities. Here we show that human vision uses color statistics of an image for the perception of an ecologically important surface condition (i.e., wetness). Psychophysical experiments showed that overall enhancement of chromatic saturation, combined with a luminance tone change that increases the darkness and glossiness of the image, tended to make dry scenes look wetter. Theoretical analysis along with image analysis of real objects indicated that our image transformation, which we call the wetness enhancing transformation, is consistent with actual optical changes produced by surface wetting. Furthermore, we found that the wetness enhancing transformation operator was more effective for the images with many colors (large hue entropy) than for those with few colors (small hue entropy). The hue entropy may be used to separate surface wetness from other surface states having similar optical properties. While surface wetness and surface color might seem to be independent, there are higher order color statistics that can influence wetness judgments, in accord with the ecological statistics. The present findings indicate that the visual system uses color image statistics in an elegant way to help estimate the complex physical status of a scene.
Learning Long-Range Vision for an Offroad Robot
2008-09-01
robot to perceive and navigate in an unstructured natural world is a difficult task. Without learning, navigation systems are short-range and extremely...unsupervised or weakly supervised learning methods are necessary for training general feature representations for natural scenes. The process was...the world looked dark, and Legos when I was weary. iii ABSTRACT Teaching a robot to perceive and navigate in an unstructured natural world is a
Virkler, Kelly; Lednev, Igor K
2009-07-01
Body fluid traces recovered at crime scenes are among the most important types of evidence to forensic investigators. They contain valuable DNA evidence which can identify a suspect or victim as well as exonerate an innocent individual. The first step of identifying a particular body fluid is highly important since the nature of the fluid is itself very informative to the investigation, and the destructive nature of a screening test must be considered when only a small amount of material is available. The ability to characterize an unknown stain at the scene of the crime without having to wait for results from a laboratory is another very critical step in the development of forensic body fluid analysis. Driven by the importance for forensic applications, body fluid identification methods have been extensively developed in recent years. The systematic analysis of these new developments is vital for forensic investigators to be continuously educated on possible superior techniques. Significant advances in laser technology and the development of novel light detectors have dramatically improved spectroscopic methods for molecular characterization over the last decade. The application of this novel biospectroscopy for forensic purposes opens new and exciting opportunities for the development of on-field, non-destructive, confirmatory methods for body fluid identification at a crime scene. In addition, the biospectroscopy methods are universally applicable to all body fluids unlike the majority of current techniques which are valid for individual fluids only. This article analyzes the current methods being used to identify body fluid stains including blood, semen, saliva, vaginal fluid, urine, and sweat, and also focuses on new techniques that have been developed in the last 5-6 years. In addition, the potential of new biospectroscopic techniques based on Raman and fluorescence spectroscopy is evaluated for rapid, confirmatory, non-destructive identification of a body fluid at a crime scene.
NASA Astrophysics Data System (ADS)
Huang, Xin; Chen, Huijun; Gong, Jianya
2018-01-01
Spaceborne multi-angle images with a high-resolution are capable of simultaneously providing spatial details and three-dimensional (3D) information to support detailed and accurate classification of complex urban scenes. In recent years, satellite-derived digital surface models (DSMs) have been increasingly utilized to provide height information to complement spectral properties for urban classification. However, in such a way, the multi-angle information is not effectively exploited, which is mainly due to the errors and difficulties of the multi-view image matching and the inaccuracy of the generated DSM over complex and dense urban scenes. Therefore, it is still a challenging task to effectively exploit the available angular information from high-resolution multi-angle images. In this paper, we investigate the potential for classifying urban scenes based on local angular properties characterized from high-resolution ZY-3 multi-view images. Specifically, three categories of angular difference features (ADFs) are proposed to describe the angular information at three levels (i.e., pixel, feature, and label levels): (1) ADF-pixel: the angular information is directly extrapolated by pixel comparison between the multi-angle images; (2) ADF-feature: the angular differences are described in the feature domains by comparing the differences between the multi-angle spatial features (e.g., morphological attribute profiles (APs)). (3) ADF-label: label-level angular features are proposed based on a group of urban primitives (e.g., buildings and shadows), in order to describe the specific angular information related to the types of primitive classes. In addition, we utilize spatial-contextual information to refine the multi-level ADF features using superpixel segmentation, for the purpose of alleviating the effects of salt-and-pepper noise and representing the main angular characteristics within a local area. The experiments on ZY-3 multi-angle images confirm that the proposed ADF features can effectively improve the accuracy of urban scene classification, with a significant increase in overall accuracy (3.8-11.7%) compared to using the spectral bands alone. Furthermore, the results indicated the superiority of the proposed ADFs in distinguishing between the spectrally similar and complex man-made classes, including roads and various types of buildings (e.g., high buildings, urban villages, and residential apartments).
Natural scene logo recognition by joint boosting feature selection in salient regions
NASA Astrophysics Data System (ADS)
Fan, Wei; Sun, Jun; Naoi, Satoshi; Minagawa, Akihiro; Hotta, Yoshinobu
2011-01-01
Logos are considered valuable intellectual properties and a key component of the goodwill of a business. In this paper, we propose a natural scene logo recognition method which is segmentation-free and capable of processing images extremely rapidly and achieving high recognition rates. The classifiers for each logo are trained jointly, rather than independently. In this way, common features can be shared across multiple classes for better generalization. To deal with large range of aspect ratio of different logos, a set of salient regions of interest (ROI) are extracted to describe each class. We ensure the selected ROIs to be both individually informative and two-by-two weakly dependant by a Class Conditional Entropy Maximization criteria. Experimental results on a large logo database demonstrate the effectiveness and efficiency of our proposed method.
How to Read the Natural Landscape in Forests and Fields.
ERIC Educational Resources Information Center
Davis, Millard C.
This booklet is one in a series of instructional aids designed for use by elementary and secondary school science teachers. First, the original natural scene of the eastern portion of the United States is described, from the Atlantic Coast to the Mississippi River, as it probably was before European man invaded in great numbers. How local…
NASA Astrophysics Data System (ADS)
O'Byrne, Michael; Ghosh, Bidisha; Schoefs, Franck; O'Donnell, Deirdre; Wright, Robert; Pakrashi, Vikram
2015-07-01
Video based tracking is capable of analysing bridge vibrations that are characterised by large amplitudes and low frequencies. This paper presents the use of video images and associated image processing techniques to obtain the dynamic response of a pedestrian suspension bridge in Cork, Ireland. This historic structure is one of the four suspension bridges in Ireland and is notable for its dynamic nature. A video camera is mounted on the river-bank and the dynamic responses of the bridge have been measured from the video images. The dynamic response is assessed without the need of a reflector on the bridge and in the presence of various forms of luminous complexities in the video image scenes. Vertical deformations of the bridge were measured in this regard. The video image tracking for the measurement of dynamic responses of the bridge were based on correlating patches in time-lagged scenes in video images and utilisinga zero mean normalisedcross correlation (ZNCC) metric. The bridge was excited by designed pedestrian movement and by individual cyclists traversing the bridge. The time series data of dynamic displacement responses of the bridge were analysedto obtain the frequency domain response. Frequencies obtained from video analysis were checked against accelerometer data from the bridge obtained while carrying out the same set of experiments used for video image based recognition.
NASA Astrophysics Data System (ADS)
O'Byrne, Michael; Ghosh, Bidisha; Schoefs, Franck; O'Donnell, Deirdre; Wright, Robert; Pakrashi, Vikram
2015-07-01
Video based tracking is capable of analysing bridge vibrations that are characterised by large amplitudes and low frequencies. This paper presents the use of video images and associated image processing techniques to obtain the dynamic response of a pedestrian suspension bridge in Cork, Ireland. This historic structure is one of the four suspension bridges in Ireland and is notable for its dynamic nature. A video camera is mounted on the river-bank and the dynamic responses of the bridge have been measured from the video images. The dynamic response is assessed without the need of a reflector on the bridge and in the presence of various forms of luminous complexities in the video image scenes. Vertical deformations of the bridge were measured in this regard. The video image tracking for the measurement of dynamic responses of the bridge were based on correlating patches in time-lagged scenes in video images and utilisinga zero mean normalised cross correlation (ZNCC) metric. The bridge was excited by designed pedestrian movement and by individual cyclists traversing the bridge. The time series data of dynamic displacement responses of the bridge were analysedto obtain the frequency domain response. Frequencies obtained from video analysis were checked against accelerometer data from the bridge obtained while carrying out the same set of experiments used for video image based recognition.
Robust selectivity to two-object images in human visual cortex
Agam, Yigal; Liu, Hesheng; Papanastassiou, Alexander; Buia, Calin; Golby, Alexandra J.; Madsen, Joseph R.; Kreiman, Gabriel
2010-01-01
SUMMARY We can recognize objects in a fraction of a second in spite of the presence of other objects [1–3]. The responses in macaque areas V4 and inferior temporal cortex [4–15] to a neuron’s preferred stimuli are typically suppressed by the addition of a second object within the receptive field (see however [16, 17]). How can this suppression be reconciled with rapid visual recognition in complex scenes? One option is that certain “special categories” are unaffected by other objects [18] but this leaves the problem unsolved for other categories. Another possibility is that serial attentional shifts help ameliorate the problem of distractor objects [19–21]. Yet, psychophysical studies [1–3], scalp recordings [1] and neurophysiological recordings [14, 16, 22–24], suggest that the initial sweep of visual processing contains a significant amount of information. We recorded intracranial field potentials in human visual cortex during presentation of flashes of two-object images. Visual selectivity from temporal cortex during the initial ~200 ms was largely robust to the presence of other objects. We could train linear decoders on the responses to isolated objects and decode information in two-object images. These observations are compatible with parallel, hierarchical and feed-forward theories of rapid visual recognition [25] and may provide a neural substrate to begin to unravel rapid recognition in natural scenes. PMID:20417105
Brain bases for auditory stimulus-driven figure-ground segregation.
Teki, Sundeep; Chait, Maria; Kumar, Sukhbinder; von Kriegstein, Katharina; Griffiths, Timothy D
2011-01-05
Auditory figure-ground segregation, listeners' ability to selectively hear out a sound of interest from a background of competing sounds, is a fundamental aspect of scene analysis. In contrast to the disordered acoustic environment we experience during everyday listening, most studies of auditory segregation have used relatively simple, temporally regular signals. We developed a new figure-ground stimulus that incorporates stochastic variation of the figure and background that captures the rich spectrotemporal complexity of natural acoustic scenes. Figure and background signals overlap in spectrotemporal space, but vary in the statistics of fluctuation, such that the only way to extract the figure is by integrating the patterns over time and frequency. Our behavioral results demonstrate that human listeners are remarkably sensitive to the appearance of such figures. In a functional magnetic resonance imaging experiment, aimed at investigating preattentive, stimulus-driven, auditory segregation mechanisms, naive subjects listened to these stimuli while performing an irrelevant task. Results demonstrate significant activations in the intraparietal sulcus (IPS) and the superior temporal sulcus related to bottom-up, stimulus-driven figure-ground decomposition. We did not observe any significant activation in the primary auditory cortex. Our results support a role for automatic, bottom-up mechanisms in the IPS in mediating stimulus-driven, auditory figure-ground segregation, which is consistent with accumulating evidence implicating the IPS in structuring sensory input and perceptual organization.
Taya, Shuichiro; Windridge, David; Osman, Magda
2012-01-01
Several studies have reported that task instructions influence eye-movement behavior during static image observation. In contrast, during dynamic scene observation we show that while the specificity of the goal of a task influences observers’ beliefs about where they look, the goal does not in turn influence eye-movement patterns. In our study observers watched short video clips of a single tennis match and were asked to make subjective judgments about the allocation of visual attention to the items presented in the clip (e.g., ball, players, court lines, and umpire). However, before attending to the clips, observers were either told to simply watch clips (non-specific goal), or they were told to watch the clips with a view to judging which of the two tennis players was awarded the point (specific goal). The results of subjective reports suggest that observers believed that they allocated their attention more to goal-related items (e.g. court lines) if they performed the goal-specific task. However, we did not find the effect of goal specificity on major eye-movement parameters (i.e., saccadic amplitudes, inter-saccadic intervals, and gaze coherence). We conclude that the specificity of a task goal can alter observer’s beliefs about their attention allocation strategy, but such task-driven meta-attentional modulation does not necessarily correlate with eye-movement behavior. PMID:22768058
Scene Recognition for Indoor Localization Using a Multi-Sensor Fusion Approach.
Liu, Mengyun; Chen, Ruizhi; Li, Deren; Chen, Yujin; Guo, Guangyi; Cao, Zhipeng; Pan, Yuanjin
2017-12-08
After decades of research, there is still no solution for indoor localization like the GNSS (Global Navigation Satellite System) solution for outdoor environments. The major reasons for this phenomenon are the complex spatial topology and RF transmission environment. To deal with these problems, an indoor scene constrained method for localization is proposed in this paper, which is inspired by the visual cognition ability of the human brain and the progress in the computer vision field regarding high-level image understanding. Furthermore, a multi-sensor fusion method is implemented on a commercial smartphone including cameras, WiFi and inertial sensors. Compared to former research, the camera on a smartphone is used to "see" which scene the user is in. With this information, a particle filter algorithm constrained by scene information is adopted to determine the final location. For indoor scene recognition, we take advantage of deep learning that has been proven to be highly effective in the computer vision community. For particle filter, both WiFi and magnetic field signals are used to update the weights of particles. Similar to other fingerprinting localization methods, there are two stages in the proposed system, offline training and online localization. In the offline stage, an indoor scene model is trained by Caffe (one of the most popular open source frameworks for deep learning) and a fingerprint database is constructed by user trajectories in different scenes. To reduce the volume requirement of training data for deep learning, a fine-tuned method is adopted for model training. In the online stage, a camera in a smartphone is used to recognize the initial scene. Then a particle filter algorithm is used to fuse the sensor data and determine the final location. To prove the effectiveness of the proposed method, an Android client and a web server are implemented. The Android client is used to collect data and locate a user. The web server is developed for indoor scene model training and communication with an Android client. To evaluate the performance, comparison experiments are conducted and the results demonstrate that a positioning accuracy of 1.32 m at 95% is achievable with the proposed solution. Both positioning accuracy and robustness are enhanced compared to approaches without scene constraint including commercial products such as IndoorAtlas.
Scene Recognition for Indoor Localization Using a Multi-Sensor Fusion Approach
Chen, Ruizhi; Li, Deren; Chen, Yujin; Guo, Guangyi; Cao, Zhipeng
2017-01-01
After decades of research, there is still no solution for indoor localization like the GNSS (Global Navigation Satellite System) solution for outdoor environments. The major reasons for this phenomenon are the complex spatial topology and RF transmission environment. To deal with these problems, an indoor scene constrained method for localization is proposed in this paper, which is inspired by the visual cognition ability of the human brain and the progress in the computer vision field regarding high-level image understanding. Furthermore, a multi-sensor fusion method is implemented on a commercial smartphone including cameras, WiFi and inertial sensors. Compared to former research, the camera on a smartphone is used to “see” which scene the user is in. With this information, a particle filter algorithm constrained by scene information is adopted to determine the final location. For indoor scene recognition, we take advantage of deep learning that has been proven to be highly effective in the computer vision community. For particle filter, both WiFi and magnetic field signals are used to update the weights of particles. Similar to other fingerprinting localization methods, there are two stages in the proposed system, offline training and online localization. In the offline stage, an indoor scene model is trained by Caffe (one of the most popular open source frameworks for deep learning) and a fingerprint database is constructed by user trajectories in different scenes. To reduce the volume requirement of training data for deep learning, a fine-tuned method is adopted for model training. In the online stage, a camera in a smartphone is used to recognize the initial scene. Then a particle filter algorithm is used to fuse the sensor data and determine the final location. To prove the effectiveness of the proposed method, an Android client and a web server are implemented. The Android client is used to collect data and locate a user. The web server is developed for indoor scene model training and communication with an Android client. To evaluate the performance, comparison experiments are conducted and the results demonstrate that a positioning accuracy of 1.32 m at 95% is achievable with the proposed solution. Both positioning accuracy and robustness are enhanced compared to approaches without scene constraint including commercial products such as IndoorAtlas. PMID:29292761
Scene-based nonuniformity correction algorithm based on interframe registration.
Zuo, Chao; Chen, Qian; Gu, Guohua; Sui, Xiubao
2011-06-01
In this paper, we present a simple and effective scene-based nonuniformity correction (NUC) method for infrared focal plane arrays based on interframe registration. This method estimates the global translation between two adjacent frames and minimizes the mean square error between the two properly registered images to make any two detectors with the same scene produce the same output value. In this way, the accumulation of the registration error can be avoided and the NUC can be achieved. The advantages of the proposed algorithm lie in its low computational complexity and storage requirements and ability to capture temporal drifts in the nonuniformity parameters. The performance of the proposed technique is thoroughly studied with infrared image sequences with simulated nonuniformity and infrared imagery with real nonuniformity. It shows a significantly fast and reliable fixed-pattern noise reduction and obtains an effective frame-by-frame adaptive estimation of each detector's gain and offset.
How high is visual short-term memory capacity for object layout?
Sanocki, Thomas; Sellers, Eric; Mittelstadt, Jeff; Sulman, Noah
2010-05-01
Previous research measuring visual short-term memory (VSTM) suggests that the capacity for representing the layout of objects is fairly high. In four experiments, we further explored the capacity of VSTM for layout of objects, using the change detection method. In Experiment 1, participants retained most of the elements in displays of 4 to 8 elements. In Experiments 2 and 3, with up to 20 elements, participants retained many of them, reaching a capacity of 13.4 stimulus elements. In Experiment 4, participants retained much of a complex naturalistic scene. In most cases, increasing display size caused only modest reductions in performance, consistent with the idea of configural, variable-resolution grouping. The results indicate that participants can retain a substantial amount of scene layout information (objects and locations) in short-term memory. We propose that this is a case of remote visual understanding, where observers' ability to integrate information from a scene is paramount.
Coordinate references for the indoor/outdoor seamless positioning
NASA Astrophysics Data System (ADS)
Ruan, Ling; Zhang, Ling; Long, Yi; Cheng, Fei
2018-05-01
Indoor positioning technologies are being developed rapidly, and seamless positioning which connected indoor and outdoor space is a new trend. The indoor and outdoor positioning are not applying the same coordinate system and different indoor positioning scenes uses different indoor local coordinate reference systems. A specific and unified coordinate reference frame is needed as the space basis and premise in seamless positioning application. Trajectory analysis of indoor and outdoor integration also requires a uniform coordinate reference. However, the coordinate reference frame in seamless positioning which can applied to various complex scenarios is lacking of research for a long time. In this paper, we proposed a universal coordinate reference frame in indoor/outdoor seamless positioning. The research focus on analysis and classify the indoor positioning scenes and put forward the coordinate reference system establishment and coordinate transformation methods in each scene. And, through some experiments, the calibration method feasibility was verified.
The functional consequences of social distraction: Attention and memory for complex scenes.
Doherty, Brianna Ruth; Patai, Eva Zita; Duta, Mihaela; Nobre, Anna Christina; Scerif, Gaia
2017-01-01
Cognitive scientists have long proposed that social stimuli attract visual attention even when task irrelevant, but the consequences of this privileged status for memory are unknown. To address this, we combined computational approaches, eye-tracking methodology, and individual-differences measures. Participants searched for targets in scenes containing social or non-social distractors equated for low-level visual salience. Subsequent memory precision for target locations was tested. Individual differences in autistic traits and social anxiety were also measured. Eye-tracking revealed significantly more attentional capture to social compared to non-social distractors. Critically, memory precision for target locations was poorer for social scenes. This effect was moderated by social anxiety, with anxious individuals remembering target locations better under conditions of social distraction. These findings shed further light onto the privileged attentional status of social stimuli and its functional consequences on memory across individuals. Copyright © 2016. Published by Elsevier B.V.
Hannula, Deborah E.; Tranel, Daniel; Allen, John S.; Kirchhoff, Brenda A.; Nickel, Allison E.; Cohen, Neal J.
2014-01-01
Objective The objective of this study was to examine the dependence of item memory and relational memory on medial temporal lobe (MTL) structures. Patients with amnesia, who either had extensive MTL damage or damage that was relatively restricted to the hippocampus, were tested, as was a matched comparison group. Disproportionate relational memory impairments were predicted for both patient groups, and those with extensive MTL damage were also expected to have impaired item memory. Method Participants studied scenes, and were tested with interleaved two-alternative forced-choice probe trials. Probe trials were either presented immediately after the corresponding study trial (lag 1), five trials later (lag 5), or nine trials later (lag 9) and consisted of the studied scene along with a manipulated version of that scene in which one item was replaced with a different exemplar (item memory test) or was moved to a new location (relational memory test). Participants were to identify the exact match of the studied scene. Results As predicted, patients were disproportionately impaired on the test of relational memory. Item memory performance was marginally poorer among patients with extensive MTL damage, but both groups were impaired relative to matched comparison participants. Impaired performance was evident at all lags, including the shortest possible lag (lag 1). Conclusions The results are consistent with the proposed role of the hippocampus in relational memory binding and representation, even at short delays, and suggest that the hippocampus may also contribute to successful item memory when items are embedded in complex scenes. PMID:25068665
An optical systems analysis approach to image resampling
NASA Technical Reports Server (NTRS)
Lyon, Richard G.
1997-01-01
All types of image registration require some type of resampling, either during the registration or as a final step in the registration process. Thus the image(s) must be regridded into a spatially uniform, or angularly uniform, coordinate system with some pre-defined resolution. Frequently the ending resolution is not the resolution at which the data was observed with. The registration algorithm designer and end product user are presented with a multitude of possible resampling methods each of which modify the spatial frequency content of the data in some way. The purpose of this paper is threefold: (1) to show how an imaging system modifies the scene from an end to end optical systems analysis approach, (2) to develop a generalized resampling model, and (3) empirically apply the model to simulated radiometric scene data and tabulate the results. A Hanning windowed sinc interpolator method will be developed based upon the optical characterization of the system. It will be discussed in terms of the effects and limitations of sampling, aliasing, spectral leakage, and computational complexity. Simulated radiometric scene data will be used to demonstrate each of the algorithms. A high resolution scene will be "grown" using a fractal growth algorithm based on mid-point recursion techniques. The result scene data will be convolved with a point spread function representing the optical response. The resultant scene will be convolved with the detection systems response and subsampled to the desired resolution. The resultant data product will be subsequently resampled to the correct grid using the Hanning windowed sinc interpolator and the results and errors tabulated and discussed.
Brain mechanisms underlying cue-based memorizing during free viewing of movie Memento.
Kauttonen, Janne; Hlushchuk, Yevhen; Jääskeläinen, Iiro P; Tikka, Pia
2018-05-15
How does the human brain recall and connect relevant memories with unfolding events? To study this, we presented 25 healthy subjects, during functional magnetic resonance imaging, the movie 'Memento' (director C. Nolan). In this movie, scenes are presented in chronologically reverse order with certain scenes briefly overlapping previously presented scenes. Such overlapping "key-frames" serve as effective memory cues for the viewers, prompting recall of relevant memories of the previously seen scene and connecting them with the concurrent scene. We hypothesized that these repeating key-frames serve as immediate recall cues and would facilitate reconstruction of the story piece-by-piece. The chronological version of Memento, shown in a separate experiment for another group of subjects, served as a control condition. Using multivariate event-related pattern analysis method and representational similarity analysis, focal fingerprint patterns of hemodynamic activity were found to emerge during presentation of key-frame scenes. This effect was present in higher-order cortical network with regions including precuneus, angular gyrus, cingulate gyrus, as well as lateral, superior, and middle frontal gyri within frontal poles. This network was right hemispheric dominant. These distributed patterns of brain activity appear to underlie ability to recall relevant memories and connect them with ongoing events, i.e., "what goes with what" in a complex story. Given the real-life likeness of cinematic experience, these results provide new insight into how the human brain recalls, given proper cues, relevant memories to facilitate understanding and prediction of everyday life events. Copyright © 2018 Elsevier Inc. All rights reserved.
See Josephus: viewing first-century sexual drama with Victorian eyes.
Goldhill, Simon
2009-01-01
This article looks first at how the art of J.W. Waterhouse responds to the classical world: how complex the scene of reception is, triangulated between artist, the ancient past, and his audiences, and extended over time. Second, it looks at how this scene of reception engages with a specific Victorian problematic about male sexuality and self-control. This is not just a question of Waterhouse using classics as an alibi for thinking about desire, but also of the interference of different models of desire and different knowledges of the classical world in the reception of the painting's narrative semantics.
Hyperspectral imaging simulation of object under sea-sky background
NASA Astrophysics Data System (ADS)
Wang, Biao; Lin, Jia-xuan; Gao, Wei; Yue, Hui
2016-10-01
Remote sensing image simulation plays an important role in spaceborne/airborne load demonstration and algorithm development. Hyperspectral imaging is valuable in marine monitoring, search and rescue. On the demand of spectral imaging of objects under the complex sea scene, physics based simulation method of spectral image of object under sea scene is proposed. On the development of an imaging simulation model considering object, background, atmosphere conditions, sensor, it is able to examine the influence of wind speed, atmosphere conditions and other environment factors change on spectral image quality under complex sea scene. Firstly, the sea scattering model is established based on the Philips sea spectral model, the rough surface scattering theory and the water volume scattering characteristics. The measured bi directional reflectance distribution function (BRDF) data of objects is fit to the statistical model. MODTRAN software is used to obtain solar illumination on the sea, sky brightness, the atmosphere transmittance from sea to sensor and atmosphere backscattered radiance, and Monte Carlo ray tracing method is used to calculate the sea surface object composite scattering and spectral image. Finally, the object spectrum is acquired by the space transformation, radiation degradation and adding the noise. The model connects the spectrum image with the environmental parameters, the object parameters, and the sensor parameters, which provide a tool for the load demonstration and algorithm development.
Real-time visual tracking of less textured three-dimensional objects on mobile platforms
NASA Astrophysics Data System (ADS)
Seo, Byung-Kuk; Park, Jungsik; Park, Hanhoon; Park, Jong-Il
2012-12-01
Natural feature-based approaches are still challenging for mobile applications (e.g., mobile augmented reality), because they are feasible only in limited environments such as highly textured and planar scenes/objects, and they need powerful mobile hardware for fast and reliable tracking. In many cases where conventional approaches are not effective, three-dimensional (3-D) knowledge of target scenes would be beneficial. We present a well-established framework for real-time visual tracking of less textured 3-D objects on mobile platforms. Our framework is based on model-based tracking that efficiently exploits partially known 3-D scene knowledge such as object models and a background's distinctive geometric or photometric knowledge. Moreover, we elaborate on implementation in order to make it suitable for real-time vision processing on mobile hardware. The performance of the framework is tested and evaluated on recent commercially available smartphones, and its feasibility is shown by real-time demonstrations.
Nature and place of crime scene management within forensic sciences.
Crispino, Frank
2008-03-01
This short paper presents the preliminary results of a recent study aimed at appreciating the relevant parameters required to qualify forensic science as a science through an epistemological analysis. The reader is invited to reflect upon references within a historical and logical framework which assert that forensic science is based upon two fundamental principles (those of Locard and Kirk). The basis of the assertion that forensic science is indeed a science should be appreciated not only on one epistemological criteria (as Popper's falsification raised by the Daubert hearing was), but also on the logical frameworks used by the individuals involved (investigator, expert witness and trier of fact) from the crime scene examination to the final interpretation of the evidence. Hence, it can be argued that the management of the crime scene should be integrated into the scientific way of thinking rather than remain as a technical discipline as recently suggested by Harrison.
NASA Technical Reports Server (NTRS)
Strahler, Alan H.; Li, Xiao-Wen; Jupp, David L. B.
1991-01-01
The bidirectional radiance or reflectance of a forest or woodland can be modeled using principles of geometric optics and Boolean models for random sets in a three dimensional space. This model may be defined at two levels, the scene includes four components; sunlight and shadowed canopy, and sunlit and shadowed background. The reflectance of the scene is modeled as the sum of the reflectances of the individual components as weighted by their areal proportions in the field of view. At the leaf level, the canopy envelope is an assemblage of leaves, and thus the reflectance is a function of the areal proportions of sunlit and shadowed leaf, and sunlit and shadowed background. Because the proportions of scene components are dependent upon the directions of irradiance and exitance, the model accounts for the hotspot that is well known in leaf and tree canopies.
NASA Astrophysics Data System (ADS)
Appel, Marius; Lahn, Florian; Buytaert, Wouter; Pebesma, Edzer
2018-04-01
Earth observation (EO) datasets are commonly provided as collection of scenes, where individual scenes represent a temporal snapshot and cover a particular region on the Earth's surface. Using these data in complex spatiotemporal modeling becomes difficult as soon as data volumes exceed a certain capacity or analyses include many scenes, which may spatially overlap and may have been recorded at different dates. In order to facilitate analytics on large EO datasets, we combine and extend the geospatial data abstraction library (GDAL) and the array-based data management and analytics system SciDB. We present an approach to automatically convert collections of scenes to multidimensional arrays and use SciDB to scale computationally intensive analytics. We evaluate the approach in three study cases on national scale land use change monitoring with Landsat imagery, global empirical orthogonal function analysis of daily precipitation, and combining historical climate model projections with satellite-based observations. Results indicate that the approach can be used to represent various EO datasets and that analyses in SciDB scale well with available computational resources. To simplify analyses of higher-dimensional datasets as from climate model output, however, a generalization of the GDAL data model might be needed. All parts of this work have been implemented as open-source software and we discuss how this may facilitate open and reproducible EO analyses.
How Beauty Determines Gaze! Facial Attractiveness and Gaze Duration in Images of Real World Scenes
Mitrovic, Aleksandra; Goller, Jürgen
2016-01-01
We showed that the looking time spent on faces is a valid covariate of beauty by testing the relation between facial attractiveness and gaze behavior. We presented natural scenes which always pictured two people, encompassing a wide range of facial attractiveness. Employing measurements of eye movements in a free viewing paradigm, we found a linear relation between facial attractiveness and gaze behavior: The more attractive the face, the longer and the more often it was looked at. In line with evolutionary approaches, the positive relation was particularly pronounced when participants viewed other sex faces. PMID:27698984
Brédart, Serge; Cornet, Alyssa; Rakic, Jean-Marie
2014-01-01
Color deficient (dichromat) and normal observers' recognition memory for colored and black-and-white natural scenes was evaluated through several parameters: the rate of recognition, discrimination (A'), response bias (B"D), response confidence, and the proportion of conscious recollections (Remember responses) among hits. At the encoding phase, 36 images of natural scenes were each presented for 1 sec. Half of the images were shown in color and half in black-and-white. At the recognition phase, these 36 pictures were intermixed with 36 new images. The participants' task was to indicate whether an image had been presented or not at the encoding phase, to rate their level of confidence in his her/his response, and in the case of a positive response, to classify the response as a Remember, a Know or a Guess response. Results indicated that accuracy, response discrimination, response bias and confidence ratings were higher for colored than for black-and-white images; this advantage for colored images was similar in both groups of participants. Rates of Remember responses were not higher for colored images than for black-and-white ones, whatever the group. However, interestingly, Remember responses were significantly more often based on color information for colored than for black-and-white images in normal observers only, not in dichromats.
Einhäuser, Wolfgang; Nuthmann, Antje
2016-09-01
During natural scene viewing, humans typically attend and fixate selected locations for about 200-400 ms. Two variables characterize such "overt" attention: the probability of a location being fixated, and the fixation's duration. Both variables have been widely researched, but little is known about their relation. We use a two-step approach to investigate the relation between fixation probability and duration. In the first step, we use a large corpus of fixation data. We demonstrate that fixation probability (empirical salience) predicts fixation duration across different observers and tasks. Linear mixed-effects modeling shows that this relation is explained neither by joint dependencies on simple image features (luminance, contrast, edge density) nor by spatial biases (central bias). In the second step, we experimentally manipulate some of these features. We find that fixation probability from the corpus data still predicts fixation duration for this new set of experimental data. This holds even if stimuli are deprived of low-level images features, as long as higher level scene structure remains intact. Together, this shows a robust relation between fixation duration and probability, which does not depend on simple image features. Moreover, the study exemplifies the combination of empirical research on a large corpus of data with targeted experimental manipulations.
The natural statistics of blur
Sprague, William W.; Cooper, Emily A.; Reissier, Sylvain; Yellapragada, Baladitya; Banks, Martin S.
2016-01-01
Blur from defocus can be both useful and detrimental for visual perception: It can be useful as a source of depth information and detrimental because it degrades image quality. We examined these aspects of blur by measuring the natural statistics of defocus blur across the visual field. Participants wore an eye-and-scene tracker that measured gaze direction, pupil diameter, and scene distances as they performed everyday tasks. We found that blur magnitude increases with increasing eccentricity. There is a vertical gradient in the distances that generate defocus blur: Blur below the fovea is generally due to scene points nearer than fixation; blur above the fovea is mostly due to points farther than fixation. There is no systematic horizontal gradient. Large blurs are generally caused by points farther rather than nearer than fixation. Consistent with the statistics, participants in a perceptual experiment perceived vertical blur gradients as slanted top-back whereas horizontal gradients were perceived equally as left-back and right-back. The tendency for people to see sharp as near and blurred as far is also consistent with the observed statistics. We calculated how many observations will be perceived as unsharp and found that perceptible blur is rare. Finally, we found that eye shape in ground-dwelling animals conforms to that required to put likely distances in best focus. PMID:27580043
Temporal dynamics of motor cortex excitability during perception of natural emotional scenes
Borgomaneri, Sara; Gazzola, Valeria
2014-01-01
Although it is widely assumed that emotions prime the body for action, the effects of visual perception of natural emotional scenes on the temporal dynamics of the human motor system have scarcely been investigated. Here, we used single-pulse transcranial magnetic stimulation (TMS) to assess motor excitability during observation and categorization of positive, neutral and negative pictures from the International Affective Picture System database. Motor-evoked potentials (MEPs) from TMS of the left motor cortex were recorded from hand muscles, at 150 and 300 ms after picture onset. In the early temporal condition we found an increase in hand motor excitability that was specific for the perception of negative pictures. This early negative bias was predicted by interindividual differences in the disposition to experience aversive feelings (personal distress) in interpersonal emotional contexts. In the later temporal condition, we found that MEPs were similarly increased for both positive and negative pictures, suggesting an increased reactivity to emotionally arousing scenes. By highlighting the temporal course of motor excitability during perception of emotional pictures, our study provides direct neurophysiological support for the evolutionary notions that emotion perception is closely linked to action systems and that emotionally negative events require motor reactions to be more urgently mobilized. PMID:23945998
Edge co-occurrences can account for rapid categorization of natural versus animal images
NASA Astrophysics Data System (ADS)
Perrinet, Laurent U.; Bednar, James A.
2015-06-01
Making a judgment about the semantic category of a visual scene, such as whether it contains an animal, is typically assumed to involve high-level associative brain areas. Previous explanations require progressively analyzing the scene hierarchically at increasing levels of abstraction, from edge extraction to mid-level object recognition and then object categorization. Here we show that the statistics of edge co-occurrences alone are sufficient to perform a rough yet robust (translation, scale, and rotation invariant) scene categorization. We first extracted the edges from images using a scale-space analysis coupled with a sparse coding algorithm. We then computed the “association field” for different categories (natural, man-made, or containing an animal) by computing the statistics of edge co-occurrences. These differed strongly, with animal images having more curved configurations. We show that this geometry alone is sufficient for categorization, and that the pattern of errors made by humans is consistent with this procedure. Because these statistics could be measured as early as the primary visual cortex, the results challenge widely held assumptions about the flow of computations in the visual system. The results also suggest new algorithms for image classification and signal processing that exploit correlations between low-level structure and the underlying semantic category.
Using Persistent Scatterers Interferometry to create a subsidence map of the Nile Delta in Egypt
NASA Astrophysics Data System (ADS)
Bouali, E. Y.; Sultan, M.; Becker, R.; Cherif, O.
2013-12-01
Inhabitants of the Nile Delta in Egypt, especially those who live around the coast, are threatened by two perpetual hazards: (1) sea level rise and encroachment from the Mediterranean Sea and (2) land subsidence that is inherent in deltaic environments. With cities like Alexandria and Port Said currently only one meter above sea level, it is important to understand the nature of the sea level rise and land subsidence, both spatially and temporally, and to be able to quantify these hazards. The magnitude of sea level rise has been actively monitored in stations across the Mediterranean Sea; the subsidence of the Nile Delta, as a whole system however, has not been adequately quantified. We have employed the Differential Synthetic Aperture Radar Interferometry (DInSAR) technique known as Persistent Scatterers Interferometry (PSI) across the entire northern parts of the Nile Delta. A dataset of 106 ENVISAT single look complex (SLC) scenes (four descending tracks: 164, 207, 436, and 479) acquired throughout the time period 2003 to 2010 were obtained from the European Space Agency and utilized for radar interferometric purposes. Multiple combinations of these scenes - used for output optimization and validation - were processed. Due to the nature of the PSI technique, subsidence rates calculated using this technique are values measured from cities and urban areas - where PSI works well. The methodology of choice is to calculate the subsidence rates on a city-by-city basis by: (1) choosing an urban area and cutting the SLC scene stack down to a small area (25 - 200 km2); (2) processing this area multiple times using difference scene and parameter combinations in order to best optimize persistent scatterer (PS) abundance and ground displacement measurements; (3) calibrating the relative ground motion measured by PSI to known locations of minimal subsidence rates. The final result is a spatial representation of the subsidence rates across the Nile Delta in Egypt. Measured subsidence rates vary widely across the Nile Delta, with the highest rates occurring in cities near the mouth of the Damietta branch of the Nile River and around the Mansala Lagoon, such as Ras El Bar (up to 15 mm/year), Damietta (up to 10 mm/year), and Port Said (up to 7 mm/year). The complexity of these subsidence rates is spatially evident: many cities display a wide range of subsidence rates - for example Port Said, where a majority of the city is undergoing minimal to no subsidence (< 1 mm/year) there are two regions - near the Mediterranean coast and near the Mansala Lagoon - where subsidence rates are quite high (5-7 mm/year). There are also a few overall trends observed across the delta: (1) subsidence rates are greatest in the northeast region of the delta (average: > 5 mm/year) than anywhere else (e.g., average western subsidence: 1-4 mm/year) and (2) cities generally more proximal to the Mediterranean coast exhibit greater subsidence rates (average subsidence rates: Ras El Bar: 8 mm/year, Port Said: 5 mm/year, and Damietta: 6 mm/year)than cities in the middle (e.g., Mansoura and Al Mahallah: 4 mm/year) or south regions (e.g., Tanta: <4 mm/year) of the delta.
Uncomfortable images in art and nature.
Fernandez, Dominic; Wilkins, Arnold J
2008-01-01
The ratings of discomfort from a wide variety of images can be predicted from the energy at different spatial scales in the image, as measured by the Fourier amplitude spectrum of the luminance. Whereas comfortable images show the regression of Fourier amplitude against spatial frequency common in natural scenes, uncomfortable images show a regression with disproportionately greater amplitude at spatial frequencies within two octaves of 3 cycles deg(-1). In six studies, the amplitude in this spatial frequency range relative to that elsewhere in the spectrum explains variance in judgments of discomfort from art, from images constructed from filtered noise, and from art in which the phase or amplitude spectra have been altered. Striped patterns with spatial frequency within the above range are known to be uncomfortable and capable of provoking headaches and seizures in susceptible persons. The present findings show for the first time that, even in more complex images, the energy in this spatial-frequency range is associated with aversion. We propose a simple measurement that can predict aversion to those works of art that have reached the national media because of negative public reaction.
Uncomfortable images in art and nature
Fernandez, Dominic; Wilkins, Arnold J.
2008-01-01
We find that the ratings of discomfort from a wide variety of images can be predicted from the energy at different spatial scales in the image, as measured by the Fourier amplitude spectrum of the luminance. Whereas comfortable images show the regression of Fourier amplitude against spatial frequency common in natural scenes, uncomfortable images show a regression with disproportionately greater amplitude at spatial frequencies within two octaves of 3 cycles per degree. In six studies, the amplitude at this spatial frequency relative to that 3 octaves below explains variance in judgments of discomfort from art, from images constructed from filtered noise and from art in which the phase or amplitude spectra have been altered. Striped patterns with spatial frequency within the above range are known to be uncomfortable and capable of provoking headaches and seizures in susceptible persons. The present findings show for the first time that even in more complex images the energy in this spatial frequency range is associated with aversion. We propose a simple measurement that can predict aversion to those works of art that have reached the national media because of negative public reaction. PMID:18773732
Development of Moire machine vision
NASA Technical Reports Server (NTRS)
Harding, Kevin G.
1987-01-01
Three dimensional perception is essential to the development of versatile robotics systems in order to handle complex manufacturing tasks in future factories and in providing high accuracy measurements needed in flexible manufacturing and quality control. A program is described which will develop the potential of Moire techniques to provide this capability in vision systems and automated measurements, and demonstrate artificial intelligence (AI) techniques to take advantage of the strengths of Moire sensing. Moire techniques provide a means of optically manipulating the complex visual data in a three dimensional scene into a form which can be easily and quickly analyzed by computers. This type of optical data manipulation provides high productivity through integrated automation, producing a high quality product while reducing computer and mechanical manipulation requirements and thereby the cost and time of production. This nondestructive evaluation is developed to be able to make full field range measurement and three dimensional scene analysis.
The Effects of Similarity on High-Level Visual Working Memory Processing.
Yang, Li; Mo, Lei
2017-01-01
Similarity has been observed to have opposite effects on visual working memory (VWM) for complex images. How can these discrepant results be reconciled? To answer this question, we used a change-detection paradigm to test visual working memory performance for multiple real-world objects. We found that working memory for moderate similarity items was worse than that for either high or low similarity items. This pattern was unaffected by manipulations of stimulus type (faces vs. scenes), encoding duration (limited vs. self-paced), and presentation format (simultaneous vs. sequential). We also found that the similarity effects differed in strength in different categories (scenes vs. faces). These results suggest that complex real-world objects are represented using a centre-surround inhibition organization . These results support the category-specific cortical resource theory and further suggest that centre-surround inhibition organization may differ by category.
Robust position estimation of a mobile vehicle
NASA Astrophysics Data System (ADS)
Conan, Vania; Boulanger, Pierre; Elgazzar, Shadia
1994-11-01
The ability to estimate the position of a mobile vehicle is a key task for navigation over large distances in complex indoor environments such as nuclear power plants. Schematics of the plants are available, but they are incomplete, as real settings contain many objects, such as pipes, cables or furniture, that mask part of the model. The position estimation method described in this paper matches 3-D data with a simple schematic of a plant. It is basically independent of odometry information and viewpoint, robust to noisy data and spurious points and largely insensitive to occlusions. The method is based on a hypothesis/verification paradigm and its complexity is polynomial; it runs in (Omicron) (m4n4), where m represents the number of model patches and n the number of scene patches. Heuristics are presented to speed up the algorithm. Results on real 3-D data show good behavior even when the scene is very occluded.
Development of Moire machine vision
NASA Astrophysics Data System (ADS)
Harding, Kevin G.
1987-10-01
Three dimensional perception is essential to the development of versatile robotics systems in order to handle complex manufacturing tasks in future factories and in providing high accuracy measurements needed in flexible manufacturing and quality control. A program is described which will develop the potential of Moire techniques to provide this capability in vision systems and automated measurements, and demonstrate artificial intelligence (AI) techniques to take advantage of the strengths of Moire sensing. Moire techniques provide a means of optically manipulating the complex visual data in a three dimensional scene into a form which can be easily and quickly analyzed by computers. This type of optical data manipulation provides high productivity through integrated automation, producing a high quality product while reducing computer and mechanical manipulation requirements and thereby the cost and time of production. This nondestructive evaluation is developed to be able to make full field range measurement and three dimensional scene analysis.
3-D Scene Reconstruction from Aerial Imagery
2012-03-01
educational, and recreational uses. Professionals can quickly determine optimal sites for natural resource exploration and potential development... sites . Educators can ex- plore natural landmarks and experience foreign countries with their students, while recreational users can be immediately... sites can forgo an intensive process of erect- ing a grid and documenting the placement of all finds, and medical surgeons can gauge depth while
ERIC Educational Resources Information Center
Cakmakci, Gultekin
2017-01-01
This study used video vignettes of historical episodes from documentary films as a context and instructional tool to promote pre-service science teachers' (PSTs) conceptions of the nature of science (NOS). The participants received explicit-reflective NOS instruction, and were introduced to techniques to be able to use scenes from documentary…
Theory-based Bayesian Models of Inductive Inference
2010-07-19
Subjective randomness and natural scene statistics. Psychonomic Bulletin & Review . http://cocosci.berkeley.edu/tom/papers/randscenes.pdf Page 1...in press). Exemplar models as a mechanism for performing Bayesian inference. Psychonomic Bulletin & Review . http://cocosci.berkeley.edu/tom
A gaze-contingent display to study contrast sensitivity under natural viewing conditions
NASA Astrophysics Data System (ADS)
Dorr, Michael; Bex, Peter J.
2011-03-01
Contrast sensitivity has been extensively studied over the last decades and there are well-established models of early vision that were derived by presenting the visual system with synthetic stimuli such as sine-wave gratings near threshold contrasts. Natural scenes, however, contain a much wider distribution of orientations, spatial frequencies, and both luminance and contrast values. Furthermore, humans typically move their eyes two to three times per second under natural viewing conditions, but most laboratory experiments require subjects to maintain central fixation. We here describe a gaze-contingent display capable of performing real-time contrast modulations of video in retinal coordinates, thus allowing us to study contrast sensitivity when dynamically viewing dynamic scenes. Our system is based on a Laplacian pyramid for each frame that efficiently represents individual frequency bands. Each output pixel is then computed as a locally weighted sum of pyramid levels to introduce local contrast changes as a function of gaze. Our GPU implementation achieves real-time performance with more than 100 fps on high-resolution video (1920 by 1080 pixels) and a synthesis latency of only 1.5ms. Psychophysical data show that contrast sensitivity is greatly decreased in natural videos and under dynamic viewing conditions. Synthetic stimuli therefore only poorly characterize natural vision.
Kremkow, Jens; Perrinet, Laurent U.; Monier, Cyril; Alonso, Jose-Manuel; Aertsen, Ad; Frégnac, Yves; Masson, Guillaume S.
2016-01-01
Neurons in the primary visual cortex are known for responding vigorously but with high variability to classical stimuli such as drifting bars or gratings. By contrast, natural scenes are encoded more efficiently by sparse and temporal precise spiking responses. We used a conductance-based model of the visual system in higher mammals to investigate how two specific features of the thalamo-cortical pathway, namely push-pull receptive field organization and fast synaptic depression, can contribute to this contextual reshaping of V1 responses. By comparing cortical dynamics evoked respectively by natural vs. artificial stimuli in a comprehensive parametric space analysis, we demonstrate that the reliability and sparseness of the spiking responses during natural vision is not a mere consequence of the increased bandwidth in the sensory input spectrum. Rather, it results from the combined impacts of fast synaptic depression and push-pull inhibition, the later acting for natural scenes as a form of “effective” feed-forward inhibition as demonstrated in other sensory systems. Thus, the combination of feedforward-like inhibition with fast thalamo-cortical synaptic depression by simple cells receiving a direct structured input from thalamus composes a generic computational mechanism for generating a sparse and reliable encoding of natural sensory events. PMID:27242445
Modeling Images of Natural 3D Surfaces: Overview and Potential Applications
NASA Technical Reports Server (NTRS)
Jalobeanu, Andre; Kuehnel, Frank; Stutz, John
2004-01-01
Generative models of natural images have long been used in computer vision. However, since they only describe the of 2D scenes, they fail to capture all the properties of the underlying 3D world. Even though such models are sufficient for many vision tasks a 3D scene model is when it comes to inferring a 3D object or its characteristics. In this paper, we present such a generative model, incorporating both a multiscale surface prior model for surface geometry and reflectance, and an image formation process model based on realistic rendering, the computation of the posterior model parameter densities, and on the critical aspects of the rendering. We also how to efficiently invert the model within a Bayesian framework. We present a few potential applications, such as asteroid modeling and Planetary topography recovery, illustrated by promising results on real images.
NASA Astrophysics Data System (ADS)
White, Brian J.; Berg, David J.; Kan, Janis Y.; Marino, Robert A.; Itti, Laurent; Munoz, Douglas P.
2017-01-01
Models of visual attention postulate the existence of a saliency map whose function is to guide attention and gaze to the most conspicuous regions in a visual scene. Although cortical representations of saliency have been reported, there is mounting evidence for a subcortical saliency mechanism, which pre-dates the evolution of neocortex. Here, we conduct a strong test of the saliency hypothesis by comparing the output of a well-established computational saliency model with the activation of neurons in the primate superior colliculus (SC), a midbrain structure associated with attention and gaze, while monkeys watched video of natural scenes. We find that the activity of SC superficial visual-layer neurons (SCs), specifically, is well-predicted by the model. This saliency representation is unlikely to be inherited from fronto-parietal cortices, which do not project to SCs, but may be computed in SCs and relayed to other areas via tectothalamic pathways.
Terrain detection and classification using single polarization SAR
Chow, James G.; Koch, Mark W.
2016-01-19
The various technologies presented herein relate to identifying manmade and/or natural features in a radar image. Two radar images (e.g., single polarization SAR images) can be captured for a common scene. The first image is captured at a first instance and the second image is captured at a second instance, whereby the duration between the captures are of sufficient time such that temporal decorrelation occurs for natural surfaces in the scene, and only manmade surfaces, e.g., a road, produce correlated pixels. A LCCD image comprising the correlated and decorrelated pixels can be generated from the two radar images. A median image can be generated from a plurality of radar images, whereby any features in the median image can be identified. A superpixel operation can be performed on the LCCD image and the median image, thereby enabling a feature(s) in the LCCD image to be classified.
Frontal–Occipital Connectivity During Visual Search
Pantazatos, Spiro P.; Yanagihara, Ted K.; Zhang, Xian; Meitzler, Thomas
2012-01-01
Abstract Although expectation- and attention-related interactions between ventral and medial prefrontal cortex and stimulus category-selective visual regions have been identified during visual detection and discrimination, it is not known if similar neural mechanisms apply to other tasks such as visual search. The current work tested the hypothesis that high-level frontal regions, previously implicated in expectation and visual imagery of object categories, interact with visual regions associated with object recognition during visual search. Using functional magnetic resonance imaging, subjects searched for a specific object that varied in size and location within a complex natural scene. A model-free, spatial-independent component analysis isolated multiple task-related components, one of which included visual cortex, as well as a cluster within ventromedial prefrontal cortex (vmPFC), consistent with the engagement of both top-down and bottom-up processes. Analyses of psychophysiological interactions showed increased functional connectivity between vmPFC and object-sensitive lateral occipital cortex (LOC), and results from dynamic causal modeling and Bayesian Model Selection suggested bidirectional connections between vmPFC and LOC that were positively modulated by the task. Using image-guided diffusion-tensor imaging, functionally seeded, probabilistic white-matter tracts between vmPFC and LOC, which presumably underlie this effective interconnectivity, were also observed. These connectivity findings extend previous models of visual search processes to include specific frontal–occipital neuronal interactions during a natural and complex search task. PMID:22708993
SAR backscatter from coniferous forest gaps
NASA Technical Reports Server (NTRS)
Day, John L.; Davis, Frank W.
1992-01-01
A study is in progress comparing Airborne Synthetic Aperture Radar (AIRSAR) backscatter from coniferous forest plots containing gaps to backscatter from adjacent gap-free plots. Issues discussed are how do gaps in the range of 400 to 1600 sq m (approximately 4-14 pixels at intermediate incidence angles) affect forest backscatter statistics and what incidence angles, wavelengths, and polarizations are most sensitive to forest gaps. In order to visualize the slant-range imaging of forest and gaps, a simple conceptual model is used. This strictly qualitative model has led us to hypothesize that forest radar returns at short wavelengths (eg., C-band) and large incidence angles (e.g., 50 deg) should be most affected by the presence of gaps, whereas returns at long wavelengths and small angles should be least affected. Preliminary analysis of 1989 AIRSAR data from forest near Mt. Shasta supports the hypothesis. Current forest backscatter models such as MIMICS and Santa Barbara Discontinuous Canopy Backscatter Model have in several cases correctly predicted backscatter from forest stands based on inputs of measured or estimated forest parameters. These models do not, however, predict within-stand SAR scene texture, or 'intrinsic scene variability' as Ulaby et al. has referred to it. For instance, the Santa Barbara model, which may be the most spatially coupled of the existing models, is not truly spatial. Tree locations within a simulated pixel are distributed according to a Poisson process, as they are in many natural forests, but tree size is unrelated to location, which is not the case in nature. Furthermore, since pixels of a simulated stand are generated independently in the Santa Barbara model, spatial processes larger than one pixel are not modeled. Using a different approach, Oliver modeled scene texture based on an hypothetical forest geometry. His simulated scenes do not agree well with SAR data, perhaps due to the simple geometric model used. Insofar as texture is the expression of biological forest processes, such as succession and disease, and physical ones, such as fire and wind-throw, it contains useful information about the forest, and has value in image interpretation and classification. Forest gaps are undoubtedly important contributors to scene variance. By studying the localized effects of gaps on forest backscatter, guided by our qualitative model, we hope to understand more clearly the manner in which spatial heterogeneities in forests produce variations in backscatter, which collectively give rise to scene texture.
Zhao, Hui-Jie; Jiang, Cheng; Jia, Guo-Rui
2014-01-01
Adjacency effects may introduce errors in the quantitative applications of hyperspectral remote sensing, of which the significant item is the earth-atmosphere coupling radiance. However, the surrounding relief and shadow induce strong changes in hyperspectral images acquired from rugged terrain, which is not accurate to describe the spectral characteristics. Furthermore, the radiative coupling process between the earth and the atmosphere is more complex over the rugged scenes. In order to meet the requirements of real-time processing in data simulation, an equivalent reflectance of background was developed by taking into account the topography and the geometry between surroundings and targets based on the radiative transfer process. The contributions of the coupling to the signal at sensor level were then evaluated. This approach was integrated to the sensor-level radiance simulation model and then validated through simulating a set of actual radiance data. The results show that the visual effect of simulated images is consistent with that of observed images. It was also shown that the spectral similarity is improved over rugged scenes. In addition, the model precision is maintained at the same level over flat scenes.
Irsik, Vanessa C; Vanden Bosch der Nederlanden, Christina M; Snyder, Joel S
2016-11-01
Attention and other processing constraints limit the perception of objects in complex scenes, which has been studied extensively in the visual sense. We used a change deafness paradigm to examine how attention to particular objects helps and hurts the ability to notice changes within complex auditory scenes. In a counterbalanced design, we examined how cueing attention to particular objects affected performance in an auditory change-detection task through the use of valid or invalid cues and trials without cues (Experiment 1). We further examined how successful encoding predicted change-detection performance using an object-encoding task and we addressed whether performing the object-encoding task along with the change-detection task affected performance overall (Experiment 2). Participants had more error for invalid compared to valid and uncued trials, but this effect was reduced in Experiment 2 compared to Experiment 1. When the object-encoding task was present, listeners who completed the uncued condition first had less overall error than those who completed the cued condition first. All participants showed less change deafness when they successfully encoded change-relevant compared to irrelevant objects during valid and uncued trials. However, only participants who completed the uncued condition first also showed this effect during invalid cue trials, suggesting a broader scope of attention. These findings provide converging evidence that attention to change-relevant objects is crucial for successful detection of acoustic changes and that encouraging broad attention to multiple objects is the best way to reduce change deafness. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Active polarization descattering.
Treibitz, Tali; Schechner, Yoav Y
2009-03-01
Vision in scattering media is important but challenging. Images suffer from poor visibility due to backscattering and attenuation. Most prior methods for scene recovery use active illumination scanners (structured and gated), which can be slow and cumbersome, while natural illumination is inapplicable to dark environments. The current paper addresses the need for a non-scanning recovery method, that uses active scene irradiance. We study the formation of images under widefield artificial illumination. Based on the formation model, the paper presents an approach for recovering the object signal. It also yields rough information about the 3D scene structure. The approach can work with compact, simple hardware, having active widefield, polychromatic polarized illumination. The camera is fitted with a polarization analyzer. Two frames of the scene are taken, with different states of the analyzer or polarizer. A recovery algorithm follows the acquisition. It allows both the backscatter and the object reflection to be partially polarized. It thus unifies and generalizes prior polarization-based methods, which had assumed exclusive polarization of either of these components. The approach is limited to an effective range, due to image noise and illumination falloff. Thus, the limits and noise sensitivity are analyzed. We demonstrate the approach in underwater field experiments.
Galli, Giulia; Griffiths, Victoria A; Otten, Leun J
2014-03-01
It has been shown that the effectiveness with which unpleasant events are encoded into memory is related to brain activity set in train before the events. Here, we assessed whether encoding-related activity before an aversive event can be modulated by emotion regulation. Electrical brain activity was recorded from the scalps of healthy women while they performed an incidental encoding task on randomly intermixed unpleasant and neutral visual scenes. A cue presented 1.5 s before each picture indicated the upcoming valence. In half of the blocks of trials, the instructions emphasized to let emotions arise in a natural way. In the other half, participants were asked to decrease their emotional response by adopting the perspective of a detached observer. Memory for the scenes was probed 1 day later with a recognition memory test. Brain activity before unpleasant scenes predicted later memory of the scenes, but only when participants felt their emotions and did not detach from them. The findings indicate that emotion regulation can eliminate the influence of anticipatory brain activity on memory encoding. This may be relevant for the understanding and treatment of psychiatric diseases with a memory component.
[Application of virtual reality in surgical treatment of complex head and neck carcinoma].
Zhou, Y Q; Li, C; Shui, C Y; Cai, Y C; Sun, R H; Zeng, D F; Wang, W; Li, Q L; Huang, L; Tu, J; Jiang, J
2018-01-07
Objective: To investigate the application of virtual reality technology in the preoperative evaluation of complex head and neck carcinoma and he value of virtual reality technology in surgical treatment of head and neck carcinoma. Methods: The image data of eight patients with complex head and neck carcinoma treated from December 2016 to May 2017 was acquired. The data were put into virtual reality system to built the three-dimensional anatomical model of carcinoma and to created the surgical scene. The process of surgery was stimulated by recognizing the relationship between tumor and surrounding important structures. Finally all patients were treated with surgery. And two typical cases were reported. Results: With the help of virtual reality, surgeons could adequately assess the condition of carcinoma and the security of operation and ensured the safety of operations. Conclusions: Virtual reality can provide the surgeons with the sensory experience in virtual surgery scenes and achieve the man-computer cooperation and stereoscopic assessment, which will ensure the safety of surgery. Virtual reality has a huge impact on guiding the traditional surgical procedure of head and neck carcinoma.
NASA Astrophysics Data System (ADS)
Selj, G. K.; Søderblom, M.
2015-10-01
Detection of a camouflaged object in natural sceneries requires the target to be distinguishable from its local background. The development of any new camouflage pattern therefore has to rely on a well-founded test methodology - which has to be correlated with the final purpose of the pattern - as well as an evaluation procedure, containing the optimal criteria for i) discriminating between the targets and then eventually ii) for a final rank of the targets. In this study we present results from a recent camouflage assessment trial where human observers were used in a search by photo methodology to assess generic test camouflage patterns. We conducted a study to investigate possible improvements in camouflage patterns for battle dress uniforms. The aim was to do a comparative study of potential, and generic patterns intended for use in arid areas (sparsely vegetated, semi desert). We developed a test methodology that was intended to be simple, reliable and realistic with respect to the operational benefit of camouflage. Therefore we chose to conduct a human based observer trial founded on imagery of realistic targets in natural backgrounds. Inspired by a recent and similar trial in the UK, we developed new and purpose-based software to be able to conduct the observer trial. Our preferred assessment methodology - the observer trial - was based on target recordings in 12 different, but operational relevant scenes, collected in a dry and sparsely vegetated area (Rhodes). The scenes were chosen with the intention to span as broadly as possible. The targets were human-shaped mannequins and were situated identically in each of the scenes to allow for a relative comparison of camouflage effectiveness in each scene. Test of significance, among the targets' performance, was carried out by non-parametric tests as the corresponding time of detection distributions in overall were found to be difficult to parameterize. From the trial, containing 12 different scenes from sparsely vegetated areas we collected detection time's distributions for 6 generic targets through visual search by 148 observers. We found that the different targets performed differently, given by their corresponding time of detection distributions, within a single scene. Furthermore, we gained an overall ranking over all the 12 scenes by performing a weighted sum over all scenes, intended to keep as much of the vital information on the targets' signature effectiveness as possible. Our results show that it was possible to measure the targets performance relatively to another also when summing over all scenes. We also compared our ranking based on our preferred criterion (detection time) with a secondary (probability of detection) to assess the sensitivity of a final ranking based upon the test set-up and evaluation criterion. We found our observer-based approach to be well suited regarding its ability to discriminate between similar targets and to assign numeric values to the observed differences in performance. We believe our approach will be well suited as a tool whenever different aspects of camouflage are to be evaluated and understood further.
NASA Astrophysics Data System (ADS)
Packard, Corey D.; Viola, Timothy S.; Klein, Mark D.
2017-10-01
The ability to predict spectral electro-optical (EO) signatures for various targets against realistic, cluttered backgrounds is paramount for rigorous signature evaluation. Knowledge of background and target signatures, including plumes, is essential for a variety of scientific and defense-related applications including contrast analysis, camouflage development, automatic target recognition (ATR) algorithm development and scene material classification. The capability to simulate any desired mission scenario with forecast or historical weather is a tremendous asset for defense agencies, serving as a complement to (or substitute for) target and background signature measurement campaigns. In this paper, a systematic process for the physical temperature and visible-through-infrared radiance prediction of several diverse targets in a cluttered natural environment scene is presented. The ability of a virtual airborne sensor platform to detect and differentiate targets from a cluttered background, from a variety of sensor perspectives and across numerous wavelengths in differing atmospheric conditions, is considered. The process described utilizes the thermal and radiance simulation software MuSES and provides a repeatable, accurate approach for analyzing wavelength-dependent background and target (including plume) signatures in multiple band-integrated wavebands (multispectral) or hyperspectrally. The engineering workflow required to combine 3D geometric descriptions, thermal material properties, natural weather boundary conditions, all modes of heat transfer and spectral surface properties is summarized. This procedure includes geometric scene creation, material and optical property attribution, and transient physical temperature prediction. Radiance renderings, based on ray-tracing and the Sandford-Robertson BRDF model, are coupled with MODTRAN for the inclusion of atmospheric effects. This virtual hyperspectral/multispectral radiance prediction methodology has been extensively validated and provides a flexible process for signature evaluation and algorithm development.
NASA Astrophysics Data System (ADS)
Hawbaker, T. J.; Vanderhoof, M.; Beal, Y. J. G.; Takacs, J. D.; Schmidt, G.; Falgout, J.; Brunner, N. M.; Caldwell, M. K.; Picotte, J. J.; Howard, S. M.; Stitt, S.; Dwyer, J. L.
2016-12-01
Complete and accurate burned area data are needed to document patterns of fires, to quantify relationships between the patterns and drivers of fire occurrence, and to assess the impacts of fires on human and natural systems. Unfortunately, many existing fire datasets in the United States are known to be incomplete and that complicates efforts to understand burned area patterns and introduces a large amount of uncertainty in efforts to identify their driving processes and impacts. Because of this, the need to systematically collect burned area information has been recognized by the United Nations Framework Convention on Climate Change and the Intergovernmental Panel on Climate Change, which have both called for the production of essential climate variables. To help meet this need, we developed a novel algorithm that automatically identifies burned areas in temporally-dense time series of Landsat image stacks to produce Landsat Burned Area Essential Climate Variable (BAECV) products. The algorithm makes use of predictors derived from individual Landsat scenes, lagged reference conditions, and change metrics between the scene and reference predictors. Outputs of the BAECV algorithm, generated for the conterminous United States for 1984 through 2015, consist of burn probabilities for each Landsat scene, in addition to, annual composites including: the maximum burn probability, burn classification, and the Julian date of the first Landsat scene a burn was observed. The BAECV products document patterns of fire occurrence that are not well characterized by existing fire datasets in the United States. We anticipate that these data could help to better understand past patterns of fire occurrence, the drivers that created them, and the impacts fires had on natural and human systems.
Azizi, Elham; Abel, Larry A; Stainer, Matthew J
2017-02-01
Action game playing has been associated with several improvements in visual attention tasks. However, it is not clear how such changes might influence the way we overtly select information from our visual world (i.e. eye movements). We examined whether action-video-game training changed eye movement behaviour in a series of visual search tasks including conjunctive search (relatively abstracted from natural behaviour), game-related search, and more naturalistic scene search. Forty nongamers were trained in either an action first-person shooter game or a card game (control) for 10 hours. As a further control, we recorded eye movements of 20 experienced action gamers on the same tasks. The results did not show any change in duration of fixations or saccade amplitude either from before to after the training or between all nongamers (pretraining) and experienced action gamers. However, we observed a change in search strategy, reflected by a reduction in the vertical distribution of fixations for the game-related search task in the action-game-trained group. This might suggest learning the likely distribution of targets. In other words, game training only skilled participants to search game images for targets important to the game, with no indication of transfer to the more natural scene search. Taken together, these results suggest no modification in overt allocation of attention. Either the skills that can be trained with action gaming are not powerful enough to influence information selection through eye movements, or action-game-learned skills are not used when deciding where to move the eyes.
Using the structure of natural scenes and sounds to predict neural response properties in the brain
NASA Astrophysics Data System (ADS)
Deweese, Michael
2014-03-01
The natural scenes and sounds we encounter in the world are highly structured. The fact that animals and humans are so efficient at processing these sensory signals compared with the latest algorithms running on the fastest modern computers suggests that our brains can exploit this structure. We have developed a sparse mathematical representation of speech that minimizes the number of active model neurons needed to represent typical speech sounds. The model learns several well-known acoustic features of speech such as harmonic stacks, formants, onsets and terminations, but we also find more exotic structures in the spectrogra representation of sound such as localized checkerboard patterns and frequency-modulated excitatory subregions flanked by suppressive sidebands. Moreover, several of these novel features resemble neuronal receptive fields reported in the Inferior Colliculus (IC), as well as auditory thalamus (MGBv) and primary auditory cortex (A1), and our model neurons exhibit the same tradeoff in spectrotemporal resolution as has been observed in IC. To our knowledge, this is the first demonstration that receptive fields of neurons in the ascending mammalian auditory pathway beyond the auditory nerve can be predicted based on coding principles and the statistical properties of recorded sounds. We have also developed a biologically-inspired neural network model of primary visual cortex (V1) that can learn a sparse representation of natural scenes using spiking neurons and strictly local plasticity rules. The representation learned by our model is in good agreement with measured receptive fields in V1, demonstrating that sparse sensory coding can be achieved in a realistic biological setting.
Vasserman, Genadiy; Schneidman, Elad; Segev, Ronen
2013-01-01
The visual system continually adjusts its sensitivity to the statistical properties of the environment through an adaptation process that starts in the retina. Colour perception and processing is commonly thought to occur mainly in high visual areas, and indeed most evidence for chromatic colour contrast adaptation comes from cortical studies. We show that colour contrast adaptation starts in the retina where ganglion cells adjust their responses to the spectral properties of the environment. We demonstrate that the ganglion cells match their responses to red-blue stimulus combinations according to the relative contrast of each of the input channels by rotating their functional response properties in colour space. Using measurements of the chromatic statistics of natural environments, we show that the retina balances inputs from the two (red and blue) stimulated colour channels, as would be expected from theoretical optimal behaviour. Our results suggest that colour is encoded in the retina based on the efficient processing of spectral information that matches spectral combinations in natural scenes on the colour processing level. PMID:24205373
[Comparison of bite marks and teeth features using 2D and 3D methods].
Lorkiewicz-Muszyńska, Dorota; Glapiński, Mariusz; Zaba, Czesław; Łabecka, Marzena
2011-01-01
The nature of bite marks is complex. They are found at the scene of crime on different materials and surfaces - not only on human body and corpse, but also on food products and material objects. Human bites on skin are sometimes difficult to interpret and to analyze because of the specific character of skin--elastic and distortable--and because different areas of human body have different surfaces and curvatures. A bite mark left at the scene of crime can be a highly helpful way to lead investigators to criminals. The study was performed to establish: 1) whether bite marks exhibit variations in the accuracy of impressions on different materials, 2) whether it is possible to use the 3D method in the process of identifying an individual based on the comparison of bite marks revealed at the scene, and 3D scans of dental casts, 3) whether application of the 3D method allows for elimination of secondary photographic distortion of bite marks. The authors carried out experiments on simulated cases. Five volunteers bit various materials with different surfaces. Experimental bite marks were collected with emphasis on differentiations of materials. Subsequently, dental impressions were taken from five volunteers in order to prepare five sets of dental casts (the maxilla and mandible. The biting edges of teeth were impressed in wax to create an imprint. The samples of dental casts, corresponding wax bite impressions and bite marks from different materials were scanned with 2D and 3D scanners and photographs were taken. All of these were examined in detail and then compared using different methods (2D and 3D). 1) Bite marks exhibit variations in accuracy of impression on different materials. The most legible reproduction of bite marks was seen on cheese. 2) In comparison of bite marks, the 3D method and 3D scans of dental casts are highly accurate. 3) The 3D method helps to eliminate secondary photographic distortion of bite marks.
Reduced modulation of scanpaths in response to task demands in posterior cortical atrophy.
Shakespeare, Timothy J; Pertzov, Yoni; Yong, Keir X X; Nicholas, Jennifer; Crutch, Sebastian J
2015-02-01
A difficulty in perceiving visual scenes is one of the most striking impairments experienced by patients with the clinico-radiological syndrome posterior cortical atrophy (PCA). However whilst a number of studies have investigated perception of relatively simple experimental stimuli in these individuals, little is known about multiple object and complex scene perception and the role of eye movements in posterior cortical atrophy. We embrace the distinction between high-level (top-down) and low-level (bottom-up) influences upon scanning eye movements when looking at scenes. This distinction was inspired by Yarbus (1967), who demonstrated how the location of our fixations is affected by task instructions and not only the stimulus' low level properties. We therefore examined how scanning patterns are influenced by task instructions and low-level visual properties in 7 patients with posterior cortical atrophy, 8 patients with typical Alzheimer's disease, and 19 healthy age-matched controls. Each participant viewed 10 scenes under four task conditions (encoding, recognition, search and description) whilst eye movements were recorded. The results reveal significant differences between groups in the impact of test instructions upon scanpaths. Across tasks without a search component, posterior cortical atrophy patients were significantly less consistent than typical Alzheimer's disease patients and controls in where they were looking. By contrast, when comparing search and non-search tasks, it was controls who exhibited lowest between-task similarity ratings, suggesting they were better able than posterior cortical atrophy or typical Alzheimer's disease patients to respond appropriately to high-level needs by looking at task-relevant regions of a scene. Posterior cortical atrophy patients had a significant tendency to fixate upon more low-level salient parts of the scenes than controls irrespective of the viewing task. The study provides a detailed characterisation of scene perception abilities in posterior cortical atrophy and offers insights into the mechanisms by which high-level cognitive schemes interact with low-level perception. Copyright © 2015 Elsevier Ltd. All rights reserved.
Young People and Popular Music in East Germany: Focus on a Scene.
ERIC Educational Resources Information Center
Wicke, Peter
1985-01-01
Points out a great preference for Western-style rock music among young people (ages 14-25) in East Germany despite social, political, and cultural differences, suggesting the global nature of music as communication. (PD)
Bekhtereva, Valeria; Müller, Matthias M
2017-10-01
Is color a critical feature in emotional content extraction and involuntary attentional orienting toward affective stimuli? Here we used briefly presented emotional distractors to investigate the extent to which color information can influence the time course of attentional bias in early visual cortex. While participants performed a demanding visual foreground task, complex unpleasant and neutral background images were displayed in color or grayscale format for a short period of 133 ms and were immediately masked. Such a short presentation poses a challenge for visual processing. In the visual detection task, participants attended to flickering squares that elicited the steady-state visual evoked potential (SSVEP), allowing us to analyze the temporal dynamics of the competition for processing resources in early visual cortex. Concurrently we measured the visual event-related potentials (ERPs) evoked by the unpleasant and neutral background scenes. The results showed (a) that the distraction effect was greater with color than with grayscale images and (b) that it lasted longer with colored unpleasant distractor images. Furthermore, classical and mass-univariate ERP analyses indicated that, when presented in color, emotional scenes elicited more pronounced early negativities (N1-EPN) relative to neutral scenes, than when the scenes were presented in grayscale. Consistent with neural data, unpleasant scenes were rated as being more emotionally negative and received slightly higher arousal values when they were shown in color than when they were presented in grayscale. Taken together, these findings provide evidence for the modulatory role of picture color on a cascade of coordinated perceptual processes: by facilitating the higher-level extraction of emotional content, color influences the duration of the attentional bias to briefly presented affective scenes in lower-tier visual areas.
Hayes, Scott M; Nadel, Lynn; Ryan, Lee
2007-01-01
Previous research has investigated intentional retrieval of contextual information and contextual influences on object identification and word recognition, yet few studies have investigated context effects in episodic memory for objects. To address this issue, unique objects embedded in a visually rich scene or on a white background were presented to participants. At test, objects were presented either in the original scene or on a white background. A series of behavioral studies with young adults demonstrated a context shift decrement (CSD)-decreased recognition performance when context is changed between encoding and retrieval. The CSD was not attenuated by encoding or retrieval manipulations, suggesting that binding of object and context may be automatic. A final experiment explored the neural correlates of the CSD, using functional Magnetic Resonance Imaging. Parahippocampal cortex (PHC) activation (right greater than left) during incidental encoding was associated with subsequent memory of objects in the context shift condition. Greater activity in right PHC was also observed during successful recognition of objects previously presented in a scene. Finally, a subset of regions activated during scene encoding, such as bilateral PHC, was reactivated when the object was presented on a white background at retrieval. Although participants were not required to intentionally retrieve contextual information, the results suggest that PHC may reinstate visual context to mediate successful episodic memory retrieval. The CSD is attributed to automatic and obligatory binding of object and context. The results suggest that PHC is important not only for processing of scene information, but also plays a role in successful episodic memory encoding and retrieval. These findings are consistent with the view that spatial information is stored in the hippocampal complex, one of the central tenets of Multiple Trace Theory. (c) 2007 Wiley-Liss, Inc.