visual object recognition: Topics by Science.gov

Sample records for visual object recognition

A new selective developmental deficit: Impaired object recognition with normal face recognition.

PubMed

Germine, Laura; Cashdollar, Nathan; Düzel, Emrah; Duchaine, Bradley

2011-05-01

Studies of developmental deficits in face recognition, or developmental prosopagnosia, have shown that individuals who have not suffered brain damage can show face recognition impairments coupled with normal object recognition (Duchaine and Nakayama, 2005; Duchaine et al., 2006; Nunn et al., 2001). However, no developmental cases with the opposite dissociation - normal face recognition with impaired object recognition - have been reported. The existence of a case of non-face developmental visual agnosia would indicate that the development of normal face recognition mechanisms does not rely on the development of normal object recognition mechanisms. To see whether a developmental variant of non-face visual object agnosia exists, we conducted a series of web-based object and face recognition tests to screen for individuals showing object recognition memory impairments but not face recognition impairments. Through this screening process, we identified AW, an otherwise normal 19-year-old female, who was then tested in the lab on face and object recognition tests. AW's performance was impaired in within-class visual recognition memory across six different visual categories (guns, horses, scenes, tools, doors, and cars). In contrast, she scored normally on seven tests of face recognition, tests of memory for two other object categories (houses and glasses), and tests of recall memory for visual shapes. Testing confirmed that her impairment was not related to a general deficit in lower-level perception, object perception, basic-level recognition, or memory. AW's results provide the first neuropsychological evidence that recognition memory for non-face visual object categories can be selectively impaired in individuals without brain damage or other memory impairment. These results indicate that the development of recognition memory for faces does not depend on intact object recognition memory and provide further evidence for category-specific dissociations in visual recognition. Copyright © 2010 Elsevier Srl. All rights reserved.
Infant Visual Attention and Object Recognition

PubMed Central

Reynolds, Greg D.

2015-01-01

This paper explores the role visual attention plays in the recognition of objects in infancy. Research and theory on the development of infant attention and recognition memory are reviewed in three major sections. The first section reviews some of the major findings and theory emerging from a rich tradition of behavioral research utilizing preferential looking tasks to examine visual attention and recognition memory in infancy. The second section examines research utilizing neural measures of attention and object recognition in infancy as well as research on brain-behavior relations in the early development of attention and recognition memory. The third section addresses potential areas of the brain involved in infant object recognition and visual attention. An integrated synthesis of some of the existing models of the development of visual attention is presented which may account for the observed changes in behavioral and neural measures of visual attention and object recognition that occur across infancy. PMID:25596333
Infant visual attention and object recognition.

PubMed

Reynolds, Greg D

2015-05-15

This paper explores the role visual attention plays in the recognition of objects in infancy. Research and theory on the development of infant attention and recognition memory are reviewed in three major sections. The first section reviews some of the major findings and theory emerging from a rich tradition of behavioral research utilizing preferential looking tasks to examine visual attention and recognition memory in infancy. The second section examines research utilizing neural measures of attention and object recognition in infancy as well as research on brain-behavior relations in the early development of attention and recognition memory. The third section addresses potential areas of the brain involved in infant object recognition and visual attention. An integrated synthesis of some of the existing models of the development of visual attention is presented which may account for the observed changes in behavioral and neural measures of visual attention and object recognition that occur across infancy. Copyright © 2015 Elsevier B.V. All rights reserved.
Fast neuromimetic object recognition using FPGA outperforms GPU implementations.

PubMed

Orchard, Garrick; Martin, Jacob G; Vogelstein, R Jacob; Etienne-Cummings, Ralph

2013-08-01

Recognition of objects in still images has traditionally been regarded as a difficult computational problem. Although modern automated methods for visual object recognition have achieved steadily increasing recognition accuracy, even the most advanced computational vision approaches are unable to obtain performance equal to that of humans. This has led to the creation of many biologically inspired models of visual object recognition, among them the hierarchical model and X (HMAX) model. HMAX is traditionally known to achieve high accuracy in visual object recognition tasks at the expense of significant computational complexity. Increasing complexity, in turn, increases computation time, reducing the number of images that can be processed per unit time. In this paper we describe how the computationally intensive and biologically inspired HMAX model for visual object recognition can be modified for implementation on a commercial field-programmable aate Array, specifically the Xilinx Virtex 6 ML605 evaluation board with XC6VLX240T FPGA. We show that with minor modifications to the traditional HMAX model we can perform recognition on images of size 128 × 128 pixels at a rate of 190 images per second with a less than 1% loss in recognition accuracy in both binary and multiclass visual object recognition tasks.
The development of newborn object recognition in fast and slow visual worlds

PubMed Central

Wood, Justin N.; Wood, Samantha M. W.

2016-01-01

Object recognition is central to perception and cognition. Yet relatively little is known about the environmental factors that cause invariant object recognition to emerge in the newborn brain. Is this ability a hardwired property of vision? Or does the development of invariant object recognition require experience with a particular kind of visual environment? Here, we used a high-throughput controlled-rearing method to examine whether newborn chicks (Gallus gallus) require visual experience with slowly changing objects to develop invariant object recognition abilities. When newborn chicks were raised with a slowly rotating virtual object, the chicks built invariant object representations that generalized across novel viewpoints and rotation speeds. In contrast, when newborn chicks were raised with a virtual object that rotated more quickly, the chicks built viewpoint-specific object representations that failed to generalize to novel viewpoints and rotation speeds. Moreover, there was a direct relationship between the speed of the object and the amount of invariance in the chick's object representation. Thus, visual experience with slowly changing objects plays a critical role in the development of invariant object recognition. These results indicate that invariant object recognition is not a hardwired property of vision, but is learned rapidly when newborns encounter a slowly changing visual world. PMID:27097925
Changes in Visual Object Recognition Precede the Shape Bias in Early Noun Learning

PubMed Central

Yee, Meagan; Jones, Susan S.; Smith, Linda B.

2012-01-01

Two of the most formidable skills that characterize human beings are language and our prowess in visual object recognition. They may also be developmentally intertwined. Two experiments, a large sample cross-sectional study and a smaller sample 6-month longitudinal study of 18- to 24-month-olds, tested a hypothesized developmental link between changes in visual object representation and noun learning. Previous findings in visual object recognition indicate that children’s ability to recognize common basic level categories from sparse structural shape representations of object shape emerges between the ages of 18 and 24 months, is related to noun vocabulary size, and is lacking in children with language delay. Other research shows in artificial noun learning tasks that during this same developmental period, young children systematically generalize object names by shape, that this shape bias predicts future noun learning, and is lacking in children with language delay. The two experiments examine the developmental relation between visual object recognition and the shape bias for the first time. The results show that developmental changes in visual object recognition systematically precede the emergence of the shape bias. The results suggest a developmental pathway in which early changes in visual object recognition that are themselves linked to category learning enable the discovery of higher-order regularities in category structure and thus the shape bias in novel noun learning tasks. The proposed developmental pathway has implications for understanding the role of specific experience in the development of both visual object recognition and the shape bias in early noun learning. PMID:23227015
Short temporal asynchrony disrupts visual object recognition

PubMed Central

Singer, Jedediah M.; Kreiman, Gabriel

2014-01-01

Humans can recognize objects and scenes in a small fraction of a second. The cascade of signals underlying rapid recognition might be disrupted by temporally jittering different parts of complex objects. Here we investigated the time course over which shape information can be integrated to allow for recognition of complex objects. We presented fragments of object images in an asynchronous fashion and behaviorally evaluated categorization performance. We observed that visual recognition was significantly disrupted by asynchronies of approximately 30 ms, suggesting that spatiotemporal integration begins to break down with even small deviations from simultaneity. However, moderate temporal asynchrony did not completely obliterate recognition; in fact, integration of visual shape information persisted even with an asynchrony of 100 ms. We describe the data with a concise model based on the dynamic reduction of uncertainty about what image was presented. These results emphasize the importance of timing in visual processing and provide strong constraints for the development of dynamical models of visual shape recognition. PMID:24819738
Eye movements during object recognition in visual agnosia.

PubMed

Charles Leek, E; Patterson, Candy; Paul, Matthew A; Rafal, Robert; Cristino, Filipe

2012-07-01

This paper reports the first ever detailed study about eye movement patterns during single object recognition in visual agnosia. Eye movements were recorded in a patient with an integrative agnosic deficit during two recognition tasks: common object naming and novel object recognition memory. The patient showed normal directional biases in saccades and fixation dwell times in both tasks and was as likely as controls to fixate within object bounding contour regardless of recognition accuracy. In contrast, following initial saccades of similar amplitude to controls, the patient showed a bias for short saccades. In object naming, but not in recognition memory, the similarity of the spatial distributions of patient and control fixations was modulated by recognition accuracy. The study provides new evidence about how eye movements can be used to elucidate the functional impairments underlying object recognition deficits. We argue that the results reflect a breakdown in normal functional processes involved in the integration of shape information across object structure during the visual perception of shape. Copyright © 2012 Elsevier Ltd. All rights reserved.
Decoding the time-course of object recognition in the human brain: From visual features to categorical decisions.

PubMed

Contini, Erika W; Wardle, Susan G; Carlson, Thomas A

2017-10-01

Visual object recognition is a complex, dynamic process. Multivariate pattern analysis methods, such as decoding, have begun to reveal how the brain processes complex visual information. Recently, temporal decoding methods for EEG and MEG have offered the potential to evaluate the temporal dynamics of object recognition. Here we review the contribution of M/EEG time-series decoding methods to understanding visual object recognition in the human brain. Consistent with the current understanding of the visual processing hierarchy, low-level visual features dominate decodable object representations early in the time-course, with more abstract representations related to object category emerging later. A key finding is that the time-course of object processing is highly dynamic and rapidly evolving, with limited temporal generalisation of decodable information. Several studies have examined the emergence of object category structure, and we consider to what degree category decoding can be explained by sensitivity to low-level visual features. Finally, we evaluate recent work attempting to link human behaviour to the neural time-course of object processing. Copyright © 2017 Elsevier Ltd. All rights reserved.
The Role of Sensory-Motor Information in Object Recognition: Evidence from Category-Specific Visual Agnosia

ERIC Educational Resources Information Center

Wolk, D.A.; Coslett, H.B.; Glosser, G.

2005-01-01

The role of sensory-motor representations in object recognition was investigated in experiments involving AD, a patient with mild visual agnosia who was impaired in the recognition of visually presented living as compared to non-living entities. AD named visually presented items for which sensory-motor information was available significantly more…
Measuring the Speed of Newborn Object Recognition in Controlled Visual Worlds

ERIC Educational Resources Information Center

Wood, Justin N.; Wood, Samantha M. W.

2017-01-01

How long does it take for a newborn to recognize an object? Adults can recognize objects rapidly, but measuring object recognition speed in newborns has not previously been possible. Here we introduce an automated controlled-rearing method for measuring the speed of newborn object recognition in controlled visual worlds. We raised newborn chicks…
Affective and contextual values modulate spatial frequency use in object recognition

PubMed Central

Caplette, Laurent; West, Gregory; Gomot, Marie; Gosselin, Frédéric; Wicker, Bruno

2014-01-01

Visual object recognition is of fundamental importance in our everyday interaction with the environment. Recent models of visual perception emphasize the role of top-down predictions facilitating object recognition via initial guesses that limit the number of object representations that need to be considered. Several results suggest that this rapid and efficient object processing relies on the early extraction and processing of low spatial frequencies (LSF). The present study aimed to investigate the SF content of visual object representations and its modulation by contextual and affective values of the perceived object during a picture-name verification task. Stimuli consisted of pictures of objects equalized in SF content and categorized as having low or high affective and contextual values. To access the SF content of stored visual representations of objects, SFs of each image were then randomly sampled on a trial-by-trial basis. Results reveal that intermediate SFs between 14 and 24 cycles per object (2.3–4 cycles per degree) are correlated with fast and accurate identification for all categories of objects. Moreover, there was a significant interaction between affective and contextual values over the SFs correlating with fast recognition. These results suggest that affective and contextual values of a visual object modulate the SF content of its internal representation, thus highlighting the flexibility of the visual recognition system. PMID:24904514
Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence

PubMed Central

Cichy, Radoslaw Martin; Khosla, Aditya; Pantazis, Dimitrios; Torralba, Antonio; Oliva, Aude

2016-01-01

The complex multi-stage architecture of cortical visual pathways provides the neural basis for efficient visual object recognition in humans. However, the stage-wise computations therein remain poorly understood. Here, we compared temporal (magnetoencephalography) and spatial (functional MRI) visual brain representations with representations in an artificial deep neural network (DNN) tuned to the statistics of real-world visual recognition. We showed that the DNN captured the stages of human visual processing in both time and space from early visual areas towards the dorsal and ventral streams. Further investigation of crucial DNN parameters revealed that while model architecture was important, training on real-world categorization was necessary to enforce spatio-temporal hierarchical relationships with the brain. Together our results provide an algorithmically informed view on the spatio-temporal dynamics of visual object recognition in the human visual brain. PMID:27282108
Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence.

PubMed

Cichy, Radoslaw Martin; Khosla, Aditya; Pantazis, Dimitrios; Torralba, Antonio; Oliva, Aude

2016-06-10

The complex multi-stage architecture of cortical visual pathways provides the neural basis for efficient visual object recognition in humans. However, the stage-wise computations therein remain poorly understood. Here, we compared temporal (magnetoencephalography) and spatial (functional MRI) visual brain representations with representations in an artificial deep neural network (DNN) tuned to the statistics of real-world visual recognition. We showed that the DNN captured the stages of human visual processing in both time and space from early visual areas towards the dorsal and ventral streams. Further investigation of crucial DNN parameters revealed that while model architecture was important, training on real-world categorization was necessary to enforce spatio-temporal hierarchical relationships with the brain. Together our results provide an algorithmically informed view on the spatio-temporal dynamics of visual object recognition in the human visual brain.
Development of visuo-haptic transfer for object recognition in typical preschool and school-aged children.

PubMed

Purpura, Giulia; Cioni, Giovanni; Tinelli, Francesca

2018-07-01

Object recognition is a long and complex adaptive process and its full maturation requires combination of many different sensory experiences as well as cognitive abilities to manipulate previous experiences in order to develop new percepts and subsequently to learn from the environment. It is well recognized that the transfer of visual and haptic information facilitates object recognition in adults, but less is known about development of this ability. In this study, we explored the developmental course of object recognition capacity in children using unimodal visual information, unimodal haptic information, and visuo-haptic information transfer in children from 4 years to 10 years and 11 months of age. Participants were tested through a clinical protocol, involving visual exploration of black-and-white photographs of common objects, haptic exploration of real objects, and visuo-haptic transfer of these two types of information. Results show an age-dependent development of object recognition abilities for visual, haptic, and visuo-haptic modalities. A significant effect of time on development of unimodal and crossmodal recognition skills was found. Moreover, our data suggest that multisensory processes for common object recognition are active at 4 years of age. They facilitate recognition of common objects, and, although not fully mature, are significant in adaptive behavior from the first years of age. The study of typical development of visuo-haptic processes in childhood is a starting point for future studies regarding object recognition in impaired populations.
Using Prosopagnosia to Test and Modify Visual Recognition Theory.

PubMed

O'Brien, Alexander M

2018-02-01

Biederman's contemporary theory of basic visual object recognition (Recognition-by-Components) is based on structural descriptions of objects and presumes 36 visual primitives (geons) people can discriminate, but there has been no empirical test of the actual use of these 36 geons to visually distinguish objects. In this study, we tested for the actual use of these geons in basic visual discrimination by comparing object discrimination performance patterns (when distinguishing varied stimuli) of an acquired prosopagnosia patient (LB) and healthy control participants. LB's prosopagnosia left her heavily reliant on structural descriptions or categorical object differences in visual discrimination tasks versus the control participants' additional ability to use face recognition or coordinate systems (Coordinate Relations Hypothesis). Thus, when LB performed comparably to control participants with a given stimulus, her restricted reliance on basic or categorical discriminations meant that the stimuli must be distinguishable on the basis of a geon feature. By varying stimuli in eight separate experiments and presenting all 36 geons, we discerned that LB coded only 12 (vs. 36) distinct visual primitives (geons), apparently reflective of human visual systems generally.
Superior voice recognition in a patient with acquired prosopagnosia and object agnosia.

PubMed

Hoover, Adria E N; Démonet, Jean-François; Steeves, Jennifer K E

2010-11-01

Anecdotally, it has been reported that individuals with acquired prosopagnosia compensate for their inability to recognize faces by using other person identity cues such as hair, gait or the voice. Are they therefore superior at the use of non-face cues, specifically voices, to person identity? Here, we empirically measure person and object identity recognition in a patient with acquired prosopagnosia and object agnosia. We quantify person identity (face and voice) and object identity (car and horn) recognition for visual, auditory, and bimodal (visual and auditory) stimuli. The patient is unable to recognize faces or cars, consistent with his prosopagnosia and object agnosia, respectively. He is perfectly able to recognize people's voices and car horns and bimodal stimuli. These data show a reverse shift in the typical weighting of visual over auditory information for audiovisual stimuli in a compromised visual recognition system. Moreover, the patient shows selectively superior voice recognition compared to the controls revealing that two different stimulus domains, persons and objects, may not be equally affected by sensory adaptation effects. This also implies that person and object identity recognition are processed in separate pathways. These data demonstrate that an individual with acquired prosopagnosia and object agnosia can compensate for the visual impairment and become quite skilled at using spared aspects of sensory processing. In the case of acquired prosopagnosia it is advantageous to develop a superior use of voices for person identity recognition in everyday life. Copyright © 2010 Elsevier Ltd. All rights reserved.
Visual Object Detection, Categorization, and Identification Tasks Are Associated with Different Time Courses and Sensitivities

ERIC Educational Resources Information Center

de la Rosa, Stephan; Choudhery, Rabia N.; Chatziastros, Astros

2011-01-01

Recent evidence suggests that the recognition of an object's presence and its explicit recognition are temporally closely related. Here we re-examined the time course (using a fine and a coarse temporal resolution) and the sensitivity of three possible component processes of visual object recognition. In particular, participants saw briefly…
Size-Sensitive Perceptual Representations Underlie Visual and Haptic Object Recognition

PubMed Central

Craddock, Matt; Lawson, Rebecca

2009-01-01

A variety of similarities between visual and haptic object recognition suggests that the two modalities may share common representations. However, it is unclear whether such common representations preserve low-level perceptual features or whether transfer between vision and haptics is mediated by high-level, abstract representations. Two experiments used a sequential shape-matching task to examine the effects of size changes on unimodal and crossmodal visual and haptic object recognition. Participants felt or saw 3D plastic models of familiar objects. The two objects presented on a trial were either the same size or different sizes and were the same shape or different but similar shapes. Participants were told to ignore size changes and to match on shape alone. In Experiment 1, size changes on same-shape trials impaired performance similarly for both visual-to-visual and haptic-to-haptic shape matching. In Experiment 2, size changes impaired performance on both visual-to-haptic and haptic-to-visual shape matching and there was no interaction between the cost of size changes and direction of transfer. Together the unimodal and crossmodal matching results suggest that the same, size-specific perceptual representations underlie both visual and haptic object recognition, and indicate that crossmodal memory for objects must be at least partly based on common perceptual representations. PMID:19956685
Semantic and visual determinants of face recognition in a prosopagnosic patient.

PubMed

Dixon, M J; Bub, D N; Arguin, M

1998-05-01

Prosopagnosia is the neuropathological inability to recognize familiar people by their faces. It can occur in isolation or can coincide with recognition deficits for other nonface objects. Often, patients whose prosopagnosia is accompanied by object recognition difficulties have more trouble identifying certain categories of objects relative to others. In previous research, we demonstrated that objects that shared multiple visual features and were semantically close posed severe recognition difficulties for a patient with temporal lobe damage. We now demonstrate that this patient's face recognition is constrained by these same parameters. The prosopagnosic patient ELM had difficulties pairing faces to names when the faces shared visual features and the names were semantically related (e.g., Tonya Harding, Nancy Kerrigan, and Josee Chouinard -three ice skaters). He made tenfold fewer errors when the exact same faces were associated with semantically unrelated people (e.g., singer Celine Dion, actress Betty Grable, and First Lady Hillary Clinton). We conclude that prosopagnosia and co-occurring category-specific recognition problems both stem from difficulties disambiguating the stored representations of objects that share multiple visual features and refer to semantically close identities or concepts.

Trajectory Recognition as the Basis for Object Individuation: A Functional Model of Object File Instantiation and Object-Token Encoding

PubMed Central

Fields, Chris

2011-01-01

The perception of persisting visual objects is mediated by transient intermediate representations, object files, that are instantiated in response to some, but not all, visual trajectories. The standard object file concept does not, however, provide a mechanism sufficient to account for all experimental data on visual object persistence, object tracking, and the ability to perceive spatially disconnected stimuli as continuously existing objects. Based on relevant anatomical, functional, and developmental data, a functional model is constructed that bases visual object individuation on the recognition of temporal sequences of apparent center-of-mass positions that are specifically identified as trajectories by dedicated “trajectory recognition networks” downstream of the medial–temporal motion-detection area. This model is shown to account for a wide range of data, and to generate a variety of testable predictions. Individual differences in the recognition, abstraction, and encoding of trajectory information are expected to generate distinct object persistence judgments and object recognition abilities. Dominance of trajectory information over feature information in stored object tokens during early infancy, in particular, is expected to disrupt the ability to re-identify human and other individuals across perceptual episodes, and lead to developmental outcomes with characteristics of autism spectrum disorders. PMID:21716599
Non-accidental properties, metric invariance, and encoding by neurons in a model of ventral stream visual object recognition, VisNet.

PubMed

Rolls, Edmund T; Mills, W Patrick C

2018-05-01

When objects transform into different views, some properties are maintained, such as whether the edges are convex or concave, and these non-accidental properties are likely to be important in view-invariant object recognition. The metric properties, such as the degree of curvature, may change with different views, and are less likely to be useful in object recognition. It is shown that in a model of invariant visual object recognition in the ventral visual stream, VisNet, non-accidental properties are encoded much more than metric properties by neurons. Moreover, it is shown how with the temporal trace rule training in VisNet, non-accidental properties of objects become encoded by neurons, and how metric properties are treated invariantly. We also show how VisNet can generalize between different objects if they have the same non-accidental property, because the metric properties are likely to overlap. VisNet is a 4-layer unsupervised model of visual object recognition trained by competitive learning that utilizes a temporal trace learning rule to implement the learning of invariance using views that occur close together in time. A second crucial property of this model of object recognition is, when neurons in the level corresponding to the inferior temporal visual cortex respond selectively to objects, whether neurons in the intermediate layers can respond to combinations of features that may be parts of two or more objects. In an investigation using the four sides of a square presented in every possible combination, it was shown that even though different layer 4 neurons are tuned to encode each feature or feature combination orthogonally, neurons in the intermediate layers can respond to features or feature combinations present is several objects. This property is an important part of the way in which high capacity can be achieved in the four-layer ventral visual cortical pathway. These findings concerning non-accidental properties and the use of neurons in intermediate layers of the hierarchy help to emphasise fundamental underlying principles of the computations that may be implemented in the ventral cortical visual stream used in object recognition. Copyright © 2018 Elsevier Inc. All rights reserved.
Newborn chickens generate invariant object representations at the onset of visual object experience

PubMed Central

Wood, Justin N.

2013-01-01

To recognize objects quickly and accurately, mature visual systems build invariant object representations that generalize across a range of novel viewing conditions (e.g., changes in viewpoint). To date, however, the origins of this core cognitive ability have not yet been established. To examine how invariant object recognition develops in a newborn visual system, I raised chickens from birth for 2 weeks within controlled-rearing chambers. These chambers provided complete control over all visual object experiences. In the first week of life, subjects’ visual object experience was limited to a single virtual object rotating through a 60° viewpoint range. In the second week of life, I examined whether subjects could recognize that virtual object from novel viewpoints. Newborn chickens were able to generate viewpoint-invariant representations that supported object recognition across large, novel, and complex changes in the object’s appearance. Thus, newborn visual systems can begin building invariant object representations at the onset of visual object experience. These abstract representations can be generated from sparse data, in this case from a visual world containing a single virtual object seen from a limited range of viewpoints. This study shows that powerful, robust, and invariant object recognition machinery is an inherent feature of the newborn brain. PMID:23918372
Change blindness and visual memory: visual representations get rich and act poor.

PubMed

Varakin, D Alexander; Levin, Daniel T

2006-02-01

Change blindness is often taken as evidence that visual representations are impoverished, while successful recognition of specific objects is taken as evidence that they are richly detailed. In the current experiments, participants performed cover tasks that required each object in a display to be attended. Change detection trials were unexpectedly introduced and surprise recognition tests were given for nonchanging displays. For both change detection and recognition, participants had to distinguish objects from the same basic-level category, making it likely that specific visual information had to be used for successful performance. Although recognition was above chance, incidental change detection usually remained at floor. These results help reconcile demonstrations of poor change detection with demonstrations of good memory because they suggest that the capability to store visual information in memory is not reflected by the visual system's tendency to utilize these representations for purposes of detecting unexpected changes.
Image processing strategies based on saliency segmentation for object recognition under simulated prosthetic vision.

PubMed

Li, Heng; Su, Xiaofan; Wang, Jing; Kan, Han; Han, Tingting; Zeng, Yajie; Chai, Xinyu

2018-01-01

Current retinal prostheses can only generate low-resolution visual percepts constituted of limited phosphenes which are elicited by an electrode array and with uncontrollable color and restricted grayscale. Under this visual perception, prosthetic recipients can just complete some simple visual tasks, but more complex tasks like face identification/object recognition are extremely difficult. Therefore, it is necessary to investigate and apply image processing strategies for optimizing the visual perception of the recipients. This study focuses on recognition of the object of interest employing simulated prosthetic vision. We used a saliency segmentation method based on a biologically plausible graph-based visual saliency model and a grabCut-based self-adaptive-iterative optimization framework to automatically extract foreground objects. Based on this, two image processing strategies, Addition of Separate Pixelization and Background Pixel Shrink, were further utilized to enhance the extracted foreground objects. i) The results showed by verification of psychophysical experiments that under simulated prosthetic vision, both strategies had marked advantages over Direct Pixelization in terms of recognition accuracy and efficiency. ii) We also found that recognition performance under two strategies was tied to the segmentation results and was affected positively by the paired-interrelated objects in the scene. The use of the saliency segmentation method and image processing strategies can automatically extract and enhance foreground objects, and significantly improve object recognition performance towards recipients implanted a high-density implant. Copyright © 2017 Elsevier B.V. All rights reserved.
Impaired recognition of faces and objects in dyslexia: Evidence for ventral stream dysfunction?

PubMed

Sigurdardottir, Heida Maria; Ívarsson, Eysteinn; Kristinsdóttir, Kristjana; Kristjánsson, Árni

2015-09-01

The objective of this study was to establish whether or not dyslexics are impaired at the recognition of faces and other complex nonword visual objects. This would be expected based on a meta-analysis revealing that children and adult dyslexics show functional abnormalities within the left fusiform gyrus, a brain region high up in the ventral visual stream, which is thought to support the recognition of words, faces, and other objects. 20 adult dyslexics (M = 29 years) and 20 matched typical readers (M = 29 years) participated in the study. One dyslexic-typical reader pair was excluded based on Adult Reading History Questionnaire scores and IS-FORM reading scores. Performance was measured on 3 high-level visual processing tasks: the Cambridge Face Memory Test, the Vanderbilt Holistic Face Processing Test, and the Vanderbilt Expertise Test. People with dyslexia are impaired in their recognition of faces and other visually complex objects. Their holistic processing of faces appears to be intact, suggesting that dyslexics may instead be specifically impaired at part-based processing of visual objects. The difficulty that people with dyslexia experience with reading might be the most salient manifestation of a more general high-level visual deficit. (c) 2015 APA, all rights reserved).
Beyond sensory images: Object-based representation in the human ventral pathway

PubMed Central

Pietrini, Pietro; Furey, Maura L.; Ricciardi, Emiliano; Gobbini, M. Ida; Wu, W.-H. Carolyn; Cohen, Leonardo; Guazzelli, Mario; Haxby, James V.

2004-01-01

We investigated whether the topographically organized, category-related patterns of neural response in the ventral visual pathway are a representation of sensory images or a more abstract representation of object form that is not dependent on sensory modality. We used functional MRI to measure patterns of response evoked during visual and tactile recognition of faces and manmade objects in sighted subjects and during tactile recognition in blind subjects. Results showed that visual and tactile recognition evoked category-related patterns of response in a ventral extrastriate visual area in the inferior temporal gyrus that were correlated across modality for manmade objects. Blind subjects also demonstrated category-related patterns of response in this “visual” area, and in more ventral cortical regions in the fusiform gyrus, indicating that these patterns are not due to visual imagery and, furthermore, that visual experience is not necessary for category-related representations to develop in these cortices. These results demonstrate that the representation of objects in the ventral visual pathway is not simply a representation of visual images but, rather, is a representation of more abstract features of object form. PMID:15064396
Priming Contour-Deleted Images: Evidence for Immediate Representations in Visual Object Recognition.

ERIC Educational Resources Information Center

Biederman, Irving; Cooper, Eric E.

1991-01-01

Speed and accuracy of identification of pictures of objects are facilitated by prior viewing. Contributions of image features, convex or concave components, and object models in a repetition priming task were explored in 2 studies involving 96 college students. Results provide evidence of intermediate representations in visual object recognition.…
Spatiotemporal dynamics underlying object completion in human ventral visual cortex.

PubMed

Tang, Hanlin; Buia, Calin; Madhavan, Radhika; Crone, Nathan E; Madsen, Joseph R; Anderson, William S; Kreiman, Gabriel

2014-08-06

Natural vision often involves recognizing objects from partial information. Recognition of objects from parts presents a significant challenge for theories of vision because it requires spatial integration and extrapolation from prior knowledge. Here we recorded intracranial field potentials of 113 visually selective electrodes from epilepsy patients in response to whole and partial objects. Responses along the ventral visual stream, particularly the inferior occipital and fusiform gyri, remained selective despite showing only 9%-25% of the object areas. However, these visually selective signals emerged ∼100 ms later for partial versus whole objects. These processing delays were particularly pronounced in higher visual areas within the ventral stream. This latency difference persisted when controlling for changes in contrast, signal amplitude, and the strength of selectivity. These results argue against a purely feedforward explanation of recognition from partial information, and provide spatiotemporal constraints on theories of object recognition that involve recurrent processing. Copyright © 2014 Elsevier Inc. All rights reserved.
Optimization of Visual Information Presentation for Visual Prosthesis.

PubMed

Guo, Fei; Yang, Yuan; Gao, Yong

2018-01-01

Visual prosthesis applying electrical stimulation to restore visual function for the blind has promising prospects. However, due to the low resolution, limited visual field, and the low dynamic range of the visual perception, huge loss of information occurred when presenting daily scenes. The ability of object recognition in real-life scenarios is severely restricted for prosthetic users. To overcome the limitations, optimizing the visual information in the simulated prosthetic vision has been the focus of research. This paper proposes two image processing strategies based on a salient object detection technique. The two processing strategies enable the prosthetic implants to focus on the object of interest and suppress the background clutter. Psychophysical experiments show that techniques such as foreground zooming with background clutter removal and foreground edge detection with background reduction have positive impacts on the task of object recognition in simulated prosthetic vision. By using edge detection and zooming technique, the two processing strategies significantly improve the recognition accuracy of objects. We can conclude that the visual prosthesis using our proposed strategy can assist the blind to improve their ability to recognize objects. The results will provide effective solutions for the further development of visual prosthesis.
Optimization of Visual Information Presentation for Visual Prosthesis

PubMed Central

Gao, Yong

2018-01-01

Visual prosthesis applying electrical stimulation to restore visual function for the blind has promising prospects. However, due to the low resolution, limited visual field, and the low dynamic range of the visual perception, huge loss of information occurred when presenting daily scenes. The ability of object recognition in real-life scenarios is severely restricted for prosthetic users. To overcome the limitations, optimizing the visual information in the simulated prosthetic vision has been the focus of research. This paper proposes two image processing strategies based on a salient object detection technique. The two processing strategies enable the prosthetic implants to focus on the object of interest and suppress the background clutter. Psychophysical experiments show that techniques such as foreground zooming with background clutter removal and foreground edge detection with background reduction have positive impacts on the task of object recognition in simulated prosthetic vision. By using edge detection and zooming technique, the two processing strategies significantly improve the recognition accuracy of objects. We can conclude that the visual prosthesis using our proposed strategy can assist the blind to improve their ability to recognize objects. The results will provide effective solutions for the further development of visual prosthesis. PMID:29731769
View Combination: A Generalization Mechanism for Visual Recognition

ERIC Educational Resources Information Center

Friedman, Alinda; Waller, David; Thrash, Tyler; Greenauer, Nathan; Hodgson, Eric

2011-01-01

We examined whether view combination mechanisms shown to underlie object and scene recognition can integrate visual information across views that have little or no three-dimensional information at either the object or scene level. In three experiments, people learned four "views" of a two dimensional visual array derived from a three-dimensional…
Contributions of Low and High Spatial Frequency Processing to Impaired Object Recognition Circuitry in Schizophrenia

PubMed Central

Calderone, Daniel J.; Hoptman, Matthew J.; Martínez, Antígona; Nair-Collins, Sangeeta; Mauro, Cristina J.; Bar, Moshe; Javitt, Daniel C.; Butler, Pamela D.

2013-01-01

Patients with schizophrenia exhibit cognitive and sensory impairment, and object recognition deficits have been linked to sensory deficits. The “frame and fill” model of object recognition posits that low spatial frequency (LSF) information rapidly reaches the prefrontal cortex (PFC) and creates a general shape of an object that feeds back to the ventral temporal cortex to assist object recognition. Visual dysfunction findings in schizophrenia suggest a preferential loss of LSF information. This study used functional magnetic resonance imaging (fMRI) and resting state functional connectivity (RSFC) to investigate the contribution of visual deficits to impaired object “framing” circuitry in schizophrenia. Participants were shown object stimuli that were intact or contained only LSF or high spatial frequency (HSF) information. For controls, fMRI revealed preferential activation to LSF information in precuneus, superior temporal, and medial and dorsolateral PFC areas, whereas patients showed a preference for HSF information or no preference. RSFC revealed a lack of connectivity between early visual areas and PFC for patients. These results demonstrate impaired processing of LSF information during object recognition in schizophrenia, with patients instead displaying increased processing of HSF information. This is consistent with findings of a preference for local over global visual information in schizophrenia. PMID:22735157
The Last Meter: Blind Visual Guidance to a Target.

PubMed

Manduchi, Roberto; Coughlan, James M

2014-01-01

Smartphone apps can use object recognition software to provide information to blind or low vision users about objects in the visual environment. A crucial challenge for these users is aiming the camera properly to take a well-framed picture of the desired target object. We investigate the effects of two fundamental constraints of object recognition - frame rate and camera field of view - on a blind person's ability to use an object recognition smartphone app. The app was used by 18 blind participants to find visual targets beyond arm's reach and approach them to within 30 cm. While we expected that a faster frame rate or wider camera field of view should always improve search performance, our experimental results show that in many cases increasing the field of view does not help, and may even hurt, performance. These results have important implications for the design of object recognition systems for blind users.
It's all connected: Pathways in visual object recognition and early noun learning.

PubMed

Smith, Linda B

2013-11-01

A developmental pathway may be defined as the route, or chain of events, through which a new structure or function forms. For many human behaviors, including object name learning and visual object recognition, these pathways are often complex and multicausal and include unexpected dependencies. This article presents three principles of development that suggest the value of a developmental psychology that explicitly seeks to trace these pathways and uses empirical evidence on developmental dependencies among motor development, action on objects, visual object recognition, and object name learning in 12- to 24-month-old infants to make the case. The article concludes with a consideration of the theoretical implications of this approach. (PsycINFO Database Record (c) 2013 APA, all rights reserved).
Recognition-induced forgetting of faces in visual long-term memory.

PubMed

Rugo, Kelsi F; Tamler, Kendall N; Woodman, Geoffrey F; Maxcey, Ashleigh M

2017-10-01

Despite more than a century of evidence that long-term memory for pictures and words are different, much of what we know about memory comes from studies using words. Recent research examining visual long-term memory has demonstrated that recognizing an object induces the forgetting of objects from the same category. This recognition-induced forgetting has been shown with a variety of everyday objects. However, unlike everyday objects, faces are objects of expertise. As a result, faces may be immune to recognition-induced forgetting. However, despite excellent memory for such stimuli, we found that faces were susceptible to recognition-induced forgetting. Our findings have implications for how models of human memory account for recognition-induced forgetting as well as represent objects of expertise and consequences for eyewitness testimony and the justice system.
A Novel Locally Linear KNN Method With Applications to Visual Recognition.

PubMed

Liu, Qingfeng; Liu, Chengjun

2017-09-01

A locally linear K Nearest Neighbor (LLK) method is presented in this paper with applications to robust visual recognition. Specifically, the concept of an ideal representation is first presented, which improves upon the traditional sparse representation in many ways. The objective function based on a host of criteria for sparsity, locality, and reconstruction is then optimized to derive a novel representation, which is an approximation to the ideal representation. The novel representation is further processed by two classifiers, namely, an LLK-based classifier and a locally linear nearest mean-based classifier, for visual recognition. The proposed classifiers are shown to connect to the Bayes decision rule for minimum error. Additional new theoretical analysis is presented, such as the nonnegative constraint, the group regularization, and the computational efficiency of the proposed LLK method. New methods such as a shifted power transformation for improving reliability, a coefficients' truncating method for enhancing generalization, and an improved marginal Fisher analysis method for feature extraction are proposed to further improve visual recognition performance. Extensive experiments are implemented to evaluate the proposed LLK method for robust visual recognition. In particular, eight representative data sets are applied for assessing the performance of the LLK method for various visual recognition applications, such as action recognition, scene recognition, object recognition, and face recognition.
Recognition-induced forgetting is not due to category-based set size.

PubMed

Maxcey, Ashleigh M

2016-01-01

What are the consequences of accessing a visual long-term memory representation? Previous work has shown that accessing a long-term memory representation via retrieval improves memory for the targeted item and hurts memory for related items, a phenomenon called retrieval-induced forgetting. Recently we found a similar forgetting phenomenon with recognition of visual objects. Recognition-induced forgetting occurs when practice recognizing an object during a two-alternative forced-choice task, from a group of objects learned at the same time, leads to worse memory for objects from that group that were not practiced. An alternative explanation of this effect is that category-based set size is inducing forgetting, not recognition practice as claimed by some researchers. This alternative explanation is possible because during recognition practice subjects make old-new judgments in a two-alternative forced-choice task, and are thus exposed to more objects from practiced categories, potentially inducing forgetting due to set-size. Herein I pitted the category-based set size hypothesis against the recognition-induced forgetting hypothesis. To this end, I parametrically manipulated the amount of practice objects received in the recognition-induced forgetting paradigm. If forgetting is due to category-based set size, then the magnitude of forgetting of related objects will increase as the number of practice trials increases. If forgetting is recognition induced, the set size of exemplars from any given category should not be predictive of memory for practiced objects. Consistent with this latter hypothesis, additional practice systematically improved memory for practiced objects, but did not systematically affect forgetting of related objects. These results firmly establish that recognition practice induces forgetting of related memories. Future directions and important real-world applications of using recognition to access our visual memories of previously encountered objects are discussed.
Visual working memory is more tolerant than visual long-term memory.

PubMed

Schurgin, Mark W; Flombaum, Jonathan I

2018-05-07

Human visual memory is tolerant, meaning that it supports object recognition despite variability across encounters at the image level. Tolerant object recognition remains one capacity in which artificial intelligence trails humans. Typically, tolerance is described as a property of human visual long-term memory (VLTM). In contrast, visual working memory (VWM) is not usually ascribed a role in tolerant recognition, with tests of that system usually demanding discriminatory power-identifying changes, not sameness. There are good reasons to expect that VLTM is more tolerant; functionally, recognition over the long-term must accommodate the fact that objects will not be viewed under identical conditions; and practically, the passive and massive nature of VLTM may impose relatively permissive criteria for thinking that two inputs are the same. But empirically, tolerance has never been compared across working and long-term visual memory. We therefore developed a novel paradigm for equating encoding and test across different memory types. In each experiment trial, participants saw two objects, memory for one tested immediately (VWM) and later for the other (VLTM). VWM performance was better than VLTM and remained robust despite the introduction of image and object variability. In contrast, VLTM performance suffered linearly as more variability was introduced into test stimuli. Additional experiments excluded interference effects as causes for the observed differences. These results suggest the possibility of a previously unidentified role for VWM in the acquisition of tolerant representations for object recognition. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
A validated set of tool pictures with matched objects and non-objects for laterality research.

PubMed

Verma, Ark; Brysbaert, Marc

2015-01-01

Neuropsychological and neuroimaging research has established that knowledge related to tool use and tool recognition is lateralized to the left cerebral hemisphere. Recently, behavioural studies with the visual half-field technique have confirmed the lateralization. A limitation of this research was that different sets of stimuli had to be used for the comparison of tools to other objects and objects to non-objects. Therefore, we developed a new set of stimuli containing matched triplets of tools, other objects and non-objects. With the new stimulus set, we successfully replicated the findings of no visual field advantage for objects in an object recognition task combined with a significant right visual field advantage for tools in a tool recognition task. The set of stimuli is available as supplemental data to this article.

A rodent model for the study of invariant visual object recognition

PubMed Central

Zoccolan, Davide; Oertelt, Nadja; DiCarlo, James J.; Cox, David D.

2009-01-01

The human visual system is able to recognize objects despite tremendous variation in their appearance on the retina resulting from variation in view, size, lighting, etc. This ability—known as “invariant” object recognition—is central to visual perception, yet its computational underpinnings are poorly understood. Traditionally, nonhuman primates have been the animal model-of-choice for investigating the neuronal substrates of invariant recognition, because their visual systems closely mirror our own. Meanwhile, simpler and more accessible animal models such as rodents have been largely overlooked as possible models of higher-level visual functions, because their brains are often assumed to lack advanced visual processing machinery. As a result, little is known about rodents' ability to process complex visual stimuli in the face of real-world image variation. In the present work, we show that rats possess more advanced visual abilities than previously appreciated. Specifically, we trained pigmented rats to perform a visual task that required them to recognize objects despite substantial variation in their appearance, due to changes in size, view, and lighting. Critically, rats were able to spontaneously generalize to previously unseen transformations of learned objects. These results provide the first systematic evidence for invariant object recognition in rats and argue for an increased focus on rodents as models for studying high-level visual processing. PMID:19429704
Exogenous temporal cues enhance recognition memory in an object-based manner.

PubMed

Ohyama, Junji; Watanabe, Katsumi

2010-11-01

Exogenous attention enhances the perception of attended items in both a space-based and an object-based manner. Exogenous attention also improves recognition memory for attended items in the space-based mode. However, it has not been examined whether object-based exogenous attention enhances recognition memory. To address this issue, we examined whether a sudden visual change in a task-irrelevant stimulus (an exogenous cue) would affect participants' recognition memory for items that were serially presented around a cued time. The results showed that recognition accuracy for an item was strongly enhanced when the visual cue occurred at the same location and time as the item (Experiments 1 and 2). The memory enhancement effect occurred when the exogenous visual cue and an item belonged to the same object (Experiments 3 and 4) and even when the cue was counterpredictive of the timing of an item to be asked about (Experiment 5). The present study suggests that an exogenous temporal cue automatically enhances the recognition accuracy for an item that is presented at close temporal proximity to the cue and that recognition memory enhancement occurs in an object-based manner.
A Comparison of the Effects of Depth Rotation on Visual and Haptic Three-Dimensional Object Recognition

ERIC Educational Resources Information Center

Lawson, Rebecca

2009-01-01

A sequential matching task was used to compare how the difficulty of shape discrimination influences the achievement of object constancy for depth rotations across haptic and visual object recognition. Stimuli were nameable, 3-dimensional plastic models of familiar objects (e.g., bed, chair) and morphs midway between these endpoint shapes (e.g., a…
Agnosic vision is like peripheral vision, which is limited by crowding.

PubMed

Strappini, Francesca; Pelli, Denis G; Di Pace, Enrico; Martelli, Marialuisa

2017-04-01

Visual agnosia is a neuropsychological impairment of visual object recognition despite near-normal acuity and visual fields. A century of research has provided only a rudimentary account of the functional damage underlying this deficit. We find that the object-recognition ability of agnosic patients viewing an object directly is like that of normally-sighted observers viewing it indirectly, with peripheral vision. Thus, agnosic vision is like peripheral vision. We obtained 14 visual-object-recognition tests that are commonly used for diagnosis of visual agnosia. Our "standard" normal observer took these tests at various eccentricities in his periphery. Analyzing the published data of 32 apperceptive agnosia patients and a group of 14 posterior cortical atrophy (PCA) patients on these tests, we find that each patient's pattern of object recognition deficits is well characterized by one number, the equivalent eccentricity at which our standard observer's peripheral vision is like the central vision of the agnosic patient. In other words, each agnosic patient's equivalent eccentricity is conserved across tests. Across patients, equivalent eccentricity ranges from 4 to 40 deg, which rates severity of the visual deficit. In normal peripheral vision, the required size to perceive a simple image (e.g., an isolated letter) is limited by acuity, and that for a complex image (e.g., a face or a word) is limited by crowding. In crowding, adjacent simple objects appear unrecognizably jumbled unless their spacing exceeds the crowding distance, which grows linearly with eccentricity. Besides conservation of equivalent eccentricity across object-recognition tests, we also find conservation, from eccentricity to agnosia, of the relative susceptibility of recognition of ten visual tests. These findings show that agnosic vision is like eccentric vision. Whence crowding? Peripheral vision, strabismic amblyopia, and possibly apperceptive agnosia are all limited by crowding, making it urgent to know what drives crowding. Acuity does not (Song et al., 2014), but neural density might: neurons per deg 2 in the crowding-relevant cortical area. Copyright © 2017 Elsevier Ltd. All rights reserved.
When apperceptive agnosia is explained by a deficit of primary visual processing.

PubMed

Serino, Andrea; Cecere, Roberto; Dundon, Neil; Bertini, Caterina; Sanchez-Castaneda, Cristina; Làdavas, Elisabetta

2014-03-01

Visual agnosia is a deficit in shape perception, affecting figure, object, face and letter recognition. Agnosia is usually attributed to lesions to high-order modules of the visual system, which combine visual cues to represent the shape of objects. However, most of previously reported agnosia cases presented visual field (VF) defects and poor primary visual processing. The present case-study aims to verify whether form agnosia could be explained by a deficit in basic visual functions, rather that by a deficit in high-order shape recognition. Patient SDV suffered a bilateral lesion of the occipital cortex due to anoxia. When tested, he could navigate, interact with others, and was autonomous in daily life activities. However, he could not recognize objects from drawings and figures, read or recognize familiar faces. He was able to recognize objects by touch and people from their voice. Assessments of visual functions showed blindness at the centre of the VF, up to almost 5°, bilaterally, with better stimulus detection in the periphery. Colour and motion perception was preserved. Psychophysical experiments showed that SDV's visual recognition deficits were not explained by poor spatial acuity or by the crowding effect. Rather a severe deficit in line orientation processing might be a key mechanism explaining SDV's agnosia. Line orientation processing is a basic function of primary visual cortex neurons, necessary for detecting "edges" of visual stimuli to build up a "primal sketch" for object recognition. We propose, therefore, that some forms of visual agnosia may be explained by deficits in basic visual functions due to widespread lesions of the primary visual areas, affecting primary levels of visual processing. Copyright © 2013 Elsevier Ltd. All rights reserved.
Breaking object correspondence across saccades impairs object recognition: The role of color and luminance.

PubMed

Poth, Christian H; Schneider, Werner X

2016-09-01

Rapid saccadic eye movements bring the foveal region of the eye's retina onto objects for high-acuity vision. Saccades change the location and resolution of objects' retinal images. To perceive objects as visually stable across saccades, correspondence between the objects before and after the saccade must be established. We have previously shown that breaking object correspondence across the saccade causes a decrement in object recognition (Poth, Herwig, & Schneider, 2015). Color and luminance can establish object correspondence, but it is unknown how these surface features contribute to transsaccadic visual processing. Here, we investigated whether changing the surface features color-and-luminance and color alone across saccades impairs postsaccadic object recognition. Participants made saccades to peripheral objects, which either maintained or changed their surface features across the saccade. After the saccade, participants briefly viewed a letter within the saccade target object (terminated by a pattern mask). Postsaccadic object recognition was assessed as participants' accuracy in reporting the letter. Experiment A used the colors green and red with different luminances as surface features, Experiment B blue and yellow with approximately the same luminances. Changing the surface features across the saccade deteriorated postsaccadic object recognition in both experiments. These findings reveal a link between object recognition and object correspondence relying on the surface features colors and luminance, which is currently not addressed in theories of transsaccadic perception. We interpret the findings within a recent theory ascribing this link to visual attention (Schneider, 2013).
Preserved Haptic Shape Processing after Bilateral LOC Lesions.

PubMed

Snow, Jacqueline C; Goodale, Melvyn A; Culham, Jody C

2015-10-07

The visual and haptic perceptual systems are understood to share a common neural representation of object shape. A region thought to be critical for recognizing visual and haptic shape information is the lateral occipital complex (LOC). We investigated whether LOC is essential for haptic shape recognition in humans by studying behavioral responses and brain activation for haptically explored objects in a patient (M.C.) with bilateral lesions of the occipitotemporal cortex, including LOC. Despite severe deficits in recognizing objects using vision, M.C. was able to accurately recognize objects via touch. M.C.'s psychophysical response profile to haptically explored shapes was also indistinguishable from controls. Using fMRI, M.C. showed no object-selective visual or haptic responses in LOC, but her pattern of haptic activation in other brain regions was remarkably similar to healthy controls. Although LOC is routinely active during visual and haptic shape recognition tasks, it is not essential for haptic recognition of object shape. The lateral occipital complex (LOC) is a brain region regarded to be critical for recognizing object shape, both in vision and in touch. However, causal evidence linking LOC with haptic shape processing is lacking. We studied recognition performance, psychophysical sensitivity, and brain response to touched objects, in a patient (M.C.) with extensive lesions involving LOC bilaterally. Despite being severely impaired in visual shape recognition, M.C. was able to identify objects via touch and she showed normal sensitivity to a haptic shape illusion. M.C.'s brain response to touched objects in areas of undamaged cortex was also very similar to that observed in neurologically healthy controls. These results demonstrate that LOC is not necessary for recognizing objects via touch. Copyright © 2015 the authors 0270-6474/15/3513745-16$15.00/0.
Augmented reality three-dimensional object visualization and recognition with axially distributed sensing.

PubMed

Markman, Adam; Shen, Xin; Hua, Hong; Javidi, Bahram

2016-01-15

An augmented reality (AR) smartglass display combines real-world scenes with digital information enabling the rapid growth of AR-based applications. We present an augmented reality-based approach for three-dimensional (3D) optical visualization and object recognition using axially distributed sensing (ADS). For object recognition, the 3D scene is reconstructed, and feature extraction is performed by calculating the histogram of oriented gradients (HOG) of a sliding window. A support vector machine (SVM) is then used for classification. Once an object has been identified, the 3D reconstructed scene with the detected object is optically displayed in the smartglasses allowing the user to see the object, remove partial occlusions of the object, and provide critical information about the object such as 3D coordinates, which are not possible with conventional AR devices. To the best of our knowledge, this is the first report on combining axially distributed sensing with 3D object visualization and recognition for applications to augmented reality. The proposed approach can have benefits for many applications, including medical, military, transportation, and manufacturing.
[Visual Texture Agnosia in Humans].

PubMed

Suzuki, Kyoko

2015-06-01

Visual object recognition requires the processing of both geometric and surface properties. Patients with occipital lesions may have visual agnosia, which is impairment in the recognition and identification of visually presented objects primarily through their geometric features. An analogous condition involving the failure to recognize an object by its texture may exist, which can be called visual texture agnosia. Here we present two cases with visual texture agnosia. Case 1 had left homonymous hemianopia and right upper quadrantanopia, along with achromatopsia, prosopagnosia, and texture agnosia, because of damage to his left ventromedial occipitotemporal cortex and right lateral occipito-temporo-parietal cortex due to multiple cerebral embolisms. Although he showed difficulty matching and naming textures of real materials, he could readily name visually presented objects by their contours. Case 2 had right lower quadrantanopia, along with impairment in stereopsis and recognition of texture in 2D images, because of subcortical hemorrhage in the left occipitotemporal region. He failed to recognize shapes based on texture information, whereas shape recognition based on contours was well preserved. Our findings, along with those of three reported cases with texture agnosia, indicate that there are separate channels for processing texture, color, and geometric features, and that the regions around the left collateral sulcus are crucial for texture processing.
Visual agnosia and focal brain injury.

PubMed

Martinaud, O

Visual agnosia encompasses all disorders of visual recognition within a selective visual modality not due to an impairment of elementary visual processing or other cognitive deficit. Based on a sequential dichotomy between the perceptual and memory systems, two different categories of visual object agnosia are usually considered: 'apperceptive agnosia' and 'associative agnosia'. Impaired visual recognition within a single category of stimuli is also reported in: (i) visual object agnosia of the ventral pathway, such as prosopagnosia (for faces), pure alexia (for words), or topographagnosia (for landmarks); (ii) visual spatial agnosia of the dorsal pathway, such as cerebral akinetopsia (for movement), or orientation agnosia (for the placement of objects in space). Focal brain injuries provide a unique opportunity to better understand regional brain function, particularly with the use of effective statistical approaches such as voxel-based lesion-symptom mapping (VLSM). The aim of the present work was twofold: (i) to review the various agnosia categories according to the traditional visual dual-pathway model; and (ii) to better assess the anatomical network underlying visual recognition through lesion-mapping studies correlating neuroanatomical and clinical outcomes. Copyright © 2017 Elsevier Masson SAS. All rights reserved.
Aging and solid shape recognition: Vision and haptics.

PubMed

Norman, J Farley; Cheeseman, Jacob R; Adkins, Olivia C; Cox, Andrea G; Rogers, Connor E; Dowell, Catherine J; Baxter, Michael W; Norman, Hideko F; Reyes, Cecia M

2015-10-01

The ability of 114 younger and older adults to recognize naturally-shaped objects was evaluated in three experiments. The participants viewed or haptically explored six randomly-chosen bell peppers (Capsicum annuum) in a study session and were later required to judge whether each of twelve bell peppers was "old" (previously presented during the study session) or "new" (not presented during the study session). When recognition memory was tested immediately after study, the younger adults' (Experiment 1) performance for vision and haptics was identical when the individual study objects were presented once. Vision became superior to haptics, however, when the individual study objects were presented multiple times. When 10- and 20-min delays (Experiment 2) were inserted in between study and test sessions, no significant differences occurred between vision and haptics: recognition performance in both modalities was comparable. When the recognition performance of older adults was evaluated (Experiment 3), a negative effect of age was found for visual shape recognition (younger adults' overall recognition performance was 60% higher). There was no age effect, however, for haptic shape recognition. The results of the present experiments indicate that the visual recognition of natural object shape is different from haptic recognition in multiple ways: visual shape recognition can be superior to that of haptics and is affected by aging, while haptic shape recognition is less accurate and unaffected by aging. Copyright © 2015 Elsevier Ltd. All rights reserved.
Comparison of Object Recognition Behavior in Human and Monkey

PubMed Central

Rajalingham, Rishi; Schmidt, Kailyn

2015-01-01

Although the rhesus monkey is used widely as an animal model of human visual processing, it is not known whether invariant visual object recognition behavior is quantitatively comparable across monkeys and humans. To address this question, we systematically compared the core object recognition behavior of two monkeys with that of human subjects. To test true object recognition behavior (rather than image matching), we generated several thousand naturalistic synthetic images of 24 basic-level objects with high variation in viewing parameters and image background. Monkeys were trained to perform binary object recognition tasks on a match-to-sample paradigm. Data from 605 human subjects performing the same tasks on Mechanical Turk were aggregated to characterize “pooled human” object recognition behavior, as well as 33 separate Mechanical Turk subjects to characterize individual human subject behavior. Our results show that monkeys learn each new object in a few days, after which they not only match mean human performance but show a pattern of object confusion that is highly correlated with pooled human confusion patterns and is statistically indistinguishable from individual human subjects. Importantly, this shared human and monkey pattern of 3D object confusion is not shared with low-level visual representations (pixels, V1+; models of the retina and primary visual cortex) but is shared with a state-of-the-art computer vision feature representation. Together, these results are consistent with the hypothesis that rhesus monkeys and humans share a common neural shape representation that directly supports object perception. SIGNIFICANCE STATEMENT To date, several mammalian species have shown promise as animal models for studying the neural mechanisms underlying high-level visual processing in humans. In light of this diversity, making tight comparisons between nonhuman and human primates is particularly critical in determining the best use of nonhuman primates to further the goal of the field of translating knowledge gained from animal models to humans. To the best of our knowledge, this study is the first systematic attempt at comparing a high-level visual behavior of humans and macaque monkeys. PMID:26338324
Feedforward object-vision models only tolerate small image variations compared to human

PubMed Central

Ghodrati, Masoud; Farzmahdi, Amirhossein; Rajaei, Karim; Ebrahimpour, Reza; Khaligh-Razavi, Seyed-Mahdi

2014-01-01

Invariant object recognition is a remarkable ability of primates' visual system that its underlying mechanism has constantly been under intense investigations. Computational modeling is a valuable tool toward understanding the processes involved in invariant object recognition. Although recent computational models have shown outstanding performances on challenging image databases, they fail to perform well in image categorization under more complex image variations. Studies have shown that making sparse representation of objects by extracting more informative visual features through a feedforward sweep can lead to higher recognition performances. Here, however, we show that when the complexity of image variations is high, even this approach results in poor performance compared to humans. To assess the performance of models and humans in invariant object recognition tasks, we built a parametrically controlled image database consisting of several object categories varied in different dimensions and levels, rendered from 3D planes. Comparing the performance of several object recognition models with human observers shows that only in low-level image variations the models perform similar to humans in categorization tasks. Furthermore, the results of our behavioral experiments demonstrate that, even under difficult experimental conditions (i.e., briefly presented masked stimuli with complex image variations), human observers performed outstandingly well, suggesting that the models are still far from resembling humans in invariant object recognition. Taken together, we suggest that learning sparse informative visual features, although desirable, is not a complete solution for future progresses in object-vision modeling. We show that this approach is not of significant help in solving the computational crux of object recognition (i.e., invariant object recognition) when the identity-preserving image variations become more complex. PMID:25100986
Invariant visual object recognition and shape processing in rats

PubMed Central

Zoccolan, Davide

2015-01-01

Invariant visual object recognition is the ability to recognize visual objects despite the vastly different images that each object can project onto the retina during natural vision, depending on its position and size within the visual field, its orientation relative to the viewer, etc. Achieving invariant recognition represents such a formidable computational challenge that is often assumed to be a unique hallmark of primate vision. Historically, this has limited the invasive investigation of its neuronal underpinnings to monkey studies, in spite of the narrow range of experimental approaches that these animal models allow. Meanwhile, rodents have been largely neglected as models of object vision, because of the widespread belief that they are incapable of advanced visual processing. However, the powerful array of experimental tools that have been developed to dissect neuronal circuits in rodents has made these species very attractive to vision scientists too, promoting a new tide of studies that have started to systematically explore visual functions in rats and mice. Rats, in particular, have been the subjects of several behavioral studies, aimed at assessing how advanced object recognition and shape processing is in this species. Here, I review these recent investigations, as well as earlier studies of rat pattern vision, to provide an historical overview and a critical summary of the status of the knowledge about rat object vision. The picture emerging from this survey is very encouraging with regard to the possibility of using rats as complementary models to monkeys in the study of higher-level vision. PMID:25561421
HD-MTL: Hierarchical Deep Multi-Task Learning for Large-Scale Visual Recognition.

PubMed

Fan, Jianping; Zhao, Tianyi; Kuang, Zhenzhong; Zheng, Yu; Zhang, Ji; Yu, Jun; Peng, Jinye

2017-02-09

In this paper, a hierarchical deep multi-task learning (HD-MTL) algorithm is developed to support large-scale visual recognition (e.g., recognizing thousands or even tens of thousands of atomic object classes automatically). First, multiple sets of multi-level deep features are extracted from different layers of deep convolutional neural networks (deep CNNs), and they are used to achieve more effective accomplishment of the coarseto- fine tasks for hierarchical visual recognition. A visual tree is then learned by assigning the visually-similar atomic object classes with similar learning complexities into the same group, which can provide a good environment for determining the interrelated learning tasks automatically. By leveraging the inter-task relatedness (inter-class similarities) to learn more discriminative group-specific deep representations, our deep multi-task learning algorithm can train more discriminative node classifiers for distinguishing the visually-similar atomic object classes effectively. Our hierarchical deep multi-task learning (HD-MTL) algorithm can integrate two discriminative regularization terms to control the inter-level error propagation effectively, and it can provide an end-to-end approach for jointly learning more representative deep CNNs (for image representation) and more discriminative tree classifier (for large-scale visual recognition) and updating them simultaneously. Our incremental deep learning algorithms can effectively adapt both the deep CNNs and the tree classifier to the new training images and the new object classes. Our experimental results have demonstrated that our HD-MTL algorithm can achieve very competitive results on improving the accuracy rates for large-scale visual recognition.
Timing, timing, timing: Fast decoding of object information from intracranial field potentials in human visual cortex

PubMed Central

Liu, Hesheng; Agam, Yigal; Madsen, Joseph R.; Kreiman, Gabriel

2010-01-01

Summary The difficulty of visual recognition stems from the need to achieve high selectivity while maintaining robustness to object transformations within hundreds of milliseconds. Theories of visual recognition differ in whether the neuronal circuits invoke recurrent feedback connections or not. The timing of neurophysiological responses in visual cortex plays a key role in distinguishing between bottom-up and top-down theories. Here we quantified at millisecond resolution the amount of visual information conveyed by intracranial field potentials from 912 electrodes in 11 human subjects. We could decode object category information from human visual cortex in single trials as early as 100 ms post-stimulus. Decoding performance was robust to depth rotation and scale changes. The results suggest that physiological activity in the temporal lobe can account for key properties of visual recognition. The fast decoding in single trials is compatible with feed-forward theories and provides strong constraints for computational models of human vision. PMID:19409272
Emergence of transformation-tolerant representations of visual objects in rat lateral extrastriate cortex

PubMed Central

Tafazoli, Sina; Safaai, Houman; De Franceschi, Gioia; Rosselli, Federica Bianca; Vanzella, Walter; Riggi, Margherita; Buffolo, Federica; Panzeri, Stefano; Zoccolan, Davide

2017-01-01

Rodents are emerging as increasingly popular models of visual functions. Yet, evidence that rodent visual cortex is capable of advanced visual processing, such as object recognition, is limited. Here we investigate how neurons located along the progression of extrastriate areas that, in the rat brain, run laterally to primary visual cortex, encode object information. We found a progressive functional specialization of neural responses along these areas, with: (1) a sharp reduction of the amount of low-level, energy-related visual information encoded by neuronal firing; and (2) a substantial increase in the ability of both single neurons and neuronal populations to support discrimination of visual objects under identity-preserving transformations (e.g., position and size changes). These findings strongly argue for the existence of a rat object-processing pathway, and point to the rodents as promising models to dissect the neuronal circuitry underlying transformation-tolerant recognition of visual objects. DOI: http://dx.doi.org/10.7554/eLife.22794.001 PMID:28395730
Multivariate fMRI and Eye Tracking Reveal Differential Effects of Visual Interference on Recognition Memory Judgments for Objects and Scenes.

PubMed

O'Neil, Edward B; Watson, Hilary C; Dhillon, Sonya; Lobaugh, Nancy J; Lee, Andy C H

2015-09-01

Recent work has demonstrated that the perirhinal cortex (PRC) supports conjunctive object representations that aid object recognition memory following visual object interference. It is unclear, however, how these representations interact with other brain regions implicated in mnemonic retrieval and how congruent and incongruent interference influences the processing of targets and foils during object recognition. To address this, multivariate partial least squares was applied to fMRI data acquired during an interference match-to-sample task, in which participants made object or scene recognition judgments after object or scene interference. This revealed a pattern of activity sensitive to object recognition following congruent (i.e., object) interference that included PRC, prefrontal, and parietal regions. Moreover, functional connectivity analysis revealed a common pattern of PRC connectivity across interference and recognition conditions. Examination of eye movements during the same task in a separate study revealed that participants gazed more at targets than foils during correct object recognition decisions, regardless of interference congruency. By contrast, participants viewed foils more than targets for incorrect object memory judgments, but only after congruent interference. Our findings suggest that congruent interference makes object foils appear familiar and that a network of regions, including PRC, is recruited to overcome the effects of interference.
Robust selectivity to two-object images in human visual cortex

PubMed Central

Agam, Yigal; Liu, Hesheng; Papanastassiou, Alexander; Buia, Calin; Golby, Alexandra J.; Madsen, Joseph R.; Kreiman, Gabriel

2010-01-01

SUMMARY We can recognize objects in a fraction of a second in spite of the presence of other objects [1–3]. The responses in macaque areas V4 and inferior temporal cortex [4–15] to a neuron’s preferred stimuli are typically suppressed by the addition of a second object within the receptive field (see however [16, 17]). How can this suppression be reconciled with rapid visual recognition in complex scenes? One option is that certain “special categories” are unaffected by other objects [18] but this leaves the problem unsolved for other categories. Another possibility is that serial attentional shifts help ameliorate the problem of distractor objects [19–21]. Yet, psychophysical studies [1–3], scalp recordings [1] and neurophysiological recordings [14, 16, 22–24], suggest that the initial sweep of visual processing contains a significant amount of information. We recorded intracranial field potentials in human visual cortex during presentation of flashes of two-object images. Visual selectivity from temporal cortex during the initial ~200 ms was largely robust to the presence of other objects. We could train linear decoders on the responses to isolated objects and decode information in two-object images. These observations are compatible with parallel, hierarchical and feed-forward theories of rapid visual recognition [25] and may provide a neural substrate to begin to unravel rapid recognition in natural scenes. PMID:20417105
Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition

PubMed Central

Cadieu, Charles F.; Hong, Ha; Yamins, Daniel L. K.; Pinto, Nicolas; Ardila, Diego; Solomon, Ethan A.; Majaj, Najib J.; DiCarlo, James J.

2014-01-01

The primate visual system achieves remarkable visual object recognition performance even in brief presentations, and under changes to object exemplar, geometric transformations, and background variation (a.k.a. core visual object recognition). This remarkable performance is mediated by the representation formed in inferior temporal (IT) cortex. In parallel, recent advances in machine learning have led to ever higher performing models of object recognition using artificial deep neural networks (DNNs). It remains unclear, however, whether the representational performance of DNNs rivals that of the brain. To accurately produce such a comparison, a major difficulty has been a unifying metric that accounts for experimental limitations, such as the amount of noise, the number of neural recording sites, and the number of trials, and computational limitations, such as the complexity of the decoding classifier and the number of classifier training examples. In this work, we perform a direct comparison that corrects for these experimental limitations and computational considerations. As part of our methodology, we propose an extension of “kernel analysis” that measures the generalization accuracy as a function of representational complexity. Our evaluations show that, unlike previous bio-inspired models, the latest DNNs rival the representational performance of IT cortex on this visual object recognition task. Furthermore, we show that models that perform well on measures of representational performance also perform well on measures of representational similarity to IT, and on measures of predicting individual IT multi-unit responses. Whether these DNNs rely on computational mechanisms similar to the primate visual system is yet to be determined, but, unlike all previous bio-inspired models, that possibility cannot be ruled out merely on representational performance grounds. PMID:25521294

Evidence for perceptual deficits in associative visual (prosop)agnosia: a single-case study.

PubMed

Delvenne, Jean François; Seron, Xavier; Coyette, Françoise; Rossion, Bruno

2004-01-01

Associative visual agnosia is classically defined as normal visual perception stripped of its meaning [Archiv für Psychiatrie und Nervenkrankheiten 21 (1890) 22/English translation: Cognitive Neuropsychol. 5 (1988) 155]: these patients cannot access to their stored visual memories to categorize the objects nonetheless perceived correctly. However, according to an influential theory of visual agnosia [Farah, Visual Agnosia: Disorders of Object Recognition and What They Tell Us about Normal Vision, MIT Press, Cambridge, MA, 1990], visual associative agnosics necessarily present perceptual deficits that are the cause of their impairment at object recognition Here we report a detailed investigation of a patient with bilateral occipito-temporal lesions strongly impaired at object and face recognition. NS presents normal drawing copy, and normal performance at object and face matching tasks as used in classical neuropsychological tests. However, when tested with several computer tasks using carefully controlled visual stimuli and taking both his accuracy rate and response times into account, NS was found to have abnormal performances at high-level visual processing of objects and faces. Albeit presenting a different pattern of deficits than previously described in integrative agnosic patients such as HJA and LH, his deficits were characterized by an inability to integrate individual parts into a whole percept, as suggested by his failure at processing structurally impossible three-dimensional (3D) objects, an absence of face inversion effects and an advantage at detecting and matching single parts. Taken together, these observations question the idea of separate visual representations for object/face perception and object/face knowledge derived from investigations of visual associative (prosop)agnosia, and they raise some methodological issues in the analysis of single-case studies of (prosop)agnosic patients.
Computing with Connections in Visual Recognition of Origami Objects.

ERIC Educational Resources Information Center

Sabbah, Daniel

1985-01-01

Summarizes an initial foray in tackling artificial intelligence problems using a connectionist approach. The task chosen is visual recognition of Origami objects, and the questions answered are how to construct a connectionist network to represent and recognize projected Origami line drawings and the advantages such an approach would have. (30…
Developmental Changes in Visual Object Recognition between 18 and 24 Months of Age

ERIC Educational Resources Information Center

Pereira, Alfredo F.; Smith, Linda B.

2009-01-01

Two experiments examined developmental changes in children's visual recognition of common objects during the period of 18 to 24 months. Experiment 1 examined children's ability to recognize common category instances that presented three different kinds of information: (1) richly detailed and prototypical instances that presented both local and…
Simple Learned Weighted Sums of Inferior Temporal Neuronal Firing Rates Accurately Predict Human Core Object Recognition Performance

PubMed Central

Hong, Ha; Solomon, Ethan A.; DiCarlo, James J.

2015-01-01

To go beyond qualitative models of the biological substrate of object recognition, we ask: can a single ventral stream neuronal linking hypothesis quantitatively account for core object recognition performance over a broad range of tasks? We measured human performance in 64 object recognition tests using thousands of challenging images that explore shape similarity and identity preserving object variation. We then used multielectrode arrays to measure neuronal population responses to those same images in visual areas V4 and inferior temporal (IT) cortex of monkeys and simulated V1 population responses. We tested leading candidate linking hypotheses and control hypotheses, each postulating how ventral stream neuronal responses underlie object recognition behavior. Specifically, for each hypothesis, we computed the predicted performance on the 64 tests and compared it with the measured pattern of human performance. All tested hypotheses based on low- and mid-level visually evoked activity (pixels, V1, and V4) were very poor predictors of the human behavioral pattern. However, simple learned weighted sums of distributed average IT firing rates exactly predicted the behavioral pattern. More elaborate linking hypotheses relying on IT trial-by-trial correlational structure, finer IT temporal codes, or ones that strictly respect the known spatial substructures of IT (“face patches”) did not improve predictive power. Although these results do not reject those more elaborate hypotheses, they suggest a simple, sufficient quantitative model: each object recognition task is learned from the spatially distributed mean firing rates (100 ms) of ∼60,000 IT neurons and is executed as a simple weighted sum of those firing rates. SIGNIFICANCE STATEMENT We sought to go beyond qualitative models of visual object recognition and determine whether a single neuronal linking hypothesis can quantitatively account for core object recognition behavior. To achieve this, we designed a database of images for evaluating object recognition performance. We used multielectrode arrays to characterize hundreds of neurons in the visual ventral stream of nonhuman primates and measured the object recognition performance of >100 human observers. Remarkably, we found that simple learned weighted sums of firing rates of neurons in monkey inferior temporal (IT) cortex accurately predicted human performance. Although previous work led us to expect that IT would outperform V4, we were surprised by the quantitative precision with which simple IT-based linking hypotheses accounted for human behavior. PMID:26424887
Modeling guidance and recognition in categorical search: bridging human and computer object detection.

PubMed

Zelinsky, Gregory J; Peng, Yifan; Berg, Alexander C; Samaras, Dimitris

2013-10-08

Search is commonly described as a repeating cycle of guidance to target-like objects, followed by the recognition of these objects as targets or distractors. Are these indeed separate processes using different visual features? We addressed this question by comparing observer behavior to that of support vector machine (SVM) models trained on guidance and recognition tasks. Observers searched for a categorically defined teddy bear target in four-object arrays. Target-absent trials consisted of random category distractors rated in their visual similarity to teddy bears. Guidance, quantified as first-fixated objects during search, was strongest for targets, followed by target-similar, medium-similarity, and target-dissimilar distractors. False positive errors to first-fixated distractors also decreased with increasing dissimilarity to the target category. To model guidance, nine teddy bear detectors, using features ranging in biological plausibility, were trained on unblurred bears then tested on blurred versions of the same objects appearing in each search display. Guidance estimates were based on target probabilities obtained from these detectors. To model recognition, nine bear/nonbear classifiers, trained and tested on unblurred objects, were used to classify the object that would be fixated first (based on the detector estimates) as a teddy bear or a distractor. Patterns of categorical guidance and recognition accuracy were modeled almost perfectly by an HMAX model in combination with a color histogram feature. We conclude that guidance and recognition in the context of search are not separate processes mediated by different features, and that what the literature knows as guidance is really recognition performed on blurred objects viewed in the visual periphery.
Modeling guidance and recognition in categorical search: Bridging human and computer object detection

PubMed Central

Zelinsky, Gregory J.; Peng, Yifan; Berg, Alexander C.; Samaras, Dimitris

2013-01-01

Search is commonly described as a repeating cycle of guidance to target-like objects, followed by the recognition of these objects as targets or distractors. Are these indeed separate processes using different visual features? We addressed this question by comparing observer behavior to that of support vector machine (SVM) models trained on guidance and recognition tasks. Observers searched for a categorically defined teddy bear target in four-object arrays. Target-absent trials consisted of random category distractors rated in their visual similarity to teddy bears. Guidance, quantified as first-fixated objects during search, was strongest for targets, followed by target-similar, medium-similarity, and target-dissimilar distractors. False positive errors to first-fixated distractors also decreased with increasing dissimilarity to the target category. To model guidance, nine teddy bear detectors, using features ranging in biological plausibility, were trained on unblurred bears then tested on blurred versions of the same objects appearing in each search display. Guidance estimates were based on target probabilities obtained from these detectors. To model recognition, nine bear/nonbear classifiers, trained and tested on unblurred objects, were used to classify the object that would be fixated first (based on the detector estimates) as a teddy bear or a distractor. Patterns of categorical guidance and recognition accuracy were modeled almost perfectly by an HMAX model in combination with a color histogram feature. We conclude that guidance and recognition in the context of search are not separate processes mediated by different features, and that what the literature knows as guidance is really recognition performed on blurred objects viewed in the visual periphery. PMID:24105460
An optimized content-aware image retargeting method: toward expanding the perceived visual field of the high-density retinal prosthesis recipients

NASA Astrophysics Data System (ADS)

Li, Heng; Zeng, Yajie; Lu, Zhuofan; Cao, Xiaofei; Su, Xiaofan; Sui, Xiaohong; Wang, Jing; Chai, Xinyu

2018-04-01

Objective. Retinal prosthesis devices have shown great value in restoring some sight for individuals with profoundly impaired vision, but the visual acuity and visual field provided by prostheses greatly limit recipients’ visual experience. In this paper, we employ computer vision approaches to seek to expand the perceptible visual field in patients implanted potentially with a high-density retinal prosthesis while maintaining visual acuity as much as possible. Approach. We propose an optimized content-aware image retargeting method, by introducing salient object detection based on color and intensity-difference contrast, aiming to remap important information of a scene into a small visual field and preserve their original scale as much as possible. It may improve prosthetic recipients’ perceived visual field and aid in performing some visual tasks (e.g. object detection and object recognition). To verify our method, psychophysical experiments, detecting object number and recognizing objects, are conducted under simulated prosthetic vision. As control, we use three other image retargeting techniques, including Cropping, Scaling, and seam-assisted shrinkability. Main results. Results show that our method outperforms in preserving more key features and has significantly higher recognition accuracy in comparison with other three image retargeting methods under the condition of small visual field and low-resolution. Significance. The proposed method is beneficial to expand the perceived visual field of prosthesis recipients and improve their object detection and recognition performance. It suggests that our method may provide an effective option for image processing module in future high-density retinal implants.
Episodic Short-Term Recognition Requires Encoding into Visual Working Memory: Evidence from Probe Recognition after Letter Report

PubMed Central

Poth, Christian H.; Schneider, Werner X.

2016-01-01

Human vision is organized in discrete processing episodes (e.g., eye fixations or task-steps). Object information must be transmitted across episodes to enable episodic short-term recognition: recognizing whether a current object has been seen in a previous episode. We ask whether episodic short-term recognition presupposes that objects have been encoded into capacity-limited visual working memory (VWM), which retains visual information for report. Alternatively, it could rely on the activation of visual features or categories that occurs before encoding into VWM. We assessed the dependence of episodic short-term recognition on VWM by a new paradigm combining letter report and probe recognition. Participants viewed displays of 10 letters and reported as many as possible after a retention interval (whole report). Next, participants viewed a probe letter and indicated whether it had been one of the 10 letters (probe recognition). In Experiment 1, probe recognition was more accurate for letters that had been encoded into VWM (reported letters) compared with non-encoded letters (non-reported letters). Interestingly, those letters that participants reported in their whole report had been near to one another within the letter displays. This suggests that the encoding into VWM proceeded in a spatially clustered manner. In Experiment 2, participants reported only one of 10 letters (partial report) and probes either referred to this letter, to letters that had been near to it, or far from it. Probe recognition was more accurate for near than for far letters, although none of these letters had to be reported. These findings indicate that episodic short-term recognition is constrained to a small number of simultaneously presented objects that have been encoded into VWM. PMID:27713722
Episodic Short-Term Recognition Requires Encoding into Visual Working Memory: Evidence from Probe Recognition after Letter Report.

PubMed

Poth, Christian H; Schneider, Werner X

2016-01-01

Human vision is organized in discrete processing episodes (e.g., eye fixations or task-steps). Object information must be transmitted across episodes to enable episodic short-term recognition: recognizing whether a current object has been seen in a previous episode. We ask whether episodic short-term recognition presupposes that objects have been encoded into capacity-limited visual working memory (VWM), which retains visual information for report. Alternatively, it could rely on the activation of visual features or categories that occurs before encoding into VWM. We assessed the dependence of episodic short-term recognition on VWM by a new paradigm combining letter report and probe recognition. Participants viewed displays of 10 letters and reported as many as possible after a retention interval (whole report). Next, participants viewed a probe letter and indicated whether it had been one of the 10 letters (probe recognition). In Experiment 1, probe recognition was more accurate for letters that had been encoded into VWM (reported letters) compared with non-encoded letters (non-reported letters). Interestingly, those letters that participants reported in their whole report had been near to one another within the letter displays. This suggests that the encoding into VWM proceeded in a spatially clustered manner. In Experiment 2, participants reported only one of 10 letters (partial report) and probes either referred to this letter, to letters that had been near to it, or far from it. Probe recognition was more accurate for near than for far letters, although none of these letters had to be reported. These findings indicate that episodic short-term recognition is constrained to a small number of simultaneously presented objects that have been encoded into VWM.
The role of color information on object recognition: a review and meta-analysis.

PubMed

Bramão, Inês; Reis, Alexandra; Petersson, Karl Magnus; Faísca, Luís

2011-09-01

In this study, we systematically review the scientific literature on the effect of color on object recognition. Thirty-five independent experiments, comprising 1535 participants, were included in a meta-analysis. We found a moderate effect of color on object recognition (d=0.28). Specific effects of moderator variables were analyzed and we found that color diagnosticity is the factor with the greatest moderator effect on the influence of color in object recognition; studies using color diagnostic objects showed a significant color effect (d=0.43), whereas a marginal color effect was found in studies that used non-color diagnostic objects (d=0.18). The present study did not permit the drawing of specific conclusions about the moderator effect of the object recognition task; while the meta-analytic review showed that color information improves object recognition mainly in studies using naming tasks (d=0.36), the literature review revealed a large body of evidence showing positive effects of color information on object recognition in studies using a large variety of visual recognition tasks. We also found that color is important for the ability to recognize artifacts and natural objects, to recognize objects presented as types (line-drawings) or as tokens (photographs), and to recognize objects that are presented without surface details, such as texture or shadow. Taken together, the results of the meta-analysis strongly support the contention that color plays a role in object recognition. This suggests that the role of color should be taken into account in models of visual object recognition. Copyright © 2011 Elsevier B.V. All rights reserved.
Two speed factors of visual recognition independently correlated with fluid intelligence.

PubMed

Tachibana, Ryosuke; Namba, Yuri; Noguchi, Yasuki

2014-01-01

Growing evidence indicates a moderate but significant relationship between processing speed in visuo-cognitive tasks and general intelligence. On the other hand, findings from neuroscience proposed that the primate visual system consists of two major pathways, the ventral pathway for objects recognition and the dorsal pathway for spatial processing and attentive analysis. Previous studies seeking for visuo-cognitive factors of human intelligence indicated a significant correlation between fluid intelligence and the inspection time (IT), an index for a speed of object recognition performed in the ventral pathway. We thus presently examined a possibility that neural processing speed in the dorsal pathway also represented a factor of intelligence. Specifically, we used the mental rotation (MR) task, a popular psychometric measure for mental speed of spatial processing in the dorsal pathway. We found that the speed of MR was significantly correlated with intelligence scores, while it had no correlation with one's IT (recognition speed of visual objects). Our results support the new possibility that intelligence could be explained by two types of mental speed, one related to object recognition (IT) and another for manipulation of mental images (MR).
Coordinate Transformations in Object Recognition

ERIC Educational Resources Information Center

Graf, Markus

2006-01-01

A basic problem of visual perception is how human beings recognize objects after spatial transformations. Three central classes of findings have to be accounted for: (a) Recognition performance varies systematically with orientation, size, and position; (b) recognition latencies are sequentially additive, suggesting analogue transformation…
Do we understand high-level vision?

PubMed

Cox, David Daniel

2014-04-01

'High-level' vision lacks a single, agreed upon definition, but it might usefully be defined as those stages of visual processing that transition from analyzing local image structure to analyzing structure of the external world that produced those images. Much work in the last several decades has focused on object recognition as a framing problem for the study of high-level visual cortex, and much progress has been made in this direction. This approach presumes that the operational goal of the visual system is to read-out the identity of an object (or objects) in a scene, in spite of variation in the position, size, lighting and the presence of other nearby objects. However, while object recognition as a operational framing of high-level is intuitive appealing, it is by no means the only task that visual cortex might do, and the study of object recognition is beset by challenges in building stimulus sets that adequately sample the infinite space of possible stimuli. Here I review the successes and limitations of this work, and ask whether we should reframe our approaches to understanding high-level vision. Copyright © 2014. Published by Elsevier Ltd.
Crowding by a single bar: probing pattern recognition mechanisms in the visual periphery.

PubMed

Põder, Endel

2014-11-06

Whereas visual crowding does not greatly affect the detection of the presence of simple visual features, it heavily inhibits combining them into recognizable objects. Still, crowding effects have rarely been directly related to general pattern recognition mechanisms. In this study, pattern recognition mechanisms in visual periphery were probed using a single crowding feature. Observers had to identify the orientation of a rotated T presented briefly in a peripheral location. Adjacent to the target, a single bar was presented. The bar was either horizontal or vertical and located in a random direction from the target. It appears that such a crowding bar has very strong and regular effects on the identification of the target orientation. The observer's responses are determined by approximate relative positions of basic visual features; exact image-based similarity to the target is not important. A version of the "standard model" of object recognition with second-order features explains the main regularities of the data. © 2014 ARVO.
Simple Learned Weighted Sums of Inferior Temporal Neuronal Firing Rates Accurately Predict Human Core Object Recognition Performance.

PubMed

Majaj, Najib J; Hong, Ha; Solomon, Ethan A; DiCarlo, James J

2015-09-30

To go beyond qualitative models of the biological substrate of object recognition, we ask: can a single ventral stream neuronal linking hypothesis quantitatively account for core object recognition performance over a broad range of tasks? We measured human performance in 64 object recognition tests using thousands of challenging images that explore shape similarity and identity preserving object variation. We then used multielectrode arrays to measure neuronal population responses to those same images in visual areas V4 and inferior temporal (IT) cortex of monkeys and simulated V1 population responses. We tested leading candidate linking hypotheses and control hypotheses, each postulating how ventral stream neuronal responses underlie object recognition behavior. Specifically, for each hypothesis, we computed the predicted performance on the 64 tests and compared it with the measured pattern of human performance. All tested hypotheses based on low- and mid-level visually evoked activity (pixels, V1, and V4) were very poor predictors of the human behavioral pattern. However, simple learned weighted sums of distributed average IT firing rates exactly predicted the behavioral pattern. More elaborate linking hypotheses relying on IT trial-by-trial correlational structure, finer IT temporal codes, or ones that strictly respect the known spatial substructures of IT ("face patches") did not improve predictive power. Although these results do not reject those more elaborate hypotheses, they suggest a simple, sufficient quantitative model: each object recognition task is learned from the spatially distributed mean firing rates (100 ms) of ∼60,000 IT neurons and is executed as a simple weighted sum of those firing rates. Significance statement: We sought to go beyond qualitative models of visual object recognition and determine whether a single neuronal linking hypothesis can quantitatively account for core object recognition behavior. To achieve this, we designed a database of images for evaluating object recognition performance. We used multielectrode arrays to characterize hundreds of neurons in the visual ventral stream of nonhuman primates and measured the object recognition performance of >100 human observers. Remarkably, we found that simple learned weighted sums of firing rates of neurons in monkey inferior temporal (IT) cortex accurately predicted human performance. Although previous work led us to expect that IT would outperform V4, we were surprised by the quantitative precision with which simple IT-based linking hypotheses accounted for human behavior. Copyright © 2015 the authors 0270-6474/15/3513402-17$15.00/0.
Figure-ground organization and object recognition processes: an interactive account.

PubMed

Vecera, S P; O'Reilly, R C

1998-04-01

Traditional bottom-up models of visual processing assume that figure-ground organization precedes object recognition. This assumption seems logically necessary: How can object recognition occur before a region is labeled as figure? However, some behavioral studies find that familiar regions are more likely to be labeled figure than less familiar regions, a problematic finding for bottom-up models. An interactive account is proposed in which figure-ground processes receive top-down input from object representations in a hierarchical system. A graded, interactive computational model is presented that accounts for behavioral results in which familiarity effects are found. The interactive model offers an alternative conception of visual processing to bottom-up models.
Integration trumps selection in object recognition.

PubMed

Saarela, Toni P; Landy, Michael S

2015-03-30

Finding and recognizing objects is a fundamental task of vision. Objects can be defined by several "cues" (color, luminance, texture, etc.), and humans can integrate sensory cues to improve detection and recognition [1-3]. Cortical mechanisms fuse information from multiple cues [4], and shape-selective neural mechanisms can display cue invariance by responding to a given shape independent of the visual cue defining it [5-8]. Selective attention, in contrast, improves recognition by isolating a subset of the visual information [9]. Humans can select single features (red or vertical) within a perceptual dimension (color or orientation), giving faster and more accurate responses to items having the attended feature [10, 11]. Attention elevates neural responses and sharpens neural tuning to the attended feature, as shown by studies in psychophysics and modeling [11, 12], imaging [13-16], and single-cell and neural population recordings [17, 18]. Besides single features, attention can select whole objects [19-21]. Objects are among the suggested "units" of attention because attention to a single feature of an object causes the selection of all of its features [19-21]. Here, we pit integration against attentional selection in object recognition. We find, first, that humans can integrate information near optimally from several perceptual dimensions (color, texture, luminance) to improve recognition. They cannot, however, isolate a single dimension even when the other dimensions provide task-irrelevant, potentially conflicting information. For object recognition, it appears that there is mandatory integration of information from multiple dimensions of visual experience. The advantage afforded by this integration, however, comes at the expense of attentional selection. Copyright © 2015 Elsevier Ltd. All rights reserved.
Integration trumps selection in object recognition

PubMed Central

Saarela, Toni P.; Landy, Michael S.

2015-01-01

Summary Finding and recognizing objects is a fundamental task of vision. Objects can be defined by several “cues” (color, luminance, texture etc.), and humans can integrate sensory cues to improve detection and recognition [1–3]. Cortical mechanisms fuse information from multiple cues [4], and shape-selective neural mechanisms can display cue-invariance by responding to a given shape independent of the visual cue defining it [5–8]. Selective attention, in contrast, improves recognition by isolating a subset of the visual information [9]. Humans can select single features (red or vertical) within a perceptual dimension (color or orientation), giving faster and more accurate responses to items having the attended feature [10,11]. Attention elevates neural responses and sharpens neural tuning to the attended feature, as shown by studies in psychophysics and modeling [11,12], imaging [13–16], and single-cell and neural population recordings [17,18]. Besides single features, attention can select whole objects [19–21]. Objects are among the suggested “units” of attention because attention to a single feature of an object causes the selection of all of its features [19–21]. Here, we pit integration against attentional selection in object recognition. We find, first, that humans can integrate information near-optimally from several perceptual dimensions (color, texture, luminance) to improve recognition. They cannot, however, isolate a single dimension even when the other dimensions provide task-irrelevant, potentially conflicting information. For object recognition, it appears that there is mandatory integration of information from multiple dimensions of visual experience. The advantage afforded by this integration, however, comes at the expense of attentional selection. PMID:25802154
Two Speed Factors of Visual Recognition Independently Correlated with Fluid Intelligence

PubMed Central

Tachibana, Ryosuke; Namba, Yuri; Noguchi, Yasuki

2014-01-01

Growing evidence indicates a moderate but significant relationship between processing speed in visuo-cognitive tasks and general intelligence. On the other hand, findings from neuroscience proposed that the primate visual system consists of two major pathways, the ventral pathway for objects recognition and the dorsal pathway for spatial processing and attentive analysis. Previous studies seeking for visuo-cognitive factors of human intelligence indicated a significant correlation between fluid intelligence and the inspection time (IT), an index for a speed of object recognition performed in the ventral pathway. We thus presently examined a possibility that neural processing speed in the dorsal pathway also represented a factor of intelligence. Specifically, we used the mental rotation (MR) task, a popular psychometric measure for mental speed of spatial processing in the dorsal pathway. We found that the speed of MR was significantly correlated with intelligence scores, while it had no correlation with one’s IT (recognition speed of visual objects). Our results support the new possibility that intelligence could be explained by two types of mental speed, one related to object recognition (IT) and another for manipulation of mental images (MR). PMID:24825574
Selective verbal recognition memory impairments are associated with atrophy of the language network in non-semantic variants of primary progressive aphasia.

PubMed

Nilakantan, Aneesha S; Voss, Joel L; Weintraub, Sandra; Mesulam, M-Marsel; Rogalski, Emily J

2017-06-01

Primary progressive aphasia (PPA) is clinically defined by an initial loss of language function and preservation of other cognitive abilities, including episodic memory. While PPA primarily affects the left-lateralized perisylvian language network, some clinical neuropsychological tests suggest concurrent initial memory loss. The goal of this study was to test recognition memory of objects and words in the visual and auditory modality to separate language-processing impairments from retentive memory in PPA. Individuals with non-semantic PPA had longer reaction times and higher false alarms for auditory word stimuli compared to visual object stimuli. Moreover, false alarms for auditory word recognition memory were related to cortical thickness within the left inferior frontal gyrus and left temporal pole, while false alarms for visual object recognition memory was related to cortical thickness within the right-temporal pole. This pattern of results suggests that specific vulnerability in processing verbal stimuli can hinder episodic memory in PPA, and provides evidence for differential contributions of the left and right temporal poles in word and object recognition memory. Copyright © 2017 Elsevier Ltd. All rights reserved.

Comparing visual representations across human fMRI and computational vision

PubMed Central

Leeds, Daniel D.; Seibert, Darren A.; Pyles, John A.; Tarr, Michael J.

2013-01-01

Feedforward visual object perception recruits a cortical network that is assumed to be hierarchical, progressing from basic visual features to complete object representations. However, the nature of the intermediate features related to this transformation remains poorly understood. Here, we explore how well different computer vision recognition models account for neural object encoding across the human cortical visual pathway as measured using fMRI. These neural data, collected during the viewing of 60 images of real-world objects, were analyzed with a searchlight procedure as in Kriegeskorte, Goebel, and Bandettini (2006): Within each searchlight sphere, the obtained patterns of neural activity for all 60 objects were compared to model responses for each computer recognition algorithm using representational dissimilarity analysis (Kriegeskorte et al., 2008). Although each of the computer vision methods significantly accounted for some of the neural data, among the different models, the scale invariant feature transform (Lowe, 2004), encoding local visual properties gathered from “interest points,” was best able to accurately and consistently account for stimulus representations within the ventral pathway. More generally, when present, significance was observed in regions of the ventral-temporal cortex associated with intermediate-level object perception. Differences in model effectiveness and the neural location of significant matches may be attributable to the fact that each model implements a different featural basis for representing objects (e.g., more holistic or more parts-based). Overall, we conclude that well-known computer vision recognition systems may serve as viable proxies for theories of intermediate visual object representation. PMID:24273227
Mechanisms of object recognition: what we have learned from pigeons

PubMed Central

Soto, Fabian A.; Wasserman, Edward A.

2014-01-01

Behavioral studies of object recognition in pigeons have been conducted for 50 years, yielding a large body of data. Recent work has been directed toward synthesizing this evidence and understanding the visual, associative, and cognitive mechanisms that are involved. The outcome is that pigeons are likely to be the non-primate species for which the computational mechanisms of object recognition are best understood. Here, we review this research and suggest that a core set of mechanisms for object recognition might be present in all vertebrates, including pigeons and people, making pigeons an excellent candidate model to study the neural mechanisms of object recognition. Behavioral and computational evidence suggests that error-driven learning participates in object category learning by pigeons and people, and recent neuroscientific research suggests that the basal ganglia, which are homologous in these species, may implement error-driven learning of stimulus-response associations. Furthermore, learning of abstract category representations can be observed in pigeons and other vertebrates. Finally, there is evidence that feedforward visual processing, a central mechanism in models of object recognition in the primate ventral stream, plays a role in object recognition by pigeons. We also highlight differences between pigeons and people in object recognition abilities, and propose candidate adaptive specializations which may explain them, such as holistic face processing and rule-based category learning in primates. From a modern comparative perspective, such specializations are to be expected regardless of the model species under study. The fact that we have a good idea of which aspects of object recognition differ in people and pigeons should be seen as an advantage over other animal models. From this perspective, we suggest that there is much to learn about human object recognition from studying the “simple” brains of pigeons. PMID:25352784
Toward a unified model of face and object recognition in the human visual system

PubMed Central

Wallis, Guy

2013-01-01

Our understanding of the mechanisms and neural substrates underlying visual recognition has made considerable progress over the past 30 years. During this period, accumulating evidence has led many scientists to conclude that objects and faces are recognised in fundamentally distinct ways, and in fundamentally distinct cortical areas. In the psychological literature, in particular, this dissociation has led to a palpable disconnect between theories of how we process and represent the two classes of object. This paper follows a trend in part of the recognition literature to try to reconcile what we know about these two forms of recognition by considering the effects of learning. Taking a widely accepted, self-organizing model of object recognition, this paper explains how such a system is affected by repeated exposure to specific stimulus classes. In so doing, it explains how many aspects of recognition generally regarded as unusual to faces (holistic processing, configural processing, sensitivity to inversion, the other-race effect, the prototype effect, etc.) are emergent properties of category-specific learning within such a system. Overall, the paper describes how a single model of recognition learning can and does produce the seemingly very different types of representation associated with faces and objects. PMID:23966963
The functional neuroanatomy of object agnosia: a case study.

PubMed

Konen, Christina S; Behrmann, Marlene; Nishimura, Mayu; Kastner, Sabine

2011-07-14

Cortical reorganization of visual and object representations following neural injury was examined using fMRI and behavioral investigations. We probed the visual responsivity of the ventral visual cortex of an agnosic patient who was impaired at object recognition following a lesion to the right lateral fusiform gyrus. In both hemispheres, retinotopic mapping revealed typical topographic organization and visual activation of early visual cortex. However, visual responses, object-related, and -selective responses were reduced in regions immediately surrounding the lesion in the right hemisphere, and also, surprisingly, in corresponding locations in the structurally intact left hemisphere. In contrast, hV4 of the right hemisphere showed expanded response properties. These findings indicate that the right lateral fusiform gyrus is critically involved in object recognition and that an impairment to this region has widespread consequences for remote parts of cortex. Finally, functional neural plasticity is possible even when a cortical lesion is sustained in adulthood. Copyright © 2011 Elsevier Inc. All rights reserved.
Self-Recognition in Autistic Children.

ERIC Educational Resources Information Center

Dawson, Geraldine; McKissick, Fawn Celeste

1984-01-01

Fifteen autistic children (four to six years old) were assessed for visual self-recognition ability, as well as for object permanence and gestural imitation. It was found that 13 of 15 autistic children showed evidence of self-recognition. Consistent relationships were suggested between self-cognition and object permanence but not between…
Comparing object recognition from binary and bipolar edge images for visual prostheses.

PubMed

Jung, Jae-Hyun; Pu, Tian; Peli, Eli

2016-11-01

Visual prostheses require an effective representation method due to the limited display condition which has only 2 or 3 levels of grayscale in low resolution. Edges derived from abrupt luminance changes in images carry essential information for object recognition. Typical binary (black and white) edge images have been used to represent features to convey essential information. However, in scenes with a complex cluttered background, the recognition rate of the binary edge images by human observers is limited and additional information is required. The polarity of edges and cusps (black or white features on a gray background) carries important additional information; the polarity may provide shape from shading information missing in the binary edge image. This depth information may be restored by using bipolar edges. We compared object recognition rates from 16 binary edge images and bipolar edge images by 26 subjects to determine the possible impact of bipolar filtering in visual prostheses with 3 or more levels of grayscale. Recognition rates were higher with bipolar edge images and the improvement was significant in scenes with complex backgrounds. The results also suggest that erroneous shape from shading interpretation of bipolar edges resulting from pigment rather than boundaries of shape may confound the recognition.
’What’ and ’Where’ in Visual Attention: Evidence from the Neglect Syndrome

DTIC Science & Technology

1992-01-01

representations of the visual world, visual attention, and object representations. 24 Bauer, R. M., & Rubens, A. B. (1985). Agnosia . In K. M. Heilman, & E...visual information. Journal of Experimental Psychology: General, 1-1, 501-517. Farah, M. J. (1990). Visual Agnosia : Disorders of Object Recognition and
Automaticity of Basic-Level Categorization Accounts for Labeling Effects in Visual Recognition Memory

ERIC Educational Resources Information Center

Richler, Jennifer J.; Gauthier, Isabel; Palmeri, Thomas J.

2011-01-01

Are there consequences of calling objects by their names? Lupyan (2008) suggested that overtly labeling objects impairs subsequent recognition memory because labeling shifts stored memory representations of objects toward the category prototype (representational shift hypothesis). In Experiment 1, we show that processing objects at the basic…
Auditory-visual object recognition time suggests specific processing for animal sounds.

PubMed

Suied, Clara; Viaud-Delmon, Isabelle

2009-01-01

Recognizing an object requires binding together several cues, which may be distributed across different sensory modalities, and ignoring competing information originating from other objects. In addition, knowledge of the semantic category of an object is fundamental to determine how we should react to it. Here we investigate the role of semantic categories in the processing of auditory-visual objects. We used an auditory-visual object-recognition task (go/no-go paradigm). We compared recognition times for two categories: a biologically relevant one (animals) and a non-biologically relevant one (means of transport). Participants were asked to react as fast as possible to target objects, presented in the visual and/or the auditory modality, and to withhold their response for distractor objects. A first main finding was that, when participants were presented with unimodal or bimodal congruent stimuli (an image and a sound from the same object), similar reaction times were observed for all object categories. Thus, there was no advantage in the speed of recognition for biologically relevant compared to non-biologically relevant objects. A second finding was that, in the presence of a biologically relevant auditory distractor, the processing of a target object was slowed down, whether or not it was itself biologically relevant. It seems impossible to effectively ignore an animal sound, even when it is irrelevant to the task. These results suggest a specific and mandatory processing of animal sounds, possibly due to phylogenetic memory and consistent with the idea that hearing is particularly efficient as an alerting sense. They also highlight the importance of taking into account the auditory modality when investigating the way object concepts of biologically relevant categories are stored and retrieved.
Object Recognition in Mental Representations: Directions for Exploring Diagnostic Features through Visual Mental Imagery.

PubMed

Roldan, Stephanie M

2017-01-01

One of the fundamental goals of object recognition research is to understand how a cognitive representation produced from the output of filtered and transformed sensory information facilitates efficient viewer behavior. Given that mental imagery strongly resembles perceptual processes in both cortical regions and subjective visual qualities, it is reasonable to question whether mental imagery facilitates cognition in a manner similar to that of perceptual viewing: via the detection and recognition of distinguishing features. Categorizing the feature content of mental imagery holds potential as a reverse pathway by which to identify the components of a visual stimulus which are most critical for the creation and retrieval of a visual representation. This review will examine the likelihood that the information represented in visual mental imagery reflects distinctive object features thought to facilitate efficient object categorization and recognition during perceptual viewing. If it is the case that these representational features resemble their sensory counterparts in both spatial and semantic qualities, they may well be accessible through mental imagery as evaluated through current investigative techniques. In this review, methods applied to mental imagery research and their findings are reviewed and evaluated for their efficiency in accessing internal representations, and implications for identifying diagnostic features are discussed. An argument is made for the benefits of combining mental imagery assessment methods with diagnostic feature research to advance the understanding of visual perceptive processes, with suggestions for avenues of future investigation.
Object Recognition in Mental Representations: Directions for Exploring Diagnostic Features through Visual Mental Imagery

PubMed Central

Roldan, Stephanie M.

2017-01-01

One of the fundamental goals of object recognition research is to understand how a cognitive representation produced from the output of filtered and transformed sensory information facilitates efficient viewer behavior. Given that mental imagery strongly resembles perceptual processes in both cortical regions and subjective visual qualities, it is reasonable to question whether mental imagery facilitates cognition in a manner similar to that of perceptual viewing: via the detection and recognition of distinguishing features. Categorizing the feature content of mental imagery holds potential as a reverse pathway by which to identify the components of a visual stimulus which are most critical for the creation and retrieval of a visual representation. This review will examine the likelihood that the information represented in visual mental imagery reflects distinctive object features thought to facilitate efficient object categorization and recognition during perceptual viewing. If it is the case that these representational features resemble their sensory counterparts in both spatial and semantic qualities, they may well be accessible through mental imagery as evaluated through current investigative techniques. In this review, methods applied to mental imagery research and their findings are reviewed and evaluated for their efficiency in accessing internal representations, and implications for identifying diagnostic features are discussed. An argument is made for the benefits of combining mental imagery assessment methods with diagnostic feature research to advance the understanding of visual perceptive processes, with suggestions for avenues of future investigation. PMID:28588538
The evolution of meaning: spatio-temporal dynamics of visual object recognition.

PubMed

Clarke, Alex; Taylor, Kirsten I; Tyler, Lorraine K

2011-08-01

Research on the spatio-temporal dynamics of visual object recognition suggests a recurrent, interactive model whereby an initial feedforward sweep through the ventral stream to prefrontal cortex is followed by recurrent interactions. However, critical questions remain regarding the factors that mediate the degree of recurrent interactions necessary for meaningful object recognition. The novel prediction we test here is that recurrent interactivity is driven by increasing semantic integration demands as defined by the complexity of semantic information required by the task and driven by the stimuli. To test this prediction, we recorded magnetoencephalography data while participants named living and nonliving objects during two naming tasks. We found that the spatio-temporal dynamics of neural activity were modulated by the level of semantic integration required. Specifically, source reconstructed time courses and phase synchronization measures showed increased recurrent interactions as a function of semantic integration demands. These findings demonstrate that the cortical dynamics of object processing are modulated by the complexity of semantic information required from the visual input.
Real-time unconstrained object recognition: a processing pipeline based on the mammalian visual system.

PubMed

Aguilar, Mario; Peot, Mark A; Zhou, Jiangying; Simons, Stephen; Liao, Yuwei; Metwalli, Nader; Anderson, Mark B

2012-03-01

The mammalian visual system is still the gold standard for recognition accuracy, flexibility, efficiency, and speed. Ongoing advances in our understanding of function and mechanisms in the visual system can now be leveraged to pursue the design of computer vision architectures that will revolutionize the state of the art in computer vision.
The uncrowded window of object recognition

PubMed Central

Pelli, Denis G; Tillman, Katharine A

2009-01-01

It is now emerging that vision is usually limited by object spacing rather than size. The visual system recognizes an object by detecting and then combining its features. ‘Crowding’ occurs when objects are too close together and features from several objects are combined into a jumbled percept. Here, we review the explosion of studies on crowding—in grating discrimination, letter and face recognition, visual search, selective attention, and reading—and find a universal principle, the Bouma law. The critical spacing required to prevent crowding is equal for all objects, although the effect is weaker between dissimilar objects. Furthermore, critical spacing at the cortex is independent of object position, and critical spacing at the visual field is proportional to object distance from fixation. The region where object spacing exceeds critical spacing is the ‘uncrowded window’. Observers cannot recognize objects outside of this window and its size limits the speed of reading and search. PMID:18828191
A Pilot Study of a Test for Visual Recognition Memory in Adults with Moderate to Severe Intellectual Disability

ERIC Educational Resources Information Center

Pyo, Geunyeong; Ala, Tom; Kyrouac, Gregory A.; Verhulst, Steven J.

2010-01-01

Objective assessment of memory functioning is an important part of evaluation for Dementia of Alzheimer Type (DAT). The revised Picture Recognition Memory Test (r-PRMT) is a test for visual recognition memory to assess memory functioning of persons with intellectual disabilities (ID), specifically targeting moderate to severe ID. A pilot study was…
Behavioral model of visual perception and recognition

NASA Astrophysics Data System (ADS)

Rybak, Ilya A.; Golovan, Alexander V.; Gusakova, Valentina I.

1993-09-01

In the processes of visual perception and recognition human eyes actively select essential information by way of successive fixations at the most informative points of the image. A behavioral program defining a scanpath of the image is formed at the stage of learning (object memorizing) and consists of sequential motor actions, which are shifts of attention from one to another point of fixation, and sensory signals expected to arrive in response to each shift of attention. In the modern view of the problem, invariant object recognition is provided by the following: (1) separated processing of `what' (object features) and `where' (spatial features) information at high levels of the visual system; (2) mechanisms of visual attention using `where' information; (3) representation of `what' information in an object-based frame of reference (OFR). However, most recent models of vision based on OFR have demonstrated the ability of invariant recognition of only simple objects like letters or binary objects without background, i.e. objects to which a frame of reference is easily attached. In contrast, we use not OFR, but a feature-based frame of reference (FFR), connected with the basic feature (edge) at the fixation point. This has provided for our model, the ability for invariant representation of complex objects in gray-level images, but demands realization of behavioral aspects of vision described above. The developed model contains a neural network subsystem of low-level vision which extracts a set of primary features (edges) in each fixation, and high- level subsystem consisting of `what' (Sensory Memory) and `where' (Motor Memory) modules. The resolution of primary features extraction decreases with distances from the point of fixation. FFR provides both the invariant representation of object features in Sensor Memory and shifts of attention in Motor Memory. Object recognition consists in successive recall (from Motor Memory) and execution of shifts of attention and successive verification of the expected sets of features (stored in Sensory Memory). The model shows the ability of recognition of complex objects (such as faces) in gray-level images invariant with respect to shift, rotation, and scale.
Evidence for the Activation of Sensorimotor Information during Visual Word Recognition: The Body-Object Interaction Effect

ERIC Educational Resources Information Center

Siakaluk, Paul D.; Pexman, Penny M.; Aguilera, Laura; Owen, William J.; Sears, Christopher R.

2008-01-01

We examined the effects of sensorimotor experience in two visual word recognition tasks. Body-object interaction (BOI) ratings were collected for a large set of words. These ratings assess perceptions of the ease with which a human body can physically interact with a word's referent. A set of high BOI words (e.g., "mask") and a set of low BOI…
Cultural differences in visual object recognition in 3-year-old children

PubMed Central

Kuwabara, Megumi; Smith, Linda B.

2016-01-01

Recent research indicates that culture penetrates fundamental processes of perception and cognition (e.g. Nisbett & Miyamoto, 2005). Here, we provide evidence that these influences begin early and influence how preschool children recognize common objects. The three tasks (n=128) examined the degree to which nonface object recognition by 3 year olds was based on individual diagnostic features versus more configural and holistic processing. Task 1 used a 6-alternative forced choice task in which children were asked to find a named category in arrays of masked objects in which only 3 diagnostic features were visible for each object. U.S. children outperformed age-matched Japanese children. Task 2 presented pictures of objects to children piece by piece. U.S. children recognized the objects given fewer pieces than Japanese children and likelihood of recognition increased for U.S., but not Japanese children when the piece added was rated by both U.S. and Japanese adults as highly defining. Task 3 used a standard measure of configural progressing, asking the degree to which recognition of matching pictures was disrupted by the rotation of one picture. Japanese children’s recognition was more disrupted by inversion than was that of U.S. children, indicating more configural processing by Japanese than U.S. children. The pattern suggests early cross-cultural differences in visual processing; findings that raise important questions about how visual experiences differ across cultures and about universal patterns of cognitive development. PMID:26985576
Cultural differences in visual object recognition in 3-year-old children.

PubMed

Kuwabara, Megumi; Smith, Linda B

2016-07-01

Recent research indicates that culture penetrates fundamental processes of perception and cognition. Here, we provide evidence that these influences begin early and influence how preschool children recognize common objects. The three tasks (N=128) examined the degree to which nonface object recognition by 3-year-olds was based on individual diagnostic features versus more configural and holistic processing. Task 1 used a 6-alternative forced choice task in which children were asked to find a named category in arrays of masked objects where only three diagnostic features were visible for each object. U.S. children outperformed age-matched Japanese children. Task 2 presented pictures of objects to children piece by piece. U.S. children recognized the objects given fewer pieces than Japanese children, and the likelihood of recognition increased for U.S. children, but not Japanese children, when the piece added was rated by both U.S. and Japanese adults as highly defining. Task 3 used a standard measure of configural progressing, asking the degree to which recognition of matching pictures was disrupted by the rotation of one picture. Japanese children's recognition was more disrupted by inversion than was that of U.S. children, indicating more configural processing by Japanese than U.S. children. The pattern suggests early cross-cultural differences in visual processing; findings that raise important questions about how visual experiences differ across cultures and about universal patterns of cognitive development. Copyright © 2016 Elsevier Inc. All rights reserved.
Recognition vs Reverse Engineering in Boolean Concepts Learning

ERIC Educational Resources Information Center

Shafat, Gabriel; Levin, Ilya

2012-01-01

This paper deals with two types of logical problems--recognition problems and reverse engineering problems, and with the interrelations between these types of problems. The recognition problems are modeled in the form of a visual representation of various objects in a common pattern, with a composition of represented objects in the pattern.…

Guidance of visual attention by semantic information in real-world scenes

PubMed Central

Wu, Chia-Chien; Wick, Farahnaz Ahmed; Pomplun, Marc

2014-01-01

Recent research on attentional guidance in real-world scenes has focused on object recognition within the context of a scene. This approach has been valuable for determining some factors that drive the allocation of visual attention and determine visual selection. This article provides a review of experimental work on how different components of context, especially semantic information, affect attentional deployment. We review work from the areas of object recognition, scene perception, and visual search, highlighting recent studies examining semantic structure in real-world scenes. A better understanding on how humans parse scene representations will not only improve current models of visual attention but also advance next-generation computer vision systems and human-computer interfaces. PMID:24567724
Complementary Hemispheric Asymmetries in Object Naming and Recognition: A Voxel-Based Correlational Study

ERIC Educational Resources Information Center

Acres, K.; Taylor, K. I.; Moss, H. E.; Stamatakis, E. A.; Tyler, L. K.

2009-01-01

Cognitive neuroscientific research proposes complementary hemispheric asymmetries in naming and recognising visual objects, with a left temporal lobe advantage for object naming and a right temporal lobe advantage for object recognition. Specifically, it has been proposed that the left inferior temporal lobe plays a mediational role linking…
Coding of visual object features and feature conjunctions in the human brain.

PubMed

Martinovic, Jasna; Gruber, Thomas; Müller, Matthias M

2008-01-01

Object recognition is achieved through neural mechanisms reliant on the activity of distributed coordinated neural assemblies. In the initial steps of this process, an object's features are thought to be coded very rapidly in distinct neural assemblies. These features play different functional roles in the recognition process--while colour facilitates recognition, additional contours and edges delay it. Here, we selectively varied the amount and role of object features in an entry-level categorization paradigm and related them to the electrical activity of the human brain. We found that early synchronizations (approx. 100 ms) increased quantitatively when more image features had to be coded, without reflecting their qualitative contribution to the recognition process. Later activity (approx. 200-400 ms) was modulated by the representational role of object features. These findings demonstrate that although early synchronizations may be sufficient for relatively crude discrimination of objects in visual scenes, they cannot support entry-level categorization. This was subserved by later processes of object model selection, which utilized the representational value of object features such as colour or edges to select the appropriate model and achieve identification.
Shape and texture fused recognition of flying targets

NASA Astrophysics Data System (ADS)

Kovács, Levente; Utasi, Ákos; Kovács, Andrea; Szirányi, Tamás

2011-06-01

This paper presents visual detection and recognition of flying targets (e.g. planes, missiles) based on automatically extracted shape and object texture information, for application areas like alerting, recognition and tracking. Targets are extracted based on robust background modeling and a novel contour extraction approach, and object recognition is done by comparisons to shape and texture based query results on a previously gathered real life object dataset. Application areas involve passive defense scenarios, including automatic object detection and tracking with cheap commodity hardware components (CPU, camera and GPS).
Visual memory in unilateral spatial neglect: immediate recall versus delayed recognition.

PubMed

Moreh, Elior; Malkinson, Tal Seidel; Zohary, Ehud; Soroker, Nachum

2014-09-01

Patients with unilateral spatial neglect (USN) often show impaired performance in spatial working memory tasks, apart from the difficulty retrieving "left-sided" spatial data from long-term memory, shown in the "piazza effect" by Bisiach and colleagues. This study's aim was to compare the effect of the spatial position of a visual object on immediate and delayed memory performance in USN patients. Specifically, immediate verbal recall performance, tested using a simultaneous presentation of four visual objects in four quadrants, was compared with memory in a later-provided recognition task, in which objects were individually shown at the screen center. Unlike healthy controls, USN patients showed a left-side disadvantage and a vertical bias in the immediate free recall task (69% vs. 42% recall for right- and left-sided objects, respectively). In the recognition task, the patients correctly recognized half of "old" items, and their correct rejection rate was 95.5%. Importantly, when the analysis focused on previously recalled items (in the immediate task), no statistically significant difference was found in the delayed recognition of objects according to their original quadrant of presentation. Furthermore, USN patients were able to recollect the correct original location of the recognized objects in 60% of the cases, well beyond chance level. This suggests that the memory trace formed in these cases was not only semantic but also contained a visuospatial tag. Finally, successful recognition of objects missed in recall trials points to formation of memory traces for neglected contralesional objects, which may become accessible to retrieval processes in explicit memory.
Complex scenes and situations visualization in hierarchical learning algorithm with dynamic 3D NeoAxis engine

NASA Astrophysics Data System (ADS)

Graham, James; Ternovskiy, Igor V.

2013-06-01

We applied a two stage unsupervised hierarchical learning system to model complex dynamic surveillance and cyber space monitoring systems using a non-commercial version of the NeoAxis visualization software. The hierarchical scene learning and recognition approach is based on hierarchical expectation maximization, and was linked to a 3D graphics engine for validation of learning and classification results and understanding the human - autonomous system relationship. Scene recognition is performed by taking synthetically generated data and feeding it to a dynamic logic algorithm. The algorithm performs hierarchical recognition of the scene by first examining the features of the objects to determine which objects are present, and then determines the scene based on the objects present. This paper presents a framework within which low level data linked to higher-level visualization can provide support to a human operator and be evaluated in a detailed and systematic way.
Exploring the association between visual perception abilities and reading of musical notation.

PubMed

Lee, Horng-Yih

2012-06-01

In the reading of music, the acquisition of pitch information depends primarily upon the spatial position of notes as well as upon an individual's spatial processing ability. This study investigated the relationship between the ability to read single notes and visual-spatial ability. Participants with high and low single-note reading abilities were differentiated based upon differences in musical notation-reading abilities and their spatial processing; object recognition abilities were then assessed. It was found that the group with lower note-reading abilities made more errors than did the group with a higher note-reading abilities in the mental rotation task. In contrast, there was no apparent significant difference between the two groups in the object recognition task. These results suggest that note-reading may be related to visual spatial processing abilities, and not to an individual's ability with object recognition.
Comparing object recognition from binary and bipolar edge images for visual prostheses

PubMed Central

Jung, Jae-Hyun; Pu, Tian; Peli, Eli

2017-01-01

Visual prostheses require an effective representation method due to the limited display condition which has only 2 or 3 levels of grayscale in low resolution. Edges derived from abrupt luminance changes in images carry essential information for object recognition. Typical binary (black and white) edge images have been used to represent features to convey essential information. However, in scenes with a complex cluttered background, the recognition rate of the binary edge images by human observers is limited and additional information is required. The polarity of edges and cusps (black or white features on a gray background) carries important additional information; the polarity may provide shape from shading information missing in the binary edge image. This depth information may be restored by using bipolar edges. We compared object recognition rates from 16 binary edge images and bipolar edge images by 26 subjects to determine the possible impact of bipolar filtering in visual prostheses with 3 or more levels of grayscale. Recognition rates were higher with bipolar edge images and the improvement was significant in scenes with complex backgrounds. The results also suggest that erroneous shape from shading interpretation of bipolar edges resulting from pigment rather than boundaries of shape may confound the recognition. PMID:28458481
Visual Object Pattern Separation Varies in Older Adults

ERIC Educational Resources Information Center

Holden, Heather M.; Toner, Chelsea; Pirogovsky, Eva; Kirwan, C. Brock; Gilbert, Paul E.

2013-01-01

Young and nondemented older adults completed a visual object continuous recognition memory task in which some stimuli (lures) were similar but not identical to previously presented objects. The lures were hypothesized to result in increased interference and increased pattern separation demand. To examine variability in object pattern separation…
Object similarity affects the perceptual strategy underlying invariant visual object recognition in rats

PubMed Central

Rosselli, Federica B.; Alemi, Alireza; Ansuini, Alessio; Zoccolan, Davide

2015-01-01

In recent years, a number of studies have explored the possible use of rats as models of high-level visual functions. One central question at the root of such an investigation is to understand whether rat object vision relies on the processing of visual shape features or, rather, on lower-order image properties (e.g., overall brightness). In a recent study, we have shown that rats are capable of extracting multiple features of an object that are diagnostic of its identity, at least when those features are, structure-wise, distinct enough to be parsed by the rat visual system. In the present study, we have assessed the impact of object structure on rat perceptual strategy. We trained rats to discriminate between two structurally similar objects, and compared their recognition strategies with those reported in our previous study. We found that, under conditions of lower stimulus discriminability, rat visual discrimination strategy becomes more view-dependent and subject-dependent. Rats were still able to recognize the target objects, in a way that was largely tolerant (i.e., invariant) to object transformation; however, the larger structural and pixel-wise similarity affected the way objects were processed. Compared to the findings of our previous study, the patterns of diagnostic features were: (i) smaller and more scattered; (ii) only partially preserved across object views; and (iii) only partially reproducible across rats. On the other hand, rats were still found to adopt a multi-featural processing strategy and to make use of part of the optimal discriminatory information afforded by the two objects. Our findings suggest that, as in humans, rat invariant recognition can flexibly rely on either view-invariant representations of distinctive object features or view-specific object representations, acquired through learning. PMID:25814936
Separability of Abstract-Category and Specific-Exemplar Visual Object Subsystems: Evidence from fMRI Pattern Analysis

PubMed Central

McMenamin, Brenton W.; Deason, Rebecca G.; Steele, Vaughn R.; Koutstaal, Wilma; Marsolek, Chad J.

2014-01-01

Previous research indicates that dissociable neural subsystems underlie abstract-category (AC) recognition and priming of objects (e.g., cat, piano) and specific-exemplar (SE) recognition and priming of objects (e.g., a calico cat, a different calico cat, a grand piano, etc.). However, the degree of separability between these subsystems is not known, despite the importance of this issue for assessing relevant theories. Visual object representations are widely distributed in visual cortex, thus a multivariate pattern analysis (MVPA) approach to analyzing functional magnetic resonance imaging (fMRI) data may be critical for assessing the separability of different kinds of visual object processing. Here we examined the neural representations of visual object categories and visual object exemplars using multi-voxel pattern analyses of brain activity elicited in visual object processing areas during a repetition-priming task. In the encoding phase, participants viewed visual objects and the printed names of other objects. In the subsequent test phase, participants identified objects that were either same-exemplar primed, different-exemplar primed, word-primed, or unprimed. In visual object processing areas, classifiers were trained to distinguish same-exemplar primed objects from word-primed objects. Then, the abilities of these classifiers to discriminate different-exemplar primed objects and word-primed objects (reflecting AC priming) and to discriminate same-exemplar primed objects and different-exemplar primed objects (reflecting SE priming) was assessed. Results indicated that (a) repetition priming in occipital-temporal regions is organized asymmetrically, such that AC priming is more prevalent in the left hemisphere and SE priming is more prevalent in the right hemisphere, and (b) AC and SE subsystems are weakly modular, not strongly modular or unified. PMID:25528436
Separability of abstract-category and specific-exemplar visual object subsystems: evidence from fMRI pattern analysis.

PubMed

McMenamin, Brenton W; Deason, Rebecca G; Steele, Vaughn R; Koutstaal, Wilma; Marsolek, Chad J

2015-02-01

Previous research indicates that dissociable neural subsystems underlie abstract-category (AC) recognition and priming of objects (e.g., cat, piano) and specific-exemplar (SE) recognition and priming of objects (e.g., a calico cat, a different calico cat, a grand piano, etc.). However, the degree of separability between these subsystems is not known, despite the importance of this issue for assessing relevant theories. Visual object representations are widely distributed in visual cortex, thus a multivariate pattern analysis (MVPA) approach to analyzing functional magnetic resonance imaging (fMRI) data may be critical for assessing the separability of different kinds of visual object processing. Here we examined the neural representations of visual object categories and visual object exemplars using multi-voxel pattern analyses of brain activity elicited in visual object processing areas during a repetition-priming task. In the encoding phase, participants viewed visual objects and the printed names of other objects. In the subsequent test phase, participants identified objects that were either same-exemplar primed, different-exemplar primed, word-primed, or unprimed. In visual object processing areas, classifiers were trained to distinguish same-exemplar primed objects from word-primed objects. Then, the abilities of these classifiers to discriminate different-exemplar primed objects and word-primed objects (reflecting AC priming) and to discriminate same-exemplar primed objects and different-exemplar primed objects (reflecting SE priming) was assessed. Results indicated that (a) repetition priming in occipital-temporal regions is organized asymmetrically, such that AC priming is more prevalent in the left hemisphere and SE priming is more prevalent in the right hemisphere, and (b) AC and SE subsystems are weakly modular, not strongly modular or unified. Copyright © 2014 Elsevier Inc. All rights reserved.
Mechanisms and neural basis of object and pattern recognition: a study with chess experts.

PubMed

Bilalić, Merim; Langner, Robert; Erb, Michael; Grodd, Wolfgang

2010-11-01

Comparing experts with novices offers unique insights into the functioning of cognition, based on the maximization of individual differences. Here we used this expertise approach to disentangle the mechanisms and neural basis behind two processes that contribute to everyday expertise: object and pattern recognition. We compared chess experts and novices performing chess-related and -unrelated (visual) search tasks. As expected, the superiority of experts was limited to the chess-specific task, as there were no differences in a control task that used the same chess stimuli but did not require chess-specific recognition. The analysis of eye movements showed that experts immediately and exclusively focused on the relevant aspects in the chess task, whereas novices also examined irrelevant aspects. With random chess positions, when pattern knowledge could not be used to guide perception, experts nevertheless maintained an advantage. Experts' superior domain-specific parafoveal vision, a consequence of their knowledge about individual domain-specific symbols, enabled improved object recognition. Functional magnetic resonance imaging corroborated this differentiation between object and pattern recognition and showed that chess-specific object recognition was accompanied by bilateral activation of the occipitotemporal junction, whereas chess-specific pattern recognition was related to bilateral activations in the middle part of the collateral sulci. Using the expertise approach together with carefully chosen controls and multiple dependent measures, we identified object and pattern recognition as two essential cognitive processes in expert visual cognition, which may also help to explain the mechanisms of everyday perception.
[Symptoms and lesion localization in visual agnosia].

PubMed

Suzuki, Kyoko

2004-11-01

There are two cortical visual processing streams, the ventral and dorsal stream. The ventral visual stream plays the major role in constructing our perceptual representation of the visual world and the objects within it. Disturbance of visual processing at any stage of the ventral stream could result in impairment of visual recognition. Thus we need systematic investigations to diagnose visual agnosia and its type. Two types of category-selective visual agnosia, prosopagnosia and landmark agnosia, are different from others in that patients could recognize a face as a face and buildings as buildings, but could not identify an individual person or building. Neuronal bases of prosopagnosia and landmark agnosia are distinct. Importance of the right fusiform gyrus for face recognition was confirmed by both clinical and neuroimaging studies. Landmark agnosia is related to lesions in the right parahippocampal gyrus. Enlarged lesions including both the right fusiform and parahippocampal gyri can result in prosopagnosia and landmark agnosia at the same time. Category non-selective visual agnosia is related to bilateral occipito-temporal lesions, which is in agreement with the results of neuroimaging studies that revealed activation of the bilateral occipito-temporal during object recognition tasks.
Invariant recognition drives neural representations of action sequences

PubMed Central

Poggio, Tomaso

2017-01-01

Recognizing the actions of others from visual stimuli is a crucial aspect of human perception that allows individuals to respond to social cues. Humans are able to discriminate between similar actions despite transformations, like changes in viewpoint or actor, that substantially alter the visual appearance of a scene. This ability to generalize across complex transformations is a hallmark of human visual intelligence. Advances in understanding action recognition at the neural level have not always translated into precise accounts of the computational principles underlying what representations of action sequences are constructed by human visual cortex. Here we test the hypothesis that invariant action discrimination might fill this gap. Recently, the study of artificial systems for static object perception has produced models, Convolutional Neural Networks (CNNs), that achieve human level performance in complex discriminative tasks. Within this class, architectures that better support invariant object recognition also produce image representations that better match those implied by human and primate neural data. However, whether these models produce representations of action sequences that support recognition across complex transformations and closely follow neural representations of actions remains unknown. Here we show that spatiotemporal CNNs accurately categorize video stimuli into action classes, and that deliberate model modifications that improve performance on an invariant action recognition task lead to data representations that better match human neural recordings. Our results support our hypothesis that performance on invariant discrimination dictates the neural representations of actions computed in the brain. These results broaden the scope of the invariant recognition framework for understanding visual intelligence from perception of inanimate objects and faces in static images to the study of human perception of action sequences. PMID:29253864
Category-Specificity in Visual Object Recognition

ERIC Educational Resources Information Center

Gerlach, Christian

2009-01-01

Are all categories of objects recognized in the same manner visually? Evidence from neuropsychology suggests they are not: some brain damaged patients are more impaired in recognizing natural objects than artefacts whereas others show the opposite impairment. Category-effects have also been demonstrated in neurologically intact subjects, but the…
Biologically Inspired Visual Model With Preliminary Cognition and Active Attention Adjustment.

PubMed

Qiao, Hong; Xi, Xuanyang; Li, Yinlin; Wu, Wei; Li, Fengfu

2015-11-01

Recently, many computational models have been proposed to simulate visual cognition process. For example, the hierarchical Max-Pooling (HMAX) model was proposed according to the hierarchical and bottom-up structure of V1 to V4 in the ventral pathway of primate visual cortex, which could achieve position- and scale-tolerant recognition. In our previous work, we have introduced memory and association into the HMAX model to simulate visual cognition process. In this paper, we improve our theoretical framework by mimicking a more elaborate structure and function of the primate visual cortex. We will mainly focus on the new formation of memory and association in visual processing under different circumstances as well as preliminary cognition and active adjustment in the inferior temporal cortex, which are absent in the HMAX model. The main contributions of this paper are: 1) in the memory and association part, we apply deep convolutional neural networks to extract various episodic features of the objects since people use different features for object recognition. Moreover, to achieve a fast and robust recognition in the retrieval and association process, different types of features are stored in separated clusters and the feature binding of the same object is stimulated in a loop discharge manner and 2) in the preliminary cognition and active adjustment part, we introduce preliminary cognition to classify different types of objects since distinct neural circuits in a human brain are used for identification of various types of objects. Furthermore, active cognition adjustment of occlusion and orientation is implemented to the model to mimic the top-down effect in human cognition process. Finally, our model is evaluated on two face databases CAS-PEAL-R1 and AR. The results demonstrate that our model exhibits its efficiency on visual recognition process with much lower memory storage requirement and a better performance compared with the traditional purely computational methods.
Symbolic Play Connects to Language through Visual Object Recognition

ERIC Educational Resources Information Center

Smith, Linda B.; Jones, Susan S.

2011-01-01

Object substitutions in play (e.g. using a box as a car) are strongly linked to language learning and their absence is a diagnostic marker of language delay. Classic accounts posit a symbolic function that underlies both words and object substitutions. Here we show that object substitutions depend on developmental changes in visual object…
Image Processing Strategies Based on a Visual Saliency Model for Object Recognition Under Simulated Prosthetic Vision.

PubMed

Wang, Jing; Li, Heng; Fu, Weizhen; Chen, Yao; Li, Liming; Lyu, Qing; Han, Tingting; Chai, Xinyu

2016-01-01

Retinal prostheses have the potential to restore partial vision. Object recognition in scenes of daily life is one of the essential tasks for implant wearers. Still limited by the low-resolution visual percepts provided by retinal prostheses, it is important to investigate and apply image processing methods to convey more useful visual information to the wearers. We proposed two image processing strategies based on Itti's visual saliency map, region of interest (ROI) extraction, and image segmentation. Itti's saliency model generated a saliency map from the original image, in which salient regions were grouped into ROI by the fuzzy c-means clustering. Then Grabcut generated a proto-object from the ROI labeled image which was recombined with background and enhanced in two ways--8-4 separated pixelization (8-4 SP) and background edge extraction (BEE). Results showed that both 8-4 SP and BEE had significantly higher recognition accuracy in comparison with direct pixelization (DP). Each saliency-based image processing strategy was subject to the performance of image segmentation. Under good and perfect segmentation conditions, BEE and 8-4 SP obtained noticeably higher recognition accuracy than DP, and under bad segmentation condition, only BEE boosted the performance. The application of saliency-based image processing strategies was verified to be beneficial to object recognition in daily scenes under simulated prosthetic vision. They are hoped to help the development of the image processing module for future retinal prostheses, and thus provide more benefit for the patients. Copyright © 2015 International Center for Artificial Organs and Transplantation and Wiley Periodicals, Inc.
The Functional Architecture of Visual Object Recognition

DTIC Science & Technology

1991-07-01

different forms of agnosia can provide clues to the representations underlying normal object recognition (Farah, 1990). For example, the pair-wise...patterns of deficit and sparing occur. In a review of 99 published cases of agnosia , the observed patterns of co- occurrence implicated two underlying

Associated impairment of the categories of conspecifics and biological entities: cognitive and neuroanatomical aspects of a new case.

PubMed

Capitani, Erminio; Chieppa, Francesca; Laiacona, Marcella

2010-05-01

Case A.C.A. presented an associated impairment of visual recognition and semantic knowledge for celebrities and biological objects. This case was relevant for (a) the neuroanatomical correlations, and (b) the relationship between visual recognition and semantics within the biological domain and the conspecifics domain. A.C.A. was not affected by anterior temporal damage. Her bilateral vascular lesions were localized on the medial and inferior temporal gyrus on the right and on the intermediate fusiform gyrus on the left, without concomitant lesions of the parahippocampal gyrus or posterior fusiform. Data analysis was based on a novel methodology developed to estimate the rate of stored items in the visual structural description system (SDS) or in the face recognition unit. For each biological object, no particular correlation was found between the visual information accessed through the semantic system and that tapped by the picture reality judgement. Findings are discussed with reference to whether a putative resource commonality is likely between biological objects and conspecifics, and whether or not either category may depend on an exclusive neural substrate.
Manipulating Color and Other Visual Information Influences Picture Naming at Different Levels of Processing: Evidence from Alzheimer Subjects and Normal Controls

ERIC Educational Resources Information Center

Zannino, Gian Daniele; Perri, Roberta; Salamone, Giovanna; Di Lorenzo, Concetta; Caltagirone, Carlo; Carlesimo, Giovanni A.

2010-01-01

There is now a large body of evidence suggesting that color and photographic detail exert an effect on recognition of visually presented familiar objects. However, an unresolved issue is whether these factors act at the visual, the semantic or lexical level of the recognition process. In the present study, we investigated this issue by having…
Visual body recognition in a prosopagnosic patient.

PubMed

Moro, V; Pernigo, S; Avesani, R; Bulgarelli, C; Urgesi, C; Candidi, M; Aglioti, S M

2012-01-01

Conspicuous deficits in face recognition characterize prosopagnosia. Information on whether agnosic deficits may extend to non-facial body parts is lacking. Here we report the neuropsychological description of FM, a patient affected by a complete deficit in face recognition in the presence of mild clinical signs of visual object agnosia. His deficit involves both overt and covert recognition of faces (i.e. recognition of familiar faces, but also categorization of faces for gender or age) as well as the visual mental imagery of faces. By means of a series of matching-to-sample tasks we investigated: (i) a possible association between prosopagnosia and disorders in visual body perception; (ii) the effect of the emotional content of stimuli on the visual discrimination of faces, bodies and objects; (iii) the existence of a dissociation between identity recognition and the emotional discrimination of faces and bodies. Our results document, for the first time, the co-occurrence of body agnosia, i.e. the visual inability to discriminate body forms and body actions, and prosopagnosia. Moreover, the results show better performance in the discrimination of emotional face and body expressions with respect to body identity and neutral actions. Since FM's lesions involve bilateral fusiform areas, it is unlikely that the amygdala-temporal projections explain the relative sparing of emotion discrimination performance. Indeed, the emotional content of the stimuli did not improve the discrimination of their identity. The results hint at the existence of two segregated brain networks involved in identity and emotional discrimination that are at least partially shared by face and body processing. Copyright © 2011 Elsevier Ltd. All rights reserved.
Atoms of recognition in human and computer vision.

PubMed

Ullman, Shimon; Assif, Liav; Fetaya, Ethan; Harari, Daniel

2016-03-08

Discovering the visual features and representations used by the brain to recognize objects is a central problem in the study of vision. Recently, neural network models of visual object recognition, including biological and deep network models, have shown remarkable progress and have begun to rival human performance in some challenging tasks. These models are trained on image examples and learn to extract features and representations and to use them for categorization. It remains unclear, however, whether the representations and learning processes discovered by current models are similar to those used by the human visual system. Here we show, by introducing and using minimal recognizable images, that the human visual system uses features and processes that are not used by current models and that are critical for recognition. We found by psychophysical studies that at the level of minimal recognizable images a minute change in the image can have a drastic effect on recognition, thus identifying features that are critical for the task. Simulations then showed that current models cannot explain this sensitivity to precise feature configurations and, more generally, do not learn to recognize minimal images at a human level. The role of the features shown here is revealed uniquely at the minimal level, where the contribution of each feature is essential. A full understanding of the learning and use of such features will extend our understanding of visual recognition and its cortical mechanisms and will enhance the capacity of computational models to learn from visual experience and to deal with recognition and detailed image interpretation.
Viewpoint dependence in the recognition of non-elongated familiar objects: testing the effects of symmetry, front-back axis, and familiarity.

PubMed

Niimi, Ryosuke; Yokosawa, Kazuhiko

2009-01-01

Visual recognition of three-dimensional (3-D) objects is relatively impaired for some particular views, called accidental views. For most familiar objects, the front and top views are considered to be accidental views. Previous studies have shown that foreshortening of the axes of elongation of objects in these views impairs recognition, but the influence of other possible factors is largely unknown. Using familiar objects without a salient axis of elongation, we found that a foreshortened symmetry plane of the object and low familiarity of the viewpoint accounted for the relatively worse recognition for front views and top views, independently of the effect of a foreshortened axis of elongation. We found no evidence that foreshortened front-back axes impaired recognition in front views. These results suggest that the viewpoint dependence of familiar object recognition is not a unitary phenomenon. The possible role of symmetry (either 2-D or 3-D) in familiar object recognition is also discussed.
Face-specific and domain-general visual processing deficits in children with developmental prosopagnosia.

PubMed

Dalrymple, Kirsten A; Elison, Jed T; Duchaine, Brad

2017-02-01

Evidence suggests that face and object recognition depend on distinct neural circuitry within the visual system. Work with adults with developmental prosopagnosia (DP) demonstrates that some individuals have preserved object recognition despite severe face recognition deficits. This face selectivity in adults with DP indicates that face- and object-processing systems can develop independently, but it is unclear at what point in development these mechanisms are separable. Determining when individuals with DP first show dissociations between faces and objects is one means to address this question. In the current study, we investigated face and object processing in six children with DP (5-12-years-old). Each child was assessed with one face perception test, two different face memory tests, and two object memory tests that were matched to the face memory tests in format and difficulty. Scores from the DP children on the matched face and object tasks were compared to within-subject data from age-matched controls. Four of the six DP children, including the 5-year-old, showed evidence of face-specific deficits, while one child appeared to have more general visual-processing deficits. The remaining child had inconsistent results. The presence of face-specific deficits in children with DP suggests that face and object perception depend on dissociable processes in childhood.
A Biologically Plausible Transform for Visual Recognition that is Invariant to Translation, Scale, and Rotation.

PubMed

Sountsov, Pavel; Santucci, David M; Lisman, John E

2011-01-01

Visual object recognition occurs easily despite differences in position, size, and rotation of the object, but the neural mechanisms responsible for this invariance are not known. We have found a set of transforms that achieve invariance in a neurally plausible way. We find that a transform based on local spatial frequency analysis of oriented segments and on logarithmic mapping, when applied twice in an iterative fashion, produces an output image that is unique to the object and that remains constant as the input image is shifted, scaled, or rotated.
A Biologically Plausible Transform for Visual Recognition that is Invariant to Translation, Scale, and Rotation

PubMed Central

Sountsov, Pavel; Santucci, David M.; Lisman, John E.

2011-01-01

Visual object recognition occurs easily despite differences in position, size, and rotation of the object, but the neural mechanisms responsible for this invariance are not known. We have found a set of transforms that achieve invariance in a neurally plausible way. We find that a transform based on local spatial frequency analysis of oriented segments and on logarithmic mapping, when applied twice in an iterative fashion, produces an output image that is unique to the object and that remains constant as the input image is shifted, scaled, or rotated. PMID:22125522
Exploiting Attribute Correlations: A Novel Trace Lasso-Based Weakly Supervised Dictionary Learning Method.

PubMed

Wu, Lin; Wang, Yang; Pan, Shirui

2017-12-01

It is now well established that sparse representation models are working effectively for many visual recognition tasks, and have pushed forward the success of dictionary learning therein. Recent studies over dictionary learning focus on learning discriminative atoms instead of purely reconstructive ones. However, the existence of intraclass diversities (i.e., data objects within the same category but exhibit large visual dissimilarities), and interclass similarities (i.e., data objects from distinct classes but share much visual similarities), makes it challenging to learn effective recognition models. To this end, a large number of labeled data objects are required to learn models which can effectively characterize these subtle differences. However, labeled data objects are always limited to access, committing it difficult to learn a monolithic dictionary that can be discriminative enough. To address the above limitations, in this paper, we propose a weakly-supervised dictionary learning method to automatically learn a discriminative dictionary by fully exploiting visual attribute correlations rather than label priors. In particular, the intrinsic attribute correlations are deployed as a critical cue to guide the process of object categorization, and then a set of subdictionaries are jointly learned with respect to each category. The resulting dictionary is highly discriminative and leads to intraclass diversity aware sparse representations. Extensive experiments on image classification and object recognition are conducted to show the effectiveness of our approach.
Direction of Magnetoencephalography Sources Associated with Feedback and Feedforward Contributions in a Visual Object Recognition Task

PubMed Central

Ahlfors, Seppo P.; Jones, Stephanie R.; Ahveninen, Jyrki; Hämäläinen, Matti S.; Belliveau, John W.; Bar, Moshe

2014-01-01

Identifying inter-area communication in terms of the hierarchical organization of functional brain areas is of considerable interest in human neuroimaging. Previous studies have suggested that the direction of magneto- and electroencephalography (MEG, EEG) source currents depends on the layer-specific input patterns into a cortical area. We examined the direction in MEG source currents in a visual object recognition experiment in which there were specific expectations of activation in the fusiform region being driven by either feedforward or feedback inputs. The source for the early non-specific visual evoked response, presumably corresponding to feedforward driven activity, pointed outward, i.e., away from the white matter. In contrast, the source for the later, object-recognition related signals, expected to be driven by feedback inputs, pointed inward, toward the white matter. Associating specific features of the MEG/EEG source waveforms to feedforward and feedback inputs could provide unique information about the activation patterns within hierarchically organized cortical areas. PMID:25445356
Are face representations depth cue invariant?

PubMed

Dehmoobadsharifabadi, Armita; Farivar, Reza

2016-06-01

The visual system can process three-dimensional depth cues defining surfaces of objects, but it is unclear whether such information contributes to complex object recognition, including face recognition. The processing of different depth cues involves both dorsal and ventral visual pathways. We investigated whether facial surfaces defined by individual depth cues resulted in meaningful face representations-representations that maintain the relationship between the population of faces as defined in a multidimensional face space. We measured face identity aftereffects for facial surfaces defined by individual depth cues (Experiments 1 and 2) and tested whether the aftereffect transfers across depth cues (Experiments 3 and 4). Facial surfaces and their morphs to the average face were defined purely by one of shading, texture, motion, or binocular disparity. We obtained identification thresholds for matched (matched identity between adapting and test stimuli), non-matched (non-matched identity between adapting and test stimuli), and no-adaptation (showing only the test stimuli) conditions for each cue and across different depth cues. We found robust face identity aftereffect in both experiments. Our results suggest that depth cues do contribute to forming meaningful face representations that are depth cue invariant. Depth cue invariance would require integration of information across different areas and different pathways for object recognition, and this in turn has important implications for cortical models of visual object recognition.
Get rich quick: the signal to respond procedure reveals the time course of semantic richness effects during visual word recognition.

PubMed

Hargreaves, Ian S; Pexman, Penny M

2014-05-01

According to several current frameworks, semantic processing involves an early influence of language-based information followed by later influences of object-based information (e.g., situated simulations; Santos, Chaigneau, Simmons, & Barsalou, 2011). In the present study we examined whether these predictions extend to the influence of semantic variables in visual word recognition. We investigated the time course of semantic richness effects in visual word recognition using a signal-to-respond (STR) paradigm fitted to a lexical decision (LDT) and a semantic categorization (SCT) task. We used linear mixed effects to examine the relative contributions of language-based (number of senses, ARC) and object-based (imageability, number of features, body-object interaction ratings) descriptions of semantic richness at four STR durations (75, 100, 200, and 400ms). Results showed an early influence of number of senses and ARC in the SCT. In both LDT and SCT, object-based effects were the last to influence participants' decision latencies. We interpret our results within a framework in which semantic processes are available to influence word recognition as a function of their availability over time, and of their relevance to task-specific demands. Copyright © 2014 Elsevier B.V. All rights reserved.
Model-Driven Study of Visual Memory

DTIC Science & Technology

2004-12-01

dimensional stimuli (synthetic human faces ) afford important insights into episodic recognition memory. The results were well accommodated by a summed...the unusual properties of the z-transformed ROCS. 15. SUBJECT TERMS Memory, visual memory, computational model, human memory, faces , identity 16...3 Accomplishments/New Findings 3 Work on Objective One: Recognition Memory for Synthetic Faces . 3 Experim ent 1
Functional specialization and convergence in the occipito-temporal cortex supporting haptic and visual identification of human faces and body parts: an fMRI study.

PubMed

Kitada, Ryo; Johnsrude, Ingrid S; Kochiyama, Takanori; Lederman, Susan J

2009-10-01

Humans can recognize common objects by touch extremely well whenever vision is unavailable. Despite its importance to a thorough understanding of human object recognition, the neuroscientific study of this topic has been relatively neglected. To date, the few published studies have addressed the haptic recognition of nonbiological objects. We now focus on haptic recognition of the human body, a particularly salient object category for touch. Neuroimaging studies demonstrate that regions of the occipito-temporal cortex are specialized for visual perception of faces (fusiform face area, FFA) and other body parts (extrastriate body area, EBA). Are the same category-sensitive regions activated when these components of the body are recognized haptically? Here, we use fMRI to compare brain organization for haptic and visual recognition of human body parts. Sixteen subjects identified exemplars of faces, hands, feet, and nonbiological control objects using vision and haptics separately. We identified two discrete regions within the fusiform gyrus (FFA and the haptic face region) that were each sensitive to both haptically and visually presented faces; however, these two regions differed significantly in their response patterns. Similarly, two regions within the lateral occipito-temporal area (EBA and the haptic body region) were each sensitive to body parts in both modalities, although the response patterns differed. Thus, although the fusiform gyrus and the lateral occipito-temporal cortex appear to exhibit modality-independent, category-sensitive activity, our results also indicate a degree of functional specialization related to sensory modality within these structures.
Object memory and change detection: dissociation as a function of visual and conceptual similarity.

PubMed

Yeh, Yei-Yu; Yang, Cheng-Ta

2008-01-01

People often fail to detect a change between two visual scenes, a phenomenon referred to as change blindness. This study investigates how a post-change object's similarity to the pre-change object influences memory of the pre-change object and affects change detection. The results of Experiment 1 showed that similarity lowered detection sensitivity but did not affect the speed of identifying the pre-change object, suggesting that similarity between the pre- and post-change objects does not degrade the pre-change representation. Identification speed for the pre-change object was faster than naming the new object regardless of detection accuracy. Similarity also decreased detection sensitivity in Experiment 2 but improved the recognition of the pre-change object under both correct detection and detection failure. The similarity effect on recognition was greatly reduced when 20% of each pre-change stimulus was masked by random dots in Experiment 3. Together the results suggest that the level of pre-change representation under detection failure is equivalent to the level under correct detection and that the pre-change representation is almost complete. Similarity lowers detection sensitivity but improves explicit access in recognition. Dissociation arises between recognition and change detection as the two judgments rely on the match-to-mismatch signal and mismatch-to-match signal, respectively.
Representations of Shape in Object Recognition and Long-Term Visual Memory

DTIC Science & Technology

1993-02-11

in anything other than linguistic terms ( Biederman , 1987 , for example). STATUS 1. Viewpoint-Dependent Features in Object Representation Tarr and...is object- based orientation-independent representations sufficient for "basic-level" categorization ( Biederman , 1987 ; Corballis, 1988). Alternatively...space. REFERENCES Biederman , I. ( 1987 ). Recognition-by-components: A theory of human image understanding. Psychological Review, 94,115-147. Cooper, L
Repetition priming of face recognition in a serial choice reaction-time task.

PubMed

Roberts, T; Bruce, V

1989-05-01

Marshall & Walker (1987) found that pictorial stimuli yield visual priming that is disrupted by an unpredictable visual event in the response-stimulus interval. They argue that visual stimuli are represented in memory in the form of distinct visual and object codes. Bruce & Young (1986) propose similar pictorial, structural and semantic codes which mediate the recognition of faces, yet repetition priming results obtained with faces as stimuli (Bruce & Valentine, 1985), and with objects (Warren & Morton, 1982) are quite different from those of Marshall & Walker (1987), in the sense that recognition is facilitated by pictures presented 20 minutes earlier. The experiment reported here used different views of familiar and unfamiliar faces as stimuli in a serial choice reaction-time task and found that, with identical pictures, repetition priming survives and intervening item requiring a response, with both familiar and unfamiliar faces. Furthermore, with familiar faces such priming was present even when the view of the prime was different from the target. The theoretical implications of these results are discussed.
Multispectral image analysis for object recognition and classification

NASA Astrophysics Data System (ADS)

Viau, C. R.; Payeur, P.; Cretu, A.-M.

2016-05-01

Computer and machine vision applications are used in numerous fields to analyze static and dynamic imagery in order to assist or automate decision-making processes. Advancements in sensor technologies now make it possible to capture and visualize imagery at various wavelengths (or bands) of the electromagnetic spectrum. Multispectral imaging has countless applications in various fields including (but not limited to) security, defense, space, medical, manufacturing and archeology. The development of advanced algorithms to process and extract salient information from the imagery is a critical component of the overall system performance. The fundamental objective of this research project was to investigate the benefits of combining imagery from the visual and thermal bands of the electromagnetic spectrum to improve the recognition rates and accuracy of commonly found objects in an office setting. A multispectral dataset (visual and thermal) was captured and features from the visual and thermal images were extracted and used to train support vector machine (SVM) classifiers. The SVM's class prediction ability was evaluated separately on the visual, thermal and multispectral testing datasets.
[A case of carbon monoxide poisoning by explosion of coal mine presenting as visual agnosia: re-evaluation after 40 years].

PubMed

Takaiwa, Akiko; Yamashita, Kenichiro; Nomura, Takuo; Shida, Kenshiro; Taniwaki, Takayuki

2005-11-01

We re-evaluated a case of carbon monoxide poisoning presenting as visual agnosia who had been injured by explosion of Miike-Mikawa coal mine 40 years ago. In an early stage, his main neuropsychological symptoms were visual agnosia, severe anterograde amnesia, alexia, agraphia, constructional apraxia, left hemispatial neglect and psychic paralysis of gaze, in addition to pyramidal and extra pyramidal signs. At the time of re-evaluation after 40 years, he still showed visual agnosia associated with agraphia and constructional apraxia. Concerning visual agnosia, recognition of the real object was preserved, while recognition of object photographs and picture was impaired. Thus, this case was considered to have picture agnosia as he could not recognize the object by pictorial cues on the second dimensional space. MRI examination revealed low signal intensity lesions and cortical atrophy in the bilateral parieto-occipital lobes on T1-weighted images. Therefore, the bilateral parieto-occipital lesions are likely to be responsible for his picture agnosia.
Global precedence effects account for individual differences in both face and object recognition performance.

PubMed

Gerlach, Christian; Starrfelt, Randi

2018-03-20

There has been an increase in studies adopting an individual difference approach to examine visual cognition and in particular in studies trying to relate face recognition performance with measures of holistic processing (the face composite effect and the part-whole effect). In the present study we examine whether global precedence effects, measured by means of non-face stimuli in Navon's paradigm, can also account for individual differences in face recognition and, if so, whether the effect is of similar magnitude for faces and objects. We find evidence that global precedence effects facilitate both face and object recognition, and to a similar extent. Our results suggest that both face and object recognition are characterized by a coarse-to-fine temporal dynamic, where global shape information is derived prior to local shape information, and that the efficiency of face and object recognition is related to the magnitude of the global precedence effect.

Evidence for the activation of sensorimotor information during visual word recognition: the body-object interaction effect.

PubMed

Siakaluk, Paul D; Pexman, Penny M; Aguilera, Laura; Owen, William J; Sears, Christopher R

2008-01-01

We examined the effects of sensorimotor experience in two visual word recognition tasks. Body-object interaction (BOI) ratings were collected for a large set of words. These ratings assess perceptions of the ease with which a human body can physically interact with a word's referent. A set of high BOI words (e.g., mask) and a set of low BOI words (e.g., ship) were created, matched on imageability and concreteness. Facilitatory BOI effects were observed in lexical decision and phonological lexical decision tasks: responses were faster for high BOI words than for low BOI words. We discuss how our findings may be accounted for by (a) semantic feedback within the visual word recognition system, and (b) an embodied view of cognition (e.g., Barsalou's perceptual symbol systems theory), which proposes that semantic knowledge is grounded in sensorimotor interactions with the environment.
A neurophysiologically plausible population code model for feature integration explains visual crowding.

PubMed

van den Berg, Ronald; Roerdink, Jos B T M; Cornelissen, Frans W

2010-01-22

An object in the peripheral visual field is more difficult to recognize when surrounded by other objects. This phenomenon is called "crowding". Crowding places a fundamental constraint on human vision that limits performance on numerous tasks. It has been suggested that crowding results from spatial feature integration necessary for object recognition. However, in the absence of convincing models, this theory has remained controversial. Here, we present a quantitative and physiologically plausible model for spatial integration of orientation signals, based on the principles of population coding. Using simulations, we demonstrate that this model coherently accounts for fundamental properties of crowding, including critical spacing, "compulsory averaging", and a foveal-peripheral anisotropy. Moreover, we show that the model predicts increased responses to correlated visual stimuli. Altogether, these results suggest that crowding has little immediate bearing on object recognition but is a by-product of a general, elementary integration mechanism in early vision aimed at improving signal quality.
Does object view influence the scene consistency effect?

PubMed

Sastyin, Gergo; Niimi, Ryosuke; Yokosawa, Kazuhiko

2015-04-01

Traditional research on the scene consistency effect only used clearly recognizable object stimuli to show mutually interactive context effects for both the object and background components on scene perception (Davenport & Potter in Psychological Science, 15, 559-564, 2004). However, in real environments, objects are viewed from multiple viewpoints, including an accidental, hard-to-recognize one. When the observers named target objects in scenes (Experiments 1a and 1b, object recognition task), we replicated the scene consistency effect (i.e., there was higher accuracy for the objects with consistent backgrounds). However, there was a significant interaction effect between consistency and object viewpoint, which indicated that the scene consistency effect was more important for identifying objects in the accidental view condition than in the canonical view condition. Therefore, the object recognition system may rely more on the scene context when the object is difficult to recognize. In Experiment 2, the observers identified the background (background recognition task) while the scene consistency and object views were manipulated. The results showed that object viewpoint had no effect, while the scene consistency effect was observed. More specifically, the canonical and accidental views both equally provided contextual information for scene perception. These findings suggested that the mechanism for conscious recognition of objects could be dissociated from the mechanism for visual analysis of object images that were part of a scene. The "context" that the object images provided may have been derived from its view-invariant, relatively low-level visual features (e.g., color), rather than its semantic information.
Cotinine improves visual recognition memory and decreases cortical Tau phosphorylation in the Tg6799 mice.

PubMed

Grizzell, J Alex; Patel, Sagar; Barreto, George E; Echeverria, Valentina

2017-08-01

Alzheimer's disease (AD) is associated with the progressive aggregation of hyperphosphorylated forms of the microtubule associated protein Tau in the central nervous system. Cotinine, the main metabolite of nicotine, reduced working memory deficits, synaptic loss, and amyloid β peptide aggregation into oligomers and plaques as well as inhibited the cerebral Tau kinase, glycogen synthase 3β (GSK3β) in the transgenic (Tg)6799 (5XFAD) mice. In this study, the effect of cotinine on visual recognition memory and cortical Tau phosphorylation at the GSK3β sites Serine (Ser)-396/Ser-404 and phospho-CREB were investigated in the Tg6799 and non-transgenic (NT) littermate mice. Tg mice showed short-term visual recognition memory impairment in the novel object recognition test, and higher levels of Tau phosphorylation when compared to NT mice. Cotinine significantly improved visual recognition memory performance increased CREB phosphorylation and reduced cortical Tau phosphorylation. Potential mechanisms underlying theses beneficial effects are discussed. Copyright © 2017. Published by Elsevier Inc.
A Computational Model of Semantic Memory Impairment: Modality- Specificity and Emergent Category-Specificity

DTIC Science & Technology

1991-09-01

just one modality (e.g. visual or auditory agnosia ) or impaired manipulation of objects with specific uses, despite intact recognition of them (apraxia...Neurosurgery and itbiatzy, 51, 1201-1207. Farah, M. J. (1991) Patterns of co-occurence among the associative agnosias : Implications for visual object
Recognition memory is modulated by visual similarity.

PubMed

Yago, Elena; Ishai, Alumit

2006-06-01

We used event-related fMRI to test whether recognition memory depends on visual similarity between familiar prototypes and novel exemplars. Subjects memorized portraits, landscapes, and abstract compositions by six painters with a unique style, and later performed a memory recognition task. The prototypes were presented with new exemplars that were either visually similar or dissimilar. Behaviorally, novel, dissimilar items were detected faster and more accurately. We found activation in a distributed cortical network that included face- and object-selective regions in the visual cortex, where familiar prototypes evoked stronger responses than new exemplars; attention-related regions in parietal cortex, where responses elicited by new exemplars were reduced with decreased similarity to the prototypes; and the hippocampus and memory-related regions in parietal and prefrontal cortices, where stronger responses were evoked by the dissimilar exemplars. Our findings suggest that recognition memory is mediated by classification of novel exemplars as a match or a mismatch, based on their visual similarity to familiar prototypes.
Illusory conjunctions in visual short-term memory: Individual differences in corpus callosum connectivity and splitting attention between the two hemifields.

PubMed

Qin, Shuo; Ray, Nicholas R; Ramakrishnan, Nithya; Nashiro, Kaoru; O'Connell, Margaret A; Basak, Chandramallika

2016-11-01

Overloading the capacity of visual attention can result in mistakenly combining the various features of an object, that is, illusory conjunctions. We hypothesize that if the two hemispheres separately process visual information by splitting attention, connectivity of corpus callosum-a brain structure integrating the two hemispheres-would predict the degree of illusory conjunctions. In the current study, we assessed two types of illusory conjunctions using a memory-scanning paradigm; the features were either presented across the two opposite hemifields or within the same hemifield. Four objects, each with two visual features, were briefly presented together followed by a probe-recognition and a confidence rating for the recognition accuracy. MRI scans were also obtained. Results indicated that successful recollection during probe recognition was better for across hemifields conjunctions compared to within hemifield conjunctions, lending support to the bilateral advantage of the two hemispheres in visual short-term memory. Age-related differences regarding the underlying mechanisms of the bilateral advantage indicated greater reliance on recollection-based processing in young and on familiarity-based processing in old. Moreover, the integrity of the posterior corpus callosum was more predictive of opposite hemifield illusory conjunctions compared to within hemifield illusory conjunctions, even after controlling for age. That is, individuals with lesser posterior corpus callosum connectivity had better recognition for objects when their features were recombined from the opposite hemifields than from the same hemifield. This study is the first to investigate the role of the corpus callosum in splitting attention between versus within hemifields. © 2016 Society for Psychophysiological Research.
Perirhinal Cortex Resolves Feature Ambiguity in Configural Object Recognition and Perceptual Oddity Tasks

ERIC Educational Resources Information Center

Bartko, Susan J.; Winters, Boyer D.; Cowell, Rosemary A.; Saksida, Lisa M.; Bussey, Timothy J.

2007-01-01

The perirhinal cortex (PRh) has a well-established role in object recognition memory. More recent studies suggest that PRh is also important for two-choice visual discrimination tasks. Specifically, it has been suggested that PRh contains conjunctive representations that help resolve feature ambiguity, which occurs when a task cannot easily be…
The roles of scene priming and location priming in object-scene consistency effects

PubMed Central

Heise, Nils; Ansorge, Ulrich

2014-01-01

Presenting consistent objects in scenes facilitates object recognition as compared to inconsistent objects. Yet the mechanisms by which scenes influence object recognition are still not understood. According to one theory, consistent scenes facilitate visual search for objects at expected places. Here, we investigated two predictions following from this theory: If visual search is responsible for consistency effects, consistency effects could be weaker (1) with better-primed than less-primed object locations, and (2) with less-primed than better-primed scenes. In Experiments 1 and 2, locations of objects were varied within a scene to a different degree (one, two, or four possible locations). In addition, object-scene consistency was studied as a function of progressive numbers of repetitions of the backgrounds. Because repeating locations and backgrounds could facilitate visual search for objects, these repetitions might alter the object-scene consistency effect by lowering of location uncertainty. Although we find evidence for a significant consistency effect, we find no clear support for impacts of scene priming or location priming on the size of the consistency effect. Additionally, we find evidence that the consistency effect is dependent on the eccentricity of the target objects. These results point to only small influences of priming to object-scene consistency effects but all-in-all the findings can be reconciled with a visual-search explanation of the consistency effect. PMID:24910628
Exploiting range imagery: techniques and applications

NASA Astrophysics Data System (ADS)

Armbruster, Walter

2009-07-01

Practically no applications exist for which automatic processing of 2D intensity imagery can equal human visual perception. This is not the case for range imagery. The paper gives examples of 3D laser radar applications, for which automatic data processing can exceed human visual cognition capabilities and describes basic processing techniques for attaining these results. The examples are drawn from the fields of helicopter obstacle avoidance, object detection in surveillance applications, object recognition at high range, multi-object-tracking, and object re-identification in range image sequences. Processing times and recognition performances are summarized. The techniques used exploit the bijective continuity of the imaging process as well as its independence of object reflectivity, emissivity and illumination. This allows precise formulations of the probability distributions involved in figure-ground segmentation, feature-based object classification and model based object recognition. The probabilistic approach guarantees optimal solutions for single images and enables Bayesian learning in range image sequences. Finally, due to recent results in 3D-surface completion, no prior model libraries are required for recognizing and re-identifying objects of quite general object categories, opening the way to unsupervised learning and fully autonomous cognitive systems.
Object-related activity revealed by functional magnetic resonance imaging in human occipital cortex.

PubMed Central

Malach, R; Reppas, J B; Benson, R R; Kwong, K K; Jiang, H; Kennedy, W A; Ledden, P J; Brady, T J; Rosen, B R; Tootell, R B

1995-01-01

The stages of integration leading from local feature analysis to object recognition were explored in human visual cortex by using the technique of functional magnetic resonance imaging. Here we report evidence for object-related activation. Such activation was located at the lateral-posterior aspect of the occipital lobe, just abutting the posterior aspect of the motion-sensitive area MT/V5, in a region termed the lateral occipital complex (LO). LO showed preferential activation to images of objects, compared to a wide range of texture patterns. This activation was not caused by a global difference in the Fourier spatial frequency content of objects versus texture images, since object images produced enhanced LO activation compared to textures matched in power spectra but randomized in phase. The preferential activation to objects also could not be explained by different patterns of eye movements: similar levels of activation were observed when subjects fixated on the objects and when they scanned the objects with their eyes. Additional manipulations such as spatial frequency filtering and a 4-fold change in visual size did not affect LO activation. These results suggest that the enhanced responses to objects were not a manifestation of low-level visual processing. A striking demonstration that activity in LO is uniquely correlated to object detectability was produced by the "Lincoln" illusion, in which blurring of objects digitized into large blocks paradoxically increases their recognizability. Such blurring led to significant enhancement of LO activation. Despite the preferential activation to objects, LO did not seem to be involved in the final, "semantic," stages of the recognition process. Thus, objects varying widely in their recognizability (e.g., famous faces, common objects, and unfamiliar three-dimensional abstract sculptures) activated it to a similar degree. These results are thus evidence for an intermediate link in the chain of processing stages leading to object recognition in human visual cortex. Images Fig. 1 Fig. 2 Fig. 3 PMID:7667258
Experience moderates overlap between object and face recognition, suggesting a common ability

PubMed Central

Gauthier, Isabel; McGugin, Rankin W.; Richler, Jennifer J.; Herzmann, Grit; Speegle, Magen; Van Gulick, Ana E.

2014-01-01

Some research finds that face recognition is largely independent from the recognition of other objects; a specialized and innate ability to recognize faces could therefore have little or nothing to do with our ability to recognize objects. We propose a new framework in which recognition performance for any category is the product of domain-general ability and category-specific experience. In Experiment 1, we show that the overlap between face and object recognition depends on experience with objects. In 256 subjects we measured face recognition, object recognition for eight categories, and self-reported experience with these categories. Experience predicted neither face recognition nor object recognition but moderated their relationship: Face recognition performance is increasingly similar to object recognition performance with increasing object experience. If a subject has a lot of experience with objects and is found to perform poorly, they also prove to have a low ability with faces. In a follow-up survey, we explored the dimensions of experience with objects that may have contributed to self-reported experience in Experiment 1. Different dimensions of experience appear to be more salient for different categories, with general self-reports of expertise reflecting judgments of verbal knowledge about a category more than judgments of visual performance. The complexity of experience and current limitations in its measurement support the importance of aggregating across multiple categories. Our findings imply that both face and object recognition are supported by a common, domain-general ability expressed through experience with a category and best measured when accounting for experience. PMID:24993021
Experience moderates overlap between object and face recognition, suggesting a common ability.

PubMed

Gauthier, Isabel; McGugin, Rankin W; Richler, Jennifer J; Herzmann, Grit; Speegle, Magen; Van Gulick, Ana E

2014-07-03

Some research finds that face recognition is largely independent from the recognition of other objects; a specialized and innate ability to recognize faces could therefore have little or nothing to do with our ability to recognize objects. We propose a new framework in which recognition performance for any category is the product of domain-general ability and category-specific experience. In Experiment 1, we show that the overlap between face and object recognition depends on experience with objects. In 256 subjects we measured face recognition, object recognition for eight categories, and self-reported experience with these categories. Experience predicted neither face recognition nor object recognition but moderated their relationship: Face recognition performance is increasingly similar to object recognition performance with increasing object experience. If a subject has a lot of experience with objects and is found to perform poorly, they also prove to have a low ability with faces. In a follow-up survey, we explored the dimensions of experience with objects that may have contributed to self-reported experience in Experiment 1. Different dimensions of experience appear to be more salient for different categories, with general self-reports of expertise reflecting judgments of verbal knowledge about a category more than judgments of visual performance. The complexity of experience and current limitations in its measurement support the importance of aggregating across multiple categories. Our findings imply that both face and object recognition are supported by a common, domain-general ability expressed through experience with a category and best measured when accounting for experience. © 2014 ARVO.
SEMI-SUPERVISED OBJECT RECOGNITION USING STRUCTURE KERNEL

PubMed Central

Wang, Botao; Xiong, Hongkai; Jiang, Xiaoqian; Ling, Fan

2013-01-01

Object recognition is a fundamental problem in computer vision. Part-based models offer a sparse, flexible representation of objects, but suffer from difficulties in training and often use standard kernels. In this paper, we propose a positive definite kernel called “structure kernel”, which measures the similarity of two part-based represented objects. The structure kernel has three terms: 1) the global term that measures the global visual similarity of two objects; 2) the part term that measures the visual similarity of corresponding parts; 3) the spatial term that measures the spatial similarity of geometric configuration of parts. The contribution of this paper is to generalize the discriminant capability of local kernels to complex part-based object models. Experimental results show that the proposed kernel exhibit higher accuracy than state-of-art approaches using standard kernels. PMID:23666108
The Invariance Hypothesis Implies Domain-Specific Regions in Visual Cortex

PubMed Central

Leibo, Joel Z.; Liao, Qianli; Anselmi, Fabio; Poggio, Tomaso

2015-01-01

Is visual cortex made up of general-purpose information processing machinery, or does it consist of a collection of specialized modules? If prior knowledge, acquired from learning a set of objects is only transferable to new objects that share properties with the old, then the recognition system’s optimal organization must be one containing specialized modules for different object classes. Our analysis starts from a premise we call the invariance hypothesis: that the computational goal of the ventral stream is to compute an invariant-to-transformations and discriminative signature for recognition. The key condition enabling approximate transfer of invariance without sacrificing discriminability turns out to be that the learned and novel objects transform similarly. This implies that the optimal recognition system must contain subsystems trained only with data from similarly-transforming objects and suggests a novel interpretation of domain-specific regions like the fusiform face area (FFA). Furthermore, we can define an index of transformation-compatibility, computable from videos, that can be combined with information about the statistics of natural vision to yield predictions for which object categories ought to have domain-specific regions in agreement with the available data. The result is a unifying account linking the large literature on view-based recognition with the wealth of experimental evidence concerning domain-specific regions. PMID:26496457
An expanded framework for biomolecular visualization in the classroom: Learning goals and competencies.

PubMed

Dries, Daniel R; Dean, Diane M; Listenberger, Laura L; Novak, Walter R P; Franzen, Margaret A; Craig, Paul A

2017-01-02

A thorough understanding of the molecular biosciences requires the ability to visualize and manipulate molecules in order to interpret results or to generate hypotheses. While many instructors in biochemistry and molecular biology use visual representations, few indicate that they explicitly teach visual literacy. One reason is the need for a list of core content and competencies to guide a more deliberate instruction in visual literacy. We offer here the second stage in the development of one such resource for biomolecular three-dimensional visual literacy. We present this work with the goal of building a community for online resource development and use. In the first stage, overarching themes were identified and submitted to the biosciences community for comment: atomic geometry; alternate renderings; construction/annotation; het group recognition; molecular dynamics; molecular interactions; monomer recognition; symmetry/asymmetry recognition; structure-function relationships; structural model skepticism; and topology and connectivity. Herein, the overarching themes have been expanded to include a 12th theme (macromolecular assemblies), 27 learning goals, and more than 200 corresponding objectives, many of which cut across multiple overarching themes. The learning goals and objectives offered here provide educators with a framework on which to map the use of molecular visualization in their classrooms. In addition, the framework may also be used by biochemistry and molecular biology educators to identify gaps in coverage and drive the creation of new activities to improve visual literacy. This work represents the first attempt, to our knowledge, to catalog a comprehensive list of explicit learning goals and objectives in visual literacy. © 2016 by The International Union of Biochemistry and Molecular Biology, 45(1):69-75, 2017. © 2016 The Authors Biochemistry and Molecular Biology Education published by Wiley Periodicals, Inc. on behalf of International Union of Biochemistry and Molecular Biology.
An expanded framework for biomolecular visualization in the classroom: Learning goals and competencies

PubMed Central

Dries, Daniel R.; Dean, Diane M.; Listenberger, Laura L.; Novak, Walter R.P.

2016-01-01

Abstract A thorough understanding of the molecular biosciences requires the ability to visualize and manipulate molecules in order to interpret results or to generate hypotheses. While many instructors in biochemistry and molecular biology use visual representations, few indicate that they explicitly teach visual literacy. One reason is the need for a list of core content and competencies to guide a more deliberate instruction in visual literacy. We offer here the second stage in the development of one such resource for biomolecular three‐dimensional visual literacy. We present this work with the goal of building a community for online resource development and use. In the first stage, overarching themes were identified and submitted to the biosciences community for comment: atomic geometry; alternate renderings; construction/annotation; het group recognition; molecular dynamics; molecular interactions; monomer recognition; symmetry/asymmetry recognition; structure‐function relationships; structural model skepticism; and topology and connectivity. Herein, the overarching themes have been expanded to include a 12th theme (macromolecular assemblies), 27 learning goals, and more than 200 corresponding objectives, many of which cut across multiple overarching themes. The learning goals and objectives offered here provide educators with a framework on which to map the use of molecular visualization in their classrooms. In addition, the framework may also be used by biochemistry and molecular biology educators to identify gaps in coverage and drive the creation of new activities to improve visual literacy. This work represents the first attempt, to our knowledge, to catalog a comprehensive list of explicit learning goals and objectives in visual literacy. © 2016 by The International Union of Biochemistry and Molecular Biology, 45(1):69–75, 2017. PMID:27486685
Bilateral Theta-Burst TMS to Influence Global Gestalt Perception

PubMed Central

Ritzinger, Bernd; Huberle, Elisabeth; Karnath, Hans-Otto

2012-01-01

While early and higher visual areas along the ventral visual pathway in the inferotemporal cortex are critical for the recognition of individual objects, the neural representation of human perception of complex global visual scenes remains under debate. Stroke patients with a selective deficit in the perception of a complex global Gestalt with intact recognition of individual objects – a deficit termed simultanagnosia – greatly helped to study this question. Interestingly, simultanagnosia typically results from bilateral lesions of the temporo-parietal junction (TPJ). The present study aimed to verify the relevance of this area for human global Gestalt perception. We applied continuous theta-burst TMS either unilaterally (left or right) or bilateral simultaneously over TPJ. Healthy subjects were presented with hierarchically organized visual stimuli that allowed parametrical degrading of the object at the global level. Identification of the global Gestalt was significantly modulated only for the bilateral TPJ stimulation condition. Our results strengthen the view that global Gestalt perception in the human brain involves TPJ and is co-dependent on both hemispheres. PMID:23110106
Bilateral theta-burst TMS to influence global gestalt perception.

PubMed

Ritzinger, Bernd; Huberle, Elisabeth; Karnath, Hans-Otto

2012-01-01

While early and higher visual areas along the ventral visual pathway in the inferotemporal cortex are critical for the recognition of individual objects, the neural representation of human perception of complex global visual scenes remains under debate. Stroke patients with a selective deficit in the perception of a complex global Gestalt with intact recognition of individual objects - a deficit termed simultanagnosia - greatly helped to study this question. Interestingly, simultanagnosia typically results from bilateral lesions of the temporo-parietal junction (TPJ). The present study aimed to verify the relevance of this area for human global Gestalt perception. We applied continuous theta-burst TMS either unilaterally (left or right) or bilateral simultaneously over TPJ. Healthy subjects were presented with hierarchically organized visual stimuli that allowed parametrical degrading of the object at the global level. Identification of the global Gestalt was significantly modulated only for the bilateral TPJ stimulation condition. Our results strengthen the view that global Gestalt perception in the human brain involves TPJ and is co-dependent on both hemispheres.
Target recognition and scene interpretation in image/video understanding systems based on network-symbolic models

NASA Astrophysics Data System (ADS)

Kuvich, Gary

2004-08-01

Vision is only a part of a system that converts visual information into knowledge structures. These structures drive the vision process, resolving ambiguity and uncertainty via feedback, and provide image understanding, which is an interpretation of visual information in terms of these knowledge models. These mechanisms provide a reliable recognition if the object is occluded or cannot be recognized as a whole. It is hard to split the entire system apart, and reliable solutions to the target recognition problems are possible only within the solution of a more generic Image Understanding Problem. Brain reduces informational and computational complexities, using implicit symbolic coding of features, hierarchical compression, and selective processing of visual information. Biologically inspired Network-Symbolic representation, where both systematic structural/logical methods and neural/statistical methods are parts of a single mechanism, is the most feasible for such models. It converts visual information into relational Network-Symbolic structures, avoiding artificial precise computations of 3-dimensional models. Network-Symbolic Transformations derive abstract structures, which allows for invariant recognition of an object as exemplar of a class. Active vision helps creating consistent models. Attention, separation of figure from ground and perceptual grouping are special kinds of network-symbolic transformations. Such Image/Video Understanding Systems will be reliably recognizing targets.

Image jitter enhances visual performance when spatial resolution is impaired.

PubMed

Watson, Lynne M; Strang, Niall C; Scobie, Fraser; Love, Gordon D; Seidel, Dirk; Manahilov, Velitchko

2012-09-06

Visibility of low-spatial frequency stimuli improves when their contrast is modulated at 5 to 10 Hz compared with stationary stimuli. Therefore, temporal modulations of visual objects could enhance the performance of low vision patients who primarily perceive images of low-spatial frequency content. We investigated the effect of retinal-image jitter on word recognition speed and facial emotion recognition in subjects with central visual impairment. Word recognition speed and accuracy of facial emotion discrimination were measured in volunteers with AMD under stationary and jittering conditions. Computer-driven and optoelectronic approaches were used to induce retinal-image jitter with duration of 100 or 166 ms and amplitude within the range of 0.5 to 2.6° visual angle. Word recognition speed was also measured for participants with simulated (Bangerter filters) visual impairment. Text jittering markedly enhanced word recognition speed for people with severe visual loss (101 ± 25%), while for those with moderate visual impairment, this effect was weaker (19 ± 9%). The ability of low vision patients to discriminate the facial emotions of jittering images improved by a factor of 2. A prototype of optoelectronic jitter goggles produced similar improvement in facial emotion discrimination. Word recognition speed in participants with simulated visual impairment was enhanced for interjitter intervals over 100 ms and reduced for shorter intervals. Results suggest that retinal-image jitter with optimal frequency and amplitude is an effective strategy for enhancing visual information processing in the absence of spatial detail. These findings will enable the development of novel tools to improve the quality of life of low vision patients.
The effect of colour congruency on shape discriminations of novel objects.

PubMed

Nicholson, Karen G; Humphrey, G Keith

2004-01-01

Although visual object recognition is primarily shape driven, colour assists the recognition of some objects. It is unclear, however, just how colour information is coded with respect to shape in long-term memory and how the availability of colour in the visual image facilitates object recognition. We examined the role of colour in the recognition of novel, 3-D objects by manipulating the congruency of object colour across the study and test phases, using an old/new shape-identification task. In experiment 1, we found that participants were faster at correctly identifying old objects on the basis of shape information when these objects were presented in their original colour, rather than in a different colour. In experiments 2 and 3, we found that participants were faster at correctly identifying old objects on the basis of shape information when these objects were presented with their original part-colour conjunctions, rather than in different or in reversed part-colour conjunctions. In experiment 4, we found that participants were quite poor at the verbal recall of part-colour conjunctions for correctly identified old objects, presented as grey-scale images at test. In experiment 5, we found that participants were significantly slower at correctly identifying old objects when object colour was incongruent across study and test, than when background colour was incongruent across study and test. The results of these experiments suggest that both shape and colour information are stored as part of the long-term representation of these novel objects. Results are discussed in terms of how colour might be coded with respect to shape in stored object representations.
Implicit and Explicit Contributions to Object Recognition: Evidence from Rapid Perceptual Learning

PubMed Central

Hassler, Uwe; Friese, Uwe; Gruber, Thomas

2012-01-01

The present study investigated implicit and explicit recognition processes of rapidly perceptually learned objects by means of steady-state visual evoked potentials (SSVEP). Participants were initially exposed to object pictures within an incidental learning task (living/non-living categorization). Subsequently, degraded versions of some of these learned pictures were presented together with degraded versions of unlearned pictures and participants had to judge, whether they recognized an object or not. During this test phase, stimuli were presented at 15 Hz eliciting an SSVEP at the same frequency. Source localizations of SSVEP effects revealed for implicit and explicit processes overlapping activations in orbito-frontal and temporal regions. Correlates of explicit object recognition were additionally found in the superior parietal lobe. These findings are discussed to reflect facilitation of object-specific processing areas within the temporal lobe by an orbito-frontal top-down signal as proposed by bi-directional accounts of object recognition. PMID:23056558
Acquired prosopagnosia without word recognition deficits.

PubMed

Susilo, Tirta; Wright, Victoria; Tree, Jeremy J; Duchaine, Bradley

2015-01-01

It has long been suggested that face recognition relies on specialized mechanisms that are not involved in visual recognition of other object categories, including those that require expert, fine-grained discrimination at the exemplar level such as written words. But according to the recently proposed many-to-many theory of object recognition (MTMT), visual recognition of faces and words are carried out by common mechanisms [Behrmann, M., & Plaut, D. C. ( 2013 ). Distributed circuits, not circumscribed centers, mediate visual recognition. Trends in Cognitive Sciences, 17, 210-219]. MTMT acknowledges that face and word recognition are lateralized, but posits that the mechanisms that predominantly carry out face recognition still contribute to word recognition and vice versa. MTMT makes a key prediction, namely that acquired prosopagnosics should exhibit some measure of word recognition deficits. We tested this prediction by assessing written word recognition in five acquired prosopagnosic patients. Four patients had lesions limited to the right hemisphere while one had bilateral lesions with more pronounced lesions in the right hemisphere. The patients completed a total of seven word recognition tasks: two lexical decision tasks and five reading aloud tasks totalling more than 1200 trials. The performances of the four older patients (3 female, age range 50-64 years) were compared to those of 12 older controls (8 female, age range 56-66 years), while the performances of the younger prosopagnosic (male, 31 years) were compared to those of 14 younger controls (9 female, age range 20-33 years). We analysed all results at the single-patient level using Crawford's t-test. Across seven tasks, four prosopagnosics performed as quickly and accurately as controls. Our results demonstrate that acquired prosopagnosia can exist without word recognition deficits. These findings are inconsistent with a key prediction of MTMT. They instead support the hypothesis that face recognition is carried out by specialized mechanisms that do not contribute to recognition of written words.
Age-related impairments in active learning and strategic visual exploration.

PubMed

Brandstatt, Kelly L; Voss, Joel L

2014-01-01

Old age could impair memory by disrupting learning strategies used by younger individuals. We tested this possibility by manipulating the ability to use visual-exploration strategies during learning. Subjects controlled visual exploration during active learning, thus permitting the use of strategies, whereas strategies were limited during passive learning via predetermined exploration patterns. Performance on tests of object recognition and object-location recall was matched for younger and older subjects for objects studied passively, when learning strategies were restricted. Active learning improved object recognition similarly for younger and older subjects. However, active learning improved object-location recall for younger subjects, but not older subjects. Exploration patterns were used to identify a learning strategy involving repeat viewing. Older subjects used this strategy less frequently and it provided less memory benefit compared to younger subjects. In previous experiments, we linked hippocampal-prefrontal co-activation to improvements in object-location recall from active learning and to the exploration strategy. Collectively, these findings suggest that age-related memory problems result partly from impaired strategies during learning, potentially due to reduced hippocampal-prefrontal co-engagement.
Unsupervised and self-mapping category formation and semantic object recognition for mobile robot vision used in an actual environment

NASA Astrophysics Data System (ADS)

Madokoro, H.; Tsukada, M.; Sato, K.

2013-07-01

This paper presents an unsupervised learning-based object category formation and recognition method for mobile robot vision. Our method has the following features: detection of feature points and description of features using a scale-invariant feature transform (SIFT), selection of target feature points using one class support vector machines (OC-SVMs), generation of visual words using self-organizing maps (SOMs), formation of labels using adaptive resonance theory 2 (ART-2), and creation and classification of categories on a category map of counter propagation networks (CPNs) for visualizing spatial relations between categories. Classification results of dynamic images using time-series images obtained using two different-size robots and according to movements respectively demonstrate that our method can visualize spatial relations of categories while maintaining time-series characteristics. Moreover, we emphasize the effectiveness of our method for category formation of appearance changes of objects.
A new method for text detection and recognition in indoor scene for assisting blind people

NASA Astrophysics Data System (ADS)

Jabnoun, Hanen; Benzarti, Faouzi; Amiri, Hamid

2017-03-01

Developing assisting system of handicapped persons become a challenging ask in research projects. Recently, a variety of tools are designed to help visually impaired or blind people object as a visual substitution system. The majority of these tools are based on the conversion of input information into auditory or tactile sensory information. Furthermore, object recognition and text retrieval are exploited in the visual substitution systems. Text detection and recognition provides the description of the surrounding environments, so that the blind person can readily recognize the scene. In this work, we aim to introduce a method for detecting and recognizing text in indoor scene. The process consists on the detection of the regions of interest that should contain the text using the connected component. Then, the text detection is provided by employing the images correlation. This component of an assistive blind person should be simple, so that the users are able to obtain the most informative feedback within the shortest time.
Face recognition increases during saccade preparation.

PubMed

Lin, Hai; Rizak, Joshua D; Ma, Yuan-ye; Yang, Shang-chuan; Chen, Lin; Hu, Xin-tian

2014-01-01

Face perception is integral to human perception system as it underlies social interactions. Saccadic eye movements are frequently made to bring interesting visual information, such as faces, onto the fovea for detailed processing. Just before eye movement onset, the processing of some basic features, such as the orientation, of an object improves at the saccade landing point. Interestingly, there is also evidence that indicates faces are processed in early visual processing stages similar to basic features. However, it is not known whether this early enhancement of processing includes face recognition. In this study, three experiments were performed to map the timing of face presentation to the beginning of the eye movement in order to evaluate pre-saccadic face recognition. Faces were found to be similarly processed as simple objects immediately prior to saccadic movements. Starting ∼ 120 ms before a saccade to a target face, independent of whether or not the face was surrounded by other faces, the face recognition gradually improved and the critical spacing of the crowding decreased as saccade onset was approaching. These results suggest that an upcoming saccade prepares the visual system for new information about faces at the saccade landing site and may reduce the background in a crowd to target the intended face. This indicates an important role of pre-saccadic eye movement signals in human face recognition.
Presentations of Shape in Object Recognition and Long-Term Visual Memory

DTIC Science & Technology

1994-04-05

theory of human image understanding . Psychological Review, 94, 115-147. Biederman, I., & Gerhardstein, P. C. (1993). Recognizing depth-rotated...Kybemetik. Submitted to Journal of Experimental Psychology: Human Perception and Performance. REFERENCES Biederman, I. (1987). Recognition-by-components: A
Picture object recognition in an American black bear (Ursus americanus).

PubMed

Johnson-Ulrich, Zoe; Vonk, Jennifer; Humbyrd, Mary; Crowley, Marilyn; Wojtkowski, Ela; Yates, Florence; Allard, Stephanie

2016-11-01

Many animals have been tested for conceptual discriminations using two-dimensional images as stimuli, and many of these species appear to transfer knowledge from 2D images to analogous real life objects. We tested an American black bear for picture-object recognition using a two alternative forced choice task. She was presented with four unique sets of objects and corresponding pictures. The bear showed generalization from both objects to pictures and pictures to objects; however, her transfer was superior when transferring from real objects to pictures, suggesting that bears can recognize visual features from real objects within photographic images during discriminations.
Simulated Prosthetic Vision: The Benefits of Computer-Based Object Recognition and Localization.

PubMed

Macé, Marc J-M; Guivarch, Valérian; Denis, Grégoire; Jouffrais, Christophe

2015-07-01

Clinical trials with blind patients implanted with a visual neuroprosthesis showed that even the simplest tasks were difficult to perform with the limited vision restored with current implants. Simulated prosthetic vision (SPV) is a powerful tool to investigate the putative functions of the upcoming generations of visual neuroprostheses. Recent studies based on SPV showed that several generations of implants will be required before usable vision is restored. However, none of these studies relied on advanced image processing. High-level image processing could significantly reduce the amount of information required to perform visual tasks and help restore visuomotor behaviors, even with current low-resolution implants. In this study, we simulated a prosthetic vision device based on object localization in the scene. We evaluated the usability of this device for object recognition, localization, and reaching. We showed that a very low number of electrodes (e.g., nine) are sufficient to restore visually guided reaching movements with fair timing (10 s) and high accuracy. In addition, performance, both in terms of accuracy and speed, was comparable with 9 and 100 electrodes. Extraction of high level information (object recognition and localization) from video images could drastically enhance the usability of current visual neuroprosthesis. We suggest that this method-that is, localization of targets of interest in the scene-may restore various visuomotor behaviors. This method could prove functional on current low-resolution implants. The main limitation resides in the reliability of the vision algorithms, which are improving rapidly. Copyright © 2015 International Center for Artificial Organs and Transplantation and Wiley Periodicals, Inc.
A Neurophysiologically Plausible Population Code Model for Feature Integration Explains Visual Crowding

PubMed Central

van den Berg, Ronald; Roerdink, Jos B. T. M.; Cornelissen, Frans W.

2010-01-01

An object in the peripheral visual field is more difficult to recognize when surrounded by other objects. This phenomenon is called “crowding”. Crowding places a fundamental constraint on human vision that limits performance on numerous tasks. It has been suggested that crowding results from spatial feature integration necessary for object recognition. However, in the absence of convincing models, this theory has remained controversial. Here, we present a quantitative and physiologically plausible model for spatial integration of orientation signals, based on the principles of population coding. Using simulations, we demonstrate that this model coherently accounts for fundamental properties of crowding, including critical spacing, “compulsory averaging”, and a foveal-peripheral anisotropy. Moreover, we show that the model predicts increased responses to correlated visual stimuli. Altogether, these results suggest that crowding has little immediate bearing on object recognition but is a by-product of a general, elementary integration mechanism in early vision aimed at improving signal quality. PMID:20098499
[Recognition of visual objects under forward masking. Effects of cathegorial similarity of test and masking stimuli].

PubMed

Gerasimenko, N Iu; Slavutskaia, A V; Kalinin, S A; Kulikov, M A; Mikhaĭlova, E S

2013-01-01

In 38 healthy subjects accuracy and response time were examined during recognition of two categories of images--animals andnonliving objects--under forward masking. We revealed new data that masking effects depended of categorical similarity of target and masking stimuli. The recognition accuracy was the lowest and the response time was the most slow, when the target and masking stimuli belongs to the same category, that was combined with high dispersion of response times. The revealed effects were more clear in the task of animal recognition in comparison with the recognition of nonliving objects. We supposed that the revealed effects connected with interference between cortical representations of the target and masking stimuli and discussed our results in context of cortical interference and negative priming.
When a Picasso is a "Picasso": the entry point in the identification of visual art.

PubMed

Belke, B; Leder, H; Harsanyi, G; Carbon, C C

2010-02-01

We investigated whether art is distinguished from other real world objects in human cognition, in that art allows for a special memorial representation and identification based on artists' specific stylistic appearances. Testing art-experienced viewers, converging empirical evidence from three experiments, which have proved sensitive to addressing the question of initial object recognition, suggest that identification of visual art is at the subordinate level of the producing artist. Specifically, in a free naming task it was found that art-objects as opposed to non-art-objects were most frequently named with subordinate level categories, with the artist's name as the most frequent category (Experiment 1). In a category-verification task (Experiment 2), art-objects were recognized faster than non-art-objects on the subordinate level with the artist's name. In a conceptual priming task, subordinate primes of artists' names facilitated matching responses to art-objects but subordinate primes did not facilitate responses to non-art-objects (Experiment 3). Collectively, these results suggest that the artist's name has a special status in the memorial representation of visual art and serves as a predominant entry point in recognition in art perception. Copyright 2009 Elsevier B.V. All rights reserved.
Object recognition contributions to figure-ground organization: operations on outlines and subjective contours.

PubMed

Peterson, M A; Gibson, B S

1994-11-01

In previous research, replicated here, we found that some object recognition processes influence figure-ground organization. We have proposed that these object recognition processes operate on edges (or contours) detected early in visual processing, rather than on regions. Consistent with this proposal, influences from object recognition on figure-ground organization were previously observed in both pictures and stereograms depicting regions of different luminance, but not in random-dot stereograms, where edges arise late in processing (Peterson & Gibson, 1993). In the present experiments, we examined whether or not two other types of contours--outlines and subjective contours--enable object recognition influences on figure-ground organization. For both types of contours we observed a pattern of effects similar to that originally obtained with luminance edges. The results of these experiments are valuable for distinguishing between alternative views of the mechanisms mediating object recognition influences on figure-ground organization. In addition, in both Experiments 1 and 2, fixated regions were seen as figure longer than nonfixated regions, suggesting that fixation location must be included among the variables relevant to figure-ground organization.
Dopamine D1 receptor stimulation modulates the formation and retrieval of novel object recognition memory: Role of the prelimbic cortex

PubMed Central

Pezze, Marie A.; Marshall, Hayley J.; Fone, Kevin C.F.; Cassaday, Helen J.

2015-01-01

Previous studies have shown that dopamine D1 receptor antagonists impair novel object recognition memory but the effects of dopamine D1 receptor stimulation remain to be determined. This study investigated the effects of the selective dopamine D1 receptor agonist SKF81297 on acquisition and retrieval in the novel object recognition task in male Wistar rats. SKF81297 (0.4 and 0.8 mg/kg s.c.) given 15 min before the sampling phase impaired novel object recognition evaluated 10 min or 24 h later. The same treatments also reduced novel object recognition memory tested 24 h after the sampling phase and when given 15 min before the choice session. These data indicate that D1 receptor stimulation modulates both the encoding and retrieval of object recognition memory. Microinfusion of SKF81297 (0.025 or 0.05 μg/side) into the prelimbic sub-region of the medial prefrontal cortex (mPFC) in this case 10 min before the sampling phase also impaired novel object recognition memory, suggesting that the mPFC is one important site mediating the effects of D1 receptor stimulation on visual recognition memory. PMID:26277743
Developmental Commonalities between Object and Face Recognition in Adolescence

PubMed Central

Jüttner, Martin; Wakui, Elley; Petters, Dean; Davidoff, Jules

2016-01-01

In the visual perception literature, the recognition of faces has often been contrasted with that of non-face objects, in terms of differences with regard to the role of parts, part relations and holistic processing. However, recent evidence from developmental studies has begun to blur this sharp distinction. We review evidence for a protracted development of object recognition that is reminiscent of the well-documented slow maturation observed for faces. The prolonged development manifests itself in a retarded processing of metric part relations as opposed to that of individual parts and offers surprising parallels to developmental accounts of face recognition, even though the interpretation of the data is less clear with regard to holistic processing. We conclude that such results might indicate functional commonalities between the mechanisms underlying the recognition of faces and non-face objects, which are modulated by different task requirements in the two stimulus domains. PMID:27014176
Perceptual expertise and top-down expectation of musical notation engages the primary visual cortex.

PubMed

Wong, Yetta Kwailing; Peng, Cynthia; Fratus, Kristyn N; Woodman, Geoffrey F; Gauthier, Isabel

2014-08-01

Most theories of visual processing propose that object recognition is achieved in higher visual cortex. However, we show that category selectivity for musical notation can be observed in the first ERP component called the C1 (measured 40-60 msec after stimulus onset) with music-reading expertise. Moreover, the C1 note selectivity was observed only when the stimulus category was blocked but not when the stimulus category was randomized. Under blocking, the C1 activity for notes predicted individual music-reading ability, and behavioral judgments of musical stimuli reflected music-reading skill. Our results challenge current theories of object recognition, indicating that the primary visual cortex can be selective for musical notation within the initial feedforward sweep of activity with perceptual expertise and with a testing context that is consistent with the expertise training, such as blocking the stimulus category for music reading.
Neural Correlates of Individual Differences in Infant Visual Attention and Recognition Memory

PubMed Central

Reynolds, Greg D.; Guy, Maggie W.; Zhang, Dantong

2010-01-01

Past studies have identified individual differences in infant visual attention based upon peak look duration during initial exposure to a stimulus. Colombo and colleagues (e.g., Colombo & Mitchell, 1990) found that infants that demonstrate brief visual fixations (i.e., short lookers) during familiarization are more likely to demonstrate evidence of recognition memory during subsequent stimulus exposure than infants that demonstrate long visual fixations (i.e., long lookers). The current study utilized event-related potentials to examine possible neural mechanisms associated with individual differences in visual attention and recognition memory for 6- and 7.5-month-old infants. Short- and long-looking infants viewed images of familiar and novel objects during ERP testing. There was a stimulus type by looker type interaction at temporal and frontal electrodes on the late slow wave (LSW). Short lookers demonstrated a LSW that was significantly greater in amplitude in response to novel stimulus presentations. No significant differences in LSW amplitude were found based on stimulus type for long lookers. These results indicate deeper processing and recognition memory of the familiar stimulus for short lookers. PMID:21666833
Experience improves feature extraction in Drosophila.

PubMed

Peng, Yueqing; Xi, Wang; Zhang, Wei; Zhang, Ke; Guo, Aike

2007-05-09

Previous exposure to a pattern in the visual scene can enhance subsequent recognition of that pattern in many species from honeybees to humans. However, whether previous experience with a visual feature of an object, such as color or shape, can also facilitate later recognition of that particular feature from multiple visual features is largely unknown. Visual feature extraction is the ability to select the key component from multiple visual features. Using a visual flight simulator, we designed a novel protocol for visual feature extraction to investigate the effects of previous experience on visual reinforcement learning in Drosophila. We found that, after conditioning with a visual feature of objects among combinatorial shape-color features, wild-type flies exhibited poor ability to extract the correct visual feature. However, the ability for visual feature extraction was greatly enhanced in flies trained previously with that visual feature alone. Moreover, we demonstrated that flies might possess the ability to extract the abstract category of "shape" but not a particular shape. Finally, this experience-dependent feature extraction is absent in flies with defective MBs, one of the central brain structures in Drosophila. Our results indicate that previous experience can enhance visual feature extraction in Drosophila and that MBs are required for this experience-dependent visual cognition.

Introducing memory and association mechanism into a biologically inspired visual model.

PubMed

Qiao, Hong; Li, Yinlin; Tang, Tang; Wang, Peng

2014-09-01

A famous biologically inspired hierarchical model (HMAX model), which was proposed recently and corresponds to V1 to V4 of the ventral pathway in primate visual cortex, has been successfully applied to multiple visual recognition tasks. The model is able to achieve a set of position- and scale-tolerant recognition, which is a central problem in pattern recognition. In this paper, based on some other biological experimental evidence, we introduce the memory and association mechanism into the HMAX model. The main contributions of the work are: 1) mimicking the active memory and association mechanism and adding the top down adjustment to the HMAX model, which is the first try to add the active adjustment to this famous model and 2) from the perspective of information, algorithms based on the new model can reduce the computation storage and have a good recognition performance. The new model is also applied to object recognition processes. The primary experimental results show that our method is efficient with a much lower memory requirement.
Hybrid simulated annealing and its application to optimization of hidden Markov models for visual speech recognition.

PubMed

Lee, Jong-Seok; Park, Cheol Hoon

2010-08-01

We propose a novel stochastic optimization algorithm, hybrid simulated annealing (SA), to train hidden Markov models (HMMs) for visual speech recognition. In our algorithm, SA is combined with a local optimization operator that substitutes a better solution for the current one to improve the convergence speed and the quality of solutions. We mathematically prove that the sequence of the objective values converges in probability to the global optimum in the algorithm. The algorithm is applied to train HMMs that are used as visual speech recognizers. While the popular training method of HMMs, the expectation-maximization algorithm, achieves only local optima in the parameter space, the proposed method can perform global optimization of the parameters of HMMs and thereby obtain solutions yielding improved recognition performance. The superiority of the proposed algorithm to the conventional ones is demonstrated via isolated word recognition experiments.
Neural correlates of auditory recognition memory in the primate dorsal temporal pole

PubMed Central

Ng, Chi-Wing; Plakke, Bethany

2013-01-01

Temporal pole (TP) cortex is associated with higher-order sensory perception and/or recognition memory, as human patients with damage in this region show impaired performance during some tasks requiring recognition memory (Olson et al. 2007). The underlying mechanisms of TP processing are largely based on examination of the visual nervous system in humans and monkeys, while little is known about neuronal activity patterns in the auditory portion of this region, dorsal TP (dTP; Poremba et al. 2003). The present study examines single-unit activity of dTP in rhesus monkeys performing a delayed matching-to-sample task utilizing auditory stimuli, wherein two sounds are determined to be the same or different. Neurons of dTP encode several task-relevant events during the delayed matching-to-sample task, and encoding of auditory cues in this region is associated with accurate recognition performance. Population activity in dTP shows a match suppression mechanism to identical, repeated sound stimuli similar to that observed in the visual object identification pathway located ventral to dTP (Desimone 1996; Nakamura and Kubota 1996). However, in contrast to sustained visual delay-related activity in nearby analogous regions, auditory delay-related activity in dTP is transient and limited. Neurons in dTP respond selectively to different sound stimuli and often change their sound response preferences between experimental contexts. Current findings suggest a significant role for dTP in auditory recognition memory similar in many respects to the visual nervous system, while delay memory firing patterns are not prominent, which may relate to monkeys' shorter forgetting thresholds for auditory vs. visual objects. PMID:24198324
Lateralized effects of categorical and coordinate spatial processing of component parts on the recognition of 3D non-nameable objects.

PubMed

Saneyoshi, Ayako; Michimata, Chikashi

2009-12-01

Participants performed two object-matching tasks for novel, non-nameable objects consisting of geons. For each original stimulus, two transformations were applied to create comparison stimuli. In the categorical transformation, a geon connected to geon A was moved to geon B. In the coordinate transformation, a geon connected to geon A was moved to a different position on geon A. The Categorical task consisted of the original and the categorically transformed objects. The Coordinate task consisted of the original and the coordinately transformed objects. The original object was presented to the central visual field, followed by a comparison object presented to the right or left visual half-fields (RVF and LVF). The results showed an RVF advantage for the Categorical task and an LVF advantage for the Coordinate task. The possibility that categorical and coordinate spatial processing subsystems would be basic computational elements for between- and within-category object recognition was discussed.
The Role of Anterior Nuclei of the Thalamus: A Subcortical Gate in Memory Processing: An Intracerebral Recording Study

PubMed Central

Štillová, Klára; Jurák, Pavel; Chládek, Jan; Chrastina, Jan; Halámek, Josef; Bočková, Martina; Goldemundová, Sabina; Říha, Ivo; Rektor, Ivan

2015-01-01

Objective To study the involvement of the anterior nuclei of the thalamus (ANT) as compared to the involvement of the hippocampus in the processes of encoding and recognition during visual and verbal memory tasks. Methods We studied intracerebral recordings in patients with pharmacoresistent epilepsy who underwent deep brain stimulation (DBS) of the ANT with depth electrodes implanted bilaterally in the ANT and compared the results with epilepsy surgery candidates with depth electrodes implanted bilaterally in the hippocampus. We recorded the event-related potentials (ERPs) elicited by the visual and verbal memory encoding and recognition tasks. Results P300-like potentials were recorded in the hippocampus by visual and verbal memory encoding and recognition tasks and in the ANT by the visual encoding and visual and verbal recognition tasks. No significant ERPs were recorded during the verbal encoding task in the ANT. In the visual and verbal recognition tasks, the P300-like potentials in the ANT preceded the P300-like potentials in the hippocampus. Conclusions The ANT is a structure in the memory pathway that processes memory information before the hippocampus. We suggest that the ANT has a specific role in memory processes, especially memory recognition, and that memory disturbance should be considered in patients with ANT-DBS and in patients with ANT lesions. ANT is well positioned to serve as a subcortical gate for memory processing in cortical structures. PMID:26529407
Dynamic information processing states revealed through neurocognitive models of object semantics

PubMed Central

Clarke, Alex

2015-01-01

Recognising objects relies on highly dynamic, interactive brain networks to process multiple aspects of object information. To fully understand how different forms of information about objects are represented and processed in the brain requires a neurocognitive account of visual object recognition that combines a detailed cognitive model of semantic knowledge with a neurobiological model of visual object processing. Here we ask how specific cognitive factors are instantiated in our mental processes and how they dynamically evolve over time. We suggest that coarse semantic information, based on generic shared semantic knowledge, is rapidly extracted from visual inputs and is sufficient to drive rapid category decisions. Subsequent recurrent neural activity between the anterior temporal lobe and posterior fusiform supports the formation of object-specific semantic representations – a conjunctive process primarily driven by the perirhinal cortex. These object-specific representations require the integration of shared and distinguishing object properties and support the unique recognition of objects. We conclude that a valuable way of understanding the cognitive activity of the brain is though testing the relationship between specific cognitive measures and dynamic neural activity. This kind of approach allows us to move towards uncovering the information processing states of the brain and how they evolve over time. PMID:25745632
The effect of visual and interaction fidelity on spatial cognition in immersive virtual environments.

PubMed

Mania, Katerina; Wooldridge, Dave; Coxon, Matthew; Robinson, Andrew

2006-01-01

Accuracy of memory performance per se is an imperfect reflection of the cognitive activity (awareness states) that underlies performance in memory tasks. The aim of this research is to investigate the effect of varied visual and interaction fidelity of immersive virtual environments on memory awareness states. A between groups experiment was carried out to explore the effect of rendering quality on location-based recognition memory for objects and associated states of awareness. The experimental space, consisting of two interconnected rooms, was rendered either flat-shaded or using radiosity rendering. The computer graphics simulations were displayed on a stereo head-tracked Head Mounted Display. Participants completed a recognition memory task after exposure to the experimental space and reported one of four states of awareness following object recognition. These reflected the level of visual mental imagery involved during retrieval, the familiarity of the recollection, and also included guesses. Experimental results revealed variations in the distribution of participants' awareness states across conditions while memory performance failed to reveal any. Interestingly, results revealed a higher proportion of recollections associated with mental imagery in the flat-shaded condition. These findings comply with similar effects revealed in two earlier studies summarized here, which demonstrated that the less "naturalistic" interaction interface or interface of low interaction fidelity provoked a higher proportion of recognitions based on visual mental images.
Teaching Object Permanence: An Action Research Study

ERIC Educational Resources Information Center

Bruce, Susan M.; Vargas, Claudia

2013-01-01

"Object permanence," also known as "object concept" in the field of visual impairment, is one of the most important early developmental milestones. The achievement of object permanence is associated with the onset of representational thought and language. Object permanence is important to orientation, including the recognition of landmarks.…
Computational modeling of the neural representation of object shape in the primate ventral visual system

PubMed Central

Eguchi, Akihiro; Mender, Bedeho M. W.; Evans, Benjamin D.; Humphreys, Glyn W.; Stringer, Simon M.

2015-01-01

Neurons in successive stages of the primate ventral visual pathway encode the spatial structure of visual objects. In this paper, we investigate through computer simulation how these cell firing properties may develop through unsupervised visually-guided learning. Individual neurons in the model are shown to exploit statistical regularity and temporal continuity of the visual inputs during training to learn firing properties that are similar to neurons in V4 and TEO. Neurons in V4 encode the conformation of boundary contour elements at a particular position within an object regardless of the location of the object on the retina, while neurons in TEO integrate information from multiple boundary contour elements. This representation goes beyond mere object recognition, in which neurons simply respond to the presence of a whole object, but provides an essential foundation from which the brain is subsequently able to recognize the whole object. PMID:26300766
Webly-Supervised Fine-Grained Visual Categorization via Deep Domain Adaptation.

PubMed

Xu, Zhe; Huang, Shaoli; Zhang, Ya; Tao, Dacheng

2018-05-01

Learning visual representations from web data has recently attracted attention for object recognition. Previous studies have mainly focused on overcoming label noise and data bias and have shown promising results by learning directly from web data. However, we argue that it might be better to transfer knowledge from existing human labeling resources to improve performance at nearly no additional cost. In this paper, we propose a new semi-supervised method for learning via web data. Our method has the unique design of exploiting strong supervision, i.e., in addition to standard image-level labels, our method also utilizes detailed annotations including object bounding boxes and part landmarks. By transferring as much knowledge as possible from existing strongly supervised datasets to weakly supervised web images, our method can benefit from sophisticated object recognition algorithms and overcome several typical problems found in webly-supervised learning. We consider the problem of fine-grained visual categorization, in which existing training resources are scarce, as our main research objective. Comprehensive experimentation and extensive analysis demonstrate encouraging performance of the proposed approach, which, at the same time, delivers a new pipeline for fine-grained visual categorization that is likely to be highly effective for real-world applications.
A stable biologically motivated learning mechanism for visual feature extraction to handle facial categorization.

PubMed

Rajaei, Karim; Khaligh-Razavi, Seyed-Mahdi; Ghodrati, Masoud; Ebrahimpour, Reza; Shiri Ahmad Abadi, Mohammad Ebrahim

2012-01-01

The brain mechanism of extracting visual features for recognizing various objects has consistently been a controversial issue in computational models of object recognition. To extract visual features, we introduce a new, biologically motivated model for facial categorization, which is an extension of the Hubel and Wiesel simple-to-complex cell hierarchy. To address the synaptic stability versus plasticity dilemma, we apply the Adaptive Resonance Theory (ART) for extracting informative intermediate level visual features during the learning process, which also makes this model stable against the destruction of previously learned information while learning new information. Such a mechanism has been suggested to be embedded within known laminar microcircuits of the cerebral cortex. To reveal the strength of the proposed visual feature learning mechanism, we show that when we use this mechanism in the training process of a well-known biologically motivated object recognition model (the HMAX model), it performs better than the HMAX model in face/non-face classification tasks. Furthermore, we demonstrate that our proposed mechanism is capable of following similar trends in performance as humans in a psychophysical experiment using a face versus non-face rapid categorization task.
Learning and disrupting invariance in visual recognition with a temporal association rule

PubMed Central

Isik, Leyla; Leibo, Joel Z.; Poggio, Tomaso

2012-01-01

Learning by temporal association rules such as Foldiak's trace rule is an attractive hypothesis that explains the development of invariance in visual recognition. Consistent with these rules, several recent experiments have shown that invariance can be broken at both the psychophysical and single cell levels. We show (1) that temporal association learning provides appropriate invariance in models of object recognition inspired by the visual cortex, (2) that we can replicate the “invariance disruption” experiments using these models with a temporal association learning rule to develop and maintain invariance, and (3) that despite dramatic single cell effects, a population of cells is very robust to these disruptions. We argue that these models account for the stability of perceptual invariance despite the underlying plasticity of the system, the variability of the visual world and expected noise in the biological mechanisms. PMID:22754523
The Effect of Inversion on 3- to 5-Year-Olds' Recognition of Face and Nonface Visual Objects

ERIC Educational Resources Information Center

Picozzi, Marta; Cassia, Viola Macchi; Turati, Chiara; Vescovo, Elena

2009-01-01

This study compared the effect of stimulus inversion on 3- to 5-year-olds' recognition of faces and two nonface object categories matched with faces for a number of attributes: shoes (Experiment 1) and frontal images of cars (Experiments 2 and 3). The inversion effect was present for faces but not shoes at 3 years of age (Experiment 1). Analogous…
Metric invariance in object recognition: a review and further evidence.

PubMed

Cooper, E E; Biederman, I; Hummel, J E

1992-06-01

Phenomenologically, human shape recognition appears to be invariant with changes of orientation in depth (up to parts occlusion), position in the visual field, and size. Recent versions of template theories (e.g., Ullman, 1989; Lowe, 1987) assume that these invariances are achieved through the application of transformations such as rotation, translation, and scaling of the image so that it can be matched metrically to a stored template. Presumably, such transformations would require time for their execution. We describe recent priming experiments in which the effects of a prior brief presentation of an image on its subsequent recognition are assessed. The results of these experiments indicate that the invariance is complete: The magnitude of visual priming (as distinct from name or basic level concept priming) is not affected by a change in position, size, orientation in depth, or the particular lines and vertices present in the image, as long as representations of the same components can be activated. An implemented seven layer neural network model (Hummel & Biederman, 1992) that captures these fundamental properties of human object recognition is described. Given a line drawing of an object, the model activates a viewpoint-invariant structural description of the object, specifying its parts and their interrelations. Visual priming is interpreted as a change in the connection weights for the activation of: a) cells, termed geon feature assemblies (GFAs), that conjoin the output of units that represent invariant, independent properties of a single geon and its relations (such as its type, aspect ratio, relations to other geons), or b) a change in the connection weights by which several GFAs activate a cell representing an object.
Remembering the Specific Visual Details of Presented Objects: Neuroimaging Evidence for Effects of Emotion

ERIC Educational Resources Information Center

Kensinger, Elizabeth A.; Schacter, Daniel L.

2007-01-01

Memories can be retrieved with varied amounts of visual detail, and the emotional content of information can influence the likelihood that visual detail is remembered. In the present fMRI experiment (conducted with 19 adults scanned using a 3T magnet), we examined the neural processes that correspond with recognition of the visual details of…
Threat as a feature in visual semantic object memory.

PubMed

Calley, Clifford S; Motes, Michael A; Chiang, H-Sheng; Buhl, Virginia; Spence, Jeffrey S; Abdi, Hervé; Anand, Raksha; Maguire, Mandy; Estevez, Leonardo; Briggs, Richard; Freeman, Thomas; Kraut, Michael A; Hart, John

2013-08-01

Threatening stimuli have been found to modulate visual processes related to perception and attention. The present functional magnetic resonance imaging (fMRI) study investigated whether threat modulates visual object recognition of man-made and naturally occurring categories of stimuli. Compared with nonthreatening pictures, threatening pictures of real items elicited larger fMRI BOLD signal changes in medial visual cortices extending inferiorly into the temporo-occipital (TO) "what" pathways. This region elicited greater signal changes for threatening items compared to nonthreatening from both the natural-occurring and man-made stimulus supraordinate categories, demonstrating a featural component to these visual processing areas. Two additional loci of signal changes within more lateral inferior TO areas (bilateral BA18 and 19 as well as the right ventral temporal lobe) were detected for a category-feature interaction, with stronger responses to man-made (category) threatening (feature) stimuli than to natural threats. The findings are discussed in terms of visual recognition of processing efficiently or rapidly groups of items that confer an advantage for survival. Copyright © 2012 Wiley Periodicals, Inc.
The Color “Fruit”: Object Memories Defined by Color

PubMed Central

Lewis, David E.; Pearson, Joel; Khuu, Sieu K.

2013-01-01

Most fruits and other highly color-diagnostic objects have color as a central aspect of their identity, which can facilitate detection and visual recognition. It has been theorized that there may be a large amount of overlap between the neural representations of these objects and processing involved in color perception. In accordance with this theory we sought to determine if the recognition of highly color diagnostic fruit objects could be facilitated by the visual presentation of their known color associates. In two experiments we show that color associate priming is possible, but contingent upon multiple factors. Color priming was found to be maximally effective for the most highly color diagnostic fruits, when low spatial-frequency information was present in the image, and when determination of the object's specific identity, not merely its category, was required. These data illustrate the importance of color for determining the identity of certain objects, and support the theory that object knowledge involves sensory specific systems. PMID:23717677
Object recognition with hierarchical discriminant saliency networks.

PubMed

Han, Sunhyoung; Vasconcelos, Nuno

2014-01-01

The benefits of integrating attention and object recognition are investigated. While attention is frequently modeled as a pre-processor for recognition, we investigate the hypothesis that attention is an intrinsic component of recognition and vice-versa. This hypothesis is tested with a recognition model, the hierarchical discriminant saliency network (HDSN), whose layers are top-down saliency detectors, tuned for a visual class according to the principles of discriminant saliency. As a model of neural computation, the HDSN has two possible implementations. In a biologically plausible implementation, all layers comply with the standard neurophysiological model of visual cortex, with sub-layers of simple and complex units that implement a combination of filtering, divisive normalization, pooling, and non-linearities. In a convolutional neural network implementation, all layers are convolutional and implement a combination of filtering, rectification, and pooling. The rectification is performed with a parametric extension of the now popular rectified linear units (ReLUs), whose parameters can be tuned for the detection of target object classes. This enables a number of functional enhancements over neural network models that lack a connection to saliency, including optimal feature denoising mechanisms for recognition, modulation of saliency responses by the discriminant power of the underlying features, and the ability to detect both feature presence and absence. In either implementation, each layer has a precise statistical interpretation, and all parameters are tuned by statistical learning. Each saliency detection layer learns more discriminant saliency templates than its predecessors and higher layers have larger pooling fields. This enables the HDSN to simultaneously achieve high selectivity to target object classes and invariance. The performance of the network in saliency and object recognition tasks is compared to those of models from the biological and computer vision literatures. This demonstrates benefits for all the functional enhancements of the HDSN, the class tuning inherent to discriminant saliency, and saliency layers based on templates of increasing target selectivity and invariance. Altogether, these experiments suggest that there are non-trivial benefits in integrating attention and recognition.
Not All Attention Orienting is Created Equal: Recognition Memory is Enhanced When Attention Orienting Involves Distractor Suppression

PubMed Central

Markant, Julie; Worden, Michael S.; Amso, Dima

2015-01-01

Learning through visual exploration often requires orienting of attention to meaningful information in a cluttered world. Previous work has shown that attention modulates visual cortex activity, with enhanced activity for attended targets and suppressed activity for competing inputs, thus enhancing the visual experience. Here we examined the idea that learning may be engaged differentially with variations in attention orienting mechanisms that drive driving eye movements during visual search and exploration. We hypothesized that attention orienting mechanisms that engaged suppression of a previously attended location will boost memory encoding of the currently attended target objects to a greater extent than those that involve target enhancement alone To test this hypothesis we capitalized on the classic spatial cueing task and the inhibition of return (IOR) mechanism (Posner, Rafal, & Choate, 1985; Posner, 1980) to demonstrate that object images encoded in the context of concurrent suppression at a previously attended location were encoded more effectively and remembered better than those encoded without concurrent suppression. Furthermore, fMRI analyses revealed that this memory benefit was driven by attention modulation of visual cortex activity, as increased suppression of the previously attended location in visual cortex during target object encoding predicted better subsequent recognition memory performance. These results suggest that not all attention orienting impacts learning and memory equally. PMID:25701278
Recruitment of Foveal Retinotopic Cortex During Haptic Exploration of Shapes and Actions in the Dark.

PubMed

Monaco, Simona; Gallivan, Jason P; Figley, Teresa D; Singhal, Anthony; Culham, Jody C

2017-11-29

The role of the early visual cortex and higher-order occipitotemporal cortex has been studied extensively for visual recognition and to a lesser degree for haptic recognition and visually guided actions. Using a slow event-related fMRI experiment, we investigated whether tactile and visual exploration of objects recruit the same "visual" areas (and in the case of visual cortex, the same retinotopic zones) and if these areas show reactivation during delayed actions in the dark toward haptically explored objects (and if so, whether this reactivation might be due to imagery). We examined activation during visual or haptic exploration of objects and action execution (grasping or reaching) separated by an 18 s delay. Twenty-nine human volunteers (13 females) participated in this study. Participants had their eyes open and fixated on a point in the dark. The objects were placed below the fixation point and accordingly visual exploration activated the cuneus, which processes retinotopic locations in the lower visual field. Strikingly, the occipital pole (OP), representing foveal locations, showed higher activation for tactile than visual exploration, although the stimulus was unseen and location in the visual field was peripheral. Moreover, the lateral occipital tactile-visual area (LOtv) showed comparable activation for tactile and visual exploration. Psychophysiological interaction analysis indicated that the OP showed stronger functional connectivity with anterior intraparietal sulcus and LOtv during the haptic than visual exploration of shapes in the dark. After the delay, the cuneus, OP, and LOtv showed reactivation that was independent of the sensory modality used to explore the object. These results show that haptic actions not only activate "visual" areas during object touch, but also that this information appears to be used in guiding grasping actions toward targets after a delay. SIGNIFICANCE STATEMENT Visual presentation of an object activates shape-processing areas and retinotopic locations in early visual areas. Moreover, if the object is grasped in the dark after a delay, these areas show "reactivation." Here, we show that these areas are also activated and reactivated for haptic object exploration and haptically guided grasping. Touch-related activity occurs not only in the retinotopic location of the visual stimulus, but also at the occipital pole (OP), corresponding to the foveal representation, even though the stimulus was unseen and located peripherally. That is, the same "visual" regions are implicated in both visual and haptic exploration; however, touch also recruits high-acuity central representation within early visual areas during both haptic exploration of objects and subsequent actions toward them. Functional connectivity analysis shows that the OP is more strongly connected with ventral and dorsal stream areas when participants explore an object in the dark than when they view it. Copyright © 2017 the authors 0270-6474/17/3711572-20$15.00/0.

Dynamic representation of partially occluded objects in primate prefrontal and visual cortex

PubMed Central

Choi, Hannah; Shea-Brown, Eric

2017-01-01

Successful recognition of partially occluded objects is presumed to involve dynamic interactions between brain areas responsible for vision and cognition, but neurophysiological evidence for the involvement of feedback signals is lacking. Here, we demonstrate that neurons in the ventrolateral prefrontal cortex (vlPFC) of monkeys performing a shape discrimination task respond more strongly to occluded than unoccluded stimuli. In contrast, neurons in visual area V4 respond more strongly to unoccluded stimuli. Analyses of V4 response dynamics reveal that many neurons exhibit two transient response peaks, the second of which emerges after vlPFC response onset and displays stronger selectivity for occluded shapes. We replicate these findings using a model of V4/vlPFC interactions in which occlusion-sensitive vlPFC neurons feed back to shape-selective V4 neurons, thereby enhancing V4 responses and selectivity to occluded shapes. These results reveal how signals from frontal and visual cortex could interact to facilitate object recognition under occlusion. PMID:28925354
The Development of Invariant Object Recognition Requires Visual Experience with Temporally Smooth Objects

ERIC Educational Resources Information Center

Wood, Justin N.; Wood, Samantha M. W.

2018-01-01

How do newborns learn to recognize objects? According to temporal learning models in computational neuroscience, the brain constructs object representations by extracting smoothly changing features from the environment. To date, however, it is unknown whether newborns depend on smoothly changing features to build invariant object representations.…
Biologically Inspired Model for Visual Cognition Achieving Unsupervised Episodic and Semantic Feature Learning.

PubMed

Qiao, Hong; Li, Yinlin; Li, Fengfu; Xi, Xuanyang; Wu, Wei

2016-10-01

Recently, many biologically inspired visual computational models have been proposed. The design of these models follows the related biological mechanisms and structures, and these models provide new solutions for visual recognition tasks. In this paper, based on the recent biological evidence, we propose a framework to mimic the active and dynamic learning and recognition process of the primate visual cortex. From principle point of view, the main contributions are that the framework can achieve unsupervised learning of episodic features (including key components and their spatial relations) and semantic features (semantic descriptions of the key components), which support higher level cognition of an object. From performance point of view, the advantages of the framework are as follows: 1) learning episodic features without supervision-for a class of objects without a prior knowledge, the key components, their spatial relations and cover regions can be learned automatically through a deep neural network (DNN); 2) learning semantic features based on episodic features-within the cover regions of the key components, the semantic geometrical values of these components can be computed based on contour detection; 3) forming the general knowledge of a class of objects-the general knowledge of a class of objects can be formed, mainly including the key components, their spatial relations and average semantic values, which is a concise description of the class; and 4) achieving higher level cognition and dynamic updating-for a test image, the model can achieve classification and subclass semantic descriptions. And the test samples with high confidence are selected to dynamically update the whole model. Experiments are conducted on face images, and a good performance is achieved in each layer of the DNN and the semantic description learning process. Furthermore, the model can be generalized to recognition tasks of other objects with learning ability.
The effect of Wi-Fi electromagnetic waves in unimodal and multimodal object recognition tasks in male rats.

PubMed

Hassanshahi, Amin; Shafeie, Seyed Ali; Fatemi, Iman; Hassanshahi, Elham; Allahtavakoli, Mohammad; Shabani, Mohammad; Roohbakhsh, Ali; Shamsizadeh, Ali

2017-06-01

Wireless internet (Wi-Fi) electromagnetic waves (2.45 GHz) have widespread usage almost everywhere, especially in our homes. Considering the recent reports about some hazardous effects of Wi-Fi signals on the nervous system, this study aimed to investigate the effect of 2.4 GHz Wi-Fi radiation on multisensory integration in rats. This experimental study was done on 80 male Wistar rats that were allocated into exposure and sham groups. Wi-Fi exposure to 2.4 GHz microwaves [in Service Set Identifier mode (23.6 dBm and 3% for power and duty cycle, respectively)] was done for 30 days (12 h/day). Cross-modal visual-tactile object recognition (CMOR) task was performed by four variations of spontaneous object recognition (SOR) test including standard SOR, tactile SOR, visual SOR, and CMOR tests. A discrimination ratio was calculated to assess the preference of animal to the novel object. The expression levels of M1 and GAT1 mRNA in the hippocampus were assessed by quantitative real-time RT-PCR. Results demonstrated that rats in Wi-Fi exposure groups could not discriminate significantly between the novel and familiar objects in any of the standard SOR, tactile SOR, visual SOR, and CMOR tests. The expression of M1 receptors increased following Wi-Fi exposure. In conclusion, results of this study showed that chronic exposure to Wi-Fi electromagnetic waves might impair both unimodal and cross-modal encoding of information.
Perceptual Integration Deficits in Autism Spectrum Disorders Are Associated with Reduced Interhemispheric Gamma-Band Coherence.

PubMed

Peiker, Ina; David, Nicole; Schneider, Till R; Nolte, Guido; Schöttle, Daniel; Engel, Andreas K

2015-12-16

The integration of visual details into a holistic percept is essential for object recognition. This integration has been reported as a key deficit in patients with autism spectrum disorders (ASDs). The weak central coherence account posits an altered disposition to integrate features into a coherent whole in ASD. Here, we test the hypothesis that such weak perceptual coherence may be reflected in weak neural coherence across different cortical sites. We recorded magnetoencephalography from 20 adult human participants with ASD and 20 matched controls, who performed a slit-viewing paradigm, in which objects gradually passed behind a vertical or horizontal slit so that only fragments of the object were visible at any given moment. Object recognition thus required perceptual integration over time and, in case of the horizontal slit, also across visual hemifields. ASD participants were selectively impaired in the horizontal slit condition, indicating specific difficulties in long-range synchronization between the hemispheres. Specifically, the ASD group failed to show condition-related enhancement of imaginary coherence between the posterior superior temporal sulci in both hemispheres during horizontal slit-viewing in contrast to controls. Moreover, local synchronization reflected in occipitocerebellar beta-band power was selectively reduced for horizontal compared with vertical slit-viewing in ASD. Furthermore, we found disturbed connectivity between right posterior superior temporal sulcus and left cerebellum. Together, our results suggest that perceptual integration deficits co-occur with specific patterns of abnormal global and local synchronization in ASD. The weak central coherence account proposes a tendency of individuals with autism spectrum disorders (ASDs) to focus on details at the cost of an integrated coherent whole. Here, we provide evidence, at the behavioral and the neural level, that visual integration in object recognition is impaired in ASD, when details had to be integrated across both visual hemifields. We found enhanced interhemispheric gamma-band coherence in typically developed participants when communication between cortical hemispheres was required by the task. Importantly, participants with ASD failed to show this enhanced coherence between bilateral posterior superior temporal sulci. The findings suggest that visual integration is disturbed at the local and global synchronization scale, which might bear implications for object recognition in ASD. Copyright © 2015 the authors 0270-6474/15/3516352-10$15.00/0.
On the road to invariant recognition: explaining tradeoff and morph properties of cells in inferotemporal cortex using multiple-scale task-sensitive attentive learning.

PubMed

Grossberg, Stephen; Markowitz, Jeffrey; Cao, Yongqiang

2011-12-01

Visual object recognition is an essential accomplishment of advanced brains. Object recognition needs to be tolerant, or invariant, with respect to changes in object position, size, and view. In monkeys and humans, a key area for recognition is the anterior inferotemporal cortex (ITa). Recent neurophysiological data show that ITa cells with high object selectivity often have low position tolerance. We propose a neural model whose cells learn to simulate this tradeoff, as well as ITa responses to image morphs, while explaining how invariant recognition properties may arise in stages due to processes across multiple cortical areas. These processes include the cortical magnification factor, multiple receptive field sizes, and top-down attentive matching and learning properties that may be tuned by task requirements to attend to either concrete or abstract visual features with different levels of vigilance. The model predicts that data from the tradeoff and image morph tasks emerge from different levels of vigilance in the animals performing them. This result illustrates how different vigilance requirements of a task may change the course of category learning, notably the critical features that are attended and incorporated into learned category prototypes. The model outlines a path for developing an animal model of how defective vigilance control can lead to symptoms of various mental disorders, such as autism and amnesia. Copyright © 2011 Elsevier Ltd. All rights reserved.
Efficient visual object and word recognition relies on high spatial frequency coding in the left posterior fusiform gyrus: evidence from a case-series of patients with ventral occipito-temporal cortex damage.

PubMed

Roberts, Daniel J; Woollams, Anna M; Kim, Esther; Beeson, Pelagie M; Rapcsak, Steven Z; Lambon Ralph, Matthew A

2013-11-01

Recent visual neuroscience investigations suggest that ventral occipito-temporal cortex is retinotopically organized, with high acuity foveal input projecting primarily to the posterior fusiform gyrus (pFG), making this region crucial for coding high spatial frequency information. Because high spatial frequencies are critical for fine-grained visual discrimination, we hypothesized that damage to the left pFG should have an adverse effect not only on efficient reading, as observed in pure alexia, but also on the processing of complex non-orthographic visual stimuli. Consistent with this hypothesis, we obtained evidence that a large case series (n = 20) of patients with lesions centered on left pFG: 1) Exhibited reduced sensitivity to high spatial frequencies; 2) demonstrated prolonged response latencies both in reading (pure alexia) and object naming; and 3) were especially sensitive to visual complexity and similarity when discriminating between novel visual patterns. These results suggest that the patients' dual reading and non-orthographic recognition impairments have a common underlying mechanism and reflect the loss of high spatial frequency visual information normally coded in the left pFG.
Generic decoding of seen and imagined objects using hierarchical visual features.

PubMed

Horikawa, Tomoyasu; Kamitani, Yukiyasu

2017-05-22

Object recognition is a key function in both human and machine vision. While brain decoding of seen and imagined objects has been achieved, the prediction is limited to training examples. We present a decoding approach for arbitrary objects using the machine vision principle that an object category is represented by a set of features rendered invariant through hierarchical processing. We show that visual features, including those derived from a deep convolutional neural network, can be predicted from fMRI patterns, and that greater accuracy is achieved for low-/high-level features with lower-/higher-level visual areas, respectively. Predicted features are used to identify seen/imagined object categories (extending beyond decoder training) from a set of computed features for numerous object images. Furthermore, decoding of imagined objects reveals progressive recruitment of higher-to-lower visual representations. Our results demonstrate a homology between human and machine vision and its utility for brain-based information retrieval.
Fusion of Multiple Sensing Modalities for Machine Vision

DTIC Science & Technology

1994-05-31

Modeling of Non-Homogeneous 3-D Objects for Thermal and Visual Image Synthesis," Pattern Recognition, in press. U [11] Nair, Dinesh , and J. K. Aggarwal...20th AIPR Workshop: Computer Vision--Meeting the Challenges, McLean, Virginia, October 1991. Nair, Dinesh , and J. K. Aggarwal, "An Object Recognition...Computer Engineering August 1992 Sunil Gupta Ph.D. Student Mohan Kumar M.S. Student Sandeep Kumar M.S. Student Xavier Lebegue Ph.D., Computer
Information Processing at 1 Year: Relation to Birth Status and Developmental Outcome during the First 5 Years.

ERIC Educational Resources Information Center

Rose, Susan A.; And Others

1991-01-01

Measures of visual and tactual recognition memory, tactual-visual transfer, and object permanence were obtained for preterm and full-term infants. Measures of tactual-visual transfer were correlated with later intelligence measures up to the age of five years. These correlations were independent of socioeconomic status, medical risk, and early…
The effect of scene context on episodic object recognition: parahippocampal cortex mediates memory encoding and retrieval success.

PubMed

Hayes, Scott M; Nadel, Lynn; Ryan, Lee

2007-01-01

Previous research has investigated intentional retrieval of contextual information and contextual influences on object identification and word recognition, yet few studies have investigated context effects in episodic memory for objects. To address this issue, unique objects embedded in a visually rich scene or on a white background were presented to participants. At test, objects were presented either in the original scene or on a white background. A series of behavioral studies with young adults demonstrated a context shift decrement (CSD)-decreased recognition performance when context is changed between encoding and retrieval. The CSD was not attenuated by encoding or retrieval manipulations, suggesting that binding of object and context may be automatic. A final experiment explored the neural correlates of the CSD, using functional Magnetic Resonance Imaging. Parahippocampal cortex (PHC) activation (right greater than left) during incidental encoding was associated with subsequent memory of objects in the context shift condition. Greater activity in right PHC was also observed during successful recognition of objects previously presented in a scene. Finally, a subset of regions activated during scene encoding, such as bilateral PHC, was reactivated when the object was presented on a white background at retrieval. Although participants were not required to intentionally retrieve contextual information, the results suggest that PHC may reinstate visual context to mediate successful episodic memory retrieval. The CSD is attributed to automatic and obligatory binding of object and context. The results suggest that PHC is important not only for processing of scene information, but also plays a role in successful episodic memory encoding and retrieval. These findings are consistent with the view that spatial information is stored in the hippocampal complex, one of the central tenets of Multiple Trace Theory. (c) 2007 Wiley-Liss, Inc.
Salience of the lambs: a test of the saliency map hypothesis with pictures of emotive objects.

PubMed

Humphrey, Katherine; Underwood, Geoffrey; Lambert, Tony

2012-01-25

Humans have an ability to rapidly detect emotive stimuli. However, many emotional objects in a scene are also highly visually salient, which raises the question of how dependent the effects of emotionality are on visual saliency and whether the presence of an emotional object changes the power of a more visually salient object in attracting attention. Participants were shown a set of positive, negative, and neutral pictures and completed recall and recognition memory tests. Eye movement data revealed that visual saliency does influence eye movements, but the effect is reliably reduced when an emotional object is present. Pictures containing negative objects were recognized more accurately and recalled in greater detail, and participants fixated more on negative objects than positive or neutral ones. Initial fixations were more likely to be on emotional objects than more visually salient neutral ones, suggesting that the processing of emotional features occurs at a very early stage of perception.
Coding the presence of visual objects in a recurrent neural network of visual cortex.

PubMed

Zwickel, Timm; Wachtler, Thomas; Eckhorn, Reinhard

2007-01-01

Before we can recognize a visual object, our visual system has to segregate it from its background. This requires a fast mechanism for establishing the presence and location of objects independently of their identity. Recently, border-ownership neurons were recorded in monkey visual cortex which might be involved in this task [Zhou, H., Friedmann, H., von der Heydt, R., 2000. Coding of border ownership in monkey visual cortex. J. Neurosci. 20 (17), 6594-6611]. In order to explain the basic mechanisms required for fast coding of object presence, we have developed a neural network model of visual cortex consisting of three stages. Feed-forward and lateral connections support coding of Gestalt properties, including similarity, good continuation, and convexity. Neurons of the highest area respond to the presence of an object and encode its position, invariant of its form. Feedback connections to the lowest area facilitate orientation detectors activated by contours belonging to potential objects, and thus generate the experimentally observed border-ownership property. This feedback control acts fast and significantly improves the figure-ground segregation required for the consecutive task of object recognition.
Visual and cross-modal cues increase the identification of overlapping visual stimuli in Balint's syndrome.

PubMed

D'Imperio, Daniela; Scandola, Michele; Gobbetto, Valeria; Bulgarelli, Cristina; Salgarello, Matteo; Avesani, Renato; Moro, Valentina

2017-10-01

Cross-modal interactions improve the processing of external stimuli, particularly when an isolated sensory modality is impaired. When information from different modalities is integrated, object recognition is facilitated probably as a result of bottom-up and top-down processes. The aim of this study was to investigate the potential effects of cross-modal stimulation in a case of simultanagnosia. We report a detailed analysis of clinical symptoms and an 18 F-fluorodeoxyglucose (FDG) brain positron emission tomography/computed tomography (PET/CT) study of a patient affected by Balint's syndrome, a rare and invasive visual-spatial disorder following bilateral parieto-occipital lesions. An experiment was conducted to investigate the effects of visual and nonvisual cues on performance in tasks involving the recognition of overlapping pictures. Four modalities of sensory cues were used: visual, tactile, olfactory, and auditory. Data from neuropsychological tests showed the presence of ocular apraxia, optic ataxia, and simultanagnosia. The results of the experiment indicate a positive effect of the cues on the recognition of overlapping pictures, not only in the identification of the congruent valid-cued stimulus (target) but also in the identification of the other, noncued stimuli. All the sensory modalities analyzed (except the auditory stimulus) were efficacious in terms of increasing visual recognition. Cross-modal integration improved the patient's ability to recognize overlapping figures. However, while in the visual unimodal modality both bottom-up (priming, familiarity effect, disengagement of attention) and top-down processes (mental representation and short-term memory, the endogenous orientation of attention) are involved, in the cross-modal integration it is semantic representations that mainly activate visual recognition processes. These results are potentially useful for the design of rehabilitation training for attentional and visual-perceptual deficits.
Con-Text: Text Detection for Fine-grained Object Classification.

PubMed

Karaoglu, Sezer; Tao, Ran; van Gemert, Jan C; Gevers, Theo

2017-05-24

This work focuses on fine-grained object classification using recognized scene text in natural images. While the state-of-the-art relies on visual cues only, this paper is the first work which proposes to combine textual and visual cues. Another novelty is the textual cue extraction. Unlike the state-of-the-art text detection methods, we focus more on the background instead of text regions. Once text regions are detected, they are further processed by two methods to perform text recognition i.e. ABBYY commercial OCR engine and a state-of-the-art character recognition algorithm. Then, to perform textual cue encoding, bi- and trigrams are formed between the recognized characters by considering the proposed spatial pairwise constraints. Finally, extracted visual and textual cues are combined for fine-grained classification. The proposed method is validated on four publicly available datasets: ICDAR03, ICDAR13, Con-Text and Flickr-logo. We improve the state-of-the-art end-to-end character recognition by a large margin of 15% on ICDAR03. We show that textual cues are useful in addition to visual cues for fine-grained classification. We show that textual cues are also useful for logo retrieval. Adding textual cues outperforms visual- and textual-only in fine-grained classification (70.7% to 60.3%) and logo retrieval (57.4% to 54.8%).
Visual recognition and inference using dynamic overcomplete sparse learning.

PubMed

Murray, Joseph F; Kreutz-Delgado, Kenneth

2007-09-01

We present a hierarchical architecture and learning algorithm for visual recognition and other visual inference tasks such as imagination, reconstruction of occluded images, and expectation-driven segmentation. Using properties of biological vision for guidance, we posit a stochastic generative world model and from it develop a simplified world model (SWM) based on a tractable variational approximation that is designed to enforce sparse coding. Recent developments in computational methods for learning overcomplete representations (Lewicki & Sejnowski, 2000; Teh, Welling, Osindero, & Hinton, 2003) suggest that overcompleteness can be useful for visual tasks, and we use an overcomplete dictionary learning algorithm (Kreutz-Delgado, et al., 2003) as a preprocessing stage to produce accurate, sparse codings of images. Inference is performed by constructing a dynamic multilayer network with feedforward, feedback, and lateral connections, which is trained to approximate the SWM. Learning is done with a variant of the back-propagation-through-time algorithm, which encourages convergence to desired states within a fixed number of iterations. Vision tasks require large networks, and to make learning efficient, we take advantage of the sparsity of each layer to update only a small subset of elements in a large weight matrix at each iteration. Experiments on a set of rotated objects demonstrate various types of visual inference and show that increasing the degree of overcompleteness improves recognition performance in difficult scenes with occluded objects in clutter.
The effects of perceptual priming on 4-year-olds' haptic-to-visual cross-modal transfer.

PubMed

Kalagher, Hilary

2013-01-01

Four-year-old children often have difficulty visually recognizing objects that were previously experienced only haptically. This experiment attempts to improve their performance in these haptic-to-visual transfer tasks. Sixty-two 4-year-old children participated in priming trials in which they explored eight unfamiliar objects visually, haptically, or visually and haptically together. Subsequently, all children participated in the same haptic-to-visual cross-modal transfer task. In this task, children haptically explored the objects that were presented in the priming phase and then visually identified a match from among three test objects, each matching the object on only one dimension (shape, texture, or color). Children in all priming conditions predominantly made shape-based matches; however, the most shape-based matches were made in the Visual and Haptic condition. All kinds of priming provided the necessary memory traces upon which subsequent haptic exploration could build a strong enough representation to enable subsequent visual recognition. Haptic exploration patterns during the cross-modal transfer task are discussed and the detailed analyses provide a unique contribution to our understanding of the development of haptic exploratory procedures.
Neural Dynamics of Object-Based Multifocal Visual Spatial Attention and Priming: Object Cueing, Useful-Field-of-View, and Crowding

ERIC Educational Resources Information Center

Foley, Nicholas C.; Grossberg, Stephen; Mingolla, Ennio

2012-01-01

How are spatial and object attention coordinated to achieve rapid object learning and recognition during eye movement search? How do prefrontal priming and parietal spatial mechanisms interact to determine the reaction time costs of intra-object attention shifts, inter-object attention shifts, and shifts between visible objects and covertly cued…
Neurophysiological indices of perceptual object priming in the absence of explicit recognition memory.

PubMed

Harris, Jill D; Cutmore, Tim R H; O'Gorman, John; Finnigan, Simon; Shum, David

2009-02-01

The aim of this study was to identify ERP correlates of perceptual object priming that are insensitive to factors affecting explicit, episodic memory. EEG was recorded from 21 participants while they performed a visual object recognition test on a combination of unstudied items and old items that were previously encountered during either a 'deep' or 'shallow' levels-of-processing (LOP) study task. The results demonstrated a midline P150 old/new effect which was sensitive only to objects' old/new status and not to the accuracy of recognition responses to old items, or to the LOP manipulation. Similar outcomes were observed for the subsequent P200 and N400 effects, the former of which had a parietal scalp maximum and the latter, a broadly distributed topography. In addition an LPC old/new effect typical of those reported in past ERP recognition studies was observed. These outcomes support the proposal that the P150 effect is reflective of perceptual object priming and moreover, provide novel evidence that this and the P200 effect are independent of explicit recognition memory process(es).
Multilevel depth and image fusion for human activity detection.

PubMed

Ni, Bingbing; Pei, Yong; Moulin, Pierre; Yan, Shuicheng

2013-10-01

Recognizing complex human activities usually requires the detection and modeling of individual visual features and the interactions between them. Current methods only rely on the visual features extracted from 2-D images, and therefore often lead to unreliable salient visual feature detection and inaccurate modeling of the interaction context between individual features. In this paper, we show that these problems can be addressed by combining data from a conventional camera and a depth sensor (e.g., Microsoft Kinect). We propose a novel complex activity recognition and localization framework that effectively fuses information from both grayscale and depth image channels at multiple levels of the video processing pipeline. In the individual visual feature detection level, depth-based filters are applied to the detected human/object rectangles to remove false detections. In the next level of interaction modeling, 3-D spatial and temporal contexts among human subjects or objects are extracted by integrating information from both grayscale and depth images. Depth information is also utilized to distinguish different types of indoor scenes. Finally, a latent structural model is developed to integrate the information from multiple levels of video processing for an activity detection. Extensive experiments on two activity recognition benchmarks (one with depth information) and a challenging grayscale + depth human activity database that contains complex interactions between human-human, human-object, and human-surroundings demonstrate the effectiveness of the proposed multilevel grayscale + depth fusion scheme. Higher recognition and localization accuracies are obtained relative to the previous methods.

How Does Using Object Names Influence Visual Recognition Memory?

ERIC Educational Resources Information Center

Richler, Jennifer J.; Palmeri, Thomas J.; Gauthier, Isabel

2013-01-01

Two recent lines of research suggest that explicitly naming objects at study influences subsequent memory for those objects at test. Lupyan (2008) suggested that naming "impairs" memory by a representational shift of stored representations of named objects toward the prototype (labeling effect). MacLeod, Gopie, Hourihan, Neary, and Ozubko (2010)…
Weighted feature selection criteria for visual servoing of a telerobot

NASA Technical Reports Server (NTRS)

Feddema, John T.; Lee, C. S. G.; Mitchell, O. R.

1989-01-01

Because of the continually changing environment of a space station, visual feedback is a vital element of a telerobotic system. A real time visual servoing system would allow a telerobot to track and manipulate randomly moving objects. Methodologies for the automatic selection of image features to be used to visually control the relative position between an eye-in-hand telerobot and a known object are devised. A weighted criteria function with both image recognition and control components is used to select the combination of image features which provides the best control. Simulation and experimental results of a PUMA robot arm visually tracking a randomly moving carburetor gasket with a visual update time of 70 milliseconds are discussed.
Brief Report: Face-Specific Recognition Deficits in Young Children with Autism Spectrum Disorders

ERIC Educational Resources Information Center

Bradshaw, Jessica; Shic, Frederick; Chawarska, Katarzyna

2011-01-01

This study used eyetracking to investigate the ability of young children with autism spectrum disorders (ASD) to recognize social (faces) and nonsocial (simple objects and complex block patterns) stimuli using the visual paired comparison (VPC) paradigm. Typically developing (TD) children showed evidence for recognition of faces and simple…
Perception, memory and aesthetics of indeterminate art.

PubMed

Ishai, Alumit; Fairhall, Scott L; Pepperell, Robert

2007-07-12

Indeterminate art, in which familiar objects are only suggestive, invokes a perceptual conundrum as apparently detailed and vivid images resist identification. We hypothesized that compared with paintings that depict meaningful content, object recognition in indeterminate images would be delayed, and tested whether aesthetic affect depends on meaningful content. Subjects performed object recognition and judgment of aesthetic affect tasks. Response latencies were significantly longer for indeterminate images and subjects perceived recognizable objects in 24% of these paintings. Although the aesthetic affect rating of all paintings was similar, judgement latencies for the indeterminate paintings were significantly longer. A surprise memory test revealed that more representational than indeterminate paintings were remembered and that affective strength increased the probability of subsequent recall. Our results suggest that perception and memory of art depend on semantic aspects, whereas, aesthetic affect depends on formal visual features. The longer latencies associated with indeterminate paintings reflect the underlying cognitive processes that mediate object resolution. Indeterminate art works therefore comprise a rich set of stimuli with which the neural correlates of visual perception can be investigated.
The Role of Fixation and Visual Attention in Object Recognition.

DTIC Science & Technology

1995-01-01

computers", Technical Report, Aritificial Intelligence Lab, M.I. T., AI-Memo-915, June 1986. [29] D.P. Huttenlocher and S.Ullman, "Object Recognition Using...attention", Technical Report, Aritificial Intelligence Lab, M.I. T., AI-memo-770, Jan 1984. [35] E.Krotkov, K. Henriksen and R. Kories, "Stereo...MIT Artificial Intelligence Laboratory [ PCTBTBimON STATEMENT X \\ Afipioved tor puciic reieo*«* \\ »?*•;.., jDi*tiibutK» U»lisut»d* 19951004
Ventral-stream-like shape representation: from pixel intensity values to trainable object-selective COSFIRE models

PubMed Central

Azzopardi, George; Petkov, Nicolai

2014-01-01

The remarkable abilities of the primate visual system have inspired the construction of computational models of some visual neurons. We propose a trainable hierarchical object recognition model, which we call S-COSFIRE (S stands for Shape and COSFIRE stands for Combination Of Shifted FIlter REsponses) and use it to localize and recognize objects of interests embedded in complex scenes. It is inspired by the visual processing in the ventral stream (V1/V2 → V4 → TEO). Recognition and localization of objects embedded in complex scenes is important for many computer vision applications. Most existing methods require prior segmentation of the objects from the background which on its turn requires recognition. An S-COSFIRE filter is automatically configured to be selective for an arrangement of contour-based features that belong to a prototype shape specified by an example. The configuration comprises selecting relevant vertex detectors and determining certain blur and shift parameters. The response is computed as the weighted geometric mean of the blurred and shifted responses of the selected vertex detectors. S-COSFIRE filters share similar properties with some neurons in inferotemporal cortex, which provided inspiration for this work. We demonstrate the effectiveness of S-COSFIRE filters in two applications: letter and keyword spotting in handwritten manuscripts and object spotting in complex scenes for the computer vision system of a domestic robot. S-COSFIRE filters are effective to recognize and localize (deformable) objects in images of complex scenes without requiring prior segmentation. They are versatile trainable shape detectors, conceptually simple and easy to implement. The presented hierarchical shape representation contributes to a better understanding of the brain and to more robust computer vision algorithms. PMID:25126068
The posterior parietal cortex in recognition memory: a neuropsychological study.

PubMed

Haramati, Sharon; Soroker, Nachum; Dudai, Yadin; Levy, Daniel A

2008-01-01

Several recent functional neuroimaging studies have reported robust bilateral activation (L>R) in lateral posterior parietal cortex and precuneus during recognition memory retrieval tasks. It has not yet been determined what cognitive processes are represented by those activations. In order to examine whether parietal lobe-based processes are necessary for basic episodic recognition abilities, we tested a group of 17 first-incident CVA patients whose cortical damage included (but was not limited to) extensive unilateral posterior parietal lesions. These patients performed a series of tasks that yielded parietal activations in previous fMRI studies: yes/no recognition judgments on visual words and on colored object pictures and identifiable environmental sounds. We found that patients with left hemisphere lesions were not impaired compared to controls in any of the tasks. Patients with right hemisphere lesions were not significantly impaired in memory for visual words, but were impaired in recognition of object pictures and sounds. Two lesion--behavior analyses--area-based correlations and voxel-based lesion symptom mapping (VLSM)---indicate that these impairments resulted from extra-parietal damage, specifically to frontal and lateral temporal areas. These findings suggest that extensive parietal damage does not impair recognition performance. We suggest that parietal activations recorded during recognition memory tasks might reflect peri-retrieval processes, such as the storage of retrieved memoranda in a working memory buffer for further cognitive processing.
Recurrent Convolutional Neural Networks: A Better Model of Biological Object Recognition.

PubMed

Spoerer, Courtney J; McClure, Patrick; Kriegeskorte, Nikolaus

2017-01-01

Feedforward neural networks provide the dominant model of how the brain performs visual object recognition. However, these networks lack the lateral and feedback connections, and the resulting recurrent neuronal dynamics, of the ventral visual pathway in the human and non-human primate brain. Here we investigate recurrent convolutional neural networks with bottom-up (B), lateral (L), and top-down (T) connections. Combining these types of connections yields four architectures (B, BT, BL, and BLT), which we systematically test and compare. We hypothesized that recurrent dynamics might improve recognition performance in the challenging scenario of partial occlusion. We introduce two novel occluded object recognition tasks to test the efficacy of the models, digit clutter (where multiple target digits occlude one another) and digit debris (where target digits are occluded by digit fragments). We find that recurrent neural networks outperform feedforward control models (approximately matched in parametric complexity) at recognizing objects, both in the absence of occlusion and in all occlusion conditions. Recurrent networks were also found to be more robust to the inclusion of additive Gaussian noise. Recurrent neural networks are better in two respects: (1) they are more neurobiologically realistic than their feedforward counterparts; (2) they are better in terms of their ability to recognize objects, especially under challenging conditions. This work shows that computer vision can benefit from using recurrent convolutional architectures and suggests that the ubiquitous recurrent connections in biological brains are essential for task performance.
Effect of tDCS on task relevant and irrelevant perceptual learning of complex objects.

PubMed

Van Meel, Chayenne; Daniels, Nicky; de Beeck, Hans Op; Baeck, Annelies

2016-01-01

During perceptual learning the visual representations in the brain are altered, but these changes' causal role has not yet been fully characterized. We used transcranial direct current stimulation (tDCS) to investigate the role of higher visual regions in lateral occipital cortex (LO) in perceptual learning with complex objects. We also investigated whether object learning is dependent on the relevance of the objects for the learning task. Participants were trained in two tasks: object recognition using a backward masking paradigm and an orientation judgment task. During both tasks, an object with a red line on top of it were presented in each trial. The crucial difference between both tasks was the relevance of the object: the object was relevant for the object recognition task, but not for the orientation judgment task. During training, half of the participants received anodal tDCS stimulation targeted at the lateral occipital cortex (LO). Afterwards, participants were tested on how well they recognized the trained objects, the irrelevant objects presented during the orientation judgment task and a set of completely new objects. Participants stimulated with tDCS during training showed larger improvements of performance compared to participants in the sham condition. No learning effect was found for the objects presented during the orientation judgment task. To conclude, this study suggests a causal role of LO in relevant object learning, but given the rather low spatial resolution of tDCS, more research on the specificity of this effect is needed. Further, mere exposure is not sufficient to train object recognition in our paradigm.
Not all attention orienting is created equal: recognition memory is enhanced when attention orienting involves distractor suppression.

PubMed

Markant, Julie; Worden, Michael S; Amso, Dima

2015-04-01

Learning through visual exploration often requires orienting of attention to meaningful information in a cluttered world. Previous work has shown that attention modulates visual cortex activity, with enhanced activity for attended targets and suppressed activity for competing inputs, thus enhancing the visual experience. Here we examined the idea that learning may be engaged differentially with variations in attention orienting mechanisms that drive eye movements during visual search and exploration. We hypothesized that attention orienting mechanisms that engaged suppression of a previously attended location would boost memory encoding of the currently attended target objects to a greater extent than those that involve target enhancement alone. To test this hypothesis we capitalized on the classic spatial cueing task and the inhibition of return (IOR) mechanism (Posner, 1980; Posner, Rafal, & Choate, 1985) to demonstrate that object images encoded in the context of concurrent suppression at a previously attended location were encoded more effectively and remembered better than those encoded without concurrent suppression. Furthermore, fMRI analyses revealed that this memory benefit was driven by attention modulation of visual cortex activity, as increased suppression of the previously attended location in visual cortex during target object encoding predicted better subsequent recognition memory performance. These results suggest that not all attention orienting impacts learning and memory equally. Copyright © 2015 Elsevier Inc. All rights reserved.
Learned Non-Rigid Object Motion is a View-Invariant Cue to Recognizing Novel Objects

PubMed Central

Chuang, Lewis L.; Vuong, Quoc C.; Bülthoff, Heinrich H.

2012-01-01

There is evidence that observers use learned object motion to recognize objects. For instance, studies have shown that reversing the learned direction in which a rigid object rotated in depth impaired recognition accuracy. This motion reversal can be achieved by playing animation sequences of moving objects in reverse frame order. In the current study, we used this sequence-reversal manipulation to investigate whether observers encode the motion of dynamic objects in visual memory, and whether such dynamic representations are encoded in a way that is dependent on the viewing conditions. Participants first learned dynamic novel objects, presented as animation sequences. Following learning, they were then tested on their ability to recognize these learned objects when their animation sequence was shown in the same sequence order as during learning or in the reverse sequence order. In Experiment 1, we found that non-rigid motion contributed to recognition performance; that is, sequence-reversal decreased sensitivity across different tasks. In subsequent experiments, we tested the recognition of non-rigidly deforming (Experiment 2) and rigidly rotating (Experiment 3) objects across novel viewpoints. Recognition performance was affected by viewpoint changes for both experiments. Learned non-rigid motion continued to contribute to recognition performance and this benefit was the same across all viewpoint changes. By comparison, learned rigid motion did not contribute to recognition performance. These results suggest that non-rigid motion provides a source of information for recognizing dynamic objects, which is not affected by changes to viewpoint. PMID:22661939
Invariant visual object recognition: a model, with lighting invariance.

PubMed

Rolls, Edmund T; Stringer, Simon M

2006-01-01

How are invariant representations of objects formed in the visual cortex? We describe a neurophysiological and computational approach which focusses on a feature hierarchy model in which invariant representations can be built by self-organizing learning based on the statistics of the visual input. The model can use temporal continuity in an associative synaptic learning rule with a short term memory trace, and/or it can use spatial continuity in Continuous Transformation learning. The model of visual processing in the ventral cortical stream can build representations of objects that are invariant with respect to translation, view, size, and in this paper we show also lighting. The model has been extended to provide an account of invariant representations in the dorsal visual system of the global motion produced by objects such as looming, rotation, and object-based movement. The model has been extended to incorporate top-down feedback connections to model the control of attention by biased competition in for example spatial and object search tasks. The model has also been extended to account for how the visual system can select single objects in complex visual scenes, and how multiple objects can be represented in a scene.
Selective involvement of superior frontal cortex during working memory for shapes.

PubMed

Yee, Lydia T S; Roe, Katherine; Courtney, Susan M

2010-01-01

A spatial/nonspatial functional dissociation between the dorsal and ventral visual pathways is well established and has formed the basis of domain-specific theories of prefrontal cortex (PFC). Inconsistencies in the literature regarding prefrontal organization, however, have led to questions regarding whether the nature of the dissociations observed in PFC during working memory are equivalent to those observed in the visual pathways for perception. In particular, the dissociation between dorsal and ventral PFC during working memory for locations versus object identities has been clearly present in some studies but not in others, seemingly in part due to the type of objects used. The current study compared functional MRI activation during delayed-recognition tasks for shape or color, two object features considered to be processed by the ventral pathway for perceptual recognition. Activation for the shape-delayed recognition task was greater than that for the color task in the lateral occipital cortex, in agreement with studies of visual perception. Greater memory-delay activity was also observed, however, in the parietal and superior frontal cortices for the shape than for the color task. Activity in superior frontal cortex was associated with better performance on the shape task. Conversely, greater delay activity for color than for shape was observed in the left anterior insula and this activity was associated with better performance on the color task. These results suggest that superior frontal cortex contributes to performance on tasks requiring working memory for object identities, but it represents different information about those objects than does the ventral frontal cortex.
A bio-inspired system for spatio-temporal recognition in static and video imagery

NASA Astrophysics Data System (ADS)

Khosla, Deepak; Moore, Christopher K.; Chelian, Suhas

2007-04-01

This paper presents a bio-inspired method for spatio-temporal recognition in static and video imagery. It builds upon and extends our previous work on a bio-inspired Visual Attention and object Recognition System (VARS). The VARS approach locates and recognizes objects in a single frame. This work presents two extensions of VARS. The first extension is a Scene Recognition Engine (SCE) that learns to recognize spatial relationships between objects that compose a particular scene category in static imagery. This could be used for recognizing the category of a scene, e.g., office vs. kitchen scene. The second extension is the Event Recognition Engine (ERE) that recognizes spatio-temporal sequences or events in sequences. This extension uses a working memory model to recognize events and behaviors in video imagery by maintaining and recognizing ordered spatio-temporal sequences. The working memory model is based on an ARTSTORE1 neural network that combines an ART-based neural network with a cascade of sustained temporal order recurrent (STORE)1 neural networks. A series of Default ARTMAP classifiers ascribes event labels to these sequences. Our preliminary studies have shown that this extension is robust to variations in an object's motion profile. We evaluated the performance of the SCE and ERE on real datasets. The SCE module was tested on a visual scene classification task using the LabelMe2 dataset. The ERE was tested on real world video footage of vehicles and pedestrians in a street scene. Our system is able to recognize the events in this footage involving vehicles and pedestrians.
The Doors and People Test: The Effect of Frontal Lobe Lesions on Recall and Recognition Memory Performance

PubMed Central

2016-01-01

Objective: Memory deficits in patients with frontal lobe lesions are most apparent on free recall tasks that require the selection, initiation, and implementation of retrieval strategies. The effect of frontal lesions on recognition memory performance is less clear with some studies reporting recognition memory impairments but others not. The majority of these studies do not directly compare recall and recognition within the same group of frontal patients, assessing only recall or recognition memory performance. Other studies that do compare recall and recognition in the same frontal group do not consider recall or recognition tests that are comparable for difficulty. Recognition memory impairments may not be reported because recognition memory tasks are less demanding. Method: This study aimed to investigate recall and recognition impairments in the same group of 47 frontal patients and 78 healthy controls. The Doors and People Test was administered as a neuropsychological test of memory as it assesses both verbal and visual recall and recognition using subtests that are matched for difficulty. Results: Significant verbal and visual recall and recognition impairments were found in the frontal patients. Conclusion: These results demonstrate that when frontal patients are assessed on recall and recognition memory tests of comparable difficulty, memory impairments are found on both types of episodic memory test. PMID:26752123
A systematic review of visual processing and associated treatments in body dysmorphic disorder.

PubMed

Beilharz, F; Castle, D J; Grace, S; Rossell, S L

2017-07-01

Recent advances in body dysmorphic disorder (BDD) have explored abnormal visual processing, yet it is unclear how this relates to treatment. The aim of this study was to summarize our current understanding of visual processing in BDD and review associated treatments. The literature was collected through PsycInfo and PubMed. Visual processing articles were included if written in English after 1970, had a specific BDD group compared to healthy controls and were not case studies. Due to the lack of research regarding treatments associated with visual processing, case studies were included. A number of visual processing abnormalities are present in BDD, including face recognition, emotion identification, aesthetics, object recognition and gestalt processing. Differences to healthy controls include a dominance of detailed local processing over global processing and associated changes in brain activation in visual regions. Perceptual mirror retraining and some forms of self-exposure have demonstrated improved treatment outcomes, but have not been examined in isolation from broader treatments. Despite these abnormalities in perception, particularly concerning face and emotion recognition, few BDD treatments attempt to specifically remediate this. The development of a novel visual training programme which addresses these widespread abnormalities may provide an effective treatment modality. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Neural Dynamics Underlying Target Detection in the Human Brain

PubMed Central

Bansal, Arjun K.; Madhavan, Radhika; Agam, Yigal; Golby, Alexandra; Madsen, Joseph R.

2014-01-01

Sensory signals must be interpreted in the context of goals and tasks. To detect a target in an image, the brain compares input signals and goals to elicit the correct behavior. We examined how target detection modulates visual recognition signals by recording intracranial field potential responses from 776 electrodes in 10 epileptic human subjects. We observed reliable differences in the physiological responses to stimuli when a cued target was present versus absent. Goal-related modulation was particularly strong in the inferior temporal and fusiform gyri, two areas important for object recognition. Target modulation started after 250 ms post stimulus, considerably after the onset of visual recognition signals. While broadband signals exhibited increased or decreased power, gamma frequency power showed predominantly increases during target presence. These observations support models where task goals interact with sensory inputs via top-down signals that influence the highest echelons of visual processing after the onset of selective responses. PMID:24553944
The neural basis of body form and body action agnosia.

PubMed

Moro, Valentina; Urgesi, Cosimo; Pernigo, Simone; Lanteri, Paola; Pazzaglia, Mariella; Aglioti, Salvatore Maria

2008-10-23

Visual analysis of faces and nonfacial body stimuli brings about neural activity in different cortical areas. Moreover, processing body form and body action relies on distinct neural substrates. Although brain lesion studies show specific face processing deficits, neuropsychological evidence for defective recognition of nonfacial body parts is lacking. By combining psychophysics studies with lesion-mapping techniques, we found that lesions of ventromedial, occipitotemporal areas induce face and body recognition deficits while lesions involving extrastriate body area seem causatively associated with impaired recognition of body but not of face and object stimuli. We also found that body form and body action recognition deficits can be double dissociated and are causatively associated with lesions to extrastriate body area and ventral premotor cortex, respectively. Our study reports two category-specific visual deficits, called body form and body action agnosia, and highlights their neural underpinnings.
Emotion and Object Processing in Parkinson's Disease

ERIC Educational Resources Information Center

Cohen, Henri; Gagne, Marie-Helene; Hess, Ursula; Pourcher, Emmanuelle

2010-01-01

The neuropsychological literature on the processing of emotions in Parkinson's disease (PD) reveals conflicting evidence about the role of the basal ganglia in the recognition of facial emotions. Hence, the present study had two objectives. One was to determine the extent to which the visual processing of emotions and objects differs in PD. The…
Beyond perceptual expertise: revisiting the neural substrates of expert object recognition

PubMed Central

Harel, Assaf; Kravitz, Dwight; Baker, Chris I.

2013-01-01

Real-world expertise provides a valuable opportunity to understand how experience shapes human behavior and neural function. In the visual domain, the study of expert object recognition, such as in car enthusiasts or bird watchers, has produced a large, growing, and often-controversial literature. Here, we synthesize this literature, focusing primarily on results from functional brain imaging, and propose an interactive framework that incorporates the impact of high-level factors, such as attention and conceptual knowledge, in supporting expertise. This framework contrasts with the perceptual view of object expertise that has concentrated largely on stimulus-driven processing in visual cortex. One prominent version of this perceptual account has almost exclusively focused on the relation of expertise to face processing and, in terms of the neural substrates, has centered on face-selective cortical regions such as the Fusiform Face Area (FFA). We discuss the limitations of this face-centric approach as well as the more general perceptual view, and highlight that expert related activity is: (i) found throughout visual cortex, not just FFA, with a strong relationship between neural response and behavioral expertise even in the earliest stages of visual processing, (ii) found outside visual cortex in areas such as parietal and prefrontal cortices, and (iii) modulated by the attentional engagement of the observer suggesting that it is neither automatic nor driven solely by stimulus properties. These findings strongly support a framework in which object expertise emerges from extensive interactions within and between the visual system and other cognitive systems, resulting in widespread, distributed patterns of expertise-related activity across the entire cortex. PMID:24409134

Visual feature extraction and establishment of visual tags in the intelligent visual internet of things

NASA Astrophysics Data System (ADS)

Zhao, Yiqun; Wang, Zhihui

2015-12-01

The Internet of things (IOT) is a kind of intelligent networks which can be used to locate, track, identify and supervise people and objects. One of important core technologies of intelligent visual internet of things ( IVIOT) is the intelligent visual tag system. In this paper, a research is done into visual feature extraction and establishment of visual tags of the human face based on ORL face database. Firstly, we use the principal component analysis (PCA) algorithm for face feature extraction, then adopt the support vector machine (SVM) for classifying and face recognition, finally establish a visual tag for face which is already classified. We conducted a experiment focused on a group of people face images, the result show that the proposed algorithm have good performance, and can show the visual tag of objects conveniently.
Visual Agnosia for Line Drawings and Silhouettes without Apparent Impairment of Real-Object Recognition: A Case Report

PubMed Central

Hiraoka, Kotaro; Suzuki, Kyoko; Hirayama, Kazumi; Mori, Etsuro

2009-01-01

We report on a patient with visual agnosia for line drawings and silhouette pictures following cerebral infarction in the region of the right posterior cerebral artery. The patient retained the ability to recognize real objects and their photographs, and could precisely copy line drawings of objects that she could not name. This case report highlights the importance of clinicians and researchers paying special attention to avoid overlooking agnosia in such cases. The factors that lead to problems in the identification of stimuli other than real objects in agnosic cases are discussed. PMID:19996516
Visual agnosia for line drawings and silhouettes without apparent impairment of real-object recognition: a case report.

PubMed

Hiraoka, Kotaro; Suzuki, Kyoko; Hirayama, Kazumi; Mori, Etsuro

2009-01-01

We report on a patient with visual agnosia for line drawings and silhouette pictures following cerebral infarction in the region of the right posterior cerebral artery. The patient retained the ability to recognize real objects and their photographs, and could precisely copy line drawings of objects that she could not name. This case report highlights the importance of clinicians and researchers paying special attention to avoid overlooking agnosia in such cases. The factors that lead to problems in the identification of stimuli other than real objects in agnosic cases are discussed.
Evaluating structural pattern recognition for handwritten math via primitive label graphs

NASA Astrophysics Data System (ADS)

Zanibbi, Richard; MoucheÌre, Harold; Viard-Gaudin, Christian

2013-01-01

Currently, structural pattern recognizer evaluations compare graphs of detected structure to target structures (i.e. ground truth) using recognition rates, recall and precision for object segmentation, classification and relationships. In document recognition, these target objects (e.g. symbols) are frequently comprised of multiple primitives (e.g. connected components, or strokes for online handwritten data), but current metrics do not characterize errors at the primitive level, from which object-level structure is obtained. Primitive label graphs are directed graphs defined over primitives and primitive pairs. We define new metrics obtained by Hamming distances over label graphs, which allow classification, segmentation and parsing errors to be characterized separately, or using a single measure. Recall and precision for detected objects may also be computed directly from label graphs. We illustrate the new metrics by comparing a new primitive-level evaluation to the symbol-level evaluation performed for the CROHME 2012 handwritten math recognition competition. A Python-based set of utilities for evaluating, visualizing and translating label graphs is publicly available.
View-Based Models of 3D Object Recognition and Class-Specific Invariance

DTIC Science & Technology

1994-04-01

underlie recognition of geon-like com- ponents (see Edelman, 1991 and Biederman , 1987 ). I(X -_ ta)II1y = (X - ta)TWTW(x -_ ta) (3) View-invariant features...Institute of Technology, 1993. neocortex. Biological Cybernetics, 1992. 14] I. Biederman . Recognition by components: a theory [20] B. Olshausen, C...Anderson, and D. Van Essen. A of human image understanding. Psychol. Review, neural model of visual attention and invariant pat- 94:115-147, 1987 . tern
The Limits of Shape Recognition following Late Emergence from Blindness.

PubMed

McKyton, Ayelet; Ben-Zion, Itay; Doron, Ravid; Zohary, Ehud

2015-09-21

Visual object recognition develops during the first years of life. But what if one is deprived of vision during early post-natal development? Shape information is extracted using both low-level cues (e.g., intensity- or color-based contours) and more complex algorithms that are largely based on inference assumptions (e.g., illumination is from above, objects are often partially occluded). Previous studies, testing visual acuity using a 2D shape-identification task (Lea symbols), indicate that contour-based shape recognition can improve with visual experience, even after years of visual deprivation from birth. We hypothesized that this may generalize to other low-level cues (shape, size, and color), but not to mid-level functions (e.g., 3D shape from shading) that might require prior visual knowledge. To that end, we studied a unique group of subjects in Ethiopia that suffered from an early manifestation of dense bilateral cataracts and were surgically treated only years later. Our results suggest that the newly sighted rapidly acquire the ability to recognize an odd element within an array, on the basis of color, size, or shape differences. However, they are generally unable to find the odd shape on the basis of illusory contours, shading, or occlusion relationships. Little recovery of these mid-level functions is seen within 1 year post-operation. We find that visual performance using low-level cues is relatively robust to prolonged deprivation from birth. However, the use of pictorial depth cues to infer 3D structure from the 2D retinal image is highly susceptible to early and prolonged visual deprivation. Copyright © 2015 Elsevier Ltd. All rights reserved.
Shape Recognition in Infancy: Visual Integration of Sequential Information.

ERIC Educational Resources Information Center

Rose, Susan A

1988-01-01

Investigated infants' integration of visual information across space and time. In four experiments, infants aged 12 months and 6 months viewed objects after watching light trace similar and dissimilar shapes. Infants looked longer at novel shapes, although six-month-olds did not recognize figures taking more than 10 seconds to trace. One-year-old…
The Role of Visual Experience on the Representation and Updating of Novel Haptic Scenes

ERIC Educational Resources Information Center

Pasqualotto, Achille; Newell, Fiona N.

2007-01-01

We investigated the role of visual experience on the spatial representation and updating of haptic scenes by comparing recognition performance across sighted, congenitally and late blind participants. We first established that spatial updating occurs in sighted individuals to haptic scenes of novel objects. All participants were required to…
3D visual mechinism by neural networkings

NASA Astrophysics Data System (ADS)

Sugiyama, Shigeki

2007-04-01

There are some computer vision systems that are available on a market but those are quite far from a real usage of our daily life in a sense of security guard or in a sense of a usage of recognition of a target object behaviour. Because those surroundings' sensing might need to recognize a detail description of an object, like "the distance to an object" and "an object detail figure" and "its figure of edging", which are not possible to have a clear picture of the mechanisms of them with the present recognition system. So for doing this, here studies on mechanisms of how a pair of human eyes can recognize a distance apart, an object edging, and an object in order to get basic essences of vision mechanisms. And those basic mechanisms of object recognition are simplified and are extended logically for applying to a computer vision system. Some of the results of these studies are introduced on this paper.
An ERP study of recognition memory for concrete and abstract pictures in school-aged children

PubMed Central

Boucher, Olivier; Chouinard-Leclaire, Christine; Muckle, Gina; Westerlund, Alissa; Burden, Matthew J.; Jacobson, Sandra W.; Jacobson, Joseph L.

2016-01-01

Recognition memory for concrete, nameable pictures is typically faster and more accurate than for abstract pictures. A dual-coding account for these findings suggests that concrete pictures are processed into verbal and image codes, whereas abstract pictures are encoded in image codes only. Recognition memory relies on two successive and distinct processes, namely familiarity and recollection. Whether these two processes are similarly or differently affected by stimulus concreteness remains unknown. This study examined the effect of picture concreteness on visual recognition memory processes using event-related potentials (ERPs). In a sample of children involved in a longitudinal study, participants (N = 96; mean age = 11.3 years) were assessed on a continuous visual recognition memory task in which half the pictures were easily nameable, everyday concrete objects, and the other half were three-dimensional abstract, sculpture-like objects. Behavioral performance and ERP correlates of familiarity and recollection (respectively, the FN400 and P600 repetition effects) were measured. Behavioral results indicated faster and more accurate identification of concrete pictures as “new” or “old” (i.e., previously displayed) compared to abstract pictures. ERPs were characterised by a larger repetition effect, on the P600 amplitude, for concrete than for abstract images, suggesting a graded recollection process dependant on the type of material to be recollected. Topographic differences were observed within the FN400 latency interval, especially over anterior-inferior electrodes, with the repetition effect more pronounced and localized over the left hemisphere for concrete stimuli, potentially reflecting different neural processes underlying early processing of verbal/semantic and visual material in memory. PMID:27329352
Object recognition with severe spatial deficits in Williams syndrome: sparing and breakdown.

PubMed

Landau, Barbara; Hoffman, James E; Kurz, Nicole

2006-07-01

Williams syndrome (WS) is a rare genetic disorder that results in severe visual-spatial cognitive deficits coupled with relative sparing in language, face recognition, and certain aspects of motion processing. Here, we look for evidence for sparing or impairment in another cognitive system-object recognition. Children with WS, normal mental-age (MA) and chronological age-matched (CA) children, and normal adults viewed pictures of a large range of objects briefly presented under various conditions of degradation, including canonical and unusual orientations, and clear or blurred contours. Objects were shown as either full-color views (Experiment 1) or line drawings (Experiment 2). Across both experiments, WS and MA children performed similarly in all conditions while CA children performed better than both WS group and MA groups with unusual views. This advantage, however, was eliminated when images were also blurred. The error types and relative difficulty of different objects were similar across all participant groups. The results indicate selective sparing of basic mechanisms of object recognition in WS, together with developmental delay or arrest in recognition of objects from unusual viewpoints. These findings are consistent with the growing literature on brain abnormalities in WS which points to selective impairment in the parietal areas of the brain. As a whole, the results lend further support to the growing literature on the functional separability of object recognition mechanisms from other spatial functions, and raise intriguing questions about the link between genetic deficits and cognition.
Visual skills in airport-security screening.

PubMed

McCarley, Jason S; Kramer, Arthur F; Wickens, Christopher D; Vidoni, Eric D; Boot, Walter R

2004-05-01

An experiment examined visual performance in a simulated luggage-screening task. Observers participated in five sessions of a task requiring them to search for knives hidden in x-ray images of cluttered bags. Sensitivity and response times improved reliably as a result of practice. Eye movement data revealed that sensitivity increases were produced entirely by changes in observers' ability to recognize target objects, and not by changes in the effectiveness of visual scanning. Moreover, recognition skills were in part stimulus-specific, such that performance was degraded by the introduction of unfamiliar target objects. Implications for screener training are discussed.
Recognition memory strength is predicted by pupillary responses at encoding while fixation patterns distinguish recollection from familiarity.

PubMed

Kafkas, Alexandros; Montaldi, Daniela

2011-10-01

Thirty-five healthy participants incidentally encoded a set of man-made and natural object pictures, while their pupil response and eye movements were recorded. At retrieval, studied and new stimuli were rated as novel, familiar (strong, moderate, or weak), or recollected. We found that both pupil response and fixation patterns at encoding predict later recognition memory strength. The extent of pupillary response accompanying incidental encoding was found to be predictive of subsequent memory. In addition, the number of fixations was also predictive of later recognition memory strength, suggesting that the accumulation of greater visual detail, even for single objects, is critical for the creation of a strong memory. Moreover, fixation patterns at encoding distinguished between recollection and familiarity at retrieval, with more dispersed fixations predicting familiarity and more clustered fixations predicting recollection. These data reveal close links between the autonomic control of pupil responses and eye movement patterns on the one hand and memory encoding on the other. Moreover, the data illustrate quantitative as well as qualitative differences in the incidental visual processing of stimuli, which are differentially predictive of the strength and the kind of memory experienced at recognition.
Similarity-Based Fusion of MEG and fMRI Reveals Spatio-Temporal Dynamics in Human Cortex During Visual Object Recognition

PubMed Central

Cichy, Radoslaw Martin; Pantazis, Dimitrios; Oliva, Aude

2016-01-01

Every human cognitive function, such as visual object recognition, is realized in a complex spatio-temporal activity pattern in the brain. Current brain imaging techniques in isolation cannot resolve the brain's spatio-temporal dynamics, because they provide either high spatial or temporal resolution but not both. To overcome this limitation, we developed an integration approach that uses representational similarities to combine measurements of magnetoencephalography (MEG) and functional magnetic resonance imaging (fMRI) to yield a spatially and temporally integrated characterization of neuronal activation. Applying this approach to 2 independent MEG–fMRI data sets, we observed that neural activity first emerged in the occipital pole at 50–80 ms, before spreading rapidly and progressively in the anterior direction along the ventral and dorsal visual streams. Further region-of-interest analyses established that dorsal and ventral regions showed MEG–fMRI correspondence in representations later than early visual cortex. Together, these results provide a novel and comprehensive, spatio-temporally resolved view of the rapid neural dynamics during the first few hundred milliseconds of object vision. They further demonstrate the feasibility of spatially unbiased representational similarity-based fusion of MEG and fMRI, promising new insights into how the brain computes complex cognitive functions. PMID:27235099
Evidence for a Limited-Cascading Account of Written Word Naming

ERIC Educational Resources Information Center

Bonin, Patrick; Roux, Sebastien; Barry, Christopher; Canell, Laura

2012-01-01

We address the issue of how information flows within the written word production system by examining written object-naming latencies. We report 4 experiments in which we manipulate variables assumed to have their primary impact at the level of object recognition (e.g., quality of visual presentation of pictured objects), at the level of semantic…
Enhanced learning of natural visual sequences in newborn chicks.

PubMed

Wood, Justin N; Prasad, Aditya; Goldman, Jason G; Wood, Samantha M W

2016-07-01

To what extent are newborn brains designed to operate over natural visual input? To address this question, we used a high-throughput controlled-rearing method to examine whether newborn chicks (Gallus gallus) show enhanced learning of natural visual sequences at the onset of vision. We took the same set of images and grouped them into either natural sequences (i.e., sequences showing different viewpoints of the same real-world object) or unnatural sequences (i.e., sequences showing different images of different real-world objects). When raised in virtual worlds containing natural sequences, newborn chicks developed the ability to recognize familiar images of objects. Conversely, when raised in virtual worlds containing unnatural sequences, newborn chicks' object recognition abilities were severely impaired. In fact, the majority of the chicks raised with the unnatural sequences failed to recognize familiar images of objects despite acquiring over 100 h of visual experience with those images. Thus, newborn chicks show enhanced learning of natural visual sequences at the onset of vision. These results indicate that newborn brains are designed to operate over natural visual input.
On the three-quarter view advantage of familiar object recognition.

PubMed

Nonose, Kohei; Niimi, Ryosuke; Yokosawa, Kazuhiko

2016-11-01

A three-quarter view, i.e., an oblique view, of familiar objects often leads to a higher subjective goodness rating when compared with other orientations. What is the source of the high goodness for oblique views? First, we confirmed that object recognition performance was also best for oblique views around 30° view, even when the foreshortening disadvantage of front- and side-views was minimized (Experiments 1 and 2). In Experiment 3, we measured subjective ratings of view goodness and two possible determinants of view goodness: familiarity of view, and subjective impression of three-dimensionality. Three-dimensionality was measured as the subjective saliency of visual depth information. The oblique views were rated best, most familiar, and as approximating greatest three-dimensionality on average; however, the cluster analyses showed that the "best" orientation systematically varied among objects. We found three clusters of objects: front-preferred objects, oblique-preferred objects, and side-preferred objects. Interestingly, recognition performance and the three-dimensionality rating were higher for oblique views irrespective of the clusters. It appears that recognition efficiency is not the major source of the three-quarter view advantage. There are multiple determinants and variability among objects. This study suggests that the classical idea that a canonical view has a unique advantage in object perception requires further discussion.
A depictive neural model for the representation of motion verbs.

PubMed

Rao, Sunil; Aleksander, Igor

2011-11-01

In this paper, we present a depictive neural model for the representation of motion verb semantics in neural models of visual awareness. The problem of modelling motion verb representation is shown to be one of function application, mapping a set of given input variables defining the moving object and the path of motion to a defined output outcome in the motion recognition context. The particular function-applicative implementation and consequent recognition model design presented are seen as arising from a noun-adjective recognition model enabling the recognition of colour adjectives as applied to a set of shapes representing objects to be recognised. The presence of such a function application scheme and a separately implemented position identification and path labelling scheme are accordingly shown to be the primitives required to enable the design and construction of a composite depictive motion verb recognition scheme. Extensions to the presented design to enable the representation of transitive verbs are also discussed.
Combining heterogenous features for 3D hand-held object recognition

NASA Astrophysics Data System (ADS)

Lv, Xiong; Wang, Shuang; Li, Xiangyang; Jiang, Shuqiang

2014-10-01

Object recognition has wide applications in the area of human-machine interaction and multimedia retrieval. However, due to the problem of visual polysemous and concept polymorphism, it is still a great challenge to obtain reliable recognition result for the 2D images. Recently, with the emergence and easy availability of RGB-D equipment such as Kinect, this challenge could be relieved because the depth channel could bring more information. A very special and important case of object recognition is hand-held object recognition, as hand is a straight and natural way for both human-human interaction and human-machine interaction. In this paper, we study the problem of 3D object recognition by combining heterogenous features with different modalities and extraction techniques. For hand-craft feature, although it reserves the low-level information such as shape and color, it has shown weakness in representing hiconvolutionalgh-level semantic information compared with the automatic learned feature, especially deep feature. Deep feature has shown its great advantages in large scale dataset recognition but is not always robust to rotation or scale variance compared with hand-craft feature. In this paper, we propose a method to combine hand-craft point cloud features and deep learned features in RGB and depth channle. First, hand-held object segmentation is implemented by using depth cues and human skeleton information. Second, we combine the extracted hetegerogenous 3D features in different stages using linear concatenation and multiple kernel learning (MKL). Then a training model is used to recognize 3D handheld objects. Experimental results validate the effectiveness and gerneralization ability of the proposed method.
Neural Encoding of Relative Position

ERIC Educational Resources Information Center

Hayworth, Kenneth J.; Lescroart, Mark D.; Biederman, Irving

2011-01-01

Late ventral visual areas generally consist of cells having a significant degree of translation invariance. Such a "bag of features" representation is useful for the recognition of individual objects; however, it seems unable to explain our ability to parse a scene into multiple objects and to understand their spatial relationships. We…

Object Recognition and Random Image Structure Evolution

ERIC Educational Resources Information Center

Sadr, Jvid; Sinha, Pawan

2004-01-01

We present a technique called Random Image Structure Evolution (RISE) for use in experimental investigations of high-level visual perception. Potential applications of RISE include the quantitative measurement of perceptual hysteresis and priming, the study of the neural substrates of object perception, and the assessment and detection of subtle…
Graded effects in hierarchical figure-ground organization: reply to Peterson (1999).

PubMed

Vecera, S P; O'Reilly, R C

2000-06-01

An important issue in vision research concerns the order of visual processing. S. P. Vecera and R. C. O'Reilly (1998) presented an interactive, hierarchical model that placed figure-ground segregation prior to object recognition. M. A. Peterson (1999) critiqued this model, arguing that because it used ambiguous stimulus displays, figure-ground processing did not precede object processing. In the current article, the authors respond to Peterson's (1999) interpretation of ambiguity in the model and her interpretation of what it means for figure-ground processing to come before object recognition. The authors argue that complete stimulus ambiguity is not critical to the model and that figure-ground precedes object recognition architecturally in the model. The arguments are supported with additional simulation results and an experiment, demonstrating that top-down inputs can influence figure-ground organization in displays that contain stimulus cues.
The Precategorical Nature of Visual Short-Term Memory

ERIC Educational Resources Information Center

Quinlan, Philip T.; Cohen, Dale J.

2016-01-01

We conducted a series of recognition experiments that assessed whether visual short-term memory (VSTM) is sensitive to shared category membership of to-be-remembered (tbr) images of common objects. In Experiment 1 some of the tbr items shared the same basic level category (e.g., hand axe): Such items were no better retained than others. In the…
Toward a Unified Theory of Visual Area V4

PubMed Central

Roe, Anna W.; Chelazzi, Leonardo; Connor, Charles E.; Conway, Bevil R.; Fujita, Ichiro; Gallant, Jack L.; Lu, Haidong; Vanduffel, Wim

2016-01-01

Visual area V4 is a midtier cortical area in the ventral visual pathway. It is crucial for visual object recognition and has been a focus of many studies on visual attention. However, there is no unifying view of V4’s role in visual processing. Neither is there an understanding of how its role in feature processing interfaces with its role in visual attention. This review captures our current knowledge of V4, largely derived from electrophysiological and imaging studies in the macaque monkey. Based on recent discovery of functionally specific domains in V4, we propose that the unifying function of V4 circuitry is to enable selective extraction of specific functional domain-based networks, whether it be by bottom-up specification of object features or by top-down attentionally driven selection. PMID:22500626
Forms Of Memory For Representation Of Visual Objects

DTIC Science & Technology

1991-02-14

description system that functions independently of the episodic memory system that is damaged in amnesia and supports explicit remembering. Miscellaneous...well as semantic and functional information about an object, are preserved in the episodic system. 4. Priming and recognition of depth-cued, 3D objects A...requirement should serve to enhance an object’s distinctiveness in episodic memory . We also predicted robust priming for symmetric objects; this is because
Reading laterally: the cerebral hemispheric use of spatial frequencies in visual word recognition.

PubMed

Tadros, Karine; Dupuis-Roy, Nicolas; Fiset, Daniel; Arguin, Martin; Gosselin, Frédéric

2013-01-04

It is generally accepted that the left hemisphere (LH) is more capable for reading than the right hemisphere (RH). Left hemifield presentations (initially processed by the RH) lead to a globally higher error rate, slower word identification, and a significantly stronger word length effect (i.e., slower reaction times for longer words). Because the visuo-perceptual mechanisms of the brain for word recognition are primarily localized in the LH (Cohen et al., 2003), it is possible that this part of the brain possesses better spatial frequency (SF) tuning for processing the visual properties of words than the RH. The main objective of this study is to determine the SF tuning functions of the LH and RH for word recognition. Each word image was randomly sampled in the SF domain using the SF bubbles method (Willenbockel et al., 2010) and was presented laterally to the left or right visual hemifield. As expected, the LH requires less visual information than the RH to reach the same level of performance, illustrating the well-known LH advantage for word recognition. Globally, the SF tuning of both hemispheres is similar. However, these seemingly identical tuning functions hide important differences. Most importantly, we argue that the RH requires higher SFs to identify longer words because of crowding.
Enhanced recognition memory in grapheme-color synaesthesia for different categories of visual stimuli

PubMed Central

Ward, Jamie; Hovard, Peter; Jones, Alicia; Rothen, Nicolas

2013-01-01

Memory has been shown to be enhanced in grapheme-color synaesthesia, and this enhancement extends to certain visual stimuli (that don't induce synaesthesia) as well as stimuli comprised of graphemes (which do). Previous studies have used a variety of testing procedures to assess memory in synaesthesia (e.g., free recall, recognition, associative learning) making it hard to know the extent to which memory benefits are attributable to the stimulus properties themselves, the testing method, participant strategies, or some combination of these factors. In the first experiment, we use the same testing procedure (recognition memory) for a variety of stimuli (written words, non-words, scenes, and fractals) and also check which memorization strategies were used. We demonstrate that grapheme-color synaesthetes show enhanced memory across all these stimuli, but this is not found for a non-visual type of synaesthesia (lexical-gustatory). In the second experiment, the memory advantage for scenes is explored further by manipulating the properties of the old and new images (changing color, orientation, or object presence). Again, grapheme-color synaesthetes show a memory advantage for scenes across all manipulations. Although recognition memory is generally enhanced in this study, the largest effects were found for abstract visual images (fractals) and scenes for which color can be used to discriminate old/new status. PMID:24187542
Enhanced recognition memory in grapheme-color synaesthesia for different categories of visual stimuli.

PubMed

Ward, Jamie; Hovard, Peter; Jones, Alicia; Rothen, Nicolas

2013-01-01

Memory has been shown to be enhanced in grapheme-color synaesthesia, and this enhancement extends to certain visual stimuli (that don't induce synaesthesia) as well as stimuli comprised of graphemes (which do). Previous studies have used a variety of testing procedures to assess memory in synaesthesia (e.g., free recall, recognition, associative learning) making it hard to know the extent to which memory benefits are attributable to the stimulus properties themselves, the testing method, participant strategies, or some combination of these factors. In the first experiment, we use the same testing procedure (recognition memory) for a variety of stimuli (written words, non-words, scenes, and fractals) and also check which memorization strategies were used. We demonstrate that grapheme-color synaesthetes show enhanced memory across all these stimuli, but this is not found for a non-visual type of synaesthesia (lexical-gustatory). In the second experiment, the memory advantage for scenes is explored further by manipulating the properties of the old and new images (changing color, orientation, or object presence). Again, grapheme-color synaesthetes show a memory advantage for scenes across all manipulations. Although recognition memory is generally enhanced in this study, the largest effects were found for abstract visual images (fractals) and scenes for which color can be used to discriminate old/new status.
A steady state visually evoked potential investigation of memory and ageing.

PubMed

Macpherson, Helen; Pipingas, Andrew; Silberstein, Richard

2009-04-01

Old age is generally accompanied by a decline in memory performance. Specifically, neuroimaging and electrophysiological studies have revealed that there are age-related changes in the neural correlates of episodic and working memory. This study investigated age-associated changes in the steady state visually evoked potential (SSVEP) amplitude and latency associated with memory performance. Participants were 15 older (59-67 years) and 14 younger (20-30 years) adults who performed an object working memory (OWM) task and a contextual recognition memory (CRM) task, whilst the SSVEP was recorded from 64 electrode sites. Retention of a single object in the low demand OWM task was characterised by smaller frontal SSVEP amplitude and latency differences in older adults than in younger adults, indicative of an age-associated reduction in neural processes. Recognition of visual images in the more difficult CRM task was accompanied by larger, more sustained SSVEP amplitude and latency decreases over temporal parietal regions in older adults. In contrast, the more transient, frontally mediated pattern of activity demonstrated by younger adults suggests that younger and older adults utilize different neural resources to perform recognition judgements. The results provide support for compensatory processes in the aging brain; at lower task demands, older adults demonstrate reduced neural activity, whereas at greater task demands neural activity is increased.
How does the brain solve visual object recognition?

PubMed Central

Zoccolan, Davide; Rust, Nicole C.

2012-01-01

Mounting evidence suggests that “core object recognition,” the ability to rapidly recognize objects despite substantial appearance variation, is solved in the brain via a cascade of reflexive, largely feedforward computations that culminate in a powerful neuronal representation in the inferior temporal cortex. However, the algorithm that produces this solution remains little-understood. Here we review evidence ranging from individual neurons, to neuronal populations, to behavior, to computational models. We propose that understanding this algorithm will require using neuronal and psychophysical data to sift through many computational models, each based on building blocks of small, canonical sub-networks with a common functional goal. PMID:22325196
Role of fusiform and anterior temporal cortical areas in facial recognition.

PubMed

Nasr, Shahin; Tootell, Roger B H

2012-11-15

Recent fMRI studies suggest that cortical face processing extends well beyond the fusiform face area (FFA), including unspecified portions of the anterior temporal lobe. However, the exact location of such anterior temporal region(s), and their role during active face recognition, remain unclear. Here we demonstrate that (in addition to FFA) a small bilateral site in the anterior tip of the collateral sulcus ('AT'; the anterior temporal face patch) is selectively activated during recognition of faces but not houses (a non-face object). In contrast to the psychophysical prediction that inverted and contrast reversed faces are processed like other non-face objects, both FFA and AT (but not other visual areas) were also activated during recognition of inverted and contrast reversed faces. However, response accuracy was better correlated to recognition-driven activity in AT, compared to FFA. These data support a segregated, hierarchical model of face recognition processing, extending to the anterior temporal cortex. Copyright © 2012 Elsevier Inc. All rights reserved.
Role of Fusiform and Anterior Temporal Cortical Areas in Facial Recognition

PubMed Central

Nasr, Shahin; Tootell, Roger BH

2012-01-01

Recent FMRI studies suggest that cortical face processing extends well beyond the fusiform face area (FFA), including unspecified portions of the anterior temporal lobe. However, the exact location of such anterior temporal region(s), and their role during active face recognition, remain unclear. Here we demonstrate that (in addition to FFA) a small bilateral site in the anterior tip of the collateral sulcus (‘AT’; the anterior temporal face patch) is selectively activated during recognition of faces but not houses (a non-face object). In contrast to the psychophysical prediction that inverted and contrast reversed faces are processed like other non-face objects, both FFA and AT (but not other visual areas) were also activated during recognition of inverted and contrast reversed faces. However, response accuracy was better correlated to recognition-driven activity in AT, compared to FFA. These data support a segregated, hierarchical model of face recognition processing, extending to the anterior temporal cortex. PMID:23034518
Modafinil improves methamphetamine-induced object recognition deficits and restores prefrontal cortex ERK signaling in mice.

PubMed

González, Betina; Raineri, Mariana; Cadet, Jean Lud; García-Rill, Edgar; Urbano, Francisco J; Bisagno, Veronica

2014-12-01

Chronic use of methamphetamine (METH) leads to long-lasting cognitive dysfunction in humans and in animal models. Modafinil is a wake-promoting compound approved for the treatment of sleeping disorders. It is also prescribed off label to treat METH dependence. In the present study, we investigated whether modafinil could improve cognitive deficits induced by sub-chronic METH treatment in mice by measuring visual retention in a Novel Object Recognition (NOR) task. After sub-chronic METH treatment (1 mg/kg, once a day for 7 days), mice performed the NOR task, which consisted of habituation to the object recognition arena (5 min a day, 3 consecutive days), training session (2 equal objects, 10 min, day 4), and a retention session (1 novel object, 5 min, day 5). One hour before the training session, mice were given a single dose of modafinil (30 or 90 mg/kg). METH-treated mice showed impairments in visual memory retention, evidenced by equal preference of familiar and novel objects during the retention session. The lower dose of modafinil (30 mg/kg) had no effect on visual retention scores in METH-treated mice, while the higher dose (90 mg/kg) rescued visual memory retention to control values. We also measured extracellular signal-regulated kinase (ERK) phosphorylation in medial prefrontal cortex (mPFC), hippocampus, and nucleus accumbens (NAc) of METH- and vehicle-treated mice that received modafinil 1 h before exposure to novel objects in the training session, compared to mice placed in the arena without objects. Elevated ERK phosphorylation was found in the mPFC of vehicle-treated mice, but not in METH-treated mice, exposed to objects. The lower dose of modafinil had no effect on ERK phosphorylation in METH-treated mice, while 90 mg/kg modafinil treatment restored the ERK phosphorylation induced by novelty in METH-treated mice to values comparable to controls. We found neither a novelty nor treatment effect on ERK phosphorylation in hippocampus or NAc of vehicle- and METH-treated mice receiving acute 90 mg/kg modafinil treatment. Our results showed a palliative role of modafinil against METH-induced visual cognitive impairments, possibly by normalizing ERK signaling pathways in mPFC. Modafinil may be a valuable pharmacological tool for the treatment of cognitive deficits observed in human METH abusers as well as in other neuropsychiatric conditions. This article is part of the Special Issue entitled 'CNS Stimulants'. Copyright © 2014 Elsevier Ltd. All rights reserved.
The Timing of Visual Object Categorization

PubMed Central

Mack, Michael L.; Palmeri, Thomas J.

2011-01-01

An object can be categorized at different levels of abstraction: as natural or man-made, animal or plant, bird or dog, or as a Northern Cardinal or Pyrrhuloxia. There has been growing interest in understanding how quickly categorizations at different levels are made and how the timing of those perceptual decisions changes with experience. We specifically contrast two perspectives on the timing of object categorization at different levels of abstraction. By one account, the relative timing implies a relative timing of stages of visual processing that are tied to particular levels of object categorization: Fast categorizations are fast because they precede other categorizations within the visual processing hierarchy. By another account, the relative timing reflects when perceptual features are available over time and the quality of perceptual evidence used to drive a perceptual decision process: Fast simply means fast, it does not mean first. Understanding the short-term and long-term temporal dynamics of object categorizations is key to developing computational models of visual object recognition. We briefly review a number of models of object categorization and outline how they explain the timing of visual object categorization at different levels of abstraction. PMID:21811480
Explaining seeing? Disentangling qualia from perceptual organization.

PubMed

Ibáñez, Agustin; Bekinschtein, Tristan

2010-09-01

Abstract Visual perception and integration seem to play an essential role in our conscious phenomenology. Relatively local neural processing of reentrant nature may explain several visual integration processes (feature binding or figure-ground segregation, object recognition, inference, competition), even without attention or cognitive control. Based on the above statements, should the neural signatures of visual integration (via reentrant process) be non-reportable phenomenological qualia? We argue that qualia are not required to understand this perceptual organization.
Semantic congruence affects hippocampal response to repetition of visual associations.

PubMed

McAndrews, Mary Pat; Girard, Todd A; Wilkins, Leanne K; McCormick, Cornelia

2016-09-01

Recent research has shown complementary engagement of the hippocampus and medial prefrontal cortex (mPFC) in encoding and retrieving associations based on pre-existing or experimentally-induced schemas, such that the latter supports schema-congruent information whereas the former is more engaged for incongruent or novel associations. Here, we attempted to explore some of the boundary conditions in the relative involvement of those structures in short-term memory for visual associations. The current literature is based primarily on intentional evaluation of schema-target congruence and on study-test paradigms with relatively long delays between learning and retrieval. We used a continuous recognition paradigm to investigate hippocampal and mPFC activation to first and second presentations of scene-object pairs as a function of semantic congruence between the elements (e.g., beach-seashell versus schoolyard-lamp). All items were identical at first and second presentation and the context scene, which was presented 500ms prior to the appearance of the target object, was incidental to the task which required a recognition response to the central target only. Very short lags 2-8 intervening stimuli occurred between presentations. Encoding the targets with congruent contexts was associated with increased activation in visual cortical regions at initial presentation and faster response time at repetition, but we did not find enhanced activation in mPFC relative to incongruent stimuli at either presentation. We did observe enhanced activation in the right anterior hippocampus, as well as regions in visual and lateral temporal and frontal cortical regions, for the repetition of incongruent scene-object pairs. This pattern demonstrates rapid and incidental effects of schema processing in hippocampal, but not mPFC, engagement during continuous recognition. Copyright © 2016 Elsevier Ltd. All rights reserved.
A Longitudinal Investigation of Visual Event-Related Potentials in the First Year of Life

ERIC Educational Resources Information Center

Webb, Sara J.; Long, Jeffrey D.; Nelson, Charles A.

2005-01-01

The goal of the current study was to assess general maturational changes in the ERP in the same sample of infants from 4 to 12 months of age. All participants were tested in two experimental manipulations at each age: a test of facial recognition and one of object recognition. Two sets of analyses were undertaken. First, growth curve modeling with…
Temporal properties of material categorization and material rating: visual vs non-visual material features.

PubMed

Nagai, Takehiro; Matsushima, Toshiki; Koida, Kowa; Tani, Yusuke; Kitazaki, Michiteru; Nakauchi, Shigeki

2015-10-01

Humans can visually recognize material categories of objects, such as glass, stone, and plastic, easily. However, little is known about the kinds of surface quality features that contribute to such material class recognition. In this paper, we examine the relationship between perceptual surface features and material category discrimination performance for pictures of materials, focusing on temporal aspects, including reaction time and effects of stimulus duration. The stimuli were pictures of objects with an identical shape but made of different materials that could be categorized into seven classes (glass, plastic, metal, stone, wood, leather, and fabric). In a pre-experiment, observers rated the pictures on nine surface features, including visual (e.g., glossiness and transparency) and non-visual features (e.g., heaviness and warmness), on a 7-point scale. In the main experiments, observers judged whether two simultaneously presented pictures were classified as the same or different material category. Reaction times and effects of stimulus duration were measured. The results showed that visual feature ratings were correlated with material discrimination performance for short reaction times or short stimulus durations, while non-visual feature ratings were correlated only with performance for long reaction times or long stimulus durations. These results suggest that the mechanisms underlying visual and non-visual feature processing may differ in terms of processing time, although the cause is unclear. Visual surface features may mainly contribute to material recognition in daily life, while non-visual features may contribute only weakly, if at all. Copyright © 2014 Elsevier Ltd. All rights reserved.
The roles of perceptual and conceptual information in face recognition.

PubMed

Schwartz, Linoy; Yovel, Galit

2016-11-01

The representation of familiar objects is comprised of perceptual information about their visual properties as well as the conceptual knowledge that we have about them. What is the relative contribution of perceptual and conceptual information to object recognition? Here, we examined this question by designing a face familiarization protocol during which participants were either exposed to rich perceptual information (viewing each face in different angles and illuminations) or with conceptual information (associating each face with a different name). Both conditions were compared with single-view faces presented with no labels. Recognition was tested on new images of the same identities to assess whether learning generated a view-invariant representation. Results showed better recognition of novel images of the learned identities following association of a face with a name label, but no enhancement following exposure to multiple face views. Whereas these findings may be consistent with the role of category learning in object recognition, face recognition was better for labeled faces only when faces were associated with person-related labels (name, occupation), but not with person-unrelated labels (object names or symbols). These findings suggest that association of meaningful conceptual information with an image shifts its representation from an image-based percept to a view-invariant concept. They further indicate that the role of conceptual information should be considered to account for the superior recognition that we have for familiar faces and objects. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Learning object-to-class kernels for scene classification.

PubMed

Zhang, Lei; Zhen, Xiantong; Shao, Ling

2014-08-01

High-level image representations have drawn increasing attention in visual recognition, e.g., scene classification, since the invention of the object bank. The object bank represents an image as a response map of a large number of pretrained object detectors and has achieved superior performance for visual recognition. In this paper, based on the object bank representation, we propose the object-to-class (O2C) distances to model scene images. In particular, four variants of O2C distances are presented, and with the O2C distances, we can represent the images using the object bank by lower-dimensional but more discriminative spaces, called distance spaces, which are spanned by the O2C distances. Due to the explicit computation of O2C distances based on the object bank, the obtained representations can possess more semantic meanings. To combine the discriminant ability of the O2C distances to all scene classes, we further propose to kernalize the distance representation for the final classification. We have conducted extensive experiments on four benchmark data sets, UIUC-Sports, Scene-15, MIT Indoor, and Caltech-101, which demonstrate that the proposed approaches can significantly improve the original object bank approach and achieve the state-of-the-art performance.

Higher Level Visual Cortex Represents Retinotopic, Not Spatiotopic, Object Location

PubMed Central

Kanwisher, Nancy

2012-01-01

The crux of vision is to identify objects and determine their locations in the environment. Although initial visual representations are necessarily retinotopic (eye centered), interaction with the real world requires spatiotopic (absolute) location information. We asked whether higher level human visual cortex—important for stable object recognition and action—contains information about retinotopic and/or spatiotopic object position. Using functional magnetic resonance imaging multivariate pattern analysis techniques, we found information about both object category and object location in each of the ventral, dorsal, and early visual regions tested, replicating previous reports. By manipulating fixation position and stimulus position, we then tested whether these location representations were retinotopic or spatiotopic. Crucially, all location information was purely retinotopic. This pattern persisted when location information was irrelevant to the task, and even when spatiotopic (not retinotopic) stimulus position was explicitly emphasized. We also conducted a “searchlight” analysis across our entire scanned volume to explore additional cortex but again found predominantly retinotopic representations. The lack of explicit spatiotopic representations suggests that spatiotopic object position may instead be computed indirectly and continually reconstructed with each eye movement. Thus, despite our subjective impression that visual information is spatiotopic, even in higher level visual cortex, object location continues to be represented in retinotopic coordinates. PMID:22190434
Pure associative tactile agnosia for the left hand: clinical and anatomo-functional correlations.

PubMed

Veronelli, Laura; Ginex, Valeria; Dinacci, Daria; Cappa, Stefano F; Corbo, Massimo

2014-09-01

Associative tactile agnosia (TA) is defined as the inability to associate information about object sensory properties derived through tactile modality with previously acquired knowledge about object identity. The impairment is often described after a lesion involving the parietal cortex (Caselli, 1997; Platz, 1996). We report the case of SA, a right-handed 61-year-old man affected by first ever right hemispheric hemorrhagic stroke. The neurological examination was normal, excluding major somaesthetic and motor impairment; a brain magnetic resonance imaging (MRI) confirmed the presence of a right subacute hemorrhagic lesion limited to the post-central and supra-marginal gyri. A comprehensive neuropsychological evaluation detected a selective inability to name objects when handled with the left hand in the absence of other cognitive deficits. A series of experiments were conducted in order to assess each stage of tactile recognition processing using the same stimulus sets: materials, 3D geometrical shapes, real objects and letters. SA and seven matched controls underwent the same experimental tasks during four sessions in consecutive days. Tactile discrimination, recognition, pantomime, drawing after haptic exploration out of vision and tactile-visual matching abilities were assessed. In addition, we looked for the presence of a supra-modal impairment of spatial perception and of specific difficulties in programming exploratory movements during recognition. Tactile discrimination was intact for all the stimuli tested. In contrast, SA was able neither to recognize nor to pantomime real objects manipulated with the left hand out of vision, while he identified them with the right hand without hesitations. Tactile-visual matching was intact. Furthermore, SA was able to grossly reproduce the global shape in drawings but failed to extract details of objects after left-hand manipulation, and he could not identify objects after looking at his own drawings. This case confirms the existence of selective associative TA as a left hand-specific deficit in recognizing objects. This deficit is not related to spatial perception or to the programming of exploratory movements. The cross-modal transfer of information via visual perception permits the activation of a partially degraded image, which alone does not allow the proper recognition of the initial tactile stimulus. Copyright © 2014 Elsevier Ltd. All rights reserved.
Reader error, object recognition, and visual search

NASA Astrophysics Data System (ADS)

Kundel, Harold L.

2004-05-01

Small abnormalities such as hairline fractures, lung nodules and breast tumors are missed by competent radiologists with sufficient frequency to make them a matter of concern to the medical community; not only because they lead to litigation but also because they delay patient care. It is very easy to attribute misses to incompetence or inattention. To do so may be placing an unjustified stigma on the radiologists involved and may allow other radiologists to continue a false optimism that it can never happen to them. This review presents some of the fundamentals of visual system function that are relevant to understanding the search for and the recognition of small targets embedded in complicated but meaningful backgrounds like chests and mammograms. It presents a model for visual search that postulates a pre-attentive global analysis of the retinal image followed by foveal checking fixations and eventually discovery scanning. The model will be used to differentiate errors of search, recognition and decision making. The implications for computer aided diagnosis and for functional workstation design are discussed.
Learning to Be (In)Variant: Combining Prior Knowledge and Experience to Infer Orientation Invariance in Object Recognition

ERIC Educational Resources Information Center

Austerweil, Joseph L.; Griffiths, Thomas L.; Palmer, Stephen E.

2017-01-01

How does the visual system recognize images of a novel object after a single observation despite possible variations in the viewpoint of that object relative to the observer? One possibility is comparing the image with a prototype for invariance over a relevant transformation set (e.g., translations and dilations). However, invariance over…
Odors as effective retrieval cues for stressful episodes.

PubMed

Wiemers, Uta S; Sauvage, Magdalena M; Wolf, Oliver T

2014-07-01

Olfactory information seems to play a special role in memory due to the fast and direct processing of olfactory information in limbic areas like the amygdala and the hippocampus. This has led to the assumption that odors can serve as effective retrieval cues for autobiographic memories, especially emotional memories. The current study sought to investigate whether an olfactory cue can serve as an effective retrieval cue for memories of a stressful episode. A total of 95 participants were exposed to a psychosocial stressor or a well matching but not stressful control condition. During both conditions were visual objects present, either bound to the situation (central objects) or not (peripheral objects). Additionally, an ambient odor was present during both conditions. The next day, participants engaged in an unexpected object recognition task either under the influence of the same odor as was present during encoding (congruent odor) or another odor (non-congruent odor). Results show that stressed participants show a better memory for all objects and especially for central visual objects if recognition took place under influence of the congruent odor. An olfactory cue thus indeed seems to be an effective retrieval cue for stressful memories. Copyright © 2013 Elsevier Inc. All rights reserved.
An ERP study of recognition memory for concrete and abstract pictures in school-aged children.

PubMed

Boucher, Olivier; Chouinard-Leclaire, Christine; Muckle, Gina; Westerlund, Alissa; Burden, Matthew J; Jacobson, Sandra W; Jacobson, Joseph L

2016-08-01

Recognition memory for concrete, nameable pictures is typically faster and more accurate than for abstract pictures. A dual-coding account for these findings suggests that concrete pictures are processed into verbal and image codes, whereas abstract pictures are encoded in image codes only. Recognition memory relies on two successive and distinct processes, namely familiarity and recollection. Whether these two processes are similarly or differently affected by stimulus concreteness remains unknown. This study examined the effect of picture concreteness on visual recognition memory processes using event-related potentials (ERPs). In a sample of children involved in a longitudinal study, participants (N=96; mean age=11.3years) were assessed on a continuous visual recognition memory task in which half the pictures were easily nameable, everyday concrete objects, and the other half were three-dimensional abstract, sculpture-like objects. Behavioral performance and ERP correlates of familiarity and recollection (respectively, the FN400 and P600 repetition effects) were measured. Behavioral results indicated faster and more accurate identification of concrete pictures as "new" or "old" (i.e., previously displayed) compared to abstract pictures. ERPs were characterized by a larger repetition effect, on the P600 amplitude, for concrete than for abstract images, suggesting a graded recollection process dependent on the type of material to be recollected. Topographic differences were observed within the FN400 latency interval, especially over anterior-inferior electrodes, with the repetition effect more pronounced and localized over the left hemisphere for concrete stimuli, potentially reflecting different neural processes underlying early processing of verbal/semantic and visual material in memory. Copyright © 2016 Elsevier B.V. All rights reserved.
Further evidence that amygdala and hippocampus contribute equally to recognition memory.

PubMed

Saunders, R C; Murray, E A; Mishkin, M

1984-01-01

The medial temporal neuropathology found in an amnesic neurosurgical patient [17] was simulated in monkeys in an attempt to determine whether the patient's mnemonic disorder, which had been ascribed to bilateral hippocampal destruction, may have also been due in part to unilateral amygdaloid removal. For this purpose, monkeys were prepared with bilateral hippocampectomy combined with unilateral amygdalectomy, and (as a control) bilateral amygdalectomy combined with unilateral hippocampectomy. The animals were trained both before and after surgery on a one-trial visual recognition task requiring memory of single objects for 10 sec each and then given a postoperative performance test in which their one-trial recognition ability was taxed with longer delays (up to 2 min) and longer lists (up to 10 objects). The two groups, which did not differ reliably at any stage, obtained average scores on the performance test 75 and 80%, respectively. Comparison with the results of an earlier experiment [8] indicates that this performance level lies approximately midway between that of monkeys with amygdaloid or hippocampal removals alone (91%) and that of monkeys with combined amygdalo-hippocampal removals (60%). The results point to a direct quantitative relationship between degree of recognition impairment and amount of conjoint damage to the amygdala and hippocampus irrespective of the specific structure involved. Evidence from neurosurgical cases tested in visual recognition [21] indicates that the same conclusion may apply to man.
A study of perceptual analysis in a high-level autistic subject with exceptional graphic abilities.

PubMed

Mottron, L; Belleville, S

1993-11-01

We report here the case study of a patient (E.C.) with an Asperger syndrome, or autism with quasinormal intelligence, who shows an outstanding ability for three-dimensional drawing of inanimate objects (savant syndrome). An assessment of the subsystems proposed in recent models of object recognition evidenced intact perceptual analysis and identification. The initial (or primal sketch), viewer-centered (or 2-1/2-D), or object-centered (3-D) representations and the recognition and name levels were functional. In contrast, E.C.'s pattern of performance in three different types of tasks converge to suggest an anomaly in the hierarchical organization of the local and global parts of a figure: a local interference effect in incongruent hierarchical visual stimuli, a deficit in relating local parts to global form information in impossible figures, and an absence of feature-grouping in graphic recall. The results are discussed in relation to normal visual perception and to current accounts of the savant syndrome in autism.
Visual and visuomotor processing of hands and tools as a case study of cross talk between the dorsal and ventral streams.

PubMed

Almeida, Jorge; Amaral, Lénia; Garcea, Frank E; Aguiar de Sousa, Diana; Xu, Shan; Mahon, Bradford Z; Martins, Isabel Pavão

2018-05-24

A major principle of organization of the visual system is between a dorsal stream that processes visuomotor information and a ventral stream that supports object recognition. Most research has focused on dissociating processing across these two streams. Here we focus on how the two streams interact. We tested neurologically-intact and impaired participants in an object categorization task over two classes of objects that depend on processing within both streams-hands and tools. We measured how unconscious processing of images from one of these categories (e.g., tools) affects the recognition of images from the other category (i.e., hands). Our findings with neurologically-intact participants demonstrated that processing an image of a hand hampers the subsequent processing of an image of a tool, and vice versa. These results were not present in apraxic patients (N = 3). These findings suggest local and global inhibitory processes working in tandem to co-register information across the two streams.
Coding visual features extracted from video sequences.

PubMed

Baroffio, Luca; Cesana, Matteo; Redondi, Alessandro; Tagliasacchi, Marco; Tubaro, Stefano

2014-05-01

Visual features are successfully exploited in several applications (e.g., visual search, object recognition and tracking, etc.) due to their ability to efficiently represent image content. Several visual analysis tasks require features to be transmitted over a bandwidth-limited network, thus calling for coding techniques to reduce the required bit budget, while attaining a target level of efficiency. In this paper, we propose, for the first time, a coding architecture designed for local features (e.g., SIFT, SURF) extracted from video sequences. To achieve high coding efficiency, we exploit both spatial and temporal redundancy by means of intraframe and interframe coding modes. In addition, we propose a coding mode decision based on rate-distortion optimization. The proposed coding scheme can be conveniently adopted to implement the analyze-then-compress (ATC) paradigm in the context of visual sensor networks. That is, sets of visual features are extracted from video frames, encoded at remote nodes, and finally transmitted to a central controller that performs visual analysis. This is in contrast to the traditional compress-then-analyze (CTA) paradigm, in which video sequences acquired at a node are compressed and then sent to a central unit for further processing. In this paper, we compare these coding paradigms using metrics that are routinely adopted to evaluate the suitability of visual features in the context of content-based retrieval, object recognition, and tracking. Experimental results demonstrate that, thanks to the significant coding gains achieved by the proposed coding scheme, ATC outperforms CTA with respect to all evaluation metrics.
Intrinsic and contextual features in object recognition.

PubMed

Schlangen, Derrick; Barenholtz, Elan

2015-01-28

The context in which an object is found can facilitate its recognition. Yet, it is not known how effective this contextual information is relative to the object's intrinsic visual features, such as color and shape. To address this, we performed four experiments using rendered scenes with novel objects. In each experiment, participants first performed a visual search task, searching for a uniquely shaped target object whose color and location within the scene was experimentally manipulated. We then tested participants' tendency to use their knowledge of the location and color information in an identification task when the objects' images were degraded due to blurring, thus eliminating the shape information. In Experiment 1, we found that, in the absence of any diagnostic intrinsic features, participants identified objects based purely on their locations within the scene. In Experiment 2, we found that participants combined an intrinsic feature, color, with contextual location in order to uniquely specify an object. In Experiment 3, we found that when an object's color and location information were in conflict, participants identified the object using both sources of information equally. Finally, in Experiment 4, we found that participants used whichever source of information-either color or location-was more statistically reliable in order to identify the target object. Overall, these experiments show that the context in which objects are found can play as important a role as intrinsic features in identifying the objects. © 2015 ARVO.
Finding and recognizing objects in natural scenes: complementary computations in the dorsal and ventral visual systems

PubMed Central

Rolls, Edmund T.; Webb, Tristan J.

2014-01-01

Searching for and recognizing objects in complex natural scenes is implemented by multiple saccades until the eyes reach within the reduced receptive field sizes of inferior temporal cortex (IT) neurons. We analyze and model how the dorsal and ventral visual streams both contribute to this. Saliency detection in the dorsal visual system including area LIP is modeled by graph-based visual saliency, and allows the eyes to fixate potential objects within several degrees. Visual information at the fixated location subtending approximately 9° corresponding to the receptive fields of IT neurons is then passed through a four layer hierarchical model of the ventral cortical visual system, VisNet. We show that VisNet can be trained using a synaptic modification rule with a short-term memory trace of recent neuronal activity to capture both the required view and translation invariances to allow in the model approximately 90% correct object recognition for 4 objects shown in any view across a range of 135° anywhere in a scene. The model was able to generalize correctly within the four trained views and the 25 trained translations. This approach analyses the principles by which complementary computations in the dorsal and ventral visual cortical streams enable objects to be located and recognized in complex natural scenes. PMID:25161619
Modeling global scene factors in attention

NASA Astrophysics Data System (ADS)

Torralba, Antonio

2003-07-01

Models of visual attention have focused predominantly on bottom-up approaches that ignored structured contextual and scene information. I propose a model of contextual cueing for attention guidance based on the global scene configuration. It is shown that the statistics of low-level features across the whole image can be used to prime the presence or absence of objects in the scene and to predict their location, scale, and appearance before exploring the image. In this scheme, visual context information can become available early in the visual processing chain, which allows modulation of the saliency of image regions and provides an efficient shortcut for object detection and recognition. 2003 Optical Society of America
Evidence for Holistic Representations of Ignored Images and Analytic Representations of Attended Images

ERIC Educational Resources Information Center

Thoma, Volker; Hummel, John E.; Davidoff, Jules

2004-01-01

According to the hybrid theory of object recognition (J. E. Hummel, 2001), ignored object images are represented holistically, and attended images are represented both holistically and analytically. This account correctly predicts patterns of visual priming as a function of translation, scale (B. J. Stankiewicz & J. E. Hummel, 2002), and…
Surface versus Edge-Based Determinants of Visual Recognition.

ERIC Educational Resources Information Center

Biederman, Irving; Ju, Ginny

1988-01-01

The latency at which objects could be identified by 126 subjects was compared through line drawings (edge-based) or color photography (surface depiction). The line drawing was identified about as quickly as the photograph; primal access to a mental representation of an object can be modeled from an edge-based description. (SLD)
Visual object recognition and tracking

NASA Technical Reports Server (NTRS)

Chang, Chu-Yin (Inventor); English, James D. (Inventor); Tardella, Neil M. (Inventor)

2010-01-01

This invention describes a method for identifying and tracking an object from two-dimensional data pictorially representing said object by an object-tracking system through processing said two-dimensional data using at least one tracker-identifier belonging to the object-tracking system for providing an output signal containing: a) a type of the object, and/or b) a position or an orientation of the object in three-dimensions, and/or c) an articulation or a shape change of said object in said three dimensions.
Short- and long-term effects of nicotine and the histone deacetylase inhibitor phenylbutyrate on novel object recognition in zebrafish.

PubMed

Faillace, M P; Pisera-Fuster, A; Medrano, M P; Bejarano, A C; Bernabeu, R O

2017-03-01

Zebrafish have a sophisticated color- and shape-sensitive visual system, so we examined color cue-based novel object recognition in zebrafish. We evaluated preference in the absence or presence of drugs that affect attention and memory retention in rodents: nicotine and the histone deacetylase inhibitor (HDACi) phenylbutyrate (PhB). The objective of this study was to evaluate whether nicotine and PhB affect innate preferences of zebrafish for familiar and novel objects after short- and long-retention intervals. We developed modified object recognition (OR) tasks using neutral novel and familiar objects in different colors. We also tested objects which differed with respect to the exploratory behavior they elicited from naïve zebrafish. Zebrafish showed an innate preference for exploring red or green objects rather than yellow or blue objects. Zebrafish were better at discriminating color changes than changes in object shape or size. Nicotine significantly enhanced or changed short-term innate novel object preference whereas PhB had similar effects when preference was assessed 24 h after training. Analysis of other zebrafish behaviors corroborated these results. Zebrafish were innately reluctant or prone to explore colored novel objects, so drug effects on innate preference for objects can be evaluated changing the color of objects with a simple geometry. Zebrafish exhibited recognition memory for novel objects with similar innate significance. Interestingly, nicotine and PhB significantly modified innate object preference.
Integrating visual learning within a model-based ATR system

NASA Astrophysics Data System (ADS)

Carlotto, Mark; Nebrich, Mark

2017-05-01

Automatic target recognition (ATR) systems, like human photo-interpreters, rely on a variety of visual information for detecting, classifying, and identifying manmade objects in aerial imagery. We describe the integration of a visual learning component into the Image Data Conditioner (IDC) for target/clutter and other visual classification tasks. The component is based on an implementation of a model of the visual cortex developed by Serre, Wolf, and Poggio. Visual learning in an ATR context requires the ability to recognize objects independent of location, scale, and rotation. Our method uses IDC to extract, rotate, and scale image chips at candidate target locations. A bootstrap learning method effectively extends the operation of the classifier beyond the training set and provides a measure of confidence. We show how the classifier can be used to learn other features that are difficult to compute from imagery such as target direction, and to assess the performance of the visual learning process itself.
New technologies lead to a new frontier: cognitive multiple data representation

NASA Astrophysics Data System (ADS)

Buffat, S.; Liege, F.; Plantier, J.; Roumes, C.

2005-05-01

The increasing number and complexity of operational sensors (radar, infrared, hyperspectral...) and availability of huge amount of data, lead to more and more sophisticated information presentations. But one key element of the IMINT line cannot be improved beyond initial system specification: the operator.... In order to overcome this issue, we have to better understand human visual object representation. Object recognition theories in human vision balance between matching 2D templates representation with viewpoint-dependant information, and a viewpoint-invariant system based on structural description. Spatial frequency content is relevant due to early vision filtering. Orientation in depth is an important variable to challenge object constancy. Three objects, seen from three different points of view in a natural environment made the original images in this study. Test images were a combination of spatial frequency filtered original images and an additive contrast level of white noise. In the first experiment, the observer's task was a same versus different forced choice with spatial alternative. Test images had the same noise level in a presentation row. Discrimination threshold was determined by modifying the white noise contrast level by means of an adaptative method. In the second experiment, a repetition blindness paradigm was used to further investigate the viewpoint effect on object recognition. The results shed some light on the human visual system processing of objects displayed under different physical descriptions. This is an important achievement because targets which not always match physical properties of usual visual stimuli can increase operational workload.
Emerging Object Representations in the Visual System Predict Reaction Times for Categorization

PubMed Central

Ritchie, J. Brendan; Tovar, David A.; Carlson, Thomas A.

2015-01-01

Recognizing an object takes just a fraction of a second, less than the blink of an eye. Applying multivariate pattern analysis, or “brain decoding”, methods to magnetoencephalography (MEG) data has allowed researchers to characterize, in high temporal resolution, the emerging representation of object categories that underlie our capacity for rapid recognition. Shortly after stimulus onset, object exemplars cluster by category in a high-dimensional activation space in the brain. In this emerging activation space, the decodability of exemplar category varies over time, reflecting the brain’s transformation of visual inputs into coherent category representations. How do these emerging representations relate to categorization behavior? Recently it has been proposed that the distance of an exemplar representation from a categorical boundary in an activation space is critical for perceptual decision-making, and that reaction times should therefore correlate with distance from the boundary. The predictions of this distance hypothesis have been born out in human inferior temporal cortex (IT), an area of the brain crucial for the representation of object categories. When viewed in the context of a time varying neural signal, the optimal time to “read out” category information is when category representations in the brain are most decodable. Here, we show that the distance from a decision boundary through activation space, as measured using MEG decoding methods, correlates with reaction times for visual categorization during the period of peak decodability. Our results suggest that the brain begins to read out information about exemplar category at the optimal time for use in choice behaviour, and support the hypothesis that the structure of the representation for objects in the visual system is partially constitutive of the decision process in recognition. PMID:26107634

Mid-level perceptual features contain early cues to animacy.

PubMed

Long, Bria; Störmer, Viola S; Alvarez, George A

2017-06-01

While substantial work has focused on how the visual system achieves basic-level recognition, less work has asked about how it supports large-scale distinctions between objects, such as animacy and real-world size. Previous work has shown that these dimensions are reflected in our neural object representations (Konkle & Caramazza, 2013), and that objects of different real-world sizes have different mid-level perceptual features (Long, Konkle, Cohen, & Alvarez, 2016). Here, we test the hypothesis that animates and manmade objects also differ in mid-level perceptual features. To do so, we generated synthetic images of animals and objects that preserve some texture and form information ("texforms"), but are not identifiable at the basic level. We used visual search efficiency as an index of perceptual similarity, as search is slower when targets are perceptually similar to distractors. Across three experiments, we find that observers can find animals faster among objects than among other animals, and vice versa, and that these results hold when stimuli are reduced to unrecognizable texforms. Electrophysiological evidence revealed that this mixed-animacy search advantage emerges during early stages of target individuation, and not during later stages associated with semantic processing. Lastly, we find that perceived curvature explains part of the mixed-animacy search advantage and that observers use perceived curvature to classify texforms as animate/inanimate. Taken together, these findings suggest that mid-level perceptual features, including curvature, contain cues to whether an object may be animate versus manmade. We propose that the visual system capitalizes on these early cues to facilitate object detection, recognition, and classification.
Learning to distinguish similar objects

NASA Astrophysics Data System (ADS)

Seibert, Michael; Waxman, Allen M.; Gove, Alan N.

1995-04-01

This paper describes how the similarities and differences among similar objects can be discovered during learning to facilitate recognition. The application domain is single views of flying model aircraft captured in silhouette by a CCD camera. The approach was motivated by human psychovisual and monkey neurophysiological data. The implementation uses neural net processing mechanisms to build a hierarchy that relates similar objects to superordinate classes, while simultaneously discovering the salient differences between objects within a class. Learning and recognition experiments both with and without the class similarity and difference learning show the effectiveness of the approach on this visual data. To test the approach, the hierarchical approach was compared to a non-hierarchical approach, and was found to improve the average percentage of correctly classified views from 77% to 84%.
A role for the CAMKK pathway in visual object recognition memory.

PubMed

Tinsley, Chris J; Narduzzo, Katherine E; Brown, Malcolm W; Warburton, E Clea

2012-03-01

The role of the CAMKK pathway in object recognition memory was investigated. Rats' performance in a preferential object recognition test was examined after local infusion into the perirhinal cortex of the CAMKK inhibitor STO-609. STO-609 infused either before or immediately after acquisition impaired memory tested after a 24 h but not a 20-min delay. Memory was not impaired when STO-609 was infused 20 min after acquisition. The expression of a downstream reaction product of CAMKK was measured by immunohistochemical staining for phospho-CAMKI(Thr177) at 10, 40, 70, and 100 min following the viewing of novel and familiar images of objects. Processing familiar images resulted in more pCAMKI stained neurons in the perirhinal cortex than processing novel images at the 10- and 40-min delays. Prior infusion of STO-609 caused a reduction in pCAMKI stained neurons in response to viewing either novel or familiar images, consistent with its role as an inhibitor of CAMKK. The results establish that the CAMKK pathway within the perirhinal cortex is important for the consolidation of object recognition memory. The activation of pCAMKI after acquisition is earlier than previously reported for pCAMKII. Copyright © 2011 Wiley Periodicals, Inc.
Global shape recognition is modulated by the spatial distance of local elements--evidence from simultanagnosia.

PubMed

Huberle, Elisabeth; Karnath, Hans-Otto

2006-01-01

Simultanagnosia is a rare deficit that impairs individuals in perceiving several objects at the same time. It is usually observed following bilateral parieto-occipital brain damage. Despite the restrictions in perceiving the global aspect of a scene, processing of individual objects remains unaffected. The mechanisms underlying simultanagnosia are not well understood. Previous findings indicated that the integration of multiple objects into a holistic representation of the environment is not impossible per se, but might depend on the spatial relationship between individual objects. The present study examined the influence of inter-element distances between individual objects on the recognition of global shapes in two patients with simultanagnosia. We presented Navon hierarchical letter stimuli with different inter-element distances between letters at the Local Scale. Improved recognition at the Global Scale was observed in both patients by reducing the inter-element distance. Global shape recognition in simultanagnosia thus seems to be modulated by the spatial distance of local elements and does not appear to be an all-or-nothing phenomenon depending on spatial continuity. The findings seem to argue against a deficit in visual working memory capacity as the primary deficit in simultanagnosia. However, further research is necessary to investigate alternative interpretations.
Enhanced multisensory integration and motor reactivation after active motor learning of audiovisual associations.

PubMed

Butler, Andrew J; James, Thomas W; James, Karin Harman

2011-11-01

Everyday experience affords us many opportunities to learn about objects through multiple senses using physical interaction. Previous work has shown that active motor learning of unisensory items enhances memory and leads to the involvement of motor systems during subsequent perception. However, the impact of active motor learning on subsequent perception and recognition of associations among multiple senses has not been investigated. Twenty participants were included in an fMRI study that explored the impact of active motor learning on subsequent processing of unisensory and multisensory stimuli. Participants were exposed to visuo-motor associations between novel objects and novel sounds either through self-generated actions on the objects or by observing an experimenter produce the actions. Immediately after exposure, accuracy, RT, and BOLD fMRI measures were collected with unisensory and multisensory stimuli in associative perception and recognition tasks. Response times during audiovisual associative and unisensory recognition were enhanced by active learning, as was accuracy during audiovisual associative recognition. The difference in motor cortex activation between old and new associations was greater for the active than the passive group. Furthermore, functional connectivity between visual and motor cortices was stronger after active learning than passive learning. Active learning also led to greater activation of the fusiform gyrus during subsequent unisensory visual perception. Finally, brain regions implicated in audiovisual integration (e.g., STS) showed greater multisensory gain after active learning than after passive learning. Overall, the results show that active motor learning modulates the processing of multisensory associations.
Position estimation and driving of an autonomous vehicle by monocular vision

NASA Astrophysics Data System (ADS)

Hanan, Jay C.; Kayathi, Pavan; Hughlett, Casey L.

2007-04-01

Automatic adaptive tracking in real-time for target recognition provided autonomous control of a scale model electric truck. The two-wheel drive truck was modified as an autonomous rover test-bed for vision based guidance and navigation. Methods were implemented to monitor tracking error and ensure a safe, accurate arrival at the intended science target. Some methods are situation independent relying only on the confidence error of the target recognition algorithm. Other methods take advantage of the scenario of combined motion and tracking to filter out anomalies. In either case, only a single calibrated camera was needed for position estimation. Results from real-time autonomous driving tests on the JPL simulated Mars yard are presented. Recognition error was often situation dependent. For the rover case, the background was in motion and may be characterized to provide visual cues on rover travel such as rate, pitch, roll, and distance to objects of interest or hazards. Objects in the scene may be used as landmarks, or waypoints, for such estimations. As objects are approached, their scale increases and their orientation may change. In addition, particularly on rough terrain, these orientation and scale changes may be unpredictable. Feature extraction combined with the neural network algorithm was successful in providing visual odometry in the simulated Mars environment.
Visual adaptation dominates bimodal visual-motor action adaptation

PubMed Central

de la Rosa, Stephan; Ferstl, Ylva; Bülthoff, Heinrich H.

2016-01-01

A long standing debate revolves around the question whether visual action recognition primarily relies on visual or motor action information. Previous studies mainly examined the contribution of either visual or motor information to action recognition. Yet, the interaction of visual and motor action information is particularly important for understanding action recognition in social interactions, where humans often observe and execute actions at the same time. Here, we behaviourally examined the interaction of visual and motor action recognition processes when participants simultaneously observe and execute actions. We took advantage of behavioural action adaptation effects to investigate behavioural correlates of neural action recognition mechanisms. In line with previous results, we find that prolonged visual exposure (visual adaptation) and prolonged execution of the same action with closed eyes (non-visual motor adaptation) influence action recognition. However, when participants simultaneously adapted visually and motorically – akin to simultaneous execution and observation of actions in social interactions - adaptation effects were only modulated by visual but not motor adaptation. Action recognition, therefore, relies primarily on vision-based action recognition mechanisms in situations that require simultaneous action observation and execution, such as social interactions. The results suggest caution when associating social behaviour in social interactions with motor based information. PMID:27029781
The influence of print exposure on the body-object interaction effect in visual word recognition.

PubMed

Hansen, Dana; Siakaluk, Paul D; Pexman, Penny M

2012-01-01

We examined the influence of print exposure on the body-object interaction (BOI) effect in visual word recognition. High print exposure readers and low print exposure readers either made semantic categorizations ("Is the word easily imageable?"; Experiment 1) or phonological lexical decisions ("Does the item sound like a real English word?"; Experiment 2). The results from Experiment 1 showed that there was a larger BOI effect for the low print exposure readers than for the high print exposure readers in semantic categorization, though an effect was observed for both print exposure groups. However, the results from Experiment 2 showed that the BOI effect was observed only for the high print exposure readers in phonological lexical decision. The results of the present study suggest that print exposure does influence the BOI effect, and that this influence varies as a function of task demands.
The Dynamic Multisensory Engram: Neural Circuitry Underlying Crossmodal Object Recognition in Rats Changes with the Nature of Object Experience.

PubMed

Jacklin, Derek L; Cloke, Jacob M; Potvin, Alphonse; Garrett, Inara; Winters, Boyer D

2016-01-27

Rats, humans, and monkeys demonstrate robust crossmodal object recognition (CMOR), identifying objects across sensory modalities. We have shown that rats' performance of a spontaneous tactile-to-visual CMOR task requires functional integration of perirhinal (PRh) and posterior parietal (PPC) cortices, which seemingly provide visual and tactile object feature processing, respectively. However, research with primates has suggested that PRh is sufficient for multisensory object representation. We tested this hypothesis in rats using a modification of the CMOR task in which multimodal preexposure to the to-be-remembered objects significantly facilitates performance. In the original CMOR task, with no preexposure, reversible lesions of PRh or PPC produced patterns of impairment consistent with modality-specific contributions. Conversely, in the CMOR task with preexposure, PPC lesions had no effect, whereas PRh involvement was robust, proving necessary for phases of the task that did not require PRh activity when rats did not have preexposure; this pattern was supported by results from c-fos imaging. We suggest that multimodal preexposure alters the circuitry responsible for object recognition, in this case obviating the need for PPC contributions and expanding PRh involvement, consistent with the polymodal nature of PRh connections and results from primates indicating a key role for PRh in multisensory object representation. These findings have significant implications for our understanding of multisensory information processing, suggesting that the nature of an individual's past experience with an object strongly determines the brain circuitry involved in representing that object's multisensory features in memory. The ability to integrate information from multiple sensory modalities is crucial to the survival of organisms living in complex environments. Appropriate responses to behaviorally relevant objects are informed by integration of multisensory object features. We used crossmodal object recognition tasks in rats to study the neurobiological basis of multisensory object representation. When rats had no prior exposure to the to-be-remembered objects, the spontaneous ability to recognize objects across sensory modalities relied on functional interaction between multiple cortical regions. However, prior multisensory exploration of the task-relevant objects remapped cortical contributions, negating the involvement of one region and significantly expanding the role of another. This finding emphasizes the dynamic nature of cortical representation of objects in relation to past experience. Copyright © 2016 the authors 0270-6474/16/361273-17$15.00/0.
Measuring listening effort: driving simulator vs. simple dual-task paradigm

PubMed Central

Wu, Yu-Hsiang; Aksan, Nazan; Rizzo, Matthew; Stangl, Elizabeth; Zhang, Xuyang; Bentler, Ruth

2014-01-01

Objectives The dual-task paradigm has been widely used to measure listening effort. The primary objectives of the study were to (1) investigate the effect of hearing aid amplification and a hearing aid directional technology on listening effort measured by a complicated, more real world dual-task paradigm, and (2) compare the results obtained with this paradigm to a simpler laboratory-style dual-task paradigm. Design The listening effort of adults with hearing impairment was measured using two dual-task paradigms, wherein participants performed a speech recognition task simultaneously with either a driving task in a simulator or a visual reaction-time task in a sound-treated booth. The speech materials and road noises for the speech recognition task were recorded in a van traveling on the highway in three hearing aid conditions: unaided, aided with omni directional processing (OMNI), and aided with directional processing (DIR). The change in the driving task or the visual reaction-time task performance across the conditions quantified the change in listening effort. Results Compared to the driving-only condition, driving performance declined significantly with the addition of the speech recognition task. Although the speech recognition score was higher in the OMNI and DIR conditions than in the unaided condition, driving performance was similar across these three conditions, suggesting that listening effort was not affected by amplification and directional processing. Results from the simple dual-task paradigm showed a similar trend: hearing aid technologies improved speech recognition performance, but did not affect performance in the visual reaction-time task (i.e., reduce listening effort). The correlation between listening effort measured using the driving paradigm and the visual reaction-time task paradigm was significant. The finding showing that our older (56 to 85 years old) participants’ better speech recognition performance did not result in reduced listening effort was not consistent with literature that evaluated younger (approximately 20 years old), normal hearing adults. Because of this, a follow-up study was conducted. In the follow-up study, the visual reaction-time dual-task experiment using the same speech materials and road noises was repeated on younger adults with normal hearing. Contrary to findings with older participants, the results indicated that the directional technology significantly improved performance in both speech recognition and visual reaction-time tasks. Conclusions Adding a speech listening task to driving undermined driving performance. Hearing aid technologies significantly improved speech recognition while driving, but did not significantly reduce listening effort. Listening effort measured by dual-task experiments using a simulated real-world driving task and a conventional laboratory-style task was generally consistent. For a given listening environment, the benefit of hearing aid technologies on listening effort measured from younger adults with normal hearing may not be fully translated to older listeners with hearing impairment. PMID:25083599
Sensitivity to timing and order in human visual cortex

PubMed Central

Singer, Jedediah M.; Madsen, Joseph R.; Anderson, William S.

2014-01-01

Visual recognition takes a small fraction of a second and relies on the cascade of signals along the ventral visual stream. Given the rapid path through multiple processing steps between photoreceptors and higher visual areas, information must progress from stage to stage very quickly. This rapid progression of information suggests that fine temporal details of the neural response may be important to the brain's encoding of visual signals. We investigated how changes in the relative timing of incoming visual stimulation affect the representation of object information by recording intracranial field potentials along the human ventral visual stream while subjects recognized objects whose parts were presented with varying asynchrony. Visual responses along the ventral stream were sensitive to timing differences as small as 17 ms between parts. In particular, there was a strong dependency on the temporal order of stimulus presentation, even at short asynchronies. From these observations we infer that the neural representation of complex information in visual cortex can be modulated by rapid dynamics on scales of tens of milliseconds. PMID:25429116
Should visual speech cues (speechreading) be considered when fitting hearing aids?

NASA Astrophysics Data System (ADS)

Grant, Ken

2002-05-01

When talker and listener are face-to-face, visual speech cues become an important part of the communication environment, and yet, these cues are seldom considered when designing hearing aids. Models of auditory-visual speech recognition highlight the importance of complementary versus redundant speech information for predicting auditory-visual recognition performance. Thus, for hearing aids to work optimally when visual speech cues are present, it is important to know whether the cues provided by amplification and the cues provided by speechreading complement each other. In this talk, data will be reviewed that show nonmonotonicity between auditory-alone speech recognition and auditory-visual speech recognition, suggesting that efforts designed solely to improve auditory-alone recognition may not always result in improved auditory-visual recognition. Data will also be presented showing that one of the most important speech cues for enhancing auditory-visual speech recognition performance, voicing, is often the cue that benefits least from amplification.
Deep hierarchies in the primate visual cortex: what can we learn for computer vision?

PubMed

Krüger, Norbert; Janssen, Peter; Kalkan, Sinan; Lappe, Markus; Leonardis, Ales; Piater, Justus; Rodríguez-Sánchez, Antonio J; Wiskott, Laurenz

2013-08-01

Computational modeling of the primate visual system yields insights of potential relevance to some of the challenges that computer vision is facing, such as object recognition and categorization, motion detection and activity recognition, or vision-based navigation and manipulation. This paper reviews some functional principles and structures that are generally thought to underlie the primate visual cortex, and attempts to extract biological principles that could further advance computer vision research. Organized for a computer vision audience, we present functional principles of the processing hierarchies present in the primate visual system considering recent discoveries in neurophysiology. The hierarchical processing in the primate visual system is characterized by a sequence of different levels of processing (on the order of 10) that constitute a deep hierarchy in contrast to the flat vision architectures predominantly used in today's mainstream computer vision. We hope that the functional description of the deep hierarchies realized in the primate visual system provides valuable insights for the design of computer vision algorithms, fostering increasingly productive interaction between biological and computer vision research.
The role of lateral occipitotemporal junction and area MT/V5 in the visual analysis of upper-limb postures.

PubMed

Peigneux, P; Salmon, E; van der Linden, M; Garraux, G; Aerts, J; Delfiore, G; Degueldre, C; Luxen, A; Orban, G; Franck, G

2000-06-01

Humans, like numerous other species, strongly rely on the observation of gestures of other individuals in their everyday life. It is hypothesized that the visual processing of human gestures is sustained by a specific functional architecture, even at an early prelexical cognitive stage, different from that required for the processing of other visual entities. In the present PET study, the neural basis of visual gesture analysis was investigated with functional neuroimaging of brain activity during naming and orientation tasks performed on pictures of either static gestures (upper-limb postures) or tridimensional objects. To prevent automatic object-related cerebral activation during the visual processing of postures, only intransitive postures were selected, i. e., symbolic or meaningless postures which do not imply the handling of objects. Conversely, only intransitive objects which cannot be handled were selected to prevent gesture-related activation during their visual processing. Results clearly demonstrate a significant functional segregation between the processing of static intransitive postures and the processing of intransitive tridimensional objects. Visual processing of objects elicited mainly occipital and fusiform gyrus activity, while visual processing of postures strongly activated the lateral occipitotemporal junction, encroaching upon area MT/V5, involved in motion analysis. These findings suggest that the lateral occipitotemporal junction, working in association with area MT/V5, plays a prominent role in the high-level perceptual analysis of gesture, namely the construction of its visual representation, available for subsequent recognition or imitation. Copyright 2000 Academic Press.
A PDP model of the simultaneous perception of multiple objects

NASA Astrophysics Data System (ADS)

Henderson, Cynthia M.; McClelland, James L.

2011-06-01

Illusory conjunctions in normal and simultanagnosic subjects are two instances where the visual features of multiple objects are incorrectly 'bound' together. A connectionist model explores how multiple objects could be perceived correctly in normal subjects given sufficient time, but could give rise to illusory conjunctions with damage or time pressure. In this model, perception of two objects benefits from lateral connections between hidden layers modelling aspects of the ventral and dorsal visual pathways. As with simultanagnosia, simulations of dorsal lesions impair multi-object recognition. In contrast, a large ventral lesion has minimal effect on dorsal functioning, akin to dissociations between simple object manipulation (retained in visual form agnosia and semantic dementia) and object discrimination (impaired in these disorders) [Hodges, J.R., Bozeat, S., Lambon Ralph, M.A., Patterson, K., and Spatt, J. (2000), 'The Role of Conceptual Knowledge: Evidence from Semantic Dementia', Brain, 123, 1913-1925; Milner, A.D., and Goodale, M.A. (2006), The Visual Brain in Action (2nd ed.), New York: Oxford]. It is hoped that the functioning of this model might suggest potential processes underlying dorsal and ventral contributions to the correct perception of multiple objects.
A Low-Cost EEG System-Based Hybrid Brain-Computer Interface for Humanoid Robot Navigation and Recognition

PubMed Central

Choi, Bongjae; Jo, Sungho

2013-01-01

This paper describes a hybrid brain-computer interface (BCI) technique that combines the P300 potential, the steady state visually evoked potential (SSVEP), and event related de-synchronization (ERD) to solve a complicated multi-task problem consisting of humanoid robot navigation and control along with object recognition using a low-cost BCI system. Our approach enables subjects to control the navigation and exploration of a humanoid robot and recognize a desired object among candidates. This study aims to demonstrate the possibility of a hybrid BCI based on a low-cost system for a realistic and complex task. It also shows that the use of a simple image processing technique, combined with BCI, can further aid in making these complex tasks simpler. An experimental scenario is proposed in which a subject remotely controls a humanoid robot in a properly sized maze. The subject sees what the surrogate robot sees through visual feedback and can navigate the surrogate robot. While navigating, the robot encounters objects located in the maze. It then recognizes if the encountered object is of interest to the subject. The subject communicates with the robot through SSVEP and ERD-based BCIs to navigate and explore with the robot, and P300-based BCI to allow the surrogate robot recognize their favorites. Using several evaluation metrics, the performances of five subjects navigating the robot were quite comparable to manual keyboard control. During object recognition mode, favorite objects were successfully selected from two to four choices. Subjects conducted humanoid navigation and recognition tasks as if they embodied the robot. Analysis of the data supports the potential usefulness of the proposed hybrid BCI system for extended applications. This work presents an important implication for the future work that a hybridization of simple BCI protocols provide extended controllability to carry out complicated tasks even with a low-cost system. PMID:24023953
A low-cost EEG system-based hybrid brain-computer interface for humanoid robot navigation and recognition.

PubMed

Choi, Bongjae; Jo, Sungho

2013-01-01

This paper describes a hybrid brain-computer interface (BCI) technique that combines the P300 potential, the steady state visually evoked potential (SSVEP), and event related de-synchronization (ERD) to solve a complicated multi-task problem consisting of humanoid robot navigation and control along with object recognition using a low-cost BCI system. Our approach enables subjects to control the navigation and exploration of a humanoid robot and recognize a desired object among candidates. This study aims to demonstrate the possibility of a hybrid BCI based on a low-cost system for a realistic and complex task. It also shows that the use of a simple image processing technique, combined with BCI, can further aid in making these complex tasks simpler. An experimental scenario is proposed in which a subject remotely controls a humanoid robot in a properly sized maze. The subject sees what the surrogate robot sees through visual feedback and can navigate the surrogate robot. While navigating, the robot encounters objects located in the maze. It then recognizes if the encountered object is of interest to the subject. The subject communicates with the robot through SSVEP and ERD-based BCIs to navigate and explore with the robot, and P300-based BCI to allow the surrogate robot recognize their favorites. Using several evaluation metrics, the performances of five subjects navigating the robot were quite comparable to manual keyboard control. During object recognition mode, favorite objects were successfully selected from two to four choices. Subjects conducted humanoid navigation and recognition tasks as if they embodied the robot. Analysis of the data supports the potential usefulness of the proposed hybrid BCI system for extended applications. This work presents an important implication for the future work that a hybridization of simple BCI protocols provide extended controllability to carry out complicated tasks even with a low-cost system.
Generating descriptive visual words and visual phrases for large-scale image applications.

PubMed

Zhang, Shiliang; Tian, Qi; Hua, Gang; Huang, Qingming; Gao, Wen

2011-09-01

Bag-of-visual Words (BoWs) representation has been applied for various problems in the fields of multimedia and computer vision. The basic idea is to represent images as visual documents composed of repeatable and distinctive visual elements, which are comparable to the text words. Notwithstanding its great success and wide adoption, visual vocabulary created from single-image local descriptors is often shown to be not as effective as desired. In this paper, descriptive visual words (DVWs) and descriptive visual phrases (DVPs) are proposed as the visual correspondences to text words and phrases, where visual phrases refer to the frequently co-occurring visual word pairs. Since images are the carriers of visual objects and scenes, a descriptive visual element set can be composed by the visual words and their combinations which are effective in representing certain visual objects or scenes. Based on this idea, a general framework is proposed for generating DVWs and DVPs for image applications. In a large-scale image database containing 1506 object and scene categories, the visual words and visual word pairs descriptive to certain objects or scenes are identified and collected as the DVWs and DVPs. Experiments show that the DVWs and DVPs are informative and descriptive and, thus, are more comparable with the text words than the classic visual words. We apply the identified DVWs and DVPs in several applications including large-scale near-duplicated image retrieval, image search re-ranking, and object recognition. The combination of DVW and DVP performs better than the state of the art in large-scale near-duplicated image retrieval in terms of accuracy, efficiency and memory consumption. The proposed image search re-ranking algorithm: DWPRank outperforms the state-of-the-art algorithm by 12.4% in mean average precision and about 11 times faster in efficiency.
Effects of joint attention on long-term memory in 9-month-old infants: an event-related potentials study.

PubMed

Kopp, Franziska; Lindenberger, Ulman

2011-07-01

Joint attention develops during the first year of life but little is known about its effects on long-term memory. We investigated whether joint attention modulates long-term memory in 9-month-old infants. Infants were familiarized with visually presented objects in either of two conditions that differed in the degree of joint attention (high versus low). EEG indicators in response to old and novel objects were probed directly after the familiarization phase (immediate recognition), and following a 1-week delay (delayed recognition). In immediate recognition, the amplitude of positive slow-wave activity was modulated by joint attention. In the delayed recognition, the amplitude of the Pb component differentiated between high and low joint attention. In addition, the positive slow-wave amplitude during immediate and delayed recognition correlated with the frequency of infants' looks to the experimenter during familiarization. Under both high- and low-joint-attention conditions, the processing of unfamiliar objects was associated with an enhanced Nc component. Our results show that the degree of joint attention modulates EEG during immediate and delayed recognition. We conclude that joint attention affects long-term memory processing in 9-month-old infants by enhancing the relevance of attended items. © 2010 Blackwell Publishing Ltd.
Grouping in object recognition: the role of a Gestalt law in letter identification.

PubMed

Pelli, Denis G; Majaj, Najib J; Raizman, Noah; Christian, Christopher J; Kim, Edward; Palomares, Melanie C

2009-02-01

The Gestalt psychologists reported a set of laws describing how vision groups elements to recognize objects. The Gestalt laws "prescribe for us what we are to recognize 'as one thing'" (Kohler, 1920). Were they right? Does object recognition involve grouping? Tests of the laws of grouping have been favourable, but mostly assessed only detection, not identification, of the compound object. The grouping of elements seen in the detection experiments with lattices and "snakes in the grass" is compelling, but falls far short of the vivid everyday experience of recognizing a familiar, meaningful, named thing, which mediates the ordinary identification of an object. Thus, after nearly a century, there is hardly any evidence that grouping plays a role in ordinary object recognition. To assess grouping in object recognition, we made letters out of grating patches and measured threshold contrast for identifying these letters in visual noise as a function of perturbation of grating orientation, phase, and offset. We define a new measure, "wiggle", to characterize the degree to which these various perturbations violate the Gestalt law of good continuation. We find that efficiency for letter identification is inversely proportional to wiggle and is wholly determined by wiggle, independent of how the wiggle was produced. Thus the effects of three different kinds of shape perturbation on letter identifiability are predicted by a single measure of goodness of continuation. This shows that letter identification obeys the Gestalt law of good continuation and may be the first confirmation of the original Gestalt claim that object recognition involves grouping.

Grouping in object recognition: The role of a Gestalt law in letter identification

PubMed Central

Pelli, Denis G.; Majaj, Najib J.; Raizman, Noah; Christian, Christopher J.; Kim, Edward; Palomares, Melanie C.

2009-01-01

The Gestalt psychologists reported a set of laws describing how vision groups elements to recognize objects. The Gestalt laws “prescribe for us what we are to recognize ‘as one thing’” (Köhler, 1920). Were they right? Does object recognition involve grouping? Tests of the laws of grouping have been favourable, but mostly assessed only detection, not identification, of the compound object. The grouping of elements seen in the detection experiments with lattices and “snakes in the grass” is compelling, but falls far short of the vivid everyday experience of recognizing a familiar, meaningful, named thing, which mediates the ordinary identification of an object. Thus, after nearly a century, there is hardly any evidence that grouping plays a role in ordinary object recognition. To assess grouping in object recognition, we made letters out of grating patches and measured threshold contrast for identifying these letters in visual noise as a function of perturbation of grating orientation, phase, and offset. We define a new measure, “wiggle”, to characterize the degree to which these various perturbations violate the Gestalt law of good continuation. We find that efficiency for letter identification is inversely proportional to wiggle and is wholly determined by wiggle, independent of how the wiggle was produced. Thus the effects of three different kinds of shape perturbation on letter identifiability are predicted by a single measure of goodness of continuation. This shows that letter identification obeys the Gestalt law of good continuation and may be the first confirmation of the original Gestalt claim that object recognition involves grouping. PMID:19424881
Functional dissociation between action and perception of object shape in developmental visual object agnosia.

PubMed

Freud, Erez; Ganel, Tzvi; Avidan, Galia; Gilaie-Dotan, Sharon

2016-03-01

According to the two visual systems model, the cortical visual system is segregated into a ventral pathway mediating object recognition, and a dorsal pathway mediating visuomotor control. In the present study we examined whether the visual control of action could develop normally even when visual perceptual abilities are compromised from early childhood onward. Using his fingers, LG, an individual with a rare developmental visual object agnosia, manually estimated (perceptual condition) the width of blocks that varied in width and length (but not in overall size), or simply picked them up across their width (grasping condition). LG's perceptual sensitivity to target width was profoundly impaired in the manual estimation task compared to matched controls. In contrast, the sensitivity to object shape during grasping, as measured by maximum grip aperture (MGA), the time to reach the MGA, the reaction time and the total movement time were all normal in LG. Further analysis, however, revealed that LG's sensitivity to object shape during grasping emerged at a later time stage during the movement compared to controls. Taken together, these results demonstrate a dissociation between action and perception of object shape, and also point to a distinction between different stages of the grasping movement, namely planning versus online control. Moreover, the present study implies that visuomotor abilities can develop normally even when perceptual abilities developed in a profoundly impaired fashion. Copyright © 2016 Elsevier Ltd. All rights reserved.
Linear and Non-Linear Visual Feature Learning in Rat and Humans

PubMed Central

Bossens, Christophe; Op de Beeck, Hans P.

2016-01-01

The visual system processes visual input in a hierarchical manner in order to extract relevant features that can be used in tasks such as invariant object recognition. Although typically investigated in primates, recent work has shown that rats can be trained in a variety of visual object and shape recognition tasks. These studies did not pinpoint the complexity of the features used by these animals. Many tasks might be solved by using a combination of relatively simple features which tend to be correlated. Alternatively, rats might extract complex features or feature combinations which are nonlinear with respect to those simple features. In the present study, we address this question by starting from a small stimulus set for which one stimulus-response mapping involves a simple linear feature to solve the task while another mapping needs a well-defined nonlinear combination of simpler features related to shape symmetry. We verified computationally that the nonlinear task cannot be trivially solved by a simple V1-model. We show how rats are able to solve the linear feature task but are unable to acquire the nonlinear feature. In contrast, humans are able to use the nonlinear feature and are even faster in uncovering this solution as compared to the linear feature. The implications for the computational capabilities of the rat visual system are discussed. PMID:28066201
Touch influences perceived gloss

PubMed Central

Adams, Wendy J.; Kerrigan, Iona S.; Graf, Erich W.

2016-01-01

Identifying an object’s material properties supports recognition and action planning: we grasp objects according to how heavy, hard or slippery we expect them to be. Visual cues to material qualities such as gloss have recently received attention, but how they interact with haptic (touch) information has been largely overlooked. Here, we show that touch modulates gloss perception: objects that feel slippery are perceived as glossier (more shiny).Participants explored virtual objects that varied in look and feel. A discrimination paradigm (Experiment 1) revealed that observers integrate visual gloss with haptic information. Observers could easily detect an increase in glossiness when it was paired with a decrease in friction. In contrast, increased glossiness coupled with decreased slipperiness produced a small perceptual change: the visual and haptic changes counteracted each other. Subjective ratings (Experiment 2) reflected a similar interaction – slippery objects were rated as glossier and vice versa. The sensory system treats visual gloss and haptic friction as correlated cues to surface material. Although friction is not a perfect predictor of gloss, the visual system appears to know and use a probabilistic relationship between these variables to bias perception – a sensible strategy given the ambiguity of visual clues to gloss. PMID:26915492
Impaired integration of object knowledge and visual input in a case of ventral simultanagnosia with bilateral damage to area V4.

PubMed

Leek, E Charles; d'Avossa, Giovanni; Tainturier, Marie-Josèphe; Roberts, Daniel J; Yuen, Sung Lai; Hu, Mo; Rafal, Robert

2012-01-01

This study examines how brain damage can affect the cognitive processes that support the integration of sensory input and prior knowledge during shape perception. It is based on the first detailed study of acquired ventral simultanagnosia, which was found in a patient (M.T.) with posterior occipitotemporal lesions encompassing V4 bilaterally. Despite showing normal object recognition for single items in both accuracy and response times (RTs), and intact low-level vision assessed across an extensive battery of tests, M.T. was impaired in object identification with overlapping figures displays. Task performance was modulated by familiarity: Unlike controls, M.T. was faster with overlapping displays of abstract shapes than with overlapping displays of common objects. His performance with overlapping common object displays was also influenced by both the semantic relatedness and visual similarity of the display items. These findings challenge claims that visual perception is driven solely by feedforward mechanisms and show how brain damage can selectively impair high-level perceptual processes supporting the integration of stored knowledge and visual sensory input.
Social cues at encoding affect memory in 4-month-old infants.

PubMed

Kopp, Franziska; Lindenberger, Ulman

2012-01-01

Available evidence suggests that infants use adults' social cues for learning by the second half of the first year of life. However, little is known about the short-term or long-term effects of joint attention interactions on learning and memory in younger infants. In the present study, 4-month-old infants were familiarized with visually presented objects in either of two conditions that differed in the degree of joint attention (high vs. low). Brain activity in response to familiar and novel objects was assessed immediately after the familiarization phase (immediate recognition), and following a 1-week delay (delayed recognition). The latency of the Nc component differentiated between recognition of old versus new objects. Pb amplitude and latency were affected by joint attention in delayed recognition. Moreover, the frequency of infant gaze to the experimenter during familiarization differed between the two experimental groups and modulated the Pb response. Results show that joint attention affects the mechanisms of long-term retention in 4-month-old infants. We conclude that joint attention helps children at this young age to recognize the relevance of learned items.
[Pattern recognition of decorative papers with different visual characteristics using visible spectroscopy coupled with principal component analysis (PCA)].

PubMed

Zhang, Mao-mao; Yang, Zhong; Lu, Bin; Liu, Ya-na; Sun, Xue-dong

2015-02-01

As one of the most important decorative materials for the modern household products, decorative papers impregnated with melamine not only have better decorative performance, but also could greatly improve the surface properties of materials. However, the appearance quality (such as color-difference evaluation and control) of decorative papers, as an important index for the surface quality of decorative paper, has been a puzzle for manufacturers and consumers. Nowadays, human eye is used to discriminate whether there exist color difference in the factory, which is not only of low efficiency but also prone to bring subjective error. Thus, it is of great significance to find an effective method in order to realize the fast recognition and classification of the decorative papers. In the present study, the visible spectroscopy coupled with principal component analysis (PCA) was used for the pattern recognition of decorative papers with different visual characteristics to investigate the feasibility of visible spectroscopy to rapidly recognize the types of decorative papers. The results showed that the correlation between visible spectroscopy and visual characteristics (L*, a* and b*) was significant, and the correlation coefficients wereup to 0.85 and some was even more than 0. 99, which might suggest that the visible spectroscopy reflected some information about visual characteristics on the surface of decorative papers. When using the visible spectroscopy coupled with PCA to recognize the types of decorative papers, the accuracy reached 94%-100%, which might suggest that the visible spectroscopy was a very potential new method for the rapid, objective and accurate recognition of decorative papers with different visual characteristics.
The role of line junctions in object recognition: The case of reading musical notation.

PubMed

Wong, Yetta Kwailing; Wong, Alan C-N

2018-04-30

Previous work has shown that line junctions are informative features for visual perception of objects, letters, and words. However, the sources of such sensitivity and their generalizability to other object categories are largely unclear. We addressed these questions by studying perceptual expertise in reading musical notation, a domain in which individuals with different levels of expertise are readily available. We observed that removing line junctions created by the contact between musical notes and staff lines selectively impaired recognition performance in experts and intermediate readers, but not in novices. The degree of performance impairment was predicted by individual fluency in reading musical notation. Our findings suggest that line junctions provide diagnostic information about object identity across various categories, including musical notation. However, human sensitivity to line junctions does not readily transfer from familiar to unfamiliar object categories, and has to be acquired through perceptual experience with the specific objects.
MEDIASSIST: medical assistance for intraoperative skill transfer in minimally invasive surgery using augmented reality

NASA Astrophysics Data System (ADS)

Sudra, Gunther; Speidel, Stefanie; Fritz, Dominik; Müller-Stich, Beat Peter; Gutt, Carsten; Dillmann, Rüdiger

2007-03-01

Minimally invasive surgery is a highly complex medical discipline with various risks for surgeon and patient, but has also numerous advantages on patient-side. The surgeon has to adapt special operation-techniques and deal with difficulties like the complex hand-eye coordination, limited field of view and restricted mobility. To alleviate with these new problems, we propose to support the surgeon's spatial cognition by using augmented reality (AR) techniques to directly visualize virtual objects in the surgical site. In order to generate an intelligent support, it is necessary to have an intraoperative assistance system that recognizes the surgical skills during the intervention and provides context-aware assistance surgeon using AR techniques. With MEDIASSIST we bundle our research activities in the field of intraoperative intelligent support and visualization. Our experimental setup consists of a stereo endoscope, an optical tracking system and a head-mounted-display for 3D visualization. The framework will be used as platform for the development and evaluation of our research in the field of skill recognition and context-aware assistance generation. This includes methods for surgical skill analysis, skill classification, context interpretation as well as assistive visualization and interaction techniques. In this paper we present the objectives of MEDIASSIST and first results in the fields of skill analysis, visualization and multi-modal interaction. In detail we present a markerless instrument tracking for surgical skill analysis as well as visualization techniques and recognition of interaction gestures in an AR environment.
Electrophysiological evidence that top-down knowledge controls working memory processing for subsequent visual search.

PubMed

Kawashima, Tomoya; Matsumoto, Eriko

2016-03-23

Items in working memory guide visual attention toward a memory-matching object. Recent studies have shown that when searching for an object this attentional guidance can be modulated by knowing the probability that the target will match an item in working memory. Here, we recorded the P3 and contralateral delay activity to investigate how top-down knowledge controls the processing of working memory items. Participants performed memory task (recognition only) and memory-or-search task (recognition or visual search) in which they were asked to maintain two colored oriented bars in working memory. For visual search, we manipulated the probability that target had the same color as memorized items (0, 50, or 100%). Participants knew the probabilities before the task. Target detection in 100% match condition was faster than that in 50% match condition, indicating that participants used their knowledge of the probabilities. We found that the P3 amplitude in 100% condition was larger than in other conditions and that contralateral delay activity amplitude did not vary across conditions. These results suggest that more attention was allocated to the memory items when observers knew in advance that their color would likely match a target. This led to better search performance despite using qualitatively equal working memory representations.
Towards discrete wavelet transform-based human activity recognition

NASA Astrophysics Data System (ADS)

Khare, Manish; Jeon, Moongu

2017-06-01

Providing accurate recognition of human activities is a challenging problem for visual surveillance applications. In this paper, we present a simple and efficient algorithm for human activity recognition based on a wavelet transform. We adopt discrete wavelet transform (DWT) coefficients as a feature of human objects to obtain advantages of its multiresolution approach. The proposed method is tested on multiple levels of DWT. Experiments are carried out on different standard action datasets including KTH and i3D Post. The proposed method is compared with other state-of-the-art methods in terms of different quantitative performance measures. The proposed method is found to have better recognition accuracy in comparison to the state-of-the-art methods.
Quantifying the effect of colorization enhancement on mammogram images

NASA Astrophysics Data System (ADS)

Wojnicki, Paul J.; Uyeda, Elizabeth; Micheli-Tzanakou, Evangelia

2002-04-01

Current methods of radiological displays provide only grayscale images of mammograms. The limitation of the image space to grayscale provides only luminance differences and textures as cues for object recognition within the image. However, color can be an important and significant cue in the detection of shapes and objects. Increasing detection ability allows the radiologist to interpret the images in more detail, improving object recognition and diagnostic accuracy. Color detection experiments using our stimulus system, have demonstrated that an observer can only detect an average of 140 levels of grayscale. An optimally colorized image can allow a user to distinguish 250 - 1000 different levels, hence increasing potential image feature detection by 2-7 times. By implementing a colorization map, which follows the luminance map of the original grayscale images, the luminance profile is preserved and color is isolated as the enhancement mechanism. The effect of this enhancement mechanism on the shape, frequency composition and statistical characteristics of the Visual Evoked Potential (VEP) are analyzed and presented. Thus, the effectiveness of the image colorization is measured quantitatively using the Visual Evoked Potential (VEP).
Tactical decisions for changeable cuttlefish camouflage: visual cues for choosing masquerade are relevant from a greater distance than visual cues used for background matching.

PubMed

Buresch, Kendra C; Ulmer, Kimberly M; Cramer, Corinne; McAnulty, Sarah; Davison, William; Mäthger, Lydia M; Hanlon, Roger T

2015-10-01

Cuttlefish use multiple camouflage tactics to evade their predators. Two common tactics are background matching (resembling the background to hinder detection) and masquerade (resembling an uninteresting or inanimate object to impede detection or recognition). We investigated how the distance and orientation of visual stimuli affected the choice of these two camouflage tactics. In the current experiments, cuttlefish were presented with three visual cues: 2D horizontal floor, 2D vertical wall, and 3D object. Each was placed at several distances: directly beneath (in a circle whose diameter was one body length (BL); at zero BL [(0BL); i.e., directly beside, but not beneath the cuttlefish]; at 1BL; and at 2BL. Cuttlefish continued to respond to 3D visual cues from a greater distance than to a horizontal or vertical stimulus. It appears that background matching is chosen when visual cues are relevant only in the immediate benthic surroundings. However, for masquerade, objects located multiple body lengths away remained relevant for choice of camouflage. © 2015 Marine Biological Laboratory.
CAVIAR: a 45k neuron, 5M synapse, 12G connects/s AER hardware sensory-processing- learning-actuating system for high-speed visual object recognition and tracking.

PubMed

Serrano-Gotarredona, Rafael; Oster, Matthias; Lichtsteiner, Patrick; Linares-Barranco, Alejandro; Paz-Vicente, Rafael; Gomez-Rodriguez, Francisco; Camunas-Mesa, Luis; Berner, Raphael; Rivas-Perez, Manuel; Delbruck, Tobi; Liu, Shih-Chii; Douglas, Rodney; Hafliger, Philipp; Jimenez-Moreno, Gabriel; Civit Ballcels, Anton; Serrano-Gotarredona, Teresa; Acosta-Jimenez, Antonio J; Linares-Barranco, Bernabé

2009-09-01

This paper describes CAVIAR, a massively parallel hardware implementation of a spike-based sensing-processing-learning-actuating system inspired by the physiology of the nervous system. CAVIAR uses the asychronous address-event representation (AER) communication framework and was developed in the context of a European Union funded project. It has four custom mixed-signal AER chips, five custom digital AER interface components, 45k neurons (spiking cells), up to 5M synapses, performs 12G synaptic operations per second, and achieves millisecond object recognition and tracking latencies.
Invariant Visual Object and Face Recognition: Neural and Computational Bases, and a Model, VisNet

PubMed Central

Rolls, Edmund T.

2012-01-01

Neurophysiological evidence for invariant representations of objects and faces in the primate inferior temporal visual cortex is described. Then a computational approach to how invariant representations are formed in the brain is described that builds on the neurophysiology. A feature hierarchy model in which invariant representations can be built by self-organizing learning based on the temporal and spatial statistics of the visual input produced by objects as they transform in the world is described. VisNet can use temporal continuity in an associative synaptic learning rule with a short-term memory trace, and/or it can use spatial continuity in continuous spatial transformation learning which does not require a temporal trace. The model of visual processing in the ventral cortical stream can build representations of objects that are invariant with respect to translation, view, size, and also lighting. The model has been extended to provide an account of invariant representations in the dorsal visual system of the global motion produced by objects such as looming, rotation, and object-based movement. The model has been extended to incorporate top-down feedback connections to model the control of attention by biased competition in, for example, spatial and object search tasks. The approach has also been extended to account for how the visual system can select single objects in complex visual scenes, and how multiple objects can be represented in a scene. The approach has also been extended to provide, with an additional layer, for the development of representations of spatial scenes of the type found in the hippocampus. PMID:22723777
Invariant Visual Object and Face Recognition: Neural and Computational Bases, and a Model, VisNet.

PubMed

Rolls, Edmund T

2012-01-01

Neurophysiological evidence for invariant representations of objects and faces in the primate inferior temporal visual cortex is described. Then a computational approach to how invariant representations are formed in the brain is described that builds on the neurophysiology. A feature hierarchy model in which invariant representations can be built by self-organizing learning based on the temporal and spatial statistics of the visual input produced by objects as they transform in the world is described. VisNet can use temporal continuity in an associative synaptic learning rule with a short-term memory trace, and/or it can use spatial continuity in continuous spatial transformation learning which does not require a temporal trace. The model of visual processing in the ventral cortical stream can build representations of objects that are invariant with respect to translation, view, size, and also lighting. The model has been extended to provide an account of invariant representations in the dorsal visual system of the global motion produced by objects such as looming, rotation, and object-based movement. The model has been extended to incorporate top-down feedback connections to model the control of attention by biased competition in, for example, spatial and object search tasks. The approach has also been extended to account for how the visual system can select single objects in complex visual scenes, and how multiple objects can be represented in a scene. The approach has also been extended to provide, with an additional layer, for the development of representations of spatial scenes of the type found in the hippocampus.
Frontal–Occipital Connectivity During Visual Search

PubMed Central

Pantazatos, Spiro P.; Yanagihara, Ted K.; Zhang, Xian; Meitzler, Thomas

2012-01-01

Abstract Although expectation- and attention-related interactions between ventral and medial prefrontal cortex and stimulus category-selective visual regions have been identified during visual detection and discrimination, it is not known if similar neural mechanisms apply to other tasks such as visual search. The current work tested the hypothesis that high-level frontal regions, previously implicated in expectation and visual imagery of object categories, interact with visual regions associated with object recognition during visual search. Using functional magnetic resonance imaging, subjects searched for a specific object that varied in size and location within a complex natural scene. A model-free, spatial-independent component analysis isolated multiple task-related components, one of which included visual cortex, as well as a cluster within ventromedial prefrontal cortex (vmPFC), consistent with the engagement of both top-down and bottom-up processes. Analyses of psychophysiological interactions showed increased functional connectivity between vmPFC and object-sensitive lateral occipital cortex (LOC), and results from dynamic causal modeling and Bayesian Model Selection suggested bidirectional connections between vmPFC and LOC that were positively modulated by the task. Using image-guided diffusion-tensor imaging, functionally seeded, probabilistic white-matter tracts between vmPFC and LOC, which presumably underlie this effective interconnectivity, were also observed. These connectivity findings extend previous models of visual search processes to include specific frontal–occipital neuronal interactions during a natural and complex search task. PMID:22708993
A cultural side effect: learning to read interferes with identity processing of familiar objects

PubMed Central

Kolinsky, Régine; Fernandes, Tânia

2014-01-01

Based on the neuronal recycling hypothesis (Dehaene and Cohen, 2007), we examined whether reading acquisition has a cost for the recognition of non-linguistic visual materials. More specifically, we checked whether the ability to discriminate between mirror images, which develops through literacy acquisition, interferes with object identity judgments, and whether interference strength varies as a function of the nature of the non-linguistic material. To these aims we presented illiterate, late literate (who learned to read at adult age), and early literate adults with an orientation-independent, identity-based same-different comparison task in which they had to respond “same” to both physically identical and mirrored or plane-rotated images of pictures of familiar objects (Experiment 1) or of geometric shapes (Experiment 2). Interference from irrelevant orientation variations was stronger with plane rotations than with mirror images, and stronger with geometric shapes than with objects. Illiterates were the only participants almost immune to mirror variations, but only for familiar objects. Thus, the process of unlearning mirror-image generalization, necessary to acquire literacy in the Latin alphabet, has a cost for a basic function of the visual ventral object recognition stream, i.e., identification of familiar objects. This demonstrates that neural recycling is not just an adaptation to multi-use but a process of at least partial exaptation. PMID:25400605
A multistream model of visual word recognition.

PubMed

Allen, Philip A; Smith, Albert F; Lien, Mei-Ching; Kaut, Kevin P; Canfield, Angie

2009-02-01

Four experiments are reported that test a multistream model of visual word recognition, which associates letter-level and word-level processing channels with three known visual processing streams isolated in macaque monkeys: the magno-dominated (MD) stream, the interblob-dominated (ID) stream, and the blob-dominated (BD) stream (Van Essen & Anderson, 1995). We show that mixing the color of adjacent letters of words does not result in facilitation of response times or error rates when the spatial-frequency pattern of a whole word is familiar. However, facilitation does occur when the spatial-frequency pattern of a whole word is not familiar. This pattern of results is not due to different luminance levels across the different-colored stimuli and the background because isoluminant displays were used. Also, the mixed-case, mixed-hue facilitation occurred when different display distances were used (Experiments 2 and 3), so this suggests that image normalization can adjust independently of object size differences. Finally, we show that this effect persists in both spaced and unspaced conditions (Experiment 4)--suggesting that inappropriate letter grouping by hue cannot account for these results. These data support a model of visual word recognition in which lower spatial frequencies are processed first in the more rapid MD stream. The slower ID and BD streams may process some lower spatial frequency information in addition to processing higher spatial frequency information, but these channels tend to lose the processing race to recognition unless the letter string is unfamiliar to the MD stream--as with mixed-case presentation.
Do object refixations during scene viewing indicate rehearsal in visual working memory?

PubMed

Zelinsky, Gregory J; Loschky, Lester C; Dickinson, Christopher A

2011-05-01

Do refixations serve a rehearsal function in visual working memory (VWM)? We analyzed refixations from observers freely viewing multiobject scenes. An eyetracker was used to limit the viewing of a scene to a specified number of objects fixated after the target (intervening objects), followed by a four-alternative forced choice recognition test. Results showed that the probability of target refixation increased with the number of fixated intervening objects, and these refixations produced a 16% accuracy benefit over the first five intervening-object conditions. Additionally, refixations most frequently occurred after fixations on only one to two other objects, regardless of the intervening-object condition. These behaviors could not be explained by random or minimally constrained computational models; a VWM component was required to completely describe these data. We explain these findings in terms of a monitor-refixate rehearsal system: The activations of object representations in VWM are monitored, with refixations occurring when these activations decrease suddenly.

Getting the Gist of Events: Recognition of Two-Participant Actions from Brief Displays

PubMed Central

Hafri, Alon; Papafragou, Anna; Trueswell, John C.

2013-01-01

Unlike rapid scene and object recognition from brief displays, little is known about recognition of event categories and event roles from minimal visual information. In three experiments, we displayed naturalistic photographs of a wide range of two-participant event scenes for 37 ms and 73 ms followed by a mask, and found that event categories (the event gist, e.g., ‘kicking’, ‘pushing’, etc.) and event roles (i.e., Agent and Patient) can be recognized rapidly, even with various actor pairs and backgrounds. Norming ratings from a subsequent experiment revealed that certain physical features (e.g., outstretched extremities) that correlate with Agent-hood could have contributed to rapid role recognition. In a final experiment, using identical twin actors, we then varied these features in two sets of stimuli, in which Patients had Agent-like features or not. Subjects recognized the roles of event participants less accurately when Patients possessed Agent-like features, with this difference being eliminated with two-second durations. Thus, given minimal visual input, typical Agent-like physical features are used in role recognition but, with sufficient input from multiple fixations, people categorically determine the relationship between event participants. PMID:22984951
Semantic attributes are encoded in human electrocorticographic signals during visual object recognition.

PubMed

Rupp, Kyle; Roos, Matthew; Milsap, Griffin; Caceres, Carlos; Ratto, Christopher; Chevillet, Mark; Crone, Nathan E; Wolmetz, Michael

2017-03-01

Non-invasive neuroimaging studies have shown that semantic category and attribute information are encoded in neural population activity. Electrocorticography (ECoG) offers several advantages over non-invasive approaches, but the degree to which semantic attribute information is encoded in ECoG responses is not known. We recorded ECoG while patients named objects from 12 semantic categories and then trained high-dimensional encoding models to map semantic attributes to spectral-temporal features of the task-related neural responses. Using these semantic attribute encoding models, untrained objects were decoded with accuracies comparable to whole-brain functional Magnetic Resonance Imaging (fMRI), and we observed that high-gamma activity (70-110Hz) at basal occipitotemporal electrodes was associated with specific semantic dimensions (manmade-animate, canonically large-small, and places-tools). Individual patient results were in close agreement with reports from other imaging modalities on the time course and functional organization of semantic processing along the ventral visual pathway during object recognition. The semantic attribute encoding model approach is critical for decoding objects absent from a training set, as well as for studying complex semantic encodings without artificially restricting stimuli to a small number of semantic categories. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Visual-spatial abilities relate to mathematics achievement in children with heavy prenatal alcohol exposure

PubMed Central

Crocker, N.; Riley, E.P.; Mattson, S.N.

2014-01-01

Objective The current study examined the relationship between mathematics and attention, working memory, and visual memory in children with heavy prenatal alcohol exposure and controls. Method Fifty-six children (29 AE, 27 CON) were administered measures of global mathematics achievement (WRAT-3 Arithmetic & WISC-III Written Arithmetic), attention, (WISC-III Digit Span forward and Spatial Span forward), working memory (WISC-III Digit Span backward and Spatial Span backward), and visual memory (CANTAB Spatial Recognition Memory and Pattern Recognition Memory). The contribution of cognitive domains to mathematics achievement was analyzed using linear regression techniques. Attention, working memory and visual memory data were entered together on step 1 followed by group on step 2, and the interaction terms on step 3. Results Model 1 accounted for a significant amount of variance in both mathematics achievement measures, however, model fit improved with the addition of group on step 2. Significant predictors of mathematics achievement were Spatial Span forward and backward and Spatial Recognition Memory. Conclusions These findings suggest that deficits in spatial processing may be related to math impairments seen in FASD. In addition, prenatal alcohol exposure was associated with deficits in mathematics achievement, above and beyond the contribution of general cognitive abilities. PMID:25000323
Salient man-made structure detection in infrared images

NASA Astrophysics Data System (ADS)

Li, Dong-jie; Zhou, Fu-gen; Jin, Ting

2013-09-01

Target detection, segmentation and recognition is a hot research topic in the field of image processing and pattern recognition nowadays, among which salient area or object detection is one of core technologies of precision guided weapon. Many theories have been raised in this paper; we detect salient objects in a series of input infrared images by using the classical feature integration theory and Itti's visual attention system. In order to find the salient object in an image accurately, we present a new method to solve the edge blur problem by calculating and using the edge mask. We also greatly improve the computing speed by improving the center-surround differences method. Unlike the traditional algorithm, we calculate the center-surround differences through rows and columns separately. Experimental results show that our method is effective in detecting salient object accurately and rapidly.
[Several mechanisms of visual gnosis disorders in local brain lesions].

PubMed

Meerson, Ia A

1981-01-01

The object of the studies were peculiarities of recognizing visual images by patients with local cerebral lesions under conditions of incomplete sets of the image features, disjunction of the latter, distortion of their spatial arrangement, and unusual spatial orientation of the image as a whole. It was found that elimination of even one essential feature sharply hampered the recognition of the image both by healthy individuals (control), and patients with extraoccipital lesions, whereas elimination of several nonessential features only slowed down the process. In distinction from this the difficulties of the recognition of incomplete images by patients with occipital lesions were directly proportional to the number of the eliminated features irrespective of the latters' significance, i.e. these patients were unable to evaluate the hierarchy of the features. The recognition process in these patients were followed the way of scanning individual features. The reaccumulation and summation. The recognition of the fragmental, spatially distorted and unusually oriented images was found to be affected selectively in patients with parietal lobe affections. The patients with occipital lesions recognized such images practically as good as the ordinary ones.
Sensitivity to timing and order in human visual cortex.

PubMed

Singer, Jedediah M; Madsen, Joseph R; Anderson, William S; Kreiman, Gabriel

2015-03-01

Visual recognition takes a small fraction of a second and relies on the cascade of signals along the ventral visual stream. Given the rapid path through multiple processing steps between photoreceptors and higher visual areas, information must progress from stage to stage very quickly. This rapid progression of information suggests that fine temporal details of the neural response may be important to the brain's encoding of visual signals. We investigated how changes in the relative timing of incoming visual stimulation affect the representation of object information by recording intracranial field potentials along the human ventral visual stream while subjects recognized objects whose parts were presented with varying asynchrony. Visual responses along the ventral stream were sensitive to timing differences as small as 17 ms between parts. In particular, there was a strong dependency on the temporal order of stimulus presentation, even at short asynchronies. From these observations we infer that the neural representation of complex information in visual cortex can be modulated by rapid dynamics on scales of tens of milliseconds. Copyright © 2015 the American Physiological Society.
Visual shape perception as Bayesian inference of 3D object-centered shape representations.

PubMed

Erdogan, Goker; Jacobs, Robert A

2017-11-01

Despite decades of research, little is known about how people visually perceive object shape. We hypothesize that a promising approach to shape perception is provided by a "visual perception as Bayesian inference" framework which augments an emphasis on visual representation with an emphasis on the idea that shape perception is a form of statistical inference. Our hypothesis claims that shape perception of unfamiliar objects can be characterized as statistical inference of 3D shape in an object-centered coordinate system. We describe a computational model based on our theoretical framework, and provide evidence for the model along two lines. First, we show that, counterintuitively, the model accounts for viewpoint-dependency of object recognition, traditionally regarded as evidence against people's use of 3D object-centered shape representations. Second, we report the results of an experiment using a shape similarity task, and present an extensive evaluation of existing models' abilities to account for the experimental data. We find that our shape inference model captures subjects' behaviors better than competing models. Taken as a whole, our experimental and computational results illustrate the promise of our approach and suggest that people's shape representations of unfamiliar objects are probabilistic, 3D, and object-centered. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Acquiring Semantically Meaningful Models for Robotic Localization, Mapping and Target Recognition

DTIC Science & Technology

2014-12-21

information, including suggesstions for reducing this burden, to Washington Headquarters Services , Directorate for Information Operations and Reports, 1215...Representations • Point features tracking • Recovery of relative motion, visual odometry • Loop closure • Environment models, sparse clouds of points...that co- occur with the object of interest Chair-Background Table-Background Object Level Segmentation Jaccard Index Silber .[5] 15.12 RenFox[4
Perceptual Learning of Object Shape

PubMed Central

Golcu, Doruk; Gilbert, Charles D.

2009-01-01

Recognition of objects is accomplished through the use of cues that depend on internal representations of familiar shapes. We used a paradigm of perceptual learning during visual search to explore what features human observers use to identify objects. Human subjects were trained to search for a target object embedded in an array of distractors, until their performance improved from near-chance levels to over 80% of trials in an object specific manner. We determined the role of specific object components in the recognition of the object as a whole by measuring the transfer of learning from the trained object to other objects sharing components with it. Depending on the geometric relationship of the trained object with untrained objects, transfer to untrained objects was observed. Novel objects that shared a component with the trained object were identified at much higher levels than those that did not, and this could be used as an indicator of which features of the object were important for recognition. Training on an object also transferred to the components of the object when these components were embedded in an array of distractors of similar complexity. These results suggest that objects are not represented in a holistic manner during learning, but that their individual components are encoded. Transfer between objects was not complete, and occurred for more than one component, regardless of how well they distinguish the object from distractors. This suggests that a joint involvement of multiple components was necessary for full performance. PMID:19864574
Multi-channel feature dictionaries for RGB-D object recognition

NASA Astrophysics Data System (ADS)

Lan, Xiaodong; Li, Qiming; Chong, Mina; Song, Jian; Li, Jun

2018-04-01

Hierarchical matching pursuit (HMP) is a popular feature learning method for RGB-D object recognition. However, the feature representation with only one dictionary for RGB channels in HMP does not capture sufficient visual information. In this paper, we propose multi-channel feature dictionaries based feature learning method for RGB-D object recognition. The process of feature extraction in the proposed method consists of two layers. The K-SVD algorithm is used to learn dictionaries in sparse coding of these two layers. In the first-layer, we obtain features by performing max pooling on sparse codes of pixels in a cell. And the obtained features of cells in a patch are concatenated to generate patch jointly features. Then, patch jointly features in the first-layer are used to learn the dictionary and sparse codes in the second-layer. Finally, spatial pyramid pooling can be applied to the patch jointly features of any layer to generate the final object features in our method. Experimental results show that our method with first or second-layer features can obtain a comparable or better performance than some published state-of-the-art methods.
Visual Predictions in the Orbitofrontal Cortex Rely on Associative Content

PubMed Central

Chaumon, Maximilien; Kveraga, Kestutis; Barrett, Lisa Feldman; Bar, Moshe

2014-01-01

Predicting upcoming events from incomplete information is an essential brain function. The orbitofrontal cortex (OFC) plays a critical role in this process by facilitating recognition of sensory inputs via predictive feedback to sensory cortices. In the visual domain, the OFC is engaged by low spatial frequency (LSF) and magnocellular-biased inputs, but beyond this, we know little about the information content required to activate it. Is the OFC automatically engaged to analyze any LSF information for meaning? Or is it engaged only when LSF information matches preexisting memory associations? We tested these hypotheses and show that only LSF information that could be linked to memory associations engages the OFC. Specifically, LSF stimuli activated the OFC in 2 distinct medial and lateral regions only if they resembled known visual objects. More identifiable objects increased activity in the medial OFC, known for its function in affective responses. Furthermore, these objects also increased the connectivity of the lateral OFC with the ventral visual cortex, a crucial region for object identification. At the interface between sensory, memory, and affective processing, the OFC thus appears to be attuned to the associative content of visual information and to play a central role in visuo-affective prediction. PMID:23771980
[Visual representation of natural scenes in flicker changes].

PubMed

Nakashima, Ryoichi; Yokosawa, Kazuhiko

2010-08-01

Coherence theory in scene perception (Rensink, 2002) assumes the retention of volatile object representations on which attention is not focused. On the other hand, visual memory theory in scene perception (Hollingworth & Henderson, 2002) assumes that robust object representations are retained. In this study, we hypothesized that the difference between these two theories is derived from the difference of the experimental tasks that they are based on. In order to verify this hypothesis, we examined the properties of visual representation by using a change detection and memory task in a flicker paradigm. We measured the representations when participants were instructed to search for a change in a scene, and compared them with the intentional memory representations. The visual representations were retained in visual long-term memory even in the flicker paradigm, and were as robust as the intentional memory representations. However, the results indicate that the representations are unavailable for explicitly localizing a scene change, but are available for answering the recognition test. This suggests that coherence theory and visual memory theory are compatible.
Component-based target recognition inspired by human vision

NASA Astrophysics Data System (ADS)

Zheng, Yufeng; Agyepong, Kwabena

2009-05-01

In contrast with machine vision, human can recognize an object from complex background with great flexibility. For example, given the task of finding and circling all cars (no further information) in a picture, you may build a virtual image in mind from the task (or target) description before looking at the picture. Specifically, the virtual car image may be composed of the key components such as driver cabin and wheels. In this paper, we propose a component-based target recognition method by simulating the human recognition process. The component templates (equivalent to the virtual image in mind) of the target (car) are manually decomposed from the target feature image. Meanwhile, the edges of the testing image can be extracted by using a difference of Gaussian (DOG) model that simulates the spatiotemporal response in visual process. A phase correlation matching algorithm is then applied to match the templates with the testing edge image. If all key component templates are matched with the examining object, then this object is recognized as the target. Besides the recognition accuracy, we will also investigate if this method works with part targets (half cars). In our experiments, several natural pictures taken on streets were used to test the proposed method. The preliminary results show that the component-based recognition method is very promising.
A defense of the subordinate-level expertise account for the N170 component.

PubMed

Rossion, Bruno; Curran, Tim; Gauthier, Isabel

2002-09-01

A recent paper in this journal reports two event-related potential (ERP) experiments interpreted as supporting the domain specificity of the visual mechanisms implicated in processing faces (Cognition 83 (2002) 1). The authors argue that because a large neurophysiological response to faces (N170) is less influenced by the task than the response to objects, and because the response for human faces extends to ape faces (for which we are not expert), we should reject the hypothesis that the face-sensitivity reflected by the N170 can be accounted for by the subordinate-level expertise model of object recognition (Nature Neuroscience 3 (2000) 764). In this commentary, we question this conclusion based on some of our own ERP work on expert object recognition as well as the work of others.
Visual Communications and Image Processing

NASA Astrophysics Data System (ADS)

Hsing, T. Russell

1987-07-01

This special issue of Optical Engineering is concerned with visual communications and image processing. The increase in communication of visual information over the past several decades has resulted in many new image processing and visual communication systems being put into service. The growth of this field has been rapid in both commercial and military applications. The objective of this special issue is to intermix advent technology in visual communications and image processing with ideas generated from industry, universities, and users through both invited and contributed papers. The 15 papers of this issue are organized into four different categories: image compression and transmission, image enhancement, image analysis and pattern recognition, and image processing in medical applications.
Gender differences in the recognition of spatially transformed figures: behavioral data and event-related potentials (ERPs).

PubMed

Mikhailova, E S; Slavutskaya, A V; Gerasimenko, N Yu

2012-08-30

The gender differences in accuracy, reaction time (RT) and amplitude of the early P1 and N1 components of ERPs during recognition of previously memorized objects after their spatial transformation were examined. We used three levels of the spatial transformation: a displacement of object details in radial direction, and a displacement in combination with rotation of the details by ±0° to 45° and ±45° to 90°. The accuracy and the RT data showed a similarity of task performance in males and females. The effect of rotation was significantly greater than the effect of simple displacement, and the accuracy decreased, and the RT increased with the rotation angle in both genders. At the same time we found significant sex differences in the early stage of visual processing. In males the P1 peak amplitude at the P3/P4 sites increased significantly during the recognition of spatially transformed objects, and the wider the angle of rotation the greater the P1 peak amplitude. In contrast, in females the P1 peak amplitude did not depend on the rotation of figure details. The N1 amplitude revealed no gender differences, although the object transformation evoked somewhat greater changes in the N1 at the O1/O2 sites in females compared to males. This new fact that only males demonstrated the sensitivity of early perceptual stage to the transformation of objects adds information about the neurobiological basis of different strategies in the visual processing used by each gender. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Deep learning

NASA Astrophysics Data System (ADS)

Lecun, Yann; Bengio, Yoshua; Hinton, Geoffrey

2015-05-01

Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.
Deep learning.

PubMed

LeCun, Yann; Bengio, Yoshua; Hinton, Geoffrey

2015-05-28

Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.
Resolving human object recognition in space and time

PubMed Central

Cichy, Radoslaw Martin; Pantazis, Dimitrios; Oliva, Aude

2014-01-01

A comprehensive picture of object processing in the human brain requires combining both spatial and temporal information about brain activity. Here, we acquired human magnetoencephalography (MEG) and functional magnetic resonance imaging (fMRI) responses to 92 object images. Multivariate pattern classification applied to MEG revealed the time course of object processing: whereas individual images were discriminated by visual representations early, ordinate and superordinate category levels emerged relatively later. Using representational similarity analysis, we combine human fMRI and MEG to show content-specific correspondence between early MEG responses and primary visual cortex (V1), and later MEG responses and inferior temporal (IT) cortex. We identified transient and persistent neural activities during object processing, with sources in V1 and IT., Finally, human MEG signals were correlated to single-unit responses in monkey IT. Together, our findings provide an integrated space- and time-resolved view of human object categorization during the first few hundred milliseconds of vision. PMID:24464044
Transfer learning for visual categorization: a survey.

PubMed

Shao, Ling; Zhu, Fan; Li, Xuelong

2015-05-01

Regular machine learning and data mining techniques study the training data for future inferences under a major assumption that the future data are within the same feature space or have the same distribution as the training data. However, due to the limited availability of human labeled training data, training data that stay in the same feature space or have the same distribution as the future data cannot be guaranteed to be sufficient enough to avoid the over-fitting problem. In real-world applications, apart from data in the target domain, related data in a different domain can also be included to expand the availability of our prior knowledge about the target future data. Transfer learning addresses such cross-domain learning problems by extracting useful information from data in a related domain and transferring them for being used in target tasks. In recent years, with transfer learning being applied to visual categorization, some typical problems, e.g., view divergence in action recognition tasks and concept drifting in image classification tasks, can be efficiently solved. In this paper, we survey state-of-the-art transfer learning algorithms in visual categorization applications, such as object recognition, image classification, and human action recognition.

Fixation and saliency during search of natural scenes: the case of visual agnosia.

PubMed

Foulsham, Tom; Barton, Jason J S; Kingstone, Alan; Dewhurst, Richard; Underwood, Geoffrey

2009-07-01

Models of eye movement control in natural scenes often distinguish between stimulus-driven processes (which guide the eyes to visually salient regions) and those based on task and object knowledge (which depend on expectations or identification of objects and scene gist). In the present investigation, the eye movements of a patient with visual agnosia were recorded while she searched for objects within photographs of natural scenes and compared to those made by students and age-matched controls. Agnosia is assumed to disrupt the top-down knowledge available in this task, and so may increase the reliance on bottom-up cues. The patient's deficit in object recognition was seen in poor search performance and inefficient scanning. The low-level saliency of target objects had an effect on responses in visual agnosia, and the most salient region in the scene was more likely to be fixated by the patient than by controls. An analysis of model-predicted saliency at fixation locations indicated a closer match between fixations and low-level saliency in agnosia than in controls. These findings are discussed in relation to saliency-map models and the balance between high and low-level factors in eye guidance.
STDP in lateral connections creates category-based perceptual cycles for invariance learning with multiple stimuli.

PubMed

Evans, Benjamin D; Stringer, Simon M

2015-04-01

Learning to recognise objects and faces is an important and challenging problem tackled by the primate ventral visual system. One major difficulty lies in recognising an object despite profound differences in the retinal images it projects, due to changes in view, scale, position and other identity-preserving transformations. Several models of the ventral visual system have been successful in coping with these issues, but have typically been privileged by exposure to only one object at a time. In natural scenes, however, the challenges of object recognition are typically further compounded by the presence of several objects which should be perceived as distinct entities. In the present work, we explore one possible mechanism by which the visual system may overcome these two difficulties simultaneously, through segmenting unseen (artificial) stimuli using information about their category encoded in plastic lateral connections. We demonstrate that these experience-guided lateral interactions robustly organise input representations into perceptual cycles, allowing feed-forward connections trained with spike-timing-dependent plasticity to form independent, translation-invariant output representations. We present these simulations as a functional explanation for the role of plasticity in the lateral connectivity of visual cortex.
Additional Remarks on Designing Category-Level Attributes for Discriminative Visual Recognition

DTIC Science & Technology

2013-01-01

Discriminative Visual Recognition ∗ Felix X. Yu†, Liangliang Cao§, Rogerio S. Feris§, John R. Smith§, Shih-Fu Chang† † Columbia University § IBM T. J...for Designing Category-Level Attributes for Dis- criminative Visual Recognition [3]. We first provide an overview of the proposed ap- proach in...2013 to 00-00-2013 4. TITLE AND SUBTITLE Additional Remarks on Designing Category-Level Attributes for Discriminative Visual Recognition 5a
Neuronal integration in visual cortex elevates face category tuning to conscious face perception

PubMed Central

Fahrenfort, Johannes J.; Snijders, Tineke M.; Heinen, Klaartje; van Gaal, Simon; Scholte, H. Steven; Lamme, Victor A. F.

2012-01-01

The human brain has the extraordinary capability to transform cluttered sensory input into distinct object representations. For example, it is able to rapidly and seemingly without effort detect object categories in complex natural scenes. Surprisingly, category tuning is not sufficient to achieve conscious recognition of objects. What neural process beyond category extraction might elevate neural representations to the level where objects are consciously perceived? Here we show that visible and invisible faces produce similar category-selective responses in the ventral visual cortex. The pattern of neural activity evoked by visible faces could be used to decode the presence of invisible faces and vice versa. However, only visible faces caused extensive response enhancements and changes in neural oscillatory synchronization, as well as increased functional connectivity between higher and lower visual areas. We conclude that conscious face perception is more tightly linked to neural processes of sustained information integration and binding than to processes accommodating face category tuning. PMID:23236162
Velocity and Structure Estimation of a Moving Object Using a Moving Monocular Camera

DTIC Science & Technology

2006-01-01

map the Euclidean position of static landmarks or visual features in the environment . Recent applications of this technique include aerial...From Motion in a Piecewise Planar Environment ,” International Journal of Pattern Recognition and Artificial Intelligence, Vol. 2, No. 3, pp. 485-508...1988. [9] J. M. Ferryman, S. J. Maybank , and A. D. Worrall, “Visual Surveil- lance for Moving Vehicles,” Intl. Journal of Computer Vision, Vol. 37, No
Cross-Modal Correspondences Enhance Performance on a Colour-to-Sound Sensory Substitution Device.

PubMed

Hamilton-Fletcher, Giles; Wright, Thomas D; Ward, Jamie

Visual sensory substitution devices (SSDs) can represent visual characteristics through distinct patterns of sound, allowing a visually impaired user access to visual information. Previous SSDs have avoided colour and when they do encode colour, have assigned sounds to colour in a largely unprincipled way. This study introduces a new tablet-based SSD termed the ‘Creole’ (so called because it combines tactile scanning with image sonification) and a new algorithm for converting colour to sound that is based on established cross-modal correspondences (intuitive mappings between different sensory dimensions). To test the utility of correspondences, we examined the colour–sound associative memory and object recognition abilities of sighted users who had their device either coded in line with or opposite to sound–colour correspondences. Improved colour memory and reduced colour-errors were made by users who had the correspondence-based mappings. Interestingly, the colour–sound mappings that provided the highest improvements during the associative memory task also saw the greatest gains for recognising realistic objects that also featured these colours, indicating a transfer of abilities from memory to recognition. These users were also marginally better at matching sounds to images varying in luminance, even though luminance was coded identically across the different versions of the device. These findings are discussed with relevance for both colour and correspondences for sensory substitution use.
Automatic face recognition in HDR imaging

NASA Astrophysics Data System (ADS)

Pereira, Manuela; Moreno, Juan-Carlos; Proença, Hugo; Pinheiro, António M. G.

2014-05-01

The gaining popularity of the new High Dynamic Range (HDR) imaging systems is raising new privacy issues caused by the methods used for visualization. HDR images require tone mapping methods for an appropriate visualization on conventional and non-expensive LDR displays. These visualization methods might result in completely different visualization raising several issues on privacy intrusion. In fact, some visualization methods result in a perceptual recognition of the individuals, while others do not even show any identity. Although perceptual recognition might be possible, a natural question that can rise is how computer based recognition will perform using tone mapping generated images? In this paper, a study where automatic face recognition using sparse representation is tested with images that result from common tone mapping operators applied to HDR images. Its ability for the face identity recognition is described. Furthermore, typical LDR images are used for the face recognition training.
Biased figure-ground assignment affects conscious object recognition in spatial neglect.

PubMed

Eramudugolla, Ranmalee; Driver, Jon; Mattingley, Jason B

2010-09-01

Unilateral spatial neglect is a disorder of attention and spatial representation, in which early visual processes such as figure-ground segmentation have been assumed to be largely intact. There is evidence, however, that the spatial attention bias underlying neglect can bias the segmentation of a figural region from its background. Relatively few studies have explicitly examined the effect of spatial neglect on processing the figures that result from such scene segmentation. Here, we show that a neglect patient's bias in figure-ground segmentation directly influences his conscious recognition of these figures. By varying the relative salience of figural and background regions in static, two-dimensional displays, we show that competition between elements in such displays can modulate a neglect patient's ability to recognise parsed figures in a scene. The findings provide insight into the interaction between scene segmentation, explicit object recognition, and attention.
Comparative Study on Interaction of Form and Motion Processing Streams by Applying Two Different Classifiers in Mechanism for Recognition of Biological Movement

PubMed Central

2014-01-01

Research on psychophysics, neurophysiology, and functional imaging shows particular representation of biological movements which contains two pathways. The visual perception of biological movements formed through the visual system called dorsal and ventral processing streams. Ventral processing stream is associated with the form information extraction; on the other hand, dorsal processing stream provides motion information. Active basic model (ABM) as hierarchical representation of the human object had revealed novelty in form pathway due to applying Gabor based supervised object recognition method. It creates more biological plausibility along with similarity with original model. Fuzzy inference system is used for motion pattern information in motion pathway creating more robustness in recognition process. Besides, interaction of these paths is intriguing and many studies in various fields considered it. Here, the interaction of the pathways to get more appropriated results has been investigated. Extreme learning machine (ELM) has been implied for classification unit of this model, due to having the main properties of artificial neural networks, but crosses from the difficulty of training time substantially diminished in it. Here, there will be a comparison between two different configurations, interactions using synergetic neural network and ELM, in terms of accuracy and compatibility. PMID:25276860
Implicit recognition based on lateralized perceptual fluency.

PubMed

Vargas, Iliana M; Voss, Joel L; Paller, Ken A

2012-02-06

In some circumstances, accurate recognition of repeated images in an explicit memory test is driven by implicit memory. We propose that this "implicit recognition" results from perceptual fluency that influences responding without awareness of memory retrieval. Here we examined whether recognition would vary if images appeared in the same or different visual hemifield during learning and testing. Kaleidoscope images were briefly presented left or right of fixation during divided-attention encoding. Presentation in the same visual hemifield at test produced higher recognition accuracy than presentation in the opposite visual hemifield, but only for guess responses. These correct guesses likely reflect a contribution from implicit recognition, given that when the stimulated visual hemifield was the same at study and test, recognition accuracy was higher for guess responses than for responses with any level of confidence. The dramatic difference in guessing accuracy as a function of lateralized perceptual overlap between study and test suggests that implicit recognition arises from memory storage in visual cortical networks that mediate repetition-induced fluency increments.
What Types of Visual Recognition Tasks Are Mediated by the Neural Subsystem that Subserves Face Recognition?

ERIC Educational Resources Information Center

Brooks, Brian E.; Cooper, Eric E.

2006-01-01

Three divided visual field experiments tested current hypotheses about the types of visual shape representation tasks that recruit the cognitive and neural mechanisms underlying face recognition. Experiment 1 found a right hemisphere advantage for subordinate but not basic-level face recognition. Experiment 2 found a right hemisphere advantage for…
Visual search in scenes involves selective and non-selective pathways

PubMed Central

Wolfe, Jeremy M; Vo, Melissa L-H; Evans, Karla K; Greene, Michelle R

2010-01-01

How do we find objects in scenes? For decades, visual search models have been built on experiments in which observers search for targets, presented among distractor items, isolated and randomly arranged on blank backgrounds. Are these models relevant to search in continuous scenes? This paper argues that the mechanisms that govern artificial, laboratory search tasks do play a role in visual search in scenes. However, scene-based information is used to guide search in ways that had no place in earlier models. Search in scenes may be best explained by a dual-path model: A “selective” path in which candidate objects must be individually selected for recognition and a “non-selective” path in which information can be extracted from global / statistical information. PMID:21227734
Contour Curvature As an Invariant Code for Objects in Visual Area V4

PubMed Central

Pasupathy, Anitha

2016-01-01

Size-invariant object recognition—the ability to recognize objects across transformations of scale—is a fundamental feature of biological and artificial vision. To investigate its basis in the primate cerebral cortex, we measured single neuron responses to stimuli of varying size in visual area V4, a cornerstone of the object-processing pathway, in rhesus monkeys (Macaca mulatta). Leveraging two competing models for how neuronal selectivity for the bounding contours of objects may depend on stimulus size, we show that most V4 neurons (∼70%) encode objects in a size-invariant manner, consistent with selectivity for a size-independent parameter of boundary form: for these neurons, “normalized” curvature, rather than “absolute” curvature, provided a better account of responses. Our results demonstrate the suitability of contour curvature as a basis for size-invariant object representation in the visual cortex, and posit V4 as a foundation for behaviorally relevant object codes. SIGNIFICANCE STATEMENT Size-invariant object recognition is a bedrock for many perceptual and cognitive functions. Despite growing neurophysiological evidence for invariant object representations in the primate cortex, we still lack a basic understanding of the encoding rules that govern them. Classic work in the field of visual shape theory has long postulated that a representation of objects based on information about their bounding contours is well suited to mediate such an invariant code. In this study, we provide the first empirical support for this hypothesis, and its instantiation in single neurons of visual area V4. PMID:27194333
Cortical Networks for Visual Self-Recognition

NASA Astrophysics Data System (ADS)

Sugiura, Motoaki

This paper briefly reviews recent developments regarding the brain mechanisms of visual self-recognition. A special cognitive mechanism for visual self-recognition has been postulated based on behavioral and neuropsychological evidence, but its neural substrate remains controversial. Recent functional imaging studies suggest that multiple cortical mechanisms play self-specific roles during visual self-recognition, reconciling the existing controversy. Respective roles for the left occipitotemporal, right parietal, and frontal cortices in symbolic, visuospatial, and conceptual aspects of self-representation have been proposed.
The Inversion Effect for Chinese Characters Is Modulated by Radical Organization

ERIC Educational Resources Information Center

Luo, Canhuang; Chen, Wei; Zhang, Ye

2017-01-01

In studies of visual object recognition, strong inversion effects accompany the acquisition of expertise and imply the involvement of configural processing. Chinese literacy results in sensitivity to the orthography of Chinese characters. While there is some evidence that this orthographic sensitivity results in an inversion effect, and thus…
Latency of modality-specific reactivation of auditory and visual information during episodic memory retrieval.

PubMed

Ueno, Daisuke; Masumoto, Kouhei; Sutani, Kouichi; Iwaki, Sunao

2015-04-15

This study used magnetoencephalography (MEG) to examine the latency of modality-specific reactivation in the visual and auditory cortices during a recognition task to determine the effects of reactivation on episodic memory retrieval. Nine right-handed healthy young adults participated in the experiment. The experiment consisted of a word-encoding phase and two recognition phases. Three encoding conditions were included: encoding words alone (word-only) and encoding words presented with either related pictures (visual) or related sounds (auditory). The recognition task was conducted in the MEG scanner 15 min after the completion of the encoding phase. After the recognition test, a source-recognition task was given, in which participants were required to choose whether each recognition word was not presented or was presented with which information during the encoding phase. Word recognition in the auditory condition was higher than that in the word-only condition. Confidence-of-recognition scores (d') and the source-recognition test showed superior performance in both the visual and the auditory conditions compared with the word-only condition. An equivalent current dipoles analysis of MEG data indicated that higher equivalent current dipole amplitudes in the right fusiform gyrus occurred during the visual condition and in the superior temporal auditory cortices during the auditory condition, both 450-550 ms after onset of the recognition stimuli. Results suggest that reactivation of visual and auditory brain regions during recognition binds language with modality-specific information and that reactivation enhances confidence in one's recognition performance.
The 4-D approach to visual control of autonomous systems

NASA Technical Reports Server (NTRS)

Dickmanns, Ernst D.

1994-01-01

Development of a 4-D approach to dynamic machine vision is described. Core elements of this method are spatio-temporal models oriented towards objects and laws of perspective projection in a foward mode. Integration of multi-sensory measurement data was achieved through spatio-temporal models as invariants for object recognition. Situation assessment and long term predictions were allowed through maintenance of a symbolic 4-D image of processes involving objects. Behavioral capabilities were easily realized by state feedback and feed-foward control.
Detecting objects in radiographs for homeland security

NASA Astrophysics Data System (ADS)

Prasad, Lakshman; Snyder, Hans

2005-05-01

We present a general scheme for segmenting a radiographic image into polygons that correspond to visual features. This decomposition provides a vectorized representation that is a high-level description of the image. The polygons correspond to objects or object parts present in the image. This characterization of radiographs allows the direct application of several shape recognition algorithms to identify objects. In this paper we describe the use of constrained Delaunay triangulations as a uniform foundational tool to achieve multiple visual tasks, namely image segmentation, shape decomposition, and parts-based shape matching. Shape decomposition yields parts that serve as tokens representing local shape characteristics. Parts-based shape matching enables the recognition of objects in the presence of occlusions, which commonly occur in radiographs. The polygonal representation of image features affords the efficient design and application of sophisticated geometric filtering methods to detect large-scale structural properties of objects in images. Finally, the representation of radiographs via polygons results in significant reduction of image file sizes and permits the scalable graphical representation of images, along with annotations of detected objects, in the SVG (scalable vector graphics) format that is proposed by the world wide web consortium (W3C). This is a textual representation that can be compressed and encrypted for efficient and secure transmission of information over wireless channels and on the Internet. In particular, our methods described here provide an algorithmic framework for developing image analysis tools for screening cargo at ports of entry for homeland security.
RecceMan: an interactive recognition assistance for image-based reconnaissance: synergistic effects of human perception and computational methods for object recognition, identification, and infrastructure analysis

NASA Astrophysics Data System (ADS)

El Bekri, Nadia; Angele, Susanne; Ruckhäberle, Martin; Peinsipp-Byma, Elisabeth; Haelke, Bruno

2015-10-01

This paper introduces an interactive recognition assistance system for imaging reconnaissance. This system supports aerial image analysts on missions during two main tasks: Object recognition and infrastructure analysis. Object recognition concentrates on the classification of one single object. Infrastructure analysis deals with the description of the components of an infrastructure and the recognition of the infrastructure type (e.g. military airfield). Based on satellite or aerial images, aerial image analysts are able to extract single object features and thereby recognize different object types. It is one of the most challenging tasks in the imaging reconnaissance. Currently, there are no high potential ATR (automatic target recognition) applications available, as consequence the human observer cannot be replaced entirely. State-of-the-art ATR applications cannot assume in equal measure human perception and interpretation. Why is this still such a critical issue? First, cluttered and noisy images make it difficult to automatically extract, classify and identify object types. Second, due to the changed warfare and the rise of asymmetric threats it is nearly impossible to create an underlying data set containing all features, objects or infrastructure types. Many other reasons like environmental parameters or aspect angles compound the application of ATR supplementary. Due to the lack of suitable ATR procedures, the human factor is still important and so far irreplaceable. In order to use the potential benefits of the human perception and computational methods in a synergistic way, both are unified in an interactive assistance system. RecceMan® (Reconnaissance Manual) offers two different modes for aerial image analysts on missions: the object recognition mode and the infrastructure analysis mode. The aim of the object recognition mode is to recognize a certain object type based on the object features that originated from the image signatures. The infrastructure analysis mode pursues the goal to analyze the function of the infrastructure. The image analyst extracts visually certain target object signatures, assigns them to corresponding object features and is finally able to recognize the object type. The system offers him the possibility to assign the image signatures to features given by sample images. The underlying data set contains a wide range of objects features and object types for different domains like ships or land vehicles. Each domain has its own feature tree developed by aerial image analyst experts. By selecting the corresponding features, the possible solution set of objects is automatically reduced and matches only the objects that contain the selected features. Moreover, we give an outlook of current research in the field of ground target analysis in which we deal with partly automated methods to extract image signatures and assign them to the corresponding features. This research includes methods for automatically determining the orientation of an object and geometric features like width and length of the object. This step enables to reduce automatically the possible object types offered to the image analyst by the interactive recognition assistance system.
Visual face-movement sensitive cortex is relevant for auditory-only speech recognition.

PubMed

Riedel, Philipp; Ragert, Patrick; Schelinski, Stefanie; Kiebel, Stefan J; von Kriegstein, Katharina

2015-07-01

It is commonly assumed that the recruitment of visual areas during audition is not relevant for performing auditory tasks ('auditory-only view'). According to an alternative view, however, the recruitment of visual cortices is thought to optimize auditory-only task performance ('auditory-visual view'). This alternative view is based on functional magnetic resonance imaging (fMRI) studies. These studies have shown, for example, that even if there is only auditory input available, face-movement sensitive areas within the posterior superior temporal sulcus (pSTS) are involved in understanding what is said (auditory-only speech recognition). This is particularly the case when speakers are known audio-visually, that is, after brief voice-face learning. Here we tested whether the left pSTS involvement is causally related to performance in auditory-only speech recognition when speakers are known by face. To test this hypothesis, we applied cathodal transcranial direct current stimulation (tDCS) to the pSTS during (i) visual-only speech recognition of a speaker known only visually to participants and (ii) auditory-only speech recognition of speakers they learned by voice and face. We defined the cathode as active electrode to down-regulate cortical excitability by hyperpolarization of neurons. tDCS to the pSTS interfered with visual-only speech recognition performance compared to a control group without pSTS stimulation (tDCS to BA6/44 or sham). Critically, compared to controls, pSTS stimulation additionally decreased auditory-only speech recognition performance selectively for voice-face learned speakers. These results are important in two ways. First, they provide direct evidence that the pSTS is causally involved in visual-only speech recognition; this confirms a long-standing prediction of current face-processing models. Secondly, they show that visual face-sensitive pSTS is causally involved in optimizing auditory-only speech recognition. These results are in line with the 'auditory-visual view' of auditory speech perception, which assumes that auditory speech recognition is optimized by using predictions from previously encoded speaker-specific audio-visual internal models. Copyright © 2015 Elsevier Ltd. All rights reserved.

Task-dependent modulation of the visual sensory thalamus assists visual-speech recognition.

PubMed

Díaz, Begoña; Blank, Helen; von Kriegstein, Katharina

2018-05-14

The cerebral cortex modulates early sensory processing via feed-back connections to sensory pathway nuclei. The functions of this top-down modulation for human behavior are poorly understood. Here, we show that top-down modulation of the visual sensory thalamus (the lateral geniculate body, LGN) is involved in visual-speech recognition. In two independent functional magnetic resonance imaging (fMRI) studies, LGN response increased when participants processed fast-varying features of articulatory movements required for visual-speech recognition, as compared to temporally more stable features required for face identification with the same stimulus material. The LGN response during the visual-speech task correlated positively with the visual-speech recognition scores across participants. In addition, the task-dependent modulation was present for speech movements and did not occur for control conditions involving non-speech biological movements. In face-to-face communication, visual speech recognition is used to enhance or even enable understanding what is said. Speech recognition is commonly explained in frameworks focusing on cerebral cortex areas. Our findings suggest that task-dependent modulation at subcortical sensory stages has an important role for communication: Together with similar findings in the auditory modality the findings imply that task-dependent modulation of the sensory thalami is a general mechanism to optimize speech recognition. Copyright © 2018. Published by Elsevier Inc.
Visual Scanning Patterns and Executive Function in Relation to Facial Emotion Recognition in Aging

PubMed Central

Circelli, Karishma S.; Clark, Uraina S.; Cronin-Golomb, Alice

2012-01-01

Objective The ability to perceive facial emotion varies with age. Relative to younger adults (YA), older adults (OA) are less accurate at identifying fear, anger, and sadness, and more accurate at identifying disgust. Because different emotions are conveyed by different parts of the face, changes in visual scanning patterns may account for age-related variability. We investigated the relation between scanning patterns and recognition of facial emotions. Additionally, as frontal-lobe changes with age may affect scanning patterns and emotion recognition, we examined correlations between scanning parameters and performance on executive function tests. Methods We recorded eye movements from 16 OA (mean age 68.9) and 16 YA (mean age 19.2) while they categorized facial expressions and non-face control images (landscapes), and administered standard tests of executive function. Results OA were less accurate than YA at identifying fear (p<.05, r=.44) and more accurate at identifying disgust (p<.05, r=.39). OA fixated less than YA on the top half of the face for disgust, fearful, happy, neutral, and sad faces (p’s<.05, r’s≥.38), whereas there was no group difference for landscapes. For OA, executive function was correlated with recognition of sad expressions and with scanning patterns for fearful, sad, and surprised expressions. Conclusion We report significant age-related differences in visual scanning that are specific to faces. The observed relation between scanning patterns and executive function supports the hypothesis that frontal-lobe changes with age may underlie some changes in emotion recognition. PMID:22616800
Learning and recognition of on-premise signs from weakly labeled street view images.

PubMed

Tsai, Tsung-Hung; Cheng, Wen-Huang; You, Chuang-Wen; Hu, Min-Chun; Tsui, Arvin Wen; Chi, Heng-Yu

2014-03-01

Camera-enabled mobile devices are commonly used as interaction platforms for linking the user's virtual and physical worlds in numerous research and commercial applications, such as serving an augmented reality interface for mobile information retrieval. The various application scenarios give rise to a key technique of daily life visual object recognition. On-premise signs (OPSs), a popular form of commercial advertising, are widely used in our living life. The OPSs often exhibit great visual diversity (e.g., appearing in arbitrary size), accompanied with complex environmental conditions (e.g., foreground and background clutter). Observing that such real-world characteristics are lacking in most of the existing image data sets, in this paper, we first proposed an OPS data set, namely OPS-62, in which totally 4649 OPS images of 62 different businesses are collected from Google's Street View. Further, for addressing the problem of real-world OPS learning and recognition, we developed a probabilistic framework based on the distributional clustering, in which we proposed to exploit the distributional information of each visual feature (the distribution of its associated OPS labels) as a reliable selection criterion for building discriminative OPS models. Experiments on the OPS-62 data set demonstrated the outperformance of our approach over the state-of-the-art probabilistic latent semantic analysis models for more accurate recognitions and less false alarms, with a significant 151.28% relative improvement in the average recognition rate. Meanwhile, our approach is simple, linear, and can be executed in a parallel fashion, making it practical and scalable for large-scale multimedia applications.
Animacy and real-world size shape object representations in the human medial temporal lobes.

PubMed

Blumenthal, Anna; Stojanoski, Bobby; Martin, Chris B; Cusack, Rhodri; Köhler, Stefan

2018-06-26

Identifying what an object is, and whether an object has been encountered before, is a crucial aspect of human behavior. Despite this importance, we do not yet have a complete understanding of the neural basis of these abilities. Investigations into the neural organization of human object representations have revealed category specific organization in the ventral visual stream in perceptual tasks. Interestingly, these categories fall within broader domains of organization, with reported distinctions between animate, inanimate large, and inanimate small objects. While there is some evidence for category specific effects in the medial temporal lobe (MTL), in particular in perirhinal and parahippocampal cortex, it is currently unclear whether domain level organization is also present across these structures. To this end, we used fMRI with a continuous recognition memory task. Stimuli were images of objects from several different categories, which were either animate or inanimate, or large or small within the inanimate domain. We employed representational similarity analysis (RSA) to test the hypothesis that object-evoked responses in MTL structures during recognition-memory judgments also show evidence for domain-level organization along both dimensions. Our data support this hypothesis. Specifically, object representations were shaped by either animacy, real-world size, or both, in perirhinal and parahippocampal cortex, and the hippocampus. While sensitivity to these dimensions differed across structures when probed individually, hinting at interesting links to functional differentiation, similarities in organization across MTL structures were more prominent overall. These results argue for continuity in the organization of object representations in the ventral visual stream and the MTL. © 2018 Wiley Periodicals, Inc.
Mirror-image discrimination in the literate brain: a causal role for the left occpitotemporal cortex.

PubMed

Nakamura, Kimihiro; Makuuchi, Michiru; Nakajima, Yasoichi

2014-01-01

Previous studies show that the primate and human visual system automatically generates a common and invariant representation from a visual object image and its mirror reflection. For humans, however, this mirror-image generalization seems to be partially suppressed through literacy acquisition, since literate adults have greater difficulty in recognizing mirror images of letters than those of other visual objects. At the neural level, such category-specific effect on mirror-image processing has been associated with the left occpitotemporal cortex (L-OTC), but it remains unclear whether the apparent "inhibition" on mirror letters is mediated by suppressing mirror-image representations covertly generated from normal letter stimuli. Using transcranial magnetic stimulation (TMS), we examined how transient disruption of the L-OTC affects mirror-image recognition during a same-different judgment task, while varying the semantic category (letters and non-letter objects), identity (same or different), and orientation (same or mirror-reversed) of the first and second stimuli. We found that magnetic stimulation of the L-OTC produced a significant delay in mirror-image recognition for letter-strings but not for other objects. By contrast, this category specific impact was not observed when TMS was applied to other control sites, including the right homologous area and vertex. These results thus demonstrate a causal link between the L-OTC and mirror-image discrimination in literate people. We further suggest that left-right sensitivity for letters is not achieved by a local inhibitory mechanism in the L-OTC but probably relies on the inter-regional coupling with other orientation-sensitive occipito-parietal regions.
Object-based spatial attention when objects have sufficient depth cues.

PubMed

Takeya, Ryuji; Kasai, Tetsuko

2015-01-01

Attention directed to a part of an object tends to obligatorily spread over all of the spatial regions that belong to the object, which may be critical for rapid object-recognition in cluttered visual scenes. Previous studies have generally used simple rectangles as objects and have shown that attention spreading is reflected by amplitude modulation in the posterior N1 component (150-200 ms poststimulus) of event-related potentials, while other interpretations (i.e., rectangular holes) may arise implicitly in early visual processing stages. By using modified Kanizsa-type stimuli that provided less ambiguity of depth ordering, the present study examined early event-related potential spatial-attention effects for connected and separated objects, both of which were perceived in front of (Experiment 1) and in back of (Experiment 2) the surroundings. Typical P1 (100-140 ms) and N1 (150-220 ms) attention effects of ERP in response to unilateral probes were observed in both experiments. Importantly, the P1 attention effect was decreased for connected objects compared to separated objects only in Experiment 1, and the typical object-based modulations of N1 were not observed in either experiment. These results suggest that spatial attention spreads over a figural object at earlier stages of processing than previously indicated, in three-dimensional visual scenes with multiple depth cues.
Temporal and peripheral extraction of contextual cues from scenes during visual search.

PubMed

Koehler, Kathryn; Eckstein, Miguel P

2017-02-01

Scene context is known to facilitate object recognition and guide visual search, but little work has focused on isolating image-based cues and evaluating their contributions to eye movement guidance and search performance. Here, we explore three types of contextual cues (a co-occurring object, the configuration of other objects, and the superordinate category of background elements) and assess their joint contributions to search performance in the framework of cue-combination and the temporal unfolding of their extraction. We also assess whether observers' ability to extract each contextual cue in the visual periphery is a bottleneck that determines the utilization and contribution of each cue to search guidance and decision accuracy. We find that during the first four fixations of a visual search task observers first utilize the configuration of objects for coarse eye movement guidance and later use co-occurring object information for finer guidance. In the absence of contextual cues, observers were suboptimally biased to report the target object as being absent. The presence of the co-occurring object was the only contextual cue that had a significant effect in reducing decision bias. The early influence of object-based cues on eye movements is corroborated by a clear demonstration of observers' ability to extract object cues up to 16° into the visual periphery. The joint contributions of the cues to decision search accuracy approximates that expected from the combination of statistically independent cues and optimal cue combination. Finally, the lack of utilization and contribution of the background-based contextual cue to search guidance cannot be explained by the availability of the contextual cue in the visual periphery; instead it is related to background cues providing the least inherent information about the precise location of the target in the scene.
Utterance independent bimodal emotion recognition in spontaneous communication

NASA Astrophysics Data System (ADS)

Tao, Jianhua; Pan, Shifeng; Yang, Minghao; Li, Ya; Mu, Kaihui; Che, Jianfeng

2011-12-01

Emotion expressions sometimes are mixed with the utterance expression in spontaneous face-to-face communication, which makes difficulties for emotion recognition. This article introduces the methods of reducing the utterance influences in visual parameters for the audio-visual-based emotion recognition. The audio and visual channels are first combined under a Multistream Hidden Markov Model (MHMM). Then, the utterance reduction is finished by finding the residual between the real visual parameters and the outputs of the utterance related visual parameters. This article introduces the Fused Hidden Markov Model Inversion method which is trained in the neutral expressed audio-visual corpus to solve the problem. To reduce the computing complexity the inversion model is further simplified to a Gaussian Mixture Model (GMM) mapping. Compared with traditional bimodal emotion recognition methods (e.g., SVM, CART, Boosting), the utterance reduction method can give better results of emotion recognition. The experiments also show the effectiveness of our emotion recognition system when it was used in a live environment.
Two Visual Pathways in Primates Based on Sampling of Space: Exploitation and Exploration of Visual Information

PubMed Central

Sheth, Bhavin R.; Young, Ryan

2016-01-01

Evidence is strong that the visual pathway is segregated into two distinct streams—ventral and dorsal. Two proposals theorize that the pathways are segregated in function: The ventral stream processes information about object identity, whereas the dorsal stream, according to one model, processes information about either object location, and according to another, is responsible in executing movements under visual control. The models are influential; however recent experimental evidence challenges them, e.g., the ventral stream is not solely responsible for object recognition; conversely, its function is not strictly limited to object vision; the dorsal stream is not responsible by itself for spatial vision or visuomotor control; conversely, its function extends beyond vision or visuomotor control. In their place, we suggest a robust dichotomy consisting of a ventral stream selectively sampling high-resolution/focal spaces, and a dorsal stream sampling nearly all of space with reduced foveal bias. The proposal hews closely to the theme of embodied cognition: Function arises as a consequence of an extant sensory underpinning. A continuous, not sharp, segregation based on function emerges, and carries with it an undercurrent of an exploitation-exploration dichotomy. Under this interpretation, cells of the ventral stream, which individually have more punctate receptive fields that generally include the fovea or parafovea, provide detailed information about object shapes and features and lead to the systematic exploitation of said information; cells of the dorsal stream, which individually have large receptive fields, contribute to visuospatial perception, provide information about the presence/absence of salient objects and their locations for novel exploration and subsequent exploitation by the ventral stream or, under certain conditions, the dorsal stream. We leverage the dichotomy to unify neuropsychological cases under a common umbrella, account for the increased prevalence of multisensory integration in the dorsal stream under a Bayesian framework, predict conditions under which object recognition utilizes the ventral or dorsal stream, and explain why cells of the dorsal stream drive sensorimotor control and motion processing and have poorer feature selectivity. Finally, the model speculates on a dynamic interaction between the two streams that underscores a unified, seamless perception. Existing theories are subsumed under our proposal. PMID:27920670
Two Visual Pathways in Primates Based on Sampling of Space: Exploitation and Exploration of Visual Information.

PubMed

Sheth, Bhavin R; Young, Ryan

2016-01-01

Evidence is strong that the visual pathway is segregated into two distinct streams-ventral and dorsal. Two proposals theorize that the pathways are segregated in function: The ventral stream processes information about object identity, whereas the dorsal stream, according to one model, processes information about either object location, and according to another, is responsible in executing movements under visual control. The models are influential; however recent experimental evidence challenges them, e.g., the ventral stream is not solely responsible for object recognition; conversely, its function is not strictly limited to object vision; the dorsal stream is not responsible by itself for spatial vision or visuomotor control; conversely, its function extends beyond vision or visuomotor control. In their place, we suggest a robust dichotomy consisting of a ventral stream selectively sampling high-resolution/ focal spaces, and a dorsal stream sampling nearly all of space with reduced foveal bias. The proposal hews closely to the theme of embodied cognition: Function arises as a consequence of an extant sensory underpinning. A continuous, not sharp, segregation based on function emerges, and carries with it an undercurrent of an exploitation-exploration dichotomy. Under this interpretation, cells of the ventral stream, which individually have more punctate receptive fields that generally include the fovea or parafovea, provide detailed information about object shapes and features and lead to the systematic exploitation of said information; cells of the dorsal stream, which individually have large receptive fields, contribute to visuospatial perception, provide information about the presence/absence of salient objects and their locations for novel exploration and subsequent exploitation by the ventral stream or, under certain conditions, the dorsal stream. We leverage the dichotomy to unify neuropsychological cases under a common umbrella, account for the increased prevalence of multisensory integration in the dorsal stream under a Bayesian framework, predict conditions under which object recognition utilizes the ventral or dorsal stream, and explain why cells of the dorsal stream drive sensorimotor control and motion processing and have poorer feature selectivity. Finally, the model speculates on a dynamic interaction between the two streams that underscores a unified, seamless perception. Existing theories are subsumed under our proposal.
Robot Command Interface Using an Audio-Visual Speech Recognition System

NASA Astrophysics Data System (ADS)

Ceballos, Alexánder; Gómez, Juan; Prieto, Flavio; Redarce, Tanneguy

In recent years audio-visual speech recognition has emerged as an active field of research thanks to advances in pattern recognition, signal processing and machine vision. Its ultimate goal is to allow human-computer communication using voice, taking into account the visual information contained in the audio-visual speech signal. This document presents a command's automatic recognition system using audio-visual information. The system is expected to control the laparoscopic robot da Vinci. The audio signal is treated using the Mel Frequency Cepstral Coefficients parametrization method. Besides, features based on the points that define the mouth's outer contour according to the MPEG-4 standard are used in order to extract the visual speech information.
ASERA: A Spectrum Eye Recognition Assistant

NASA Astrophysics Data System (ADS)

Yuan, Hailong; Zhang, Haotong; Zhang, Yanxia; Lei, Yajuan; Dong, Yiqiao; Zhao, Yongheng

2018-04-01

ASERA, ASpectrum Eye Recognition Assistant, aids in quasar spectral recognition and redshift measurement and can also be used to recognize various types of spectra of stars, galaxies and AGNs (Active Galactic Nucleus). This interactive software allows users to visualize observed spectra, superimpose template spectra from the Sloan Digital Sky Survey (SDSS), and interactively access related spectral line information. ASERA is an efficient and user-friendly semi-automated toolkit for the accurate classification of spectra observed by LAMOST (the Large Sky Area Multi-object Fiber Spectroscopic Telescope) and is available as a standalone Java application and as a Java applet. The software offers several functions, including wavelength and flux scale settings, zoom in and out, redshift estimation, and spectral line identification.
3-D vision and figure-ground separation by visual cortex.

PubMed

Grossberg, S

1994-01-01

A neural network theory of three-dimensional (3-D) vision, called FACADE theory, is described. The theory proposes a solution of the classical figure-ground problem for biological vision. It does so by suggesting how boundary representations and surface representations are formed within a boundary contour system (BCS) and a feature contour system (FCS). The BCS and FCS interact reciprocally to form 3-D boundary and surface representations that are mutually consistent. Their interactions generate 3-D percepts wherein occluding and occluded object parts are separated, completed, and grouped. The theory clarifies how preattentive processes of 3-D perception and figure-ground separation interact reciprocally with attentive processes of spatial localization, object recognition, and visual search. A new theory of stereopsis is proposed that predicts how cells sensitive to multiple spatial frequencies, disparities, and orientations are combined by context-sensitive filtering, competition, and cooperation to form coherent BCS boundary segmentations. Several factors contribute to figure-ground pop-out, including: boundary contrast between spatially contiguous boundaries, whether due to scenic differences in luminance, color, spatial frequency, or disparity; partially ordered interactions from larger spatial scales and disparities to smaller scales and disparities; and surface filling-in restricted to regions surrounded by a connected boundary. Phenomena such as 3-D pop-out from a 2-D picture, Da Vinci stereopsis, 3-D neon color spreading, completion of partially occluded objects, and figure-ground reversals are analyzed. The BCS and FCS subsystems model aspects of how the two parvocellular cortical processing streams that join the lateral geniculate nucleus to prestriate cortical area V4 interact to generate a multiplexed representation of Form-And-Color-And-DEpth, or FACADE, within area V4. Area V4 is suggested to support figure-ground separation and to interact with cortical mechanisms of spatial attention, attentive object learning, and visual search. Adaptive resonance theory (ART) mechanisms model aspects of how prestriate visual cortex interacts reciprocally with a visual object recognition system in inferotemporal (IT) cortex for purposes of attentive object learning and categorization. Object attention mechanisms of the What cortical processing stream through IT cortex are distinguished from spatial attention mechanisms of the Where cortical processing stream through parietal cortex. Parvocellular BCS and FCS signals interact with the model What stream. Parvocellular FCS and magnocellular motion BCS signals interact with the model Where stream.(ABSTRACT TRUNCATED AT 400 WORDS)
Sparse aperture 3D passive image sensing and recognition

NASA Astrophysics Data System (ADS)

Daneshpanah, Mehdi

The way we perceive, capture, store, communicate and visualize the world has greatly changed in the past century Novel three dimensional (3D) imaging and display systems are being pursued both in academic and industrial settings. In many cases, these systems have revolutionized traditional approaches and/or enabled new technologies in other disciplines including medical imaging and diagnostics, industrial metrology, entertainment, robotics as well as defense and security. In this dissertation, we focus on novel aspects of sparse aperture multi-view imaging systems and their application in quantum-limited object recognition in two separate parts. In the first part, two concepts are proposed. First a solution is presented that involves a generalized framework for 3D imaging using randomly distributed sparse apertures. Second, a method is suggested to extract the profile of objects in the scene through statistical properties of the reconstructed light field. In both cases, experimental results are presented that demonstrate the feasibility of the techniques. In the second part, the application of 3D imaging systems in sensing and recognition of objects is addressed. In particular, we focus on the scenario in which only 10s of photons reach the sensor from the object of interest, as opposed to hundreds of billions of photons in normal imaging conditions. At this level, the quantum limited behavior of light will dominate and traditional object recognition practices may fail. We suggest a likelihood based object recognition framework that incorporates the physics of sensing at quantum-limited conditions. Sensor dark noise has been modeled and taken into account. This framework is applied to 3D sensing of thermal objects using visible spectrum detectors. Thermal objects as cold as 250K are shown to provide enough signature photons to be sensed and recognized within background and dark noise with mature, visible band, image forming optics and detector arrays. The results suggest that one might not need to venture into exotic and expensive detector arrays and associated optics for sensing room-temperature thermal objects in complete darkness.
The role of visual imagery in the retention of information from sentences.

PubMed

Drose, G S; Allen, G L

1994-01-01

We conducted two experiments to evaluate a multiple-code model for sentence memory that posits both propositional and visual representational systems. Both sentences involved recognition memory. The results of Experiment 1 indicated that subjects' recognition memory for concrete sentences was superior to their recognition memory for abstract sentences. Instructions to use visual imagery to enhance recognition performance yielded no effects. Experiment 2 tested the prediction that interference by a visual task would differentially affect recognition memory for concrete sentences. Results showed the interference task to have had a detrimental effect on recognition memory for both concrete and abstract sentences. Overall, the evidence provided partial support for both a multiple-code model and a semantic integration model of sentence memory.
Autonomous facial recognition system inspired by human visual system based logarithmical image visualization technique

NASA Astrophysics Data System (ADS)

Wan, Qianwen; Panetta, Karen; Agaian, Sos

2017-05-01

Autonomous facial recognition system is widely used in real-life applications, such as homeland border security, law enforcement identification and authentication, and video-based surveillance analysis. Issues like low image quality, non-uniform illumination as well as variations in poses and facial expressions can impair the performance of recognition systems. To address the non-uniform illumination challenge, we present a novel robust autonomous facial recognition system inspired by the human visual system based, so called, logarithmical image visualization technique. In this paper, the proposed method, for the first time, utilizes the logarithmical image visualization technique coupled with the local binary pattern to perform discriminative feature extraction for facial recognition system. The Yale database, the Yale-B database and the ATT database are used for computer simulation accuracy and efficiency testing. The extensive computer simulation demonstrates the method's efficiency, accuracy, and robustness of illumination invariance for facial recognition.
Objects predict fixations better than early saliency.

PubMed

Einhäuser, Wolfgang; Spain, Merrielle; Perona, Pietro

2008-11-20

Humans move their eyes while looking at scenes and pictures. Eye movements correlate with shifts in attention and are thought to be a consequence of optimal resource allocation for high-level tasks such as visual recognition. Models of attention, such as "saliency maps," are often built on the assumption that "early" features (color, contrast, orientation, motion, and so forth) drive attention directly. We explore an alternative hypothesis: Observers attend to "interesting" objects. To test this hypothesis, we measure the eye position of human observers while they inspect photographs of common natural scenes. Our observers perform different tasks: artistic evaluation, analysis of content, and search. Immediately after each presentation, our observers are asked to name objects they saw. Weighted with recall frequency, these objects predict fixations in individual images better than early saliency, irrespective of task. Also, saliency combined with object positions predicts which objects are frequently named. This suggests that early saliency has only an indirect effect on attention, acting through recognized objects. Consequently, rather than treating attention as mere preprocessing step for object recognition, models of both need to be integrated.
Detailed 3D representations for object recognition and modeling.

PubMed

Zia, M Zeeshan; Stark, Michael; Schiele, Bernt; Schindler, Konrad

2013-11-01

Geometric 3D reasoning at the level of objects has received renewed attention recently in the context of visual scene understanding. The level of geometric detail, however, is typically limited to qualitative representations or coarse boxes. This is linked to the fact that today's object class detectors are tuned toward robust 2D matching rather than accurate 3D geometry, encouraged by bounding-box-based benchmarks such as Pascal VOC. In this paper, we revisit ideas from the early days of computer vision, namely, detailed, 3D geometric object class representations for recognition. These representations can recover geometrically far more accurate object hypotheses than just bounding boxes, including continuous estimates of object pose and 3D wireframes with relative 3D positions of object parts. In combination with robust techniques for shape description and inference, we outperform state-of-the-art results in monocular 3D pose estimation. In a series of experiments, we analyze our approach in detail and demonstrate novel applications enabled by such an object class representation, such as fine-grained categorization of cars and bicycles, according to their 3D geometry, and ultrawide baseline matching.
Neural Representations of Natural and Scrambled Movies Progressively Change from Rat Striate to Temporal Cortex

PubMed Central

Vinken, Kasper; Van den Bergh, Gert; Vermaercke, Ben; Op de Beeck, Hans P.

2016-01-01

In recent years, the rodent has come forward as a candidate model for investigating higher level visual abilities such as object vision. This view has been backed up substantially by evidence from behavioral studies that show rats can be trained to express visual object recognition and categorization capabilities. However, almost no studies have investigated the functional properties of rodent extrastriate visual cortex using stimuli that target object vision, leaving a gap compared with the primate literature. Therefore, we recorded single-neuron responses along a proposed ventral pathway in rat visual cortex to investigate hallmarks of primate neural object representations such as preference for intact versus scrambled stimuli and category-selectivity. We presented natural movies containing a rat or no rat as well as their phase-scrambled versions. Population analyses showed increased dissociation in representations of natural versus scrambled stimuli along the targeted stream, but without a clear preference for natural stimuli. Along the measured cortical hierarchy the neural response seemed to be driven increasingly by features that are not V1-like and destroyed by phase-scrambling. However, there was no evidence for category selectivity for the rat versus nonrat distinction. Together, these findings provide insights about differences and commonalities between rodent and primate visual cortex. PMID:27146315
Training-induced recovery of low-level vision followed by mid-level perceptual improvements in developmental object and face agnosia

PubMed Central

Lev, Maria; Gilaie-Dotan, Sharon; Gotthilf-Nezri, Dana; Yehezkel, Oren; Brooks, Joseph L; Perry, Anat; Bentin, Shlomo; Bonneh, Yoram; Polat, Uri

2015-01-01

Long-term deprivation of normal visual inputs can cause perceptual impairments at various levels of visual function, from basic visual acuity deficits, through mid-level deficits such as contour integration and motion coherence, to high-level face and object agnosia. Yet it is unclear whether training during adulthood, at a post-developmental stage of the adult visual system, can overcome such developmental impairments. Here, we visually trained LG, a developmental object and face agnosic individual. Prior to training, at the age of 20, LG's basic and mid-level visual functions such as visual acuity, crowding effects, and contour integration were underdeveloped relative to normal adult vision, corresponding to or poorer than those of 5–6 year olds (Gilaie-Dotan, Perry, Bonneh, Malach & Bentin, 2009). Intensive visual training, based on lateral interactions, was applied for a period of 9 months. LG's directly trained but also untrained visual functions such as visual acuity, crowding, binocular stereopsis and also mid-level contour integration improved significantly and reached near-age-level performance, with long-term (over 4 years) persistence. Moreover, mid-level functions that were tested post-training were found to be normal in LG. Some possible subtle improvement was observed in LG's higher-order visual functions such as object recognition and part integration, while LG's face perception skills have not improved thus far. These results suggest that corrective training at a post-developmental stage, even in the adult visual system, can prove effective, and its enduring effects are the basis for a revival of a developmental cascade that can lead to reduced perceptual impairments. PMID:24698161

Training-induced recovery of low-level vision followed by mid-level perceptual improvements in developmental object and face agnosia.

PubMed

Lev, Maria; Gilaie-Dotan, Sharon; Gotthilf-Nezri, Dana; Yehezkel, Oren; Brooks, Joseph L; Perry, Anat; Bentin, Shlomo; Bonneh, Yoram; Polat, Uri

2015-01-01

Long-term deprivation of normal visual inputs can cause perceptual impairments at various levels of visual function, from basic visual acuity deficits, through mid-level deficits such as contour integration and motion coherence, to high-level face and object agnosia. Yet it is unclear whether training during adulthood, at a post-developmental stage of the adult visual system, can overcome such developmental impairments. Here, we visually trained LG, a developmental object and face agnosic individual. Prior to training, at the age of 20, LG's basic and mid-level visual functions such as visual acuity, crowding effects, and contour integration were underdeveloped relative to normal adult vision, corresponding to or poorer than those of 5-6 year olds (Gilaie-Dotan, Perry, Bonneh, Malach & Bentin, 2009). Intensive visual training, based on lateral interactions, was applied for a period of 9 months. LG's directly trained but also untrained visual functions such as visual acuity, crowding, binocular stereopsis and also mid-level contour integration improved significantly and reached near-age-level performance, with long-term (over 4 years) persistence. Moreover, mid-level functions that were tested post-training were found to be normal in LG. Some possible subtle improvement was observed in LG's higher-order visual functions such as object recognition and part integration, while LG's face perception skills have not improved thus far. These results suggest that corrective training at a post-developmental stage, even in the adult visual system, can prove effective, and its enduring effects are the basis for a revival of a developmental cascade that can lead to reduced perceptual impairments. © 2014 The Authors. Developmental Science Published by John Wiley & Sons Ltd.
Infant Visual Recognition Memory

ERIC Educational Resources Information Center

Rose, Susan A.; Feldman, Judith F.; Jankowski, Jeffery J.

2004-01-01

Visual recognition memory is a robust form of memory that is evident from early infancy, shows pronounced developmental change, and is influenced by many of the same factors that affect adult memory; it is surprisingly resistant to decay and interference. Infant visual recognition memory shows (a) modest reliability, (b) good discriminant…
Attention and perceptual implicit memory: effects of selective versus divided attention and number of visual objects.

PubMed

Mulligan, Neil W

2002-08-01

Extant research presents conflicting results on whether manipulations of attention during encoding affect perceptual priming. Two suggested mediating factors are type of manipulation (selective vs divided) and whether attention is manipulated across multiple objects or within a single object. Words printed in different colors (Experiment 1) or flanked by colored blocks (Experiment 2) were presented at encoding. In the full-attention condition, participants always read the word, in the unattended condition they always identified the color, and in the divided-attention conditions, participants attended to both word identity and color. Perceptual priming was assessed with perceptual identification and explicit memory with recognition. Relative to the full-attention condition, attending to color always reduced priming. Dividing attention between word identity and color, however, only disrupted priming when these attributes were presented as multiple objects (Experiment 2) but not when they were dimensions of a common object (Experiment 1). On the explicit test, manipulations of attention always affected recognition accuracy.
The gender congruency effect during bilingual spoken-word recognition

PubMed Central

Morales, Luis; Paolieri, Daniela; Dussias, Paola E.; Valdés kroff, Jorge R.; Gerfen, Chip; Bajo, María Teresa

2016-01-01

We investigate the ‘gender-congruency’ effect during a spoken-word recognition task using the visual world paradigm. Eye movements of Italian–Spanish bilinguals and Spanish monolinguals were monitored while they viewed a pair of objects on a computer screen. Participants listened to instructions in Spanish (encuentra la bufanda / ‘find the scarf’) and clicked on the object named in the instruction. Grammatical gender of the objects’ name was manipulated so that pairs of objects had the same (congruent) or different (incongruent) gender in Italian, but gender in Spanish was always congruent. Results showed that bilinguals, but not monolinguals, looked at target objects less when they were incongruent in gender, suggesting a between-language gender competition effect. In addition, bilinguals looked at target objects more when the definite article in the spoken instructions provided a valid cue to anticipate its selection (different-gender condition). The temporal dynamics of gender processing and cross-language activation in bilinguals are discussed. PMID:28018132
Visual object recognition for mobile tourist information systems

NASA Astrophysics Data System (ADS)

Paletta, Lucas; Fritz, Gerald; Seifert, Christin; Luley, Patrick; Almer, Alexander

2005-03-01

We describe a mobile vision system that is capable of automated object identification using images captured from a PDA or a camera phone. We present a solution for the enabling technology of outdoors vision based object recognition that will extend state-of-the-art location and context aware services towards object based awareness in urban environments. In the proposed application scenario, tourist pedestrians are equipped with GPS, W-LAN and a camera attached to a PDA or a camera phone. They are interested whether their field of view contains tourist sights that would point to more detailed information. Multimedia type data about related history, the architecture, or other related cultural context of historic or artistic relevance might be explored by a mobile user who is intending to learn within the urban environment. Learning from ambient cues is in this way achieved by pointing the device towards the urban sight, capturing an image, and consequently getting information about the object on site and within the focus of attention, i.e., the users current field of view.
Face recognition in newly hatched chicks at the onset of vision.

PubMed

Wood, Samantha M W; Wood, Justin N

2015-04-01

How does face recognition emerge in the newborn brain? To address this question, we used an automated controlled-rearing method with a newborn animal model: the domestic chick (Gallus gallus). This automated method allowed us to examine chicks' face recognition abilities at the onset of both face experience and object experience. In the first week of life, newly hatched chicks were raised in controlled-rearing chambers that contained no objects other than a single virtual human face. In the second week of life, we used an automated forced-choice testing procedure to examine whether chicks could distinguish that familiar face from a variety of unfamiliar faces. Chicks successfully distinguished the familiar face from most of the unfamiliar faces-for example, chicks were sensitive to changes in the face's age, gender, and orientation (upright vs. inverted). Thus, chicks can build an accurate representation of the first face they see in their life. These results show that the initial state of face recognition is surprisingly powerful: Newborn visual systems can begin encoding and recognizing faces at the onset of vision. (c) 2015 APA, all rights reserved).
Differences between Dyslexic and Non-Dyslexic Children in the Performance of Phonological Visual-Auditory Recognition Tasks: An Eye-Tracking Study

PubMed Central

Tiadi, Aimé; Seassau, Magali; Gerard, Christophe-Loïc; Bucci, Maria Pia

2016-01-01

The object of this study was to explore further phonological visual-auditory recognition tasks in a group of fifty-six healthy children (mean age: 9.9 ± 0.3) and to compare these data to those recorded in twenty-six age-matched dyslexic children (mean age: 9.8 ± 0.2). Eye movements from both eyes were recorded using an infrared video-oculography system (MobileEBT® e(y)e BRAIN). The recognition task was performed under four conditions in which the target object was displayed either with phonologically unrelated objects (baseline condition), or with cohort or rhyme objects (cohort and rhyme conditions, respectively), or both together (rhyme + cohort condition). The percentage of the total time spent on the targets and the latency of the first saccade on the target were measured. Results in healthy children showed that the percentage of the total time spent in the baseline condition was significantly longer than in the other conditions, and that the latency of the first saccade in the cohort condition was significantly longer than in the other conditions; interestingly, the latency decreased significantly with the increasing age of the children. The developmental trend of phonological awareness was also observed in healthy children only. In contrast, we observed that for dyslexic children the total time spent on the target was similar in all four conditions tested, and also that they had similar latency values in both cohort and rhyme conditions. These findings suggest a different sensitivity to the phonological competitors between dyslexic and non-dyslexic children. Also, the eye-tracking technique provides online information about phonological awareness capabilities in children. PMID:27438352
Continuous recognition of spatial and nonspatial stimuli in hippocampal-lesioned rats.

PubMed

Jackson-Smith, P; Kesner, R P; Chiba, A A

1993-03-01

The present experiments compared the performance of hippocampal-lesioned rats to control rats on a spatial continuous recognition task and an analogous nonspatial task with similar processing demands. Daily sessions for Experiment 1 involved sequential presentation of individual arms on a 12-arm radial maze. Each arm contained a Froot Loop reinforcement the first time it was presented, and latency to traverse the arm was measured. A subset of the arms were repeated, but did not contain reinforcement. Repeated arms were presented with lags ranging from 0 to 6 (0 to 6 different arm presentations occurred between the first and the repeated presentation). Difference scores were computed by subtracting the latency on first presentations from the latency on repeated presentations, and these scores were high in all rats prior to surgery, with a decreasing function across lag. There were no differences in performance following cortical control or sham surgery. However, there was a total deficit in performance following large electrolytic lesions of the hippocampus. The second experiment employed the same continuous recognition memory procedure, but used three-dimensional visual objects (toys, junk items, etc., in various shapes, sizes, and textures) as stimuli on a flat runway. As in Experiment 1, the stimuli were presented successively and latency to run to and move the object was measured. Objects were repeated with lags ranging from 0 to 4. Performance on this task following surgery did not differ from performance prior to surgery for either the control group or the hippocampal lesion group. These results provide support for Kesner's attribute model of hippocampal function in that the hippocampus is assumed to mediate data-based memory for spatial locations, but not three-dimensional visual objects.
Human brain regions involved in recognizing environmental sounds.

PubMed

Lewis, James W; Wightman, Frederic L; Brefczynski, Julie A; Phinney, Raymond E; Binder, Jeffrey R; DeYoe, Edgar A

2004-09-01

To identify the brain regions preferentially involved in environmental sound recognition (comprising portions of a putative auditory 'what' pathway), we collected functional imaging data while listeners attended to a wide range of sounds, including those produced by tools, animals, liquids and dropped objects. These recognizable sounds, in contrast to unrecognizable, temporally reversed control sounds, evoked activity in a distributed network of brain regions previously associated with semantic processing, located predominantly in the left hemisphere, but also included strong bilateral activity in posterior portions of the middle temporal gyri (pMTG). Comparisons with earlier studies suggest that these bilateral pMTG foci partially overlap cortex implicated in high-level visual processing of complex biological motion and recognition of tools and other artifacts. We propose that the pMTG foci process multimodal (or supramodal) information about objects and object-associated motion, and that this may represent 'action' knowledge that can be recruited for purposes of recognition of familiar environmental sound-sources. These data also provide a functional and anatomical explanation for the symptoms of pure auditory agnosia for environmental sounds reported in human lesion studies.
The Influence of Reading Expertise in Mirror-Letter Perception: Evidence from Beginning and Expert Readers

ERIC Educational Resources Information Center

Dunabeitia, Jon Andoni; Dimitropoulou, María; Estevez, Adelina; Carreiras, Manuel

2013-01-01

The visual word recognition system recruits neuronal systems originally developed for object perception which are characterized by orientation insensitivity to mirror reversals. It has been proposed that during reading acquisition beginning readers have to "unlearn" this natural tolerance to mirror reversals in order to efficiently…
Infant Information Processing in Relation to Six-Year Cognitive Outcomes.

ERIC Educational Resources Information Center

Rose, Susan A.; And Others

1992-01-01

Infants' visual recognition memory (VRM) at seven months was associated with their general intelligence, language proficiency, reading and quantitative skills, and perceptual organization at six years. Infants' VRM, object permanence, and cross-modal transfer of perceptions at one year were related to their IQ and several outcomes at six years.…
Application of Visual Attention in Seismic Attribute Analysis

NASA Astrophysics Data System (ADS)

He, M.; Gu, H.; Wang, F.

2016-12-01

It has been proved that seismic attributes can be used to predict reservoir. The joint of multi-attribute and geological statistics, data mining, artificial intelligence, further promote the development of the seismic attribute analysis. However, the existing methods tend to have multiple solutions and insufficient generalization ability, which is mainly due to the complex relationship between seismic data and geological information, and undoubtedly own partly to the methods applied. Visual attention is a mechanism model of the human visual system which can concentrate on a few significant visual objects rapidly, even in a mixed scene. Actually, the model qualify good ability of target detection and recognition. In our study, the targets to be predicted are treated as visual objects, and an object representation based on well data is made in the attribute dimensions. Then in the same attribute space, the representation is served as a criterion to search the potential targets outside the wells. This method need not predict properties by building up a complicated relation between attributes and reservoir properties, but with reference to the standard determined before. So it has pretty good generalization ability, and the problem of multiple solutions can be weakened by defining the threshold of similarity.
Posterior Parietal Cortex Drives Inferotemporal Activations During Three-Dimensional Object Vision.

PubMed

Van Dromme, Ilse C; Premereur, Elsie; Verhoef, Bram-Ernst; Vanduffel, Wim; Janssen, Peter

2016-04-01

The primate visual system consists of a ventral stream, specialized for object recognition, and a dorsal visual stream, which is crucial for spatial vision and actions. However, little is known about the interactions and information flow between these two streams. We investigated these interactions within the network processing three-dimensional (3D) object information, comprising both the dorsal and ventral stream. Reversible inactivation of the macaque caudal intraparietal area (CIP) during functional magnetic resonance imaging (fMRI) reduced fMRI activations in posterior parietal cortex in the dorsal stream and, surprisingly, also in the inferotemporal cortex (ITC) in the ventral visual stream. Moreover, CIP inactivation caused a perceptual deficit in a depth-structure categorization task. CIP-microstimulation during fMRI further suggests that CIP projects via posterior parietal areas to the ITC in the ventral stream. To our knowledge, these results provide the first causal evidence for the flow of visual 3D information from the dorsal stream to the ventral stream, and identify CIP as a key area for depth-structure processing. Thus, combining reversible inactivation and electrical microstimulation during fMRI provides a detailed view of the functional interactions between the two visual processing streams.
Posterior Parietal Cortex Drives Inferotemporal Activations During Three-Dimensional Object Vision

PubMed Central

Van Dromme, Ilse C.; Premereur, Elsie; Verhoef, Bram-Ernst; Vanduffel, Wim; Janssen, Peter

2016-01-01

The primate visual system consists of a ventral stream, specialized for object recognition, and a dorsal visual stream, which is crucial for spatial vision and actions. However, little is known about the interactions and information flow between these two streams. We investigated these interactions within the network processing three-dimensional (3D) object information, comprising both the dorsal and ventral stream. Reversible inactivation of the macaque caudal intraparietal area (CIP) during functional magnetic resonance imaging (fMRI) reduced fMRI activations in posterior parietal cortex in the dorsal stream and, surprisingly, also in the inferotemporal cortex (ITC) in the ventral visual stream. Moreover, CIP inactivation caused a perceptual deficit in a depth-structure categorization task. CIP-microstimulation during fMRI further suggests that CIP projects via posterior parietal areas to the ITC in the ventral stream. To our knowledge, these results provide the first causal evidence for the flow of visual 3D information from the dorsal stream to the ventral stream, and identify CIP as a key area for depth-structure processing. Thus, combining reversible inactivation and electrical microstimulation during fMRI provides a detailed view of the functional interactions between the two visual processing streams. PMID:27082854
Preserved local but disrupted contextual figure-ground influences in an individual with abnormal function of intermediate visual areas

PubMed Central

Brooks, Joseph L.; Gilaie-Dotan, Sharon; Rees, Geraint; Bentin, Shlomo; Driver, Jon

2012-01-01

Visual perception depends not only on local stimulus features but also on their relationship to the surrounding stimulus context, as evident in both local and contextual influences on figure-ground segmentation. Intermediate visual areas may play a role in such contextual influences, as we tested here by examining LG, a rare case of developmental visual agnosia. LG has no evident abnormality of brain structure and functional neuroimaging showed relatively normal V1 function, but his intermediate visual areas (V2/V3) function abnormally. We found that contextual influences on figure-ground organization were selectively disrupted in LG, while local sources of figure-ground influences were preserved. Effects of object knowledge and familiarity on figure-ground organization were also significantly diminished. Our results suggest that the mechanisms mediating contextual and familiarity influences on figure-ground organization are dissociable from those mediating local influences on figure-ground assignment. The disruption of contextual processing in intermediate visual areas may play a role in the substantial object recognition difficulties experienced by LG. PMID:22947116
SAVA 3: A testbed for integration and control of visual processes

NASA Technical Reports Server (NTRS)

Crowley, James L.; Christensen, Henrik

1994-01-01

The development of an experimental test-bed to investigate the integration and control of perception in a continuously operating vision system is described. The test-bed integrates a 12 axis robotic stereo camera head mounted on a mobile robot, dedicated computer boards for real-time image acquisition and processing, and a distributed system for image description. The architecture was designed to: (1) be continuously operating, (2) integrate software contributions from geographically dispersed laboratories, (3) integrate description of the environment with 2D measurements, 3D models, and recognition of objects, (4) capable of supporting diverse experiments in gaze control, visual servoing, navigation, and object surveillance, and (5) dynamically reconfiguarable.
Recognition of visual stimuli and memory for spatial context in schizophrenic patients and healthy volunteers.

PubMed

Brébion, Gildas; David, Anthony S; Pilowsky, Lyn S; Jones, Hugh

2004-11-01

Verbal and visual recognition tasks were administered to 40 patients with schizophrenia and 40 healthy comparison subjects. The verbal recognition task consisted of discriminating between 16 target words and 16 new words. The visual recognition task consisted of discriminating between 16 target pictures (8 black-and-white and 8 color) and 16 new pictures (8 black-and-white and 8 color). Visual recognition was followed by a spatial context discrimination task in which subjects were required to remember the spatial location of the target pictures at encoding. Results showed that recognition deficit in patients was similar for verbal and visual material. In both schizophrenic and healthy groups, men, but not women, obtained better recognition scores for the colored than for the black-and-white pictures. However, men and women similarly benefited from color to reduce spatial context discrimination errors. Patients showed a significant deficit in remembering the spatial location of the pictures, independently of accuracy in remembering the pictures themselves. These data suggest that patients are impaired in the amount of visual information that they can encode. With regards to the perceptual attributes of the stimuli, memory for spatial information appears to be affected, but not processing of color information.
Matching Heard and Seen Speech: An ERP Study of Audiovisual Word Recognition

PubMed Central

Kaganovich, Natalya; Schumaker, Jennifer; Rowland, Courtney

2016-01-01

Seeing articulatory gestures while listening to speech-in-noise (SIN) significantly improves speech understanding. However, the degree of this improvement varies greatly among individuals. We examined a relationship between two distinct stages of visual articulatory processing and the SIN accuracy by combining a cross-modal repetition priming task with ERP recordings. Participants first heard a word referring to a common object (e.g., pumpkin) and then decided whether the subsequently presented visual silent articulation matched the word they had just heard. Incongruent articulations elicited a significantly enhanced N400, indicative of a mismatch detection at the pre-lexical level. Congruent articulations elicited a significantly larger LPC, indexing articulatory word recognition. Only the N400 difference between incongruent and congruent trials was significantly correlated with individuals’ SIN accuracy improvement in the presence of the talker’s face. PMID:27155219
The unique role of the visual word form area in reading.

PubMed

Dehaene, Stanislas; Cohen, Laurent

2011-06-01

Reading systematically activates the left lateral occipitotemporal sulcus, at a site known as the visual word form area (VWFA). This site is reproducible across individuals/scripts, attuned to reading-specific processes, and partially selective for written strings relative to other categories such as line drawings. Lesions affecting the VWFA cause pure alexia, a selective deficit in word recognition. These findings must be reconciled with the fact that human genome evolution cannot have been influenced by such a recent and culturally variable activity as reading. Capitalizing on recent functional magnetic resonance imaging experiments, we provide strong corroborating evidence for the hypothesis that reading acquisition partially recycles a cortical territory evolved for object and face recognition, the prior properties of which influenced the form of writing systems. Copyright © 2011 Elsevier Ltd. All rights reserved.
Effects of dividing attention during encoding on perceptual priming of unfamiliar visual objects.

PubMed

Soldan, Anja; Mangels, Jennifer A; Cooper, Lynn A

2008-11-01

According to the distractor-selection hypothesis (Mulligan, 2003), dividing attention during encoding reduces perceptual priming when responses to non-critical (i.e., distractor) stimuli are selected frequently and simultaneously with critical stimulus encoding. Because direct support for this hypothesis comes exclusively from studies using familiar word stimuli, the present study tested whether the predictions of the distractor-selection hypothesis extend to perceptual priming of unfamiliar visual objects using the possible/impossible object decision test. Consistent with the distractor-selection hypothesis, Experiments 1 and 2 found no reduction in priming when the non-critical stimuli were presented infrequently and non-synchronously with the critical target stimuli, even though explicit recognition memory was reduced. In Experiment 3, non-critical stimuli were presented frequently and simultaneously during encoding of critical stimuli; however, no decrement in priming was detected, even when encoding time was reduced. These results suggest that priming in the possible/impossible object decision test is relatively immune to reductions in central attention and that not all aspects of the distractor-selection hypothesis generalise to priming of unfamiliar visual objects. Implications for theoretical models of object decision priming are discussed.

Effects of dividing attention during encoding on perceptual priming of unfamiliar visual objects

PubMed Central

Soldan, Anja; Mangels, Jennifer A.; Cooper, Lynn A.

2008-01-01

According to the distractor-selection hypothesis (Mulligan, 2003), dividing attention during encoding reduces perceptual priming when responses to non-critical (i.e., distractor) stimuli are selected frequently and simultaneously with critical stimulus encoding. Because direct support for this hypothesis comes exclusively from studies using familiar word stimuli, the present study tested whether the predictions of the distractor-selection hypothesis extend to perceptual priming of unfamiliar visual objects using the possible/impossible object-decision test. Consistent with the distractor-selection hypothesis, Experiments 1 and 2 found no reduction in priming when the non-critical stimuli were presented infrequently and non-synchronously with the critical target stimuli, even though explicit recognition memory was reduced. In Experiment 3, non-critical stimuli were presented frequently and simultaneously during encoding of critical stimuli; however, no decrement in priming was detected, even when encoding time was reduced. These results suggest that priming in the possible/impossible object-decision test is relatively immune to reductions in central attention and that not all aspects of the distractor-selection hypothesis generalize to priming of unfamiliar visual objects. Implications for theoretical models of object-decision priming are discussed. PMID:18821167
Recognition intent and visual word recognition.

PubMed

Wang, Man-Ying; Ching, Chi-Le

2009-03-01

This study adopted a change detection task to investigate whether and how recognition intent affects the construction of orthographic representation in visual word recognition. Chinese readers (Experiment 1-1) and nonreaders (Experiment 1-2) detected color changes in radical components of Chinese characters. Explicit recognition demand was imposed in Experiment 2 by an additional recognition task. When the recognition was implicit, a bias favoring the radical location informative of character identity was found in Chinese readers (Experiment 1-1), but not nonreaders (Experiment 1-2). With explicit recognition demands, the effect of radical location interacted with radical function and word frequency (Experiment 2). An estimate of identification performance under implicit recognition was derived in Experiment 3. These findings reflect the joint influence of recognition intent and orthographic regularity in shaping readers' orthographic representation. The implication for the role of visual attention in word recognition was also discussed.
On Assisting a Visual-Facial Affect Recognition System with Keyboard-Stroke Pattern Information

NASA Astrophysics Data System (ADS)

Stathopoulou, I.-O.; Alepis, E.; Tsihrintzis, G. A.; Virvou, M.

Towards realizing a multimodal affect recognition system, we are considering the advantages of assisting a visual-facial expression recognition system with keyboard-stroke pattern information. Our work is based on the assumption that the visual-facial and keyboard modalities are complementary to each other and that their combination can significantly improve the accuracy in affective user models. Specifically, we present and discuss the development and evaluation process of two corresponding affect recognition subsystems, with emphasis on the recognition of 6 basic emotional states, namely happiness, sadness, surprise, anger and disgust as well as the emotion-less state which we refer to as neutral. We find that emotion recognition by the visual-facial modality can be aided greatly by keyboard-stroke pattern information and the combination of the two modalities can lead to better results towards building a multimodal affect recognition system.
Bidirectional Modulation of Recognition Memory

PubMed Central

Ho, Jonathan W.; Poeta, Devon L.; Jacobson, Tara K.; Zolnik, Timothy A.; Neske, Garrett T.; Connors, Barry W.

2015-01-01

Perirhinal cortex (PER) has a well established role in the familiarity-based recognition of individual items and objects. For example, animals and humans with perirhinal damage are unable to distinguish familiar from novel objects in recognition memory tasks. In the normal brain, perirhinal neurons respond to novelty and familiarity by increasing or decreasing firing rates. Recent work also implicates oscillatory activity in the low-beta and low-gamma frequency bands in sensory detection, perception, and recognition. Using optogenetic methods in a spontaneous object exploration (SOR) task, we altered recognition memory performance in rats. In the SOR task, normal rats preferentially explore novel images over familiar ones. We modulated exploratory behavior in this task by optically stimulating channelrhodopsin-expressing perirhinal neurons at various frequencies while rats looked at novel or familiar 2D images. Stimulation at 30–40 Hz during looking caused rats to treat a familiar image as if it were novel by increasing time looking at the image. Stimulation at 30–40 Hz was not effective in increasing exploration of novel images. Stimulation at 10–15 Hz caused animals to treat a novel image as familiar by decreasing time looking at the image, but did not affect looking times for images that were already familiar. We conclude that optical stimulation of PER at different frequencies can alter visual recognition memory bidirectionally. SIGNIFICANCE STATEMENT Recognition of novelty and familiarity are important for learning, memory, and decision making. Perirhinal cortex (PER) has a well established role in the familiarity-based recognition of individual items and objects, but how novelty and familiarity are encoded and transmitted in the brain is not known. Perirhinal neurons respond to novelty and familiarity by changing firing rates, but recent work suggests that brain oscillations may also be important for recognition. In this study, we showed that stimulation of the PER could increase or decrease exploration of novel and familiar images depending on the frequency of stimulation. Our findings suggest that optical stimulation of PER at specific frequencies can predictably alter recognition memory. PMID:26424881
A self-teaching image processing and voice-recognition-based, intelligent and interactive system to educate visually impaired children

NASA Astrophysics Data System (ADS)

Iqbal, Asim; Farooq, Umar; Mahmood, Hassan; Asad, Muhammad Usman; Khan, Akrama; Atiq, Hafiz Muhammad

2010-02-01

A self teaching image processing and voice recognition based system is developed to educate visually impaired children, chiefly in their primary education. System comprises of a computer, a vision camera, an ear speaker and a microphone. Camera, attached with the computer system is mounted on the ceiling opposite (on the required angle) to the desk on which the book is placed. Sample images and voices in the form of instructions and commands of English, Urdu alphabets, Numeric Digits, Operators and Shapes are already stored in the database. A blind child first reads the embossed character (object) with the help of fingers than he speaks the answer, name of the character, shape etc into the microphone. With the voice command of a blind child received by the microphone, image is taken by the camera which is processed by MATLAB® program developed with the help of Image Acquisition and Image processing toolbox and generates a response or required set of instructions to child via ear speaker, resulting in self education of a visually impaired child. Speech recognition program is also developed in MATLAB® with the help of Data Acquisition and Signal Processing toolbox which records and process the command of the blind child.
Association of impaired facial affect recognition with basic facial and visual processing deficits in schizophrenia.

PubMed

Norton, Daniel; McBain, Ryan; Holt, Daphne J; Ongur, Dost; Chen, Yue

2009-06-15

Impaired emotion recognition has been reported in schizophrenia, yet the nature of this impairment is not completely understood. Recognition of facial emotion depends on processing affective and nonaffective facial signals, as well as basic visual attributes. We examined whether and how poor facial emotion recognition in schizophrenia is related to basic visual processing and nonaffective face recognition. Schizophrenia patients (n = 32) and healthy control subjects (n = 29) performed emotion discrimination, identity discrimination, and visual contrast detection tasks, where the emotionality, distinctiveness of identity, or visual contrast was systematically manipulated. Subjects determined which of two presentations in a trial contained the target: the emotional face for emotion discrimination, a specific individual for identity discrimination, and a sinusoidal grating for contrast detection. Patients had significantly higher thresholds (worse performance) than control subjects for discriminating both fearful and happy faces. Furthermore, patients' poor performance in fear discrimination was predicted by performance in visual detection and face identity discrimination. Schizophrenia patients require greater emotional signal strength to discriminate fearful or happy face images from neutral ones. Deficient emotion recognition in schizophrenia does not appear to be determined solely by affective processing but is also linked to the processing of basic visual and facial information.
Identification and detection of simple 3D objects with severely blurred vision.

PubMed

Kallie, Christopher S; Legge, Gordon E; Yu, Deyue

2012-12-05

Detecting and recognizing three-dimensional (3D) objects is an important component of the visual accessibility of public spaces for people with impaired vision. The present study investigated the impact of environmental factors and object properties on the recognition of objects by subjects who viewed physical objects with severely reduced acuity. The experiment was conducted in an indoor testing space. We examined detection and identification of simple convex objects by normally sighted subjects wearing diffusing goggles that reduced effective acuity to 20/900. We used psychophysical methods to examine the effect on performance of important environmental variables: viewing distance (from 10-24 feet, or 3.05-7.32 m) and illumination (overhead fluorescent and artificial window), and object variables: shape (boxes and cylinders), size (heights from 2-6 feet, or 0.61-1.83 m), and color (gray and white). Object identification was significantly affected by distance, color, height, and shape, as well as interactions between illumination, color, and shape. A stepwise regression analysis showed that 64% of the variability in identification could be explained by object contrast values (58%) and object visual angle (6%). When acuity is severely limited, illumination, distance, color, height, and shape influence the identification and detection of simple 3D objects. These effects can be explained in large part by the impact of these variables on object contrast and visual angle. Basic design principles for improving object visibility are discussed.
Emotion Recognition and Visual-Scan Paths in Fragile X Syndrome

ERIC Educational Resources Information Center

Shaw, Tracey A.; Porter, Melanie A.

2013-01-01

This study investigated emotion recognition abilities and visual scanning of emotional faces in 16 Fragile X syndrome (FXS) individuals compared to 16 chronological-age and 16 mental-age matched controls. The relationships between emotion recognition, visual scan-paths and symptoms of social anxiety, schizotypy and autism were also explored.…
Comparing the visual spans for faces and letters

PubMed Central

He, Yingchen; Scholz, Jennifer M.; Gage, Rachel; Kallie, Christopher S.; Liu, Tingting; Legge, Gordon E.

2015-01-01

The visual span—the number of adjacent text letters that can be reliably recognized on one fixation—has been proposed as a sensory bottleneck that limits reading speed (Legge, Mansfield, & Chung, 2001). Like reading, searching for a face is an important daily task that involves pattern recognition. Is there a similar limitation on the number of faces that can be recognized in a single fixation? Here we report on a study in which we measured and compared the visual-span profiles for letter and face recognition. A serial two-stage model for pattern recognition was developed to interpret the data. The first stage is characterized by factors limiting recognition of isolated letters or faces, and the second stage represents the interfering effect of nearby stimuli on recognition. Our findings show that the visual span for faces is smaller than that for letters. Surprisingly, however, when differences in first-stage processing for letters and faces are accounted for, the two visual spans become nearly identical. These results suggest that the concept of visual span may describe a common sensory bottleneck that underlies different types of pattern recognition. PMID:26129858
Exploiting core knowledge for visual object recognition.

PubMed

Schurgin, Mark W; Flombaum, Jonathan I

2017-03-01

Humans recognize thousands of objects, and with relative tolerance to variable retinal inputs. The acquisition of this ability is not fully understood, and it remains an area in which artificial systems have yet to surpass people. We sought to investigate the memory process that supports object recognition. Specifically, we investigated the association of inputs that co-occur over short periods of time. We tested the hypothesis that human perception exploits expectations about object kinematics to limit the scope of association to inputs that are likely to have the same token as a source. In several experiments we exposed participants to images of objects, and we then tested recognition sensitivity. Using motion, we manipulated whether successive encounters with an image took place through kinematics that implied the same or a different token as the source of those encounters. Images were injected with noise, or shown at varying orientations, and we included 2 manipulations of motion kinematics. Across all experiments, memory performance was better for images that had been previously encountered with kinematics that implied a single token. A model-based analysis similarly showed greater memory strength when images were shown via kinematics that implied a single token. These results suggest that constraints from physics are built into the mechanisms that support memory about objects. Such constraints-often characterized as 'Core Knowledge'-are known to support perception and cognition broadly, even in young infants. But they have never been considered as a mechanism for memory with respect to recognition. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Multi-Touch Tabletop System Using Infrared Image Recognition for User Position Identification.

PubMed

Suto, Shota; Watanabe, Toshiya; Shibusawa, Susumu; Kamada, Masaru

2018-05-14

A tabletop system can facilitate multi-user collaboration in a variety of settings, including small meetings, group work, and education and training exercises. The ability to identify the users touching the table and their positions can promote collaborative work among participants, so methods have been studied that involve attaching sensors to the table, chairs, or to the users themselves. An effective method of recognizing user actions without placing a burden on the user would be some type of visual process, so the development of a method that processes multi-touch gestures by visual means is desired. This paper describes the development of a multi-touch tabletop system using infrared image recognition for user position identification and presents the results of touch-gesture recognition experiments and a system-usability evaluation. Using an inexpensive FTIR touch panel and infrared light, this system picks up the touch areas and the shadow area of the user's hand by an infrared camera to establish an association between the hand and table touch points and estimate the position of the user touching the table. The multi-touch gestures prepared for this system include an operation to change the direction of an object to face the user and a copy operation in which two users generate duplicates of an object. The system-usability evaluation revealed that prior learning was easy and that system operations could be easily performed.
Multi-Touch Tabletop System Using Infrared Image Recognition for User Position Identification

PubMed Central

Suto, Shota; Watanabe, Toshiya; Shibusawa, Susumu; Kamada, Masaru

2018-01-01

A tabletop system can facilitate multi-user collaboration in a variety of settings, including small meetings, group work, and education and training exercises. The ability to identify the users touching the table and their positions can promote collaborative work among participants, so methods have been studied that involve attaching sensors to the table, chairs, or to the users themselves. An effective method of recognizing user actions without placing a burden on the user would be some type of visual process, so the development of a method that processes multi-touch gestures by visual means is desired. This paper describes the development of a multi-touch tabletop system using infrared image recognition for user position identification and presents the results of touch-gesture recognition experiments and a system-usability evaluation. Using an inexpensive FTIR touch panel and infrared light, this system picks up the touch areas and the shadow area of the user’s hand by an infrared camera to establish an association between the hand and table touch points and estimate the position of the user touching the table. The multi-touch gestures prepared for this system include an operation to change the direction of an object to face the user and a copy operation in which two users generate duplicates of an object. The system-usability evaluation revealed that prior learning was easy and that system operations could be easily performed. PMID:29758006
Age-related differences in listening effort during degraded speech recognition

PubMed Central

Ward, Kristina M.; Shen, Jing; Souza, Pamela E.; Grieco-Calub, Tina M.

2016-01-01

Objectives The purpose of the current study was to quantify age-related differences in executive control as it relates to dual-task performance, which is thought to represent listening effort, during degraded speech recognition. Design Twenty-five younger adults (18–24 years) and twenty-one older adults (56–82 years) completed a dual-task paradigm that consisted of a primary speech recognition task and a secondary visual monitoring task. Sentence material in the primary task was either unprocessed or spectrally degraded into 8, 6, or 4 spectral channels using noise-band vocoding. Performance on the visual monitoring task was assessed by the accuracy and reaction time of participants’ responses. Performance on the primary and secondary task was quantified in isolation (i.e., single task) and during the dual-task paradigm. Participants also completed a standardized psychometric measure of executive control, including attention and inhibition. Statistical analyses were implemented to evaluate changes in listeners’ performance on the primary and secondary tasks (1) per condition (unprocessed vs. vocoded conditions); (2) per task (baseline vs. dual task); and (3) per group (younger vs. older adults). Results Speech recognition declined with increasing spectral degradation for both younger and older adults when they performed the task in isolation or concurrently with the visual monitoring task. Older adults were slower and less accurate than younger adults on the visual monitoring task when performed in isolation, which paralleled age-related differences in standardized scores of executive control. When compared to single-task performance, older adults experienced greater declines in secondary-task accuracy, but not reaction time, than younger adults. Furthermore, results revealed that age-related differences in executive control significantly contributed to age-related differences on the visual monitoring task during the dual-task paradigm. Conclusions Older adults experienced significantly greater declines in secondary-task accuracy during degraded speech recognition than younger adults. These findings are interpreted as suggesting that older listeners expended greater listening effort than younger listeners, and may be partially attributed to age-related differences in executive control. PMID:27556526
Emmert's Law and the moon illusion.

PubMed

Gregory, Richard L

2008-01-01

A cognitive account is offered of puzzling, though well known phenomena, including increased size of afterimages with greater distance (Emmert's Law) and increased size of the moon near the horizon (the Moon Illusion). Various classical distortion illusions are explained by Size Scaling when inappropriate to distance, 'flipping' depth ambiguity being used to separate botton-up and top-down visual scaling. Helmholtz's general Principle is discussed with simpler wording - that retinal images are attributed to objects - for object recognition and spatial vision.
A role for calcium-calmodulin-dependent protein kinase II in the consolidation of visual object recognition memory.

PubMed

Tinsley, C J; Narduzzo, K E; Ho, J W; Barker, G R; Brown, M W; Warburton, E C

2009-09-01

The aim was to investigate the role of calcium-calmodulin-dependent protein kinase (CAMK)II in object recognition memory. The performance of rats in a preferential object recognition test was examined after local infusion of the CAMKII inhibitors KN-62 or autocamtide-2-related inhibitory peptide (AIP) into the perirhinal cortex. KN-62 or AIP infused after acquisition impaired memory tested at 24 h, indicating an involvement of CAMKII in the consolidation of recognition memory. Memory was impaired when KN-62 was infused at 20 min after acquisition or when AIP was infused at 20, 40, 60 or 100 min after acquisition. The time-course of CAMKII activation in rats was further examined by immunohistochemical staining for phospho-CAMKII(Thre286)alpha at 10, 40, 70 and 100 min following the viewing of novel and familiar images. At 70 min, processing novel images resulted in more phospho-CAMKII(Thre286)alpha-stained neurons in the perirhinal cortex than did the processing of familiar images, consistent with the viewing of novel images increasing the activity of CAMKII at this time. This difference was eliminated by prior infusion of AIP. These findings establish that CAMKII is active within the perirhinal region between approximately 20 and 100 min following learning and then returns to baseline. Thus, increased CAMKII activity is essential for the consolidation of long-term object recognition memory but continuation of that increased activity throughout the 24 h memory delay is not necessary for maintenance of the memory.
Basic level scene understanding: categories, attributes and structures

PubMed Central

Xiao, Jianxiong; Hays, James; Russell, Bryan C.; Patterson, Genevieve; Ehinger, Krista A.; Torralba, Antonio; Oliva, Aude

2013-01-01

A longstanding goal of computer vision is to build a system that can automatically understand a 3D scene from a single image. This requires extracting semantic concepts and 3D information from 2D images which can depict an enormous variety of environments that comprise our visual world. This paper summarizes our recent efforts toward these goals. First, we describe the richly annotated SUN database which is a collection of annotated images spanning 908 different scene categories with object, attribute, and geometric labels for many scenes. This database allows us to systematically study the space of scenes and to establish a benchmark for scene and object recognition. We augment the categorical SUN database with 102 scene attributes for every image and explore attribute recognition. Finally, we present an integrated system to extract the 3D structure of the scene and objects depicted in an image. PMID:24009590
Pose estimation of industrial objects towards robot operation

NASA Astrophysics Data System (ADS)

Niu, Jie; Zhou, Fuqiang; Tan, Haishu; Cao, Yu

2017-10-01

With the advantages of wide range, non-contact and high flexibility, the visual estimation technology of target pose has been widely applied in modern industry, robot guidance and other engineering practices. However, due to the influence of complicated industrial environment, outside interference factors, lack of object characteristics, restrictions of camera and other limitations, the visual estimation technology of target pose is still faced with many challenges. Focusing on the above problems, a pose estimation method of the industrial objects is developed based on 3D models of targets. By matching the extracted shape characteristics of objects with the priori 3D model database of targets, the method realizes the recognition of target. Thus a pose estimation of objects can be determined based on the monocular vision measuring model. The experimental results show that this method can be implemented to estimate the position of rigid objects based on poor images information, and provides guiding basis for the operation of the industrial robot.
What can neuromorphic event-driven precise timing add to spike-based pattern recognition?

PubMed

Akolkar, Himanshu; Meyer, Cedric; Clady, Zavier; Marre, Olivier; Bartolozzi, Chiara; Panzeri, Stefano; Benosman, Ryad

2015-03-01

This letter introduces a study to precisely measure what an increase in spike timing precision can add to spike-driven pattern recognition algorithms. The concept of generating spikes from images by converting gray levels into spike timings is currently at the basis of almost every spike-based modeling of biological visual systems. The use of images naturally leads to generating incorrect artificial and redundant spike timings and, more important, also contradicts biological findings indicating that visual processing is massively parallel, asynchronous with high temporal resolution. A new concept for acquiring visual information through pixel-individual asynchronous level-crossing sampling has been proposed in a recent generation of asynchronous neuromorphic visual sensors. Unlike conventional cameras, these sensors acquire data not at fixed points in time for the entire array but at fixed amplitude changes of their input, resulting optimally sparse in space and time-pixel individually and precisely timed only if new, (previously unknown) information is available (event based). This letter uses the high temporal resolution spiking output of neuromorphic event-based visual sensors to show that lowering time precision degrades performance on several recognition tasks specifically when reaching the conventional range of machine vision acquisition frequencies (30-60 Hz). The use of information theory to characterize separability between classes for each temporal resolution shows that high temporal acquisition provides up to 70% more information that conventional spikes generated from frame-based acquisition as used in standard artificial vision, thus drastically increasing the separability between classes of objects. Experiments on real data show that the amount of information loss is correlated with temporal precision. Our information-theoretic study highlights the potentials of neuromorphic asynchronous visual sensors for both practical applications and theoretical investigations. Moreover, it suggests that representing visual information as a precise sequence of spike times as reported in the retina offers considerable advantages for neuro-inspired visual computations.
Spatiotemporal proximity effects in visual short-term memory examined by target-nontarget analysis.

PubMed

Sapkota, Raju P; Pardhan, Shahina; van der Linde, Ian

2016-08-01

Visual short-term memory (VSTM) is a limited-capacity system that holds a small number of objects online simultaneously, implying that competition for limited storage resources occurs (Phillips, 1974). How the spatial and temporal proximity of stimuli affects this competition is unclear. In this 2-experiment study, we examined the effect of the spatial and temporal separation of real-world memory targets and erroneously selected nontarget items examined during location-recognition and object-recall tasks. In Experiment 1 (the location-recognition task), our test display comprised either the picture or name of 1 previously examined memory stimulus (rendered above as the stimulus-display area), together with numbered square boxes at each of the memory-stimulus locations used in that trial. Participants were asked to report the number inside the square box corresponding to the location at which the cued object was originally presented. In Experiment 2 (the object-recall task), the test display comprised a single empty square box presented at 1 memory-stimulus location. Participants were asked to report the name of the object presented at that location. In both experiments, nontarget objects that were spatially and temporally proximal to the memory target were confused more often than nontarget objects that were spatially and temporally distant (i.e., a spatiotemporal proximity effect); this effect generalized across memory tasks, and the object feature (picture or name) that cued the test-display memory target. Our findings are discussed in terms of spatial and temporal confusion "fields" in VSTM, wherein objects occupy diffuse loci in a spatiotemporal coordinate system, wherein neighboring locations are more susceptible to confusion. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Role of temporal processing stages by inferior temporal neurons in facial recognition.

PubMed

Sugase-Miyamoto, Yasuko; Matsumoto, Narihisa; Kawano, Kenji

2011-01-01

In this review, we focus on the role of temporal stages of encoded facial information in the visual system, which might enable the efficient determination of species, identity, and expression. Facial recognition is an important function of our brain and is known to be processed in the ventral visual pathway, where visual signals are processed through areas V1, V2, V4, and the inferior temporal (IT) cortex. In the IT cortex, neurons show selective responses to complex visual images such as faces, and at each stage along the pathway the stimulus selectivity of the neural responses becomes sharper, particularly in the later portion of the responses. In the IT cortex of the monkey, facial information is represented by different temporal stages of neural responses, as shown in our previous study: the initial transient response of face-responsive neurons represents information about global categories, i.e., human vs. monkey vs. simple shapes, whilst the later portion of these responses represents information about detailed facial categories, i.e., expression and/or identity. This suggests that the temporal stages of the neuronal firing pattern play an important role in the coding of visual stimuli, including faces. This type of coding may be a plausible mechanism underlying the temporal dynamics of recognition, including the process of detection/categorization followed by the identification of objects. Recent single-unit studies in monkeys have also provided evidence consistent with the important role of the temporal stages of encoded facial information. For example, view-invariant facial identity information is represented in the response at a later period within a region of face-selective neurons. Consistent with these findings, temporally modulated neural activity has also been observed in human studies. These results suggest a close correlation between the temporal processing stages of facial information by IT neurons and the temporal dynamics of face recognition.

Role of Temporal Processing Stages by Inferior Temporal Neurons in Facial Recognition

PubMed Central

Sugase-Miyamoto, Yasuko; Matsumoto, Narihisa; Kawano, Kenji

2011-01-01

In this review, we focus on the role of temporal stages of encoded facial information in the visual system, which might enable the efficient determination of species, identity, and expression. Facial recognition is an important function of our brain and is known to be processed in the ventral visual pathway, where visual signals are processed through areas V1, V2, V4, and the inferior temporal (IT) cortex. In the IT cortex, neurons show selective responses to complex visual images such as faces, and at each stage along the pathway the stimulus selectivity of the neural responses becomes sharper, particularly in the later portion of the responses. In the IT cortex of the monkey, facial information is represented by different temporal stages of neural responses, as shown in our previous study: the initial transient response of face-responsive neurons represents information about global categories, i.e., human vs. monkey vs. simple shapes, whilst the later portion of these responses represents information about detailed facial categories, i.e., expression and/or identity. This suggests that the temporal stages of the neuronal firing pattern play an important role in the coding of visual stimuli, including faces. This type of coding may be a plausible mechanism underlying the temporal dynamics of recognition, including the process of detection/categorization followed by the identification of objects. Recent single-unit studies in monkeys have also provided evidence consistent with the important role of the temporal stages of encoded facial information. For example, view-invariant facial identity information is represented in the response at a later period within a region of face-selective neurons. Consistent with these findings, temporally modulated neural activity has also been observed in human studies. These results suggest a close correlation between the temporal processing stages of facial information by IT neurons and the temporal dynamics of face recognition. PMID:21734904
TU-C-17A-03: An Integrated Contour Evaluation Software Tool Using Supervised Pattern Recognition for Radiotherapy

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, H; Tan, J; Kavanaugh, J

Purpose: Radiotherapy (RT) contours delineated either manually or semiautomatically require verification before clinical usage. Manual evaluation is very time consuming. A new integrated software tool using supervised pattern contour recognition was thus developed to facilitate this process. Methods: The contouring tool was developed using an object-oriented programming language C# and application programming interfaces, e.g. visualization toolkit (VTK). The C# language served as the tool design basis. The Accord.Net scientific computing libraries were utilized for the required statistical data processing and pattern recognition, while the VTK was used to build and render 3-D mesh models from critical RT structures in real-timemore » and 360° visualization. Principal component analysis (PCA) was used for system self-updating geometry variations of normal structures based on physician-approved RT contours as a training dataset. The inhouse design of supervised PCA-based contour recognition method was used for automatically evaluating contour normality/abnormality. The function for reporting the contour evaluation results was implemented by using C# and Windows Form Designer. Results: The software input was RT simulation images and RT structures from commercial clinical treatment planning systems. Several abilities were demonstrated: automatic assessment of RT contours, file loading/saving of various modality medical images and RT contours, and generation/visualization of 3-D images and anatomical models. Moreover, it supported the 360° rendering of the RT structures in a multi-slice view, which allows physicians to visually check and edit abnormally contoured structures. Conclusion: This new software integrates the supervised learning framework with image processing and graphical visualization modules for RT contour verification. This tool has great potential for facilitating treatment planning with the assistance of an automatic contour evaluation module in avoiding unnecessary manual verification for physicians/dosimetrists. In addition, its nature as a compact and stand-alone tool allows for future extensibility to include additional functions for physicians’ clinical needs.« less
A Joint Gaussian Process Model for Active Visual Recognition with Expertise Estimation in Crowdsourcing

PubMed Central

Long, Chengjiang; Hua, Gang; Kapoor, Ashish

2015-01-01

We present a noise resilient probabilistic model for active learning of a Gaussian process classifier from crowds, i.e., a set of noisy labelers. It explicitly models both the overall label noise and the expertise level of each individual labeler with two levels of flip models. Expectation propagation is adopted for efficient approximate Bayesian inference of our probabilistic model for classification, based on which, a generalized EM algorithm is derived to estimate both the global label noise and the expertise of each individual labeler. The probabilistic nature of our model immediately allows the adoption of the prediction entropy for active selection of data samples to be labeled, and active selection of high quality labelers based on their estimated expertise to label the data. We apply the proposed model for four visual recognition tasks, i.e., object category recognition, multi-modal activity recognition, gender recognition, and fine-grained classification, on four datasets with real crowd-sourced labels from the Amazon Mechanical Turk. The experiments clearly demonstrate the efficacy of the proposed model. In addition, we extend the proposed model with the Predictive Active Set Selection Method to speed up the active learning system, whose efficacy is verified by conducting experiments on the first three datasets. The results show our extended model can not only preserve a higher accuracy, but also achieve a higher efficiency. PMID:26924892
View-tolerant face recognition and Hebbian learning imply mirror-symmetric neural tuning to head orientation

PubMed Central

Leibo, Joel Z.; Liao, Qianli; Freiwald, Winrich A.; Anselmi, Fabio; Poggio, Tomaso

2017-01-01

SUMMARY The primate brain contains a hierarchy of visual areas, dubbed the ventral stream, which rapidly computes object representations that are both specific for object identity and robust against identity-preserving transformations like depth-rotations [1, 2]. Current computational models of object recognition, including recent deep learning networks, generate these properties through a hierarchy of alternating selectivity-increasing filtering and tolerance-increasing pooling operations, similar to simple-complex cells operations [3, 4, 5, 6]. Here we prove that a class of hierarchical architectures and a broad set of biologically plausible learning rules generate approximate invariance to identity-preserving transformations at the top level of the processing hierarchy. However, all past models tested failed to reproduce the most salient property of an intermediate representation of a three-level face-processing hierarchy in the brain: mirror-symmetric tuning to head orientation [7]. Here we demonstrate that one specific biologically-plausible Hebb-type learning rule generates mirror-symmetric tuning to bilaterally symmetric stimuli like faces at intermediate levels of the architecture and show why it does so. Thus the tuning properties of individual cells inside the visual stream appear to result from group properties of the stimuli they encode and to reflect the learning rules that sculpted the information-processing system within which they reside. PMID:27916522
Preschoolers Benefit From Visually Salient Speech Cues

PubMed Central

Holt, Rachael Frush

2015-01-01

Purpose This study explored visual speech influence in preschoolers using 3 developmentally appropriate tasks that vary in perceptual difficulty and task demands. They also examined developmental differences in the ability to use visually salient speech cues and visual phonological knowledge. Method Twelve adults and 27 typically developing 3- and 4-year-old children completed 3 audiovisual (AV) speech integration tasks: matching, discrimination, and recognition. The authors compared AV benefit for visually salient and less visually salient speech discrimination contrasts and assessed the visual saliency of consonant confusions in auditory-only and AV word recognition. Results Four-year-olds and adults demonstrated visual influence on all measures. Three-year-olds demonstrated visual influence on speech discrimination and recognition measures. All groups demonstrated greater AV benefit for the visually salient discrimination contrasts. AV recognition benefit in 4-year-olds and adults depended on the visual saliency of speech sounds. Conclusions Preschoolers can demonstrate AV speech integration. Their AV benefit results from efficient use of visually salient speech cues. Four-year-olds, but not 3-year-olds, used visual phonological knowledge to take advantage of visually salient speech cues, suggesting possible developmental differences in the mechanisms of AV benefit. PMID:25322336
Top-down contextual knowledge guides visual attention in infancy.

PubMed

Tummeltshammer, Kristen; Amso, Dima

2017-10-26

The visual context in which an object or face resides can provide useful top-down information for guiding attention orienting, object recognition, and visual search. Although infants have demonstrated sensitivity to covariation in spatial arrays, it is presently unclear whether they can use rapidly acquired contextual knowledge to guide attention during visual search. In this eye-tracking experiment, 6- and 10-month-old infants searched for a target face hidden among colorful distracter shapes. Targets appeared in Old or New visual contexts, depending on whether the visual search arrays (defined by the spatial configuration, shape and color of component items in the search display) were repeated or newly generated throughout the experiment. Targets in Old contexts appeared in the same location within the same configuration, such that context covaried with target location. Both 6- and 10-month-olds successfully distinguished between Old and New contexts, exhibiting faster search times, fewer looks at distracters, and more anticipation of targets when contexts repeated. This initial demonstration of contextual cueing effects in infants indicates that they can use top-down information to facilitate orienting during memory-guided visual search. © 2017 John Wiley & Sons Ltd.
Investigation into the visual perceptive ability of anaesthetists during ultrasound-guided interscalene and femoral blocks conducted on soft embalmed cadavers: a randomised single-blind study.

PubMed

Mustafa, A; Seeley, J; Munirama, S; Columb, M; McKendrick, M; Schwab, A; Corner, G; Eisma, R; Mcleod, G

2018-04-01

Errors may occur during regional anaesthesia whilst searching for nerves, needle tips, and test doses. Poor visual search impacts on decision making, clinical intervention, and patient safety. We conducted a randomised single-blind study in a single university hospital. Twenty trainees and two consultants examined the paired B-mode and fused B-mode and elastography video recordings of 24 interscalene and 24 femoral blocks conducted on two soft embalmed cadavers. Perineural injection was randomised equally to 0.25, 0.5, and 1.0 ml volumes. Tissue displacement perceived on both imaging modalities was defined as 'target' or 'distractor'. Our primary objective was to test the anaesthetists' perception of the number and proportion of targets and distractors on B-mode and fused elastography videos collected during femoral and sciatic nerve block on soft embalmed cadavers. Our secondary objectives were to determine the differences between novices and experts, and between test-dose volumes, and to measure the area and brightness of spread and strain patterns. All anaesthetists recognised perineural spread using 0.25 ml volumes. Distractor patterns were recognised in 133 (12%) of B-mode and in 403 (38%) of fused B-mode and elastography patterns; P<0.001. With elastography, novice recognition improved from 12 to 37% (P<0.001), and consultant recognition increased from 24 to 53%; P<0.001. Distractor recognition improved from 8 to 31% using 0.25 ml volumes (P<0.001), and from 15 to 45% using 1 ml volumes (P<0.001). Visual search improved with fusion elastography, increased volume, and consultants. A need exists to investigate image search strategies. Copyright © 2018 British Journal of Anaesthesia. Published by Elsevier Ltd. All rights reserved.
The impact of privacy protection filters on gender recognition

NASA Astrophysics Data System (ADS)

Ruchaud, Natacha; Antipov, Grigory; Korshunov, Pavel; Dugelay, Jean-Luc; Ebrahimi, Touradj; Berrani, Sid-Ahmed

2015-09-01

Deep learning-based algorithms have become increasingly efficient in recognition and detection tasks, especially when they are trained on large-scale datasets. Such recent success has led to a speculation that deep learning methods are comparable to or even outperform human visual system in its ability to detect and recognize objects and their features. In this paper, we focus on the specific task of gender recognition in images when they have been processed by privacy protection filters (e.g., blurring, masking, and pixelization) applied at different strengths. Assuming a privacy protection scenario, we compare the performance of state of the art deep learning algorithms with a subjective evaluation obtained via crowdsourcing to understand how privacy protection filters affect both machine and human vision.
Discrimination of holograms and real objects by pigeons (Columba livia) and humans (Homo sapiens).

PubMed

Stephan, Claudia; Steurer, Michael M; Aust, Ulrike

2014-08-01

The type of stimulus material employed in visual tasks is crucial to all comparative cognition research that involves object recognition. There is considerable controversy about the use of 2-dimensional stimuli and the impact that the lack of the 3rd dimension (i.e., depth) may have on animals' performance in tests for their visual and cognitive abilities. We report evidence of discrimination learning using a completely novel type of stimuli, namely, holograms. Like real objects, holograms provide full 3-dimensional shape information but they also offer many possibilities for systematically modifying the appearance of a stimulus. Hence, they provide a promising means for investigating visual perception and cognition of different species in a comparative way. We trained pigeons and humans to discriminate either between 2 real objects or between holograms of the same 2 objects, and we subsequently tested both species for the transfer of discrimination to the other presentation mode. The lack of any decrements in accuracy suggests that real objects and holograms were perceived as equivalent in both species and shows the general appropriateness of holograms as stimuli in visual tasks. A follow-up experiment involving the presentation of novel views of the training objects and holograms revealed some interspecies differences in rotational invariance, thereby confirming and extending the results of previous studies. Taken together, these results suggest that holograms may not only provide a promising tool for investigating yet unexplored issues, but their use may also lead to novel insights into some crucial aspects of comparative visual perception and categorization.
Effects of cholinergic deafferentation of the rhinal cortex on visual recognition memory in monkeys.

PubMed

Turchi, Janita; Saunders, Richard C; Mishkin, Mortimer

2005-02-08

Excitotoxic lesion studies have confirmed that the rhinal cortex is essential for visual recognition ability in monkeys. To evaluate the mnemonic role of cholinergic inputs to this cortical region, we compared the visual recognition performance of monkeys given rhinal cortex infusions of a selective cholinergic immunotoxin, ME20.4-SAP, with the performance of monkeys given control infusions into this same tissue. The immunotoxin, which leads to selective cholinergic deafferentation of the infused cortex, yielded recognition deficits of the same magnitude as those produced by excitotoxic lesions of this region, providing the most direct demonstration to date that cholinergic activation of the rhinal cortex is essential for storing the representations of new visual stimuli and thereby enabling their later recognition.
Feature and Region Selection for Visual Learning.

PubMed

Zhao, Ji; Wang, Liantao; Cabral, Ricardo; De la Torre, Fernando

2016-03-01

Visual learning problems, such as object classification and action recognition, are typically approached using extensions of the popular bag-of-words (BoWs) model. Despite its great success, it is unclear what visual features the BoW model is learning. Which regions in the image or video are used to discriminate among classes? Which are the most discriminative visual words? Answering these questions is fundamental for understanding existing BoW models and inspiring better models for visual recognition. To answer these questions, this paper presents a method for feature selection and region selection in the visual BoW model. This allows for an intermediate visualization of the features and regions that are important for visual learning. The main idea is to assign latent weights to the features or regions, and jointly optimize these latent variables with the parameters of a classifier (e.g., support vector machine). There are four main benefits of our approach: 1) our approach accommodates non-linear additive kernels, such as the popular χ(2) and intersection kernel; 2) our approach is able to handle both regions in images and spatio-temporal regions in videos in a unified way; 3) the feature selection problem is convex, and both problems can be solved using a scalable reduced gradient method; and 4) we point out strong connections with multiple kernel learning and multiple instance learning approaches. Experimental results in the PASCAL VOC 2007, MSR Action Dataset II and YouTube illustrate the benefits of our approach.
Toward a Computer Vision-based Wayfinding Aid for Blind Persons to Access Unfamiliar Indoor Environments.

PubMed

Tian, Yingli; Yang, Xiaodong; Yi, Chucai; Arditi, Aries

2013-04-01

Independent travel is a well known challenge for blind and visually impaired persons. In this paper, we propose a proof-of-concept computer vision-based wayfinding aid for blind people to independently access unfamiliar indoor environments. In order to find different rooms (e.g. an office, a lab, or a bathroom) and other building amenities (e.g. an exit or an elevator), we incorporate object detection with text recognition. First we develop a robust and efficient algorithm to detect doors, elevators, and cabinets based on their general geometric shape, by combining edges and corners. The algorithm is general enough to handle large intra-class variations of objects with different appearances among different indoor environments, as well as small inter-class differences between different objects such as doors and door-like cabinets. Next, in order to distinguish intra-class objects (e.g. an office door from a bathroom door), we extract and recognize text information associated with the detected objects. For text recognition, we first extract text regions from signs with multiple colors and possibly complex backgrounds, and then apply character localization and topological analysis to filter out background interference. The extracted text is recognized using off-the-shelf optical character recognition (OCR) software products. The object type, orientation, location, and text information are presented to the blind traveler as speech.
Toward a Computer Vision-based Wayfinding Aid for Blind Persons to Access Unfamiliar Indoor Environments

PubMed Central

Tian, YingLi; Yang, Xiaodong; Yi, Chucai; Arditi, Aries

2012-01-01

Independent travel is a well known challenge for blind and visually impaired persons. In this paper, we propose a proof-of-concept computer vision-based wayfinding aid for blind people to independently access unfamiliar indoor environments. In order to find different rooms (e.g. an office, a lab, or a bathroom) and other building amenities (e.g. an exit or an elevator), we incorporate object detection with text recognition. First we develop a robust and efficient algorithm to detect doors, elevators, and cabinets based on their general geometric shape, by combining edges and corners. The algorithm is general enough to handle large intra-class variations of objects with different appearances among different indoor environments, as well as small inter-class differences between different objects such as doors and door-like cabinets. Next, in order to distinguish intra-class objects (e.g. an office door from a bathroom door), we extract and recognize text information associated with the detected objects. For text recognition, we first extract text regions from signs with multiple colors and possibly complex backgrounds, and then apply character localization and topological analysis to filter out background interference. The extracted text is recognized using off-the-shelf optical character recognition (OCR) software products. The object type, orientation, location, and text information are presented to the blind traveler as speech. PMID:23630409
Memorable objects are more susceptible to forgetting: Evidence for the inhibitory account of retrieval-induced forgetting.

PubMed

Reppa, I; Williams, K E; Worth, E R; Greville, W J; Saunders, J

2017-11-01

Retrieval of target information can cause forgetting for related, but non-retrieved, information - retrieval-induced forgetting (RIF). The aim of the current studies was to examine a key prediction of the inhibitory account of RIF - interference dependence - whereby 'strong' non-retrieved items are more likely to interfere during retrieval and therefore, are more susceptible to RIF. Using visual objects allowed us to examine and contrast one index of item strength -object typicality, that is, how typical of its category an object is. Experiment 1 provided proof of concept for our variant of the recognition practice paradigm. Experiment 2 tested the prediction of the inhibitory account that the magnitude of RIF for natural visual objects would be dependent on item strength. Non-typical objects were more memorable overall than typical objects. We found that object memorability (as determined by typicality) influenced RIF with significant forgetting occurring for the memorable (non-typical), but not non-memorable (typical), objects. The current findings strongly support an inhibitory account of retrieval-induced forgetting. Copyright © 2017 Elsevier B.V. All rights reserved.
Stimulus-driven changes in the direction of neural priming during visual word recognition.

PubMed

Pas, Maciej; Nakamura, Kimihiro; Sawamoto, Nobukatsu; Aso, Toshihiko; Fukuyama, Hidenao

2016-01-15

Visual object recognition is generally known to be facilitated when targets are preceded by the same or relevant stimuli. For written words, however, the beneficial effect of priming can be reversed when primes and targets share initial syllables (e.g., "boca" and "bono"). Using fMRI, the present study explored neuroanatomical correlates of this negative syllabic priming. In each trial, participants made semantic judgment about a centrally presented target, which was preceded by a masked prime flashed either to the left or right visual field. We observed that the inhibitory priming during reading was associated with a left-lateralized effect of repetition enhancement in the inferior frontal gyrus (IFG), rather than repetition suppression in the ventral visual region previously associated with facilitatory behavioral priming. We further performed a second fMRI experiment using a classical whole-word repetition priming paradigm with the same hemifield procedure and task instruction, and obtained well-known effects of repetition suppression in the left occipito-temporal cortex. These results therefore suggest that the left IFG constitutes a fast word processing system distinct from the posterior visual word-form system and that the directions of repetition effects can change with intrinsic properties of stimuli even when participants' cognitive and attentional states are kept constant. Copyright © 2015 Elsevier Inc. All rights reserved.
Familiarity and Recollection Produce Distinct Eye Movement, Pupil and Medial Temporal Lobe Responses when Memory Strength Is Matched

ERIC Educational Resources Information Center

Kafkas, Alexandros; Montaldi, Daniela

2012-01-01

Two experiments explored eye measures (fixations and pupil response patterns) and brain responses (BOLD) accompanying the recognition of visual object stimuli based on familiarity and recollection. In both experiments, the use of a modified remember/know procedure led to high confidence and matched accuracy levels characterising strong familiarity…
Structure from Motion

DTIC Science & Technology

1988-11-17

NOTATION 17. COSATI CODES 18. SUBJECT TERMS (Continue on reverse if ntcestary and identify by block number) FIELD GROUP SUB-GROUP ,-.:image...ambiguity in the recognition of partially occluded objects. V 1 , t : ., , ’ -, L: \\ : _ 20. DISTRIBUTION/AVAILABILITY OF ABSTRACT 21. ABSTRACT...constraints involved in the problem. More information can be found in [ 1 ]. Motion-based segmentation. Edge detection algorithms based on visual motion
Computational Modeling of Morphological Effects in Bangla Visual Word Recognition

ERIC Educational Resources Information Center

Dasgupta, Tirthankar; Sinha, Manjira; Basu, Anupam

2015-01-01

In this paper we aim to model the organization and processing of Bangla polymorphemic words in the mental lexicon. Our objective is to determine whether the mental lexicon accesses a polymorphemic word as a whole or decomposes the word into its constituent morphemes and then recognize them accordingly. To address this issue, we adopted two…
Better Object Recognition and Naming Outcome With MRI-Guided Stereotactic Laser Amygdalohippocampotomy for Temporal Lobe Epilepsy

PubMed Central

Drane, Daniel L.; Loring, David W.; Voets, Natalie L.; Price, Michele; Ojemann, Jeffrey G.; Willie, Jon T.; Saindane, Amit M.; Phatak, Vaishali; Ivanisevic, Mirjana; Millis, Scott; Helmers, Sandra L.; Miller, John W.; Meador, Kimford J.; Gross, Robert E.

2015-01-01

SUMMARY OBJECTIVES Temporal lobe epilepsy (TLE) patients experience significant deficits in category-related object recognition and naming following standard surgical approaches. These deficits may result from a decoupling of core processing modules (e.g., language, visual processing, semantic memory), due to “collateral damage” to temporal regions outside the hippocampus following open surgical approaches. We predicted stereotactic laser amygdalohippocampotomy (SLAH) would minimize such deficits because it preserves white matter pathways and neocortical regions critical for these cognitive processes. METHODS Tests of naming and recognition of common nouns (Boston Naming Test) and famous persons were compared with nonparametric analyses using exact tests between a group of nineteen patients with medically-intractable mesial TLE undergoing SLAH (10 dominant, 9 nondominant), and a comparable series of TLE patients undergoing standard surgical approaches (n=39) using a prospective, non-randomized, non-blinded, parallel group design. RESULTS Performance declines were significantly greater for the dominant TLE patients undergoing open resection versus SLAH for naming famous faces and common nouns (F=24.3, p<.0001, η2=.57, & F=11.2, p<.001, η2=.39, respectively), and for the nondominant TLE patients undergoing open resection versus SLAH for recognizing famous faces (F=3.9, p<.02, η2=.19). When examined on an individual subject basis, no SLAH patients experienced any performance declines on these measures. In contrast, 32 of the 39 undergoing standard surgical approaches declined on one or more measures for both object types (p<.001, Fisher’s exact test). Twenty-one of 22 left (dominant) TLE patients declined on one or both naming tasks after open resection, while 11 of 17 right (non-dominant) TLE patients declined on face recognition. SIGNIFICANCE Preliminary results suggest 1) naming and recognition functions can be spared in TLE patients undergoing SLAH, and 2) the hippocampus does not appear to be an essential component of neural networks underlying name retrieval or recognition of common objects or famous faces. PMID:25489630
Two Ways to Facial Expression Recognition? Motor and Visual Information Have Different Effects on Facial Expression Recognition.

PubMed

de la Rosa, Stephan; Fademrecht, Laura; Bülthoff, Heinrich H; Giese, Martin A; Curio, Cristóbal

2018-06-01

Motor-based theories of facial expression recognition propose that the visual perception of facial expression is aided by sensorimotor processes that are also used for the production of the same expression. Accordingly, sensorimotor and visual processes should provide congruent emotional information about a facial expression. Here, we report evidence that challenges this view. Specifically, the repeated execution of facial expressions has the opposite effect on the recognition of a subsequent facial expression than the repeated viewing of facial expressions. Moreover, the findings of the motor condition, but not of the visual condition, were correlated with a nonsensory condition in which participants imagined an emotional situation. These results can be well accounted for by the idea that facial expression recognition is not always mediated by motor processes but can also be recognized on visual information alone.

A Neural-Dynamic Architecture for Concurrent Estimation of Object Pose and Identity

PubMed Central

Lomp, Oliver; Faubel, Christian; Schöner, Gregor

2017-01-01

Handling objects or interacting with a human user about objects on a shared tabletop requires that objects be identified after learning from a small number of views and that object pose be estimated. We present a neurally inspired architecture that learns object instances by storing features extracted from a single view of each object. Input features are color and edge histograms from a localized area that is updated during processing. The system finds the best-matching view for the object in a novel input image while concurrently estimating the object’s pose, aligning the learned view with current input. The system is based on neural dynamics, computationally operating in real time, and can handle dynamic scenes directly off live video input. In a scenario with 30 everyday objects, the system achieves recognition rates of 87.2% from a single training view for each object, while also estimating pose quite precisely. We further demonstrate that the system can track moving objects, and that it can segment the visual array, selecting and recognizing one object while suppressing input from another known object in the immediate vicinity. Evaluation on the COIL-100 dataset, in which objects are depicted from different viewing angles, revealed recognition rates of 91.1% on the first 30 objects, each learned from four training views. PMID:28503145
Seeing without knowing: task relevance dissociates between visual awareness and recognition.

PubMed

Eitam, Baruch; Shoval, Roy; Yeshurun, Yaffa

2015-03-01

We demonstrate that task relevance dissociates between visual awareness and knowledge activation to create a state of seeing without knowing-visual awareness of familiar stimuli without recognizing them. We rely on the fact that in order to experience a Kanizsa illusion, participants must be aware of its inducers. While people can indicate the orientation of the illusory rectangle with great ease (signifying that they have consciously experienced the illusion's inducers), almost 30% of them could not report the inducers' color. Thus, people can see, in the sense of phenomenally experiencing, but not know, in the sense of recognizing what the object is or activating appropriate knowledge about it. Experiment 2 tests whether relevance-based selection operates within objects and shows that, contrary to the pattern of results found with features of different objects in our previous studies and replicated in Experiment 1, selection does not occur when both relevant and irrelevant features belong to the same object. We discuss these findings in relation to the existing theories of consciousness and to attention and inattentional blindness, and the role of cognitive load, object-based attention, and the use of self-reports as measures of awareness. © 2015 New York Academy of Sciences.
Influence of cognitive style and interstimulus interval on the hemispheric processing of tactile stimuli.

PubMed

Minagawa, N; Kashu, K

1989-06-01

16 adult subjects performed a tactile recognition task. According to our 1984 study, half of the subjects were classified as having a left hemispheric preference for the processing of visual stimuli, while the other half were classified as having a right hemispheric preference for the processing of visual stimuli. The present task was conducted according to the S1-S2 matching paradigm. The standard stimulus was a readily recognizable object and was presented tactually to either the left or right hand of each subject. The comparison stimulus was an object-picture and was presented visually by slide in a tachistoscope. The interstimulus interval was .05 sec. or 2.5 sec. Analysis indicated that the left-preference group showed right-hand superiority, and the right-preference group showed left-hand superiority. The notion of individual hemisphericity was supported in tactile processing.
Application of the SP theory of intelligence to the understanding of natural vision and the development of computer vision.

PubMed

Wolff, J Gerard

2014-01-01

The SP theory of intelligence aims to simplify and integrate concepts in computing and cognition, with information compression as a unifying theme. This article is about how the SP theory may, with advantage, be applied to the understanding of natural vision and the development of computer vision. Potential benefits include an overall simplification of concepts in a universal framework for knowledge and seamless integration of vision with other sensory modalities and other aspects of intelligence. Low level perceptual features such as edges or corners may be identified by the extraction of redundancy in uniform areas in the manner of the run-length encoding technique for information compression. The concept of multiple alignment in the SP theory may be applied to the recognition of objects, and to scene analysis, with a hierarchy of parts and sub-parts, at multiple levels of abstraction, and with family-resemblance or polythetic categories. The theory has potential for the unsupervised learning of visual objects and classes of objects, and suggests how coherent concepts may be derived from fragments. As in natural vision, both recognition and learning in the SP system are robust in the face of errors of omission, commission and substitution. The theory suggests how, via vision, we may piece together a knowledge of the three-dimensional structure of objects and of our environment, it provides an account of how we may see things that are not objectively present in an image, how we may recognise something despite variations in the size of its retinal image, and how raster graphics and vector graphics may be unified. And it has things to say about the phenomena of lightness constancy and colour constancy, the role of context in recognition, ambiguities in visual perception, and the integration of vision with other senses and other aspects of intelligence.
Modulation of microsaccades by spatial frequency during object categorization.

PubMed

Craddock, Matt; Oppermann, Frank; Müller, Matthias M; Martinovic, Jasna

2017-01-01

The organization of visual processing into a coarse-to-fine information processing based on the spatial frequency properties of the input forms an important facet of the object recognition process. During visual object categorization tasks, microsaccades occur frequently. One potential functional role of these eye movements is to resolve high spatial frequency information. To assess this hypothesis, we examined the rate, amplitude and speed of microsaccades in an object categorization task in which participants viewed object and non-object images and classified them as showing either natural objects, man-made objects or non-objects. Images were presented unfiltered (broadband; BB) or filtered to contain only low (LSF) or high spatial frequency (HSF) information. This allowed us to examine whether microsaccades were modulated independently by the presence of a high-level feature - the presence of an object - and by low-level stimulus characteristics - spatial frequency. We found a bimodal distribution of saccades based on their amplitude, with a split between smaller and larger microsaccades at 0.4° of visual angle. The rate of larger saccades (⩾0.4°) was higher for objects than non-objects, and higher for objects with high spatial frequency content (HSF and BB objects) than for LSF objects. No effects were observed for smaller microsaccades (<0.4°). This is consistent with a role for larger microsaccades in resolving HSF information for object identification, and previous evidence that more microsaccades are directed towards informative image regions. Copyright © 2016 Elsevier Ltd. All rights reserved.
Muscarinic Receptor-Dependent Long Term Depression in the Perirhinal Cortex and Recognition Memory are Impaired in the rTg4510 Mouse Model of Tauopathy.

PubMed

Scullion, Sarah E; Barker, Gareth R I; Warburton, E Clea; Randall, Andrew D; Brown, Jonathan T

2018-02-26

Neurodegenerative diseases affecting cognitive dysfunction, such as Alzheimer's disease and fronto-temporal dementia, are often associated impairments in the visual recognition memory system. Recent evidence suggests that synaptic plasticity, in particular long term depression (LTD), in the perirhinal cortex (PRh) is a critical cellular mechanism underlying recognition memory. In this study, we have examined novel object recognition and PRh LTD in rTg4510 mice, which transgenically overexpress tau P301L . We found that 8-9 month old rTg4510 mice had significant deficits in long- but not short-term novel object recognition memory. Furthermore, we also established that PRh slices prepared from rTg4510 mice, unlike those prepared from wildtype littermates, could not support a muscarinic acetylcholine receptor-dependent form of LTD, induced by a 5 Hz stimulation protocol. In contrast, bath application of the muscarinic agonist carbachol induced a form of chemical LTD in both WT and rTg4510 slices. Finally, when rTg4510 slices were preincubated with the acetylcholinesterase inhibitor donepezil, the 5 Hz stimulation protocol was capable of inducing significant levels of LTD. These data suggest that dysfunctional cholinergic innervation of the PRh of rTg4510 mice, results in deficits in synaptic LTD which may contribute to aberrant recognition memory in this rodent model of tauopathy.
Neural network face recognition using wavelets

NASA Astrophysics Data System (ADS)

Karunaratne, Passant V.; Jouny, Ismail I.

1997-04-01

The recognition of human faces is a phenomenon that has been mastered by the human visual system and that has been researched extensively in the domain of computer neural networks and image processing. This research is involved in the study of neural networks and wavelet image processing techniques in the application of human face recognition. The objective of the system is to acquire a digitized still image of a human face, carry out pre-processing on the image as required, an then, given a prior database of images of possible individuals, be able to recognize the individual in the image. The pre-processing segment of the system includes several procedures, namely image compression, denoising, and feature extraction. The image processing is carried out using Daubechies wavelets. Once the images have been passed through the wavelet-based image processor they can be efficiently analyzed by means of a neural network. A back- propagation neural network is used for the recognition segment of the system. The main constraints of the system is with regard to the characteristics of the images being processed. The system should be able to carry out effective recognition of the human faces irrespective of the individual's facial-expression, presence of extraneous objects such as head-gear or spectacles, and face/head orientation. A potential application of this face recognition system would be as a secondary verification method in an automated teller machine.
Feature integration and object representations along the dorsal stream visual hierarchy

PubMed Central

Perry, Carolyn Jeane; Fallah, Mazyar

2014-01-01

The visual system is split into two processing streams: a ventral stream that receives color and form information and a dorsal stream that receives motion information. Each stream processes that information hierarchically, with each stage building upon the previous. In the ventral stream this leads to the formation of object representations that ultimately allow for object recognition regardless of changes in the surrounding environment. In the dorsal stream, this hierarchical processing has classically been thought to lead to the computation of complex motion in three dimensions. However, there is evidence to suggest that there is integration of both dorsal and ventral stream information into motion computation processes, giving rise to intermediate object representations, which facilitate object selection and decision making mechanisms in the dorsal stream. First we review the hierarchical processing of motion along the dorsal stream and the building up of object representations along the ventral stream. Then we discuss recent work on the integration of ventral and dorsal stream features that lead to intermediate object representations in the dorsal stream. Finally we propose a framework describing how and at what stage different features are integrated into dorsal visual stream object representations. Determining the integration of features along the dorsal stream is necessary to understand not only how the dorsal stream builds up an object representation but also which computations are performed on object representations instead of local features. PMID:25140147
The role of eye fixation in memory enhancement under stress - An eye tracking study.

PubMed

Herten, Nadja; Otto, Tobias; Wolf, Oliver T

2017-04-01

In a stressful situation, attention is shifted to potentially relevant stimuli. Recent studies from our laboratory revealed that participants stressed perform superior in a recognition task involving objects of the stressful episode. In order to characterize the role of a stress induced alteration in visual exploration, the present study investigated whether participants experiencing a laboratory social stress situation differ in their fixation from participants of a control group. Further, we aimed at shedding light on the relation of fixation behaviour with obtained memory measures. We randomly assigned 32 male and 31 female participants to a control or a stress condition consisting of the Trier Social Stress Test (TSST), a public speaking paradigm causing social evaluative threat. In an established 'friendly' control condition (f-TSST) participants talk to a friendly committee. During both conditions, the committee members used ten office items (central objects) while another ten objects were present without being used (peripheral objects). Participants wore eye tracking glasses recording their fixations. On the next day, participants performed free recall and recognition tasks involving the objects present the day before. Stressed participants showed enhanced memory for central objects, accompanied by longer fixation times and larger fixation amounts on these objects. Contrasting this, fixation towards the committee faces showed the reversed pattern; here, control participants exhibited longer fixations. Fixation indices and memory measures were, however, not correlated with each other. Psychosocial stress is associated with altered fixation behaviour. Longer fixation on objects related to the stressful situation may reflect enhanced encoding, whereas diminished face fixation suggests gaze avoidance of aversive, socially threatening stimuli. Modified visual exploration should be considered in future stress research, in particular when focussing on memory for a stressful episode. Copyright © 2017 Elsevier Inc. All rights reserved.
A bio-inspired method and system for visual object-based attention and segmentation

NASA Astrophysics Data System (ADS)

Huber, David J.; Khosla, Deepak

2010-04-01

This paper describes a method and system of human-like attention and object segmentation in visual scenes that (1) attends to regions in a scene in their rank of saliency in the image, (2) extracts the boundary of an attended proto-object based on feature contours, and (3) can be biased to boost the attention paid to specific features in a scene, such as those of a desired target object in static and video imagery. The purpose of the system is to identify regions of a scene of potential importance and extract the region data for processing by an object recognition and classification algorithm. The attention process can be performed in a default, bottom-up manner or a directed, top-down manner which will assign a preference to certain features over others. One can apply this system to any static scene, whether that is a still photograph or imagery captured from video. We employ algorithms that are motivated by findings in neuroscience, psychology, and cognitive science to construct a system that is novel in its modular and stepwise approach to the problems of attention and region extraction, its application of a flooding algorithm to break apart an image into smaller proto-objects based on feature density, and its ability to join smaller regions of similar features into larger proto-objects. This approach allows many complicated operations to be carried out by the system in a very short time, approaching real-time. A researcher can use this system as a robust front-end to a larger system that includes object recognition and scene understanding modules; it is engineered to function over a broad range of situations and can be applied to any scene with minimal tuning from the user.
[Associative visual agnosia. The less visible consequences of a cerebral infarction].

PubMed

Diesfeldt, H F A

2011-02-01

After a cerebral infarction, some patients acutely demonstrate contralateral hemiplegia, or aphasia. Those are the obvious symptoms of a cerebral infarction. However, less visible but burdensome consequences may go unnoticed without closer investigation. The importance of a thorough clinical examination is exemplified by a single case study of a 72-year-old, right-handed male. Two years before he had suffered from an ischemic stroke in the territory of the left posterior cerebral artery, with right homonymous hemianopia and global alexia (i.e., impairment in letter recognition and profound impairment of reading) without agraphia. Naming was impaired on visual presentation (20%-39% correct), but improved significantly after tactile presentation (87% correct) or verbal definition (89%). Pre-semantic visual processing was normal (correct matching of different views of the same object), as was his access to structural knowledge from vision (he reliably distinguished real objects from non-objects). On a colour decision task he reliably indicated which of two items was coloured correctly. Though he was unable to mime how visually presented objects were used, he more reliably matched pictures of objects with pictures of a mime artist gesturing the use of the object. He obtained normal scores on word definition (WAIS-III), synonym judgment and word-picture matching tasks with perceptual and semantic distractors. He however failed when he had to match physically dissimilar specimens of the same object or when he had to decide which two of five objects were related associatively (Pyramids and Palm Trees Test). The patient thus showed a striking contrast in his intact ability to access knowledge of object shape or colour from vision and impaired functional and associative knowledge. As a result, he could not access a complete semantic representation, required for activating phonological representations to name visually presented objects. The pattern of impairments and preserved abilities is considered to be a specific difficulty to access a full semantic representation from an intact structural representation of visually presented objects, i.e., a form of visual object agnosia.
Visual dysfunction in Parkinson’s disease

PubMed Central

Weil, Rimona S.; Schrag, Anette E.; Warren, Jason D.; Crutch, Sebastian J.; Lees, Andrew J.; Morris, Huw R.

2016-01-01

Patients with Parkinson’s disease have a number of specific visual disturbances. These include changes in colour vision and contrast sensitivity and difficulties with complex visual tasks such as mental rotation and emotion recognition. We review changes in visual function at each stage of visual processing from retinal deficits, including contrast sensitivity and colour vision deficits to higher cortical processing impairments such as object and motion processing and neglect. We consider changes in visual function in patients with common Parkinson’s disease-associated genetic mutations including GBA and LRRK2. We discuss the association between visual deficits and clinical features of Parkinson’s disease such as rapid eye movement sleep behavioural disorder and the postural instability and gait disorder phenotype. We review the link between abnormal visual function and visual hallucinations, considering current models for mechanisms of visual hallucinations. Finally, we discuss the role of visuo-perceptual testing as a biomarker of disease and predictor of dementia in Parkinson’s disease. PMID:27412389
The Anatomy of Non-conscious Recognition Memory.

PubMed

Rosenthal, Clive R; Soto, David

2016-11-01

Cortical regions as early as primary visual cortex have been implicated in recognition memory. Here, we outline the challenges that this presents for neurobiological accounts of recognition memory. We conclude that understanding the role of early visual cortex (EVC) in this process will require the use of protocols that mask stimuli from visual awareness. Copyright © 2016 Elsevier Ltd. All rights reserved.
Regional Principal Color Based Saliency Detection

PubMed Central

Lou, Jing; Ren, Mingwu; Wang, Huan

2014-01-01

Saliency detection is widely used in many visual applications like image segmentation, object recognition and classification. In this paper, we will introduce a new method to detect salient objects in natural images. The approach is based on a regional principal color contrast modal, which incorporates low-level and medium-level visual cues. The method allows a simple computation of color features and two categories of spatial relationships to a saliency map, achieving higher F-measure rates. At the same time, we present an interpolation approach to evaluate resulting curves, and analyze parameters selection. Our method enables the effective computation of arbitrary resolution images. Experimental results on a saliency database show that our approach produces high quality saliency maps and performs favorably against ten saliency detection algorithms. PMID:25379960
Path similarity skeleton graph matching.

PubMed

Bai, Xiang; Latecki, Longin Jan

2008-07-01

This paper presents a novel framework to for shape recognition based on object silhouettes. The main idea is to match skeleton graphs by comparing the shortest paths between skeleton endpoints. In contrast to typical tree or graph matching methods, we completely ignore the topological graph structure. Our approach is motivated by the fact that visually similar skeleton graphs may have completely different topological structures. The proposed comparison of shortest paths between endpoints of skeleton graphs yields correct matching results in such cases. The skeletons are pruned by contour partitioning with Discrete Curve Evolution, which implies that the endpoints of skeleton branches correspond to visual parts of the objects. The experimental results demonstrate that our method is able to produce correct results in the presence of articulations, stretching, and occlusion.
Development of robust behaviour recognition for an at-home biomonitoring robot with assistance of subject localization and enhanced visual tracking.

PubMed

Imamoglu, Nevrez; Dorronzoro, Enrique; Wei, Zhixuan; Shi, Huangjun; Sekine, Masashi; González, José; Gu, Dongyun; Chen, Weidong; Yu, Wenwei

2014-01-01

Our research is focused on the development of an at-home health care biomonitoring mobile robot for the people in demand. Main task of the robot is to detect and track a designated subject while recognizing his/her activity for analysis and to provide warning in an emergency. In order to push forward the system towards its real application, in this study, we tested the robustness of the robot system with several major environment changes, control parameter changes, and subject variation. First, an improved color tracker was analyzed to find out the limitations and constraints of the robot visual tracking considering the suitable illumination values and tracking distance intervals. Then, regarding subject safety and continuous robot based subject tracking, various control parameters were tested on different layouts in a room. Finally, the main objective of the system is to find out walking activities for different patterns for further analysis. Therefore, we proposed a fast, simple, and person specific new activity recognition model by making full use of localization information, which is robust to partial occlusion. The proposed activity recognition algorithm was tested on different walking patterns with different subjects, and the results showed high recognition accuracy.
Development of Robust Behaviour Recognition for an at-Home Biomonitoring Robot with Assistance of Subject Localization and Enhanced Visual Tracking

PubMed Central

Imamoglu, Nevrez; Dorronzoro, Enrique; Wei, Zhixuan; Shi, Huangjun; González, José; Gu, Dongyun; Yu, Wenwei

2014-01-01

Our research is focused on the development of an at-home health care biomonitoring mobile robot for the people in demand. Main task of the robot is to detect and track a designated subject while recognizing his/her activity for analysis and to provide warning in an emergency. In order to push forward the system towards its real application, in this study, we tested the robustness of the robot system with several major environment changes, control parameter changes, and subject variation. First, an improved color tracker was analyzed to find out the limitations and constraints of the robot visual tracking considering the suitable illumination values and tracking distance intervals. Then, regarding subject safety and continuous robot based subject tracking, various control parameters were tested on different layouts in a room. Finally, the main objective of the system is to find out walking activities for different patterns for further analysis. Therefore, we proposed a fast, simple, and person specific new activity recognition model by making full use of localization information, which is robust to partial occlusion. The proposed activity recognition algorithm was tested on different walking patterns with different subjects, and the results showed high recognition accuracy. PMID:25587560
Association of auditory-verbal and visual hallucinations with impaired and improved recognition of colored pictures.

PubMed

Brébion, Gildas; Stephan-Otto, Christian; Usall, Judith; Huerta-Ramos, Elena; Perez del Olmo, Mireia; Cuevas-Esteban, Jorge; Haro, Josep Maria; Ochoa, Susana

2015-09-01

A number of cognitive underpinnings of auditory hallucinations have been established in schizophrenia patients, but few have, as yet, been uncovered for visual hallucinations. In previous research, we unexpectedly observed that auditory hallucinations were associated with poor recognition of color, but not black-and-white (b/w), pictures. In this study, we attempted to replicate and explain this finding. Potential associations with visual hallucinations were explored. B/w and color pictures were presented to 50 schizophrenia patients and 45 healthy individuals under 2 conditions of visual context presentation corresponding to 2 levels of visual encoding complexity. Then, participants had to recognize the target pictures among distractors. Auditory-verbal hallucinations were inversely associated with the recognition of the color pictures presented under the most effortful encoding condition. This association was fully mediated by working-memory span. Visual hallucinations were associated with improved recognition of the color pictures presented under the less effortful condition. Patients suffering from visual hallucinations were not impaired, relative to the healthy participants, in the recognition of these pictures. Decreased working-memory span in patients with auditory-verbal hallucinations might impede the effortful encoding of stimuli. Visual hallucinations might be associated with facilitation in the visual encoding of natural scenes, or with enhanced color perception abilities. (c) 2015 APA, all rights reserved).
Hand shape selection in pantomimed grasping: Interaction between the dorsal and the ventral visual streams and convergence on the ventral premotor area

PubMed Central

Makuuchi, Michiru; Someya, Yoshiaki; Ogawa, Seiji; Takayama, Yoshihiro

2011-01-01

In visually guided grasping, possible hand shapes are computed from the geometrical features of the object, while prior knowledge about the object and the goal of the action influence both the computation and the selection of the hand shape. We investigated the system dynamics of the human brain for the pantomiming of grasping with two aspects accentuated. One is object recognition, with the use of objects for daily use. The subjects mimed grasping movements appropriate for an object presented in a photograph either by precision or power grip. The other is the selection of grip hand shape. We manipulated the selection demands for the grip hand shape by having the subjects use the same or different grip type in the second presentation of the identical object. Effective connectivity analysis revealed that the increased selection demands enhance the interaction between the anterior intraparietal sulcus (AIP) and posterior inferior temporal gyrus (pITG), and drive the converging causal influences from the AIP, pITG, and dorsolateral prefrontal cortex to the ventral premotor area (PMv). These results suggest that the dorsal and ventral visual areas interact in the pantomiming of grasping, while the PMv integrates the neural information of different regions to select the hand posture. The present study proposes system dynamics in visually guided movement toward meaningful objects, but further research is needed to examine if the same dynamics is found also in real grasping. PMID:21739528
The effect of mood-context on visual recognition and recall memory.

PubMed

Robinson, Sarita J; Rollings, Lucy J L

2011-01-01

Although it is widely known that memory is enhanced when encoding and retrieval occur in the same state, the impact of elevated stress/arousal is less understood. This study explores mood-dependent memory's effects on visual recognition and recall of material memorized either in a neutral mood or under higher stress/arousal levels. Participants' (N = 60) recognition and recall were assessed while they experienced either the same o a mismatched mood at retrieval. The results suggested that both visual recognition and recall memory were higher when participants experienced the same mood at encoding and retrieval compared with those who experienced a mismatch in mood context between encoding and retrieval. These findings offer support for a mood dependency effect on both the recognition and recall of visual information.

Spatiotemporal dynamics in understanding hand—object interactions

PubMed Central

Avanzini, Pietro; Fabbri-Destro, Maddalena; Campi, Cristina; Pascarella, Annalisa; Barchiesi, Guido; Cattaneo, Luigi; Rizzolatti, Giacomo

2013-01-01

It is generally accepted that visual perception results from the activation of a feed-forward hierarchy of areas, leading to increasingly complex representations. Here we present evidence for a fundamental role of backward projections to the occipito-temporal region for understanding conceptual object properties. The evidence is based on two studies. In the first study, using high-density EEG, we showed that during the observation of how objects are used there is an early activation of occipital and temporal areas, subsequently reaching the pole of the temporal lobe, and a late reactivation of the visual areas. In the second study, using transcranial magnetic stimulation over the occipital lobe, we showed a clear impairment in the accuracy of recognition of how objects are used during both early activation and, most importantly, late occipital reactivation. These findings represent strong neurophysiological evidence that a top-down mechanism is fundamental for understanding conceptual object properties, and suggest that a similar mechanism might be also present for other higher-order cognitive functions. PMID:24043805
Separate processing of texture and form in the ventral stream: evidence from FMRI and visual agnosia.

PubMed

Cavina-Pratesi, C; Kentridge, R W; Heywood, C A; Milner, A D

2010-02-01

Real-life visual object recognition requires the processing of more than just geometric (shape, size, and orientation) properties. Surface properties such as color and texture are equally important, particularly for providing information about the material properties of objects. Recent neuroimaging research suggests that geometric and surface properties are dealt with separately within the lateral occipital cortex (LOC) and the collateral sulcus (CoS), respectively. Here we compared objects that differed either in aspect ratio or in surface texture only, keeping all other visual properties constant. Results on brain-intact participants confirmed that surface texture activates an area in the posterior CoS, quite distinct from the area activated by shape within LOC. We also tested 2 patients with visual object agnosia, one of whom (DF) performed well on the texture task but at chance on the shape task, whereas the other (MS) showed the converse pattern. This behavioral double dissociation was matched by a parallel neuroimaging dissociation, with activation in CoS but not LOC in patient DF and activation in LOC but not CoS in patient MS. These data provide presumptive evidence that the areas respectively activated by shape and texture play a causally necessary role in the perceptual discrimination of these features.
Are Face and Object Recognition Independent? A Neurocomputational Modeling Exploration.

PubMed

Wang, Panqu; Gauthier, Isabel; Cottrell, Garrison

2016-04-01

Are face and object recognition abilities independent? Although it is commonly believed that they are, Gauthier et al. [Gauthier, I., McGugin, R. W., Richler, J. J., Herzmann, G., Speegle, M., & VanGulick, A. E. Experience moderates overlap between object and face recognition, suggesting a common ability. Journal of Vision, 14, 7, 2014] recently showed that these abilities become more correlated as experience with nonface categories increases. They argued that there is a single underlying visual ability, v, that is expressed in performance with both face and nonface categories as experience grows. Using the Cambridge Face Memory Test and the Vanderbilt Expertise Test, they showed that the shared variance between Cambridge Face Memory Test and Vanderbilt Expertise Test performance increases monotonically as experience increases. Here, we address why a shared resource across different visual domains does not lead to competition and to an inverse correlation in abilities? We explain this conundrum using our neurocomputational model of face and object processing ["The Model", TM, Cottrell, G. W., & Hsiao, J. H. Neurocomputational models of face processing. In A. J. Calder, G. Rhodes, M. Johnson, & J. Haxby (Eds.), The Oxford handbook of face perception. Oxford, UK: Oxford University Press, 2011]. We model the domain general ability v as the available computational resources (number of hidden units) in the mapping from input to label and experience as the frequency of individual exemplars in an object category appearing during network training. Our results show that, as in the behavioral data, the correlation between subordinate level face and object recognition accuracy increases as experience grows. We suggest that different domains do not compete for resources because the relevant features are shared between faces and objects. The essential power of experience is to generate a "spreading transform" for faces (separating them in representational space) that generalizes to objects that must be individuated. Interestingly, when the task of the network is basic level categorization, no increase in the correlation between domains is observed. Hence, our model predicts that it is the type of experience that matters and that the source of the correlation is in the fusiform face area, rather than in cortical areas that subserve basic level categorization. This result is consistent with our previous modeling elucidating why the FFA is recruited for novel domains of expertise [Tong, M. H., Joyce, C. A., & Cottrell, G. W. Why is the fusiform face area recruited for novel categories of expertise? A neurocomputational investigation. Brain Research, 1202, 14-24, 2008].
Visual Equivalence and Amodal Completion in Cuttlefish

PubMed Central

Lin, I-Rong; Chiao, Chuan-Chin

2017-01-01

Modern cephalopods are notably the most intelligent invertebrates and this is accompanied by keen vision. Despite extensive studies investigating the visual systems of cephalopods, little is known about their visual perception and object recognition. In the present study, we investigated the visual processing of the cuttlefish Sepia pharaonis, including visual equivalence and amodal completion. Cuttlefish were trained to discriminate images of shrimp and fish using the operant conditioning paradigm. After cuttlefish reached the learning criteria, a series of discrimination tasks were conducted. In the visual equivalence experiment, several transformed versions of the training images, such as images reduced in size, images reduced in contrast, sketches of the images, the contours of the images, and silhouettes of the images, were used. In the amodal completion experiment, partially occluded views of the original images were used. The results showed that cuttlefish were able to treat the training images of reduced size and sketches as the visual equivalence. Cuttlefish were also capable of recognizing partially occluded versions of the training image. Furthermore, individual differences in performance suggest that some cuttlefish may be able to recognize objects when visual information was partly removed. These findings support the hypothesis that the visual perception of cuttlefish involves both visual equivalence and amodal completion. The results from this research also provide insights into the visual processing mechanisms used by cephalopods. PMID:28220075
How Chinese Semantics Capability Improves Interpretation in Visual Communication

ERIC Educational Resources Information Center

Cheng, Chu-Yu; Ou, Yang-Kun; Kin, Ching-Lung

2017-01-01

A visual representation involves delivering messages through visually communicated images. The study assumed that semantic recognition can affect visual interpretation ability, and the result showed that students graduating from a general high school achieve satisfactory results in semantic recognition and image interpretation tasks than students…
Emotion recognition abilities across stimulus modalities in schizophrenia and the role of visual attention.

PubMed

Simpson, Claire; Pinkham, Amy E; Kelsven, Skylar; Sasson, Noah J

2013-12-01

Emotion can be expressed by both the voice and face, and previous work suggests that presentation modality may impact emotion recognition performance in individuals with schizophrenia. We investigated the effect of stimulus modality on emotion recognition accuracy and the potential role of visual attention to faces in emotion recognition abilities. Thirty-one patients who met DSM-IV criteria for schizophrenia (n=8) or schizoaffective disorder (n=23) and 30 non-clinical control individuals participated. Both groups identified emotional expressions in three different conditions: audio only, visual only, combined audiovisual. In the visual only and combined conditions, time spent visually fixating salient features of the face were recorded. Patients were significantly less accurate than controls in emotion recognition during both the audio and visual only conditions but did not differ from controls on the combined condition. Analysis of visual scanning behaviors demonstrated that patients attended less than healthy individuals to the mouth in the visual condition but did not differ in visual attention to salient facial features in the combined condition, which may in part explain the absence of a deficit for patients in this condition. Collectively, these findings demonstrate that patients benefit from multimodal stimulus presentations of emotion and support hypotheses that visual attention to salient facial features may serve as a mechanism for accurate emotion identification. © 2013.
Short-term testosterone manipulations modulate visual recognition memory and some aspects of emotional reactivity in male rhesus monkeys.

PubMed

Lacreuse, Agnès; Gore, Heather E; Chang, Jeemin; Kaplan, Emily R

2012-05-15

The role of testosterone (T) in modulating cognitive function and emotion in men remains unclear. The paucity of animal studies has likely contributed to the slow progress in this area. In particular, studies in nonhuman primates have been lacking. Our laboratory has begun to address this issue by pharmacologically manipulating T levels in intact male rhesus monkeys, using blind, placebo-controlled, crossover designs. We previously found that T-suppressed monkeys receiving supraphysiological T for 4 weeks had lower visual recognition memory for long delays and enhanced attention to videos of negative social stimuli (Lacreuse et al., 2009, 2010) compared to when treated with oil. To further delineate the conditions under which T affects cognition and emotion, the present study focused on the short-term effects of physiological T. Six intact males were treated with the gonadotropin-releasing hormone antagonist degarelix (3 mg/kg) for 7 days and received one injection of T enanthate (5 mg/kg) followed by one injection of oil vehicle 7 days later (n=3), or the reverse treatment (n=3). Performance on two computerized tasks, the Delayed-non-matching-to-sample (DNMS) with random delays and the object-Delayed Recognition Span test (object-DRST) and one task of emotional reactivity, an approach/avoidance task of negative, familiar and novel objects, was examined at baseline and 3-5 days after treatment. DNMS performance was significantly better when monkeys were treated with T compared to oil, independently of the delay duration or the nature (emotional or neutral) of the stimuli. Performance on the object-DRST was unaffected. Interestingly, subtle changes in emotional reactivity were also observed: T administration was associated with fewer object contacts, especially on negative objects, without overt changes in anxious behaviors. These results may reflect increased vigilance and alertness with high T. Altogether, the data suggest that changes in general arousal may underlie the beneficial effects of T on DNMS performance. This hypothesis will require further study with objective measures of physiological arousal. Copyright © 2012 Elsevier Inc. All rights reserved.
Contextual effects on perceived contrast: figure-ground assignment and orientation contrast.

PubMed

Self, Matthew W; Mookhoek, Aart; Tjalma, Nienke; Roelfsema, Pieter R

2015-02-02

Figure-ground segregation is an important step in the path leading to object recognition. The visual system segregates objects ('figures') in the visual scene from their backgrounds ('ground'). Electrophysiological studies in awake-behaving monkeys have demonstrated that neurons in early visual areas increase their firing rate when responding to a figure compared to responding to the background. We hypothesized that similar changes in neural firing would take place in early visual areas of the human visual system, leading to changes in the perception of low-level visual features. In this study, we investigated whether contrast perception is affected by figure-ground assignment using stimuli similar to those in the electrophysiological studies in monkeys. We measured contrast discrimination thresholds and perceived contrast for Gabor probes placed on figures or the background and found that the perceived contrast of the probe was increased when it was placed on a figure. Furthermore, we tested how this effect compared with the well-known effect of orientation contrast on perceived contrast. We found that figure-ground assignment and orientation contrast produced changes in perceived contrast of a similar magnitude, and that they interacted. Our results demonstrate that figure-ground assignment influences perceived contrast, consistent with an effect of figure-ground assignment on activity in early visual areas of the human visual system. © 2015 ARVO.
Experience and information loss in auditory and visual memory.

PubMed

Gloede, Michele E; Paulauskas, Emily E; Gregg, Melissa K

2017-07-01

Recent studies show that recognition memory for sounds is inferior to memory for pictures. Four experiments were conducted to examine the nature of auditory and visual memory. Experiments 1-3 were conducted to evaluate the role of experience in auditory and visual memory. Participants received a study phase with pictures/sounds, followed by a recognition memory test. Participants then completed auditory training with each of the sounds, followed by a second memory test. Despite auditory training in Experiments 1 and 2, visual memory was superior to auditory memory. In Experiment 3, we found that it is possible to improve auditory memory, but only after 3 days of specific auditory training and 3 days of visual memory decay. We examined the time course of information loss in auditory and visual memory in Experiment 4 and found a trade-off between visual and auditory recognition memory: Visual memory appears to have a larger capacity, while auditory memory is more enduring. Our results indicate that visual and auditory memory are inherently different memory systems and that differences in visual and auditory recognition memory performance may be due to the different amounts of experience with visual and auditory information, as well as structurally different neural circuitry specialized for information retention.
Stereo Viewing Modulates Three-Dimensional Shape Processing During Object Recognition: A High-Density ERP Study

PubMed Central

2017-01-01

The role of stereo disparity in the recognition of 3-dimensional (3D) object shape remains an unresolved issue for theoretical models of the human visual system. We examined this issue using high-density (128 channel) recordings of event-related potentials (ERPs). A recognition memory task was used in which observers were trained to recognize a subset of complex, multipart, 3D novel objects under conditions of either (bi-) monocular or stereo viewing. In a subsequent test phase they discriminated previously trained targets from untrained distractor objects that shared either local parts, 3D spatial configuration, or neither dimension, across both previously seen and novel viewpoints. The behavioral data showed a stereo advantage for target recognition at untrained viewpoints. ERPs showed early differential amplitude modulations to shape similarity defined by local part structure and global 3D spatial configuration. This occurred initially during an N1 component around 145–190 ms poststimulus onset, and then subsequently during an N2/P3 component around 260–385 ms poststimulus onset. For mono viewing, amplitude modulation during the N1 was greatest between targets and distracters with different local parts for trained views only. For stereo viewing, amplitude modulation during the N2/P3 was greatest between targets and distracters with different global 3D spatial configurations and generalized across trained and untrained views. The results show that image classification is modulated by stereo information about the local part, and global 3D spatial configuration of object shape. The findings challenge current theoretical models that do not attribute functional significance to stereo input during the computation of 3D object shape. PMID:29022728
Connecting Art and the Brain: An Artist's Perspective on Visual Indeterminacy

PubMed Central

Pepperell, Robert

2011-01-01

In this article I will discuss the intersection between art and neuroscience from the perspective of a practicing artist. I have collaborated on several scientific studies into the effects of art on the brain and behavior, looking in particular at the phenomenon of “visual indeterminacy.” This is a perceptual state in which subjects fail to recognize objects from visual cues. I will look at the background to this phenomenon, and show how various artists have exploited its effect through the history of art. My own attempts to create indeterminate images will be discussed, including some of the technical problems I faced in trying to manipulate the viewer's perceptual state through paintings. Visual indeterminacy is not widely studied in neuroscience, although references to it can be found in the literature on visual agnosia and object recognition. I will briefly review some of this work and show how my attempts to understand the science behind visual indeterminacy led me to collaborate with psychophysicists and neuroscientists. After reviewing this work, I will discuss the conclusions I have drawn from its findings and consider the problem of how best to integrate neuroscientific methods with artistic knowledge to create truly interdisciplinary approach. PMID:21887141
Neural Dissociation of Number from Letter Recognition and Its Relationship to Parietal Numerical Processing

ERIC Educational Resources Information Center

Park, Joonkoo; Hebrank, Andrew; Polk, Thad A.; Park, Denise C.

2012-01-01

The visual recognition of letters dissociates from the recognition of numbers at both the behavioral and neural level. In this article, using fMRI, we investigate whether the visual recognition of numbers dissociates from letters, thereby establishing a double dissociation. In Experiment 1, participants viewed strings of consonants and Arabic…
Individual Differences in Visual Self-Recognition as a Function of Mother-Infant Attachment Relationship.

ERIC Educational Resources Information Center

Lewis, Michael; And Others

1985-01-01

Compares attachment relationships of infants at 12 months to their visual self-recognition at both 18 and 24 months. Individual differences in early attachment relations were related to later self-recognition. In particular, insecurely attached infants showed a trend toward earlier self-recognition than did securely attached infants. (Author/NH)
Facial recognition using enhanced pixelized image for simulated visual prosthesis.

PubMed

Li, Ruonan; Zhhang, Xudong; Zhang, Hui; Hu, Guanshu

2005-01-01

A simulated face recognition experiment using enhanced pixelized images is designed and performed for the artificial visual prosthesis. The results of the simulation reveal new characteristics of visual performance in an enhanced pixelization condition, and then new suggestions on the future design of visual prosthesis are provided.
Visual discrimination predicts naming and semantic association accuracy in Alzheimer disease.

PubMed

Harnish, Stacy M; Neils-Strunjas, Jean; Eliassen, James; Reilly, Jamie; Meinzer, Marcus; Clark, John Greer; Joseph, Jane

2010-12-01

Language impairment is a common symptom of Alzheimer disease (AD), and is thought to be related to semantic processing. This study examines the contribution of another process, namely visual perception, on measures of confrontation naming and semantic association abilities in persons with probable AD. Twenty individuals with probable mild-moderate Alzheimer disease and 20 age-matched controls completed a battery of neuropsychologic measures assessing visual perception, naming, and semantic association ability. Visual discrimination tasks that varied in the degree to which they likely accessed stored structural representations were used to gauge whether structural processing deficits could account for deficits in naming and in semantic association in AD. Visual discrimination abilities of nameable objects in AD strongly predicted performance on both picture naming and semantic association ability, but lacked the same predictive value for controls. Although impaired, performance on visual discrimination tests of abstract shapes and novel faces showed no significant relationship with picture naming and semantic association. These results provide additional evidence to support that structural processing deficits exist in AD, and may contribute to object recognition and naming deficits. Our findings suggest that there is a common deficit in discrimination of pictures using nameable objects, picture naming, and semantic association of pictures in AD. Disturbances in structural processing of pictured items may be associated with lexical-semantic impairment in AD, owing to degraded internal storage of structural knowledge.
Size matters: bigger is faster.

PubMed

Sereno, Sara C; O'Donnell, Patrick J; Sereno, Margaret E

2009-06-01

A largely unexplored aspect of lexical access in visual word recognition is "semantic size"--namely, the real-world size of an object to which a word refers. A total of 42 participants performed a lexical decision task on concrete nouns denoting either big or small objects (e.g., bookcase or teaspoon). Items were matched pairwise on relevant lexical dimensions. Participants' reaction times were reliably faster to semantically "big" versus "small" words. The results are discussed in terms of possible mechanisms, including more active representations for "big" words, due to the ecological importance attributed to large objects in the environment and the relative speed of neural responses to large objects.
Effect of Visual Experience on Face Processing: A Developmental Study of Inversion and Non-Native Effects

ERIC Educational Resources Information Center

Sangrigoli, Sandy; de Schonen, Scania

2004-01-01

In adults, three phenomena are taken to demonstrate an experience effect on face recognition: an inversion effect, a non-native face effect (so-called "other-race" effect) and their interaction. It is crucial for our understanding of the developmental perception mechanisms of object processing to discover when these effects are present in…
Effects of visual and verbal interference tasks on olfactory memory: the role of task complexity.

PubMed

Annett, J M; Leslie, J C

1996-08-01

Recent studies have demonstrated that visual and verbal suppression tasks interfere with olfactory memory in a manner which is partially consistent with a dual coding interpretation. However, it has been suggested that total task complexity rather than modality specificity of the suppression tasks might account for the observed pattern of results. This study addressed the issue of whether or not the level of difficulty and complexity of suppression tasks could explain the apparent modality effects noted in earlier experiments. A total of 608 participants were each allocated to one of 19 experimental conditions involving interference tasks which varied suppression type (visual or verbal), nature of complexity (single, double or mixed) and level of difficulty (easy, optimal or difficult) and presented with 13 target odours. Either recognition of the odours or free recall of the odour names was tested on one occasion, either within 15 minutes of presentation or one week later. Both recognition and recall performance showed an overall effect for suppression nature, suppression level and time of testing with no effect for suppression type. The results lend only limited support to Paivio's (1986) dual coding theory, but have a number of characteristics which suggest that an adequate account of olfactory memory may be broadly similar to current theories of face and object recognition. All of these phenomena might be dealt with by an appropriately modified version of dual coding theory.
Common constraints limit Korean and English character recognition in peripheral vision.

PubMed

He, Yingchen; Kwon, MiYoung; Legge, Gordon E

2018-01-01

The visual span refers to the number of adjacent characters that can be recognized in a single glance. It is viewed as a sensory bottleneck in reading for both normal and clinical populations. In peripheral vision, the visual span for English characters can be enlarged after training with a letter-recognition task. Here, we examined the transfer of training from Korean to English characters for a group of bilingual Korean native speakers. In the pre- and posttests, we measured visual spans for Korean characters and English letters. Training (1.5 hours × 4 days) consisted of repetitive visual-span measurements for Korean trigrams (strings of three characters). Our training enlarged the visual spans for Korean single characters and trigrams, and the benefit transferred to untrained English symbols. The improvement was largely due to a reduction of within-character and between-character crowding in Korean recognition, as well as between-letter crowding in English recognition. We also found a negative correlation between the size of the visual span and the average pattern complexity of the symbol set. Together, our results showed that the visual span is limited by common sensory (crowding) and physical (pattern complexity) factors regardless of the language script, providing evidence that the visual span reflects a universal bottleneck for text recognition.
Common constraints limit Korean and English character recognition in peripheral vision

PubMed Central

He, Yingchen; Kwon, MiYoung; Legge, Gordon E.

2018-01-01

The visual span refers to the number of adjacent characters that can be recognized in a single glance. It is viewed as a sensory bottleneck in reading for both normal and clinical populations. In peripheral vision, the visual span for English characters can be enlarged after training with a letter-recognition task. Here, we examined the transfer of training from Korean to English characters for a group of bilingual Korean native speakers. In the pre- and posttests, we measured visual spans for Korean characters and English letters. Training (1.5 hours × 4 days) consisted of repetitive visual-span measurements for Korean trigrams (strings of three characters). Our training enlarged the visual spans for Korean single characters and trigrams, and the benefit transferred to untrained English symbols. The improvement was largely due to a reduction of within-character and between-character crowding in Korean recognition, as well as between-letter crowding in English recognition. We also found a negative correlation between the size of the visual span and the average pattern complexity of the symbol set. Together, our results showed that the visual span is limited by common sensory (crowding) and physical (pattern complexity) factors regardless of the language script, providing evidence that the visual span reflects a universal bottleneck for text recognition. PMID:29327041

Some links on this page may take you to non-federal websites. Their policies may differ from this site.