Storage of features, conjunctions and objects in visual working memory.
Vogel, E K; Woodman, G F; Luck, S J
2001-02-01
Working memory can be divided into separate subsystems for verbal and visual information. Although the verbal system has been well characterized, the storage capacity of visual working memory has not yet been established for simple features or for conjunctions of features. The authors demonstrate that it is possible to retain information about only 3-4 colors or orientations in visual working memory at one time. Observers are also able to retain both the color and the orientation of 3-4 objects, indicating that visual working memory stores integrated objects rather than individual features. Indeed, objects defined by a conjunction of four features can be retained in working memory just as well as single-feature objects, allowing many individual features to be retained when distributed across a small number of objects. Thus, the capacity of visual working memory must be understood in terms of integrated objects rather than individual features.
Visual Prediction Error Spreads Across Object Features in Human Visual Cortex
Summerfield, Christopher; Egner, Tobias
2016-01-01
Visual cognition is thought to rely heavily on contextual expectations. Accordingly, previous studies have revealed distinct neural signatures for expected versus unexpected stimuli in visual cortex. However, it is presently unknown how the brain combines multiple concurrent stimulus expectations such as those we have for different features of a familiar object. To understand how an unexpected object feature affects the simultaneous processing of other expected feature(s), we combined human fMRI with a task that independently manipulated expectations for color and motion features of moving-dot stimuli. Behavioral data and neural signals from visual cortex were then interrogated to adjudicate between three possible ways in which prediction error (surprise) in the processing of one feature might affect the concurrent processing of another, expected feature: (1) feature processing may be independent; (2) surprise might “spread” from the unexpected to the expected feature, rendering the entire object unexpected; or (3) pairing a surprising feature with an expected feature might promote the inference that the two features are not in fact part of the same object. To formalize these rival hypotheses, we implemented them in a simple computational model of multifeature expectations. Across a range of analyses, behavior and visual neural signals consistently supported a model that assumes a mixing of prediction error signals across features: surprise in one object feature spreads to its other feature(s), thus rendering the entire object unexpected. These results reveal neurocomputational principles of multifeature expectations and indicate that objects are the unit of selection for predictive vision. SIGNIFICANCE STATEMENT We address a key question in predictive visual cognition: how does the brain combine multiple concurrent expectations for different features of a single object such as its color and motion trajectory? By combining a behavioral protocol that independently varies expectation of (and attention to) multiple object features with computational modeling and fMRI, we demonstrate that behavior and fMRI activity patterns in visual cortex are best accounted for by a model in which prediction error in one object feature spreads to other object features. These results demonstrate how predictive vision forms object-level expectations out of multiple independent features. PMID:27810936
Object-based attention underlies the rehearsal of feature binding in visual working memory.
Shen, Mowei; Huang, Xiang; Gao, Zaifeng
2015-04-01
Feature binding is a core concept in many research fields, including the study of working memory (WM). Over the past decade, it has been debated whether keeping the feature binding in visual WM consumes more visual attention than the constituent single features. Previous studies have only explored the contribution of domain-general attention or space-based attention in the binding process; no study so far has explored the role of object-based attention in retaining binding in visual WM. We hypothesized that object-based attention underlay the mechanism of rehearsing feature binding in visual WM. Therefore, during the maintenance phase of a visual WM task, we inserted a secondary mental rotation (Experiments 1-3), transparent motion (Experiment 4), or an object-based feature report task (Experiment 5) to consume the object-based attention available for binding. In line with the prediction of the object-based attention hypothesis, Experiments 1-5 revealed a more significant impairment for binding than for constituent single features. However, this selective binding impairment was not observed when inserting a space-based visual search task (Experiment 6). We conclude that object-based attention underlies the rehearsal of binding representation in visual WM. (c) 2015 APA, all rights reserved.
Douglas, Danielle; Newsome, Rachel N; Man, Louisa LY
2018-01-01
A significant body of research in cognitive neuroscience is aimed at understanding how object concepts are represented in the human brain. However, it remains unknown whether and where the visual and abstract conceptual features that define an object concept are integrated. We addressed this issue by comparing the neural pattern similarities among object-evoked fMRI responses with behavior-based models that independently captured the visual and conceptual similarities among these stimuli. Our results revealed evidence for distinctive coding of visual features in lateral occipital cortex, and conceptual features in the temporal pole and parahippocampal cortex. By contrast, we found evidence for integrative coding of visual and conceptual object features in perirhinal cortex. The neuroanatomical specificity of this effect was highlighted by results from a searchlight analysis. Taken together, our findings suggest that perirhinal cortex uniquely supports the representation of fully specified object concepts through the integration of their visual and conceptual features. PMID:29393853
Internal attention to features in visual short-term memory guides object learning
Fan, Judith E.; Turk-Browne, Nicholas B.
2013-01-01
Attending to objects in the world affects how we perceive and remember them. What are the consequences of attending to an object in mind? In particular, how does reporting the features of a recently seen object guide visual learning? In three experiments, observers were presented with abstract shapes in a particular color, orientation, and location. After viewing each object, observers were cued to report one feature from visual short-term memory (VSTM). In a subsequent test, observers were cued to report features of the same objects from visual long-term memory (VLTM). We tested whether reporting a feature from VSTM: (1) enhances VLTM for just that feature (practice-benefit hypothesis), (2) enhances VLTM for all features (object-based hypothesis), or (3) simultaneously enhances VLTM for that feature and suppresses VLTM for unreported features (feature-competition hypothesis). The results provided support for the feature-competition hypothesis, whereby the representation of an object in VLTM was biased towards features reported from VSTM and away from unreported features (Experiment 1). This bias could not be explained by the amount of sensory exposure or response learning (Experiment 2) and was amplified by the reporting of multiple features (Experiment 3). Taken together, these results suggest that selective internal attention induces competitive dynamics among features during visual learning, flexibly tuning object representations to align with prior mnemonic goals. PMID:23954925
Internal attention to features in visual short-term memory guides object learning.
Fan, Judith E; Turk-Browne, Nicholas B
2013-11-01
Attending to objects in the world affects how we perceive and remember them. What are the consequences of attending to an object in mind? In particular, how does reporting the features of a recently seen object guide visual learning? In three experiments, observers were presented with abstract shapes in a particular color, orientation, and location. After viewing each object, observers were cued to report one feature from visual short-term memory (VSTM). In a subsequent test, observers were cued to report features of the same objects from visual long-term memory (VLTM). We tested whether reporting a feature from VSTM: (1) enhances VLTM for just that feature (practice-benefit hypothesis), (2) enhances VLTM for all features (object-based hypothesis), or (3) simultaneously enhances VLTM for that feature and suppresses VLTM for unreported features (feature-competition hypothesis). The results provided support for the feature-competition hypothesis, whereby the representation of an object in VLTM was biased towards features reported from VSTM and away from unreported features (Experiment 1). This bias could not be explained by the amount of sensory exposure or response learning (Experiment 2) and was amplified by the reporting of multiple features (Experiment 3). Taken together, these results suggest that selective internal attention induces competitive dynamics among features during visual learning, flexibly tuning object representations to align with prior mnemonic goals. Copyright © 2013 Elsevier B.V. All rights reserved.
Impact of feature saliency on visual category learning.
Hammer, Rubi
2015-01-01
People have to sort numerous objects into a large number of meaningful categories while operating in varying contexts. This requires identifying the visual features that best predict the 'essence' of objects (e.g., edibility), rather than categorizing objects based on the most salient features in a given context. To gain this capacity, visual category learning (VCL) relies on multiple cognitive processes. These may include unsupervised statistical learning, that requires observing multiple objects for learning the statistics of their features. Other learning processes enable incorporating different sources of supervisory information, alongside the visual features of the categorized objects, from which the categorical relations between few objects can be deduced. These deductions enable inferring that objects from the same category may differ from one another in some high-saliency feature dimensions, whereas lower-saliency feature dimensions can best differentiate objects from distinct categories. Here I illustrate how feature saliency affects VCL, by also discussing kinds of supervisory information enabling reflective categorization. Arguably, principles debated here are often being ignored in categorization studies.
Impact of feature saliency on visual category learning
Hammer, Rubi
2015-01-01
People have to sort numerous objects into a large number of meaningful categories while operating in varying contexts. This requires identifying the visual features that best predict the ‘essence’ of objects (e.g., edibility), rather than categorizing objects based on the most salient features in a given context. To gain this capacity, visual category learning (VCL) relies on multiple cognitive processes. These may include unsupervised statistical learning, that requires observing multiple objects for learning the statistics of their features. Other learning processes enable incorporating different sources of supervisory information, alongside the visual features of the categorized objects, from which the categorical relations between few objects can be deduced. These deductions enable inferring that objects from the same category may differ from one another in some high-saliency feature dimensions, whereas lower-saliency feature dimensions can best differentiate objects from distinct categories. Here I illustrate how feature saliency affects VCL, by also discussing kinds of supervisory information enabling reflective categorization. Arguably, principles debated here are often being ignored in categorization studies. PMID:25954220
Generic decoding of seen and imagined objects using hierarchical visual features.
Horikawa, Tomoyasu; Kamitani, Yukiyasu
2017-05-22
Object recognition is a key function in both human and machine vision. While brain decoding of seen and imagined objects has been achieved, the prediction is limited to training examples. We present a decoding approach for arbitrary objects using the machine vision principle that an object category is represented by a set of features rendered invariant through hierarchical processing. We show that visual features, including those derived from a deep convolutional neural network, can be predicted from fMRI patterns, and that greater accuracy is achieved for low-/high-level features with lower-/higher-level visual areas, respectively. Predicted features are used to identify seen/imagined object categories (extending beyond decoder training) from a set of computed features for numerous object images. Furthermore, decoding of imagined objects reveals progressive recruitment of higher-to-lower visual representations. Our results demonstrate a homology between human and machine vision and its utility for brain-based information retrieval.
Wen, Haiguang; Shi, Junxing; Chen, Wei; Liu, Zhongming
2018-02-28
The brain represents visual objects with topographic cortical patterns. To address how distributed visual representations enable object categorization, we established predictive encoding models based on a deep residual network, and trained them to predict cortical responses to natural movies. Using this predictive model, we mapped human cortical representations to 64,000 visual objects from 80 categories with high throughput and accuracy. Such representations covered both the ventral and dorsal pathways, reflected multiple levels of object features, and preserved semantic relationships between categories. In the entire visual cortex, object representations were organized into three clusters of categories: biological objects, non-biological objects, and background scenes. In a finer scale specific to each cluster, object representations revealed sub-clusters for further categorization. Such hierarchical clustering of category representations was mostly contributed by cortical representations of object features from middle to high levels. In summary, this study demonstrates a useful computational strategy to characterize the cortical organization and representations of visual features for rapid categorization.
Humphreys, Glyn W
2016-10-01
The Treisman Bartlett lecture, reported in the Quarterly Journal of Experimental Psychology in 1988, provided a major overview of the feature integration theory of attention. This has continued to be a dominant account of human visual attention to this day. The current paper provides a summary of the work reported in the lecture and an update on critical aspects of the theory as applied to visual object perception. The paper highlights the emergence of findings that pose significant challenges to the theory and which suggest that revisions are required that allow for (a) several rather than a single form of feature integration, (b) some forms of feature integration to operate preattentively, (c) stored knowledge about single objects and interactions between objects to modulate perceptual integration, (d) the application of feature-based inhibition to object files where visual features are specified, which generates feature-based spreading suppression and scene segmentation, and (e) a role for attention in feature confirmation rather than feature integration in visual selection. A feature confirmation account of attention in object perception is outlined.
Shape and color conjunction stimuli are represented as bound objects in visual working memory.
Luria, Roy; Vogel, Edward K
2011-05-01
The integrated object view of visual working memory (WM) argues that objects (rather than features) are the building block of visual WM, so that adding an extra feature to an object does not result in any extra cost to WM capacity. Alternative views have shown that complex objects consume additional WM storage capacity so that it may not be represented as bound objects. Additionally, it was argued that two features from the same dimension (i.e., color-color) do not form an integrated object in visual WM. This led some to argue for a "weak" object view of visual WM. We used the contralateral delay activity (the CDA) as an electrophysiological marker of WM capacity, to test those alternative hypotheses to the integrated object account. In two experiments we presented complex stimuli and color-color conjunction stimuli, and compared performance in displays that had one object but varying degrees of feature complexity. The results supported the integrated object account by showing that the CDA amplitude corresponded to the number of objects regardless of the number of features within each object, even for complex objects or color-color conjunction stimuli. Copyright © 2010 Elsevier Ltd. All rights reserved.
Hardman, Kyle; Cowan, Nelson
2014-01-01
Visual working memory stores stimuli from our environment as representations that can be accessed by high-level control processes. This study addresses a longstanding debate in the literature about whether storage limits in visual working memory include a limit to the complexity of discrete items. We examined the issue with a number of change-detection experiments that used complex stimuli which possessed multiple features per stimulus item. We manipulated the number of relevant features of the stimulus objects in order to vary feature load. In all of our experiments, we found that increased feature load led to a reduction in change-detection accuracy. However, we found that feature load alone could not account for the results, but that a consideration of the number of relevant objects was also required. This study supports capacity limits for both feature and object storage in visual working memory. PMID:25089739
Experience improves feature extraction in Drosophila.
Peng, Yueqing; Xi, Wang; Zhang, Wei; Zhang, Ke; Guo, Aike
2007-05-09
Previous exposure to a pattern in the visual scene can enhance subsequent recognition of that pattern in many species from honeybees to humans. However, whether previous experience with a visual feature of an object, such as color or shape, can also facilitate later recognition of that particular feature from multiple visual features is largely unknown. Visual feature extraction is the ability to select the key component from multiple visual features. Using a visual flight simulator, we designed a novel protocol for visual feature extraction to investigate the effects of previous experience on visual reinforcement learning in Drosophila. We found that, after conditioning with a visual feature of objects among combinatorial shape-color features, wild-type flies exhibited poor ability to extract the correct visual feature. However, the ability for visual feature extraction was greatly enhanced in flies trained previously with that visual feature alone. Moreover, we demonstrated that flies might possess the ability to extract the abstract category of "shape" but not a particular shape. Finally, this experience-dependent feature extraction is absent in flies with defective MBs, one of the central brain structures in Drosophila. Our results indicate that previous experience can enhance visual feature extraction in Drosophila and that MBs are required for this experience-dependent visual cognition.
Conjunctive Coding of Complex Object Features
Erez, Jonathan; Cusack, Rhodri; Kendall, William; Barense, Morgan D.
2016-01-01
Critical to perceiving an object is the ability to bind its constituent features into a cohesive representation, yet the manner by which the visual system integrates object features to yield a unified percept remains unknown. Here, we present a novel application of multivoxel pattern analysis of neuroimaging data that allows a direct investigation of whether neural representations integrate object features into a whole that is different from the sum of its parts. We found that patterns of activity throughout the ventral visual stream (VVS), extending anteriorly into the perirhinal cortex (PRC), discriminated between the same features combined into different objects. Despite this sensitivity to the unique conjunctions of features comprising objects, activity in regions of the VVS, again extending into the PRC, was invariant to the viewpoints from which the conjunctions were presented. These results suggest that the manner in which our visual system processes complex objects depends on the explicit coding of the conjunctions of features comprising them. PMID:25921583
Task-relevant perceptual features can define categories in visual memory too.
Antonelli, Karla B; Williams, Carrick C
2017-11-01
Although Konkle, Brady, Alvarez, and Oliva (2010, Journal of Experimental Psychology: General, 139(3), 558) claim that visual long-term memory (VLTM) is organized on underlying conceptual, not perceptual, information, visual memory results from visual search tasks are not well explained by this theory. We hypothesized that when viewing an object, any task-relevant visual information is critical to the organizational structure of VLTM. In two experiments, we examined the organization of VLTM by measuring the amount of retroactive interference created by objects possessing different combinations of task-relevant features. Based on task instructions, only the conceptual category was task relevant or both the conceptual category and a perceptual object feature were task relevant. Findings indicated that when made task relevant, perceptual object feature information, along with conceptual category information, could affect memory organization for objects in VLTM. However, when perceptual object feature information was task irrelevant, it did not contribute to memory organization; instead, memory defaulted to being organized around conceptual category information. These findings support the theory that a task-defined organizational structure is created in VLTM based on the relevance of particular object features and information.
Hardman, Kyle O; Cowan, Nelson
2015-03-01
Visual working memory stores stimuli from our environment as representations that can be accessed by high-level control processes. This study addresses a longstanding debate in the literature about whether storage limits in visual working memory include a limit to the complexity of discrete items. We examined the issue with a number of change-detection experiments that used complex stimuli that possessed multiple features per stimulus item. We manipulated the number of relevant features of the stimulus objects in order to vary feature load. In all of our experiments, we found that increased feature load led to a reduction in change-detection accuracy. However, we found that feature load alone could not account for the results but that a consideration of the number of relevant objects was also required. This study supports capacity limits for both feature and object storage in visual working memory. PsycINFO Database Record (c) 2015 APA, all rights reserved.
Contini, Erika W; Wardle, Susan G; Carlson, Thomas A
2017-10-01
Visual object recognition is a complex, dynamic process. Multivariate pattern analysis methods, such as decoding, have begun to reveal how the brain processes complex visual information. Recently, temporal decoding methods for EEG and MEG have offered the potential to evaluate the temporal dynamics of object recognition. Here we review the contribution of M/EEG time-series decoding methods to understanding visual object recognition in the human brain. Consistent with the current understanding of the visual processing hierarchy, low-level visual features dominate decodable object representations early in the time-course, with more abstract representations related to object category emerging later. A key finding is that the time-course of object processing is highly dynamic and rapidly evolving, with limited temporal generalisation of decodable information. Several studies have examined the emergence of object category structure, and we consider to what degree category decoding can be explained by sensitivity to low-level visual features. Finally, we evaluate recent work attempting to link human behaviour to the neural time-course of object processing. Copyright © 2017 Elsevier Ltd. All rights reserved.
Ip, Ifan Betina; Bridge, Holly; Parker, Andrew J.
2014-01-01
An important advance in the study of visual attention has been the identification of a non-spatial component of attention that enhances the response to similar features or objects across the visual field. Here we test whether this non-spatial component can co-select individual features that are perceptually bound into a coherent object. We combined human psychophysics and functional magnetic resonance imaging (fMRI) to demonstrate the ability to co-select individual features from perceptually coherent objects. Our study used binocular disparity and visual motion to define disparity structure-from-motion (dSFM) stimuli. Although the spatial attention system induced strong modulations of the fMRI response in visual regions, the non-spatial system’s ability to co-select features of the dSFM stimulus was less pronounced and variable across subjects. Our results demonstrate that feature and global feature attention effects are variable across participants, suggesting that the feature attention system may be limited in its ability to automatically select features within the attended object. Careful comparison of the task design suggests that even minor differences in the perceptual task may be critical in revealing the presence of global feature attention. PMID:24936974
Graewe, Britta; De Weerd, Peter; Farivar, Reza; Castelo-Branco, Miguel
2012-01-01
Many studies have linked the processing of different object categories to specific event-related potentials (ERPs) such as the face-specific N170. Despite reports showing that object-related ERPs are influenced by visual stimulus features, there is consensus that these components primarily reflect categorical aspects of the stimuli. Here, we re-investigated this idea by systematically measuring the effects of visual feature manipulations on ERP responses elicited by both structure-from-motion (SFM)-defined and luminance-defined object stimuli. SFM objects elicited a novel component at 200–250 ms (N250) over parietal and posterior temporal sites. We found, however, that the N250 amplitude was unaffected by restructuring SFM stimuli into meaningless objects based on identical visual cues. This suggests that this N250 peak was not uniquely linked to categorical aspects of the objects, but is strongly determined by visual stimulus features. We provide strong support for this hypothesis by parametrically manipulating the depth range of both SFM- and luminance-defined object stimuli and showing that the N250 evoked by SFM stimuli as well as the well-known N170 to static faces were sensitive to this manipulation. Importantly, this effect could not be attributed to compromised object categorization in low depth stimuli, confirming a strong impact of visual stimulus features on object-related ERP signals. As ERP components linked with visual categorical object perception are likely determined by multiple stimulus features, this creates an interesting inverse problem when deriving specific perceptual processes from variations in ERP components. PMID:22363479
Graewe, Britta; De Weerd, Peter; Farivar, Reza; Castelo-Branco, Miguel
2012-01-01
Many studies have linked the processing of different object categories to specific event-related potentials (ERPs) such as the face-specific N170. Despite reports showing that object-related ERPs are influenced by visual stimulus features, there is consensus that these components primarily reflect categorical aspects of the stimuli. Here, we re-investigated this idea by systematically measuring the effects of visual feature manipulations on ERP responses elicited by both structure-from-motion (SFM)-defined and luminance-defined object stimuli. SFM objects elicited a novel component at 200-250 ms (N250) over parietal and posterior temporal sites. We found, however, that the N250 amplitude was unaffected by restructuring SFM stimuli into meaningless objects based on identical visual cues. This suggests that this N250 peak was not uniquely linked to categorical aspects of the objects, but is strongly determined by visual stimulus features. We provide strong support for this hypothesis by parametrically manipulating the depth range of both SFM- and luminance-defined object stimuli and showing that the N250 evoked by SFM stimuli as well as the well-known N170 to static faces were sensitive to this manipulation. Importantly, this effect could not be attributed to compromised object categorization in low depth stimuli, confirming a strong impact of visual stimulus features on object-related ERP signals. As ERP components linked with visual categorical object perception are likely determined by multiple stimulus features, this creates an interesting inverse problem when deriving specific perceptual processes from variations in ERP components.
Exploration of complex visual feature spaces for object perception
Leeds, Daniel D.; Pyles, John A.; Tarr, Michael J.
2014-01-01
The mid- and high-level visual properties supporting object perception in the ventral visual pathway are poorly understood. In the absence of well-specified theory, many groups have adopted a data-driven approach in which they progressively interrogate neural units to establish each unit's selectivity. Such methods are challenging in that they require search through a wide space of feature models and stimuli using a limited number of samples. To more rapidly identify higher-level features underlying human cortical object perception, we implemented a novel functional magnetic resonance imaging method in which visual stimuli are selected in real-time based on BOLD responses to recently shown stimuli. This work was inspired by earlier primate physiology work, in which neural selectivity for mid-level features in IT was characterized using a simple parametric approach (Hung et al., 2012). To extend such work to human neuroimaging, we used natural and synthetic object stimuli embedded in feature spaces constructed on the basis of the complex visual properties of the objects themselves. During fMRI scanning, we employed a real-time search method to control continuous stimulus selection within each image space. This search was designed to maximize neural responses across a pre-determined 1 cm3 brain region within ventral cortex. To assess the value of this method for understanding object encoding, we examined both the behavior of the method itself and the complex visual properties the method identified as reliably activating selected brain regions. We observed: (1) Regions selective for both holistic and component object features and for a variety of surface properties; (2) Object stimulus pairs near one another in feature space that produce responses at the opposite extremes of the measured activity range. Together, these results suggest that real-time fMRI methods may yield more widely informative measures of selectivity within the broad classes of visual features associated with cortical object representation. PMID:25309408
Feature Binding in Visual Working Memory Evaluated by Type Identification Paradigm
ERIC Educational Resources Information Center
Saiki, Jun; Miyatsuji, Hirofumi
2007-01-01
Memory for feature binding comprises a key ingredient in coherent object representations. Previous studies have been equivocal about human capacity for objects in the visual working memory. To evaluate memory for feature binding, a type identification paradigm was devised and used with a multiple-object permanence tracking task. Using objects…
Combining local and global limitations of visual search.
Põder, Endel
2017-04-01
There are different opinions about the roles of local interactions and central processing capacity in visual search. This study attempts to clarify the problem using a new version of relevant set cueing. A central precue indicates two symmetrical segments (that may contain a target object) within a circular array of objects presented briefly around the fixation point. The number of objects in the relevant segments, and density of objects in the array were varied independently. Three types of search experiments were run: (a) search for a simple visual feature (color, size, and orientation); (b) conjunctions of simple features; and (c) spatial configuration of simple features (rotated Ts). For spatial configuration stimuli, the results were consistent with a fixed global processing capacity and standard crowding zones. For simple features and their conjunctions, the results were different, dependent on the features involved. While color search exhibits virtually no capacity limits or crowding, search for an orientation target was limited by both. Results for conjunctions of features can be partly explained by the results from the respective features. This study shows that visual search is limited by both local interference and global capacity, and the limitations are different for different visual features.
The impact of attentional, linguistic, and visual features during object naming
Clarke, Alasdair D. F.; Coco, Moreno I.; Keller, Frank
2013-01-01
Object detection and identification are fundamental to human vision, and there is mounting evidence that objects guide the allocation of visual attention. However, the role of objects in tasks involving multiple modalities is less clear. To address this question, we investigate object naming, a task in which participants have to verbally identify objects they see in photorealistic scenes. We report an eye-tracking study that investigates which features (attentional, visual, and linguistic) influence object naming. We find that the amount of visual attention directed toward an object, its position and saliency, along with linguistic factors such as word frequency, animacy, and semantic proximity, significantly influence whether the object will be named or not. We then ask how features from different modalities are combined during naming, and find significant interactions between saliency and position, saliency and linguistic features, and attention and position. We conclude that when the cognitive system performs tasks such as object naming, it uses input from one modality to constraint or enhance the processing of other modalities, rather than processing each input modality independently. PMID:24379792
How Object-Specific Are Object Files? Evidence for Integration by Location
ERIC Educational Resources Information Center
van Dam, Wessel O.; Hommel, Bernhard
2010-01-01
Given the distributed representation of visual features in the human brain, binding mechanisms are necessary to integrate visual information about the same perceptual event. It has been assumed that feature codes are bound into object files--pointers to the neural codes of the features of a given event. The present study investigated the…
Spatial resolution in visual memory.
Ben-Shalom, Asaf; Ganel, Tzvi
2015-04-01
Representations in visual short-term memory are considered to contain relatively elaborated information on object structure. Conversely, representations in earlier stages of the visual hierarchy are thought to be dominated by a sensory-based, feed-forward buildup of information. In four experiments, we compared the spatial resolution of different object properties between two points in time along the processing hierarchy in visual short-term memory. Subjects were asked either to estimate the distance between objects or to estimate the size of one of the objects' features under two experimental conditions, of either a short or a long delay period between the presentation of the target stimulus and the probe. When different objects were referred to, similar spatial resolution was found for the two delay periods, suggesting that initial processing stages are sensitive to object-based properties. Conversely, superior resolution was found for the short, as compared with the long, delay when features were referred to. These findings suggest that initial representations in visual memory are hybrid in that they allow fine-grained resolution for object features alongside normal visual sensitivity to the segregation between objects. The findings are also discussed in reference to the distinction made in earlier studies between visual short-term memory and iconic memory.
Rosselli, Federica B.; Alemi, Alireza; Ansuini, Alessio; Zoccolan, Davide
2015-01-01
In recent years, a number of studies have explored the possible use of rats as models of high-level visual functions. One central question at the root of such an investigation is to understand whether rat object vision relies on the processing of visual shape features or, rather, on lower-order image properties (e.g., overall brightness). In a recent study, we have shown that rats are capable of extracting multiple features of an object that are diagnostic of its identity, at least when those features are, structure-wise, distinct enough to be parsed by the rat visual system. In the present study, we have assessed the impact of object structure on rat perceptual strategy. We trained rats to discriminate between two structurally similar objects, and compared their recognition strategies with those reported in our previous study. We found that, under conditions of lower stimulus discriminability, rat visual discrimination strategy becomes more view-dependent and subject-dependent. Rats were still able to recognize the target objects, in a way that was largely tolerant (i.e., invariant) to object transformation; however, the larger structural and pixel-wise similarity affected the way objects were processed. Compared to the findings of our previous study, the patterns of diagnostic features were: (i) smaller and more scattered; (ii) only partially preserved across object views; and (iii) only partially reproducible across rats. On the other hand, rats were still found to adopt a multi-featural processing strategy and to make use of part of the optimal discriminatory information afforded by the two objects. Our findings suggest that, as in humans, rat invariant recognition can flexibly rely on either view-invariant representations of distinctive object features or view-specific object representations, acquired through learning. PMID:25814936
The Benefit of Surface Uniformity for Encoding Boundary Features in Visual Working Memory
ERIC Educational Resources Information Center
Kim, Sung-Ho; Kim, Jung-Oh
2011-01-01
Using a change detection paradigm, the present study examined an object-based encoding benefit in visual working memory (VWM) for two boundary features (two orientations in Experiments 1-2 and two shapes in Experiments 3-4) assigned to a single object. Participants remembered more boundary features when they were conjoined into a single object of…
Attentive Tracking Disrupts Feature Binding in Visual Working Memory
Fougnie, Daryl; Marois, René
2009-01-01
One of the most influential theories in visual cognition proposes that attention is necessary to bind different visual features into coherent object percepts (Treisman & Gelade, 1980). While considerable evidence supports a role for attention in perceptual feature binding, whether attention plays a similar function in visual working memory (VWM) remains controversial. To test the attentional requirements of VWM feature binding, here we gave participants an attention-demanding multiple object tracking task during the retention interval of a VWM task. Results show that the tracking task disrupted memory for color-shape conjunctions above and beyond any impairment to working memory for object features, and that this impairment was larger when the VWM stimuli were presented at different spatial locations. These results demonstrate that the role of visuospatial attention in feature binding is not unique to perception, but extends to the working memory of these perceptual representations as well. PMID:19609460
THE ROLE OF THE HIPPOCAMPUS IN OBJECT DISCRIMINATION BASED ON VISUAL FEATURES.
Levcik, David; Nekovarova, Tereza; Antosova, Eliska; Stuchlik, Ales; Klement, Daniel
2018-06-07
The role of rodent hippocampus has been intensively studied in different cognitive tasks. However, its role in discrimination of objects remains controversial due to conflicting findings. We tested whether the number and type of features available for the identification of objects might affect the strategy (hippocampal-independent vs. hippocampal-dependent) that rats adopt to solve object discrimination tasks. We trained rats to discriminate 2D visual objects presented on a computer screen. The objects were defined either by their shape only or by multiple-features (a combination of filling pattern and brightness in addition to the shape). Our data showed that objects displayed as simple geometric shapes are not discriminated by trained rats after their hippocampi had been bilaterally inactivated by the GABA A -agonist muscimol. On the other hand, objects containing a specific combination of non-geometric features in addition to the shape are discriminated even without the hippocampus. Our results suggest that the involvement of the hippocampus in visual object discrimination depends on the abundance of object's features. Copyright © 2018. Published by Elsevier Inc.
Horikawa, Tomoyasu; Kamitani, Yukiyasu
2017-01-01
Dreaming is generally thought to be generated by spontaneous brain activity during sleep with patterns common to waking experience. This view is supported by a recent study demonstrating that dreamed objects can be predicted from brain activity during sleep using statistical decoders trained with stimulus-induced brain activity. However, it remains unclear whether and how visual image features associated with dreamed objects are represented in the brain. In this study, we used a deep neural network (DNN) model for object recognition as a proxy for hierarchical visual feature representation, and DNN features for dreamed objects were analyzed with brain decoding of fMRI data collected during dreaming. The decoders were first trained with stimulus-induced brain activity labeled with the feature values of the stimulus image from multiple DNN layers. The decoders were then used to decode DNN features from the dream fMRI data, and the decoded features were compared with the averaged features of each object category calculated from a large-scale image database. We found that the feature values decoded from the dream fMRI data positively correlated with those associated with dreamed object categories at mid- to high-level DNN layers. Using the decoded features, the dreamed object category could be identified at above-chance levels by matching them to the averaged features for candidate categories. The results suggest that dreaming recruits hierarchical visual feature representations associated with objects, which may support phenomenal aspects of dream experience.
Störmer, Viola S; Li, Shu-Chen; Heekeren, Hauke R; Lindenberger, Ulman
2011-02-01
The ability to attend to multiple objects that move in the visual field is important for many aspects of daily functioning. The attentional capacity for such dynamic tracking, however, is highly limited and undergoes age-related decline. Several aspects of the tracking process can influence performance. Here, we investigated effects of feature-based interference from distractor objects that appear in unattended regions of the visual field with a hemifield-tracking task. Younger and older participants performed an attentional tracking task in one hemifield while distractor objects were concurrently presented in the unattended hemifield. Feature similarity between objects in the attended and unattended hemifields as well as motion speed and the number of to-be-tracked objects were parametrically manipulated. The results show that increasing feature overlap leads to greater interference from the unattended visual field. This effect of feature-based interference was only present in the slow speed condition, indicating that the interference is mainly modulated by perceptual demands. High-performing older adults showed a similar interference effect as younger adults, whereas low-performing adults showed poor tracking performance overall.
Objects Classification by Learning-Based Visual Saliency Model and Convolutional Neural Network.
Li, Na; Zhao, Xinbo; Yang, Yongjia; Zou, Xiaochun
2016-01-01
Humans can easily classify different kinds of objects whereas it is quite difficult for computers. As a hot and difficult problem, objects classification has been receiving extensive interests with broad prospects. Inspired by neuroscience, deep learning concept is proposed. Convolutional neural network (CNN) as one of the methods of deep learning can be used to solve classification problem. But most of deep learning methods, including CNN, all ignore the human visual information processing mechanism when a person is classifying objects. Therefore, in this paper, inspiring the completed processing that humans classify different kinds of objects, we bring forth a new classification method which combines visual attention model and CNN. Firstly, we use the visual attention model to simulate the processing of human visual selection mechanism. Secondly, we use CNN to simulate the processing of how humans select features and extract the local features of those selected areas. Finally, not only does our classification method depend on those local features, but also it adds the human semantic features to classify objects. Our classification method has apparently advantages in biology. Experimental results demonstrated that our method made the efficiency of classification improve significantly.
Olivers, Christian N L; Meijer, Frank; Theeuwes, Jan
2006-10-01
In 7 experiments, the authors explored whether visual attention (the ability to select relevant visual information) and visual working memory (the ability to retain relevant visual information) share the same content representations. The presence of singleton distractors interfered more strongly with a visual search task when it was accompanied by an additional memory task. Singleton distractors interfered even more when they were identical or related to the object held in memory, but only when it was difficult to verbalize the memory content. Furthermore, this content-specific interaction occurred for features that were relevant to the memory task but not for irrelevant features of the same object or for once-remembered objects that could be forgotten. Finally, memory-related distractors attracted more eye movements but did not result in longer fixations. The results demonstrate memory-driven attentional capture on the basis of content-specific representations. Copyright 2006 APA.
Helo, Andrea; van Ommen, Sandrien; Pannasch, Sebastian; Danteny-Dordoigne, Lucile; Rämä, Pia
2017-11-01
Conceptual representations of everyday scenes are built in interaction with visual environment and these representations guide our visual attention. Perceptual features and object-scene semantic consistency have been found to attract our attention during scene exploration. The present study examined how visual attention in 24-month-old toddlers is attracted by semantic violations and how perceptual features (i. e. saliency, centre distance, clutter and object size) and linguistic properties (i. e. object label frequency and label length) affect gaze distribution. We compared eye movements of 24-month-old toddlers and adults while exploring everyday scenes which either contained an inconsistent (e.g., soap on a breakfast table) or consistent (e.g., soap in a bathroom) object. Perceptual features such as saliency, centre distance and clutter of the scene affected looking times in the toddler group during the whole viewing time whereas looking times in adults were affected only by centre distance during the early viewing time. Adults looked longer to inconsistent than consistent objects either if the objects had a high or a low saliency. In contrast, toddlers presented semantic consistency effect only when objects were highly salient. Additionally, toddlers with lower vocabulary skills looked longer to inconsistent objects while toddlers with higher vocabulary skills look equally long to both consistent and inconsistent objects. Our results indicate that 24-month-old children use scene context to guide visual attention when exploring the visual environment. However, perceptual features have a stronger influence in eye movement guidance in toddlers than in adults. Our results also indicate that language skills influence cognitive but not perceptual guidance of eye movements during scene perception in toddlers. Copyright © 2017 Elsevier Inc. All rights reserved.
Weighted feature selection criteria for visual servoing of a telerobot
NASA Technical Reports Server (NTRS)
Feddema, John T.; Lee, C. S. G.; Mitchell, O. R.
1989-01-01
Because of the continually changing environment of a space station, visual feedback is a vital element of a telerobotic system. A real time visual servoing system would allow a telerobot to track and manipulate randomly moving objects. Methodologies for the automatic selection of image features to be used to visually control the relative position between an eye-in-hand telerobot and a known object are devised. A weighted criteria function with both image recognition and control components is used to select the combination of image features which provides the best control. Simulation and experimental results of a PUMA robot arm visually tracking a randomly moving carburetor gasket with a visual update time of 70 milliseconds are discussed.
Comparing visual representations across human fMRI and computational vision
Leeds, Daniel D.; Seibert, Darren A.; Pyles, John A.; Tarr, Michael J.
2013-01-01
Feedforward visual object perception recruits a cortical network that is assumed to be hierarchical, progressing from basic visual features to complete object representations. However, the nature of the intermediate features related to this transformation remains poorly understood. Here, we explore how well different computer vision recognition models account for neural object encoding across the human cortical visual pathway as measured using fMRI. These neural data, collected during the viewing of 60 images of real-world objects, were analyzed with a searchlight procedure as in Kriegeskorte, Goebel, and Bandettini (2006): Within each searchlight sphere, the obtained patterns of neural activity for all 60 objects were compared to model responses for each computer recognition algorithm using representational dissimilarity analysis (Kriegeskorte et al., 2008). Although each of the computer vision methods significantly accounted for some of the neural data, among the different models, the scale invariant feature transform (Lowe, 2004), encoding local visual properties gathered from “interest points,” was best able to accurately and consistently account for stimulus representations within the ventral pathway. More generally, when present, significance was observed in regions of the ventral-temporal cortex associated with intermediate-level object perception. Differences in model effectiveness and the neural location of significant matches may be attributable to the fact that each model implements a different featural basis for representing objects (e.g., more holistic or more parts-based). Overall, we conclude that well-known computer vision recognition systems may serve as viable proxies for theories of intermediate visual object representation. PMID:24273227
An integrative view of storage of low- and high-level visual dimensions in visual short-term memory.
Magen, Hagit
2017-03-01
Efficient performance in an environment filled with complex objects is often achieved through the temporal maintenance of conjunctions of features from multiple dimensions. The most striking finding in the study of binding in visual short-term memory (VSTM) is equal memory performance for single features and for integrated multi-feature objects, a finding that has been central to several theories of VSTM. Nevertheless, research on binding in VSTM focused almost exclusively on low-level features, and little is known about how items from low- and high-level visual dimensions (e.g., colored manmade objects) are maintained simultaneously in VSTM. The present study tested memory for combinations of low-level features and high-level representations. In agreement with previous findings, Experiments 1 and 2 showed decrements in memory performance when non-integrated low- and high-level stimuli were maintained simultaneously compared to maintaining each dimension in isolation. However, contrary to previous findings the results of Experiments 3 and 4 showed decrements in memory performance even when integrated objects of low- and high-level stimuli were maintained in memory, compared to maintaining single-dimension objects. Overall, the results demonstrate that low- and high-level visual dimensions compete for the same limited memory capacity, and offer a more comprehensive view of VSTM.
Roldan, Stephanie M
2017-01-01
One of the fundamental goals of object recognition research is to understand how a cognitive representation produced from the output of filtered and transformed sensory information facilitates efficient viewer behavior. Given that mental imagery strongly resembles perceptual processes in both cortical regions and subjective visual qualities, it is reasonable to question whether mental imagery facilitates cognition in a manner similar to that of perceptual viewing: via the detection and recognition of distinguishing features. Categorizing the feature content of mental imagery holds potential as a reverse pathway by which to identify the components of a visual stimulus which are most critical for the creation and retrieval of a visual representation. This review will examine the likelihood that the information represented in visual mental imagery reflects distinctive object features thought to facilitate efficient object categorization and recognition during perceptual viewing. If it is the case that these representational features resemble their sensory counterparts in both spatial and semantic qualities, they may well be accessible through mental imagery as evaluated through current investigative techniques. In this review, methods applied to mental imagery research and their findings are reviewed and evaluated for their efficiency in accessing internal representations, and implications for identifying diagnostic features are discussed. An argument is made for the benefits of combining mental imagery assessment methods with diagnostic feature research to advance the understanding of visual perceptive processes, with suggestions for avenues of future investigation.
Roldan, Stephanie M.
2017-01-01
One of the fundamental goals of object recognition research is to understand how a cognitive representation produced from the output of filtered and transformed sensory information facilitates efficient viewer behavior. Given that mental imagery strongly resembles perceptual processes in both cortical regions and subjective visual qualities, it is reasonable to question whether mental imagery facilitates cognition in a manner similar to that of perceptual viewing: via the detection and recognition of distinguishing features. Categorizing the feature content of mental imagery holds potential as a reverse pathway by which to identify the components of a visual stimulus which are most critical for the creation and retrieval of a visual representation. This review will examine the likelihood that the information represented in visual mental imagery reflects distinctive object features thought to facilitate efficient object categorization and recognition during perceptual viewing. If it is the case that these representational features resemble their sensory counterparts in both spatial and semantic qualities, they may well be accessible through mental imagery as evaluated through current investigative techniques. In this review, methods applied to mental imagery research and their findings are reviewed and evaluated for their efficiency in accessing internal representations, and implications for identifying diagnostic features are discussed. An argument is made for the benefits of combining mental imagery assessment methods with diagnostic feature research to advance the understanding of visual perceptive processes, with suggestions for avenues of future investigation. PMID:28588538
A recurrent neural model for proto-object based contour integration and figure-ground segregation.
Hu, Brian; Niebur, Ernst
2017-12-01
Visual processing of objects makes use of both feedforward and feedback streams of information. However, the nature of feedback signals is largely unknown, as is the identity of the neuronal populations in lower visual areas that receive them. Here, we develop a recurrent neural model to address these questions in the context of contour integration and figure-ground segregation. A key feature of our model is the use of grouping neurons whose activity represents tentative objects ("proto-objects") based on the integration of local feature information. Grouping neurons receive input from an organized set of local feature neurons, and project modulatory feedback to those same neurons. Additionally, inhibition at both the local feature level and the object representation level biases the interpretation of the visual scene in agreement with principles from Gestalt psychology. Our model explains several sets of neurophysiological results (Zhou et al. Journal of Neuroscience, 20(17), 6594-6611 2000; Qiu et al. Nature Neuroscience, 10(11), 1492-1499 2007; Chen et al. Neuron, 82(3), 682-694 2014), and makes testable predictions about the influence of neuronal feedback and attentional selection on neural responses across different visual areas. Our model also provides a framework for understanding how object-based attention is able to select both objects and the features associated with them.
Visual short-term memory always requires general attention.
Morey, Candice C; Bieler, Malte
2013-02-01
The role of attention in visual memory remains controversial; while some evidence has suggested that memory for binding between features demands no more attention than does memory for the same features, other evidence has indicated cognitive costs or mnemonic benefits for explicitly attending to bindings. We attempted to reconcile these findings by examining how memory for binding, for features, and for features during binding is affected by a concurrent attention-demanding task. We demonstrated that performing a concurrent task impairs memory for as few as two visual objects, regardless of whether each object includes one or more features. We argue that this pattern of results reflects an essential role for domain-general attention in visual memory, regardless of the simplicity of the to-be-remembered stimuli. We then discuss the implications of these findings for theories of visual working memory.
Rolls, Edmund T; Mills, W Patrick C
2018-05-01
When objects transform into different views, some properties are maintained, such as whether the edges are convex or concave, and these non-accidental properties are likely to be important in view-invariant object recognition. The metric properties, such as the degree of curvature, may change with different views, and are less likely to be useful in object recognition. It is shown that in a model of invariant visual object recognition in the ventral visual stream, VisNet, non-accidental properties are encoded much more than metric properties by neurons. Moreover, it is shown how with the temporal trace rule training in VisNet, non-accidental properties of objects become encoded by neurons, and how metric properties are treated invariantly. We also show how VisNet can generalize between different objects if they have the same non-accidental property, because the metric properties are likely to overlap. VisNet is a 4-layer unsupervised model of visual object recognition trained by competitive learning that utilizes a temporal trace learning rule to implement the learning of invariance using views that occur close together in time. A second crucial property of this model of object recognition is, when neurons in the level corresponding to the inferior temporal visual cortex respond selectively to objects, whether neurons in the intermediate layers can respond to combinations of features that may be parts of two or more objects. In an investigation using the four sides of a square presented in every possible combination, it was shown that even though different layer 4 neurons are tuned to encode each feature or feature combination orthogonally, neurons in the intermediate layers can respond to features or feature combinations present is several objects. This property is an important part of the way in which high capacity can be achieved in the four-layer ventral visual cortical pathway. These findings concerning non-accidental properties and the use of neurons in intermediate layers of the hierarchy help to emphasise fundamental underlying principles of the computations that may be implemented in the ventral cortical visual stream used in object recognition. Copyright © 2018 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Zhao, Yiqun; Wang, Zhihui
2015-12-01
The Internet of things (IOT) is a kind of intelligent networks which can be used to locate, track, identify and supervise people and objects. One of important core technologies of intelligent visual internet of things ( IVIOT) is the intelligent visual tag system. In this paper, a research is done into visual feature extraction and establishment of visual tags of the human face based on ORL face database. Firstly, we use the principal component analysis (PCA) algorithm for face feature extraction, then adopt the support vector machine (SVM) for classifying and face recognition, finally establish a visual tag for face which is already classified. We conducted a experiment focused on a group of people face images, the result show that the proposed algorithm have good performance, and can show the visual tag of objects conveniently.
Objects and categories: feature statistics and object processing in the ventral stream.
Tyler, Lorraine K; Chiu, Shannon; Zhuang, Jie; Randall, Billi; Devereux, Barry J; Wright, Paul; Clarke, Alex; Taylor, Kirsten I
2013-10-01
Recognizing an object involves more than just visual analyses; its meaning must also be decoded. Extensive research has shown that processing the visual properties of objects relies on a hierarchically organized stream in ventral occipitotemporal cortex, with increasingly more complex visual features being coded from posterior to anterior sites culminating in the perirhinal cortex (PRC) in the anteromedial temporal lobe (aMTL). The neurobiological principles of the conceptual analysis of objects remain more controversial. Much research has focused on two neural regions-the fusiform gyrus and aMTL, both of which show semantic category differences, but of different types. fMRI studies show category differentiation in the fusiform gyrus, based on clusters of semantically similar objects, whereas category-specific deficits, specifically for living things, are associated with damage to the aMTL. These category-specific deficits for living things have been attributed to problems in differentiating between highly similar objects, a process that involves the PRC. To determine whether the PRC and the fusiform gyri contribute to different aspects of an object's meaning, with differentiation between confusable objects in the PRC and categorization based on object similarity in the fusiform, we carried out an fMRI study of object processing based on a feature-based model that characterizes the degree of semantic similarity and difference between objects and object categories. Participants saw 388 objects for which feature statistic information was available and named the objects at the basic level while undergoing fMRI scanning. After controlling for the effects of visual information, we found that feature statistics that capture similarity between objects formed category clusters in fusiform gyri, such that objects with many shared features (typical of living things) were associated with activity in the lateral fusiform gyri whereas objects with fewer shared features (typical of nonliving things) were associated with activity in the medial fusiform gyri. Significantly, a feature statistic reflecting differentiation between highly similar objects, enabling object-specific representations, was associated with bilateral PRC activity. These results confirm that the statistical characteristics of conceptual object features are coded in the ventral stream, supporting a conceptual feature-based hierarchy, and integrating disparate findings of category responses in fusiform gyri and category deficits in aMTL into a unifying neurocognitive framework.
An object-mediated updating account of insensitivity to transsaccadic change
Tas, A. Caglar; Moore, Cathleen M.; Hollingworth, Andrew
2012-01-01
Recent evidence has suggested that relatively precise information about the location and visual form of a saccade target object is retained across a saccade. However, this information appears to be available for report only when the target is removed briefly, so that the display is blank when the eyes land. We hypothesized that the availability of precise target information is dependent on whether a post-saccade object is mapped to the same object representation established for the presaccade target. If so, then the post-saccade features of the target overwrite the presaccade features, a process of object mediated updating in which visual masking is governed by object continuity. In two experiments, participants' sensitivity to the spatial displacement of a saccade target was improved when that object changed surface feature properties across the saccade, consistent with the prediction of the object-mediating updating account. Transsaccadic perception appears to depend on a mechanism of object-based masking that is observed across multiple domains of vision. In addition, the results demonstrate that surface-feature continuity contributes to visual stability across saccades. PMID:23092946
Nagai, Takehiro; Matsushima, Toshiki; Koida, Kowa; Tani, Yusuke; Kitazaki, Michiteru; Nakauchi, Shigeki
2015-10-01
Humans can visually recognize material categories of objects, such as glass, stone, and plastic, easily. However, little is known about the kinds of surface quality features that contribute to such material class recognition. In this paper, we examine the relationship between perceptual surface features and material category discrimination performance for pictures of materials, focusing on temporal aspects, including reaction time and effects of stimulus duration. The stimuli were pictures of objects with an identical shape but made of different materials that could be categorized into seven classes (glass, plastic, metal, stone, wood, leather, and fabric). In a pre-experiment, observers rated the pictures on nine surface features, including visual (e.g., glossiness and transparency) and non-visual features (e.g., heaviness and warmness), on a 7-point scale. In the main experiments, observers judged whether two simultaneously presented pictures were classified as the same or different material category. Reaction times and effects of stimulus duration were measured. The results showed that visual feature ratings were correlated with material discrimination performance for short reaction times or short stimulus durations, while non-visual feature ratings were correlated only with performance for long reaction times or long stimulus durations. These results suggest that the mechanisms underlying visual and non-visual feature processing may differ in terms of processing time, although the cause is unclear. Visual surface features may mainly contribute to material recognition in daily life, while non-visual features may contribute only weakly, if at all. Copyright © 2014 Elsevier Ltd. All rights reserved.
Object-based Encoding in Visual Working Memory: Evidence from Memory-driven Attentional Capture.
Gao, Zaifeng; Yu, Shixian; Zhu, Chengfeng; Shui, Rende; Weng, Xuchu; Li, Peng; Shen, Mowei
2016-03-09
Visual working memory (VWM) adopts a specific manner of object-based encoding (OBE) to extract perceptual information: Whenever one feature-dimension is selected for entry into VWM, the others are also extracted. Currently most studies revealing OBE probed an 'irrelevant-change distracting effect', where changes of irrelevant-features dramatically affected the performance of the target feature. However, the existence of irrelevant-feature change may affect participants' processing manner, leading to a false-positive result. The current study conducted a strict examination of OBE in VWM, by probing whether irrelevant-features guided the deployment of attention in visual search. The participants memorized an object's colour yet ignored shape and concurrently performed a visual-search task. They searched for a target line among distractor lines, each embedded within a different object. One object in the search display could match the shape, colour, or both dimensions of the memory item, but this object never contained the target line. Relative to a neutral baseline, where there was no match between the memory and search displays, search time was significantly prolonged in all match conditions, regardless of whether the memory item was displayed for 100 or 1000 ms. These results suggest that task-irrelevant shape was extracted into VWM, supporting OBE in VWM.
Poth, Christian H; Schneider, Werner X
2016-09-01
Rapid saccadic eye movements bring the foveal region of the eye's retina onto objects for high-acuity vision. Saccades change the location and resolution of objects' retinal images. To perceive objects as visually stable across saccades, correspondence between the objects before and after the saccade must be established. We have previously shown that breaking object correspondence across the saccade causes a decrement in object recognition (Poth, Herwig, & Schneider, 2015). Color and luminance can establish object correspondence, but it is unknown how these surface features contribute to transsaccadic visual processing. Here, we investigated whether changing the surface features color-and-luminance and color alone across saccades impairs postsaccadic object recognition. Participants made saccades to peripheral objects, which either maintained or changed their surface features across the saccade. After the saccade, participants briefly viewed a letter within the saccade target object (terminated by a pattern mask). Postsaccadic object recognition was assessed as participants' accuracy in reporting the letter. Experiment A used the colors green and red with different luminances as surface features, Experiment B blue and yellow with approximately the same luminances. Changing the surface features across the saccade deteriorated postsaccadic object recognition in both experiments. These findings reveal a link between object recognition and object correspondence relying on the surface features colors and luminance, which is currently not addressed in theories of transsaccadic perception. We interpret the findings within a recent theory ascribing this link to visual attention (Schneider, 2013).
Dynamic binding of visual features by neuronal/stimulus synchrony.
Iwabuchi, A
1998-05-01
When people see a visual scene, certain parts of the visual scene are treated as belonging together and we regard them as a perceptual unit, which is called a "figure". People focus on figures, and the remaining parts of the scene are disregarded as "ground". In Gestalt psychology this process is called "figure-ground segregation". According to current perceptual psychology, a figure is formed by binding various visual features in a scene, and developments in neuroscience have revealed that there are many feature-encoding neurons, which respond to such features specifically. It is not known, however, how the brain binds different features of an object into a coherent visual object representation. Recently, the theory of binding by neuronal synchrony, which argues that feature binding is dynamically mediated by neuronal synchrony of feature-encoding neurons, has been proposed. This review article portrays the problem of figure-ground segregation and features binding, summarizes neurophysiological and psychophysical experiments and theory relevant to feature binding by neuronal/stimulus synchrony, and suggests possible directions for future research on this topic.
A novel visual saliency analysis model based on dynamic multiple feature combination strategy
NASA Astrophysics Data System (ADS)
Lv, Jing; Ye, Qi; Lv, Wen; Zhang, Libao
2017-06-01
The human visual system can quickly focus on a small number of salient objects. This process was known as visual saliency analysis and these salient objects are called focus of attention (FOA). The visual saliency analysis mechanism can be used to extract the salient regions and analyze saliency of object in an image, which is time-saving and can avoid unnecessary costs of computing resources. In this paper, a novel visual saliency analysis model based on dynamic multiple feature combination strategy is introduced. In the proposed model, we first generate multi-scale feature maps of intensity, color and orientation features using Gaussian pyramids and the center-surround difference. Then, we evaluate the contribution of all feature maps to the saliency map according to the area of salient regions and their average intensity, and attach different weights to different features according to their importance. Finally, we choose the largest salient region generated by the region growing method to perform the evaluation. Experimental results show that the proposed model cannot only achieve higher accuracy in saliency map computation compared with other traditional saliency analysis models, but also extract salient regions with arbitrary shapes, which is of great value for the image analysis and understanding.
Peel, Hayden J.; Sperandio, Irene; Laycock, Robin; Chouinard, Philippe A.
2018-01-01
Our understanding of how form, orientation and size are processed within and outside of awareness is limited and requires further investigation. Therefore, we investigated whether or not the visual discrimination of basic object features can be influenced by subliminal processing of stimuli presented beforehand. Visual masking was used to render stimuli perceptually invisible. Three experiments examined if visible and invisible primes could facilitate the subsequent feature discrimination of visible targets. The experiments differed in the kind of perceptual discrimination that participants had to make. Namely, participants were asked to discriminate visual stimuli on the basis of their form, orientation, or size. In all three experiments, we demonstrated reliable priming effects when the primes were visible but not when the primes were made invisible. Our findings underscore the importance of conscious awareness in facilitating the perceptual discrimination of basic object features. PMID:29725292
Peel, Hayden J; Sperandio, Irene; Laycock, Robin; Chouinard, Philippe A
2018-01-01
Our understanding of how form, orientation and size are processed within and outside of awareness is limited and requires further investigation. Therefore, we investigated whether or not the visual discrimination of basic object features can be influenced by subliminal processing of stimuli presented beforehand. Visual masking was used to render stimuli perceptually invisible. Three experiments examined if visible and invisible primes could facilitate the subsequent feature discrimination of visible targets. The experiments differed in the kind of perceptual discrimination that participants had to make. Namely, participants were asked to discriminate visual stimuli on the basis of their form, orientation, or size. In all three experiments, we demonstrated reliable priming effects when the primes were visible but not when the primes were made invisible. Our findings underscore the importance of conscious awareness in facilitating the perceptual discrimination of basic object features.
Object-based target templates guide attention during visual search.
Berggren, Nick; Eimer, Martin
2018-05-03
During visual search, attention is believed to be controlled in a strictly feature-based fashion, without any guidance by object-based target representations. To challenge this received view, we measured electrophysiological markers of attentional selection (N2pc component) and working memory (sustained posterior contralateral negativity; SPCN) in search tasks where two possible targets were defined by feature conjunctions (e.g., blue circles and green squares). Critically, some search displays also contained nontargets with two target features (incorrect conjunction objects, e.g., blue squares). Because feature-based guidance cannot distinguish these objects from targets, any selective bias for targets will reflect object-based attentional control. In Experiment 1, where search displays always contained only one object with target-matching features, targets and incorrect conjunction objects elicited identical N2pc and SPCN components, demonstrating that attentional guidance was entirely feature-based. In Experiment 2, where targets and incorrect conjunction objects could appear in the same display, clear evidence for object-based attentional control was found. The target N2pc became larger than the N2pc to incorrect conjunction objects from 250 ms poststimulus, and only targets elicited SPCN components. This demonstrates that after an initial feature-based guidance phase, object-based templates are activated when they are required to distinguish target and nontarget objects. These templates modulate visual processing and control access to working memory, and their activation may coincide with the start of feature integration processes. Results also suggest that while multiple feature templates can be activated concurrently, only a single object-based target template can guide attention at any given time. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Role of early visual cortex in trans-saccadic memory of object features.
Malik, Pankhuri; Dessing, Joost C; Crawford, J Douglas
2015-08-01
Early visual cortex (EVC) participates in visual feature memory and the updating of remembered locations across saccades, but its role in the trans-saccadic integration of object features is unknown. We hypothesized that if EVC is involved in updating object features relative to gaze, feature memory should be disrupted when saccades remap an object representation into a simultaneously perturbed EVC site. To test this, we applied transcranial magnetic stimulation (TMS) over functional magnetic resonance imaging-localized EVC clusters corresponding to the bottom left/right visual quadrants (VQs). During experiments, these VQs were probed psychophysically by briefly presenting a central object (Gabor patch) while subjects fixated gaze to the right or left (and above). After a short memory interval, participants were required to detect the relative change in orientation of a re-presented test object at the same spatial location. Participants either sustained fixation during the memory interval (fixation task) or made a horizontal saccade that either maintained or reversed the VQ of the object (saccade task). Three TMS pulses (coinciding with the pre-, peri-, and postsaccade intervals) were applied to the left or right EVC. This had no effect when (a) fixation was maintained, (b) saccades kept the object in the same VQ, or (c) the EVC quadrant corresponding to the first object was stimulated. However, as predicted, TMS reduced performance when saccades (especially larger saccades) crossed the remembered object location and brought it into the VQ corresponding to the TMS site. This suppression effect was statistically significant for leftward saccades and followed a weaker trend for rightward saccades. These causal results are consistent with the idea that EVC is involved in the gaze-centered updating of object features for trans-saccadic memory and perception.
Learning to rank using user clicks and visual features for image retrieval.
Yu, Jun; Tao, Dacheng; Wang, Meng; Rui, Yong
2015-04-01
The inconsistency between textual features and visual contents can cause poor image search results. To solve this problem, click features, which are more reliable than textual information in justifying the relevance between a query and clicked images, are adopted in image ranking model. However, the existing ranking model cannot integrate visual features, which are efficient in refining the click-based search results. In this paper, we propose a novel ranking model based on the learning to rank framework. Visual features and click features are simultaneously utilized to obtain the ranking model. Specifically, the proposed approach is based on large margin structured output learning and the visual consistency is integrated with the click features through a hypergraph regularizer term. In accordance with the fast alternating linearization method, we design a novel algorithm to optimize the objective function. This algorithm alternately minimizes two different approximations of the original objective function by keeping one function unchanged and linearizing the other. We conduct experiments on a large-scale dataset collected from the Microsoft Bing image search engine, and the results demonstrate that the proposed learning to rank models based on visual features and user clicks outperforms state-of-the-art algorithms.
Decoding visual object categories from temporal correlations of ECoG signals.
Majima, Kei; Matsuo, Takeshi; Kawasaki, Keisuke; Kawai, Kensuke; Saito, Nobuhito; Hasegawa, Isao; Kamitani, Yukiyasu
2014-04-15
How visual object categories are represented in the brain is one of the key questions in neuroscience. Studies on low-level visual features have shown that relative timings or phases of neural activity between multiple brain locations encode information. However, whether such temporal patterns of neural activity are used in the representation of visual objects is unknown. Here, we examined whether and how visual object categories could be predicted (or decoded) from temporal patterns of electrocorticographic (ECoG) signals from the temporal cortex in five patients with epilepsy. We used temporal correlations between electrodes as input features, and compared the decoding performance with features defined by spectral power and phase from individual electrodes. While using power or phase alone, the decoding accuracy was significantly better than chance, correlations alone or those combined with power outperformed other features. Decoding performance with correlations was degraded by shuffling the order of trials of the same category in each electrode, indicating that the relative time series between electrodes in each trial is critical. Analysis using a sliding time window revealed that decoding performance with correlations began to rise earlier than that with power. This earlier increase in performance was replicated by a model using phase differences to encode categories. These results suggest that activity patterns arising from interactions between multiple neuronal units carry additional information on visual object categories. Copyright © 2013 Elsevier Inc. All rights reserved.
Spatial and temporal coherence in perceptual binding
Blake, Randolph; Yang, Yuede
1997-01-01
Component visual features of objects are registered by distributed patterns of activity among neurons comprising multiple pathways and visual areas. How these distributed patterns of activity give rise to unified representations of objects remains unresolved, although one recent, controversial view posits temporal coherence of neural activity as a binding agent. Motivated by the possible role of temporal coherence in feature binding, we devised a novel psychophysical task that requires the detection of temporal coherence among features comprising complex visual images. Results show that human observers can more easily detect synchronized patterns of temporal contrast modulation within hybrid visual images composed of two components when those components are drawn from the same original picture. Evidently, time-varying changes within spatially coherent features produce more salient neural signals. PMID:9192701
A foreground object features-based stereoscopic image visual comfort assessment model
NASA Astrophysics Data System (ADS)
Jin, Xin; Jiang, G.; Ying, H.; Yu, M.; Ding, S.; Peng, Z.; Shao, F.
2014-11-01
Since stereoscopic images provide observers with both realistic and discomfort viewing experience, it is necessary to investigate the determinants of visual discomfort. By considering that foreground object draws most attention when human observing stereoscopic images. This paper proposes a new foreground object based visual comfort assessment (VCA) metric. In the first place, a suitable segmentation method is applied to disparity map and then the foreground object is ascertained as the one having the biggest average disparity. In the second place, three visual features being average disparity, average width and spatial complexity of foreground object are computed from the perspective of visual attention. Nevertheless, object's width and complexity do not consistently influence the perception of visual comfort in comparison with disparity. In accordance with this psychological phenomenon, we divide the whole images into four categories on the basis of different disparity and width, and exert four different models to more precisely predict its visual comfort in the third place. Experimental results show that the proposed VCA metric outperformance other existing metrics and can achieve a high consistency between objective and subjective visual comfort scores. The Pearson Linear Correlation Coefficient (PLCC) and Spearman Rank Order Correlation Coefficient (SROCC) are over 0.84 and 0.82, respectively.
Jozwik, Kamila M.; Kriegeskorte, Nikolaus; Mur, Marieke
2016-01-01
Object similarity, in brain representations and conscious perception, must reflect a combination of the visual appearance of the objects on the one hand and the categories the objects belong to on the other. Indeed, visual object features and category membership have each been shown to contribute to the object representation in human inferior temporal (IT) cortex, as well as to object-similarity judgments. However, the explanatory power of features and categories has not been directly compared. Here, we investigate whether the IT object representation and similarity judgments are best explained by a categorical or a feature-based model. We use rich models (>100 dimensions) generated by human observers for a set of 96 real-world object images. The categorical model consists of a hierarchically nested set of category labels (such as “human”, “mammal”, and “animal”). The feature-based model includes both object parts (such as “eye”, “tail”, and “handle”) and other descriptive features (such as “circular”, “green”, and “stubbly”). We used non-negative least squares to fit the models to the brain representations (estimated from functional magnetic resonance imaging data) and to similarity judgments. Model performance was estimated on held-out images not used in fitting. Both models explained significant variance in IT and the amounts explained were not significantly different. The combined model did not explain significant additional IT variance, suggesting that it is the shared model variance (features correlated with categories, categories correlated with features) that best explains IT. The similarity judgments were almost fully explained by the categorical model, which explained significantly more variance than the feature-based model. The combined model did not explain significant additional variance in the similarity judgments. Our findings suggest that IT uses features that help to distinguish categories as stepping stones toward a semantic representation. Similarity judgments contain additional categorical variance that is not explained by visual features, reflecting a higher-level more purely semantic representation. PMID:26493748
Jozwik, Kamila M; Kriegeskorte, Nikolaus; Mur, Marieke
2016-03-01
Object similarity, in brain representations and conscious perception, must reflect a combination of the visual appearance of the objects on the one hand and the categories the objects belong to on the other. Indeed, visual object features and category membership have each been shown to contribute to the object representation in human inferior temporal (IT) cortex, as well as to object-similarity judgments. However, the explanatory power of features and categories has not been directly compared. Here, we investigate whether the IT object representation and similarity judgments are best explained by a categorical or a feature-based model. We use rich models (>100 dimensions) generated by human observers for a set of 96 real-world object images. The categorical model consists of a hierarchically nested set of category labels (such as "human", "mammal", and "animal"). The feature-based model includes both object parts (such as "eye", "tail", and "handle") and other descriptive features (such as "circular", "green", and "stubbly"). We used non-negative least squares to fit the models to the brain representations (estimated from functional magnetic resonance imaging data) and to similarity judgments. Model performance was estimated on held-out images not used in fitting. Both models explained significant variance in IT and the amounts explained were not significantly different. The combined model did not explain significant additional IT variance, suggesting that it is the shared model variance (features correlated with categories, categories correlated with features) that best explains IT. The similarity judgments were almost fully explained by the categorical model, which explained significantly more variance than the feature-based model. The combined model did not explain significant additional variance in the similarity judgments. Our findings suggest that IT uses features that help to distinguish categories as stepping stones toward a semantic representation. Similarity judgments contain additional categorical variance that is not explained by visual features, reflecting a higher-level more purely semantic representation. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
Miconi, Thomas; Groomes, Laura; Kreiman, Gabriel
2016-01-01
When searching for an object in a scene, how does the brain decide where to look next? Visual search theories suggest the existence of a global “priority map” that integrates bottom-up visual information with top-down, target-specific signals. We propose a mechanistic model of visual search that is consistent with recent neurophysiological evidence, can localize targets in cluttered images, and predicts single-trial behavior in a search task. This model posits that a high-level retinotopic area selective for shape features receives global, target-specific modulation and implements local normalization through divisive inhibition. The normalization step is critical to prevent highly salient bottom-up features from monopolizing attention. The resulting activity pattern constitues a priority map that tracks the correlation between local input and target features. The maximum of this priority map is selected as the locus of attention. The visual input is then spatially enhanced around the selected location, allowing object-selective visual areas to determine whether the target is present at this location. This model can localize objects both in array images and when objects are pasted in natural scenes. The model can also predict single-trial human fixations, including those in error and target-absent trials, in a search task involving complex objects. PMID:26092221
Coding of visual object features and feature conjunctions in the human brain.
Martinovic, Jasna; Gruber, Thomas; Müller, Matthias M
2008-01-01
Object recognition is achieved through neural mechanisms reliant on the activity of distributed coordinated neural assemblies. In the initial steps of this process, an object's features are thought to be coded very rapidly in distinct neural assemblies. These features play different functional roles in the recognition process--while colour facilitates recognition, additional contours and edges delay it. Here, we selectively varied the amount and role of object features in an entry-level categorization paradigm and related them to the electrical activity of the human brain. We found that early synchronizations (approx. 100 ms) increased quantitatively when more image features had to be coded, without reflecting their qualitative contribution to the recognition process. Later activity (approx. 200-400 ms) was modulated by the representational role of object features. These findings demonstrate that although early synchronizations may be sufficient for relatively crude discrimination of objects in visual scenes, they cannot support entry-level categorization. This was subserved by later processes of object model selection, which utilized the representational value of object features such as colour or edges to select the appropriate model and achieve identification.
An object-based visual attention model for robotic applications.
Yu, Yuanlong; Mann, George K I; Gosine, Raymond G
2010-10-01
By extending integrated competition hypothesis, this paper presents an object-based visual attention model, which selects one object of interest using low-dimensional features, resulting that visual perception starts from a fast attentional selection procedure. The proposed attention model involves seven modules: learning of object representations stored in a long-term memory (LTM), preattentive processing, top-down biasing, bottom-up competition, mediation between top-down and bottom-up ways, generation of saliency maps, and perceptual completion processing. It works in two phases: learning phase and attending phase. In the learning phase, the corresponding object representation is trained statistically when one object is attended. A dual-coding object representation consisting of local and global codings is proposed. Intensity, color, and orientation features are used to build the local coding, and a contour feature is employed to constitute the global coding. In the attending phase, the model preattentively segments the visual field into discrete proto-objects using Gestalt rules at first. If a task-specific object is given, the model recalls the corresponding representation from LTM and deduces the task-relevant feature(s) to evaluate top-down biases. The mediation between automatic bottom-up competition and conscious top-down biasing is then performed to yield a location-based saliency map. By combination of location-based saliency within each proto-object, the proto-object-based saliency is evaluated. The most salient proto-object is selected for attention, and it is finally put into the perceptual completion processing module to yield a complete object region. This model has been applied into distinct tasks of robots: detection of task-specific stationary and moving objects. Experimental results under different conditions are shown to validate this model.
NASA Astrophysics Data System (ADS)
Zhang, Wenlan; Luo, Ting; Jiang, Gangyi; Jiang, Qiuping; Ying, Hongwei; Lu, Jing
2016-06-01
Visual comfort assessment (VCA) for stereoscopic images is a particularly significant yet challenging task in 3D quality of experience research field. Although the subjective assessment given by human observers is known as the most reliable way to evaluate the experienced visual discomfort, it is time-consuming and non-systematic. Therefore, it is of great importance to develop objective VCA approaches that can faithfully predict the degree of visual discomfort as human beings do. In this paper, a novel two-stage objective VCA framework is proposed. The main contribution of this study is that the important visual attention mechanism of human visual system is incorporated for visual comfort-aware feature extraction. Specifically, in the first stage, we first construct an adaptive 3D visual saliency detection model to derive saliency map of a stereoscopic image, and then a set of saliency-weighted disparity statistics are computed and combined to form a single feature vector to represent a stereoscopic image in terms of visual comfort. In the second stage, a high dimensional feature vector is fused into a single visual comfort score by performing random forest algorithm. Experimental results on two benchmark databases confirm the superior performance of the proposed approach.
Implicit integration in a case of integrative visual agnosia.
Aviezer, Hillel; Landau, Ayelet N; Robertson, Lynn C; Peterson, Mary A; Soroker, Nachum; Sacher, Yaron; Bonneh, Yoram; Bentin, Shlomo
2007-05-15
We present a case (SE) with integrative visual agnosia following ischemic stroke affecting the right dorsal and the left ventral pathways of the visual system. Despite his inability to identify global hierarchical letters [Navon, D. (1977). Forest before trees: The precedence of global features in visual perception. Cognitive Psychology, 9, 353-383], and his dense object agnosia, SE showed normal global-to-local interference when responding to local letters in Navon hierarchical stimuli and significant picture-word identity priming in a semantic decision task for words. Since priming was absent if these features were scrambled, it stands to reason that these effects were not due to priming by distinctive features. The contrast between priming effects induced by coherent and scrambled stimuli is consistent with implicit but not explicit integration of features into a unified whole. We went on to show that possible/impossible object decisions were facilitated by words in a word-picture priming task, suggesting that prompts could activate perceptually integrated images in a backward fashion. We conclude that the absence of SE's ability to identify visual objects except through tedious serial construction reflects a deficit in accessing an integrated visual representation through bottom-up visual processing alone. However, top-down generated images can help activate these visual representations through semantic links.
A formal theory of feature binding in object perception.
Ashby, F G; Prinzmetal, W; Ivry, R; Maddox, W T
1996-01-01
Visual objects are perceived correctly only if their features are identified and then bound together. Illusory conjunctions result when feature identification is correct but an error occurs during feature binding. A new model is proposed that assumes feature binding errors occur because of uncertainty about the location of visual features. This model accounted for data from 2 new experiments better than a model derived from A. M. Treisman and H. Schmidt's (1982) feature integration theory. The traditional method for detecting the occurrence of true illusory conjunctions is shown to be fundamentally flawed. A reexamination of 2 previous studies provided new insights into the role of attention and location information in object perception and a reinterpretation of the deficits in patients who exhibit attentional disorders.
Perceived Average Orientation Reflects Effective Gist of the Surface.
Cha, Oakyoon; Chong, Sang Chul
2018-03-01
The human ability to represent ensemble visual information, such as average orientation and size, has been suggested as the foundation of gist perception. To effectively summarize different groups of objects into the gist of a scene, observers should form ensembles separately for different groups, even when objects have similar visual features across groups. We hypothesized that the visual system utilizes perceptual groups characterized by spatial configuration and represents separate ensembles for different groups. Therefore, participants could not integrate ensembles of different perceptual groups on a task basis. We asked participants to determine the average orientation of visual elements comprising a surface with a contour situated inside. Although participants were asked to estimate the average orientation of all the elements, they ignored orientation signals embedded in the contour. This constraint may help the visual system to keep the visual features of occluding objects separate from those of the occluded objects.
Potts, Geoffrey F; Wood, Susan M; Kothmann, Delia; Martin, Laura E
2008-10-21
Attention directs limited-capacity information processing resources to a subset of available perceptual representations. The mechanisms by which attention selects task-relevant representations for preferential processing are not fully known. Triesman and Gelade's [Triesman, A., Gelade, G., 1980. A feature integration theory of attention. Cognit. Psychol. 12, 97-136.] influential attention model posits that simple features are processed preattentively, in parallel, but that attention is required to serially conjoin multiple features into an object representation. Event-related potentials have provided evidence for this model showing parallel processing of perceptual features in the posterior Selection Negativity (SN) and serial, hierarchic processing of feature conjunctions in the Frontal Selection Positivity (FSP). Most prior studies have been done on conjunctions within one sensory modality while many real-world objects have multimodal features. It is not known if the same neural systems of posterior parallel processing of simple features and frontal serial processing of feature conjunctions seen within a sensory modality also operate on conjunctions between modalities. The current study used ERPs and simultaneously presented auditory and visual stimuli in three task conditions: Attend Auditory (auditory feature determines the target, visual features are irrelevant), Attend Visual (visual features relevant, auditory irrelevant), and Attend Conjunction (target defined by the co-occurrence of an auditory and a visual feature). In the Attend Conjunction condition when the auditory but not the visual feature was a target there was an SN over auditory cortex, when the visual but not auditory stimulus was a target there was an SN over visual cortex, and when both auditory and visual stimuli were targets (i.e. conjunction target) there were SNs over both auditory and visual cortex, indicating parallel processing of the simple features within each modality. In contrast, an FSP was present when either the visual only or both auditory and visual features were targets, but not when only the auditory stimulus was a target, indicating that the conjunction target determination was evaluated serially and hierarchically with visual information taking precedence. This indicates that the detection of a target defined by audio-visual conjunction is achieved via the same mechanism as within a single perceptual modality, through separate, parallel processing of the auditory and visual features and serial processing of the feature conjunction elements, rather than by evaluation of a fused multimodal percept.
Feature bindings endure without attention: evidence from an explicit recall task.
Gajewski, Daniel A; Brockmole, James R
2006-08-01
Are integrated objects the unit of capacity of visual working memory, or is continued attention needed to maintain bindings between independently stored features? In a delayed recall task, participants reported the color and shape of a probed item from a memory array. During the delay, attention was manipulated with an exogenous cue. Recall was elevated at validly cued positions, indicating that the cue affected item memory. On invalid trials, participants most frequently recalled either both features (perfect object memory) or neither of the two features (no object memory); the frequency with which only one feature was recalled was significantly lower than predicted by feature independence as determined in a single-feature recall task. These data do not support the view that features are remembered independently when attention is withdrawn. Instead, integrated objects are stored in visual working memory without need for continued attention.
NASA Astrophysics Data System (ADS)
Madokoro, H.; Tsukada, M.; Sato, K.
2013-07-01
This paper presents an unsupervised learning-based object category formation and recognition method for mobile robot vision. Our method has the following features: detection of feature points and description of features using a scale-invariant feature transform (SIFT), selection of target feature points using one class support vector machines (OC-SVMs), generation of visual words using self-organizing maps (SOMs), formation of labels using adaptive resonance theory 2 (ART-2), and creation and classification of categories on a category map of counter propagation networks (CPNs) for visualizing spatial relations between categories. Classification results of dynamic images using time-series images obtained using two different-size robots and according to movements respectively demonstrate that our method can visualize spatial relations of categories while maintaining time-series characteristics. Moreover, we emphasize the effectiveness of our method for category formation of appearance changes of objects.
Rajaei, Karim; Khaligh-Razavi, Seyed-Mahdi; Ghodrati, Masoud; Ebrahimpour, Reza; Shiri Ahmad Abadi, Mohammad Ebrahim
2012-01-01
The brain mechanism of extracting visual features for recognizing various objects has consistently been a controversial issue in computational models of object recognition. To extract visual features, we introduce a new, biologically motivated model for facial categorization, which is an extension of the Hubel and Wiesel simple-to-complex cell hierarchy. To address the synaptic stability versus plasticity dilemma, we apply the Adaptive Resonance Theory (ART) for extracting informative intermediate level visual features during the learning process, which also makes this model stable against the destruction of previously learned information while learning new information. Such a mechanism has been suggested to be embedded within known laminar microcircuits of the cerebral cortex. To reveal the strength of the proposed visual feature learning mechanism, we show that when we use this mechanism in the training process of a well-known biologically motivated object recognition model (the HMAX model), it performs better than the HMAX model in face/non-face classification tasks. Furthermore, we demonstrate that our proposed mechanism is capable of following similar trends in performance as humans in a psychophysical experiment using a face versus non-face rapid categorization task.
Picture Detection in Rapid Serial Visual Presentation: Features or Identity?
ERIC Educational Resources Information Center
Potter, Mary C.; Wyble, Brad; Pandav, Rijuta; Olejarczyk, Jennifer
2010-01-01
A pictured object can be readily detected in a rapid serial visual presentation sequence when the target is specified by a superordinate category name such as "animal" or "vehicle". Are category features the initial basis for detection, with identification of the specific object occurring in a second stage (Evans &…
Separate Capacities for Storing Different Features in Visual Working Memory
ERIC Educational Resources Information Center
Wang, Benchi; Cao, Xiaohua; Theeuwes, Jan; Olivers, Christian N. L.; Wang, Zhiguo
2017-01-01
Recent empirical and theoretical work suggests that visual features such as color and orientation can be stored or retrieved independently in visual working memory (VWM), even in cases when they belong to the same object. Yet it remains unclear whether different feature dimensions have their own capacity limits, or whether they compete for shared…
An insect-inspired model for visual binding II: functional analysis and visual attention.
Northcutt, Brandon D; Higgins, Charles M
2017-04-01
We have developed a neural network model capable of performing visual binding inspired by neuronal circuitry in the optic glomeruli of flies: a brain area that lies just downstream of the optic lobes where early visual processing is performed. This visual binding model is able to detect objects in dynamic image sequences and bind together their respective characteristic visual features-such as color, motion, and orientation-by taking advantage of their common temporal fluctuations. Visual binding is represented in the form of an inhibitory weight matrix which learns over time which features originate from a given visual object. In the present work, we show that information represented implicitly in this weight matrix can be used to explicitly count the number of objects present in the visual image, to enumerate their specific visual characteristics, and even to create an enhanced image in which one particular object is emphasized over others, thus implementing a simple form of visual attention. Further, we present a detailed analysis which reveals the function and theoretical limitations of the visual binding network and in this context describe a novel network learning rule which is optimized for visual binding.
Wang, Changming; Xiong, Shi; Hu, Xiaoping; Yao, Li; Zhang, Jiacai
2012-10-01
Categorization of images containing visual objects can be successfully recognized using single-trial electroencephalograph (EEG) measured when subjects view images. Previous studies have shown that task-related information contained in event-related potential (ERP) components could discriminate two or three categories of object images. In this study, we investigated whether four categories of objects (human faces, buildings, cats and cars) could be mutually discriminated using single-trial EEG data. Here, the EEG waveforms acquired while subjects were viewing four categories of object images were segmented into several ERP components (P1, N1, P2a and P2b), and then Fisher linear discriminant analysis (Fisher-LDA) was used to classify EEG features extracted from ERP components. Firstly, we compared the classification results using features from single ERP components, and identified that the N1 component achieved the highest classification accuracies. Secondly, we discriminated four categories of objects using combining features from multiple ERP components, and showed that combination of ERP components improved four-category classification accuracies by utilizing the complementarity of discriminative information in ERP components. These findings confirmed that four categories of object images could be discriminated with single-trial EEG and could direct us to select effective EEG features for classifying visual objects.
Feature integration and object representations along the dorsal stream visual hierarchy
Perry, Carolyn Jeane; Fallah, Mazyar
2014-01-01
The visual system is split into two processing streams: a ventral stream that receives color and form information and a dorsal stream that receives motion information. Each stream processes that information hierarchically, with each stage building upon the previous. In the ventral stream this leads to the formation of object representations that ultimately allow for object recognition regardless of changes in the surrounding environment. In the dorsal stream, this hierarchical processing has classically been thought to lead to the computation of complex motion in three dimensions. However, there is evidence to suggest that there is integration of both dorsal and ventral stream information into motion computation processes, giving rise to intermediate object representations, which facilitate object selection and decision making mechanisms in the dorsal stream. First we review the hierarchical processing of motion along the dorsal stream and the building up of object representations along the ventral stream. Then we discuss recent work on the integration of ventral and dorsal stream features that lead to intermediate object representations in the dorsal stream. Finally we propose a framework describing how and at what stage different features are integrated into dorsal visual stream object representations. Determining the integration of features along the dorsal stream is necessary to understand not only how the dorsal stream builds up an object representation but also which computations are performed on object representations instead of local features. PMID:25140147
Inter-area correlations in the ventral visual pathway reflect feature integration
Freeman, Jeremy; Donner, Tobias H.; Heeger, David J.
2011-01-01
During object perception, the brain integrates simple features into representations of complex objects. A perceptual phenomenon known as visual crowding selectively interferes with this process. Here, we use crowding to characterize a neural correlate of feature integration. Cortical activity was measured with functional magnetic resonance imaging, simultaneously in multiple areas of the ventral visual pathway (V1–V4 and the visual word form area, VWFA, which responds preferentially to familiar letters), while human subjects viewed crowded and uncrowded letters. Temporal correlations between cortical areas were lower for crowded letters than for uncrowded letters, especially between V1 and VWFA. These differences in correlation were retinotopically specific, and persisted when attention was diverted from the letters. But correlation differences were not evident when we substituted the letters with grating patches that were not crowded under our stimulus conditions. We conclude that inter-area correlations reflect feature integration and are disrupted by crowding. We propose that crowding may perturb the transformations between neural representations along the ventral pathway that underlie the integration of features into objects. PMID:21521832
The spread of attention across features of a surface
Ernst, Zachary Raymond; Jazayeri, Mehrdad
2013-01-01
Contrasting theories of visual attention have emphasized selection by spatial location, individual features, and whole objects. We used functional magnetic resonance imaging to ask whether and how attention to one feature of an object spreads to other features of the same object. Subjects viewed two spatially superimposed surfaces of random dots that were segregated by distinct color-motion conjunctions. The color and direction of motion of each surface changed smoothly and in a cyclical fashion. Subjects were required to track one feature (e.g., color) of one of the two surfaces and detect brief moments when the attended feature diverged from its smooth trajectory. To tease apart the effect of attention to individual features on the hemodynamic response, we used a frequency-tagging scheme. In this scheme, the stimulus features (color and direction of motion) are modulated periodically at distinct frequencies so that the contribution of each feature to the hemodynamics can be inferred from the harmonic response at the corresponding frequency. We found that attention to one feature (e.g., color) of one surface increased the response modulation not only to the attended feature but also to the other feature (e.g., motion) of the same surface. This attentional modulation was evident in multiple visual areas and was present as early as V1. The spread of attention to the behaviorally irrelevant features of a surface suggests that attention may automatically select all features of a single object. Thus object-based attention may be supported by an enhancement of feature-specific sensory signals in the visual cortex. PMID:23883860
Object-based attentional selection modulates anticipatory alpha oscillations
Knakker, Balázs; Weiss, Béla; Vidnyánszky, Zoltán
2015-01-01
Visual cortical alpha oscillations are involved in attentional gating of incoming visual information. It has been shown that spatial and feature-based attentional selection result in increased alpha oscillations over the cortical regions representing sensory input originating from the unattended visual field and task-irrelevant visual features, respectively. However, whether attentional gating in the case of object based selection is also associated with alpha oscillations has not been investigated before. Here we measured anticipatory electroencephalography (EEG) alpha oscillations while participants were cued to attend to foveal face or word stimuli, the processing of which is known to have right and left hemispheric lateralization, respectively. The results revealed that in the case of simultaneously displayed, overlapping face and word stimuli, attending to the words led to increased power of parieto-occipital alpha oscillations over the right hemisphere as compared to when faces were attended. This object category-specific modulation of the hemispheric lateralization of anticipatory alpha oscillations was maintained during sustained attentional selection of sequentially presented face and word stimuli. These results imply that in the case of object-based attentional selection—similarly to spatial and feature-based attention—gating of visual information processing might involve visual cortical alpha oscillations. PMID:25628554
Geigerman, Shriradha; Verhaeghen, Paul; Cerella, John
2016-06-01
In three experiments, we investigated whether features and whole-objects can be represented simultaneously in visual short-term memory (VSTM). Participants were presented with a memory set of colored shapes; we probed either for the constituent features or for the whole object, and analyzed retrieval dynamics (cumulative response time distributions). In our first experiment, we used whole-object probes that recombined features from the memory display; we found that subjects' data conformed to a kitchen-line model, showing that they used whole-object representations for the matching process. In the second experiment, we encouraged independent-feature representations by using probes that used features not present in the memory display; subjects' data conformed to the race-model inequality, showing that they used independent-feature representations for the matching process. In a final experiment, we used both types of probes; subjects now used both types of representations, depending on the nature of the probe. Combined, our three experiments suggest that both feature and whole-object representations can coexist in VSTM. Copyright © 2016 Elsevier B.V. All rights reserved.
A neural theory of visual attention and short-term memory (NTVA).
Bundesen, Claus; Habekost, Thomas; Kyllingsbæk, Søren
2011-05-01
The neural theory of visual attention and short-term memory (NTVA) proposed by Bundesen, Habekost, and Kyllingsbæk (2005) is reviewed. In NTVA, filtering (selection of objects) changes the number of cortical neurons in which an object is represented so that this number increases with the behavioural importance of the object. Another mechanism of selection, pigeonholing (selection of features), scales the level of activation in neurons coding for a particular feature. By these mechanisms, behaviourally important objects and features are likely to win the competition to become encoded into visual short-term memory (VSTM). The VSTM system is conceived as a feedback mechanism that sustains activity in the neurons that have won the attentional competition. NTVA accounts both for a wide range of attentional effects in human performance (reaction times and error rates) and a wide range of effects observed in firing rates of single cells in the primate visual system. Copyright © 2010 Elsevier Ltd. All rights reserved.
Behavioral model of visual perception and recognition
NASA Astrophysics Data System (ADS)
Rybak, Ilya A.; Golovan, Alexander V.; Gusakova, Valentina I.
1993-09-01
In the processes of visual perception and recognition human eyes actively select essential information by way of successive fixations at the most informative points of the image. A behavioral program defining a scanpath of the image is formed at the stage of learning (object memorizing) and consists of sequential motor actions, which are shifts of attention from one to another point of fixation, and sensory signals expected to arrive in response to each shift of attention. In the modern view of the problem, invariant object recognition is provided by the following: (1) separated processing of `what' (object features) and `where' (spatial features) information at high levels of the visual system; (2) mechanisms of visual attention using `where' information; (3) representation of `what' information in an object-based frame of reference (OFR). However, most recent models of vision based on OFR have demonstrated the ability of invariant recognition of only simple objects like letters or binary objects without background, i.e. objects to which a frame of reference is easily attached. In contrast, we use not OFR, but a feature-based frame of reference (FFR), connected with the basic feature (edge) at the fixation point. This has provided for our model, the ability for invariant representation of complex objects in gray-level images, but demands realization of behavioral aspects of vision described above. The developed model contains a neural network subsystem of low-level vision which extracts a set of primary features (edges) in each fixation, and high- level subsystem consisting of `what' (Sensory Memory) and `where' (Motor Memory) modules. The resolution of primary features extraction decreases with distances from the point of fixation. FFR provides both the invariant representation of object features in Sensor Memory and shifts of attention in Motor Memory. Object recognition consists in successive recall (from Motor Memory) and execution of shifts of attention and successive verification of the expected sets of features (stored in Sensory Memory). The model shows the ability of recognition of complex objects (such as faces) in gray-level images invariant with respect to shift, rotation, and scale.
The uncrowded window of object recognition
Pelli, Denis G; Tillman, Katharine A
2009-01-01
It is now emerging that vision is usually limited by object spacing rather than size. The visual system recognizes an object by detecting and then combining its features. ‘Crowding’ occurs when objects are too close together and features from several objects are combined into a jumbled percept. Here, we review the explosion of studies on crowding—in grating discrimination, letter and face recognition, visual search, selective attention, and reading—and find a universal principle, the Bouma law. The critical spacing required to prevent crowding is equal for all objects, although the effect is weaker between dissimilar objects. Furthermore, critical spacing at the cortex is independent of object position, and critical spacing at the visual field is proportional to object distance from fixation. The region where object spacing exceeds critical spacing is the ‘uncrowded window’. Observers cannot recognize objects outside of this window and its size limits the speed of reading and search. PMID:18828191
Rapid Processing of a Global Feature in the ON Visual Pathways of Behaving Monkeys.
Huang, Jun; Yang, Yan; Zhou, Ke; Zhao, Xudong; Zhou, Quan; Zhu, Hong; Yang, Yingshan; Zhang, Chunming; Zhou, Yifeng; Zhou, Wu
2017-01-01
Visual objects are recognized by their features. Whereas, some features are based on simple components (i.e., local features, such as orientation of line segments), some features are based on the whole object (i.e., global features, such as an object having a hole in it). Over the past five decades, behavioral, physiological, anatomical, and computational studies have established a general model of vision, which starts from extracting local features in the lower visual pathways followed by a feature integration process that extracts global features in the higher visual pathways. This local-to-global model is successful in providing a unified account for a vast sets of perception experiments, but it fails to account for a set of experiments showing human visual systems' superior sensitivity to global features. Understanding the neural mechanisms underlying the "global-first" process will offer critical insights into new models of vision. The goal of the present study was to establish a non-human primate model of rapid processing of global features for elucidating the neural mechanisms underlying differential processing of global and local features. Monkeys were trained to make a saccade to a target in the black background, which was different from the distractors (white circle) in color (e.g., red circle target), local features (e.g., white square target), a global feature (e.g., white ring with a hole target) or their combinations (e.g., red square target). Contrary to the predictions of the prevailing local-to-global model, we found that (1) detecting a distinction or a change in the global feature was faster than detecting a distinction or a change in color or local features; (2) detecting a distinction in color was facilitated by a distinction in the global feature, but not in the local features; and (3) detecting the hole was interfered by the local features of the hole (e.g., white ring with a squared hole). These results suggest that monkey ON visual systems have a subsystem that is more sensitive to distinctions in the global feature than local features. They also provide the behavioral constraints for identifying the underlying neural substrates.
Chiang, Hsueh-Sheng; Eroh, Justin; Spence, Jeffrey S; Motes, Michael A; Maguire, Mandy J; Krawczyk, Daniel C; Brier, Matthew R; Hart, John; Kraut, Michael A
2016-08-01
How the brain combines the neural representations of features that comprise an object in order to activate a coherent object memory is poorly understood, especially when the features are presented in different modalities (visual vs. auditory) and domains (verbal vs. nonverbal). We examined this question using three versions of a modified Semantic Object Retrieval Test, where object memory was probed by a feature presented as a written word, a spoken word, or a picture, followed by a second feature always presented as a visual word. Participants indicated whether each feature pair elicited retrieval of the memory of a particular object. Sixteen subjects completed one of the three versions (N=48 in total) while their EEG were recorded simultaneously. We analyzed EEG data in four separate frequency bands (delta: 1-4Hz, theta: 4-7Hz; alpha: 8-12Hz; beta: 13-19Hz) using a multivariate data-driven approach. We found that alpha power time-locked to response was modulated by both cross-modality (visual vs. auditory) and cross-domain (verbal vs. nonverbal) probing of semantic object memory. In addition, retrieval trials showed greater changes in all frequency bands compared to non-retrieval trials across all stimulus types in both response-locked and stimulus-locked analyses, suggesting dissociable neural subcomponents involved in binding object features to retrieve a memory. We conclude that these findings support both modality/domain-dependent and modality/domain-independent mechanisms during semantic object memory retrieval. Copyright © 2016 Elsevier B.V. All rights reserved.
NASA Technical Reports Server (NTRS)
Lewis, Steven J.; Palacios, David M.
2013-01-01
This software can track multiple moving objects within a video stream simultaneously, use visual features to aid in the tracking, and initiate tracks based on object detection in a subregion. A simple programmatic interface allows plugging into larger image chain modeling suites. It extracts unique visual features for aid in tracking and later analysis, and includes sub-functionality for extracting visual features about an object identified within an image frame. Tracker Toolkit utilizes a feature extraction algorithm to tag each object with metadata features about its size, shape, color, and movement. Its functionality is independent of the scale of objects within a scene. The only assumption made on the tracked objects is that they move. There are no constraints on size within the scene, shape, or type of movement. The Tracker Toolkit is also capable of following an arbitrary number of objects in the same scene, identifying and propagating the track of each object from frame to frame. Target objects may be specified for tracking beforehand, or may be dynamically discovered within a tripwire region. Initialization of the Tracker Toolkit algorithm includes two steps: Initializing the data structures for tracked target objects, including targets preselected for tracking; and initializing the tripwire region. If no tripwire region is desired, this step is skipped. The tripwire region is an area within the frames that is always checked for new objects, and all new objects discovered within the region will be tracked until lost (by leaving the frame, stopping, or blending in to the background).
Linear and Non-Linear Visual Feature Learning in Rat and Humans
Bossens, Christophe; Op de Beeck, Hans P.
2016-01-01
The visual system processes visual input in a hierarchical manner in order to extract relevant features that can be used in tasks such as invariant object recognition. Although typically investigated in primates, recent work has shown that rats can be trained in a variety of visual object and shape recognition tasks. These studies did not pinpoint the complexity of the features used by these animals. Many tasks might be solved by using a combination of relatively simple features which tend to be correlated. Alternatively, rats might extract complex features or feature combinations which are nonlinear with respect to those simple features. In the present study, we address this question by starting from a small stimulus set for which one stimulus-response mapping involves a simple linear feature to solve the task while another mapping needs a well-defined nonlinear combination of simpler features related to shape symmetry. We verified computationally that the nonlinear task cannot be trivially solved by a simple V1-model. We show how rats are able to solve the linear feature task but are unable to acquire the nonlinear feature. In contrast, humans are able to use the nonlinear feature and are even faster in uncovering this solution as compared to the linear feature. The implications for the computational capabilities of the rat visual system are discussed. PMID:28066201
Semantic and visual determinants of face recognition in a prosopagnosic patient.
Dixon, M J; Bub, D N; Arguin, M
1998-05-01
Prosopagnosia is the neuropathological inability to recognize familiar people by their faces. It can occur in isolation or can coincide with recognition deficits for other nonface objects. Often, patients whose prosopagnosia is accompanied by object recognition difficulties have more trouble identifying certain categories of objects relative to others. In previous research, we demonstrated that objects that shared multiple visual features and were semantically close posed severe recognition difficulties for a patient with temporal lobe damage. We now demonstrate that this patient's face recognition is constrained by these same parameters. The prosopagnosic patient ELM had difficulties pairing faces to names when the faces shared visual features and the names were semantically related (e.g., Tonya Harding, Nancy Kerrigan, and Josee Chouinard -three ice skaters). He made tenfold fewer errors when the exact same faces were associated with semantically unrelated people (e.g., singer Celine Dion, actress Betty Grable, and First Lady Hillary Clinton). We conclude that prosopagnosia and co-occurring category-specific recognition problems both stem from difficulties disambiguating the stored representations of objects that share multiple visual features and refer to semantically close identities or concepts.
Classification of visual and linguistic tasks using eye-movement features.
Coco, Moreno I; Keller, Frank
2014-03-07
The role of the task has received special attention in visual-cognition research because it can provide causal explanations of goal-directed eye-movement responses. The dependency between visual attention and task suggests that eye movements can be used to classify the task being performed. A recent study by Greene, Liu, and Wolfe (2012), however, fails to achieve accurate classification of visual tasks based on eye-movement features. In the present study, we hypothesize that tasks can be successfully classified when they differ with respect to the involvement of other cognitive domains, such as language processing. We extract the eye-movement features used by Greene et al. as well as additional features from the data of three different tasks: visual search, object naming, and scene description. First, we demonstrated that eye-movement responses make it possible to characterize the goals of these tasks. Then, we trained three different types of classifiers and predicted the task participants performed with an accuracy well above chance (a maximum of 88% for visual search). An analysis of the relative importance of features for classification accuracy reveals that just one feature, i.e., initiation time, is sufficient for above-chance performance (a maximum of 79% accuracy in object naming). Crucially, this feature is independent of task duration, which differs systematically across the three tasks we investigated. Overall, the best task classification performance was obtained with a set of seven features that included both spatial information (e.g., entropy of attention allocation) and temporal components (e.g., total fixation on objects) of the eye-movement record. This result confirms the task-dependent allocation of visual attention and extends previous work by showing that task classification is possible when tasks differ in the cognitive processes involved (purely visual tasks such as search vs. communicative tasks such as scene description).
Binding Objects to Locations: The Relationship between Object Files and Visual Working Memory
ERIC Educational Resources Information Center
Hollingworth, Andrew; Rasmussen, Ian P.
2010-01-01
The relationship between object files and visual working memory (VWM) was investigated in a new paradigm combining features of traditional VWM experiments (color change detection) and object-file experiments (memory for the properties of moving objects). Object-file theory was found to account for a key component of object-position binding in VWM:…
Conceptual Distinctiveness Supports Detailed Visual Long-Term Memory for Real-World Objects
ERIC Educational Resources Information Center
Konkle, Talia; Brady, Timothy F.; Alvarez, George A.; Oliva, Aude
2010-01-01
Humans have a massive capacity to store detailed information in visual long-term memory. The present studies explored the fidelity of these visual long-term memory representations and examined how conceptual and perceptual features of object categories support this capacity. Observers viewed 2,800 object images with a different number of exemplars…
Visual perception system and method for a humanoid robot
NASA Technical Reports Server (NTRS)
Chelian, Suhas E. (Inventor); Linn, Douglas Martin (Inventor); Wampler, II, Charles W. (Inventor); Bridgwater, Lyndon (Inventor); Wells, James W. (Inventor); Mc Kay, Neil David (Inventor)
2012-01-01
A robotic system includes a humanoid robot with robotic joints each moveable using an actuator(s), and a distributed controller for controlling the movement of each of the robotic joints. The controller includes a visual perception module (VPM) for visually identifying and tracking an object in the field of view of the robot under threshold lighting conditions. The VPM includes optical devices for collecting an image of the object, a positional extraction device, and a host machine having an algorithm for processing the image and positional information. The algorithm visually identifies and tracks the object, and automatically adapts an exposure time of the optical devices to prevent feature data loss of the image under the threshold lighting conditions. A method of identifying and tracking the object includes collecting the image, extracting positional information of the object, and automatically adapting the exposure time to thereby prevent feature data loss of the image.
The Role of Attention in the Maintenance of Feature Bindings in Visual Short-term Memory
ERIC Educational Resources Information Center
Johnson, Jeffrey S.; Hollingworth, Andrew; Luck, Steven J.
2008-01-01
This study examined the role of attention in maintaining feature bindings in visual short-term memory. In a change-detection paradigm, participants attempted to detect changes in the colors and orientations of multiple objects; the changes consisted of new feature values in a feature-memory condition and changes in how existing feature values were…
Toward semantic-based retrieval of visual information: a model-based approach
NASA Astrophysics Data System (ADS)
Park, Youngchoon; Golshani, Forouzan; Panchanathan, Sethuraman
2002-07-01
This paper center around the problem of automated visual content classification. To enable classification based image or visual object retrieval, we propose a new image representation scheme called visual context descriptor (VCD) that is a multidimensional vector in which each element represents the frequency of a unique visual property of an image or a region. VCD utilizes the predetermined quality dimensions (i.e., types of features and quantization level) and semantic model templates mined in priori. Not only observed visual cues, but also contextually relevant visual features are proportionally incorporated in VCD. Contextual relevance of a visual cue to a semantic class is determined by using correlation analysis of ground truth samples. Such co-occurrence analysis of visual cues requires transformation of a real-valued visual feature vector (e.g., color histogram, Gabor texture, etc.,) into a discrete event (e.g., terms in text). Good-feature to track, rule of thirds, iterative k-means clustering and TSVQ are involved in transformation of feature vectors into unified symbolic representations called visual terms. Similarity-based visual cue frequency estimation is also proposed and used for ensuring the correctness of model learning and matching since sparseness of sample data causes the unstable results of frequency estimation of visual cues. The proposed method naturally allows integration of heterogeneous visual or temporal or spatial cues in a single classification or matching framework, and can be easily integrated into a semantic knowledge base such as thesaurus, and ontology. Robust semantic visual model template creation and object based image retrieval are demonstrated based on the proposed content description scheme.
Feature-based attentional weighting and spreading in visual working memory
Niklaus, Marcel; Nobre, Anna C.; van Ede, Freek
2017-01-01
Attention can be directed at features and feature dimensions to facilitate perception. Here, we investigated whether feature-based-attention (FBA) can also dynamically weight feature-specific representations within multi-feature objects held in visual working memory (VWM). Across three experiments, participants retained coloured arrows in working memory and, during the delay, were cued to either the colour or the orientation dimension. We show that directing attention towards a feature dimension (1) improves the performance in the cued feature dimension at the expense of the uncued dimension, (2) is more efficient if directed to the same rather than to different dimensions for different objects, and (3) at least for colour, automatically spreads to the colour representation of non-attended objects in VWM. We conclude that FBA also continues to operate on VWM representations (with similar principles that govern FBA in the perceptual domain) and challenge the classical view that VWM representations are stored solely as integrated objects. PMID:28233830
Human listening studies reveal insights into object features extracted by echolocating dolphins
NASA Astrophysics Data System (ADS)
Delong, Caroline M.; Au, Whitlow W. L.; Roitblat, Herbert L.
2004-05-01
Echolocating dolphins extract object feature information from the acoustic parameters of object echoes. However, little is known about which object features are salient to dolphins or how they extract those features. To gain insight into how dolphins might be extracting feature information, human listeners were presented with echoes from objects used in a dolphin echoic-visual cross-modal matching task. Human participants performed a task similar to the one the dolphin had performed; however, echoic samples consisting of 23-echo trains were presented via headphones. The participants listened to the echoic sample and then visually selected the correct object from among three alternatives. The participants performed as well as or better than the dolphin (M=88.0% correct), and reported using a combination of acoustic cues to extract object features (e.g., loudness, pitch, timbre). Participants frequently reported using the pattern of aural changes in the echoes across the echo train to identify the shape and structure of the objects (e.g., peaks in loudness or pitch). It is likely that dolphins also attend to the pattern of changes across echoes as objects are echolocated from different angles.
Crowding by a single bar: probing pattern recognition mechanisms in the visual periphery.
Põder, Endel
2014-11-06
Whereas visual crowding does not greatly affect the detection of the presence of simple visual features, it heavily inhibits combining them into recognizable objects. Still, crowding effects have rarely been directly related to general pattern recognition mechanisms. In this study, pattern recognition mechanisms in visual periphery were probed using a single crowding feature. Observers had to identify the orientation of a rotated T presented briefly in a peripheral location. Adjacent to the target, a single bar was presented. The bar was either horizontal or vertical and located in a random direction from the target. It appears that such a crowding bar has very strong and regular effects on the identification of the target orientation. The observer's responses are determined by approximate relative positions of basic visual features; exact image-based similarity to the target is not important. A version of the "standard model" of object recognition with second-order features explains the main regularities of the data. © 2014 ARVO.
Feature-based attentional modulations in the absence of direct visual stimulation.
Serences, John T; Boynton, Geoffrey M
2007-07-19
When faced with a crowded visual scene, observers must selectively attend to behaviorally relevant objects to avoid sensory overload. Often this selection process is guided by prior knowledge of a target-defining feature (e.g., the color red when looking for an apple), which enhances the firing rate of visual neurons that are selective for the attended feature. Here, we used functional magnetic resonance imaging and a pattern classification algorithm to predict the attentional state of human observers as they monitored a visual feature (one of two directions of motion). We find that feature-specific attention effects spread across the visual field-even to regions of the scene that do not contain a stimulus. This spread of feature-based attention to empty regions of space may facilitate the perception of behaviorally relevant stimuli by increasing sensitivity to attended features at all locations in the visual field.
Binding in visual working memory: the role of the episodic buffer.
Baddeley, Alan D; Allen, Richard J; Hitch, Graham J
2011-05-01
The episodic buffer component of working memory is assumed to play a central role in the binding of features into objects, a process that was initially assumed to depend upon executive resources. Here, we review a program of work in which we specifically tested this assumption by studying the effects of a range of attentionally demanding concurrent tasks on the capacity to encode and retain both individual features and bound objects. We found no differential effect of concurrent load, even when the process of binding was made more demanding by separating the shape and color features spatially, temporally or across visual and auditory modalities. Bound features were however more readily disrupted by subsequent stimuli, a process we studied using a suffix paradigm. This suggested a need to assume a feature-based attentional filter followed by an object based storage process. Our results are interpreted within a modified version of the multicomponent working memory model. We also discuss work examining the role of the hippocampus in visual feature binding. Copyright © 2011 Elsevier Ltd. All rights reserved.
Threat as a feature in visual semantic object memory.
Calley, Clifford S; Motes, Michael A; Chiang, H-Sheng; Buhl, Virginia; Spence, Jeffrey S; Abdi, Hervé; Anand, Raksha; Maguire, Mandy; Estevez, Leonardo; Briggs, Richard; Freeman, Thomas; Kraut, Michael A; Hart, John
2013-08-01
Threatening stimuli have been found to modulate visual processes related to perception and attention. The present functional magnetic resonance imaging (fMRI) study investigated whether threat modulates visual object recognition of man-made and naturally occurring categories of stimuli. Compared with nonthreatening pictures, threatening pictures of real items elicited larger fMRI BOLD signal changes in medial visual cortices extending inferiorly into the temporo-occipital (TO) "what" pathways. This region elicited greater signal changes for threatening items compared to nonthreatening from both the natural-occurring and man-made stimulus supraordinate categories, demonstrating a featural component to these visual processing areas. Two additional loci of signal changes within more lateral inferior TO areas (bilateral BA18 and 19 as well as the right ventral temporal lobe) were detected for a category-feature interaction, with stronger responses to man-made (category) threatening (feature) stimuli than to natural threats. The findings are discussed in terms of visual recognition of processing efficiently or rapidly groups of items that confer an advantage for survival. Copyright © 2012 Wiley Periodicals, Inc.
Finlayson, Nonie J.; Golomb, Julie D.
2016-01-01
A fundamental aspect of human visual perception is the ability to recognize and locate objects in the environment. Importantly, our environment is predominantly three-dimensional (3D), but while there is considerable research exploring the binding of object features and location, it is unknown how depth information interacts with features in the object binding process. A recent paradigm called the spatial congruency bias demonstrated that 2D location is fundamentally bound to object features (Golomb, Kupitz, & Thiemann, 2014), such that irrelevant location information biases judgments of object features, but irrelevant feature information does not bias judgments of location or other features. Here, using the spatial congruency bias paradigm, we asked whether depth is processed as another type of location, or more like other features. We initially found that depth cued by binocular disparity biased judgments of object color. However, this result seemed to be driven more by the disparity differences than the depth percept: Depth cued by occlusion and size did not bias color judgments, whereas vertical disparity information (with no depth percept) did bias color judgments. Our results suggest that despite the 3D nature of our visual environment, only 2D location information – not position-in-depth – seems to be automatically bound to object features, with depth information processed more similarly to other features than to 2D location. PMID:27468654
Finlayson, Nonie J; Golomb, Julie D
2016-10-01
A fundamental aspect of human visual perception is the ability to recognize and locate objects in the environment. Importantly, our environment is predominantly three-dimensional (3D), but while there is considerable research exploring the binding of object features and location, it is unknown how depth information interacts with features in the object binding process. A recent paradigm called the spatial congruency bias demonstrated that 2D location is fundamentally bound to object features, such that irrelevant location information biases judgments of object features, but irrelevant feature information does not bias judgments of location or other features. Here, using the spatial congruency bias paradigm, we asked whether depth is processed as another type of location, or more like other features. We initially found that depth cued by binocular disparity biased judgments of object color. However, this result seemed to be driven more by the disparity differences than the depth percept: Depth cued by occlusion and size did not bias color judgments, whereas vertical disparity information (with no depth percept) did bias color judgments. Our results suggest that despite the 3D nature of our visual environment, only 2D location information - not position-in-depth - seems to be automatically bound to object features, with depth information processed more similarly to other features than to 2D location. Copyright © 2016 Elsevier Ltd. All rights reserved.
Priming Contour-Deleted Images: Evidence for Immediate Representations in Visual Object Recognition.
ERIC Educational Resources Information Center
Biederman, Irving; Cooper, Eric E.
1991-01-01
Speed and accuracy of identification of pictures of objects are facilitated by prior viewing. Contributions of image features, convex or concave components, and object models in a repetition priming task were explored in 2 studies involving 96 college students. Results provide evidence of intermediate representations in visual object recognition.…
Integration trumps selection in object recognition.
Saarela, Toni P; Landy, Michael S
2015-03-30
Finding and recognizing objects is a fundamental task of vision. Objects can be defined by several "cues" (color, luminance, texture, etc.), and humans can integrate sensory cues to improve detection and recognition [1-3]. Cortical mechanisms fuse information from multiple cues [4], and shape-selective neural mechanisms can display cue invariance by responding to a given shape independent of the visual cue defining it [5-8]. Selective attention, in contrast, improves recognition by isolating a subset of the visual information [9]. Humans can select single features (red or vertical) within a perceptual dimension (color or orientation), giving faster and more accurate responses to items having the attended feature [10, 11]. Attention elevates neural responses and sharpens neural tuning to the attended feature, as shown by studies in psychophysics and modeling [11, 12], imaging [13-16], and single-cell and neural population recordings [17, 18]. Besides single features, attention can select whole objects [19-21]. Objects are among the suggested "units" of attention because attention to a single feature of an object causes the selection of all of its features [19-21]. Here, we pit integration against attentional selection in object recognition. We find, first, that humans can integrate information near optimally from several perceptual dimensions (color, texture, luminance) to improve recognition. They cannot, however, isolate a single dimension even when the other dimensions provide task-irrelevant, potentially conflicting information. For object recognition, it appears that there is mandatory integration of information from multiple dimensions of visual experience. The advantage afforded by this integration, however, comes at the expense of attentional selection. Copyright © 2015 Elsevier Ltd. All rights reserved.
Integration trumps selection in object recognition
Saarela, Toni P.; Landy, Michael S.
2015-01-01
Summary Finding and recognizing objects is a fundamental task of vision. Objects can be defined by several “cues” (color, luminance, texture etc.), and humans can integrate sensory cues to improve detection and recognition [1–3]. Cortical mechanisms fuse information from multiple cues [4], and shape-selective neural mechanisms can display cue-invariance by responding to a given shape independent of the visual cue defining it [5–8]. Selective attention, in contrast, improves recognition by isolating a subset of the visual information [9]. Humans can select single features (red or vertical) within a perceptual dimension (color or orientation), giving faster and more accurate responses to items having the attended feature [10,11]. Attention elevates neural responses and sharpens neural tuning to the attended feature, as shown by studies in psychophysics and modeling [11,12], imaging [13–16], and single-cell and neural population recordings [17,18]. Besides single features, attention can select whole objects [19–21]. Objects are among the suggested “units” of attention because attention to a single feature of an object causes the selection of all of its features [19–21]. Here, we pit integration against attentional selection in object recognition. We find, first, that humans can integrate information near-optimally from several perceptual dimensions (color, texture, luminance) to improve recognition. They cannot, however, isolate a single dimension even when the other dimensions provide task-irrelevant, potentially conflicting information. For object recognition, it appears that there is mandatory integration of information from multiple dimensions of visual experience. The advantage afforded by this integration, however, comes at the expense of attentional selection. PMID:25802154
Coding visual features extracted from video sequences.
Baroffio, Luca; Cesana, Matteo; Redondi, Alessandro; Tagliasacchi, Marco; Tubaro, Stefano
2014-05-01
Visual features are successfully exploited in several applications (e.g., visual search, object recognition and tracking, etc.) due to their ability to efficiently represent image content. Several visual analysis tasks require features to be transmitted over a bandwidth-limited network, thus calling for coding techniques to reduce the required bit budget, while attaining a target level of efficiency. In this paper, we propose, for the first time, a coding architecture designed for local features (e.g., SIFT, SURF) extracted from video sequences. To achieve high coding efficiency, we exploit both spatial and temporal redundancy by means of intraframe and interframe coding modes. In addition, we propose a coding mode decision based on rate-distortion optimization. The proposed coding scheme can be conveniently adopted to implement the analyze-then-compress (ATC) paradigm in the context of visual sensor networks. That is, sets of visual features are extracted from video frames, encoded at remote nodes, and finally transmitted to a central controller that performs visual analysis. This is in contrast to the traditional compress-then-analyze (CTA) paradigm, in which video sequences acquired at a node are compressed and then sent to a central unit for further processing. In this paper, we compare these coding paradigms using metrics that are routinely adopted to evaluate the suitability of visual features in the context of content-based retrieval, object recognition, and tracking. Experimental results demonstrate that, thanks to the significant coding gains achieved by the proposed coding scheme, ATC outperforms CTA with respect to all evaluation metrics.
Estimated capacity of object files in visual short-term memory is not improved by retrieval cueing.
Saiki, Jun; Miyatsuji, Hirofumi
2009-03-23
Visual short-term memory (VSTM) has been claimed to maintain three to five feature-bound object representations. Some results showing smaller capacity estimates for feature binding memory have been interpreted as the effects of interference in memory retrieval. However, change-detection tasks may not properly evaluate complex feature-bound representations such as triple conjunctions in VSTM. To understand the general type of feature-bound object representation, evaluation of triple conjunctions is critical. To test whether interference occurs in memory retrieval for complete object file representations in a VSTM task, we cued retrieval in novel paradigms that directly evaluate the memory for triple conjunctions, in comparison with a simple change-detection task. In our multiple object permanence tracking displays, observers monitored for a switch in feature combination between objects during an occlusion period, and we found that a retrieval cue provided no benefit with the triple conjunction tasks, but significant facilitation with the change-detection task, suggesting that low capacity estimates of object file memory in VSTM reflect a limit on maintenance, not retrieval.
ERIC Educational Resources Information Center
Wood, Justin N.; Wood, Samantha M. W.
2018-01-01
How do newborns learn to recognize objects? According to temporal learning models in computational neuroscience, the brain constructs object representations by extracting smoothly changing features from the environment. To date, however, it is unknown whether newborns depend on smoothly changing features to build invariant object representations.…
ERIC Educational Resources Information Center
Vergauwe, Evie; Cowan, Nelson
2015-01-01
We compared two contrasting hypotheses of how multifeatured objects are stored in visual working memory (vWM); as integrated objects or as independent features. A new procedure was devised to examine vWM representations of several concurrently held objects and their features and our main measure was reaction time (RT), allowing an examination of…
Guidance of attention by information held in working memory.
Calleja, Marissa Ortiz; Rich, Anina N
2013-05-01
Information held in working memory (WM) can guide attention during visual search. The authors of recent studies have interpreted the effect of holding verbal labels in WM as guidance of visual attention by semantic information. In a series of experiments, we tested how attention is influenced by visual features versus category-level information about complex objects held in WM. Participants either memorized an object's image or its category. While holding this information in memory, they searched for a target in a four-object search display. On exact-match trials, the memorized item reappeared as a distractor in the search display. On category-match trials, another exemplar of the memorized item appeared as a distractor. On neutral trials, none of the distractors were related to the memorized object. We found attentional guidance in visual search on both exact-match and category-match trials in Experiment 1, in which the exemplars were visually similar. When we controlled for visual similarity among the exemplars by using four possible exemplars (Exp. 2) or by using two exemplars rated as being visually dissimilar (Exp. 3), we found attentional guidance only on exact-match trials when participants memorized the object's image. The same pattern of results held when the target was invariant (Exps. 2-3) and when the target was defined semantically and varied in visual features (Exp. 4). The findings of these experiments suggest that attentional guidance by WM requires active visual information.
Toward a Unified Theory of Visual Area V4
Roe, Anna W.; Chelazzi, Leonardo; Connor, Charles E.; Conway, Bevil R.; Fujita, Ichiro; Gallant, Jack L.; Lu, Haidong; Vanduffel, Wim
2016-01-01
Visual area V4 is a midtier cortical area in the ventral visual pathway. It is crucial for visual object recognition and has been a focus of many studies on visual attention. However, there is no unifying view of V4’s role in visual processing. Neither is there an understanding of how its role in feature processing interfaces with its role in visual attention. This review captures our current knowledge of V4, largely derived from electrophysiological and imaging studies in the macaque monkey. Based on recent discovery of functionally specific domains in V4, we propose that the unifying function of V4 circuitry is to enable selective extraction of specific functional domain-based networks, whether it be by bottom-up specification of object features or by top-down attentionally driven selection. PMID:22500626
Location-Unbound Color-Shape Binding Representations in Visual Working Memory.
Saiki, Jun
2016-02-01
The mechanism by which nonspatial features, such as color and shape, are bound in visual working memory, and the role of those features' location in their binding, remains unknown. In the current study, I modified a redundancy-gain paradigm to investigate these issues. A set of features was presented in a two-object memory display, followed by a single object probe. Participants judged whether the probe contained any features of the memory display, regardless of its location. Response time distributions revealed feature coactivation only when both features of a single object in the memory display appeared together in the probe, regardless of the response time benefit from the probe and memory objects sharing the same location. This finding suggests that a shared location is necessary in the formation of bound representations but unnecessary in their maintenance. Electroencephalography data showed that amplitude modulations reflecting location-unbound feature coactivation were different from those reflecting the location-sharing benefit, consistent with the behavioral finding that feature-location binding is unnecessary in the maintenance of color-shape binding. © The Author(s) 2015.
Gilchrist, Amanda L; Duarte, Audrey; Verhaeghen, Paul
2016-01-01
Research with younger adults has shown that retrospective cues can be used to orient top-down attention toward relevant items in working memory. We examined whether older adults could take advantage of these cues to improve memory performance. Younger and older adults were presented with visual arrays of five colored shapes; during maintenance, participants were presented either with an informative cue based on an object feature (here, object shape or color) that would be probed, or with an uninformative, neutral cue. Although older adults were less accurate overall, both age groups benefited from the presentation of an informative, feature-based cue relative to a neutral cue. Surprisingly, we also observed differences in the effectiveness of shape versus color cues and their effects upon post-cue memory load. These results suggest that older adults can use top-down attention to remove irrelevant items from visual working memory, provided that task-relevant features function as cues.
Scene analysis for effective visual search in rough three-dimensional-modeling scenes
NASA Astrophysics Data System (ADS)
Wang, Qi; Hu, Xiaopeng
2016-11-01
Visual search is a fundamental technology in the computer vision community. It is difficult to find an object in complex scenes when there exist similar distracters in the background. We propose a target search method in rough three-dimensional-modeling scenes based on a vision salience theory and camera imaging model. We give the definition of salience of objects (or features) and explain the way that salience measurements of objects are calculated. Also, we present one type of search path that guides to the target through salience objects. Along the search path, when the previous objects are localized, the search region of each subsequent object decreases, which is calculated through imaging model and an optimization method. The experimental results indicate that the proposed method is capable of resolving the ambiguities resulting from distracters containing similar visual features with the target, leading to an improvement of search speed by over 50%.
Visual search, visual streams, and visual architectures.
Green, M
1991-10-01
Most psychological, physiological, and computational models of early vision suggest that retinal information is divided into a parallel set of feature modules. The dominant theories of visual search assume that these modules form a "blackboard" architecture: a set of independent representations that communicate only through a central processor. A review of research shows that blackboard-based theories, such as feature-integration theory, cannot easily explain the existing data. The experimental evidence is more consistent with a "network" architecture, which stresses that: (1) feature modules are directly connected to one another, (2) features and their locations are represented together, (3) feature detection and integration are not distinct processing stages, and (4) no executive control process, such as focal attention, is needed to integrate features. Attention is not a spotlight that synthesizes objects from raw features. Instead, it is better to conceptualize attention as an aperture which masks irrelevant visual information.
Category-based guidance of spatial attention during visual search for feature conjunctions.
Nako, Rebecca; Grubert, Anna; Eimer, Martin
2016-10-01
The question whether alphanumerical category is involved in the control of attentional target selection during visual search remains a contentious issue. We tested whether category-based attentional mechanisms would guide the allocation of attention under conditions where targets were defined by a combination of alphanumerical category and a basic visual feature, and search displays could contain both targets and partially matching distractor objects. The N2pc component was used as an electrophysiological marker of attentional object selection in tasks where target objects were defined by a conjunction of color and category (Experiment 1) or shape and category (Experiment 2). Some search displays contained the target or a nontarget object that matched either the target color/shape or its category among 3 nonmatching distractors. In other displays, the target and a partially matching nontarget object appeared together. N2pc components were elicited not only by targets and by color- or shape-matching nontargets, but also by category-matching nontarget objects, even on trials where a target was present in the same display. On these trials, the summed N2pc components to the 2 types of partially matching nontargets were initially equal in size to the target N2pc, suggesting that attention was allocated simultaneously and independently to all objects with target-matching features during the early phase of attentional processing. Results demonstrate that alphanumerical category is a genuine guiding feature that can operate in parallel with color or shape information to control the deployment of attention during visual search. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Evidence for negative feature guidance in visual search is explained by spatial recoding.
Beck, Valerie M; Hollingworth, Andrew
2015-10-01
Theories of attention and visual search explain how attention is guided toward objects with known target features. But can attention be directed away from objects with a feature known to be associated only with distractors? Most studies have found that the demand to maintain the to-be-avoided feature in visual working memory biases attention toward matching objects rather than away from them. In contrast, Arita, Carlisle, and Woodman (2012) claimed that attention can be configured to selectively avoid objects that match a cued distractor color, and they reported evidence that this type of negative cue generates search benefits. However, the colors of the search array items in Arita et al. (2012) were segregated by hemifield (e.g., blue items on the left, red on the right), which allowed for a strategy of translating the feature-cue information into a simple spatial template (e.g., avoid right, or attend left). In the present study, we replicated the negative cue benefit using the Arita et al. (2012), method (albeit within a subset of participants who reliably used the color cues to guide attention). Then, we eliminated the benefit by using search arrays that could not be grouped by hemifield. Our results suggest that feature-guided avoidance is implemented only indirectly, in this case by translating feature-cue information into a spatial template. (c) 2015 APA, all rights reserved).
Qiao, Hong; Li, Yinlin; Li, Fengfu; Xi, Xuanyang; Wu, Wei
2016-10-01
Recently, many biologically inspired visual computational models have been proposed. The design of these models follows the related biological mechanisms and structures, and these models provide new solutions for visual recognition tasks. In this paper, based on the recent biological evidence, we propose a framework to mimic the active and dynamic learning and recognition process of the primate visual cortex. From principle point of view, the main contributions are that the framework can achieve unsupervised learning of episodic features (including key components and their spatial relations) and semantic features (semantic descriptions of the key components), which support higher level cognition of an object. From performance point of view, the advantages of the framework are as follows: 1) learning episodic features without supervision-for a class of objects without a prior knowledge, the key components, their spatial relations and cover regions can be learned automatically through a deep neural network (DNN); 2) learning semantic features based on episodic features-within the cover regions of the key components, the semantic geometrical values of these components can be computed based on contour detection; 3) forming the general knowledge of a class of objects-the general knowledge of a class of objects can be formed, mainly including the key components, their spatial relations and average semantic values, which is a concise description of the class; and 4) achieving higher level cognition and dynamic updating-for a test image, the model can achieve classification and subclass semantic descriptions. And the test samples with high confidence are selected to dynamically update the whole model. Experiments are conducted on face images, and a good performance is achieved in each layer of the DNN and the semantic description learning process. Furthermore, the model can be generalized to recognition tasks of other objects with learning ability.
Huynh, Duong L; Tripathy, Srimant P; Bedell, Harold E; Ögmen, Haluk
2015-01-01
Human memory is content addressable-i.e., contents of the memory can be accessed using partial information about the bound features of a stored item. In this study, we used a cross-feature cuing technique to examine how the human visual system encodes, binds, and retains information about multiple stimulus features within a set of moving objects. We sought to characterize the roles of three different features (position, color, and direction of motion, the latter two of which are processed preferentially within the ventral and dorsal visual streams, respectively) in the construction and maintenance of object representations. We investigated the extent to which these features are bound together across the following processing stages: during stimulus encoding, sensory (iconic) memory, and visual short-term memory. Whereas all features examined here can serve as cues for addressing content, their effectiveness shows asymmetries and varies according to cue-report pairings and the stage of information processing and storage. Position-based indexing theories predict that position should be more effective as a cue compared to other features. While we found a privileged role for position as a cue at the stimulus-encoding stage, position was not the privileged cue at the sensory and visual short-term memory stages. Instead, the pattern that emerged from our findings is one that mirrors the parallel processing streams in the visual system. This stream-specific binding and cuing effectiveness manifests itself in all three stages of information processing examined here. Finally, we find that the Leaky Flask model proposed in our previous study is applicable to all three features.
Peripersonal space representation develops independently from visual experience.
Ricciardi, Emiliano; Menicagli, Dario; Leo, Andrea; Costantini, Marcello; Pietrini, Pietro; Sinigaglia, Corrado
2017-12-15
Our daily-life actions are typically driven by vision. When acting upon an object, we need to represent its visual features (e.g. shape, orientation, etc.) and to map them into our own peripersonal space. But what happens with people who have never had any visual experience? How can they map object features into their own peripersonal space? Do they do it differently from sighted agents? To tackle these questions, we carried out a series of behavioral experiments in sighted and congenitally blind subjects. We took advantage of a spatial alignment effect paradigm, which typically refers to a decrease of reaction times when subjects perform an action (e.g., a reach-to-grasp pantomime) congruent with that afforded by a presented object. To systematically examine peripersonal space mapping, we presented visual or auditory affording objects both within and outside subjects' reach. The results showed that sighted and congenitally blind subjects did not differ in mapping objects into their own peripersonal space. Strikingly, this mapping occurred also when objects were presented outside subjects' reach, but within the peripersonal space of another agent. This suggests that (the lack of) visual experience does not significantly affect the development of both one's own and others' peripersonal space representation.
Size matters: large objects capture attention in visual search.
Proulx, Michael J
2010-12-23
Can objects or events ever capture one's attention in a purely stimulus-driven manner? A recent review of the literature set out the criteria required to find stimulus-driven attentional capture independent of goal-directed influences, and concluded that no published study has satisfied that criteria. Here visual search experiments assessed whether an irrelevantly large object can capture attention. Capture of attention by this static visual feature was found. The results suggest that a large object can indeed capture attention in a stimulus-driven manner and independent of displaywide features of the task that might encourage a goal-directed bias for large items. It is concluded that these results are either consistent with the stimulus-driven criteria published previously or alternatively consistent with a flexible, goal-directed mechanism of saliency detection.
Selective Audiovisual Semantic Integration Enabled by Feature-Selective Attention.
Li, Yuanqing; Long, Jinyi; Huang, Biao; Yu, Tianyou; Wu, Wei; Li, Peijun; Fang, Fang; Sun, Pei
2016-01-13
An audiovisual object may contain multiple semantic features, such as the gender and emotional features of the speaker. Feature-selective attention and audiovisual semantic integration are two brain functions involved in the recognition of audiovisual objects. Humans often selectively attend to one or several features while ignoring the other features of an audiovisual object. Meanwhile, the human brain integrates semantic information from the visual and auditory modalities. However, how these two brain functions correlate with each other remains to be elucidated. In this functional magnetic resonance imaging (fMRI) study, we explored the neural mechanism by which feature-selective attention modulates audiovisual semantic integration. During the fMRI experiment, the subjects were presented with visual-only, auditory-only, or audiovisual dynamical facial stimuli and performed several feature-selective attention tasks. Our results revealed that a distribution of areas, including heteromodal areas and brain areas encoding attended features, may be involved in audiovisual semantic integration. Through feature-selective attention, the human brain may selectively integrate audiovisual semantic information from attended features by enhancing functional connectivity and thus regulating information flows from heteromodal areas to brain areas encoding the attended features.
Crowding with conjunctions of simple features.
Põder, Endel; Wagemans, Johan
2007-11-20
Several recent studies have related crowding with the feature integration stage in visual processing. In order to understand the mechanisms involved in this stage, it is important to use stimuli that have several features to integrate, and these features should be clearly defined and measurable. In this study, Gabor patches were used as target and distractor stimuli. The stimuli differed in three dimensions: spatial frequency, orientation, and color. A group of 3, 5, or 7 objects was presented briefly at 4 deg eccentricity of the visual field. The observers' task was to identify the object located in the center of the group. A strong effect of the number of distractors was observed, consistent with various spatial pooling models. The analysis of incorrect responses revealed that these were a mix of feature errors and mislocalizations of the target object. Feature errors were not purely random, but biased by the features of distractors. We propose a simple feature integration model that predicts most of the observed regularities.
Biologically Inspired Visual Model With Preliminary Cognition and Active Attention Adjustment.
Qiao, Hong; Xi, Xuanyang; Li, Yinlin; Wu, Wei; Li, Fengfu
2015-11-01
Recently, many computational models have been proposed to simulate visual cognition process. For example, the hierarchical Max-Pooling (HMAX) model was proposed according to the hierarchical and bottom-up structure of V1 to V4 in the ventral pathway of primate visual cortex, which could achieve position- and scale-tolerant recognition. In our previous work, we have introduced memory and association into the HMAX model to simulate visual cognition process. In this paper, we improve our theoretical framework by mimicking a more elaborate structure and function of the primate visual cortex. We will mainly focus on the new formation of memory and association in visual processing under different circumstances as well as preliminary cognition and active adjustment in the inferior temporal cortex, which are absent in the HMAX model. The main contributions of this paper are: 1) in the memory and association part, we apply deep convolutional neural networks to extract various episodic features of the objects since people use different features for object recognition. Moreover, to achieve a fast and robust recognition in the retrieval and association process, different types of features are stored in separated clusters and the feature binding of the same object is stimulated in a loop discharge manner and 2) in the preliminary cognition and active adjustment part, we introduce preliminary cognition to classify different types of objects since distinct neural circuits in a human brain are used for identification of various types of objects. Furthermore, active cognition adjustment of occlusion and orientation is implemented to the model to mimic the top-down effect in human cognition process. Finally, our model is evaluated on two face databases CAS-PEAL-R1 and AR. The results demonstrate that our model exhibits its efficiency on visual recognition process with much lower memory storage requirement and a better performance compared with the traditional purely computational methods.
ERIC Educational Resources Information Center
Hommuk, Karita; Bachmann, Talis
2009-01-01
The problem of feature binding has been examined under conditions of distributed attention or with spatially dispersed stimuli. We studied binding by asking whether selective attention to a feature of a masked object enables perceptual access to the other features of that object using conditions in which spatial attention was directed at a single…
Onboard Robust Visual Tracking for UAVs Using a Reliable Global-Local Object Model
Fu, Changhong; Duan, Ran; Kircali, Dogan; Kayacan, Erdal
2016-01-01
In this paper, we present a novel onboard robust visual algorithm for long-term arbitrary 2D and 3D object tracking using a reliable global-local object model for unmanned aerial vehicle (UAV) applications, e.g., autonomous tracking and chasing a moving target. The first main approach in this novel algorithm is the use of a global matching and local tracking approach. In other words, the algorithm initially finds feature correspondences in a way that an improved binary descriptor is developed for global feature matching and an iterative Lucas–Kanade optical flow algorithm is employed for local feature tracking. The second main module is the use of an efficient local geometric filter (LGF), which handles outlier feature correspondences based on a new forward-backward pairwise dissimilarity measure, thereby maintaining pairwise geometric consistency. In the proposed LGF module, a hierarchical agglomerative clustering, i.e., bottom-up aggregation, is applied using an effective single-link method. The third proposed module is a heuristic local outlier factor (to the best of our knowledge, it is utilized for the first time to deal with outlier features in a visual tracking application), which further maximizes the representation of the target object in which we formulate outlier feature detection as a binary classification problem with the output features of the LGF module. Extensive UAV flight experiments show that the proposed visual tracker achieves real-time frame rates of more than thirty-five frames per second on an i7 processor with 640 × 512 image resolution and outperforms the most popular state-of-the-art trackers favorably in terms of robustness, efficiency and accuracy. PMID:27589769
Neural Architecture for Feature Binding in Visual Working Memory.
Schneegans, Sebastian; Bays, Paul M
2017-04-05
Binding refers to the operation that groups different features together into objects. We propose a neural architecture for feature binding in visual working memory that employs populations of neurons with conjunction responses. We tested this model using cued recall tasks, in which subjects had to memorize object arrays composed of simple visual features (color, orientation, and location). After a brief delay, one feature of one item was given as a cue, and the observer had to report, on a continuous scale, one or two other features of the cued item. Binding failure in this task is associated with swap errors, in which observers report an item other than the one indicated by the cue. We observed that the probability of swapping two items strongly correlated with the items' similarity in the cue feature dimension, and found a strong correlation between swap errors occurring in spatial and nonspatial report. The neural model explains both swap errors and response variability as results of decoding noisy neural activity, and can account for the behavioral results in quantitative detail. We then used the model to compare alternative mechanisms for binding nonspatial features. We found the behavioral results fully consistent with a model in which nonspatial features are bound exclusively via their shared location, with no indication of direct binding between color and orientation. These results provide evidence for a special role of location in feature binding, and the model explains how this special role could be realized in the neural system. SIGNIFICANCE STATEMENT The problem of feature binding is of central importance in understanding the mechanisms of working memory. How do we remember not only that we saw a red and a round object, but that these features belong together to a single object rather than to different objects in our environment? Here we present evidence for a neural mechanism for feature binding in working memory, based on encoding of visual information by neurons that respond to the conjunction of features. We find clear evidence that nonspatial features are bound via space: we memorize directly where a color or an orientation appeared, but we memorize which color belonged with which orientation only indirectly by virtue of their shared location. Copyright © 2017 Schneegans and Bays.
Atoms of recognition in human and computer vision.
Ullman, Shimon; Assif, Liav; Fetaya, Ethan; Harari, Daniel
2016-03-08
Discovering the visual features and representations used by the brain to recognize objects is a central problem in the study of vision. Recently, neural network models of visual object recognition, including biological and deep network models, have shown remarkable progress and have begun to rival human performance in some challenging tasks. These models are trained on image examples and learn to extract features and representations and to use them for categorization. It remains unclear, however, whether the representations and learning processes discovered by current models are similar to those used by the human visual system. Here we show, by introducing and using minimal recognizable images, that the human visual system uses features and processes that are not used by current models and that are critical for recognition. We found by psychophysical studies that at the level of minimal recognizable images a minute change in the image can have a drastic effect on recognition, thus identifying features that are critical for the task. Simulations then showed that current models cannot explain this sensitivity to precise feature configurations and, more generally, do not learn to recognize minimal images at a human level. The role of the features shown here is revealed uniquely at the minimal level, where the contribution of each feature is essential. A full understanding of the learning and use of such features will extend our understanding of visual recognition and its cortical mechanisms and will enhance the capacity of computational models to learn from visual experience and to deal with recognition and detailed image interpretation.
Qin, Lei; Snoussi, Hichem; Abdallah, Fahed
2014-01-01
We propose a novel approach for tracking an arbitrary object in video sequences for visual surveillance. The first contribution of this work is an automatic feature extraction method that is able to extract compact discriminative features from a feature pool before computing the region covariance descriptor. As the feature extraction method is adaptive to a specific object of interest, we refer to the region covariance descriptor computed using the extracted features as the adaptive covariance descriptor. The second contribution is to propose a weakly supervised method for updating the object appearance model during tracking. The method performs a mean-shift clustering procedure among the tracking result samples accumulated during a period of time and selects a group of reliable samples for updating the object appearance model. As such, the object appearance model is kept up-to-date and is prevented from contamination even in case of tracking mistakes. We conducted comparing experiments on real-world video sequences, which confirmed the effectiveness of the proposed approaches. The tracking system that integrates the adaptive covariance descriptor and the clustering-based model updating method accomplished stable object tracking on challenging video sequences. PMID:24865883
Good Features to Correlate for Visual Tracking
NASA Astrophysics Data System (ADS)
Gundogdu, Erhan; Alatan, A. Aydin
2018-05-01
During the recent years, correlation filters have shown dominant and spectacular results for visual object tracking. The types of the features that are employed in these family of trackers significantly affect the performance of visual tracking. The ultimate goal is to utilize robust features invariant to any kind of appearance change of the object, while predicting the object location as properly as in the case of no appearance change. As the deep learning based methods have emerged, the study of learning features for specific tasks has accelerated. For instance, discriminative visual tracking methods based on deep architectures have been studied with promising performance. Nevertheless, correlation filter based (CFB) trackers confine themselves to use the pre-trained networks which are trained for object classification problem. To this end, in this manuscript the problem of learning deep fully convolutional features for the CFB visual tracking is formulated. In order to learn the proposed model, a novel and efficient backpropagation algorithm is presented based on the loss function of the network. The proposed learning framework enables the network model to be flexible for a custom design. Moreover, it alleviates the dependency on the network trained for classification. Extensive performance analysis shows the efficacy of the proposed custom design in the CFB tracking framework. By fine-tuning the convolutional parts of a state-of-the-art network and integrating this model to a CFB tracker, which is the top performing one of VOT2016, 18% increase is achieved in terms of expected average overlap, and tracking failures are decreased by 25%, while maintaining the superiority over the state-of-the-art methods in OTB-2013 and OTB-2015 tracking datasets.
Activity in human visual and parietal cortex reveals object-based attention in working memory.
Peters, Benjamin; Kaiser, Jochen; Rahm, Benjamin; Bledowski, Christoph
2015-02-25
Visual attention enables observers to select behaviorally relevant information based on spatial locations, features, or objects. Attentional selection is not limited to physically present visual information, but can also operate on internal representations maintained in working memory (WM) in service of higher-order cognition. However, only little is known about whether attention to WM contents follows the same principles as attention to sensory stimuli. To address this question, we investigated in humans whether the typically observed effects of object-based attention in perception are also evident for object-based attentional selection of internal object representations in WM. In full accordance with effects in visual perception, the key behavioral and neuronal characteristics of object-based attention were observed in WM. Specifically, we found that reaction times were shorter when shifting attention to memory positions located on the currently attended object compared with equidistant positions on a different object. Furthermore, functional magnetic resonance imaging and multivariate pattern analysis of visuotopic activity in visual (areas V1-V4) and parietal cortex revealed that directing attention to one position of an object held in WM also enhanced brain activation for other positions on the same object, suggesting that attentional selection in WM activates the entire object. This study demonstrated that all characteristic features of object-based attention are present in WM and thus follows the same principles as in perception. Copyright © 2015 the authors 0270-6474/15/353360-10$15.00/0.
Neural Correlates of the Perception for Novel Objects
Zhang, Hao; Liu, Jia; Zhang, Qinglin
2013-01-01
Perception of novel objects is of enormous importance in our lives. People have to perceive or understand novel objects when seeing an original painting, admiring an unconventional construction, and using an inventive device. However, very little is known about neural mechanisms underlying the perception for novel objects. Perception of novel objects relies on the integration of unusual features of novel objects in order to identify what such objects are. In the present study, functional Magnetic Resonance Imaging (MRI) was employed to investigate neural correlates of perception of novel objects. The neuroimaging data on participants engaged in novel object viewing versus ordinary object viewing revealed that perception of novel objects involves significant activation in the left precuneus (Brodmann area 7) and the right visual cortex. The results suggest that the left precuneus is associated with the integration of unusual features of novel objects, while the right visual cortex is sensitive to the detection of such features. Our findings highlight the left precuneus as a crucial component of the neural circuitry underlying perception of novel objects. PMID:23646167
Intrinsic and contextual features in object recognition.
Schlangen, Derrick; Barenholtz, Elan
2015-01-28
The context in which an object is found can facilitate its recognition. Yet, it is not known how effective this contextual information is relative to the object's intrinsic visual features, such as color and shape. To address this, we performed four experiments using rendered scenes with novel objects. In each experiment, participants first performed a visual search task, searching for a uniquely shaped target object whose color and location within the scene was experimentally manipulated. We then tested participants' tendency to use their knowledge of the location and color information in an identification task when the objects' images were degraded due to blurring, thus eliminating the shape information. In Experiment 1, we found that, in the absence of any diagnostic intrinsic features, participants identified objects based purely on their locations within the scene. In Experiment 2, we found that participants combined an intrinsic feature, color, with contextual location in order to uniquely specify an object. In Experiment 3, we found that when an object's color and location information were in conflict, participants identified the object using both sources of information equally. Finally, in Experiment 4, we found that participants used whichever source of information-either color or location-was more statistically reliable in order to identify the target object. Overall, these experiments show that the context in which objects are found can play as important a role as intrinsic features in identifying the objects. © 2015 ARVO.
Object based implicit contextual learning: a study of eye movements.
van Asselen, Marieke; Sampaio, Joana; Pina, Ana; Castelo-Branco, Miguel
2011-02-01
Implicit contextual cueing refers to a top-down mechanism in which visual search is facilitated by learned contextual features. In the current study we aimed to investigate the mechanism underlying implicit contextual learning using object information as a contextual cue. Therefore, we measured eye movements during an object-based contextual cueing task. We demonstrated that visual search is facilitated by repeated object information and that this reduction in response times is associated with shorter fixation durations. This indicates that by memorizing associations between objects in our environment we can recognize objects faster, thereby facilitating visual search.
Cross-Modal Retrieval With CNN Visual Features: A New Baseline.
Wei, Yunchao; Zhao, Yao; Lu, Canyi; Wei, Shikui; Liu, Luoqi; Zhu, Zhenfeng; Yan, Shuicheng
2017-02-01
Recently, convolutional neural network (CNN) visual features have demonstrated their powerful ability as a universal representation for various recognition tasks. In this paper, cross-modal retrieval with CNN visual features is implemented with several classic methods. Specifically, off-the-shelf CNN visual features are extracted from the CNN model, which is pretrained on ImageNet with more than one million images from 1000 object categories, as a generic image representation to tackle cross-modal retrieval. To further enhance the representational ability of CNN visual features, based on the pretrained CNN model on ImageNet, a fine-tuning step is performed by using the open source Caffe CNN library for each target data set. Besides, we propose a deep semantic matching method to address the cross-modal retrieval problem with respect to samples which are annotated with one or multiple labels. Extensive experiments on five popular publicly available data sets well demonstrate the superiority of CNN visual features for cross-modal retrieval.
Stojanoski, Bobby Boge; Niemeier, Matthias
2015-10-01
It is well known that visual expectation and attention modulate object perception. Yet, the mechanisms underlying these top-down influences are not completely understood. Event-related potentials (ERPs) indicate late contributions of expectations to object processing around the P2 or N2. This is true independent of whether people expect objects (vs. no objects) or specific shapes, hence when expectations pertain to complex visual features. However, object perception can also benefit from expecting colour information, which can facilitate figure/ground segregation. Studies on attention to colour show attention-sensitive modulations of the P1, but are limited to simple transient detection paradigms. The aim of the current study was to examine whether expecting simple features (colour information) during challenging object perception tasks produce early or late ERP modulations. We told participants to expect an object defined by predominantly black or white lines that were embedded in random arrays of distractor lines and then asked them to report the object's shape. Performance was better when colour expectations were met. ERPs revealed early and late phases of modulation. An early modulation at the P1/N1 transition arguably reflected earlier stages of object processing. Later modulations, at the P3, could be consistent with decisional processes. These results provide novel insights into feature-specific contributions of visual expectations to object perception.
The Comparison of Visual Working Memory Representations with Perceptual Inputs
Hyun, Joo-seok; Woodman, Geoffrey F.; Vogel, Edward K.; Hollingworth, Andrew
2008-01-01
The human visual system can notice differences between memories of previous visual inputs and perceptions of new visual inputs, but the comparison process that detects these differences has not been well characterized. This study tests the hypothesis that differences between the memory of a stimulus array and the perception of a new array are detected in a manner that is analogous to the detection of simple features in visual search tasks. That is, just as the presence of a task-relevant feature in visual search can be detected in parallel, triggering a rapid shift of attention to the object containing the feature, the presence of a memory-percept difference along a task-relevant dimension can be detected in parallel, triggering a rapid shift of attention to the changed object. Supporting evidence was obtained in a series of experiments that examined manual reaction times, saccadic reaction times, and event-related potential latencies. However, these experiments also demonstrated that a slow, limited-capacity process must occur before the observer can make a manual change-detection response. PMID:19653755
Emerging category representation in the visual forebrain hierarchy of pigeons (Columba livia).
Azizi, Amir Hossein; Pusch, Roland; Koenen, Charlotte; Klatt, Sebastian; Bröcker, Franziska; Thiele, Samuel; Kellermann, Janosch; Güntürkün, Onur; Cheng, Sen
2018-06-06
Recognizing and categorizing visual stimuli are cognitive functions vital for survival, and an important feature of visual systems in primates as well as in birds. Visual stimuli are processed along the ventral visual pathway. At every stage in the hierarchy, neurons respond selectively to more complex features, transforming the population representation of the stimuli. It is therefore easier to read-out category information in higher visual areas. While explicit category representations have been observed in the primate brain, less is known on equivalent processes in the avian brain. Even though their brain anatomies are radically different, it has been hypothesized that visual object representations are comparable across mammals and birds. In the present study, we investigated category representations in the pigeon visual forebrain using recordings from single cells responding to photographs of real-world objects. Using a linear classifier, we found that the population activity in the visual associative area mesopallium ventrolaterale (MVL) distinguishes between animate and inanimate objects, although this distinction is not required by the task. By contrast, a population of cells in the entopallium, a region that is lower in the hierarchy of visual areas and that is related to the primate extrastriate cortex, lacked this information. A model that pools responses of simple cells, which function as edge detectors, can account for the animate vs. inanimate categorization in the MVL, but performance in the model is based on different features than in MVL. Therefore, processing in MVL cells is very likely more abstract than simple computations on the output of edge detectors. Copyright © 2018. Published by Elsevier B.V.
ERIC Educational Resources Information Center
Nordfang, Maria; Dyrholm, Mads; Bundesen, Claus
2013-01-01
The attentional weight of a visual object depends on the contrast of the features of the object to its local surroundings (feature contrast) and the relevance of the features to one's goals (feature relevance). We investigated the dependency in partial report experiments with briefly presented stimuli but unspeeded responses. The task was to…
Wang, Xin; Deng, Zhongliang
2017-01-01
In order to recognize indoor scenarios, we extract image features for detecting objects, however, computers can make some unexpected mistakes. After visualizing the histogram of oriented gradient (HOG) features, we find that the world through the eyes of a computer is indeed different from human eyes, which assists researchers to see the reasons that cause a computer to make errors. Additionally, according to the visualization, we notice that the HOG features can obtain rich texture information. However, a large amount of background interference is also introduced. In order to enhance the robustness of the HOG feature, we propose an improved method for suppressing the background interference. On the basis of the original HOG feature, we introduce a principal component analysis (PCA) to extract the principal components of the image colour information. Then, a new hybrid feature descriptor, which is named HOG–PCA (HOGP), is made by deeply fusing these two features. Finally, the HOGP is compared to the state-of-the-art HOG feature descriptor in four scenes under different illumination. In the simulation and experimental tests, the qualitative and quantitative assessments indicate that the visualizing images of the HOGP feature are close to the observation results obtained by human eyes, which is better than the original HOG feature for object detection. Furthermore, the runtime of our proposed algorithm is hardly increased in comparison to the classic HOG feature. PMID:28677635
If it's not there, where is it? Locating illusory conjunctions.
Hazeltine, R E; Prinzmetal, W; Elliott, W
1997-02-01
There is evidence that complex objects are decomposed by the visual system into features, such as shape and color. Consistent with this theory is the phenomenon of illusory conjunctions, which occur when features are incorrectly combined to form an illusory object. We analyzed the perceived location of illusory conjunctions to study the roles of color and shape in the location of visual objects. In Experiments 1 and 2, participants located illusory conjunctions about halfway between the veridical locations of the component features. Experiment 3 showed that the distribution of perceived locations was not the mixture of two distributions centered at the 2 feature locations. Experiment 4 replicated these results with an identification task rather than a detection task. We concluded that the locations of illusory conjunctions were not arbitrary but were determined by both constituent shape and color.
[Visual Texture Agnosia in Humans].
Suzuki, Kyoko
2015-06-01
Visual object recognition requires the processing of both geometric and surface properties. Patients with occipital lesions may have visual agnosia, which is impairment in the recognition and identification of visually presented objects primarily through their geometric features. An analogous condition involving the failure to recognize an object by its texture may exist, which can be called visual texture agnosia. Here we present two cases with visual texture agnosia. Case 1 had left homonymous hemianopia and right upper quadrantanopia, along with achromatopsia, prosopagnosia, and texture agnosia, because of damage to his left ventromedial occipitotemporal cortex and right lateral occipito-temporo-parietal cortex due to multiple cerebral embolisms. Although he showed difficulty matching and naming textures of real materials, he could readily name visually presented objects by their contours. Case 2 had right lower quadrantanopia, along with impairment in stereopsis and recognition of texture in 2D images, because of subcortical hemorrhage in the left occipitotemporal region. He failed to recognize shapes based on texture information, whereas shape recognition based on contours was well preserved. Our findings, along with those of three reported cases with texture agnosia, indicate that there are separate channels for processing texture, color, and geometric features, and that the regions around the left collateral sulcus are crucial for texture processing.
Modulation of neuronal responses during covert search for visual feature conjunctions
Buracas, Giedrius T.; Albright, Thomas D.
2009-01-01
While searching for an object in a visual scene, an observer's attentional focus and eye movements are often guided by information about object features and spatial locations. Both spatial and feature-specific attention are known to modulate neuronal responses in visual cortex, but little is known of the dynamics and interplay of these mechanisms as visual search progresses. To address this issue, we recorded from directionally selective cells in visual area MT of monkeys trained to covertly search for targets defined by a unique conjunction of color and motion features and to signal target detection with an eye movement to the putative target. Two patterns of response modulation were observed. One pattern consisted of enhanced responses to targets presented in the receptive field (RF). These modulations occurred at the end-stage of search and were more potent during correct target identification than during erroneous saccades to a distractor in RF, thus suggesting that this modulation is not a mere presaccadic enhancement. A second pattern of modulation was observed when RF stimuli were nontargets that shared a feature with the target. The latter effect was observed during early stages of search and is consistent with a global feature-specific mechanism. This effect often terminated before target identification, thus suggesting that it interacts with spatial attention. This modulation was exhibited not only for motion but also for color cue, although MT neurons are known to be insensitive to color. Such cue-invariant attentional effects may contribute to a feature binding mechanism acting across visual dimensions. PMID:19805385
Modulation of neuronal responses during covert search for visual feature conjunctions.
Buracas, Giedrius T; Albright, Thomas D
2009-09-29
While searching for an object in a visual scene, an observer's attentional focus and eye movements are often guided by information about object features and spatial locations. Both spatial and feature-specific attention are known to modulate neuronal responses in visual cortex, but little is known of the dynamics and interplay of these mechanisms as visual search progresses. To address this issue, we recorded from directionally selective cells in visual area MT of monkeys trained to covertly search for targets defined by a unique conjunction of color and motion features and to signal target detection with an eye movement to the putative target. Two patterns of response modulation were observed. One pattern consisted of enhanced responses to targets presented in the receptive field (RF). These modulations occurred at the end-stage of search and were more potent during correct target identification than during erroneous saccades to a distractor in RF, thus suggesting that this modulation is not a mere presaccadic enhancement. A second pattern of modulation was observed when RF stimuli were nontargets that shared a feature with the target. The latter effect was observed during early stages of search and is consistent with a global feature-specific mechanism. This effect often terminated before target identification, thus suggesting that it interacts with spatial attention. This modulation was exhibited not only for motion but also for color cue, although MT neurons are known to be insensitive to color. Such cue-invariant attentional effects may contribute to a feature binding mechanism acting across visual dimensions.
Selective Audiovisual Semantic Integration Enabled by Feature-Selective Attention
Li, Yuanqing; Long, Jinyi; Huang, Biao; Yu, Tianyou; Wu, Wei; Li, Peijun; Fang, Fang; Sun, Pei
2016-01-01
An audiovisual object may contain multiple semantic features, such as the gender and emotional features of the speaker. Feature-selective attention and audiovisual semantic integration are two brain functions involved in the recognition of audiovisual objects. Humans often selectively attend to one or several features while ignoring the other features of an audiovisual object. Meanwhile, the human brain integrates semantic information from the visual and auditory modalities. However, how these two brain functions correlate with each other remains to be elucidated. In this functional magnetic resonance imaging (fMRI) study, we explored the neural mechanism by which feature-selective attention modulates audiovisual semantic integration. During the fMRI experiment, the subjects were presented with visual-only, auditory-only, or audiovisual dynamical facial stimuli and performed several feature-selective attention tasks. Our results revealed that a distribution of areas, including heteromodal areas and brain areas encoding attended features, may be involved in audiovisual semantic integration. Through feature-selective attention, the human brain may selectively integrate audiovisual semantic information from attended features by enhancing functional connectivity and thus regulating information flows from heteromodal areas to brain areas encoding the attended features. PMID:26759193
Obligatory encoding of task-irrelevant features depletes working memory resources.
Marshall, Louise; Bays, Paul M
2013-02-18
Selective attention is often considered the "gateway" to visual working memory (VWM). However, the extent to which we can voluntarily control which of an object's features enter memory remains subject to debate. Recent research has converged on the concept of VWM as a limited commodity distributed between elements of a visual scene. Consequently, as memory load increases, the fidelity with which each visual feature is stored decreases. Here we used changes in recall precision to probe whether task-irrelevant features were encoded into VWM when individuals were asked to store specific feature dimensions. Recall precision for both color and orientation was significantly enhanced when task-irrelevant features were removed, but knowledge of which features would be probed provided no advantage over having to memorize both features of all items. Next, we assessed the effect an interpolated orientation-or color-matching task had on the resolution with which orientations in a memory array were stored. We found that the presence of orientation information in the second array disrupted memory of the first array. The cost to recall precision was identical whether the interfering features had to be remembered, attended to, or could be ignored. Therefore, it appears that storing, or merely attending to, one feature of an object is sufficient to promote automatic encoding of all its features, depleting VWM resources. However, the precision cost was abolished when the match task preceded the memory array. So, while encoding is automatic, maintenance is voluntary, allowing resources to be reallocated to store new visual information.
Real-time reliability measure-driven multi-hypothesis tracking using 2D and 3D features
NASA Astrophysics Data System (ADS)
Zúñiga, Marcos D.; Brémond, François; Thonnat, Monique
2011-12-01
We propose a new multi-target tracking approach, which is able to reliably track multiple objects even with poor segmentation results due to noisy environments. The approach takes advantage of a new dual object model combining 2D and 3D features through reliability measures. In order to obtain these 3D features, a new classifier associates an object class label to each moving region (e.g. person, vehicle), a parallelepiped model and visual reliability measures of its attributes. These reliability measures allow to properly weight the contribution of noisy, erroneous or false data in order to better maintain the integrity of the object dynamics model. Then, a new multi-target tracking algorithm uses these object descriptions to generate tracking hypotheses about the objects moving in the scene. This tracking approach is able to manage many-to-many visual target correspondences. For achieving this characteristic, the algorithm takes advantage of 3D models for merging dissociated visual evidence (moving regions) potentially corresponding to the same real object, according to previously obtained information. The tracking approach has been validated using video surveillance benchmarks publicly accessible. The obtained performance is real time and the results are competitive compared with other tracking algorithms, with minimal (or null) reconfiguration effort between different videos.
Mishra, Jyoti; Zanto, Theodore; Nilakantan, Aneesha; Gazzaley, Adam
2013-01-01
Intrasensory interference during visual working memory (WM) maintenance by object stimuli (such as faces and scenes), has been shown to negatively impact WM performance, with greater detrimental impacts of interference observed in aging. Here we assessed age-related impacts by intrasensory WM interference from lower-level stimulus features such as visual and auditory motion stimuli. We consistently found that interference in the form of ignored distractions and secondary task i nterruptions presented during a WM maintenance period, degraded memory accuracy in both the visual and auditory domain. However, in contrast to prior studies assessing WM for visual object stimuli, feature-based interference effects were not observed to be significantly greater in older adults. Analyses of neural oscillations in the alpha frequency band further revealed preserved mechanisms of interference processing in terms of post-stimulus alpha suppression, which was observed maximally for secondary task interruptions in visual and auditory modalities in both younger and older adults. These results suggest that age-related sensitivity of WM to interference may be limited to complex object stimuli, at least at low WM loads. PMID:23791629
NASA Astrophysics Data System (ADS)
Khaustova, Dar'ya; Fournier, Jérôme; Wyckens, Emmanuel; Le Meur, Olivier
2014-02-01
The aim of this research is to understand the difference in visual attention to 2D and 3D content depending on texture and amount of depth. Two experiments were conducted using an eye-tracker and a 3DTV display. Collected fixation data were used to build saliency maps and to analyze the differences between 2D and 3D conditions. In the first experiment 51 observers participated in the test. Using scenes that contained objects with crossed disparity, it was discovered that such objects are the most salient, even if observers experience discomfort due to the high level of disparity. The goal of the second experiment is to decide whether depth is a determinative factor for visual attention. During the experiment, 28 observers watched the scenes that contained objects with crossed and uncrossed disparities. We evaluated features influencing the saliency of the objects in stereoscopic conditions by using contents with low-level visual features. With univariate tests of significance (MANOVA), it was detected that texture is more important than depth for selection of objects. Objects with crossed disparity are significantly more important for selection processes when compared to 2D. However, objects with uncrossed disparity have the same influence on visual attention as 2D objects. Analysis of eyemovements indicated that there is no difference in saccade length. Fixation durations were significantly higher in stereoscopic conditions for low-level stimuli than in 2D. We believe that these experiments can help to refine existing models of visual attention for 3D content.
Recovery of a crowded object by masking the flankers: Determining the locus of feature integration
Chakravarthi, Ramakrishna; Cavanagh, Patrick
2009-01-01
Object recognition is a central function of the visual system. As a first step, the features of an object are registered; these independently encoded features are then bound together to form a single representation. Here we investigate the locus of this “feature integration” by examining crowding, a striking breakdown of this process. Crowding, an inability to identify a peripheral target surrounded by flankers, results from “excessive integration” of target and flanker features. We presented a standard crowding display with a target C flanked by four flanker C's in the periphery. We then masked only the flankers (but not the target) with one of three kinds of masks—noise, metacontrast, and object substitution—each of which interferes at progressively higher levels of visual processing. With noise and metacontrast masks (low-level masking), the crowded target was recovered, whereas with object substitution masks (high-level masking), it was not. This places a clear upper bound on the locus of interference in crowding suggesting that crowding is not a low-level phenomenon. We conclude that feature integration, which underlies crowding, occurs prior to the locus of object substitution masking. Further, our results indicate that the integrity of the flankers, but not their identification, is crucial for crowding to occur. PMID:19810785
Zelinsky, Gregory J; Peng, Yifan; Berg, Alexander C; Samaras, Dimitris
2013-10-08
Search is commonly described as a repeating cycle of guidance to target-like objects, followed by the recognition of these objects as targets or distractors. Are these indeed separate processes using different visual features? We addressed this question by comparing observer behavior to that of support vector machine (SVM) models trained on guidance and recognition tasks. Observers searched for a categorically defined teddy bear target in four-object arrays. Target-absent trials consisted of random category distractors rated in their visual similarity to teddy bears. Guidance, quantified as first-fixated objects during search, was strongest for targets, followed by target-similar, medium-similarity, and target-dissimilar distractors. False positive errors to first-fixated distractors also decreased with increasing dissimilarity to the target category. To model guidance, nine teddy bear detectors, using features ranging in biological plausibility, were trained on unblurred bears then tested on blurred versions of the same objects appearing in each search display. Guidance estimates were based on target probabilities obtained from these detectors. To model recognition, nine bear/nonbear classifiers, trained and tested on unblurred objects, were used to classify the object that would be fixated first (based on the detector estimates) as a teddy bear or a distractor. Patterns of categorical guidance and recognition accuracy were modeled almost perfectly by an HMAX model in combination with a color histogram feature. We conclude that guidance and recognition in the context of search are not separate processes mediated by different features, and that what the literature knows as guidance is really recognition performed on blurred objects viewed in the visual periphery.
Zelinsky, Gregory J.; Peng, Yifan; Berg, Alexander C.; Samaras, Dimitris
2013-01-01
Search is commonly described as a repeating cycle of guidance to target-like objects, followed by the recognition of these objects as targets or distractors. Are these indeed separate processes using different visual features? We addressed this question by comparing observer behavior to that of support vector machine (SVM) models trained on guidance and recognition tasks. Observers searched for a categorically defined teddy bear target in four-object arrays. Target-absent trials consisted of random category distractors rated in their visual similarity to teddy bears. Guidance, quantified as first-fixated objects during search, was strongest for targets, followed by target-similar, medium-similarity, and target-dissimilar distractors. False positive errors to first-fixated distractors also decreased with increasing dissimilarity to the target category. To model guidance, nine teddy bear detectors, using features ranging in biological plausibility, were trained on unblurred bears then tested on blurred versions of the same objects appearing in each search display. Guidance estimates were based on target probabilities obtained from these detectors. To model recognition, nine bear/nonbear classifiers, trained and tested on unblurred objects, were used to classify the object that would be fixated first (based on the detector estimates) as a teddy bear or a distractor. Patterns of categorical guidance and recognition accuracy were modeled almost perfectly by an HMAX model in combination with a color histogram feature. We conclude that guidance and recognition in the context of search are not separate processes mediated by different features, and that what the literature knows as guidance is really recognition performed on blurred objects viewed in the visual periphery. PMID:24105460
Newborn chickens generate invariant object representations at the onset of visual object experience
Wood, Justin N.
2013-01-01
To recognize objects quickly and accurately, mature visual systems build invariant object representations that generalize across a range of novel viewing conditions (e.g., changes in viewpoint). To date, however, the origins of this core cognitive ability have not yet been established. To examine how invariant object recognition develops in a newborn visual system, I raised chickens from birth for 2 weeks within controlled-rearing chambers. These chambers provided complete control over all visual object experiences. In the first week of life, subjects’ visual object experience was limited to a single virtual object rotating through a 60° viewpoint range. In the second week of life, I examined whether subjects could recognize that virtual object from novel viewpoints. Newborn chickens were able to generate viewpoint-invariant representations that supported object recognition across large, novel, and complex changes in the object’s appearance. Thus, newborn visual systems can begin building invariant object representations at the onset of visual object experience. These abstract representations can be generated from sparse data, in this case from a visual world containing a single virtual object seen from a limited range of viewpoints. This study shows that powerful, robust, and invariant object recognition machinery is an inherent feature of the newborn brain. PMID:23918372
Spike synchrony reveals emergence of proto-objects in visual cortex.
Martin, Anne B; von der Heydt, Rüdiger
2015-04-29
Neurons at early stages of the visual cortex signal elemental features, such as pieces of contour, but how these signals are organized into perceptual objects is unclear. Theories have proposed that spiking synchrony between these neurons encodes how features are grouped (binding-by-synchrony), but recent studies did not find the predicted increase in synchrony with binding. Here we propose that features are grouped to "proto-objects" by intrinsic feedback circuits that enhance the responses of the participating feature neurons. This hypothesis predicts synchrony exclusively between feature neurons that receive feedback from the same grouping circuit. We recorded from neurons in macaque visual cortex and used border-ownership selectivity, an intrinsic property of the neurons, to infer whether or not two neurons are part of the same grouping circuit. We found that binding produced synchrony between same-circuit neurons, but not between other pairs of neurons, as predicted by the grouping hypothesis. In a selective attention task, synchrony emerged with ignored as well as attended objects, and higher synchrony was associated with faster behavioral responses, as would be expected from early grouping mechanisms that provide the structure for object-based processing. Thus, synchrony could be produced by automatic activation of intrinsic grouping circuits. However, the binding-related elevation of synchrony was weak compared with its random fluctuations, arguing against synchrony as a code for binding. In contrast, feedback grouping circuits encode binding by modulating the response strength of related feature neurons. Thus, our results suggest a novel coding mechanism that might underlie the proto-objects of perception. Copyright © 2015 the authors 0270-6474/15/356860-11$15.00/0.
van den Berg, Ronald; Roerdink, Jos B T M; Cornelissen, Frans W
2010-01-22
An object in the peripheral visual field is more difficult to recognize when surrounded by other objects. This phenomenon is called "crowding". Crowding places a fundamental constraint on human vision that limits performance on numerous tasks. It has been suggested that crowding results from spatial feature integration necessary for object recognition. However, in the absence of convincing models, this theory has remained controversial. Here, we present a quantitative and physiologically plausible model for spatial integration of orientation signals, based on the principles of population coding. Using simulations, we demonstrate that this model coherently accounts for fundamental properties of crowding, including critical spacing, "compulsory averaging", and a foveal-peripheral anisotropy. Moreover, we show that the model predicts increased responses to correlated visual stimuli. Altogether, these results suggest that crowding has little immediate bearing on object recognition but is a by-product of a general, elementary integration mechanism in early vision aimed at improving signal quality.
The Semiotic Structure of Geometry Diagrams: How Textbook Diagrams Convey Meaning
ERIC Educational Resources Information Center
Dimmel, Justin K.; Herbst, Patricio G.
2015-01-01
Geometry diagrams use the visual features of specific drawn objects to convey meaning about generic mathematical entities. We examine the semiotic structure of these visual features in two parts. One, we conduct a semiotic inquiry to conceptualize geometry diagrams as mathematical texts that comprise choices from different semiotic systems. Two,…
Parallel Distractor Rejection as a Binding Mechanism in Search
Dent, Kevin; Allen, Harriet A.; Braithwaite, Jason J.; Humphreys, Glyn W.
2012-01-01
The relatively common experimental visual search task of finding a red X amongst red O’s and green X’s (conjunction search) presents the visual system with a binding problem. Illusory conjunctions (ICs) of features across objects must be avoided and only features present in the same object bound together. Correct binding into unique objects by the visual system may be promoted, and ICs minimized, by inhibiting the locations of distractors possessing non-target features (e.g., Treisman and Sato, 1990). Such parallel rejection of interfering distractors leaves the target as the only item competing for selection; thus solving the binding problem. In the present article we explore the theoretical and empirical basis of this process of active distractor inhibition in search. Specific experiments that provide strong evidence for a process of active distractor inhibition in search are highlighted. In the final part of the article we consider how distractor inhibition, as defined here, may be realized at a neurophysiological level (Treisman and Sato, 1990). PMID:22908002
Illusory conjunctions in simultanagnosia: coarse coding of visual feature location?
McCrea, Simon M; Buxbaum, Laurel J; Coslett, H Branch
2006-01-01
Simultanagnosia is a disorder characterized by an inability to see more than one object at a time. We report a simultanagnosic patient (ED) with bilateral posterior infarctions who produced frequent illusory conjunctions on tasks involving form and surface features (e.g., a red T) and form alone. ED also produced "blend" errors in which features of one familiar perceptual unit appeared to migrate to another familiar perceptual unit (e.g., "RO" read as "PQ"). ED often misread scrambled letter strings as a familiar word (e.g., "hmoe" read as "home"). Finally, ED's success in reporting two letters in an array was inversely related to the distance between the letters. These findings are consistent with the hypothesis that ED's illusory reflect coarse coding of visual feature location that is ameliorated in part by top-down information from object and word recognition systems; the findings are also consistent, however, with Treisman's Feature Integration Theory. Finally, the data provide additional support for the claim that the dorsal parieto-occipital cortex is implicated in the binding of visual feature information.
Object-based benefits without object-based representations.
Fougnie, Daryl; Cormiea, Sarah M; Alvarez, George A
2013-08-01
Influential theories of visual working memory have proposed that the basic units of memory are integrated object representations. Key support for this proposal is provided by the same object benefit: It is easier to remember multiple features of a single object than the same set of features distributed across multiple objects. Here, we replicate the object benefit but demonstrate that features are not stored as single, integrated representations. Specifically, participants could remember 10 features better when arranged in 5 objects compared to 10 objects, yet memory for one object feature was largely independent of memory for the other object feature. These results rule out the possibility that integrated representations drive the object benefit and require a revision of the concept of object-based memory representations. We propose that working memory is object-based in regard to the factors that enhance performance but feature based in regard to the level of representational failure. PsycINFO Database Record (c) 2013 APA, all rights reserved.
Deterministic object tracking using Gaussian ringlet and directional edge features
NASA Astrophysics Data System (ADS)
Krieger, Evan W.; Sidike, Paheding; Aspiras, Theus; Asari, Vijayan K.
2017-10-01
Challenges currently existing for intensity-based histogram feature tracking methods in wide area motion imagery (WAMI) data include object structural information distortions, background variations, and object scale change. These issues are caused by different pavement or ground types and from changing the sensor or altitude. All of these challenges need to be overcome in order to have a robust object tracker, while attaining a computation time appropriate for real-time processing. To achieve this, we present a novel method, Directional Ringlet Intensity Feature Transform (DRIFT), which employs Kirsch kernel filtering for edge features and a ringlet feature mapping for rotational invariance. The method also includes an automatic scale change component to obtain accurate object boundaries and improvements for lowering computation times. We evaluated the DRIFT algorithm on two challenging WAMI datasets, namely Columbus Large Image Format (CLIF) and Large Area Image Recorder (LAIR), to evaluate its robustness and efficiency. Additional evaluations on general tracking video sequences are performed using the Visual Tracker Benchmark and Visual Object Tracking 2014 databases to demonstrate the algorithms ability with additional challenges in long complex sequences including scale change. Experimental results show that the proposed approach yields competitive results compared to state-of-the-art object tracking methods on the testing datasets.
Mid-level perceptual features contain early cues to animacy.
Long, Bria; Störmer, Viola S; Alvarez, George A
2017-06-01
While substantial work has focused on how the visual system achieves basic-level recognition, less work has asked about how it supports large-scale distinctions between objects, such as animacy and real-world size. Previous work has shown that these dimensions are reflected in our neural object representations (Konkle & Caramazza, 2013), and that objects of different real-world sizes have different mid-level perceptual features (Long, Konkle, Cohen, & Alvarez, 2016). Here, we test the hypothesis that animates and manmade objects also differ in mid-level perceptual features. To do so, we generated synthetic images of animals and objects that preserve some texture and form information ("texforms"), but are not identifiable at the basic level. We used visual search efficiency as an index of perceptual similarity, as search is slower when targets are perceptually similar to distractors. Across three experiments, we find that observers can find animals faster among objects than among other animals, and vice versa, and that these results hold when stimuli are reduced to unrecognizable texforms. Electrophysiological evidence revealed that this mixed-animacy search advantage emerges during early stages of target individuation, and not during later stages associated with semantic processing. Lastly, we find that perceived curvature explains part of the mixed-animacy search advantage and that observers use perceived curvature to classify texforms as animate/inanimate. Taken together, these findings suggest that mid-level perceptual features, including curvature, contain cues to whether an object may be animate versus manmade. We propose that the visual system capitalizes on these early cues to facilitate object detection, recognition, and classification.
A Scalable Distributed Approach to Mobile Robot Vision
NASA Technical Reports Server (NTRS)
Kuipers, Benjamin; Browning, Robert L.; Gribble, William S.
1997-01-01
This paper documents our progress during the first year of work on our original proposal entitled 'A Scalable Distributed Approach to Mobile Robot Vision'. We are pursuing a strategy for real-time visual identification and tracking of complex objects which does not rely on specialized image-processing hardware. In this system perceptual schemas represent objects as a graph of primitive features. Distributed software agents identify and track these features, using variable-geometry image subwindows of limited size. Active control of imaging parameters and selective processing makes simultaneous real-time tracking of many primitive features tractable. Perceptual schemas operate independently from the tracking of primitive features, so that real-time tracking of a set of image features is not hurt by latency in recognition of the object that those features make up. The architecture allows semantically significant features to be tracked with limited expenditure of computational resources, and allows the visual computation to be distributed across a network of processors. Early experiments are described which demonstrate the usefulness of this formulation, followed by a brief overview of our more recent progress (after the first year).
Kim, Bumhwi; Ban, Sang-Woo; Lee, Minho
2013-10-01
Humans can efficiently perceive arbitrary visual objects based on an incremental learning mechanism with selective attention. This paper proposes a new task specific top-down attention model to locate a target object based on its form and color representation along with a bottom-up saliency based on relativity of primitive visual features and some memory modules. In the proposed model top-down bias signals corresponding to the target form and color features are generated, which draw the preferential attention to the desired object by the proposed selective attention model in concomitance with the bottom-up saliency process. The object form and color representation and memory modules have an incremental learning mechanism together with a proper object feature representation scheme. The proposed model includes a Growing Fuzzy Topology Adaptive Resonance Theory (GFTART) network which plays two important roles in object color and form biased attention; one is to incrementally learn and memorize color and form features of various objects, and the other is to generate a top-down bias signal to localize a target object by focusing on the candidate local areas. Moreover, the GFTART network can be utilized for knowledge inference which enables the perception of new unknown objects on the basis of the object form and color features stored in the memory during training. Experimental results show that the proposed model is successful in focusing on the specified target objects, in addition to the incremental representation and memorization of various objects in natural scenes. In addition, the proposed model properly infers new unknown objects based on the form and color features of previously trained objects. Copyright © 2013 Elsevier Ltd. All rights reserved.
Object activation in semantic memory from visual multimodal feature input.
Kraut, Michael A; Kremen, Sarah; Moo, Lauren R; Segal, Jessica B; Calhoun, Vincent; Hart, John
2002-01-01
The human brain's representation of objects has been proposed to exist as a network of coactivated neural regions present in multiple cognitive systems. However, it is not known if there is a region specific to the process of activating an integrated object representation in semantic memory from multimodal feature stimuli (e.g., picture-word). A previous study using word-word feature pairs as stimulus input showed that the left thalamus is integrally involved in object activation (Kraut, Kremen, Segal, et al., this issue). In the present study, participants were presented picture-word pairs that are features of objects, with the task being to decide if together they "activated" an object not explicitly presented (e.g., picture of a candle and the word "icing" activate the internal representation of a "cake"). For picture-word pairs that combine to elicit an object, signal change was detected in the ventral temporo-occipital regions, pre-SMA, left primary somatomotor cortex, both caudate nuclei, and the dorsal thalami bilaterally. These findings suggest that the left thalamus is engaged for either picture or word stimuli, but the right thalamus appears to be involved when picture stimuli are also presented with words in semantic object activation tasks. The somatomotor signal changes are likely secondary to activation of the semantic object representations from multimodal visual stimuli.
Malek, Salim; Melgani, Farid; Mekhalfi, Mohamed Lamine; Bazi, Yakoub
2017-11-16
This paper describes three coarse image description strategies, which are meant to promote a rough perception of surrounding objects for visually impaired individuals, with application to indoor spaces. The described algorithms operate on images (grabbed by the user, by means of a chest-mounted camera), and provide in output a list of objects that likely exist in his context across the indoor scene. In this regard, first, different colour, texture, and shape-based feature extractors are generated, followed by a feature learning step by means of AutoEncoder (AE) models. Second, the produced features are fused and fed into a multilabel classifier in order to list the potential objects. The conducted experiments point out that fusing a set of AE-learned features scores higher classification rates with respect to using the features individually. Furthermore, with respect to reference works, our method: (i) yields higher classification accuracies, and (ii) runs (at least four times) faster, which enables a potential full real-time application.
Beyond sensory images: Object-based representation in the human ventral pathway
Pietrini, Pietro; Furey, Maura L.; Ricciardi, Emiliano; Gobbini, M. Ida; Wu, W.-H. Carolyn; Cohen, Leonardo; Guazzelli, Mario; Haxby, James V.
2004-01-01
We investigated whether the topographically organized, category-related patterns of neural response in the ventral visual pathway are a representation of sensory images or a more abstract representation of object form that is not dependent on sensory modality. We used functional MRI to measure patterns of response evoked during visual and tactile recognition of faces and manmade objects in sighted subjects and during tactile recognition in blind subjects. Results showed that visual and tactile recognition evoked category-related patterns of response in a ventral extrastriate visual area in the inferior temporal gyrus that were correlated across modality for manmade objects. Blind subjects also demonstrated category-related patterns of response in this “visual” area, and in more ventral cortical regions in the fusiform gyrus, indicating that these patterns are not due to visual imagery and, furthermore, that visual experience is not necessary for category-related representations to develop in these cortices. These results demonstrate that the representation of objects in the ventral visual pathway is not simply a representation of visual images but, rather, is a representation of more abstract features of object form. PMID:15064396
Multilevel depth and image fusion for human activity detection.
Ni, Bingbing; Pei, Yong; Moulin, Pierre; Yan, Shuicheng
2013-10-01
Recognizing complex human activities usually requires the detection and modeling of individual visual features and the interactions between them. Current methods only rely on the visual features extracted from 2-D images, and therefore often lead to unreliable salient visual feature detection and inaccurate modeling of the interaction context between individual features. In this paper, we show that these problems can be addressed by combining data from a conventional camera and a depth sensor (e.g., Microsoft Kinect). We propose a novel complex activity recognition and localization framework that effectively fuses information from both grayscale and depth image channels at multiple levels of the video processing pipeline. In the individual visual feature detection level, depth-based filters are applied to the detected human/object rectangles to remove false detections. In the next level of interaction modeling, 3-D spatial and temporal contexts among human subjects or objects are extracted by integrating information from both grayscale and depth images. Depth information is also utilized to distinguish different types of indoor scenes. Finally, a latent structural model is developed to integrate the information from multiple levels of video processing for an activity detection. Extensive experiments on two activity recognition benchmarks (one with depth information) and a challenging grayscale + depth human activity database that contains complex interactions between human-human, human-object, and human-surroundings demonstrate the effectiveness of the proposed multilevel grayscale + depth fusion scheme. Higher recognition and localization accuracies are obtained relative to the previous methods.
Size Constancy in Bat Biosonar? Perceptual Interaction of Object Aperture and Distance
Heinrich, Melina; Wiegrebe, Lutz
2013-01-01
Perception and encoding of object size is an important feature of sensory systems. In the visual system object size is encoded by the visual angle (visual aperture) on the retina, but the aperture depends on the distance of the object. As object distance is not unambiguously encoded in the visual system, higher computational mechanisms are needed. This phenomenon is termed “size constancy”. It is assumed to reflect an automatic re-scaling of visual aperture with perceived object distance. Recently, it was found that in echolocating bats, the ‘sonar aperture’, i.e., the range of angles from which sound is reflected from an object back to the bat, is unambiguously perceived and neurally encoded. Moreover, it is well known that object distance is accurately perceived and explicitly encoded in bat sonar. Here, we addressed size constancy in bat biosonar, recruiting virtual-object techniques. Bats of the species Phyllostomus discolor learned to discriminate two simple virtual objects that only differed in sonar aperture. Upon successful discrimination, test trials were randomly interspersed using virtual objects that differed in both aperture and distance. It was tested whether the bats spontaneously assigned absolute width information to these objects by combining distance and aperture. The results showed that while the isolated perceptual cues encoding object width, aperture, and distance were all perceptually well resolved by the bats, the animals did not assign absolute width information to the test objects. This lack of sonar size constancy may result from the bats relying on different modalities to extract size information at different distances. Alternatively, it is conceivable that familiarity with a behaviorally relevant, conspicuous object is required for sonar size constancy, as it has been argued for visual size constancy. Based on the current data, it appears that size constancy is not necessarily an essential feature of sonar perception in bats. PMID:23630598
Size constancy in bat biosonar? Perceptual interaction of object aperture and distance.
Heinrich, Melina; Wiegrebe, Lutz
2013-01-01
Perception and encoding of object size is an important feature of sensory systems. In the visual system object size is encoded by the visual angle (visual aperture) on the retina, but the aperture depends on the distance of the object. As object distance is not unambiguously encoded in the visual system, higher computational mechanisms are needed. This phenomenon is termed "size constancy". It is assumed to reflect an automatic re-scaling of visual aperture with perceived object distance. Recently, it was found that in echolocating bats, the 'sonar aperture', i.e., the range of angles from which sound is reflected from an object back to the bat, is unambiguously perceived and neurally encoded. Moreover, it is well known that object distance is accurately perceived and explicitly encoded in bat sonar. Here, we addressed size constancy in bat biosonar, recruiting virtual-object techniques. Bats of the species Phyllostomus discolor learned to discriminate two simple virtual objects that only differed in sonar aperture. Upon successful discrimination, test trials were randomly interspersed using virtual objects that differed in both aperture and distance. It was tested whether the bats spontaneously assigned absolute width information to these objects by combining distance and aperture. The results showed that while the isolated perceptual cues encoding object width, aperture, and distance were all perceptually well resolved by the bats, the animals did not assign absolute width information to the test objects. This lack of sonar size constancy may result from the bats relying on different modalities to extract size information at different distances. Alternatively, it is conceivable that familiarity with a behaviorally relevant, conspicuous object is required for sonar size constancy, as it has been argued for visual size constancy. Based on the current data, it appears that size constancy is not necessarily an essential feature of sonar perception in bats.
A Cortical Network for the Encoding of Object Change
Hindy, Nicholas C.; Solomon, Sarah H.; Altmann, Gerry T.M.; Thompson-Schill, Sharon L.
2015-01-01
Understanding events often requires recognizing unique stimuli as alternative, mutually exclusive states of the same persisting object. Using fMRI, we examined the neural mechanisms underlying the representation of object states and object-state changes. We found that subjective ratings of visual dissimilarity between a depicted object and an unseen alternative state of that object predicted the corresponding multivoxel pattern dissimilarity in early visual cortex during an imagery task, while late visual cortex patterns tracked dissimilarity among distinct objects. Early visual cortex pattern dissimilarity for object states in turn predicted the level of activation in an area of left posterior ventrolateral prefrontal cortex (pVLPFC) most responsive to conflict in a separate Stroop color-word interference task, and an area of left ventral posterior parietal cortex (vPPC) implicated in the relational binding of semantic features. We suggest that when visualizing object states, representational content instantiated across early and late visual cortex is modulated by processes in left pVLPFC and left vPPC that support selection and binding, and ultimately event comprehension. PMID:24127425
Figure-ground organization and the emergence of proto-objects in the visual cortex.
von der Heydt, Rüdiger
2015-01-01
A long history of studies of perception has shown that the visual system organizes the incoming information early on, interpreting the 2D image in terms of a 3D world and producing a structure that provides perceptual continuity and enables object-based attention. Recordings from monkey visual cortex show that many neurons, especially in area V2, are selective for border ownership. These neurons are edge selective and have ordinary classical receptive fields (CRF), but in addition their responses are modulated (enhanced or suppressed) depending on the location of a 'figure' relative to the edge in their receptive field. Each neuron has a fixed preference for location on one side or the other. This selectivity is derived from the image context far beyond the CRF. This paper reviews evidence indicating that border ownership selectivity reflects the formation of early object representations ('proto-objects'). The evidence includes experiments showing (1) reversal of border ownership signals with change of perceived object structure, (2) border ownership specific enhancement of responses in object-based selective attention, (3) persistence of border ownership signals in accordance with continuity of object perception, and (4) remapping of border ownership signals across saccades and object movements. Findings 1 and 2 can be explained by hypothetical grouping circuits that sum contour feature signals in search of objectness, and, via recurrent projections, enhance the corresponding low-level feature signals. Findings 3 and 4 might be explained by assuming that the activity of grouping circuits persists and can be remapped. Grouping, persistence, and remapping are fundamental operations of vision. Finding these operations manifest in low-level visual areas challenges traditional views of visual processing. New computational models need to be developed for a comprehensive understanding of the function of the visual cortex.
Attentional Resources in Visual Tracking through Occlusion: The High-Beams Effect
ERIC Educational Resources Information Center
Flombaum, Jonathan I.; Scholl, Brian J.; Pylyshyn, Zenon W.
2008-01-01
A considerable amount of research has uncovered heuristics that the visual system employs to keep track of objects through periods of occlusion. Relatively little work, by comparison, has investigated the online resources that support this processing. We explored how attention is distributed when featurally identical objects become occluded during…
Bindings in working memory: The role of object-based attention.
Gao, Zaifeng; Wu, Fan; Qiu, Fangfang; He, Kaifeng; Yang, Yue; Shen, Mowei
2017-02-01
Over the past decade, it has been debated whether retaining bindings in working memory (WM) requires more attention than retaining constituent features, focusing on domain-general attention and space-based attention. Recently, we proposed that retaining bindings in WM needs more object-based attention than retaining constituent features (Shen, Huang, & Gao, 2015, Journal of Experimental Psychology: Human Perception and Performance, doi: 10.1037/xhp0000018 ). However, only unitized visual bindings were examined; to establish the role of object-based attention in retaining bindings in WM, more emperical evidence is required. We tested 4 new bindings that had been suggested requiring no more attention than the constituent features in the WM maintenance phase: The two constituent features of binding were stored in different WM modules (cross-module binding, Experiment 1), from auditory and visual modalities (cross-modal binding, Experiment 2), or temporally (cross-time binding, Experiments 3) or spatially (cross-space binding, Experiments 4-6) separated. In the critical condition, we added a secondary object feature-report task during the delay interval of the change-detection task, such that the secondary task competed for object-based attention with the to-be-memorized stimuli. If more object-based attention is required for retaining bindings than for retaining constituent features, the secondary task should impair the binding performance to a larger degree relative to the performance of constituent features. Indeed, Experiments 1-6 consistently revealed a significantly larger impairment for bindings than for the constituent features, suggesting that object-based attention plays a pivotal role in retaining bindings in WM.
Salience of the lambs: a test of the saliency map hypothesis with pictures of emotive objects.
Humphrey, Katherine; Underwood, Geoffrey; Lambert, Tony
2012-01-25
Humans have an ability to rapidly detect emotive stimuli. However, many emotional objects in a scene are also highly visually salient, which raises the question of how dependent the effects of emotionality are on visual saliency and whether the presence of an emotional object changes the power of a more visually salient object in attracting attention. Participants were shown a set of positive, negative, and neutral pictures and completed recall and recognition memory tests. Eye movement data revealed that visual saliency does influence eye movements, but the effect is reliably reduced when an emotional object is present. Pictures containing negative objects were recognized more accurately and recalled in greater detail, and participants fixated more on negative objects than positive or neutral ones. Initial fixations were more likely to be on emotional objects than more visually salient neutral ones, suggesting that the processing of emotional features occurs at a very early stage of perception.
Ageing and feature binding in visual working memory: The role of presentation time.
Rhodes, Stephen; Parra, Mario A; Logie, Robert H
2016-01-01
A large body of research has clearly demonstrated that healthy ageing is accompanied by an associative memory deficit. Older adults exhibit disproportionately poor performance on memory tasks requiring the retention of associations between items (e.g., pairs of unrelated words). In contrast to this robust deficit, older adults' ability to form and temporarily hold bound representations of an object's surface features, such as colour and shape, appears to be relatively well preserved. However, the findings of one set of experiments suggest that older adults may struggle to form temporary bound representations in visual working memory when given more time to study objects. However, these findings were based on between-participant comparisons across experimental paradigms. The present study directly assesses the role of presentation time in the ability of younger and older adults to bind shape and colour in visual working memory using a within-participant design. We report new evidence that giving older adults longer to study memory objects does not differentially affect their immediate memory for feature combinations relative to individual features. This is in line with a growing body of research suggesting that there is no age-related impairment in immediate memory for colour-shape binding.
Reilly, Jamie; Garcia, Amanda; Binney, Richard J.
2016-01-01
Much remains to be learned about the neural architecture underlying word meaning. Fully distributed models of semantic memory predict that the sound of a barking dog will conjointly engage a network of distributed sensorimotor spokes. An alternative framework holds that modality-specific features additionally converge within transmodal hubs. Participants underwent functional MRI while covertly naming familiar objects versus newly learned novel objects from only one of their constituent semantic features (visual form, characteristic sound, or point-light motion representation). Relative to the novel object baseline, familiar concepts elicited greater activation within association regions specific to that presentation modality. Furthermore, visual form elicited activation within high-level auditory association cortex. Conversely, environmental sounds elicited activation in regions proximal to visual association cortex. Both conditions commonly engaged a putative hub region within lateral anterior temporal cortex. These results support hybrid semantic models in which local hubs and distributed spokes are dually engaged in service of semantic memory. PMID:27289210
Repetition blindness and illusory conjunctions: errors in binding visual types with visual tokens.
Kanwisher, N
1991-05-01
Repetition blindness (Kanwisher, 1986, 1987) has been defined as the failure to detect or recall repetitions of words presented in rapid serial visual presentation (RSVP). The experiments presented here suggest that repetition blindness (RB) is a more general visual phenomenon, and examine its relationship to feature integration theory (Treisman & Gelade, 1980). Experiment 1 shows RB for letters distributed through space, time, or both. Experiment 2 demonstrates RB for repeated colors in RSVP lists. In Experiments 3 and 4, RB was found for repeated letters and colors in spatial arrays. Experiment 5 provides evidence that the mental representations of discrete objects (called "visual tokens" here) that are necessary to detect visual repetitions (Kanwisher, 1987) are the same as the "object files" (Kahneman & Treisman, 1984) in which visual features are conjoined. In Experiment 6, repetition blindness for the second occurrence of a repeated letter resulted only when the first occurrence was attended to. The overall results suggest that a general dissociation between types and tokens in visual information processing can account for both repetition blindness and illusory conjunctions.
What are the underlying units of perceived animacy? Chasing detection is intrinsically object-based.
van Buren, Benjamin; Gao, Tao; Scholl, Brian J
2017-10-01
One of the most foundational questions that can be asked about any visual process is the nature of the underlying 'units' over which it operates (e.g., features, objects, or spatial regions). Here we address this question-for the first time, to our knowledge-in the context of the perception of animacy. Even simple geometric shapes appear animate when they move in certain ways. Do such percepts arise whenever any visual feature moves appropriately, or do they require that the relevant features first be individuated as discrete objects? Observers viewed displays in which one disc (the "wolf") chased another (the "sheep") among several moving distractor discs. Critically, two pairs of discs were also connected by visible lines. In the Unconnected condition, both lines connected pairs of distractors; but in the Connected condition, one connected the wolf to a distractor, and the other connected the sheep to a different distractor. Observers in the Connected condition were much less likely to describe such displays using mental state terms. Furthermore, signal detection analyses were used to explore the objective ability to discriminate chasing displays from inanimate control displays in which the wolf moved toward the sheep's mirror-image. Chasing detection was severely impaired on Connected trials: observers could readily detect an object chasing another object, but not a line-end chasing another line-end, a line-end chasing an object, or an object chasing a line-end. We conclude that the underlying units of perceived animacy are discrete visual objects.
Jackson, Jade; Rich, Anina N; Williams, Mark A; Woolgar, Alexandra
2017-02-01
Human cognition is characterized by astounding flexibility, enabling us to select appropriate information according to the objectives of our current task. A circuit of frontal and parietal brain regions, often referred to as the frontoparietal attention network or multiple-demand (MD) regions, are believed to play a fundamental role in this flexibility. There is evidence that these regions dynamically adjust their responses to selectively process information that is currently relevant for behavior, as proposed by the "adaptive coding hypothesis" [Duncan, J. An adaptive coding model of neural function in prefrontal cortex. Nature Reviews Neuroscience, 2, 820-829, 2001]. Could this provide a neural mechanism for feature-selective attention, the process by which we preferentially process one feature of a stimulus over another? We used multivariate pattern analysis of fMRI data during a perceptually challenging categorization task to investigate whether the representation of visual object features in the MD regions flexibly adjusts according to task relevance. Participants were trained to categorize visually similar novel objects along two orthogonal stimulus dimensions (length/orientation) and performed short alternating blocks in which only one of these dimensions was relevant. We found that multivoxel patterns of activation in the MD regions encoded the task-relevant distinctions more strongly than the task-irrelevant distinctions: The MD regions discriminated between stimuli of different lengths when length was relevant and between the same objects according to orientation when orientation was relevant. The data suggest a flexible neural system that adjusts its representation of visual objects to preferentially encode stimulus features that are currently relevant for behavior, providing a neural mechanism for feature-selective attention.
Attention in the processing of complex visual displays: detecting features and their combinations.
Farell, B
1984-02-01
The distinction between operations in visual processing that are parallel and preattentive and those that are serial and attentional receives both theoretical and empirical support. According to Treisman's feature-integration theory, independent features are available preattentively, but attention is required to veridically combine features into objects. Certain evidence supporting this theory is consistent with a different interpretation, which was tested in four experiments. The first experiment compared the detection of features and feature combinations while eliminating a factor that confounded earlier comparisons. The resulting priority of access to combinatorial information suggests that features and nonlocal combinations of features are not connected solely by a bottom-up hierarchical convergence. Causes of the disparity between the results of Experiment 1 and the results of previous research were investigated in three subsequent experiments. The results showed that of the two confounded factors, it was the difference in the mapping of alternatives onto responses, not the differing attentional demands of features and objects, that underlaid the results of the previous research. The present results are thus counterexamples to the feature-integration theory. Aspects of this theory are shown to be subsumed by more general principles, which are discussed in terms of attentional processes in the detection of features, objects, and stimulus alternatives.
2012-01-01
Background There is at present crescent empirical evidence deriving from different lines of ERPs research that, unlike previously observed, the earliest sensory visual response, known as C1 component or P/N80, generated within the striate cortex, might be modulated by selective attention to visual stimulus features. Up to now, evidence of this modulation has been related to space location, and simple features such as spatial frequency, luminance, and texture. Additionally, neurophysiological conditions, such as emotion, vigilance, the reflexive or voluntary nature of input attentional selection, and workload have also been related to C1 modulations, although at least the workload status has received controversial indications. No information is instead available, at present, for objects attentional selection. Methods In this study object- and space-based attention mechanisms were conjointly investigated by presenting complex, familiar shapes of artefacts and animals, intermixed with distracters, in different tasks requiring the selection of a relevant target-category within a relevant spatial location, while ignoring the other shape categories within this location, and, overall, all the categories at an irrelevant location. EEG was recorded from 30 scalp electrode sites in 21 right-handed participants. Results and Conclusions ERP findings showed that visual processing was modulated by both shape- and location-relevance per se, beginning separately at the latency of the early phase of a precocious negativity (60-80 ms) at mesial scalp sites consistent with the C1 component, and a positivity at more lateral sites. The data also showed that the attentional modulation progressed conjointly at the latency of the subsequent P1 (100-120 ms) and N1 (120-180 ms), as well as later-latency components. These findings support the views that (1) V1 may be precociously modulated by direct top-down influences, and participates to object, besides simple features, attentional selection; (2) object spatial and non-spatial features selection might begin with an early, parallel detection of a target object in the visual field, followed by the progressive focusing of spatial attention onto the location of an actual target for its identification, somehow in line with neural mechanisms reported in the literature as "object-based space selection", or with those proposed for visual search. PMID:22300540
Modeling visual clutter perception using proto-object segmentation
Yu, Chen-Ping; Samaras, Dimitris; Zelinsky, Gregory J.
2014-01-01
We introduce the proto-object model of visual clutter perception. This unsupervised model segments an image into superpixels, then merges neighboring superpixels that share a common color cluster to obtain proto-objects—defined here as spatially extended regions of coherent features. Clutter is estimated by simply counting the number of proto-objects. We tested this model using 90 images of realistic scenes that were ranked by observers from least to most cluttered. Comparing this behaviorally obtained ranking to a ranking based on the model clutter estimates, we found a significant correlation between the two (Spearman's ρ = 0.814, p < 0.001). We also found that the proto-object model was highly robust to changes in its parameters and was generalizable to unseen images. We compared the proto-object model to six other models of clutter perception and demonstrated that it outperformed each, in some cases dramatically. Importantly, we also showed that the proto-object model was a better predictor of clutter perception than an actual count of the number of objects in the scenes, suggesting that the set size of a scene may be better described by proto-objects than objects. We conclude that the success of the proto-object model is due in part to its use of an intermediate level of visual representation—one between features and objects—and that this is evidence for the potential importance of a proto-object representation in many common visual percepts and tasks. PMID:24904121
Figure–ground organization and the emergence of proto-objects in the visual cortex
von der Heydt, Rüdiger
2015-01-01
A long history of studies of perception has shown that the visual system organizes the incoming information early on, interpreting the 2D image in terms of a 3D world and producing a structure that provides perceptual continuity and enables object-based attention. Recordings from monkey visual cortex show that many neurons, especially in area V2, are selective for border ownership. These neurons are edge selective and have ordinary classical receptive fields (CRF), but in addition their responses are modulated (enhanced or suppressed) depending on the location of a ‘figure’ relative to the edge in their receptive field. Each neuron has a fixed preference for location on one side or the other. This selectivity is derived from the image context far beyond the CRF. This paper reviews evidence indicating that border ownership selectivity reflects the formation of early object representations (‘proto-objects’). The evidence includes experiments showing (1) reversal of border ownership signals with change of perceived object structure, (2) border ownership specific enhancement of responses in object-based selective attention, (3) persistence of border ownership signals in accordance with continuity of object perception, and (4) remapping of border ownership signals across saccades and object movements. Findings 1 and 2 can be explained by hypothetical grouping circuits that sum contour feature signals in search of objectness, and, via recurrent projections, enhance the corresponding low-level feature signals. Findings 3 and 4 might be explained by assuming that the activity of grouping circuits persists and can be remapped. Grouping, persistence, and remapping are fundamental operations of vision. Finding these operations manifest in low-level visual areas challenges traditional views of visual processing. New computational models need to be developed for a comprehensive understanding of the function of the visual cortex. PMID:26579062
Image Feature Types and Their Predictions of Aesthetic Preference and Naturalness
Ibarra, Frank F.; Kardan, Omid; Hunter, MaryCarol R.; Kotabe, Hiroki P.; Meyer, Francisco A. C.; Berman, Marc G.
2017-01-01
Previous research has investigated ways to quantify visual information of a scene in terms of a visual processing hierarchy, i.e., making sense of visual environment by segmentation and integration of elementary sensory input. Guided by this research, studies have developed categories for low-level visual features (e.g., edges, colors), high-level visual features (scene-level entities that convey semantic information such as objects), and how models of those features predict aesthetic preference and naturalness. For example, in Kardan et al. (2015a), 52 participants provided aesthetic preference and naturalness ratings, which are used in the current study, for 307 images of mixed natural and urban content. Kardan et al. (2015a) then developed a model using low-level features to predict aesthetic preference and naturalness and could do so with high accuracy. What has yet to be explored is the ability of higher-level visual features (e.g., horizon line position relative to viewer, geometry of building distribution relative to visual access) to predict aesthetic preference and naturalness of scenes, and whether higher-level features mediate some of the association between the low-level features and aesthetic preference or naturalness. In this study we investigated these relationships and found that low- and high- level features explain 68.4% of the variance in aesthetic preference ratings and 88.7% of the variance in naturalness ratings. Additionally, several high-level features mediated the relationship between the low-level visual features and aaesthetic preference. In a multiple mediation analysis, the high-level feature mediators accounted for over 50% of the variance in predicting aesthetic preference. These results show that high-level visual features play a prominent role predicting aesthetic preference, but do not completely eliminate the predictive power of the low-level visual features. These strong predictors provide powerful insights for future research relating to landscape and urban design with the aim of maximizing subjective well-being, which could lead to improved health outcomes on a larger scale. PMID:28503158
2017-01-01
Recent studies have challenged the ventral/“what” and dorsal/“where” two-visual-processing-pathway view by showing the existence of “what” and “where” information in both pathways. Is the two-pathway distinction still valid? Here, we examined how goal-directed visual information processing may differentially impact visual representations in these two pathways. Using fMRI and multivariate pattern analysis, in three experiments on human participants (57% females), by manipulating whether color or shape was task-relevant and how they were conjoined, we examined shape-based object category decoding in occipitotemporal and parietal regions. We found that object category representations in all the regions examined were influenced by whether or not object shape was task-relevant. This task effect, however, tended to decrease as task-relevant and irrelevant features were more integrated, reflecting the well-known object-based feature encoding. Interestingly, task relevance played a relatively minor role in driving the representational structures of early visual and ventral object regions. They were driven predominantly by variations in object shapes. In contrast, the effect of task was much greater in dorsal than ventral regions, with object category and task relevance both contributing significantly to the representational structures of the dorsal regions. These results showed that, whereas visual representations in the ventral pathway are more invariant and reflect “what an object is,” those in the dorsal pathway are more adaptive and reflect “what we do with it.” Thus, despite the existence of “what” and “where” information in both visual processing pathways, the two pathways may still differ fundamentally in their roles in visual information representation. SIGNIFICANCE STATEMENT Visual information is thought to be processed in two distinctive pathways: the ventral pathway that processes “what” an object is and the dorsal pathway that processes “where” it is located. This view has been challenged by recent studies revealing the existence of “what” and “where” information in both pathways. Here, we found that goal-directed visual information processing differentially modulates shape-based object category representations in the two pathways. Whereas ventral representations are more invariant to the demand of the task, reflecting what an object is, dorsal representations are more adaptive, reflecting what we do with the object. Thus, despite the existence of “what” and “where” information in both pathways, visual representations may still differ fundamentally in the two pathways. PMID:28821655
Invariant visual object recognition: a model, with lighting invariance.
Rolls, Edmund T; Stringer, Simon M
2006-01-01
How are invariant representations of objects formed in the visual cortex? We describe a neurophysiological and computational approach which focusses on a feature hierarchy model in which invariant representations can be built by self-organizing learning based on the statistics of the visual input. The model can use temporal continuity in an associative synaptic learning rule with a short term memory trace, and/or it can use spatial continuity in Continuous Transformation learning. The model of visual processing in the ventral cortical stream can build representations of objects that are invariant with respect to translation, view, size, and in this paper we show also lighting. The model has been extended to provide an account of invariant representations in the dorsal visual system of the global motion produced by objects such as looming, rotation, and object-based movement. The model has been extended to incorporate top-down feedback connections to model the control of attention by biased competition in for example spatial and object search tasks. The model has also been extended to account for how the visual system can select single objects in complex visual scenes, and how multiple objects can be represented in a scene.
Self-Regulation of Visual Attention and Facial Expression of Emotions in ADHD Children
ERIC Educational Resources Information Center
Kuhle, Hans J.; Kinkelbur, Jorg; Andes, Kerstin; Heidorn, Fridjof M.; Zeyer, Solveigh; Rautzenberg, Petra; Jansen, Fritz
2007-01-01
Objective: To test if visual focusing and mimic display as features of self-regulation in ADHD children show a curvilinear relation to rising methylphenidate (MPH) doses. To test if small dose steps of 2.5mg MPH cause significant changes in behavior. And to test the relation of these features to intellectual performance, parents' ratings, and…
A method for real-time visual stimulus selection in the study of cortical object perception.
Leeds, Daniel D; Tarr, Michael J
2016-06-01
The properties utilized by visual object perception in the mid- and high-level ventral visual pathway are poorly understood. To better establish and explore possible models of these properties, we adopt a data-driven approach in which we repeatedly interrogate neural units using functional Magnetic Resonance Imaging (fMRI) to establish each unit's image selectivity. This approach to imaging necessitates a search through a broad space of stimulus properties using a limited number of samples. To more quickly identify the complex visual features underlying human cortical object perception, we implemented a new functional magnetic resonance imaging protocol in which visual stimuli are selected in real-time based on BOLD responses to recently shown images. Two variations of this protocol were developed, one relying on natural object stimuli and a second based on synthetic object stimuli, both embedded in feature spaces based on the complex visual properties of the objects. During fMRI scanning, we continuously controlled stimulus selection in the context of a real-time search through these image spaces in order to maximize neural responses across pre-determined 1cm(3) rain regions. Elsewhere we have reported the patterns of cortical selectivity revealed by this approach (Leeds et al., 2014). In contrast, here our objective is to present more detailed methods and explore the technical and biological factors influencing the behavior of our real-time stimulus search. We observe that: 1) Searches converged more reliably when exploring a more precisely parameterized space of synthetic objects; 2) real-time estimation of cortical responses to stimuli is reasonably consistent; 3) search behavior was acceptably robust to delays in stimulus displays and subject motion effects. Overall, our results indicate that real-time fMRI methods may provide a valuable platform for continuing study of localized neural selectivity, both for visual object representation and beyond. Copyright © 2016 Elsevier Inc. All rights reserved.
A method for real-time visual stimulus selection in the study of cortical object perception
Leeds, Daniel D.; Tarr, Michael J.
2016-01-01
The properties utilized by visual object perception in the mid- and high-level ventral visual pathway are poorly understood. To better establish and explore possible models of these properties, we adopt a data-driven approach in which we repeatedly interrogate neural units using functional Magnetic Resonance Imaging (fMRI) to establish each unit’s image selectivity. This approach to imaging necessitates a search through a broad space of stimulus properties using a limited number of samples. To more quickly identify the complex visual features underlying human cortical object perception, we implemented a new functional magnetic resonance imaging protocol in which visual stimuli are selected in real-time based on BOLD responses to recently shown images. Two variations of this protocol were developed, one relying on natural object stimuli and a second based on synthetic object stimuli, both embedded in feature spaces based on the complex visual properties of the objects. During fMRI scanning, we continuously controlled stimulus selection in the context of a real-time search through these image spaces in order to maximize neural responses across predetermined 1 cm3 brain regions. Elsewhere we have reported the patterns of cortical selectivity revealed by this approach (Leeds 2014). In contrast, here our objective is to present more detailed methods and explore the technical and biological factors influencing the behavior of our real-time stimulus search. We observe that: 1) Searches converged more reliably when exploring a more precisely parameterized space of synthetic objects; 2) Real-time estimation of cortical responses to stimuli are reasonably consistent; 3) Search behavior was acceptably robust to delays in stimulus displays and subject motion effects. Overall, our results indicate that real-time fMRI methods may provide a valuable platform for continuing study of localized neural selectivity, both for visual object representation and beyond. PMID:26973168
Lescroart, Mark D.; Stansbury, Dustin E.; Gallant, Jack L.
2015-01-01
Perception of natural visual scenes activates several functional areas in the human brain, including the Parahippocampal Place Area (PPA), Retrosplenial Complex (RSC), and the Occipital Place Area (OPA). It is currently unclear what specific scene-related features are represented in these areas. Previous studies have suggested that PPA, RSC, and/or OPA might represent at least three qualitatively different classes of features: (1) 2D features related to Fourier power; (2) 3D spatial features such as the distance to objects in a scene; or (3) abstract features such as the categories of objects in a scene. To determine which of these hypotheses best describes the visual representation in scene-selective areas, we applied voxel-wise modeling (VM) to BOLD fMRI responses elicited by a set of 1386 images of natural scenes. VM provides an efficient method for testing competing hypotheses by comparing predictions of brain activity based on encoding models that instantiate each hypothesis. Here we evaluated three different encoding models that instantiate each of the three hypotheses listed above. We used linear regression to fit each encoding model to the fMRI data recorded from each voxel, and we evaluated each fit model by estimating the amount of variance it predicted in a withheld portion of the data set. We found that voxel-wise models based on Fourier power or the subjective distance to objects in each scene predicted much of the variance predicted by a model based on object categories. Furthermore, the response variance explained by these three models is largely shared, and the individual models explain little unique variance in responses. Based on an evaluation of previous studies and the data we present here, we conclude that there is currently no good basis to favor any one of the three alternative hypotheses about visual representation in scene-selective areas. We offer suggestions for further studies that may help resolve this issue. PMID:26594164
Tschechne, Stephan; Neumann, Heiko
2014-01-01
Visual structures in the environment are segmented into image regions and those combined to a representation of surfaces and prototypical objects. Such a perceptual organization is performed by complex neural mechanisms in the visual cortex of primates. Multiple mutually connected areas in the ventral cortical pathway receive visual input and extract local form features that are subsequently grouped into increasingly complex, more meaningful image elements. Such a distributed network of processing must be capable to make accessible highly articulated changes in shape boundary as well as very subtle curvature changes that contribute to the perception of an object. We propose a recurrent computational network architecture that utilizes hierarchical distributed representations of shape features to encode surface and object boundary over different scales of resolution. Our model makes use of neural mechanisms that model the processing capabilities of early and intermediate stages in visual cortex, namely areas V1–V4 and IT. We suggest that multiple specialized component representations interact by feedforward hierarchical processing that is combined with feedback signals driven by representations generated at higher stages. Based on this, global configurational as well as local information is made available to distinguish changes in the object's contour. Once the outline of a shape has been established, contextual contour configurations are used to assign border ownership directions and thus achieve segregation of figure and ground. The model, thus, proposes how separate mechanisms contribute to distributed hierarchical cortical shape representation and combine with processes of figure-ground segregation. Our model is probed with a selection of stimuli to illustrate processing results at different processing stages. We especially highlight how modulatory feedback connections contribute to the processing of visual input at various stages in the processing hierarchy. PMID:25157228
Tschechne, Stephan; Neumann, Heiko
2014-01-01
Visual structures in the environment are segmented into image regions and those combined to a representation of surfaces and prototypical objects. Such a perceptual organization is performed by complex neural mechanisms in the visual cortex of primates. Multiple mutually connected areas in the ventral cortical pathway receive visual input and extract local form features that are subsequently grouped into increasingly complex, more meaningful image elements. Such a distributed network of processing must be capable to make accessible highly articulated changes in shape boundary as well as very subtle curvature changes that contribute to the perception of an object. We propose a recurrent computational network architecture that utilizes hierarchical distributed representations of shape features to encode surface and object boundary over different scales of resolution. Our model makes use of neural mechanisms that model the processing capabilities of early and intermediate stages in visual cortex, namely areas V1-V4 and IT. We suggest that multiple specialized component representations interact by feedforward hierarchical processing that is combined with feedback signals driven by representations generated at higher stages. Based on this, global configurational as well as local information is made available to distinguish changes in the object's contour. Once the outline of a shape has been established, contextual contour configurations are used to assign border ownership directions and thus achieve segregation of figure and ground. The model, thus, proposes how separate mechanisms contribute to distributed hierarchical cortical shape representation and combine with processes of figure-ground segregation. Our model is probed with a selection of stimuli to illustrate processing results at different processing stages. We especially highlight how modulatory feedback connections contribute to the processing of visual input at various stages in the processing hierarchy.
Mid-level perceptual features distinguish objects of different real-world sizes.
Long, Bria; Konkle, Talia; Cohen, Michael A; Alvarez, George A
2016-01-01
Understanding how perceptual and conceptual representations are connected is a fundamental goal of cognitive science. Here, we focus on a broad conceptual distinction that constrains how we interact with objects--real-world size. Although there appear to be clear perceptual correlates for basic-level categories (apples look like other apples, oranges look like other oranges), the perceptual correlates of broader categorical distinctions are largely unexplored, i.e., do small objects look like other small objects? Because there are many kinds of small objects (e.g., cups, keys), there may be no reliable perceptual features that distinguish them from big objects (e.g., cars, tables). Contrary to this intuition, we demonstrated that big and small objects have reliable perceptual differences that can be extracted by early stages of visual processing. In a series of visual search studies, participants found target objects faster when the distractor objects differed in real-world size. These results held when we broadly sampled big and small objects, when we controlled for low-level features and image statistics, and when we reduced objects to texforms--unrecognizable textures that loosely preserve an object's form. However, this effect was absent when we used more basic textures. These results demonstrate that big and small objects have reliably different mid-level perceptual features, and suggest that early perceptual information about broad-category membership may influence downstream object perception, recognition, and categorization processes. (c) 2015 APA, all rights reserved).
Late electrophysiological modulations of feature-based attention to object shapes.
Stojanoski, Bobby Boge; Niemeier, Matthias
2014-03-01
Feature-based attention has been shown to aid object perception. Our previous ERP effects revealed temporally late feature-based modulation in response to objects relative to motion. The aim of the current study was to confirm the timing of feature-based influences on object perception while cueing within the feature dimension of shape. Participants were told to expect either "pillow" or "flower" objects embedded among random white and black lines. Participants more accurately reported the object's main color for valid compared to invalid shapes. ERPs revealed modulation from 252-502 ms, from occipital to frontal electrodes. Our results are consistent with previous findings examining the time course for processing similar stimuli (illusory contours). Our results provide novel insights into how attending to features of higher complexity aids object perception presumably via feed-forward and feedback mechanisms along the visual hierarchy. Copyright © 2014 Society for Psychophysiological Research.
Qin, Shuo; Ray, Nicholas R; Ramakrishnan, Nithya; Nashiro, Kaoru; O'Connell, Margaret A; Basak, Chandramallika
2016-11-01
Overloading the capacity of visual attention can result in mistakenly combining the various features of an object, that is, illusory conjunctions. We hypothesize that if the two hemispheres separately process visual information by splitting attention, connectivity of corpus callosum-a brain structure integrating the two hemispheres-would predict the degree of illusory conjunctions. In the current study, we assessed two types of illusory conjunctions using a memory-scanning paradigm; the features were either presented across the two opposite hemifields or within the same hemifield. Four objects, each with two visual features, were briefly presented together followed by a probe-recognition and a confidence rating for the recognition accuracy. MRI scans were also obtained. Results indicated that successful recollection during probe recognition was better for across hemifields conjunctions compared to within hemifield conjunctions, lending support to the bilateral advantage of the two hemispheres in visual short-term memory. Age-related differences regarding the underlying mechanisms of the bilateral advantage indicated greater reliance on recollection-based processing in young and on familiarity-based processing in old. Moreover, the integrity of the posterior corpus callosum was more predictive of opposite hemifield illusory conjunctions compared to within hemifield illusory conjunctions, even after controlling for age. That is, individuals with lesser posterior corpus callosum connectivity had better recognition for objects when their features were recombined from the opposite hemifields than from the same hemifield. This study is the first to investigate the role of the corpus callosum in splitting attention between versus within hemifields. © 2016 Society for Psychophysiological Research.
The fate of task-irrelevant visual motion: perceptual load versus feature-based attention.
Taya, Shuichiro; Adams, Wendy J; Graf, Erich W; Lavie, Nilli
2009-11-18
We tested contrasting predictions derived from perceptual load theory and from recent feature-based selection accounts. Observers viewed moving, colored stimuli and performed low or high load tasks associated with one stimulus feature, either color or motion. The resultant motion aftereffect (MAE) was used to evaluate attentional allocation. We found that task-irrelevant visual features received less attention than co-localized task-relevant features of the same objects. Moreover, when color and motion features were co-localized yet perceived to belong to two distinct surfaces, feature-based selection was further increased at the expense of object-based co-selection. Load theory predicts that the MAE for task-irrelevant motion would be reduced with a higher load color task. However, this was not seen for co-localized features; perceptual load only modulated the MAE for task-irrelevant motion when this was spatially separated from the attended color location. Our results suggest that perceptual load effects are mediated by spatial selection and do not generalize to the feature domain. Feature-based selection operates to suppress processing of task-irrelevant, co-localized features, irrespective of perceptual load.
Using Prosopagnosia to Test and Modify Visual Recognition Theory.
O'Brien, Alexander M
2018-02-01
Biederman's contemporary theory of basic visual object recognition (Recognition-by-Components) is based on structural descriptions of objects and presumes 36 visual primitives (geons) people can discriminate, but there has been no empirical test of the actual use of these 36 geons to visually distinguish objects. In this study, we tested for the actual use of these geons in basic visual discrimination by comparing object discrimination performance patterns (when distinguishing varied stimuli) of an acquired prosopagnosia patient (LB) and healthy control participants. LB's prosopagnosia left her heavily reliant on structural descriptions or categorical object differences in visual discrimination tasks versus the control participants' additional ability to use face recognition or coordinate systems (Coordinate Relations Hypothesis). Thus, when LB performed comparably to control participants with a given stimulus, her restricted reliance on basic or categorical discriminations meant that the stimuli must be distinguishable on the basis of a geon feature. By varying stimuli in eight separate experiments and presenting all 36 geons, we discerned that LB coded only 12 (vs. 36) distinct visual primitives (geons), apparently reflective of human visual systems generally.
van den Berg, Ronald; Roerdink, Jos B. T. M.; Cornelissen, Frans W.
2010-01-01
An object in the peripheral visual field is more difficult to recognize when surrounded by other objects. This phenomenon is called “crowding”. Crowding places a fundamental constraint on human vision that limits performance on numerous tasks. It has been suggested that crowding results from spatial feature integration necessary for object recognition. However, in the absence of convincing models, this theory has remained controversial. Here, we present a quantitative and physiologically plausible model for spatial integration of orientation signals, based on the principles of population coding. Using simulations, we demonstrate that this model coherently accounts for fundamental properties of crowding, including critical spacing, “compulsory averaging”, and a foveal-peripheral anisotropy. Moreover, we show that the model predicts increased responses to correlated visual stimuli. Altogether, these results suggest that crowding has little immediate bearing on object recognition but is a by-product of a general, elementary integration mechanism in early vision aimed at improving signal quality. PMID:20098499
Mihalas, Stefan; Dong, Yi; von der Heydt, Rüdiger; Niebur, Ernst
2011-01-01
Visual attention is often understood as a modulatory field acting at early stages of processing, but the mechanisms that direct and fit the field to the attended object are not known. We show that a purely spatial attention field propagating downward in the neuronal network responsible for perceptual organization will be reshaped, repositioned, and sharpened to match the object's shape and scale. Key features of the model are grouping neurons integrating local features into coherent tentative objects, excitatory feedback to the same local feature neurons that caused grouping neuron activation, and inhibition between incompatible interpretations both at the local feature level and at the object representation level. PMID:21502489
Threat captures attention but does not affect learning of contextual regularities.
Yamaguchi, Motonori; Harwood, Sarah L
2017-04-01
Some of the stimulus features that guide visual attention are abstract properties of objects such as potential threat to one's survival, whereas others are complex configurations such as visual contexts that are learned through past experiences. The present study investigated the two functions that guide visual attention, threat detection and learning of contextual regularities, in visual search. Search arrays contained images of threat and non-threat objects, and their locations were fixed on some trials but random on other trials. Although they were irrelevant to the visual search task, threat objects facilitated attention capture and impaired attention disengagement. Search time improved for fixed configurations more than for random configurations, reflecting learning of visual contexts. Nevertheless, threat detection had little influence on learning of the contextual regularities. The results suggest that factors guiding visual attention are different from factors that influence learning to guide visual attention.
Barrier Effects in Non-retinotopic Feature Attribution
Aydin, Murat; Herzog, Michael H.; Öğmen, Haluk
2011-01-01
When objects move in the environment, their retinal images can undergo drastic changes and features of different objects can be inter-mixed in the retinal image. Notwithstanding these changes and ambiguities, the visual system is capable of establishing correctly feature-object relationships as well as maintaining individual identities of objects through space and time. Recently, by using a Ternus-Pikler display, we have shown that perceived motion correspondences serve as the medium for non-retinotopic attribution of features to objects. The purpose of the work reported in this manuscript was to assess whether perceived motion correspondences provide a sufficient condition for feature attribution. Our results show that the introduction of a static “barrier” stimulus can interfere with the feature attribution process. Our results also indicate that the barrier stops feature attribution based on interferences related to the feature attribution process itself rather than on mechanisms related to perceived motion. PMID:21767561
The representation of semantic knowledge in a child with Williams syndrome.
Robinson, Sally J; Temple, Christine M
2009-05-01
This study investigated whether there are distinct types of semantic knowledge with distinct representational bases during development. The representation of semantic knowledge in a teenage child (S.T.) with Williams syndrome was explored for the categories of animals, fruit, and vegetables, manipulable objects, and nonmanipulable objects. S.T.'s lexical stores were of a normal size but the volume of "sensory feature" semantic knowledge she generated in oral descriptions was reduced. In visual recognition decisions, S.T. made more false positives to nonitems than did controls. Although overall naming of pictures was unimpaired, S.T. exhibited a category-specific anomia for nonmanipulable objects and impaired naming of visual-feature descriptions of animals. S.T.'s performance was interpreted as reflecting the impaired integration of distinctive features from perceptual input, which may impact upon nonmanipulable objects to a greater extent than the other knowledge categories. Performance was used to inform adult-based models of semantic representation, with category structure proposed to emerge due to differing degrees of dependency upon underlying knowledge types, feature correlations, and the acquisition of information from modality-specific processing modules.
Accessibility limits recall from visual working memory.
Rajsic, Jason; Swan, Garrett; Wilson, Daryl E; Pratt, Jay
2017-09-01
In this article, we demonstrate limitations of accessibility of information in visual working memory (VWM). Recently, cued-recall has been used to estimate the fidelity of information in VWM, where the feature of a cued object is reproduced from memory (Bays, Catalao, & Husain, 2009; Wilken & Ma, 2004; Zhang & Luck, 2008). Response error in these tasks has been largely studied with respect to failures of encoding and maintenance; however, the retrieval operations used in these tasks remain poorly understood. By varying the number and type of object features provided as a cue in a visual delayed-estimation paradigm, we directly assess the nature of retrieval errors in delayed estimation from VWM. Our results demonstrate that providing additional object features in a single cue reliably improves recall, largely by reducing swap, or misbinding, responses. In addition, performance simulations using the binding pool model (Swan & Wyble, 2014) were able to mimic this pattern of performance across a large span of parameter combinations, demonstrating that the binding pool provides a possible mechanism underlying this pattern of results that is not merely a symptom of one particular parametrization. We conclude that accessing visual working memory is a noisy process, and can lead to errors over and above those of encoding and maintenance limitations. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Creating Concepts from Converging Features in Human Cortex
Coutanche, Marc N.; Thompson-Schill, Sharon L.
2015-01-01
To make sense of the world around us, our brain must remember the overlapping features of millions of objects. Crucially, it must also represent each object's unique feature-convergence. Some theories propose that an integration area (or “convergence zone”) binds together separate features. We report an investigation of our knowledge of objects' features and identity, and the link between them. We used functional magnetic resonance imaging to record neural activity, as humans attempted to detect a cued fruit or vegetable in visual noise. Crucially, we analyzed brain activity before a fruit or vegetable was present, allowing us to interrogate top-down activity. We found that pattern-classification algorithms could be used to decode the detection target's identity in the left anterior temporal lobe (ATL), its shape in lateral occipital cortex, and its color in right V4. A novel decoding-dependency analysis revealed that identity information in left ATL was specifically predicted by the temporal convergence of shape and color codes in early visual regions. People with stronger feature-and-identity dependencies had more similar top-down and bottom-up activity patterns. These results fulfill three key requirements for a neural convergence zone: a convergence result (object identity), ingredients (color and shape), and the link between them. PMID:24692512
The effect of category learning on attentional modulation of visual cortex.
Folstein, Jonathan R; Fuller, Kelly; Howard, Dorothy; DePatie, Thomas
2017-09-01
Learning about visual object categories causes changes in the way we perceive those objects. One likely mechanism by which this occurs is the application of attention to potentially relevant objects. Here we test the hypothesis that category membership influences the allocation of attention, allowing attention to be applied not only to object features, but to entire categories. Participants briefly learned to categorize a set of novel cartoon animals after which EEG was recorded while participants distinguished between a target and non-target category. A second identical EEG session was conducted after two sessions of categorization practice. The category structure and task design allowed parametric manipulation of number of target features while holding feature frequency and category membership constant. We found no evidence that category membership influenced attentional selection: a postero-lateral negative component, labeled the selection negativity/N250, increased over time and was sensitive to number of target features, not target categories. In contrast, the right hemisphere N170 was not sensitive to target features. The P300 appeared sensitive to category in the first session, but showed a graded sensitivity to number of target features in the second session, possibly suggesting a transition from rule-based to similarity based categorization. Copyright © 2017. Published by Elsevier Ltd.
Cronly-Dillon, J; Persaud, K; Gregory, R P
1999-01-01
This study demonstrates the ability of blind (previously sighted) and blindfolded (sighted) subjects in reconstructing and identifying a number of visual targets transformed into equivalent musical representations. Visual images are deconstructed through a process which selectively segregates different features of the image into separate packages. These are then encoded in sound and presented as a polyphonic musical melody which resembles a Baroque fugue with many voices, allowing subjects to analyse the component voices selectively in combination, or separately in sequence, in a manner which allows a subject to patch together and bind the different features of the object mentally into a mental percept of a single recognizable entity. The visual targets used in this study included a variety of geometrical figures, simple high-contrast line drawings of man-made objects, natural and urban scenes, etc., translated into sound and presented to the subject in polyphonic musical form. PMID:10643086
Can responses to basic non-numerical visual features explain neural numerosity responses?
Harvey, Ben M; Dumoulin, Serge O
2017-04-01
Humans and many animals can distinguish between stimuli that differ in numerosity, the number of objects in a set. Human and macaque parietal lobes contain neurons that respond to changes in stimulus numerosity. However, basic non-numerical visual features can affect neural responses to and perception of numerosity, and visual features often co-vary with numerosity. Therefore, it is debated whether numerosity or co-varying low-level visual features underlie neural and behavioral responses to numerosity. To test the hypothesis that non-numerical visual features underlie neural numerosity responses in a human parietal numerosity map, we analyze responses to a group of numerosity stimulus configurations that have the same numerosity progression but vary considerably in their non-numerical visual features. Using ultra-high-field (7T) fMRI, we measure responses to these stimulus configurations in an area of posterior parietal cortex whose responses are believed to reflect numerosity-selective activity. We describe an fMRI analysis method to distinguish between alternative models of neural response functions, following a population receptive field (pRF) modeling approach. For each stimulus configuration, we first quantify the relationships between numerosity and several non-numerical visual features that have been proposed to underlie performance in numerosity discrimination tasks. We then determine how well responses to these non-numerical visual features predict the observed fMRI responses, and compare this to the predictions of responses to numerosity. We demonstrate that a numerosity response model predicts observed responses more accurately than models of responses to simple non-numerical visual features. As such, neural responses in cognitive processing need not reflect simpler properties of early sensory inputs. Copyright © 2017 Elsevier Inc. All rights reserved.
A visual tracking method based on deep learning without online model updating
NASA Astrophysics Data System (ADS)
Tang, Cong; Wang, Yicheng; Feng, Yunsong; Zheng, Chao; Jin, Wei
2018-02-01
The paper proposes a visual tracking method based on deep learning without online model updating. In consideration of the advantages of deep learning in feature representation, deep model SSD (Single Shot Multibox Detector) is used as the object extractor in the tracking model. Simultaneously, the color histogram feature and HOG (Histogram of Oriented Gradient) feature are combined to select the tracking object. In the process of tracking, multi-scale object searching map is built to improve the detection performance of deep detection model and the tracking efficiency. In the experiment of eight respective tracking video sequences in the baseline dataset, compared with six state-of-the-art methods, the method in the paper has better robustness in the tracking challenging factors, such as deformation, scale variation, rotation variation, illumination variation, and background clutters, moreover, its general performance is better than other six tracking methods.
Size-Sensitive Perceptual Representations Underlie Visual and Haptic Object Recognition
Craddock, Matt; Lawson, Rebecca
2009-01-01
A variety of similarities between visual and haptic object recognition suggests that the two modalities may share common representations. However, it is unclear whether such common representations preserve low-level perceptual features or whether transfer between vision and haptics is mediated by high-level, abstract representations. Two experiments used a sequential shape-matching task to examine the effects of size changes on unimodal and crossmodal visual and haptic object recognition. Participants felt or saw 3D plastic models of familiar objects. The two objects presented on a trial were either the same size or different sizes and were the same shape or different but similar shapes. Participants were told to ignore size changes and to match on shape alone. In Experiment 1, size changes on same-shape trials impaired performance similarly for both visual-to-visual and haptic-to-haptic shape matching. In Experiment 2, size changes impaired performance on both visual-to-haptic and haptic-to-visual shape matching and there was no interaction between the cost of size changes and direction of transfer. Together the unimodal and crossmodal matching results suggest that the same, size-specific perceptual representations underlie both visual and haptic object recognition, and indicate that crossmodal memory for objects must be at least partly based on common perceptual representations. PMID:19956685
Cant, Jonathan S; Xu, Yaoda
2017-02-01
Our visual system can extract summary statistics from large collections of objects without forming detailed representations of the individual objects in the ensemble. In a region in ventral visual cortex encompassing the collateral sulcus and the parahippocampal gyrus and overlapping extensively with the scene-selective parahippocampal place area (PPA), we have previously reported fMRI adaptation to object ensembles when ensemble statistics repeated, even when local image features differed across images (e.g., two different images of the same strawberry pile). We additionally showed that this ensemble representation is similar to (but still distinct from) how visual texture patterns are processed in this region and is not explained by appealing to differences in the color of the elements that make up the ensemble. To further explore the nature of ensemble representation in this brain region, here we used PPA as our ROI and investigated in detail how the shape and surface properties (i.e., both texture and color) of the individual objects constituting an ensemble affect the ensemble representation in anterior-medial ventral visual cortex. We photographed object ensembles of stone beads that varied in shape and surface properties. A given ensemble always contained beads of the same shape and surface properties (e.g., an ensemble of star-shaped rose quartz beads). A change to the shape and/or surface properties of all the beads in an ensemble resulted in a significant release from adaptation in PPA compared with conditions in which no ensemble feature changed. In contrast, in the object-sensitive lateral occipital area (LO), we only observed a significant release from adaptation when the shape of the ensemble elements varied, and found no significant results in additional scene-sensitive regions, namely, the retrosplenial complex and occipital place area. Together, these results demonstrate that the shape and surface properties of the individual objects comprising an ensemble both contribute significantly to object ensemble representation in anterior-medial ventral visual cortex and further demonstrate a functional dissociation between object- (LO) and scene-selective (PPA) visual cortical regions and within the broader scene-processing network itself.
ERIC Educational Resources Information Center
Bartko, Susan J.; Winters, Boyer D.; Cowell, Rosemary A.; Saksida, Lisa M.; Bussey, Timothy J.
2007-01-01
The perirhinal cortex (PRh) has a well-established role in object recognition memory. More recent studies suggest that PRh is also important for two-choice visual discrimination tasks. Specifically, it has been suggested that PRh contains conjunctive representations that help resolve feature ambiguity, which occurs when a task cannot easily be…
Interpretation of the function of the striate cortex
NASA Astrophysics Data System (ADS)
Garner, Bernardette M.; Paplinski, Andrew P.
2000-04-01
Biological neural networks do not require retraining every time objects move in the visual field. Conventional computer neural networks do not share this shift-invariance. The brain compensates for movements in the head, body, eyes and objects by allowing the sensory data to be tracked across the visual field. The neurons in the striate cortex respond to objects moving across the field of vision as is seen in many experiments. It is proposed, that the neurons in the striate cortex allow continuous angle changes needed to compensate for changes in orientation of the head, eyes and the motion of objects in the field of vision. It is hypothesized that the neurons in the striate cortex form a system that allows for the translation, some rotation and scaling of objects and provides a continuity of objects as they move relative to other objects. The neurons in the striate cortex respond to features which are fundamental to sight, such as orientation of lines, direction of motion, color and contrast. The neurons that respond to these features are arranged on the cortex in a way that depends on the features they are responding to and on the area of the retina from which they receive their inputs.
Levichkina, Ekaterina; Saalmann, Yuri B; Vidyasagar, Trichur R
2017-03-01
Primate posterior parietal cortex (PPC) is known to be involved in controlling spatial attention. Neurons in one part of the PPC, the lateral intraparietal area (LIP), show enhanced responses to objects at attended locations. Although many are selective for object features, such as the orientation of a visual stimulus, it is not clear how LIP circuits integrate feature-selective information when providing attentional feedback about behaviorally relevant locations to the visual cortex. We studied the relationship between object feature and spatial attention properties of LIP cells in two macaques by measuring the cells' orientation selectivity and the degree of attentional enhancement while performing a delayed match-to-sample task. Monkeys had to match both the location and orientation of two visual gratings presented separately in time. We found a wide range in orientation selectivity and degree of attentional enhancement among LIP neurons. However, cells with significant attentional enhancement had much less orientation selectivity in their response than cells which showed no significant modulation by attention. Additionally, orientation-selective cells showed working memory activity for their preferred orientation, whereas cells showing attentional enhancement also synchronized with local neuronal activity. These results are consistent with models of selective attention incorporating two stages, where an initial feature-selective process guides a second stage of focal spatial attention. We suggest that LIP contributes to both stages, where the first stage involves orientation-selective LIP cells that support working memory of the relevant feature, and the second stage involves attention-enhanced LIP cells that synchronize to provide feedback on spatial priorities. © 2017 The Authors. Physiological Reports published by Wiley Periodicals, Inc. on behalf of The Physiological Society and the American Physiological Society.
Fox, Olivia M.; Harel, Assaf; Bennett, Kevin B.
2017-01-01
The perception of a visual stimulus is dependent not only upon local features, but also on the arrangement of those features. When stimulus features are perceptually well organized (e.g., symmetric or parallel), a global configuration with a high degree of salience emerges from the interactions between these features, often referred to as emergent features. Emergent features can be demonstrated in the Configural Superiority Effect (CSE): presenting a stimulus within an organized context relative to its presentation in a disarranged one results in better performance. Prior neuroimaging work on the perception of emergent features regards the CSE as an “all or none” phenomenon, focusing on the contrast between configural and non-configural stimuli. However, it is still not clear how emergent features are processed between these two endpoints. The current study examined the extent to which behavioral and neuroimaging markers of emergent features are responsive to the degree of configurality in visual displays. Subjects were tasked with reporting the anomalous quadrant in a visual search task while being scanned. Degree of configurality was manipulated by incrementally varying the rotational angle of low-level features within the stimulus arrays. Behaviorally, we observed faster response times with increasing levels of configurality. These behavioral changes were accompanied by increases in response magnitude across multiple visual areas in occipito-temporal cortex, primarily early visual cortex and object-selective cortex. Our findings suggest that the neural correlates of emergent features can be observed even in response to stimuli that are not fully configural, and demonstrate that configural information is already present at early stages of the visual hierarchy. PMID:28167924
Takegata, Rika; Brattico, Elvira; Tervaniemi, Mari; Varyagina, Olga; Näätänen, Risto; Winkler, István
2005-09-01
The role of attention in conjoining features of an object has been a topic of much debate. Studies using the mismatch negativity (MMN), an index of detecting acoustic deviance, suggested that the conjunctions of auditory features are preattentively represented in the brain. These studies, however, used sequentially presented sounds and thus are not directly comparable with visual studies of feature integration. Therefore, the current study presented an array of spatially distributed sounds to determine whether the auditory features of concurrent sounds are correctly conjoined without focal attention directed to the sounds. Two types of sounds differing from each other in timbre and pitch were repeatedly presented together while subjects were engaged in a visual n-back working-memory task and ignored the sounds. Occasional reversals of the frequent pitch-timbre combinations elicited MMNs of a very similar amplitude and latency irrespective of the task load. This result suggested preattentive integration of auditory features. However, performance in a subsequent target-search task with the same stimuli indicated the occurrence of illusory conjunctions. The discrepancy between the results obtained with and without focal attention suggests that illusory conjunctions may occur during voluntary access to the preattentively encoded object representations.
Mark Tracking: Position/orientation measurements using 4-circle mark and its tracking experiments
NASA Technical Reports Server (NTRS)
Kanda, Shinji; Okabayashi, Keijyu; Maruyama, Tsugito; Uchiyama, Takashi
1994-01-01
Future space robots require position and orientation tracking with visual feedback control to track and capture floating objects and satellites. We developed a four-circle mark that is useful for this purpose. With this mark, four geometric center positions as feature points can be extracted from the mark by simple image processing. We also developed a position and orientation measurement method that uses the four feature points in our mark. The mark gave good enough image measurement accuracy to let space robots approach and contact objects. A visual feedback control system using this mark enabled a robot arm to track a target object accurately. The control system was able to tolerate a time delay of 2 seconds.
Early Visual Cortex Dynamics during Top-Down Modulated Shifts of Feature-Selective Attention.
Müller, Matthias M; Trautmann, Mireille; Keitel, Christian
2016-04-01
Shifting attention from one color to another color or from color to another feature dimension such as shape or orientation is imperative when searching for a certain object in a cluttered scene. Most attention models that emphasize feature-based selection implicitly assume that all shifts in feature-selective attention underlie identical temporal dynamics. Here, we recorded time courses of behavioral data and steady-state visual evoked potentials (SSVEPs), an objective electrophysiological measure of neural dynamics in early visual cortex to investigate temporal dynamics when participants shifted attention from color or orientation toward color or orientation, respectively. SSVEPs were elicited by four random dot kinematograms that flickered at different frequencies. Each random dot kinematogram was composed of dashes that uniquely combined two features from the dimensions color (red or blue) and orientation (slash or backslash). Participants were cued to attend to one feature (such as color or orientation) and respond to coherent motion targets of the to-be-attended feature. We found that shifts toward color occurred earlier after the shifting cue compared with shifts toward orientation, regardless of the original feature (i.e., color or orientation). This was paralleled in SSVEP amplitude modulations as well as in the time course of behavioral data. Overall, our results suggest different neural dynamics during shifts of attention from color and orientation and the respective shifting destinations, namely, either toward color or toward orientation.
Oculomotor guidance and capture by irrelevant faces.
Devue, Christel; Belopolsky, Artem V; Theeuwes, Jan
2012-01-01
Even though it is generally agreed that face stimuli constitute a special class of stimuli, which are treated preferentially by our visual system, it remains unclear whether faces can capture attention in a stimulus-driven manner. Moreover, there is a long-standing debate regarding the mechanism underlying the preferential bias of selecting faces. Some claim that faces constitute a set of special low-level features to which our visual system is tuned; others claim that the visual system is capable of extracting the meaning of faces very rapidly, driving attentional selection. Those debates continue because many studies contain methodological peculiarities and manipulations that prevent a definitive conclusion. Here, we present a new visual search task in which observers had to make a saccade to a uniquely colored circle while completely irrelevant objects were also present in the visual field. The results indicate that faces capture and guide the eyes more than other animated objects and that our visual system is not only tuned to the low-level features that make up a face but also to its meaning.
Relationship between visual binding, reentry and awareness.
Koivisto, Mika; Silvanto, Juha
2011-12-01
Visual feature binding has been suggested to depend on reentrant processing. We addressed the relationship between binding, reentry, and visual awareness by asking the participants to discriminate the color and orientation of a colored bar (presented either alone or simultaneously with a white distractor bar) and to report their phenomenal awareness of the target features. The success of reentry was manipulated with object substitution masking and backward masking. The results showed that late reentrant processes are necessary for successful binding but not for phenomenal awareness of the bound features. Binding errors were accompanied by phenomenal awareness of the misbound feature conjunctions, demonstrating that they were experienced as real properties of the stimuli (i.e., illusory conjunctions). Our results suggest that early preattentive binding and local recurrent processing enable features to reach phenomenal awareness, while later attention-related reentrant iterations modulate the way in which the features are bound and experienced in awareness. Copyright © 2011 Elsevier Inc. All rights reserved.
Takahama, Sachiko; Saiki, Jun
2014-01-01
Information on an object's features bound to its location is very important for maintaining object representations in visual working memory. Interactions with dynamic multi-dimensional objects in an external environment require complex cognitive control, including the selective maintenance of feature-location binding. Here, we used event-related functional magnetic resonance imaging to investigate brain activity and functional connectivity related to the maintenance of complex feature-location binding. Participants were required to detect task-relevant changes in feature-location binding between objects defined by color, orientation, and location. We compared a complex binding task requiring complex feature-location binding (color-orientation-location) with a simple binding task in which simple feature-location binding, such as color-location, was task-relevant and the other feature was task-irrelevant. Univariate analyses showed that the dorsolateral prefrontal cortex (DLPFC), hippocampus, and frontoparietal network were activated during the maintenance of complex feature-location binding. Functional connectivity analyses indicated cooperation between the inferior precentral sulcus (infPreCS), DLPFC, and hippocampus during the maintenance of complex feature-location binding. In contrast, the connectivity for the spatial updating of simple feature-location binding determined by reanalyzing the data from Takahama et al. (2010) demonstrated that the superior parietal lobule (SPL) cooperated with the DLPFC and hippocampus. These results suggest that the connectivity for complex feature-location binding does not simply reflect general memory load and that the DLPFC and hippocampus flexibly modulate the dorsal frontoparietal network, depending on the task requirements, with the infPreCS involved in the maintenance of complex feature-location binding and the SPL involved in the spatial updating of simple feature-location binding. PMID:24917833
Takahama, Sachiko; Saiki, Jun
2014-01-01
Information on an object's features bound to its location is very important for maintaining object representations in visual working memory. Interactions with dynamic multi-dimensional objects in an external environment require complex cognitive control, including the selective maintenance of feature-location binding. Here, we used event-related functional magnetic resonance imaging to investigate brain activity and functional connectivity related to the maintenance of complex feature-location binding. Participants were required to detect task-relevant changes in feature-location binding between objects defined by color, orientation, and location. We compared a complex binding task requiring complex feature-location binding (color-orientation-location) with a simple binding task in which simple feature-location binding, such as color-location, was task-relevant and the other feature was task-irrelevant. Univariate analyses showed that the dorsolateral prefrontal cortex (DLPFC), hippocampus, and frontoparietal network were activated during the maintenance of complex feature-location binding. Functional connectivity analyses indicated cooperation between the inferior precentral sulcus (infPreCS), DLPFC, and hippocampus during the maintenance of complex feature-location binding. In contrast, the connectivity for the spatial updating of simple feature-location binding determined by reanalyzing the data from Takahama et al. (2010) demonstrated that the superior parietal lobule (SPL) cooperated with the DLPFC and hippocampus. These results suggest that the connectivity for complex feature-location binding does not simply reflect general memory load and that the DLPFC and hippocampus flexibly modulate the dorsal frontoparietal network, depending on the task requirements, with the infPreCS involved in the maintenance of complex feature-location binding and the SPL involved in the spatial updating of simple feature-location binding.
Conceptual Distinctiveness Supports Detailed Visual Long-Term Memory for Real-World Objects
Konkle, Talia; Brady, Timothy F.; Alvarez, George A.; Oliva, Aude
2012-01-01
Humans have a massive capacity to store detailed information in visual long-term memory. The present studies explored the fidelity of these visual long-term memory representations and examined how conceptual and perceptual features of object categories support this capacity. Observers viewed 2,800 object images with a different number of exemplars presented from each category. At test, observers indicated which of 2 exemplars they had previously studied. Memory performance was high and remained quite high (82% accuracy) with 16 exemplars from a category in memory, demonstrating a large memory capacity for object exemplars. However, memory performance decreased as more exemplars were held in memory, implying systematic categorical interference. Object categories with conceptually distinctive exemplars showed less interference in memory as the number of exemplars increased. Interference in memory was not predicted by the perceptual distinctiveness of exemplars from an object category, though these perceptual measures predicted visual search rates for an object target among exemplars. These data provide evidence that observers’ capacity to remember visual information in long-term memory depends more on conceptual structure than perceptual distinctiveness. PMID:20677899
Distinct cognitive mechanisms involved in the processing of single objects and object ensembles
Cant, Jonathan S.; Sun, Sol Z.; Xu, Yaoda
2015-01-01
Behavioral research has demonstrated that the shape and texture of single objects can be processed independently. Similarly, neuroimaging results have shown that an object's shape and texture are processed in distinct brain regions with shape in the lateral occipital area and texture in parahippocampal cortex. Meanwhile, objects are not always seen in isolation and are often grouped together as an ensemble. We recently showed that the processing of ensembles also involves parahippocampal cortex and that the shape and texture of ensemble elements are processed together within this region. These neural data suggest that the independence seen between shape and texture in single-object perception would not be observed in object-ensemble perception. Here we tested this prediction by examining whether observers could attend to the shape of ensemble elements while ignoring changes in an unattended texture feature and vice versa. Across six behavioral experiments, we replicated previous findings of independence between shape and texture in single-object perception. In contrast, we observed that changes in an unattended ensemble feature negatively impacted the processing of an attended ensemble feature only when ensemble features were attended globally. When they were attended locally, thereby making ensemble processing similar to single-object processing, interference was abolished. Overall, these findings confirm previous neuroimaging results and suggest that distinct cognitive mechanisms may be involved in single-object and object-ensemble perception. Additionally, they show that the scope of visual attention plays a critical role in determining which type of object processing (ensemble or single object) is engaged by the visual system. PMID:26360156
ERIC Educational Resources Information Center
Knutson, Ashley R.; Hopkins, Ramona O.; Squire, Larry R.
2013-01-01
We tested proposals that medial temporal lobe (MTL) structures support not just memory but certain kinds of visual perception as well. Patients with hippocampal lesions or larger MTL lesions attempted to identify the unique object among twin pairs of objects that had a high degree of feature overlap. Patients were markedly impaired under the more…
Ptak, Radek; Lazeyras, François; Di Pietro, Marie; Schnider, Armin; Simon, Stéphane R
2014-07-01
Patients with visual object agnosia fail to recognize the identity of visually presented objects despite preserved semantic knowledge. Object agnosia may result from damage to visual cortex lying close to or overlapping with the lateral occipital complex (LOC), a brain region that exhibits selectivity to the shape of visually presented objects. Despite this anatomical overlap the relationship between shape processing in the LOC and shape representations in object agnosia is unknown. We studied a patient with object agnosia following isolated damage to the left occipito-temporal cortex overlapping with the LOC. The patient showed intact processing of object structure, yet often made identification errors that were mainly based on the global visual similarity between objects. Using functional Magnetic Resonance Imaging (fMRI) we found that the damaged as well as the contralateral, structurally intact right LOC failed to show any object-selective fMRI activity, though the latter retained selectivity for faces. Thus, unilateral damage to the left LOC led to a bilateral breakdown of neural responses to a specific stimulus class (objects and artefacts) while preserving the response to a different stimulus class (faces). These findings indicate that representations of structure necessary for the identification of objects crucially rely on bilateral, distributed coding of shape features. Copyright © 2014 Elsevier Ltd. All rights reserved.
Determining the orientation of depth-rotated familiar objects.
Niimi, Ryosuke; Yokosawa, Kazuhiko
2008-02-01
How does the human visual system determine the depth-orientation of familiar objects? We examined reaction times and errors in the detection of 15 degrees differences in the depth orientations of two simultaneously presented familiar objects, which were the same objects (Experiment 1) or different objects (Experiment 2). Detection of orientation differences was best for 0 degrees (front) and 180 degrees (back), while 45 degrees and 135 degrees yielded poorer results, and 90 degrees (side) showed intermediate results, suggesting that the visual system is tuned for front, side and back orientations. We further found that those advantages are due to orientation-specific features such as horizontal linear contours and symmetry, since the 90 degrees advantage was absent for objects with curvilinear contours, and asymmetric object diminished the 0 degrees and 180 degrees advantages. We conclude that the efficiency of visually determining object orientation is highly orientation-dependent, and object orientation may be perceived in favor of front-back axes.
ERIC Educational Resources Information Center
Flombaum, Jonathan I.; Scholl, Brian J.
2006-01-01
Meaningful visual experience requires computations that identify objects as the same persisting individuals over time, motion, occlusion, and featural change. This article explores these computations in the tunnel effect: When an object moves behind an occluder, and then an object later emerges following a consistent trajectory, observers…
Nawroth, Christian; Prentice, Pamela M; McElligott, Alan G
2017-01-01
Variation in common personality traits, such as boldness or exploration, is often associated with risk-reward trade-offs and behavioural flexibility. To date, only a few studies have examined the effects of consistent behavioural traits on both learning and cognition. We investigated whether certain personality traits ('exploration' and 'sociability') of individuals were related to cognitive performance, learning flexibility and learning style in a social ungulate species, the goat (Capra hircus). We also investigated whether a preference for feature cues rather than impaired learning abilities can explain performance variation in a visual discrimination task. We found that personality scores were consistent across time and context. Less explorative goats performed better in a non-associative cognitive task, in which subjects had to follow the trajectory of a hidden object (i.e. testing their ability for object permanence). We also found that less sociable subjects performed better compared to more sociable goats in a visual discrimination task. Good visual learning performance was associated with a preference for feature cues, indicating personality-dependent learning strategies in goats. Our results suggest that personality traits predict the outcome in visual discrimination and non-associative cognitive tasks in goats and that impaired performance in a visual discrimination tasks does not necessarily imply impaired learning capacities, but rather can be explained by a varying preference for feature cues. Copyright © 2016 Elsevier B.V. All rights reserved.
Feature and Region Selection for Visual Learning.
Zhao, Ji; Wang, Liantao; Cabral, Ricardo; De la Torre, Fernando
2016-03-01
Visual learning problems, such as object classification and action recognition, are typically approached using extensions of the popular bag-of-words (BoWs) model. Despite its great success, it is unclear what visual features the BoW model is learning. Which regions in the image or video are used to discriminate among classes? Which are the most discriminative visual words? Answering these questions is fundamental for understanding existing BoW models and inspiring better models for visual recognition. To answer these questions, this paper presents a method for feature selection and region selection in the visual BoW model. This allows for an intermediate visualization of the features and regions that are important for visual learning. The main idea is to assign latent weights to the features or regions, and jointly optimize these latent variables with the parameters of a classifier (e.g., support vector machine). There are four main benefits of our approach: 1) our approach accommodates non-linear additive kernels, such as the popular χ(2) and intersection kernel; 2) our approach is able to handle both regions in images and spatio-temporal regions in videos in a unified way; 3) the feature selection problem is convex, and both problems can be solved using a scalable reduced gradient method; and 4) we point out strong connections with multiple kernel learning and multiple instance learning approaches. Experimental results in the PASCAL VOC 2007, MSR Action Dataset II and YouTube illustrate the benefits of our approach.
Harris, Joseph A; Donohue, Sarah E; Schoenfeld, Mircea A; Hopf, Jens-Max; Heinze, Hans-Jochen; Woldorff, Marty G
2016-08-15
Reward-associated visual features have been shown to capture visual attention, evidenced in faster and more accurate behavioral performance, as well as in neural responses reflecting lateralized shifts of visual attention to those features. Specifically, the contralateral N2pc event-related-potential (ERP) component that reflects attentional shifting exhibits increased amplitude in response to task-relevant targets containing a reward-associated feature. In the present study, we examined the automaticity of such reward-association effects using object-substitution masking (OSM) in conjunction with MEG measures of visual attentional shifts. In OSM, a visual-search array is presented, with the target item to be detected indicated by a surrounding mask (here, four surrounding squares). Delaying the offset of the target-surrounding four-dot mask relative to the offset of the rest of the target/distracter array disrupts the viewer's awareness of the target (masked condition), whereas simultaneous offsets do not (unmasked condition). Here we manipulated whether the color of the OSM target was or was not of a previously reward-associated color. By tracking reward-associated enhancements of behavior and the N2pc in response to masked targets containing a previously rewarded or unrewarded feature, the automaticity of attentional capture by reward could be probed. We found an enhanced N2pc response to targets containing a previously reward-associated color feature. Moreover, this enhancement of the N2pc by reward did not differ between masking conditions, nor did it differ as a function of the apparent visibility of the target within the masked condition. Overall, these results underscore the automaticity of attentional capture by reward-associated features, and demonstrate the ability of feature-based reward associations to shape attentional capture and allocation outside of perceptual awareness. Copyright © 2016 Elsevier Inc. All rights reserved.
Attention is required for maintenance of feature binding in visual working memory
Heider, Maike; Husain, Masud
2013-01-01
Working memory and attention are intimately connected. However, understanding the relationship between the two is challenging. Currently, there is an important controversy about whether objects in working memory are maintained automatically or require resources that are also deployed for visual or auditory attention. Here we investigated the effects of loading attention resources on precision of visual working memory, specifically on correct maintenance of feature-bound objects, using a dual-task paradigm. Participants were presented with a memory array and were asked to remember either direction of motion of random dot kinematograms of different colour, or orientation of coloured bars. During the maintenance period, they performed a secondary visual or auditory task, with varying levels of load. Following a retention period, they adjusted a coloured probe to match either the motion direction or orientation of stimuli with the same colour in the memory array. This allowed us to examine the effects of an attention-demanding task performed during maintenance on precision of recall on the concurrent working memory task. Systematic increase in attention load during maintenance resulted in a significant decrease in overall working memory performance. Changes in overall performance were specifically accompanied by an increase in feature misbinding errors: erroneous reporting of nontarget motion or orientation. Thus in trials where attention resources were taxed, participants were more likely to respond with nontarget values rather than simply making random responses. Our findings suggest that resources used during attention-demanding visual or auditory tasks also contribute to maintaining feature-bound representations in visual working memory—but not necessarily other aspects of working memory. PMID:24266343
Attention is required for maintenance of feature binding in visual working memory.
Zokaei, Nahid; Heider, Maike; Husain, Masud
2014-01-01
Working memory and attention are intimately connected. However, understanding the relationship between the two is challenging. Currently, there is an important controversy about whether objects in working memory are maintained automatically or require resources that are also deployed for visual or auditory attention. Here we investigated the effects of loading attention resources on precision of visual working memory, specifically on correct maintenance of feature-bound objects, using a dual-task paradigm. Participants were presented with a memory array and were asked to remember either direction of motion of random dot kinematograms of different colour, or orientation of coloured bars. During the maintenance period, they performed a secondary visual or auditory task, with varying levels of load. Following a retention period, they adjusted a coloured probe to match either the motion direction or orientation of stimuli with the same colour in the memory array. This allowed us to examine the effects of an attention-demanding task performed during maintenance on precision of recall on the concurrent working memory task. Systematic increase in attention load during maintenance resulted in a significant decrease in overall working memory performance. Changes in overall performance were specifically accompanied by an increase in feature misbinding errors: erroneous reporting of nontarget motion or orientation. Thus in trials where attention resources were taxed, participants were more likely to respond with nontarget values rather than simply making random responses. Our findings suggest that resources used during attention-demanding visual or auditory tasks also contribute to maintaining feature-bound representations in visual working memory-but not necessarily other aspects of working memory.
Visual short-term memory for oriented, colored objects.
Shin, Hongsup; Ma, Wei Ji
2017-08-01
A central question in the study of visual short-term memory (VSTM) has been whether its basic units are objects or features. Most studies addressing this question have used change detection tasks in which the feature value before the change is highly discriminable from the feature value after the change. This approach assumes that memory noise is negligible, which recent work has shown not to be the case. Here, we investigate VSTM for orientation and color within a noisy-memory framework, using change localization with a variable magnitude of change. A specific consequence of the noise is that it is necessary to model the inference (decision) stage. We find that (a) orientation and color have independent pools of memory resource (consistent with classic results); (b) an irrelevant feature dimension is either encoded but ignored during decision-making, or encoded with low precision and taken into account during decision-making; and (c) total resource available in a given feature dimension is lower in the presence of task-relevant stimuli that are neutral in that feature dimension. We propose a framework in which feature resource comes both in packaged and in targeted form.
Unconscious Familiarity-based Color-Form Binding: Evidence from Visual Extinction.
Rappaport, Sarah J; Riddoch, M Jane; Chechlacz, Magda; Humphreys, Glyn W
2016-03-01
There is good evidence that early visual processing involves the coding of different features in independent brain regions. A major question, then, is how we see the world in an integrated manner, in which the different features are "bound" together. A standard account of this has been that feature binding depends on attention to the stimulus, which enables only the relevant features to be linked together [Treisman, A., & Gelade, G. A feature-integration theory of attention. Cognitive Psychology, 12, 97-136, 1980]. Here we test this influential idea by examining whether, in patients showing visual extinction, the processing of otherwise unconscious (extinguished) stimuli is modulated by presenting objects in their correct (familiar) color. Correctly colored objects showed reduced extinction when they had a learned color, and this color matched across the ipsi- and contralesional items (red strawberry + red tomato). In contrast, there was no reduction in extinction under the same conditions when the stimuli were colored incorrectly (blue strawberry + blue tomato; Experiment 1). The result was not due to the speeded identification of a correctly colored ipsilesional item, as there was no benefit from having correctly colored objects in different colors (red strawberry + yellow lemon; Experiment 2). There was also no benefit to extinction from presenting the correct colors in the background of each item (Experiment 3). The data suggest that learned color-form binding can reduce extinction even when color is irrelevant for the task. The result is consistent with preattentive binding of color and shape for familiar stimuli.
Bankson, B B; Hebart, M N; Groen, I I A; Baker, C I
2018-05-17
Visual object representations are commonly thought to emerge rapidly, yet it has remained unclear to what extent early brain responses reflect purely low-level visual features of these objects and how strongly those features contribute to later categorical or conceptual representations. Here, we aimed to estimate a lower temporal bound for the emergence of conceptual representations by defining two criteria that characterize such representations: 1) conceptual object representations should generalize across different exemplars of the same object, and 2) these representations should reflect high-level behavioral judgments. To test these criteria, we compared magnetoencephalography (MEG) recordings between two groups of participants (n = 16 per group) exposed to different exemplar images of the same object concepts. Further, we disentangled low-level from high-level MEG responses by estimating the unique and shared contribution of models of behavioral judgments, semantics, and different layers of deep neural networks of visual object processing. We find that 1) both generalization across exemplars as well as generalization of object-related signals across time increase after 150 ms, peaking around 230 ms; 2) representations specific to behavioral judgments emerged rapidly, peaking around 160 ms. Collectively, these results suggest a lower bound for the emergence of conceptual object representations around 150 ms following stimulus onset. Copyright © 2018 Elsevier Inc. All rights reserved.
Seeing without knowing: task relevance dissociates between visual awareness and recognition.
Eitam, Baruch; Shoval, Roy; Yeshurun, Yaffa
2015-03-01
We demonstrate that task relevance dissociates between visual awareness and knowledge activation to create a state of seeing without knowing-visual awareness of familiar stimuli without recognizing them. We rely on the fact that in order to experience a Kanizsa illusion, participants must be aware of its inducers. While people can indicate the orientation of the illusory rectangle with great ease (signifying that they have consciously experienced the illusion's inducers), almost 30% of them could not report the inducers' color. Thus, people can see, in the sense of phenomenally experiencing, but not know, in the sense of recognizing what the object is or activating appropriate knowledge about it. Experiment 2 tests whether relevance-based selection operates within objects and shows that, contrary to the pattern of results found with features of different objects in our previous studies and replicated in Experiment 1, selection does not occur when both relevant and irrelevant features belong to the same object. We discuss these findings in relation to the existing theories of consciousness and to attention and inattentional blindness, and the role of cognitive load, object-based attention, and the use of self-reports as measures of awareness. © 2015 New York Academy of Sciences.
Solid object visualization of 3D ultrasound data
NASA Astrophysics Data System (ADS)
Nelson, Thomas R.; Bailey, Michael J.
2000-04-01
Visualization of volumetric medical data is challenging. Rapid-prototyping (RP) equipment producing solid object prototype models of computer generated structures is directly applicable to visualization of medical anatomic data. The purpose of this study was to develop methods for transferring 3D Ultrasound (3DUS) data to RP equipment for visualization of patient anatomy. 3DUS data were acquired using research and clinical scanning systems. Scaling information was preserved and the data were segmented using threshold and local operators to extract features of interest, converted from voxel raster coordinate format to a set of polygons representing an iso-surface and transferred to the RP machine to create a solid 3D object. Fabrication required 30 to 60 minutes depending on object size and complexity. After creation the model could be touched and viewed. A '3D visualization hardcopy device' has advantages for conveying spatial relations compared to visualization using computer display systems. The hardcopy model may be used for teaching or therapy planning. Objects may be produced at the exact dimension of the original object or scaled up (or down) to facilitate matching the viewers reference frame more optimally. RP models represent a useful means of communicating important information in a tangible fashion to patients and physicians.
Matsukura, Michi; Vecera, Shaun P
2011-02-01
Attention selects objects as well as locations. When attention selects an object's features, observers identify two features from a single object more accurately than two features from two different objects (object-based effect of attention; e.g., Duncan, Journal of Experimental Psychology: General, 113, 501-517, 1984). Several studies have demonstrated that object-based attention can operate at a late visual processing stage that is independent of objects' spatial information (Awh, Dhaliwal, Christensen, & Matsukura, Psychological Science, 12, 329-334, 2001; Matsukura & Vecera, Psychonomic Bulletin & Review, 16, 529-536, 2009; Vecera, Journal of Experimental Psychology: General, 126, 14-18, 1997; Vecera & Farah, Journal of Experimental Psychology: General, 123, 146-160, 1994). In the present study, we asked two questions regarding this late object-based selection mechanism. In Part I, we investigated how observers' foreknowledge of to-be-reported features allows attention to select objects, as opposed to individual features. Using a feature-report task, a significant object-based effect was observed when to-be-reported features were known in advance but not when this advance knowledge was absent. In Part II, we examined what drives attention to select objects rather than individual features in the absence of observers' foreknowledge of to-be-reported features. Results suggested that, when there was no opportunity for observers to direct their attention to objects that possess to-be-reported features at the time of stimulus presentation, these stimuli must retain strong perceptual cues to establish themselves as separate objects.
The Timing of Visual Object Categorization
Mack, Michael L.; Palmeri, Thomas J.
2011-01-01
An object can be categorized at different levels of abstraction: as natural or man-made, animal or plant, bird or dog, or as a Northern Cardinal or Pyrrhuloxia. There has been growing interest in understanding how quickly categorizations at different levels are made and how the timing of those perceptual decisions changes with experience. We specifically contrast two perspectives on the timing of object categorization at different levels of abstraction. By one account, the relative timing implies a relative timing of stages of visual processing that are tied to particular levels of object categorization: Fast categorizations are fast because they precede other categorizations within the visual processing hierarchy. By another account, the relative timing reflects when perceptual features are available over time and the quality of perceptual evidence used to drive a perceptual decision process: Fast simply means fast, it does not mean first. Understanding the short-term and long-term temporal dynamics of object categorizations is key to developing computational models of visual object recognition. We briefly review a number of models of object categorization and outline how they explain the timing of visual object categorization at different levels of abstraction. PMID:21811480
Visual saliency in MPEG-4 AVC video stream
NASA Astrophysics Data System (ADS)
Ammar, M.; Mitrea, M.; Hasnaoui, M.; Le Callet, P.
2015-03-01
Visual saliency maps already proved their efficiency in a large variety of image/video communication application fields, covering from selective compression and channel coding to watermarking. Such saliency maps are generally based on different visual characteristics (like color, intensity, orientation, motion,…) computed from the pixel representation of the visual content. This paper resumes and extends our previous work devoted to the definition of a saliency map solely extracted from the MPEG-4 AVC stream syntax elements. The MPEG-4 AVC saliency map thus defined is a fusion of static and dynamic map. The static saliency map is in its turn a combination of intensity, color and orientation features maps. Despite the particular way in which all these elementary maps are computed, the fusion techniques allowing their combination plays a critical role in the final result and makes the object of the proposed study. A total of 48 fusion formulas (6 for combining static features and, for each of them, 8 to combine static to dynamic features) are investigated. The performances of the obtained maps are evaluated on a public database organized at IRCCyN, by computing two objective metrics: the Kullback-Leibler divergence and the area under curve.
Capacity for Visual Features in Mental Rotation.
Xu, Yangqing; Franconeri, Steven L
2015-08-01
Although mental rotation is a core component of scientific reasoning, little is known about its underlying mechanisms. For instance, how much visual information can someone rotate at once? We asked participants to rotate a simple multipart shape, requiring them to maintain attachments between features and moving parts. The capacity of this aspect of mental rotation was strikingly low: Only one feature could remain attached to one part. Behavioral and eye-tracking data showed that this single feature remained "glued" via a singular focus of attention, typically on the object's top. We argue that the architecture of the human visual system is not suited for keeping multiple features attached to multiple parts during mental rotation. Such measurement of capacity limits may prove to be a critical step in dissecting the suite of visuospatial tools involved in mental rotation, leading to insights for improvement of pedagogy in science-education contexts. © The Author(s) 2015.
User-assisted video segmentation system for visual communication
NASA Astrophysics Data System (ADS)
Wu, Zhengping; Chen, Chun
2002-01-01
Video segmentation plays an important role for efficient storage and transmission in visual communication. In this paper, we introduce a novel video segmentation system using point tracking and contour formation techniques. Inspired by the results from the study of the human visual system, we intend to solve the video segmentation problem into three separate phases: user-assisted feature points selection, feature points' automatic tracking, and contour formation. This splitting relieves the computer of ill-posed automatic segmentation problems, and allows a higher level of flexibility of the method. First, the precise feature points can be found using a combination of user assistance and an eigenvalue-based adjustment. Second, the feature points in the remaining frames are obtained using motion estimation and point refinement. At last, contour formation is used to extract the object, and plus a point insertion process to provide the feature points for next frame's tracking.
NASA Astrophysics Data System (ADS)
Li, Heng; Zeng, Yajie; Lu, Zhuofan; Cao, Xiaofei; Su, Xiaofan; Sui, Xiaohong; Wang, Jing; Chai, Xinyu
2018-04-01
Objective. Retinal prosthesis devices have shown great value in restoring some sight for individuals with profoundly impaired vision, but the visual acuity and visual field provided by prostheses greatly limit recipients’ visual experience. In this paper, we employ computer vision approaches to seek to expand the perceptible visual field in patients implanted potentially with a high-density retinal prosthesis while maintaining visual acuity as much as possible. Approach. We propose an optimized content-aware image retargeting method, by introducing salient object detection based on color and intensity-difference contrast, aiming to remap important information of a scene into a small visual field and preserve their original scale as much as possible. It may improve prosthetic recipients’ perceived visual field and aid in performing some visual tasks (e.g. object detection and object recognition). To verify our method, psychophysical experiments, detecting object number and recognizing objects, are conducted under simulated prosthetic vision. As control, we use three other image retargeting techniques, including Cropping, Scaling, and seam-assisted shrinkability. Main results. Results show that our method outperforms in preserving more key features and has significantly higher recognition accuracy in comparison with other three image retargeting methods under the condition of small visual field and low-resolution. Significance. The proposed method is beneficial to expand the perceived visual field of prosthesis recipients and improve their object detection and recognition performance. It suggests that our method may provide an effective option for image processing module in future high-density retinal implants.
Effects of verbal and nonverbal interference on spatial and object visual working memory.
Postle, Bradley R; Desposito, Mark; Corkin, Suzanne
2005-03-01
We tested the hypothesis that a verbal coding mechanism is necessarily engaged by object, but not spatial, visual working memory tasks. We employed a dual-task procedure that paired n-back working memory tasks with domain-specific distractor trials inserted into each interstimulus interval of the n-back tasks. In two experiments, object n-back performance demonstrated greater sensitivity to verbal distraction, whereas spatial n-back performance demonstrated greater sensitivity to motion distraction. Visual object and spatial working memory may differ fundamentally in that the mnemonic representation of featural characteristics of objects incorporates a verbal (perhaps semantic) code, whereas the mnemonic representation of the location of objects does not. Thus, the processes supporting working memory for these two types of information may differ in more ways than those dictated by the "what/where" organization of the visual system, a fact more easily reconciled with a component process than a memory systems account of working memory function.
Effects of verbal and nonverbal interference on spatial and object visual working memory
POSTLE, BRADLEY R.; D’ESPOSITO, MARK; CORKIN, SUZANNE
2005-01-01
We tested the hypothesis that a verbal coding mechanism is necessarily engaged by object, but not spatial, visual working memory tasks. We employed a dual-task procedure that paired n-back working memory tasks with domain-specific distractor trials inserted into each interstimulus interval of the n-back tasks. In two experiments, object n-back performance demonstrated greater sensitivity to verbal distraction, whereas spatial n-back performance demonstrated greater sensitivity to motion distraction. Visual object and spatial working memory may differ fundamentally in that the mnemonic representation of featural characteristics of objects incorporates a verbal (perhaps semantic) code, whereas the mnemonic representation of the location of objects does not. Thus, the processes supporting working memory for these two types of information may differ in more ways than those dictated by the “what/where” organization of the visual system, a fact more easily reconciled with a component process than a memory systems account of working memory function. PMID:16028575
Audio-video decision support for patients: the documentary genré as a basis for decision aids.
Volandes, Angelo E; Barry, Michael J; Wood, Fiona; Elwyn, Glyn
2013-09-01
Decision support tools are increasingly using audio-visual materials. However, disagreement exists about the use of audio-visual materials as they may be subjective and biased. This is a literature review of the major texts for documentary film studies to extrapolate issues of objectivity and bias from film to decision support tools. The key features of documentary films are that they attempt to portray real events and that the attempted reality is always filtered through the lens of the filmmaker. The same key features can be said of decision support tools that use audio-visual materials. Three concerns arising from documentary film studies as they apply to the use of audio-visual materials in decision support tools include whose perspective matters (stakeholder bias), how to choose among audio-visual materials (selection bias) and how to ensure objectivity (editorial bias). Decision science needs to start a debate about how audio-visual materials are to be used in decision support tools. Simply because audio-visual materials may be subjective and open to bias does not mean that we should not use them. Methods need to be found to ensure consensus around balance and editorial control, such that audio-visual materials can be used. © 2011 John Wiley & Sons Ltd.
Audio‐video decision support for patients: the documentary genré as a basis for decision aids
Volandes, Angelo E.; Barry, Michael J.; Wood, Fiona; Elwyn, Glyn
2011-01-01
Abstract Objective Decision support tools are increasingly using audio‐visual materials. However, disagreement exists about the use of audio‐visual materials as they may be subjective and biased. Methods This is a literature review of the major texts for documentary film studies to extrapolate issues of objectivity and bias from film to decision support tools. Results The key features of documentary films are that they attempt to portray real events and that the attempted reality is always filtered through the lens of the filmmaker. The same key features can be said of decision support tools that use audio‐visual materials. Three concerns arising from documentary film studies as they apply to the use of audio‐visual materials in decision support tools include whose perspective matters (stakeholder bias), how to choose among audio‐visual materials (selection bias) and how to ensure objectivity (editorial bias). Discussion Decision science needs to start a debate about how audio‐visual materials are to be used in decision support tools. Simply because audio‐visual materials may be subjective and open to bias does not mean that we should not use them. Conclusion Methods need to be found to ensure consensus around balance and editorial control, such that audio‐visual materials can be used. PMID:22032516
Semantic guidance of eye movements in real-world scenes
Hwang, Alex D.; Wang, Hsueh-Cheng; Pomplun, Marc
2011-01-01
The perception of objects in our visual world is influenced by not only their low-level visual features such as shape and color, but also their high-level features such as meaning and semantic relations among them. While it has been shown that low-level features in real-world scenes guide eye movements during scene inspection and search, the influence of semantic similarity among scene objects on eye movements in such situations has not been investigated. Here we study guidance of eye movements by semantic similarity among objects during real-world scene inspection and search. By selecting scenes from the LabelMe object-annotated image database and applying Latent Semantic Analysis (LSA) to the object labels, we generated semantic saliency maps of real-world scenes based on the semantic similarity of scene objects to the currently fixated object or the search target. An ROC analysis of these maps as predictors of subjects’ gaze transitions between objects during scene inspection revealed a preference for transitions to objects that were semantically similar to the currently inspected one. Furthermore, during the course of a scene search, subjects’ eye movements were progressively guided toward objects that were semantically similar to the search target. These findings demonstrate substantial semantic guidance of eye movements in real-world scenes and show its importance for understanding real-world attentional control. PMID:21426914
Semantic guidance of eye movements in real-world scenes.
Hwang, Alex D; Wang, Hsueh-Cheng; Pomplun, Marc
2011-05-25
The perception of objects in our visual world is influenced by not only their low-level visual features such as shape and color, but also their high-level features such as meaning and semantic relations among them. While it has been shown that low-level features in real-world scenes guide eye movements during scene inspection and search, the influence of semantic similarity among scene objects on eye movements in such situations has not been investigated. Here we study guidance of eye movements by semantic similarity among objects during real-world scene inspection and search. By selecting scenes from the LabelMe object-annotated image database and applying latent semantic analysis (LSA) to the object labels, we generated semantic saliency maps of real-world scenes based on the semantic similarity of scene objects to the currently fixated object or the search target. An ROC analysis of these maps as predictors of subjects' gaze transitions between objects during scene inspection revealed a preference for transitions to objects that were semantically similar to the currently inspected one. Furthermore, during the course of a scene search, subjects' eye movements were progressively guided toward objects that were semantically similar to the search target. These findings demonstrate substantial semantic guidance of eye movements in real-world scenes and show its importance for understanding real-world attentional control. Copyright © 2011 Elsevier Ltd. All rights reserved.
Schmidt, Joseph; MacNamara, Annmarie; Proudfit, Greg Hajcak; Zelinsky, Gregory J.
2014-01-01
The visual-search literature has assumed that the top-down target representation used to guide search resides in visual working memory (VWM). We directly tested this assumption using contralateral delay activity (CDA) to estimate the VWM load imposed by the target representation. In Experiment 1, observers previewed four photorealistic objects and were cued to remember the two objects appearing to the left or right of central fixation; Experiment 2 was identical except that observers previewed two photorealistic objects and were cued to remember one. CDA was measured during a delay following preview offset but before onset of a four-object search array. One of the targets was always present, and observers were asked to make an eye movement to it and press a button. We found lower magnitude CDA on trials when the initial search saccade was directed to the target (strong guidance) compared to when it was not (weak guidance). This difference also tended to be larger shortly before search-display onset and was largely unaffected by VWM item-capacity limits or number of previews. Moreover, the difference between mean strong- and weak-guidance CDA was proportional to the increase in search time between mean strong-and weak-guidance trials (as measured by time-to-target and reaction-time difference scores). Contrary to most search models, our data suggest that trials resulting in the maintenance of more target features results in poor search guidance to a target. We interpret these counterintuitive findings as evidence for strong search guidance using a small set of highly discriminative target features that remain after pruning from a larger set of features, with the load imposed on VWM varying with this feature-consolidation process. PMID:24599946
Schmidt, Joseph; MacNamara, Annmarie; Proudfit, Greg Hajcak; Zelinsky, Gregory J
2014-03-05
The visual-search literature has assumed that the top-down target representation used to guide search resides in visual working memory (VWM). We directly tested this assumption using contralateral delay activity (CDA) to estimate the VWM load imposed by the target representation. In Experiment 1, observers previewed four photorealistic objects and were cued to remember the two objects appearing to the left or right of central fixation; Experiment 2 was identical except that observers previewed two photorealistic objects and were cued to remember one. CDA was measured during a delay following preview offset but before onset of a four-object search array. One of the targets was always present, and observers were asked to make an eye movement to it and press a button. We found lower magnitude CDA on trials when the initial search saccade was directed to the target (strong guidance) compared to when it was not (weak guidance). This difference also tended to be larger shortly before search-display onset and was largely unaffected by VWM item-capacity limits or number of previews. Moreover, the difference between mean strong- and weak-guidance CDA was proportional to the increase in search time between mean strong-and weak-guidance trials (as measured by time-to-target and reaction-time difference scores). Contrary to most search models, our data suggest that trials resulting in the maintenance of more target features results in poor search guidance to a target. We interpret these counterintuitive findings as evidence for strong search guidance using a small set of highly discriminative target features that remain after pruning from a larger set of features, with the load imposed on VWM varying with this feature-consolidation process.
Models of Speed Discrimination
NASA Technical Reports Server (NTRS)
1997-01-01
The prime purpose of this project was to investigate various theoretical issues concerning the integration of information across visual space. To date, most of the research efforts in the study of the visual system seem to have been focused in two almost non-overlaping directions. One research focus has been the low level perception as studied by psychophysics. The other focus has been the study of high level vision exemplified by the study of object perception. Most of the effort in psychophysics has been devoted to the search for the fundamental "features" of perception. The general idea is that the most peripheral processes of the visual system decompose the input into features that are then used for classification and recognition. The experimental and theoretical focus has been on finding and describing these analyzers that decompose images into useful components. Various models are then compared to the physiological measurements performed on neurons in the sensory systems. In the study of higher level perception, the work has been focused on the representation of objects and on the connections between various physical effects and object perception. In this category we find the perception of 3D from a variety of physical measurements including motion, shading and other physical phenomena. With few exceptions, there seem to be very limited development of theories describing how the visual system might combine the output of the analyzers to form the representation of visual objects. Therefore, the processes underlying the integration of information over space represent critical aspects of vision system. The understanding of these processes will have implications on our expectations for the underlying physiological mechanisms, as well as for our models of the internal representation for visual percepts. In this project, we explored several mechanisms related to spatial summation, attention, and eye movements. The project comprised three components: 1. Modeling visual search for the detection of speed deviation. 2. Perception of moving objects. 3. Exploring the role of eye movements in various visual tasks.
SVGenes: a library for rendering genomic features in scalable vector graphic format.
Etherington, Graham J; MacLean, Daniel
2013-08-01
Drawing genomic features in attractive and informative ways is a key task in visualization of genomics data. Scalable Vector Graphics (SVG) format is a modern and flexible open standard that provides advanced features including modular graphic design, advanced web interactivity and animation within a suitable client. SVGs do not suffer from loss of image quality on re-scaling and provide the ability to edit individual elements of a graphic on the whole object level independent of the whole image. These features make SVG a potentially useful format for the preparation of publication quality figures including genomic objects such as genes or sequencing coverage and for web applications that require rich user-interaction with the graphical elements. SVGenes is a Ruby-language library that uses SVG primitives to render typical genomic glyphs through a simple and flexible Ruby interface. The library implements a simple Page object that spaces and contains horizontal Track objects that in turn style, colour and positions features within them. Tracks are the level at which visual information is supplied providing the full styling capability of the SVG standard. Genomic entities like genes, transcripts and histograms are modelled in Glyph objects that are attached to a track and take advantage of SVG primitives to render the genomic features in a track as any of a selection of defined glyphs. The feature model within SVGenes is simple but flexible and not dependent on particular existing gene feature formats meaning graphics for any existing datasets can easily be created without need for conversion. The library is provided as a Ruby Gem from https://rubygems.org/gems/bio-svgenes under the MIT license, and open source code is available at https://github.com/danmaclean/bioruby-svgenes also under the MIT License. dan.maclean@tsl.ac.uk.
Multi-scale image segmentation method with visual saliency constraints and its application
NASA Astrophysics Data System (ADS)
Chen, Yan; Yu, Jie; Sun, Kaimin
2018-03-01
Object-based image analysis method has many advantages over pixel-based methods, so it is one of the current research hotspots. It is very important to get the image objects by multi-scale image segmentation in order to carry out object-based image analysis. The current popular image segmentation methods mainly share the bottom-up segmentation principle, which is simple to realize and the object boundaries obtained are accurate. However, the macro statistical characteristics of the image areas are difficult to be taken into account, and fragmented segmentation (or over-segmentation) results are difficult to avoid. In addition, when it comes to information extraction, target recognition and other applications, image targets are not equally important, i.e., some specific targets or target groups with particular features worth more attention than the others. To avoid the problem of over-segmentation and highlight the targets of interest, this paper proposes a multi-scale image segmentation method with visually saliency graph constraints. Visual saliency theory and the typical feature extraction method are adopted to obtain the visual saliency information, especially the macroscopic information to be analyzed. The visual saliency information is used as a distribution map of homogeneity weight, where each pixel is given a weight. This weight acts as one of the merging constraints in the multi- scale image segmentation. As a result, pixels that macroscopically belong to the same object but are locally different can be more likely assigned to one same object. In addition, due to the constraint of visual saliency model, the constraint ability over local-macroscopic characteristics can be well controlled during the segmentation process based on different objects. These controls will improve the completeness of visually saliency areas in the segmentation results while diluting the controlling effect for non- saliency background areas. Experiments show that this method works better for texture image segmentation than traditional multi-scale image segmentation methods, and can enable us to give priority control to the saliency objects of interest. This method has been used in image quality evaluation, scattered residential area extraction, sparse forest extraction and other applications to verify its validation. All applications showed good results.
Face features and face configurations both contribute to visual crowding.
Sun, Hsin-Mei; Balas, Benjamin
2015-02-01
Crowding refers to the inability to recognize an object in peripheral vision when other objects are presented nearby (Whitney & Levi Trends in Cognitive Sciences, 15, 160-168, 2011). A popular explanation of crowding is that features of the target and flankers are combined inappropriately when they are located within an integration field, thus impairing target recognition (Pelli, Palomares, & Majaj Journal of Vision, 4(12), 12:1136-1169, 2004). However, it remains unclear which features of the target and flankers are combined inappropriately to cause crowding (Levi Vision Research, 48, 635-654, 2008). For example, in a complex stimulus (e.g., a face), to what extent does crowding result from the integration of features at a part-based level or at the level of global processing of the configural appearance? In this study, we used a face categorization task and different types of flankers to examine how much the magnitude of visual crowding depends on the similarity of face parts or of global configurations. We created flankers with face-like features (e.g., the eyes, nose, and mouth) in typical and scrambled configurations to examine the impacts of part appearance and global configuration on the visual crowding of faces. Additionally, we used "electrical socket" flankers that mimicked first-order face configuration but had only schematic features, to examine the extent to which global face geometry impacted crowding. Our results indicated that both face parts and configurations contribute to visual crowding, suggesting that face similarity as realized under crowded conditions includes both aspects of facial appearance.
Selection-for-action in visual search.
Hannus, Aave; Cornelissen, Frans W; Lindemann, Oliver; Bekkering, Harold
2005-01-01
Grasping an object rather than pointing to it enhances processing of its orientation but not its color. Apparently, visual discrimination is selectively enhanced for a behaviorally relevant feature. In two experiments we investigated the limitations and targets of this bias. Specifically, in Experiment 1 we were interested to find out whether the effect is capacity demanding, therefore we manipulated the set-size of the display. The results indicated a clear cognitive processing capacity requirement, i.e. the magnitude of the effect decreased for a larger set size. Consequently, in Experiment 2, we investigated if the enhancement effect occurs only at the level of behaviorally relevant feature or at a level common to different features. Therefore we manipulated the discriminability of the behaviorally neutral feature (color). Again, results showed that this manipulation influenced the action enhancement of the behaviorally relevant feature. Particularly, the effect of the color manipulation on the action enhancement suggests that the action effect is more likely to bias the competition between different visual features rather than to enhance the processing of the relevant feature. We offer a theoretical account that integrates the action-intention effect within the biased competition model of visual selective attention.
Working memory resources are shared across sensory modalities.
Salmela, V R; Moisala, M; Alho, K
2014-10-01
A common assumption in the working memory literature is that the visual and auditory modalities have separate and independent memory stores. Recent evidence on visual working memory has suggested that resources are shared between representations, and that the precision of representations sets the limit for memory performance. We tested whether memory resources are also shared across sensory modalities. Memory precision for two visual (spatial frequency and orientation) and two auditory (pitch and tone duration) features was measured separately for each feature and for all possible feature combinations. Thus, only the memory load was varied, from one to four features, while keeping the stimuli similar. In Experiment 1, two gratings and two tones-both containing two varying features-were presented simultaneously. In Experiment 2, two gratings and two tones-each containing only one varying feature-were presented sequentially. The memory precision (delayed discrimination threshold) for a single feature was close to the perceptual threshold. However, as the number of features to be remembered was increased, the discrimination thresholds increased more than twofold. Importantly, the decrease in memory precision did not depend on the modality of the other feature(s), or on whether the features were in the same or in separate objects. Hence, simultaneously storing one visual and one auditory feature had an effect on memory precision equal to those of simultaneously storing two visual or two auditory features. The results show that working memory is limited by the precision of the stored representations, and that working memory can be described as a resource pool that is shared across modalities.
The Role of Visual Working Memory in Attentive Tracking of Unique Objects
ERIC Educational Resources Information Center
Makovski, Tal; Jiang, Yuhong V.
2009-01-01
When tracking moving objects in space humans usually attend to the objects' spatial locations and update this information over time. To what extent do surface features assist attentive tracking? In this study we asked participants to track identical or uniquely colored objects. Tracking was enhanced when objects were unique in color. The benefit…
Spatial and Feature-Based Attention in a Layered Cortical Microcircuit Model
Wagatsuma, Nobuhiko; Potjans, Tobias C.; Diesmann, Markus; Sakai, Ko; Fukai, Tomoki
2013-01-01
Directing attention to the spatial location or the distinguishing feature of a visual object modulates neuronal responses in the visual cortex and the stimulus discriminability of subjects. However, the spatial and feature-based modes of attention differently influence visual processing by changing the tuning properties of neurons. Intriguingly, neurons' tuning curves are modulated similarly across different visual areas under both these modes of attention. Here, we explored the mechanism underlying the effects of these two modes of visual attention on the orientation selectivity of visual cortical neurons. To do this, we developed a layered microcircuit model. This model describes multiple orientation-specific microcircuits sharing their receptive fields and consisting of layers 2/3, 4, 5, and 6. These microcircuits represent a functional grouping of cortical neurons and mutually interact via lateral inhibition and excitatory connections between groups with similar selectivity. The individual microcircuits receive bottom-up visual stimuli and top-down attention in different layers. A crucial assumption of the model is that feature-based attention activates orientation-specific microcircuits for the relevant feature selectively, whereas spatial attention activates all microcircuits homogeneously, irrespective of their orientation selectivity. Consequently, our model simultaneously accounts for the multiplicative scaling of neuronal responses in spatial attention and the additive modulations of orientation tuning curves in feature-based attention, which have been observed widely in various visual cortical areas. Simulations of the model predict contrasting differences between excitatory and inhibitory neurons in the two modes of attentional modulations. Furthermore, the model replicates the modulation of the psychophysical discriminability of visual stimuli in the presence of external noise. Our layered model with a biologically suggested laminar structure describes the basic circuit mechanism underlying the attention-mode specific modulations of neuronal responses and visual perception. PMID:24324628
Learned filters for object detection in multi-object visual tracking
NASA Astrophysics Data System (ADS)
Stamatescu, Victor; Wong, Sebastien; McDonnell, Mark D.; Kearney, David
2016-05-01
We investigate the application of learned convolutional filters in multi-object visual tracking. The filters were learned in both a supervised and unsupervised manner from image data using artificial neural networks. This work follows recent results in the field of machine learning that demonstrate the use learned filters for enhanced object detection and classification. Here we employ a track-before-detect approach to multi-object tracking, where tracking guides the detection process. The object detection provides a probabilistic input image calculated by selecting from features obtained using banks of generative or discriminative learned filters. We present a systematic evaluation of these convolutional filters using a real-world data set that examines their performance as generic object detectors.
Multi-channel feature dictionaries for RGB-D object recognition
NASA Astrophysics Data System (ADS)
Lan, Xiaodong; Li, Qiming; Chong, Mina; Song, Jian; Li, Jun
2018-04-01
Hierarchical matching pursuit (HMP) is a popular feature learning method for RGB-D object recognition. However, the feature representation with only one dictionary for RGB channels in HMP does not capture sufficient visual information. In this paper, we propose multi-channel feature dictionaries based feature learning method for RGB-D object recognition. The process of feature extraction in the proposed method consists of two layers. The K-SVD algorithm is used to learn dictionaries in sparse coding of these two layers. In the first-layer, we obtain features by performing max pooling on sparse codes of pixels in a cell. And the obtained features of cells in a patch are concatenated to generate patch jointly features. Then, patch jointly features in the first-layer are used to learn the dictionary and sparse codes in the second-layer. Finally, spatial pyramid pooling can be applied to the patch jointly features of any layer to generate the final object features in our method. Experimental results show that our method with first or second-layer features can obtain a comparable or better performance than some published state-of-the-art methods.
Visual learning in drosophila: application on a roving robot and comparisons
NASA Astrophysics Data System (ADS)
Arena, P.; De Fiore, S.; Patané, L.; Termini, P. S.; Strauss, R.
2011-05-01
Visual learning is an important aspect of fly life. Flies are able to extract visual cues from objects, like colors, vertical and horizontal distributedness, and others, that can be used for learning to associate a meaning to specific features (i.e. a reward or a punishment). Interesting biological experiments show trained stationary flying flies avoiding flying towards specific visual objects, appearing on the surrounding environment. Wild-type flies effectively learn to avoid those objects but this is not the case for the learning mutant rutabaga defective in the cyclic AMP dependent pathway for plasticity. A bio-inspired architecture has been proposed to model the fly behavior and experiments on roving robots were performed. Statistical comparisons have been considered and mutant-like effect on the model has been also investigated.
NASA Astrophysics Data System (ADS)
Kyrkou, Christos; Theocharides, Theocharis
2016-07-01
Object detection is a major step in several computer vision applications and a requirement for most smart camera systems. Recent advances in hardware acceleration for real-time object detection feature extensive use of reconfigurable hardware [field programmable gate arrays (FPGAs)], and relevant research has produced quite fascinating results, in both the accuracy of the detection algorithms as well as the performance in terms of frames per second (fps) for use in embedded smart camera systems. Detecting objects in images, however, is a daunting task and often involves hardware-inefficient steps, both in terms of the datapath design and in terms of input/output and memory access patterns. We present how a visual-feature-directed search cascade composed of motion detection, depth computation, and edge detection, can have a significant impact in reducing the data that needs to be examined by the classification engine for the presence of an object of interest. Experimental results on a Spartan 6 FPGA platform for face detection indicate data search reduction of up to 95%, which results in the system being able to process up to 50 1024×768 pixels images per second with a significantly reduced number of false positives.
Balaban, Halely; Luria, Roy
2016-05-01
What makes an integrated object in visual working memory (WM)? Past evidence suggested that WM holds all features of multidimensional objects together, but struggles to integrate color-color conjunctions. This difficulty was previously attributed to a challenge in same-dimension integration, but here we argue that it arises from the integration of 2 distinct objects. To test this, we examined the integration of distinct different-dimension features (a colored square and a tilted bar). We monitored the contralateral delay activity, an event-related potential component sensitive to the number of objects in WM. The results indicated that color and orientation belonging to distinct objects in a shared location were not integrated in WM (Experiment 1), even following a common fate Gestalt cue (Experiment 2). These conjunctions were better integrated in a less demanding task (Experiment 3), and in the original WM task, but with a less individuating version of the original stimuli (Experiment 4). Our results identify the critical factor in WM integration at same- versus separate-objects, rather than at same- versus different-dimensions. Compared with the perfect integration of an object's features, the integration of several objects is demanding, and depends on an interaction between the grouping cues and task demands, among other factors. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
A bio-inspired method and system for visual object-based attention and segmentation
NASA Astrophysics Data System (ADS)
Huber, David J.; Khosla, Deepak
2010-04-01
This paper describes a method and system of human-like attention and object segmentation in visual scenes that (1) attends to regions in a scene in their rank of saliency in the image, (2) extracts the boundary of an attended proto-object based on feature contours, and (3) can be biased to boost the attention paid to specific features in a scene, such as those of a desired target object in static and video imagery. The purpose of the system is to identify regions of a scene of potential importance and extract the region data for processing by an object recognition and classification algorithm. The attention process can be performed in a default, bottom-up manner or a directed, top-down manner which will assign a preference to certain features over others. One can apply this system to any static scene, whether that is a still photograph or imagery captured from video. We employ algorithms that are motivated by findings in neuroscience, psychology, and cognitive science to construct a system that is novel in its modular and stepwise approach to the problems of attention and region extraction, its application of a flooding algorithm to break apart an image into smaller proto-objects based on feature density, and its ability to join smaller regions of similar features into larger proto-objects. This approach allows many complicated operations to be carried out by the system in a very short time, approaching real-time. A researcher can use this system as a robust front-end to a larger system that includes object recognition and scene understanding modules; it is engineered to function over a broad range of situations and can be applied to any scene with minimal tuning from the user.
Core, Cynthia; Brown, Janean W; Larsen, Michael D; Mahshie, James
2014-01-01
The objectives of this research were to determine whether an adapted version of a Hybrid Visual Habituation procedure could be used to assess speech perception of phonetic and prosodic features of speech (vowel height, lexical stress, and intonation) in individual pre-school-age children who use cochlear implants. Nine children ranging in age from 3;4 to 5;5 participated in this study. Children were prelingually deaf and used cochlear implants and had no other known disabilities. Children received two speech feature tests using an adaptation of a Hybrid Visual Habituation procedure. Seven of the nine children demonstrated perception of at least one speech feature using this procedure using results from a Bayesian linear regression analysis. At least one child demonstrated perception of each speech feature using this assessment procedure. An adapted version of the Hybrid Visual Habituation Procedure with an appropriate statistical analysis provides a way to assess phonetic and prosodicaspects of speech in pre-school-age children who use cochlear implants.
Classifying four-category visual objects using multiple ERP components in single-trial ERP.
Qin, Yu; Zhan, Yu; Wang, Changming; Zhang, Jiacai; Yao, Li; Guo, Xiaojuan; Wu, Xia; Hu, Bin
2016-08-01
Object categorization using single-trial electroencephalography (EEG) data measured while participants view images has been studied intensively. In previous studies, multiple event-related potential (ERP) components (e.g., P1, N1, P2, and P3) were used to improve the performance of object categorization of visual stimuli. In this study, we introduce a novel method that uses multiple-kernel support vector machine to fuse multiple ERP component features. We investigate whether fusing the potential complementary information of different ERP components (e.g., P1, N1, P2a, and P2b) can improve the performance of four-category visual object classification in single-trial EEGs. We also compare the classification accuracy of different ERP component fusion methods. Our experimental results indicate that the classification accuracy increases through multiple ERP fusion. Additional comparative analyses indicate that the multiple-kernel fusion method can achieve a mean classification accuracy higher than 72 %, which is substantially better than that achieved with any single ERP component feature (55.07 % for the best single ERP component, N1). We compare the classification results with those of other fusion methods and determine that the accuracy of the multiple-kernel fusion method is 5.47, 4.06, and 16.90 % higher than those of feature concatenation, feature extraction, and decision fusion, respectively. Our study shows that our multiple-kernel fusion method outperforms other fusion methods and thus provides a means to improve the classification performance of single-trial ERPs in brain-computer interface research.
Neural basis for dynamic updating of object representation in visual working memory.
Takahama, Sachiko; Miyauchi, Satoru; Saiki, Jun
2010-02-15
In real world, objects have multiple features and change dynamically. Thus, object representations must satisfy dynamic updating and feature binding. Previous studies have investigated the neural activity of dynamic updating or feature binding alone, but not both simultaneously. We investigated the neural basis of feature-bound object representation in a dynamically updating situation by conducting a multiple object permanence tracking task, which required observers to simultaneously process both the maintenance and dynamic updating of feature-bound objects. Using an event-related design, we separated activities during memory maintenance and change detection. In the search for regions showing selective activation in dynamic updating of feature-bound objects, we identified a network during memory maintenance that was comprised of the inferior precentral sulcus, superior parietal lobule, and middle frontal gyrus. In the change detection period, various prefrontal regions, including the anterior prefrontal cortex, were activated. In updating object representation of dynamically moving objects, the inferior precentral sulcus closely cooperates with a so-called "frontoparietal network", and subregions of the frontoparietal network can be decomposed into those sensitive to spatial updating and feature binding. The anterior prefrontal cortex identifies changes in object representation by comparing memory and perceptual representations rather than maintaining object representations per se, as previously suggested. Copyright 2009 Elsevier Inc. All rights reserved.
Importing perceived features into false memories.
Lyle, Keith B; Johnson, Marcia K
2006-02-01
False memories sometimes contain specific details, such as location or colour, about events that never occurred. Based on the source-monitoring framework, we investigated one process by which false memories acquire details: the reactivation and misattribution of feature information from memories of similar perceived events. In Experiments 1A and 1B, when imagined objects were falsely remembered as seen, participants often reported that the objects had appeared in locations where visually or conceptually similar objects, respectively, had actually appeared. Experiment 2 indicated that colour and shape features of seen objects were misattributed to false memories of imagined objects. Experiment 3 showed that perceived details were misattributed to false memories of objects that had not been explicitly imagined. False memories that imported perceived features, compared to those that presumably did not, were subjectively more like memories for perceived events. Thus, perception may be even more pernicious than imagination in contributing to false memories.
Feature-based attention elicits surround suppression in feature space.
Störmer, Viola S; Alvarez, George A
2014-09-08
It is known that focusing attention on a particular feature (e.g., the color red) facilitates the processing of all objects in the visual field containing that feature [1-7]. Here, we show that such feature-based attention not only facilitates processing but also actively inhibits processing of similar, but not identical, features globally across the visual field. We combined behavior and electrophysiological recordings of frequency-tagged potentials in human observers to measure this inhibitory surround in feature space. We found that sensory signals of an attended color (e.g., red) were enhanced, whereas sensory signals of colors similar to the target color (e.g., orange) were suppressed relative to colors more distinct from the target color (e.g., yellow). Importantly, this inhibitory effect spreads globally across the visual field, thus operating independently of location. These findings suggest that feature-based attention comprises an excitatory peak surrounded by a narrow inhibitory zone in color space to attenuate the most distracting and potentially confusable stimuli during visual perception. This selection profile is akin to what has been reported for location-based attention [8-10] and thus suggests that such center-surround mechanisms are an overarching principle of attention across different domains in the human brain. Copyright © 2014 Elsevier Ltd. All rights reserved.
Automatic Spiral Analysis for Objective Assessment of Motor Symptoms in Parkinson's Disease.
Memedi, Mevludin; Sadikov, Aleksander; Groznik, Vida; Žabkar, Jure; Možina, Martin; Bergquist, Filip; Johansson, Anders; Haubenberger, Dietrich; Nyholm, Dag
2015-09-17
A challenge for the clinical management of advanced Parkinson's disease (PD) patients is the emergence of fluctuations in motor performance, which represents a significant source of disability during activities of daily living of the patients. There is a lack of objective measurement of treatment effects for in-clinic and at-home use that can provide an overview of the treatment response. The objective of this paper was to develop a method for objective quantification of advanced PD motor symptoms related to off episodes and peak dose dyskinesia, using spiral data gathered by a touch screen telemetry device. More specifically, the aim was to objectively characterize motor symptoms (bradykinesia and dyskinesia), to help in automating the process of visual interpretation of movement anomalies in spirals as rated by movement disorder specialists. Digitized upper limb movement data of 65 advanced PD patients and 10 healthy (HE) subjects were recorded as they performed spiral drawing tasks on a touch screen device in their home environment settings. Several spatiotemporal features were extracted from the time series and used as inputs to machine learning methods. The methods were validated against ratings on animated spirals scored by four movement disorder specialists who visually assessed a set of kinematic features and the motor symptom. The ability of the method to discriminate between PD patients and HE subjects and the test-retest reliability of the computed scores were also evaluated. Computed scores correlated well with mean visual ratings of individual kinematic features. The best performing classifier (Multilayer Perceptron) classified the motor symptom (bradykinesia or dyskinesia) with an accuracy of 84% and area under the receiver operating characteristics curve of 0.86 in relation to visual classifications of the raters. In addition, the method provided high discriminating power when distinguishing between PD patients and HE subjects as well as had good test-retest reliability. This study demonstrated the potential of using digital spiral analysis for objective quantification of PD-specific and/or treatment-induced motor symptoms.
An Object-Oriented Approach for Analyzing CALIPSO's Profile Observations
NASA Astrophysics Data System (ADS)
Trepte, C. R.
2016-12-01
The CALIPSO satellite mission is a pioneering international partnership between NASA and the French Space Agency, CNES. Since launch on 28 April 2006, CALIPSO has been acquiring near-continuous lidar profile observations of clouds and aerosols in the Earth's atmosphere. Many studies have profitably used these observations to advance our understanding of climate, weather and air quality. For the most part, however, these studies have considered CALIPSO profile measurements independent from one another and have not related each to neighboring or family observations within a cloud element or aerosol feature. In this presentation we describe an alternative approach that groups measurements into objects visually identified from CALIPSO browse images. The approach makes use of the Visualization of CALIPSO (VOCAL) software tool that enables a user to outline a region of interest and save coordinates into a database. The selected features or objects can then be analyzed to explore spatial correlations over the feature's domain and construct bulk statistical properties for each structure. This presentation will show examples that examine cirrus and dust layers and will describe how this object-oriented approach can provide added insight into physical processes beyond conventional statistical treatments. It will further show results with combined measurements from other A-Train sensors to highlight advantages of viewing features in this manner.
Task set induces dynamic reallocation of resources in visual short-term memory.
Sheremata, Summer L; Shomstein, Sarah
2017-08-01
Successful interaction with the environment requires the ability to flexibly allocate resources to different locations in the visual field. Recent evidence suggests that visual short-term memory (VSTM) resources are distributed asymmetrically across the visual field based upon task demands. Here, we propose that context, rather than the stimulus itself, determines asymmetrical distribution of VSTM resources. To test whether context modulates the reallocation of resources to the right visual field, task set, defined by memory-load, was manipulated to influence visual short-term memory performance. Performance was measured for single-feature objects embedded within predominantly single- or two-feature memory blocks. Therefore, context was varied to determine whether task set directly predicts changes in visual field biases. In accord with the dynamic reallocation of resources hypothesis, task set, rather than aspects of the physical stimulus, drove improvements in performance in the right- visual field. Our results show, for the first time, that preparation for upcoming memory demands directly determines how resources are allocated across the visual field.
When a Dog Has a Pen for a Tail: The Time Course of Creative Object Processing
ERIC Educational Resources Information Center
Wang, Botao; Duan, Haijun; Qi, Senqing; Hu, Weiping; Zhang, Huan
2017-01-01
Creative objects differ from ordinary objects in that they are created by human beings to contain novel, creative information. Previous research has demonstrated that ordinary object processing involves both a perceptual process for analyzing different features of the visual input and a higher-order process for evaluating the relevance of this…
Accessibility Limits Recall from Visual Working Memory
ERIC Educational Resources Information Center
Rajsic, Jason; Swan, Garrett; Wilson, Daryl E.; Pratt, Jay
2017-01-01
In this article, we demonstrate limitations of accessibility of information in visual working memory (VWM). Recently, cued-recall has been used to estimate the fidelity of information in VWM, where the feature of a cued object is reproduced from memory (Bays, Catalao, & Husain, 2009; Wilken & Ma, 2004; Zhang & Luck, 2008). Response…
Numerical modeling of eastern connecticut's visual resources
Daniel L. Civco
1979-01-01
A numerical model capable of accurately predicting the preference for landscape photographs of selected points in eastern Connecticut is presented. A function of the social attitudes expressed toward thirty-two salient visual landscape features serves as the independent variable in predicting preferences. A technique for objectively assigning adjectives to landscape...
JVIEW Visualization for Virtual Airspace Modeling and Simulation
2009-04-01
23 4.2.2 Translucency ................................................................................................................. 25 4.3... Translucency Used to Display Multiple Visualization Elements .............................. 26 Figure 26 - Textual Labels Feature...been done by Jason Moore and other AFRL/RISF staff and support personnel developing the JView API. JView relies on concrete Object Oriented Design
Fort, Alexandra; Delpuech, Claude; Pernier, Jacques; Giard, Marie-Hélène
2002-10-01
Very recently, a number of neuroimaging studies in humans have begun to investigate the question of how the brain integrates information from different sensory modalities to form unified percepts. Already, intermodal neural processing appears to depend on the modalities of inputs or the nature (speech/non-speech) of information to be combined. Yet, the variety of paradigms, stimuli and technics used make it difficult to understand the relationships between the factors operating at the perceptual level and the underlying physiological processes. In a previous experiment, we used event-related potentials to describe the spatio-temporal organization of audio-visual interactions during a bimodal object recognition task. Here we examined the network of cross-modal interactions involved in simple detection of the same objects. The objects were defined either by unimodal auditory or visual features alone, or by the combination of the two features. As expected, subjects detected bimodal stimuli more rapidly than either unimodal stimuli. Combined analysis of potentials, scalp current densities and dipole modeling revealed several interaction patterns within the first 200 micro s post-stimulus: in occipito-parietal visual areas (45-85 micro s), in deep brain structures, possibly the superior colliculus (105-140 micro s), and in right temporo-frontal regions (170-185 micro s). These interactions differed from those found during object identification in sensory-specific areas and possibly in the superior colliculus, indicating that the neural operations governing multisensory integration depend crucially on the nature of the perceptual processes involved.
Visual search deficits in amblyopia.
Tsirlin, Inna; Colpa, Linda; Goltz, Herbert C; Wong, Agnes M F
2018-04-01
Amblyopia is a neurodevelopmental disorder defined as a reduction in visual acuity that cannot be corrected by optical means. It has been associated with low-level deficits. However, research has demonstrated a link between amblyopia and visual attention deficits in counting, tracking, and identifying objects. Visual search is a useful tool for assessing visual attention but has not been well studied in amblyopia. Here, we assessed the extent of visual search deficits in amblyopia using feature and conjunction search tasks. We compared the performance of participants with amblyopia (n = 10) to those of controls (n = 12) on both feature and conjunction search tasks using Gabor patch stimuli, varying spatial bandwidth and orientation. To account for the low-level deficits inherent in amblyopia, we measured individual contrast and crowding thresholds and monitored eye movements. The display elements were then presented at suprathreshold levels to ensure that visibility was equalized across groups. There was no performance difference between groups on feature search, indicating that our experimental design controlled successfully for low-level amblyopia deficits. In contrast, during conjunction search, median reaction times and reaction time slopes were significantly larger in participants with amblyopia compared with controls. Amblyopia differentially affects performance on conjunction visual search, a more difficult task that requires feature binding and possibly the involvement of higher-level attention processes. Deficits in visual search may affect day-to-day functioning in people with amblyopia.
Chen, Yuantao; Xu, Weihong; Kuang, Fangjun; Gao, Shangbing
2013-01-01
The efficient target tracking algorithm researches have become current research focus of intelligent robots. The main problems of target tracking process in mobile robot face environmental uncertainty. They are very difficult to estimate the target states, illumination change, target shape changes, complex backgrounds, and other factors and all affect the occlusion in tracking robustness. To further improve the target tracking's accuracy and reliability, we present a novel target tracking algorithm to use visual saliency and adaptive support vector machine (ASVM). Furthermore, the paper's algorithm has been based on the mixture saliency of image features. These features include color, brightness, and sport feature. The execution process used visual saliency features and those common characteristics have been expressed as the target's saliency. Numerous experiments demonstrate the effectiveness and timeliness of the proposed target tracking algorithm in video sequences where the target objects undergo large changes in pose, scale, and illumination.
Vertical visual features have a strong influence on cuttlefish camouflage.
Ulmer, K M; Buresch, K C; Kossodo, M M; Mäthger, L M; Siemann, L A; Hanlon, R T
2013-04-01
Cuttlefish and other cephalopods use visual cues from their surroundings to adaptively change their body pattern for camouflage. Numerous previous experiments have demonstrated the influence of two-dimensional (2D) substrates (e.g., sand and gravel habitats) on camouflage, yet many marine habitats have varied three-dimensional (3D) structures among which cuttlefish camouflage from predators, including benthic predators that view cuttlefish horizontally against such 3D backgrounds. We conducted laboratory experiments, using Sepia officinalis, to test the relative influence of horizontal versus vertical visual cues on cuttlefish camouflage: 2D patterns on benthic substrates were tested versus 2D wall patterns and 3D objects with patterns. Specifically, we investigated the influence of (i) quantity and (ii) placement of high-contrast elements on a 3D object or a 2D wall, as well as (iii) the diameter and (iv) number of 3D objects with high-contrast elements on cuttlefish body pattern expression. Additionally, we tested the influence of high-contrast visual stimuli covering the entire 2D benthic substrate versus the entire 2D wall. In all experiments, visual cues presented in the vertical plane evoked the strongest body pattern response in cuttlefish. These experiments support field observations that, in some marine habitats, cuttlefish will respond to vertically oriented background features even when the preponderance of visual information in their field of view seems to be from the 2D surrounding substrate. Such choices highlight the selective decision-making that occurs in cephalopods with their adaptive camouflage capability.
Fields, Chris
2011-01-01
The perception of persisting visual objects is mediated by transient intermediate representations, object files, that are instantiated in response to some, but not all, visual trajectories. The standard object file concept does not, however, provide a mechanism sufficient to account for all experimental data on visual object persistence, object tracking, and the ability to perceive spatially disconnected stimuli as continuously existing objects. Based on relevant anatomical, functional, and developmental data, a functional model is constructed that bases visual object individuation on the recognition of temporal sequences of apparent center-of-mass positions that are specifically identified as trajectories by dedicated “trajectory recognition networks” downstream of the medial–temporal motion-detection area. This model is shown to account for a wide range of data, and to generate a variety of testable predictions. Individual differences in the recognition, abstraction, and encoding of trajectory information are expected to generate distinct object persistence judgments and object recognition abilities. Dominance of trajectory information over feature information in stored object tokens during early infancy, in particular, is expected to disrupt the ability to re-identify human and other individuals across perceptual episodes, and lead to developmental outcomes with characteristics of autism spectrum disorders. PMID:21716599
Viewing the dynamics and control of visual attention through the lens of electrophysiology
Woodman, Geoffrey F.
2013-01-01
How we find what we are looking for in complex visual scenes is a seemingly simple ability that has taken half a century to unravel. The first study to use the term visual search showed that as the number of objects in a complex scene increases, observers’ reaction times increase proportionally (Green and Anderson, 1956). This observation suggests that our ability to process the objects in the scenes is limited in capacity. However, if it is known that the target will have a certain feature attribute, for example, that it will be red, then only an increase in the number of red items increases reaction time. This observation suggests that we can control which visual inputs receive the benefit of our limited capacity to recognize the objects, such as those defined by the color red, as the items we seek. The nature of the mechanisms that underlie these basic phenomena in the literature on visual search have been more difficult to definitively determine. In this paper, I discuss how electrophysiological methods have provided us with the necessary tools to understand the nature of the mechanisms that give rise to the effects observed in the first visual search paper. I begin by describing how recordings of event-related potentials from humans and nonhuman primates have shown us how attention is deployed to possible target items in complex visual scenes. Then, I will discuss how event-related potential experiments have allowed us to directly measure the memory representations that are used to guide these deployments of attention to items with target-defining features. PMID:23357579
Visual context modulates potentiation of grasp types during semantic object categorization.
Kalénine, Solène; Shapiro, Allison D; Flumini, Andrea; Borghi, Anna M; Buxbaum, Laurel J
2014-06-01
Substantial evidence suggests that conceptual processing of manipulable objects is associated with potentiation of action. Such data have been viewed as evidence that objects are recognized via access to action features. Many objects, however, are associated with multiple actions. For example, a kitchen timer may be clenched with a power grip to move it but pinched with a precision grip to use it. The present study tested the hypothesis that action evocation during conceptual object processing is responsive to the visual scene in which objects are presented. Twenty-five healthy adults were asked to categorize object pictures presented in different naturalistic visual contexts that evoke either move- or use-related actions. Categorization judgments (natural vs. artifact) were performed by executing a move- or use-related action (clench vs. pinch) on a response device, and response times were assessed as a function of contextual congruence. Although the actions performed were irrelevant to the categorization judgment, responses were significantly faster when actions were compatible with the visual context. This compatibility effect was largely driven by faster pinch responses when objects were presented in use-compatible, as compared with move-compatible, contexts. The present study is the first to highlight the influence of visual scene on stimulus-response compatibility effects during semantic object processing. These data support the hypothesis that action evocation during conceptual object processing is biased toward context-relevant actions.
Visual context modulates potentiation of grasp types during semantic object categorization
Kalénine, Solène; Shapiro, Allison D.; Flumini, Andrea; Borghi, Anna M.; Buxbaum, Laurel J.
2013-01-01
Substantial evidence suggests that conceptual processing of manipulable objects is associated with potentiation of action. Such data have been viewed as evidence that objects are recognized via access to action features. Many objects, however, are associated with multiple actions. For example, a kitchen timer may be clenched with a power grip to move it, but pinched with a precision grip to use it. The present study tested the hypothesis that action evocation during conceptual object processing is responsive to the visual scene in which objects are presented. Twenty-five healthy adults were asked to categorize object pictures presented in different naturalistic visual contexts that evoke either move- or use-related actions. Categorization judgments (natural vs. artifact) were performed by executing a move- or use-related action (clench vs. pinch) on a response device, and response times were assessed as a function of contextual congruence. Although the actions performed were irrelevant to the categorization judgment, responses were significantly faster when actions were compatible with the visual context. This compatibility effect was largely driven by faster pinch responses when objects were presented in use- compared to move-compatible contexts. The present study is the first to highlight the influence of visual scene on stimulus-response compatibility effects during semantic object processing. These data support the hypothesis that action evocation during conceptual object processing is biased toward context-relevant actions. PMID:24186270
Integrating visual learning within a model-based ATR system
NASA Astrophysics Data System (ADS)
Carlotto, Mark; Nebrich, Mark
2017-05-01
Automatic target recognition (ATR) systems, like human photo-interpreters, rely on a variety of visual information for detecting, classifying, and identifying manmade objects in aerial imagery. We describe the integration of a visual learning component into the Image Data Conditioner (IDC) for target/clutter and other visual classification tasks. The component is based on an implementation of a model of the visual cortex developed by Serre, Wolf, and Poggio. Visual learning in an ATR context requires the ability to recognize objects independent of location, scale, and rotation. Our method uses IDC to extract, rotate, and scale image chips at candidate target locations. A bootstrap learning method effectively extends the operation of the classifier beyond the training set and provides a measure of confidence. We show how the classifier can be used to learn other features that are difficult to compute from imagery such as target direction, and to assess the performance of the visual learning process itself.
Face Pareidolia in the Rhesus Monkey.
Taubert, Jessica; Wardle, Susan G; Flessert, Molly; Leopold, David A; Ungerleider, Leslie G
2017-08-21
Face perception in humans and nonhuman primates is rapid and accurate [1-4]. In the human brain, a network of visual-processing regions is specialized for faces [5-7]. Although face processing is a priority of the primate visual system, face detection is not infallible. Face pareidolia is the compelling illusion of perceiving facial features on inanimate objects, such as the illusory face on the surface of the moon. Although face pareidolia is commonly experienced by humans, its presence in other species is unknown. Here we provide evidence for face pareidolia in a species known to possess a complex face-processing system [8-10]: the rhesus monkey (Macaca mulatta). In a visual preference task [11, 12], monkeys looked longer at photographs of objects that elicited face pareidolia in human observers than at photographs of similar objects that did not elicit illusory faces. Examination of eye movements revealed that monkeys fixated the illusory internal facial features in a pattern consistent with how they view photographs of faces [13]. Although the specialized response to faces observed in humans [1, 3, 5-7, 14] is often argued to be continuous across primates [4, 15], it was previously unclear whether face pareidolia arose from a uniquely human capacity. For example, pareidolia could be a product of the human aptitude for perceptual abstraction or result from frequent exposure to cartoons and illustrations that anthropomorphize inanimate objects. Instead, our results indicate that the perception of illusory facial features on inanimate objects is driven by a broadly tuned face-detection mechanism that we share with other species. Published by Elsevier Ltd.
Basic level category structure emerges gradually across human ventral visual cortex.
Iordan, Marius Cătălin; Greene, Michelle R; Beck, Diane M; Fei-Fei, Li
2015-07-01
Objects can be simultaneously categorized at multiple levels of specificity ranging from very broad ("natural object") to very distinct ("Mr. Woof"), with a mid-level of generality (basic level: "dog") often providing the most cognitively useful distinction between categories. It is unknown, however, how this hierarchical representation is achieved in the brain. Using multivoxel pattern analyses, we examined how well each taxonomic level (superordinate, basic, and subordinate) of real-world object categories is represented across occipitotemporal cortex. We found that, although in early visual cortex objects are best represented at the subordinate level (an effect mostly driven by low-level feature overlap between objects in the same category), this advantage diminishes compared to the basic level as we move up the visual hierarchy, disappearing in object-selective regions of occipitotemporal cortex. This pattern stems from a combined increase in within-category similarity (category cohesion) and between-category dissimilarity (category distinctiveness) of neural activity patterns at the basic level, relative to both subordinate and superordinate levels, suggesting that successive visual areas may be optimizing basic level representations.
Invariant Visual Object and Face Recognition: Neural and Computational Bases, and a Model, VisNet
Rolls, Edmund T.
2012-01-01
Neurophysiological evidence for invariant representations of objects and faces in the primate inferior temporal visual cortex is described. Then a computational approach to how invariant representations are formed in the brain is described that builds on the neurophysiology. A feature hierarchy model in which invariant representations can be built by self-organizing learning based on the temporal and spatial statistics of the visual input produced by objects as they transform in the world is described. VisNet can use temporal continuity in an associative synaptic learning rule with a short-term memory trace, and/or it can use spatial continuity in continuous spatial transformation learning which does not require a temporal trace. The model of visual processing in the ventral cortical stream can build representations of objects that are invariant with respect to translation, view, size, and also lighting. The model has been extended to provide an account of invariant representations in the dorsal visual system of the global motion produced by objects such as looming, rotation, and object-based movement. The model has been extended to incorporate top-down feedback connections to model the control of attention by biased competition in, for example, spatial and object search tasks. The approach has also been extended to account for how the visual system can select single objects in complex visual scenes, and how multiple objects can be represented in a scene. The approach has also been extended to provide, with an additional layer, for the development of representations of spatial scenes of the type found in the hippocampus. PMID:22723777
Invariant Visual Object and Face Recognition: Neural and Computational Bases, and a Model, VisNet.
Rolls, Edmund T
2012-01-01
Neurophysiological evidence for invariant representations of objects and faces in the primate inferior temporal visual cortex is described. Then a computational approach to how invariant representations are formed in the brain is described that builds on the neurophysiology. A feature hierarchy model in which invariant representations can be built by self-organizing learning based on the temporal and spatial statistics of the visual input produced by objects as they transform in the world is described. VisNet can use temporal continuity in an associative synaptic learning rule with a short-term memory trace, and/or it can use spatial continuity in continuous spatial transformation learning which does not require a temporal trace. The model of visual processing in the ventral cortical stream can build representations of objects that are invariant with respect to translation, view, size, and also lighting. The model has been extended to provide an account of invariant representations in the dorsal visual system of the global motion produced by objects such as looming, rotation, and object-based movement. The model has been extended to incorporate top-down feedback connections to model the control of attention by biased competition in, for example, spatial and object search tasks. The approach has also been extended to account for how the visual system can select single objects in complex visual scenes, and how multiple objects can be represented in a scene. The approach has also been extended to provide, with an additional layer, for the development of representations of spatial scenes of the type found in the hippocampus.
Feature binding, attention and object perception.
Treisman, A
1998-01-01
The seemingly effortless ability to perceive meaningful objects in an integrated scene actually depends on complex visual processes. The 'binding problem' concerns the way in which we select and integrate the separate features of objects in the correct combinations. Experiments suggest that attention plays a central role in solving this problem. Some neurological patients show a dramatic breakdown in the ability to see several objects; their deficits suggest a role for the parietal cortex in the binding process. However, indirect measures of priming and interference suggest that more information may be implicitly available than we can consciously access. PMID:9770223
Selections from the ABC 2012 Annual Convention, Honolulu, Hawaii
ERIC Educational Resources Information Center
Whalen, D. Joel
2013-01-01
The 13 Favorite Assignments featured here were presented at the 2012 Association for Business Communication (ABC) Annual Convention, Honolulu, Hawaii. A variety of learning objectives are featured, including the following: enhancing resume's visual impact, interpersonal skills, social media, team building, web design, community service projects,…
The Identity-Location Binding Problem.
Howe, Piers D L; Ferguson, Adam
2015-09-01
The binding problem is fundamental to visual perception. It is the problem of associating an object's visual properties with itself and not with some other object. The problem is made particular difficult because different properties of an object, such as its color, shape, size, and motion, are often processed independently, sometimes in different cortical areas. The results of these separate analyses have to be combined before the object can be seen as a single coherent entity as opposed to a collection of unconnected features. Visual bindings are typically initiated and updated in a serial fashion, one object at a time. Here, we show that one type of binding, location-identity bindings, can be updated in parallel. We do this by using two complementary techniques, the simultaneous-sequential paradigm and systems factorial technology. These techniques make different assumptions and rely on different behavioral measures, yet both came to the same conclusion. Copyright © 2014 Cognitive Science Society, Inc.
A PDP model of the simultaneous perception of multiple objects
NASA Astrophysics Data System (ADS)
Henderson, Cynthia M.; McClelland, James L.
2011-06-01
Illusory conjunctions in normal and simultanagnosic subjects are two instances where the visual features of multiple objects are incorrectly 'bound' together. A connectionist model explores how multiple objects could be perceived correctly in normal subjects given sufficient time, but could give rise to illusory conjunctions with damage or time pressure. In this model, perception of two objects benefits from lateral connections between hidden layers modelling aspects of the ventral and dorsal visual pathways. As with simultanagnosia, simulations of dorsal lesions impair multi-object recognition. In contrast, a large ventral lesion has minimal effect on dorsal functioning, akin to dissociations between simple object manipulation (retained in visual form agnosia and semantic dementia) and object discrimination (impaired in these disorders) [Hodges, J.R., Bozeat, S., Lambon Ralph, M.A., Patterson, K., and Spatt, J. (2000), 'The Role of Conceptual Knowledge: Evidence from Semantic Dementia', Brain, 123, 1913-1925; Milner, A.D., and Goodale, M.A. (2006), The Visual Brain in Action (2nd ed.), New York: Oxford]. It is hoped that the functioning of this model might suggest potential processes underlying dorsal and ventral contributions to the correct perception of multiple objects.
Category-based attentional guidance can operate in parallel for multiple target objects.
Jenkins, Michael; Grubert, Anna; Eimer, Martin
2018-05-01
The question whether the control of attention during visual search is always feature-based or can also be based on the category of objects remains unresolved. Here, we employed the N2pc component as an on-line marker for target selection processes to compare the efficiency of feature-based and category-based attentional guidance. Two successive displays containing pairs of real-world objects (line drawings of kitchen or clothing items) were separated by a 10 ms SOA. In Experiment 1, target objects were defined by their category. In Experiment 2, one specific visual object served as target (exemplar-based search). On different trials, targets appeared either in one or in both displays, and participants had to report the number of targets (one or two). Target N2pc components were larger and emerged earlier during exemplar-based search than during category-based search, demonstrating the superior efficiency of feature-based attentional guidance. On trials where target objects appeared in both displays, both targets elicited N2pc components that overlapped in time, suggesting that attention was allocated in parallel to these target objects. Critically, this was the case not only in the exemplar-based task, but also when targets were defined by their category. These results demonstrate that attention can be guided by object categories, and that this type of category-based attentional control can operate concurrently for multiple target objects. Copyright © 2018 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Dostal, P.; Krasula, L.; Klima, M.
2012-06-01
Various image processing techniques in multimedia technology are optimized using visual attention feature of the human visual system. Spatial non-uniformity causes that different locations in an image are of different importance in terms of perception of the image. In other words, the perceived image quality depends mainly on the quality of important locations known as regions of interest. The performance of such techniques is measured by subjective evaluation or objective image quality criteria. Many state-of-the-art objective metrics are based on HVS properties; SSIM, MS-SSIM based on image structural information, VIF based on the information that human brain can ideally gain from the reference image or FSIM utilizing the low-level features to assign the different importance to each location in the image. But still none of these objective metrics utilize the analysis of regions of interest. We solve the question if these objective metrics can be used for effective evaluation of images reconstructed by processing techniques based on ROI analysis utilizing high-level features. In this paper authors show that the state-of-the-art objective metrics do not correlate well with subjective evaluation while the demosaicing based on ROI analysis is used for reconstruction. The ROI were computed from "ground truth" visual attention data. The algorithm combining two known demosaicing techniques on the basis of ROI location is proposed to reconstruct the ROI in fine quality while the rest of image is reconstructed with low quality. The color image reconstructed by this ROI approach was compared with selected demosaicing techniques by objective criteria and subjective testing. The qualitative comparison of the objective and subjective results indicates that the state-of-the-art objective metrics are still not suitable for evaluation image processing techniques based on ROI analysis and new criteria is demanded.
Real-Time Visual Tracking through Fusion Features
Ruan, Yang; Wei, Zhenzhong
2016-01-01
Due to their high-speed, correlation filters for object tracking have begun to receive increasing attention. Traditional object trackers based on correlation filters typically use a single type of feature. In this paper, we attempt to integrate multiple feature types to improve the performance, and we propose a new DD-HOG fusion feature that consists of discriminative descriptors (DDs) and histograms of oriented gradients (HOG). However, fusion features as multi-vector descriptors cannot be directly used in prior correlation filters. To overcome this difficulty, we propose a multi-vector correlation filter (MVCF) that can directly convolve with a multi-vector descriptor to obtain a single-channel response that indicates the location of an object. Experiments on the CVPR2013 tracking benchmark with the evaluation of state-of-the-art trackers show the effectiveness and speed of the proposed method. Moreover, we show that our MVCF tracker, which uses the DD-HOG descriptor, outperforms the structure-preserving object tracker (SPOT) in multi-object tracking because of its high-speed and ability to address heavy occlusion. PMID:27347951
van Lamsweerde, Amanda E; Beck, Melissa R
2015-12-01
In this study, we investigated whether the ability to learn probability information is affected by the type of representation held in visual working memory. Across 4 experiments, participants detected changes to displays of coloured shapes. While participants detected changes in 1 dimension (e.g., colour), a feature from a second, nonchanging dimension (e.g., shape) predicted which object was most likely to change. In Experiments 1 and 3, items could be grouped by similarity in the changing dimension across items (e.g., colours and shapes were repeated in the display), while in Experiments 2 and 4 items could not be grouped by similarity (all features were unique). Probability information from the predictive dimension was learned and used to increase performance, but only when all of the features within a display were unique (Experiments 2 and 4). When it was possible to group by feature similarity in the changing dimension (e.g., 2 blue objects appeared within an array), participants were unable to learn probability information and use it to improve performance (Experiments 1 and 3). The results suggest that probability information can be learned in a dimension that is not explicitly task-relevant, but only when the probability information is represented with the changing dimension in visual working memory. (c) 2015 APA, all rights reserved).
Hiding and finding: the relationship between visual concealment and visual search.
Smilek, Daniel; Weinheimer, Laura; Kwan, Donna; Reynolds, Mike; Kingstone, Alan
2009-11-01
As an initial step toward developing a theory of visual concealment, we assessed whether people would use factors known to influence visual search difficulty when the degree of concealment of objects among distractors was varied. In Experiment 1, participants arranged search objects (shapes, emotional faces, and graphemes) to create displays in which the targets were in plain sight but were either easy or hard to find. Analyses of easy and hard displays created during Experiment 1 revealed that the participants reliably used factors known to influence search difficulty (e.g., eccentricity, target-distractor similarity, presence/absence of a feature) to vary the difficulty of search across displays. In Experiment 2, a new participant group searched for the targets in the displays created by the participants in Experiment 1. Results indicated that search was more difficult in the hard than in the easy condition. In Experiments 3 and 4, participants used presence versus absence of a feature to vary search difficulty with several novel stimulus sets. Taken together, the results reveal a close link between the factors that govern concealment and the factors known to influence search difficulty, suggesting that a visual search theory can be extended to form the basis of a theory of visual concealment.
Audiovisual Perception of Congruent and Incongruent Dutch Front Vowels
ERIC Educational Resources Information Center
Valkenier, Bea; Duyne, Jurriaan Y.; Andringa, Tjeerd C.; Baskent, Deniz
2012-01-01
Purpose: Auditory perception of vowels in background noise is enhanced when combined with visually perceived speech features. The objective of this study was to investigate whether the influence of visual cues on vowel perception extends to incongruent vowels, in a manner similar to the McGurk effect observed with consonants. Method:…
The role of attention in figure-ground segregation in areas V1 and V4 of the visual cortex.
Poort, Jasper; Raudies, Florian; Wannig, Aurel; Lamme, Victor A F; Neumann, Heiko; Roelfsema, Pieter R
2012-07-12
Our visual system segments images into objects and background. Figure-ground segregation relies on the detection of feature discontinuities that signal boundaries between the figures and the background and on a complementary region-filling process that groups together image regions with similar features. The neuronal mechanisms for these processes are not well understood and it is unknown how they depend on visual attention. We measured neuronal activity in V1 and V4 in a task where monkeys either made an eye movement to texture-defined figures or ignored them. V1 activity predicted the timing and the direction of the saccade if the figures were task relevant. We found that boundary detection is an early process that depends little on attention, whereas region filling occurs later and is facilitated by visual attention, which acts in an object-based manner. Our findings are explained by a model with local, bottom-up computations for boundary detection and feedback processing for region filling. Copyright © 2012 Elsevier Inc. All rights reserved.
Feature Integration in the Mapping of Multi-Attribute Visual Stimuli to Responses
Ishizaki, Takuya; Morita, Hiromi; Morita, Masahiko
2015-01-01
In the human visual system, different attributes of an object, such as shape and color, are separately processed in different modules and then integrated to elicit a specific response. In this process, different attributes are thought to be temporarily “bound” together by focusing attention on the object; however, how such binding contributes to stimulus-response mapping remains unclear. Here we report that learning and performance of stimulus-response tasks was more difficult when three attributes of the stimulus determined the correct response than when two attributes did. We also found that spatially separated presentations of attributes considerably complicated the task, although they did not markedly affect target detection. These results are consistent with a paired-attribute model in which bound feature pairs, rather than object representations, are associated with responses by learning. This suggests that attention does not bind three or more attributes into a unitary object representation, and long-term learning is required for their integration. PMID:25762010
The guidance of visual search by shape features and shape configurations.
McCants, Cody W; Berggren, Nick; Eimer, Martin
2018-03-01
Representations of target features (attentional templates) guide attentional object selection during visual search. In many search tasks, targets objects are defined not by a single feature but by the spatial configuration of their component shapes. We used electrophysiological markers of attentional selection processes to determine whether the guidance of shape configuration search is entirely part-based or sensitive to the spatial relationship between shape features. Participants searched for targets defined by the spatial arrangement of two shape components (e.g., hourglass above circle). N2pc components were triggered not only by targets but also by partially matching distractors with one target shape (e.g., hourglass above hexagon) and by distractors that contained both target shapes in the reverse arrangement (e.g., circle above hourglass), in line with part-based attentional control. Target N2pc components were delayed when a reverse distractor was present on the opposite side of the same display, suggesting that early shape-specific attentional guidance processes could not distinguish between targets and reverse distractors. The control of attention then became sensitive to spatial configuration, which resulted in a stronger attentional bias for target objects relative to reverse and partially matching distractors. Results demonstrate that search for target objects defined by the spatial arrangement of their component shapes is initially controlled in a feature-based fashion but can later be guided by templates for spatial configurations. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Similarity relations in visual search predict rapid visual categorization
Mohan, Krithika; Arun, S. P.
2012-01-01
How do we perform rapid visual categorization?It is widely thought that categorization involves evaluating the similarity of an object to other category items, but the underlying features and similarity relations remain unknown. Here, we hypothesized that categorization performance is based on perceived similarity relations between items within and outside the category. To this end, we measured the categorization performance of human subjects on three diverse visual categories (animals, vehicles, and tools) and across three hierarchical levels (superordinate, basic, and subordinate levels among animals). For the same subjects, we measured their perceived pair-wise similarities between objects using a visual search task. Regardless of category and hierarchical level, we found that the time taken to categorize an object could be predicted using its similarity to members within and outside its category. We were able to account for several classic categorization phenomena, such as (a) the longer times required to reject category membership; (b) the longer times to categorize atypical objects; and (c) differences in performance across tasks and across hierarchical levels. These categorization times were also accounted for by a model that extracts coarse structure from an image. The striking agreement observed between categorization and visual search suggests that these two disparate tasks depend on a shared coarse object representation. PMID:23092947
Cultural differences in visual object recognition in 3-year-old children
Kuwabara, Megumi; Smith, Linda B.
2016-01-01
Recent research indicates that culture penetrates fundamental processes of perception and cognition (e.g. Nisbett & Miyamoto, 2005). Here, we provide evidence that these influences begin early and influence how preschool children recognize common objects. The three tasks (n=128) examined the degree to which nonface object recognition by 3 year olds was based on individual diagnostic features versus more configural and holistic processing. Task 1 used a 6-alternative forced choice task in which children were asked to find a named category in arrays of masked objects in which only 3 diagnostic features were visible for each object. U.S. children outperformed age-matched Japanese children. Task 2 presented pictures of objects to children piece by piece. U.S. children recognized the objects given fewer pieces than Japanese children and likelihood of recognition increased for U.S., but not Japanese children when the piece added was rated by both U.S. and Japanese adults as highly defining. Task 3 used a standard measure of configural progressing, asking the degree to which recognition of matching pictures was disrupted by the rotation of one picture. Japanese children’s recognition was more disrupted by inversion than was that of U.S. children, indicating more configural processing by Japanese than U.S. children. The pattern suggests early cross-cultural differences in visual processing; findings that raise important questions about how visual experiences differ across cultures and about universal patterns of cognitive development. PMID:26985576
Cultural differences in visual object recognition in 3-year-old children.
Kuwabara, Megumi; Smith, Linda B
2016-07-01
Recent research indicates that culture penetrates fundamental processes of perception and cognition. Here, we provide evidence that these influences begin early and influence how preschool children recognize common objects. The three tasks (N=128) examined the degree to which nonface object recognition by 3-year-olds was based on individual diagnostic features versus more configural and holistic processing. Task 1 used a 6-alternative forced choice task in which children were asked to find a named category in arrays of masked objects where only three diagnostic features were visible for each object. U.S. children outperformed age-matched Japanese children. Task 2 presented pictures of objects to children piece by piece. U.S. children recognized the objects given fewer pieces than Japanese children, and the likelihood of recognition increased for U.S. children, but not Japanese children, when the piece added was rated by both U.S. and Japanese adults as highly defining. Task 3 used a standard measure of configural progressing, asking the degree to which recognition of matching pictures was disrupted by the rotation of one picture. Japanese children's recognition was more disrupted by inversion than was that of U.S. children, indicating more configural processing by Japanese than U.S. children. The pattern suggests early cross-cultural differences in visual processing; findings that raise important questions about how visual experiences differ across cultures and about universal patterns of cognitive development. Copyright © 2016 Elsevier Inc. All rights reserved.
Modeling global scene factors in attention
NASA Astrophysics Data System (ADS)
Torralba, Antonio
2003-07-01
Models of visual attention have focused predominantly on bottom-up approaches that ignored structured contextual and scene information. I propose a model of contextual cueing for attention guidance based on the global scene configuration. It is shown that the statistics of low-level features across the whole image can be used to prime the presence or absence of objects in the scene and to predict their location, scale, and appearance before exploring the image. In this scheme, visual context information can become available early in the visual processing chain, which allows modulation of the saliency of image regions and provides an efficient shortcut for object detection and recognition. 2003 Optical Society of America
Preattentive binding of auditory and visual stimulus features.
Winkler, István; Czigler, István; Sussman, Elyse; Horváth, János; Balázs, Lászlo
2005-02-01
We investigated the role of attention in feature binding in the auditory and the visual modality. One auditory and one visual experiment used the mismatch negativity (MMN and vMMN, respectively) event-related potential to index the memory representations created from stimulus sequences, which were either task-relevant and, therefore, attended or task-irrelevant and ignored. In the latter case, the primary task was a continuous demanding within-modality task. The test sequences were composed of two frequently occurring stimuli, which differed from each other in two stimulus features (standard stimuli) and two infrequently occurring stimuli (deviants), which combined one feature from one standard stimulus with the other feature of the other standard stimulus. Deviant stimuli elicited MMN responses of similar parameters across the different attentional conditions. These results suggest that the memory representations involved in the MMN deviance detection response encoded the frequently occurring feature combinations whether or not the test sequences were attended. A possible alternative to the memory-based interpretation of the visual results, the elicitation of the McCollough color-contingent aftereffect, was ruled out by the results of our third experiment. The current results are compared with those supporting the attentive feature integration theory. We conclude that (1) with comparable stimulus paradigms, similar results have been obtained in the two modalities, (2) there exist preattentive processes of feature binding, however, (3) conjoining features within rich arrays of objects under time pressure and/or longterm retention of the feature-conjoined memory representations may require attentive processes.
Perceptual grouping enhances visual plasticity.
Mastropasqua, Tommaso; Turatto, Massimo
2013-01-01
Visual perceptual learning, a manifestation of neural plasticity, refers to improvements in performance on a visual task achieved by training. Attention is known to play an important role in perceptual learning, given that the observer's discriminative ability improves only for those stimulus feature that are attended. However, the distribution of attention can be severely constrained by perceptual grouping, a process whereby the visual system organizes the initial retinal input into candidate objects. Taken together, these two pieces of evidence suggest the interesting possibility that perceptual grouping might also affect perceptual learning, either directly or via attentional mechanisms. To address this issue, we conducted two experiments. During the training phase, participants attended to the contrast of the task-relevant stimulus (oriented grating), while two similar task-irrelevant stimuli were presented in the adjacent positions. One of the two flanking stimuli was perceptually grouped with the attended stimulus as a consequence of its similar orientation (Experiment 1) or because it was part of the same perceptual object (Experiment 2). A test phase followed the training phase at each location. Compared to the task-irrelevant no-grouping stimulus, orientation discrimination improved at the attended location. Critically, a perceptual learning effect equivalent to the one observed for the attended location also emerged for the task-irrelevant grouping stimulus, indicating that perceptual grouping induced a transfer of learning to the stimulus (or feature) being perceptually grouped with the task-relevant one. Our findings indicate that no voluntary effort to direct attention to the grouping stimulus or feature is necessary to enhance visual plasticity.
Multispectral image analysis for object recognition and classification
NASA Astrophysics Data System (ADS)
Viau, C. R.; Payeur, P.; Cretu, A.-M.
2016-05-01
Computer and machine vision applications are used in numerous fields to analyze static and dynamic imagery in order to assist or automate decision-making processes. Advancements in sensor technologies now make it possible to capture and visualize imagery at various wavelengths (or bands) of the electromagnetic spectrum. Multispectral imaging has countless applications in various fields including (but not limited to) security, defense, space, medical, manufacturing and archeology. The development of advanced algorithms to process and extract salient information from the imagery is a critical component of the overall system performance. The fundamental objective of this research project was to investigate the benefits of combining imagery from the visual and thermal bands of the electromagnetic spectrum to improve the recognition rates and accuracy of commonly found objects in an office setting. A multispectral dataset (visual and thermal) was captured and features from the visual and thermal images were extracted and used to train support vector machine (SVM) classifiers. The SVM's class prediction ability was evaluated separately on the visual, thermal and multispectral testing datasets.
ERIC Educational Resources Information Center
Hollingworth, Andrew; Franconeri, Steven L.
2009-01-01
The "correspondence problem" is a classic issue in vision and cognition. Frequent perceptual disruptions, such as saccades and brief occlusion, create gaps in perceptual input. How does the visual system establish correspondence between objects visible before and after the disruption? Current theories hold that object correspondence is established…
Reward associations impact both iconic and visual working memory.
Infanti, Elisa; Hickey, Clayton; Turatto, Massimo
2015-02-01
Reward plays a fundamental role in human behavior. A growing number of studies have shown that stimuli associated with reward become salient and attract attention. The aim of the present study was to extend these results into the investigation of iconic memory and visual working memory. In two experiments we asked participants to perform a visual-search task where different colors of the target stimuli were paired with high or low reward. We then tested whether the pre-established feature-reward association affected performance on a subsequent visual memory task, in which no reward was provided. In this test phase participants viewed arrays of 8 objects, one of which had unique color that could match the color associated with reward during the previous visual-search task. A probe appeared at varying intervals after stimulus offset to identify the to-be-reported item. Our results suggest that reward biases the encoding of visual information such that items characterized by a reward-associated feature interfere with mnemonic representations of other items in the test display. These results extend current knowledge regarding the influence of reward on early cognitive processes, suggesting that feature-reward associations automatically interact with the encoding and storage of visual information, both in iconic memory and visual working memory. Copyright © 2014 Elsevier Ltd. All rights reserved.
Hemispheric asymmetry of liking for representational and abstract paintings.
Nadal, Marcos; Schiavi, Susanna; Cattaneo, Zaira
2017-10-13
Although the neural correlates of the appreciation of aesthetic qualities have been the target of much research in the past decade, few experiments have explored the hemispheric asymmetries in underlying processes. In this study, we used a divided visual field paradigm to test for hemispheric asymmetries in men and women's preference for abstract and representational artworks. Both male and female participants liked representational paintings more when presented in the right visual field, whereas preference for abstract paintings was unaffected by presentation hemifield. We hypothesize that this result reflects a facilitation of the sort of visual processes relevant to laypeople's liking for art-specifically, local processing of highly informative object features-when artworks are presented in the right visual field, given the left hemisphere's advantage in processing such features.
Time course of spatial and feature selective attention for partly-occluded objects.
Kasai, Tetsuko; Takeya, Ryuji
2012-07-01
Attention selects objects/groups as the most fundamental units, and this may be achieved by an attention-spreading mechanism. Previous event-related potential (ERP) studies have found that attention-spreading is reflected by a decrease in the N1 spatial attention effect. The present study tested whether the electrophysiological attention effect is associated with the perception of object unity or amodal completion through the use of partly-occluded objects. ERPs were recorded in 14 participants who were required to pay attention to their left or right visual field and to press a button for a target shape in the attended field. Bilateral stimuli were presented rapidly, and were separated, connected, or connected behind an occluder. Behavioral performance in the connected and occluded conditions was worse than that in the separated condition, indicating that attention spread over perceptual object representations after amodal completion. Consistently, the late N1 spatial attention effect (180-220 ms post-stimulus) and the early phase (230-280 ms) of feature selection effects (target N2) at contralateral sites decreased, equally for the occluded and connected conditions, while the attention effect in the early N1 latency (140-180 ms) shifted most positively for the occluded condition. These results suggest that perceptual organization processes for object recognition transiently modulate spatial and feature selection processes in the visual cortex. Copyright © 2012 Elsevier Ltd. All rights reserved.
Classification of CT examinations for COPD visual severity analysis
NASA Astrophysics Data System (ADS)
Tan, Jun; Zheng, Bin; Wang, Xingwei; Pu, Jiantao; Gur, David; Sciurba, Frank C.; Leader, J. Ken
2012-03-01
In this study we present a computational method of CT examination classification into visual assessed emphysema severity. The visual severity categories ranged from 0 to 5 and were rated by an experienced radiologist. The six categories were none, trace, mild, moderate, severe and very severe. Lung segmentation was performed for every input image and all image features are extracted from the segmented lung only. We adopted a two-level feature representation method for the classification. Five gray level distribution statistics, six gray level co-occurrence matrix (GLCM), and eleven gray level run-length (GLRL) features were computed for each CT image depicted segment lung. Then we used wavelets decomposition to obtain the low- and high-frequency components of the input image, and again extract from the lung region six GLCM features and eleven GLRL features. Therefore our feature vector length is 56. The CT examinations were classified using the support vector machine (SVM) and k-nearest neighbors (KNN) and the traditional threshold (density mask) approach. The SVM classifier had the highest classification performance of all the methods with an overall sensitivity of 54.4% and a 69.6% sensitivity to discriminate "no" and "trace visually assessed emphysema. We believe this work may lead to an automated, objective method to categorically classify emphysema severity on CT exam.
Kasai, Tetsuko; Moriya, Hiroki; Hirano, Shingo
2011-07-05
It has been proposed that the most fundamental units of attentional selection are "objects" that are grouped according to Gestalt factors such as similarity or connectedness. Previous studies using event-related potentials (ERPs) have shown that object-based attention is associated with modulations of the visual-evoked N1 component, which reflects an early cortical mechanism that is shared with spatial attention. However, these studies only examined the case of perceptually continuous objects. The present study examined the case of separate objects that are grouped according to feature similarity (color, shape) by indexing lateralized potentials at posterior sites in a sustained-attention task that involved bilateral stimulus arrays. A behavioral object effect was found only for task-relevant shape similarity. Electrophysiological results indicated that attention was guided to the task-irrelevant side of the visual field due to achromatic-color similarity in N1 (155-205 ms post-stimulus) and early N2 (210-260 ms) and due to shape similarity in early N2 and late N2 (280-400 ms) latency ranges. These results are discussed in terms of selection mechanisms and object/group representations. Copyright © 2011 Elsevier B.V. All rights reserved.
Persichetti, Andrew S; Aguirre, Geoffrey K; Thompson-Schill, Sharon L
2015-05-01
A central concern in the study of learning and decision-making is the identification of neural signals associated with the values of choice alternatives. An important factor in understanding the neural correlates of value is the representation of the object itself, separate from the act of choosing. Is it the case that the representation of an object within visual areas will change if it is associated with a particular value? We used fMRI adaptation to measure the neural similarity of a set of novel objects before and after participants learned to associate monetary values with the objects. We used a range of both positive and negative values to allow us to distinguish effects of behavioral salience (i.e., large vs. small values) from effects of valence (i.e., positive vs. negative values). During the scanning session, participants made a perceptual judgment unrelated to value. Crucially, the similarity of the visual features of any pair of objects did not predict the similarity of their value, so we could distinguish adaptation effects due to each dimension of similarity. Within early visual areas, we found that value similarity modulated the neural response to the objects after training. These results show that an abstract dimension, in this case, monetary value, modulates neural response to an object in visual areas of the brain even when attention is diverted.
Markman, Adam; Shen, Xin; Hua, Hong; Javidi, Bahram
2016-01-15
An augmented reality (AR) smartglass display combines real-world scenes with digital information enabling the rapid growth of AR-based applications. We present an augmented reality-based approach for three-dimensional (3D) optical visualization and object recognition using axially distributed sensing (ADS). For object recognition, the 3D scene is reconstructed, and feature extraction is performed by calculating the histogram of oriented gradients (HOG) of a sliding window. A support vector machine (SVM) is then used for classification. Once an object has been identified, the 3D reconstructed scene with the detected object is optically displayed in the smartglasses allowing the user to see the object, remove partial occlusions of the object, and provide critical information about the object such as 3D coordinates, which are not possible with conventional AR devices. To the best of our knowledge, this is the first report on combining axially distributed sensing with 3D object visualization and recognition for applications to augmented reality. The proposed approach can have benefits for many applications, including medical, military, transportation, and manufacturing.
Saiki, Jun
2002-01-01
Research on change blindness and transsaccadic memory revealed that a limited amount of information is retained across visual disruptions in visual working memory. It has been proposed that visual working memory can hold four to five coherent object representations. To investigate their maintenance and transformation in dynamic situations, I devised an experimental paradigm called multiple-object permanence tracking (MOPT) that measures memory for multiple feature-location bindings in dynamic situations. Observers were asked to detect any color switch in the middle of a regular rotation of a pattern with multiple colored disks behind an occluder. The color-switch detection performance dramatically declined as the pattern rotation velocity increased, and this effect of object motion was independent of the number of targets. The MOPT task with various shapes and colors showed that color-shape conjunctions are not available in the MOPT task. These results suggest that even completely predictable motion severely reduces our capacity of object representations, from four to only one or two.
Explaining seeing? Disentangling qualia from perceptual organization.
Ibáñez, Agustin; Bekinschtein, Tristan
2010-09-01
Abstract Visual perception and integration seem to play an essential role in our conscious phenomenology. Relatively local neural processing of reentrant nature may explain several visual integration processes (feature binding or figure-ground segregation, object recognition, inference, competition), even without attention or cognitive control. Based on the above statements, should the neural signatures of visual integration (via reentrant process) be non-reportable phenomenological qualia? We argue that qualia are not required to understand this perceptual organization.
Feedforward object-vision models only tolerate small image variations compared to human
Ghodrati, Masoud; Farzmahdi, Amirhossein; Rajaei, Karim; Ebrahimpour, Reza; Khaligh-Razavi, Seyed-Mahdi
2014-01-01
Invariant object recognition is a remarkable ability of primates' visual system that its underlying mechanism has constantly been under intense investigations. Computational modeling is a valuable tool toward understanding the processes involved in invariant object recognition. Although recent computational models have shown outstanding performances on challenging image databases, they fail to perform well in image categorization under more complex image variations. Studies have shown that making sparse representation of objects by extracting more informative visual features through a feedforward sweep can lead to higher recognition performances. Here, however, we show that when the complexity of image variations is high, even this approach results in poor performance compared to humans. To assess the performance of models and humans in invariant object recognition tasks, we built a parametrically controlled image database consisting of several object categories varied in different dimensions and levels, rendered from 3D planes. Comparing the performance of several object recognition models with human observers shows that only in low-level image variations the models perform similar to humans in categorization tasks. Furthermore, the results of our behavioral experiments demonstrate that, even under difficult experimental conditions (i.e., briefly presented masked stimuli with complex image variations), human observers performed outstandingly well, suggesting that the models are still far from resembling humans in invariant object recognition. Taken together, we suggest that learning sparse informative visual features, although desirable, is not a complete solution for future progresses in object-vision modeling. We show that this approach is not of significant help in solving the computational crux of object recognition (i.e., invariant object recognition) when the identity-preserving image variations become more complex. PMID:25100986
Sakata, H; Taira, M; Kusunoki, M; Murata, A; Tanaka, Y
1997-08-01
Recent neurophysiological studies in alert monkeys have revealed that the parietal association cortex plays a crucial role in depth perception and visually guided hand movement. The following five classes of parietal neurons covering various aspects of these functions have been identified: (1) depth-selective visual-fixation (VF) neurons of the inferior parietal lobule (IPL), representing egocentric distance; (2) depth-movement sensitive (DMS) neurons of V5A and the ventral intraparietal (VIP) area representing direction of linear movement in 3-D space; (3) depth-rotation-sensitive (RS) neurons of V5A and the posterior parietal (PP) area representing direction of rotary movement in space; (4) visually responsive manipulation-related neurons (visual-dominant or visual-and-motor type) of the anterior intraparietal (AIP) area, representing 3-D shape or orientation (or both) of objects for manipulation; and (5) axis-orientation-selective (AOS) and surface-orientation-selective (SOS) neurons in the caudal intraparietal sulcus (cIPS) sensitive to binocular disparity and representing the 3-D orientation of the longitudinal axes and flat surfaces, respectively. Some AOS and SOS neurons are selective in both orientation and shape. Thus the dorsal visual pathway is divided into at least two subsystems, V5A, PP and VIP areas for motion vision and V6, LIP and cIPS areas for coding position and 3-D features. The cIPS sends the signals of 3-D features of objects to the AIP area, which is reciprocally connected to the ventral premotor (F5) area and plays an essential role in matching hand orientation and shaping with 3-D objects for manipulation.
Asymmetrical access to color and location in visual working memory.
Rajsic, Jason; Wilson, Daryl E
2014-10-01
Models of visual working memory (VWM) have benefitted greatly from the use of the delayed-matching paradigm. However, in this task, the ability to recall a probed feature is confounded with the ability to maintain the proper binding between the feature that is to be reported and the feature (typically location) that is used to cue a particular item for report. Given that location is typically used as a cue-feature, we used the delayed-estimation paradigm to compare memory for location to memory for color, rotating which feature was used as a cue and which was reported. Our results revealed several novel findings: 1) the likelihood of reporting a probed object's feature was superior when reporting location with a color cue than when reporting color with a location cue; 2) location report errors were composed entirely of swap errors, with little to no random location reports; and 3) both colour and location reports greatly benefitted from the presence of nonprobed items at test. This last finding suggests that it is uncertainty over the bindings between locations and colors at memory retrieval that drive swap errors, not at encoding. We interpret our findings as consistent with a representational architecture that nests remembered object features within remembered locations.
Norman, J Farley; Phillips, Flip; Cheeseman, Jacob R; Thomason, Kelsey E; Ronning, Cecilia; Behari, Kriti; Kleinman, Kayla; Calloway, Autum B; Lamirande, Davora
2016-01-01
It is well known that motion facilitates the visual perception of solid object shape, particularly when surface texture or other identifiable features (e.g., corners) are present. Conventional models of structure-from-motion require the presence of texture or identifiable object features in order to recover 3-D structure. Is the facilitation in 3-D shape perception similar in magnitude when surface texture is absent? On any given trial in the current experiments, participants were presented with a single randomly-selected solid object (bell pepper or randomly-shaped "glaven") for 12 seconds and were required to indicate which of 12 (for bell peppers) or 8 (for glavens) simultaneously visible objects possessed the same shape. The initial single object's shape was defined either by boundary contours alone (i.e., presented as a silhouette), specular highlights alone, specular highlights combined with boundary contours, or texture. In addition, there was a haptic condition: in this condition, the participants haptically explored with both hands (but could not see) the initial single object for 12 seconds; they then performed the same shape-matching task used in the visual conditions. For both the visual and haptic conditions, motion (rotation in depth or active object manipulation) was present in half of the trials and was not present for the remaining trials. The effect of motion was quantitatively similar for all of the visual and haptic conditions-e.g., the participants' performance in Experiment 1 was 93.5 percent higher in the motion or active haptic manipulation conditions (when compared to the static conditions). The current results demonstrate that deforming specular highlights or boundary contours facilitate 3-D shape perception as much as the motion of objects that possess texture. The current results also indicate that the improvement with motion that occurs for haptics is similar in magnitude to that which occurs for vision.
A visual horizon affects steering responses during flight in fruit flies.
Caballero, Jorge; Mazo, Chantell; Rodriguez-Pinto, Ivan; Theobald, Jamie C
2015-09-01
To navigate well through three-dimensional environments, animals must in some way gauge the distances to objects and features around them. Humans use a variety of visual cues to do this, but insects, with their small size and rigid eyes, are constrained to a more limited range of possible depth cues. For example, insects attend to relative image motion when they move, but cannot change the optical power of their eyes to estimate distance. On clear days, the horizon is one of the most salient visual features in nature, offering clues about orientation, altitude and, for humans, distance to objects. We set out to determine whether flying fruit flies treat moving features as farther off when they are near the horizon. Tethered flies respond strongly to moving images they perceive as close. We measured the strength of steering responses while independently varying the elevation of moving stimuli and the elevation of a virtual horizon. We found responses to vertical bars are increased by negative elevations of their bases relative to the horizon, closely correlated with the inverse of apparent distance. In other words, a bar that dips far below the horizon elicits a strong response, consistent with using the horizon as a depth cue. Wide-field motion also had an enhanced effect below the horizon, but this was only prevalent when flies were additionally motivated with hunger. These responses may help flies tune behaviors to nearby objects and features when they are too far off for motion parallax. © 2015. Published by The Company of Biologists Ltd.
Cross-domain latent space projection for person re-identification
NASA Astrophysics Data System (ADS)
Pu, Nan; Wu, Song; Qian, Li; Xiao, Guoqiang
2018-04-01
In this paper, we research the problem of person re-identification and propose a cross-domain latent space projection (CDLSP) method to address the problems of the absence or insufficient labeled data in the target domain. Under the assumption that the visual features in the source domain and target domain share the similar geometric structure, we transform the visual features from source domain and target domain to a common latent space by optimizing the object function defined in the manifold alignment method. Moreover, the proposed object function takes into account the specific knowledge in the re-id with the aim to improve the performance of re-id under complex situations. Extensive experiments conducted on four benchmark datasets show the proposed CDLSP outperforms or is competitive with stateof- the-art methods for person re-identification.
Li, Yuanqing; Wang, Fangyi; Chen, Yongbin; Cichocki, Andrzej; Sejnowski, Terrence
2017-09-25
At cocktail parties, our brains often simultaneously receive visual and auditory information. Although the cocktail party problem has been widely investigated under auditory-only settings, the effects of audiovisual inputs have not. This study explored the effects of audiovisual inputs in a simulated cocktail party. In our fMRI experiment, each congruent audiovisual stimulus was a synthesis of 2 facial movie clips, each of which could be classified into 1 of 2 emotion categories (crying and laughing). Visual-only (faces) and auditory-only stimuli (voices) were created by extracting the visual and auditory contents from the synthesized audiovisual stimuli. Subjects were instructed to selectively attend to 1 of the 2 objects contained in each stimulus and to judge its emotion category in the visual-only, auditory-only, and audiovisual conditions. The neural representations of the emotion features were assessed by calculating decoding accuracy and brain pattern-related reproducibility index based on the fMRI data. We compared the audiovisual condition with the visual-only and auditory-only conditions and found that audiovisual inputs enhanced the neural representations of emotion features of the attended objects instead of the unattended objects. This enhancement might partially explain the benefits of audiovisual inputs for the brain to solve the cocktail party problem. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Neural Encoding of Relative Position
ERIC Educational Resources Information Center
Hayworth, Kenneth J.; Lescroart, Mark D.; Biederman, Irving
2011-01-01
Late ventral visual areas generally consist of cells having a significant degree of translation invariance. Such a "bag of features" representation is useful for the recognition of individual objects; however, it seems unable to explain our ability to parse a scene into multiple objects and to understand their spatial relationships. We…
Grasp Preparation Improves Change Detection for Congruent Objects
ERIC Educational Resources Information Center
Symes, Ed; Tucker, Mike; Ellis, Rob; Vainio, Lari; Ottoboni, Giovanni
2008-01-01
A series of experiments provided converging support for the hypothesis that action preparation biases selective attention to action-congruent object features. When visual transients are masked in so-called "change-blindness scenes," viewers are blind to substantial changes between 2 otherwise identical pictures that flick back and forth. The…
Visual Aggregate Analysis of Eligibility Features of Clinical Trials
He, Zhe; Carini, Simona; Sim, Ida; Weng, Chunhua
2015-01-01
Objective To develop a method for profiling the collective populations targeted for recruitment by multiple clinical studies addressing the same medical condition using one eligibility feature each time. Methods Using a previously published database COMPACT as the backend, we designed a scalable method for visual aggregate analysis of clinical trial eligibility features. This method consists of four modules for eligibility feature frequency analysis, query builder, distribution analysis, and visualization, respectively. This method is capable of analyzing (1) frequently used qualitative and quantitative features for recruiting subjects for a selected medical condition, (2) distribution of study enrollment on consecutive value points or value intervals of each quantitative feature, and (3) distribution of studies on the boundary values, permissible value ranges, and value range widths of each feature. All analysis results were visualized using Google Charts API. Five recruited potential users assessed the usefulness of this method for identifying common patterns in any selected eligibility feature for clinical trial participant selection. Results We implemented this method as a Web-based analytical system called VITTA (Visual Analysis Tool of Clinical Study Target Populations). We illustrated the functionality of VITTA using two sample queries involving quantitative features BMI and HbA1c for conditions “hypertension” and “Type 2 diabetes”, respectively. The recruited potential users rated the user-perceived usefulness of VITTA with an average score of 86.4/100. Conclusions We contributed a novel aggregate analysis method to enable the interrogation of common patterns in quantitative eligibility criteria and the collective target populations of multiple related clinical studies. A larger-scale study is warranted to formally assess the usefulness of VITTA among clinical investigators and sponsors in various therapeutic areas. PMID:25615940
NASA Astrophysics Data System (ADS)
Herold, Julia; Abouna, Sylvie; Zhou, Luxian; Pelengaris, Stella; Epstein, David B. A.; Khan, Michael; Nattkemper, Tim W.
2009-02-01
In the last years, bioimaging has turned from qualitative measurements towards a high-throughput and highcontent modality, providing multiple variables for each biological sample analyzed. We present a system which combines machine learning based semantic image annotation and visual data mining to analyze such new multivariate bioimage data. Machine learning is employed for automatic semantic annotation of regions of interest. The annotation is the prerequisite for a biological object-oriented exploration of the feature space derived from the image variables. With the aid of visual data mining, the obtained data can be explored simultaneously in the image as well as in the feature domain. Especially when little is known of the underlying data, for example in the case of exploring the effects of a drug treatment, visual data mining can greatly aid the process of data evaluation. We demonstrate how our system is used for image evaluation to obtain information relevant to diabetes study and screening of new anti-diabetes treatments. Cells of the Islet of Langerhans and whole pancreas in pancreas tissue samples are annotated and object specific molecular features are extracted from aligned multichannel fluorescence images. These are interactively evaluated for cell type classification in order to determine the cell number and mass. Only few parameters need to be specified which makes it usable also for non computer experts and allows for high-throughput analysis.
Foveal analysis and peripheral selection during active visual sampling
Ludwig, Casimir J. H.; Davies, J. Rhys; Eckstein, Miguel P.
2014-01-01
Human vision is an active process in which information is sampled during brief periods of stable fixation in between gaze shifts. Foveal analysis serves to identify the currently fixated object and has to be coordinated with a peripheral selection process of the next fixation location. Models of visual search and scene perception typically focus on the latter, without considering foveal processing requirements. We developed a dual-task noise classification technique that enables identification of the information uptake for foveal analysis and peripheral selection within a single fixation. Human observers had to use foveal vision to extract visual feature information (orientation) from different locations for a psychophysical comparison. The selection of to-be-fixated locations was guided by a different feature (luminance contrast). We inserted noise in both visual features and identified the uptake of information by looking at correlations between the noise at different points in time and behavior. Our data show that foveal analysis and peripheral selection proceeded completely in parallel. Peripheral processing stopped some time before the onset of an eye movement, but foveal analysis continued during this period. Variations in the difficulty of foveal processing did not influence the uptake of peripheral information and the efficacy of peripheral selection, suggesting that foveal analysis and peripheral selection operated independently. These results provide important theoretical constraints on how to model target selection in conjunction with foveal object identification: in parallel and independently. PMID:24385588
Learning to recognize objects on the fly: a neurally based dynamic field approach.
Faubel, Christian; Schöner, Gregor
2008-05-01
Autonomous robots interacting with human users need to build and continuously update scene representations. This entails the problem of rapidly learning to recognize new objects under user guidance. Based on analogies with human visual working memory, we propose a dynamical field architecture, in which localized peaks of activation represent objects over a small number of simple feature dimensions. Learning consists of laying down memory traces of such peaks. We implement the dynamical field model on a service robot and demonstrate how it learns 30 objects from a very small number of views (about 5 per object are sufficient). We also illustrate how properties of feature binding emerge from this framework.
The Rise and Fall of Priming: How Visual Exposure Shapes Cortical Representations of Objects
Zago, Laure; Fenske, Mark J.; Aminoff, Elissa; Bar, Moshe
2006-01-01
How does the amount of time for which we see an object influence the nature and content of its cortical representation? To address this question, we varied the duration of initial exposure to visual objects and then measured functional magnetic resonance imaging (fMRI) signal and behavioral performance during a subsequent repeated presentation of these objects. We report a novel ‘rise-and-fall’ pattern relating exposure duration and the corresponding magnitude of fMRI cortical signal. Compared with novel objects, repeated objects elicited maximal cortical response reduction when initially presented for 250 ms. Counter-intuitively, initially seeing an object for a longer duration significantly reduced the magnitude of this effect. This ‘rise-and-fall’ pattern was also evident for the corresponding behavioral priming. To account for these findings, we propose that the earlier interval of an exposure to a visual stimulus results in a fine-tuning of the cortical response, while additional exposure promotes selection of a subset of key features for continued representation. These two independent mechanisms complement each other in shaping object representations with experience. PMID:15716471
Modeling the Time Course of Feature Perception and Feature Information Retrieval
ERIC Educational Resources Information Center
Kent, Christopher; Lamberts, Koen
2006-01-01
Three experiments investigated whether retrieval of information about different dimensions of a visual object varies as a function of the perceptual properties of those dimensions. The experiments involved two perception-based matching tasks and two retrieval-based matching tasks. A signal-to-respond methodology was used in all tasks. A stochastic…
ERIC Educational Resources Information Center
Tapia, Evelina; Breitmeyer, Bruno G.; Jacob, Jane; Broyles, Elizabeth C.
2013-01-01
Flanker congruency effects were measured in a masked flanker task to assess the properties of spatial attention during conscious and nonconscious processing of form, color, and conjunctions of these features. We found that (1) consciously and nonconsciously processed colored shape distractors (i.e., flankers) produce flanker congruency effects;…
van Lamsweerde, Amanda E; Beck, Melissa R; Elliott, Emily M
2015-02-01
The ability to remember feature bindings is an important measure of the ability to maintain objects in working memory (WM). In this study, we investigated whether both object- and feature-based representations are maintained in WM. Specifically, we tested the hypotheses that retaining a greater number of feature representations (i.e., both as individual features and bound representations) results in a more robust representation of individual features than of feature bindings, and that retrieving information from long-term memory (LTM) into WM would cause a greater disruption to feature bindings. In four experiments, we examined the effects of retrieving a word from LTM on shape and color-shape binding change detection performance. We found that binding changes were more difficult to detect than individual-feature changes overall, but that the cost of retrieving a word from LTM was the same for both individual-feature and binding changes.
Vergauwe, Evie; Cowan, Nelson
2015-01-01
We compared two contrasting hypotheses of how multi-featured objects are stored in visual working memory (vWM): as integrated objects or as independent features. A new procedure was devised to examine vWM representations of several concurrently-held objects and their features and our main measure was reaction time (RT), allowing an examination of the real-time search through features and/or objects in an array in vWM. Response speeds to probes with color, shape or both were studied as a function of the number of memorized colored shapes. Four testing groups were created by varying the instructions and the way in which probes with both color and shape were presented. The instructions explicitly either encouraged or discouraged the use of binding information and the task-relevance of binding information was further suggested by presenting probes with both color and shapes as either integrated objects or independent features. Our results show that the unit used for retrieval from vWM depends on the testing situation. Search was fully object-based only when all factors support that basis of search, in which case retrieving two features took no longer than retrieving a single feature. Otherwise, retrieving two features took longer than retrieving a single feature. Additional analyses of change detection latency suggested that, even though different testing situations can result in a stronger emphasis on either the feature dimension or the object dimension, neither one disappears from the representation and both concurrently affect change detection performance. PMID:25705873
Incidental biasing of attention from visual long-term memory.
Fan, Judith E; Turk-Browne, Nicholas B
2016-06-01
Holding recently experienced information in mind can help us achieve our current goals. However, such immediate and direct forms of guidance from working memory are less helpful over extended delays or when other related information in long-term memory is useful for reaching these goals. Here we show that information that was encoded in the past but is no longer present or relevant to the task also guides attention. We examined this by associating multiple unique features with novel shapes in visual long-term memory (VLTM), and subsequently testing how memories for these objects biased the deployment of attention. In Experiment 1, VLTM for associated features guided visual search for the shapes, even when these features had never been task-relevant. In Experiment 2, associated features captured attention when presented in isolation during a secondary task that was completely unrelated to the shapes. These findings suggest that long-term memory enables a durable and automatic type of memory-based attentional control. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Beyond the search surface: visual search and attentional engagement.
Duncan, J; Humphreys, G
1992-05-01
Treisman (1991) described a series of visual search studies testing feature integration theory against an alternative (Duncan & Humphreys, 1989) in which feature and conjunction search are basically similar. Here the latter account is noted to have 2 distinct levels: (a) a summary of search findings in terms of stimulus similarities, and (b) a theory of how visual attention is brought to bear on relevant objects. Working at the 1st level, Treisman found that even when similarities were calibrated and controlled, conjunction search was much harder than feature search. The theory, however, can only really be tested at the 2nd level, because the 1st is an approximation. An account of the findings is developed at the 2nd level, based on the 2 processes of input-template matching and spreading suppression. New data show that, when both of these factors are controlled, feature and conjunction search are equally difficult. Possibilities for unification of the alternative views are considered.
Comparing object recognition from binary and bipolar edge images for visual prostheses.
Jung, Jae-Hyun; Pu, Tian; Peli, Eli
2016-11-01
Visual prostheses require an effective representation method due to the limited display condition which has only 2 or 3 levels of grayscale in low resolution. Edges derived from abrupt luminance changes in images carry essential information for object recognition. Typical binary (black and white) edge images have been used to represent features to convey essential information. However, in scenes with a complex cluttered background, the recognition rate of the binary edge images by human observers is limited and additional information is required. The polarity of edges and cusps (black or white features on a gray background) carries important additional information; the polarity may provide shape from shading information missing in the binary edge image. This depth information may be restored by using bipolar edges. We compared object recognition rates from 16 binary edge images and bipolar edge images by 26 subjects to determine the possible impact of bipolar filtering in visual prostheses with 3 or more levels of grayscale. Recognition rates were higher with bipolar edge images and the improvement was significant in scenes with complex backgrounds. The results also suggest that erroneous shape from shading interpretation of bipolar edges resulting from pigment rather than boundaries of shape may confound the recognition.
Memory for a single object has differently variable precisions for relevant and irrelevant features.
Swan, Garrett; Collins, John; Wyble, Brad
2016-01-01
Working memory is a limited resource. To further characterize its limitations, it is vital to understand exactly what is encoded about a visual object beyond the "relevant" features probed in a particular task. We measured the memory quality of a task-irrelevant feature of an attended object by coupling a delayed estimation task with a surprise test. Participants were presented with a single colored arrow and were asked to retrieve just its color for the first half of the experiment before unexpectedly being asked to report its direction. Mixture modeling of the data revealed that participants had highly variable precision on the surprise test, indicating a coarse-grained memory for the irrelevant feature. Following the surprise test, all participants could precisely recall the arrow's direction; however, this improvement in direction memory came at a cost in precision for color memory even though only a single object was being remembered. We attribute these findings to varying levels of attention to different features during memory encoding.
Representations of Shape in Object Recognition and Long-Term Visual Memory
1993-02-11
in anything other than linguistic terms ( Biederman , 1987 , for example). STATUS 1. Viewpoint-Dependent Features in Object Representation Tarr and...is object- based orientation-independent representations sufficient for "basic-level" categorization ( Biederman , 1987 ; Corballis, 1988). Alternatively...space. REFERENCES Biederman , I. ( 1987 ). Recognition-by-components: A theory of human image understanding. Psychological Review, 94,115-147. Cooper, L
Aging and feature search: the effect of search area.
Burton-Danner, K; Owsley, C; Jackson, G R
2001-01-01
The preattentive system involves the rapid parallel processing of visual information in the visual scene so that attention can be directed to meaningful objects and locations in the environment. This study used the feature search methodology to examine whether there are aging-related deficits in parallel-processing capabilities when older adults are required to visually search a large area of the visual field. Like young subjects, older subjects displayed flat, near-zero slopes for the Reaction Time x Set Size function when searching over a broad area (30 degrees radius) of the visual field, implying parallel processing of the visual display. These same older subjects exhibited impairment in another task, also dependent on parallel processing, performed over the same broad field area; this task, called the useful field of view test, has more complex task demands. Results imply that aging-related breakdowns of parallel processing over a large visual field area are not likely to emerge when required responses are simple, there is only one task to perform, and there is no limitation on visual inspection time.
Eye guidance during real-world scene search: The role color plays in central and peripheral vision.
Nuthmann, Antje; Malcolm, George L
2016-01-01
The visual system utilizes environmental features to direct gaze efficiently when locating objects. While previous research has isolated various features' contributions to gaze guidance, these studies generally used sparse displays and did not investigate how features facilitated search as a function of their location on the visual field. The current study investigated how features across the visual field--particularly color--facilitate gaze guidance during real-world search. A gaze-contingent window followed participants' eye movements, restricting color information to specified regions. Scene images were presented in full color, with color in the periphery and gray in central vision or gray in the periphery and color in central vision, or in grayscale. Color conditions were crossed with a search cue manipulation, with the target cued either with a word label or an exact picture. Search times increased as color information in the scene decreased. A gaze-data based decomposition of search time revealed color-mediated effects on specific subprocesses of search. Color in peripheral vision facilitated target localization, whereas color in central vision facilitated target verification. Picture cues facilitated search, with the effects of cue specificity and scene color combining additively. When available, the visual system utilizes the environment's color information to facilitate different real-world visual search behaviors based on the location within the visual field.
Object attributes combine additively in visual search.
Pramod, R T; Arun, S P
2016-01-01
We perceive objects as containing a variety of attributes: local features, relations between features, internal details, and global properties. But we know little about how they combine. Here, we report a remarkably simple additive rule that governs how these diverse object attributes combine in vision. The perceived dissimilarity between two objects was accurately explained as a sum of (a) spatially tuned local contour-matching processes modulated by part decomposition; (b) differences in internal details, such as texture; (c) differences in emergent attributes, such as symmetry; and (d) differences in global properties, such as orientation or overall configuration of parts. Our results elucidate an enduring question in object vision by showing that the whole object is not a sum of its parts but a sum of its many attributes.
Jellema, Tjeerd; Maassen, Gerard; Perrett, David I
2004-07-01
This study investigated the cellular mechanisms in the anterior part of the superior temporal sulcus (STSa) that underlie the integration of different features of the same visually perceived animate object. Three visual features were systematically manipulated: form, motion and location. In 58% of a population of cells selectively responsive to the sight of a walking agent, the location of the agent significantly influenced the cell's response. The influence of position was often evident in intricate two- and three-way interactions with the factors form and/or motion. For only one of the 31 cells tested, the response could be explained by just a single factor. For all other cells at least two factors, and for half of the cells (52%) all three factors, played a significant role in controlling responses. Our findings support a reformulation of the Ungerleider and Mishkin model, which envisages a subdivision of the visual processing into a ventral 'what' and a dorsal 'where' stream. We demonstrated that at least part of the temporal cortex ('what' stream) makes ample use of visual spatial information. Our findings open up the prospect of a much more elaborate integration of visual properties of animate objects at the single cell level. Such integration may support the comprehension of animals and their actions.
Cheeseman, Jacob R.; Thomason, Kelsey E.; Ronning, Cecilia; Behari, Kriti; Kleinman, Kayla; Calloway, Autum B.; Lamirande, Davora
2016-01-01
It is well known that motion facilitates the visual perception of solid object shape, particularly when surface texture or other identifiable features (e.g., corners) are present. Conventional models of structure-from-motion require the presence of texture or identifiable object features in order to recover 3-D structure. Is the facilitation in 3-D shape perception similar in magnitude when surface texture is absent? On any given trial in the current experiments, participants were presented with a single randomly-selected solid object (bell pepper or randomly-shaped “glaven”) for 12 seconds and were required to indicate which of 12 (for bell peppers) or 8 (for glavens) simultaneously visible objects possessed the same shape. The initial single object’s shape was defined either by boundary contours alone (i.e., presented as a silhouette), specular highlights alone, specular highlights combined with boundary contours, or texture. In addition, there was a haptic condition: in this condition, the participants haptically explored with both hands (but could not see) the initial single object for 12 seconds; they then performed the same shape-matching task used in the visual conditions. For both the visual and haptic conditions, motion (rotation in depth or active object manipulation) was present in half of the trials and was not present for the remaining trials. The effect of motion was quantitatively similar for all of the visual and haptic conditions–e.g., the participants’ performance in Experiment 1 was 93.5 percent higher in the motion or active haptic manipulation conditions (when compared to the static conditions). The current results demonstrate that deforming specular highlights or boundary contours facilitate 3-D shape perception as much as the motion of objects that possess texture. The current results also indicate that the improvement with motion that occurs for haptics is similar in magnitude to that which occurs for vision. PMID:26863531
Between-object and within-object saccade programming in a visual search task.
Vergilino-Perez, Dorine; Findlay, John M
2006-07-01
The role of the perceptual organization of the visual display on eye movement control was examined in two experiments using a task where a two-saccade sequence was directed toward either a single elongated object or three separate shorter objects. In the first experiment, we examined the consequences for the second saccade of a small displacement of the whole display during the first saccade. We found that between-object saccades compensated for the displacement to aim for a target position on the new object whereas within-object saccades did not show compensation but were coded as a fixed motor vector applied irrespective of wherever the preceding saccade landed. In the second experiment, we extended the paradigm to examine saccades performed in different directions. The results suggest that the within-object and between-object saccade distinction is an essential feature of saccadic planning.
Heuristics of reasoning and analogy in children's visual perspective taking.
Yaniv, I; Shatz, M
1990-10-01
We propose that children's reasoning about others' visual perspectives is guided by simple heuristics based on a perceiver's line of sight and salient features of the object met by that line. In 3 experiments employing a 2-perceiver analogy task, children aged 3-6 were generally better able to reproduce a perceiver's perspective if a visual cue in the perceiver's line of sight sufficed to distinguish it from alternatives. Children had greater difficulty when the task hinged on attending to configural cues. Availability of distinctive cues affixed on the objects' sides facilitated solution of the symmetrical orientations. These and several other related findings reported in the literature are traced to children's reliance on heuristics of reasoning.
A model of proto-object based saliency
Russell, Alexander F.; Mihalaş, Stefan; von der Heydt, Rudiger; Niebur, Ernst; Etienne-Cummings, Ralph
2013-01-01
Organisms use the process of selective attention to optimally allocate their computational resources to the instantaneously most relevant subsets of a visual scene, ensuring that they can parse the scene in real time. Many models of bottom-up attentional selection assume that elementary image features, like intensity, color and orientation, attract attention. Gestalt psychologists, how-ever, argue that humans perceive whole objects before they analyze individual features. This is supported by recent psychophysical studies that show that objects predict eye-fixations better than features. In this report we present a neurally inspired algorithm of object based, bottom-up attention. The model rivals the performance of state of the art non-biologically plausible feature based algorithms (and outperforms biologically plausible feature based algorithms) in its ability to predict perceptual saliency (eye fixations and subjective interest points) in natural scenes. The model achieves this by computing saliency as a function of proto-objects that establish the perceptual organization of the scene. All computational mechanisms of the algorithm have direct neural correlates, and our results provide evidence for the interface theory of attention. PMID:24184601
An integration of minimum local feature representation methods to recognize large variation of foods
NASA Astrophysics Data System (ADS)
Razali, Mohd Norhisham bin; Manshor, Noridayu; Halin, Alfian Abdul; Mustapha, Norwati; Yaakob, Razali
2017-10-01
Local invariant features have shown to be successful in describing object appearances for image classification tasks. Such features are robust towards occlusion and clutter and are also invariant against scale and orientation changes. This makes them suitable for classification tasks with little inter-class similarity and large intra-class difference. In this paper, we propose an integrated representation of the Speeded-Up Robust Feature (SURF) and Scale Invariant Feature Transform (SIFT) descriptors, using late fusion strategy. The proposed representation is used for food recognition from a dataset of food images with complex appearance variations. The Bag of Features (BOF) approach is employed to enhance the discriminative ability of the local features. Firstly, the individual local features are extracted to construct two kinds of visual vocabularies, representing SURF and SIFT. The visual vocabularies are then concatenated and fed into a Linear Support Vector Machine (SVM) to classify the respective food categories. Experimental results demonstrate impressive overall recognition at 82.38% classification accuracy based on the challenging UEC-Food100 dataset.
Contour Curvature As an Invariant Code for Objects in Visual Area V4
Pasupathy, Anitha
2016-01-01
Size-invariant object recognition—the ability to recognize objects across transformations of scale—is a fundamental feature of biological and artificial vision. To investigate its basis in the primate cerebral cortex, we measured single neuron responses to stimuli of varying size in visual area V4, a cornerstone of the object-processing pathway, in rhesus monkeys (Macaca mulatta). Leveraging two competing models for how neuronal selectivity for the bounding contours of objects may depend on stimulus size, we show that most V4 neurons (∼70%) encode objects in a size-invariant manner, consistent with selectivity for a size-independent parameter of boundary form: for these neurons, “normalized” curvature, rather than “absolute” curvature, provided a better account of responses. Our results demonstrate the suitability of contour curvature as a basis for size-invariant object representation in the visual cortex, and posit V4 as a foundation for behaviorally relevant object codes. SIGNIFICANCE STATEMENT Size-invariant object recognition is a bedrock for many perceptual and cognitive functions. Despite growing neurophysiological evidence for invariant object representations in the primate cortex, we still lack a basic understanding of the encoding rules that govern them. Classic work in the field of visual shape theory has long postulated that a representation of objects based on information about their bounding contours is well suited to mediate such an invariant code. In this study, we provide the first empirical support for this hypothesis, and its instantiation in single neurons of visual area V4. PMID:27194333
Vergauwe, Evie; Cowan, Nelson
2015-09-01
We compared two contrasting hypotheses of how multifeatured objects are stored in visual working memory (vWM); as integrated objects or as independent features. A new procedure was devised to examine vWM representations of several concurrently held objects and their features and our main measure was reaction time (RT), allowing an examination of the real-time search through features and/or objects in an array in vWM. Response speeds to probes with color, shape, or both were studied as a function of the number of memorized colored shapes. Four testing groups were created by varying the instructions and the way in which probes with both color and shape were presented. The instructions explicitly either encouraged or discouraged the use of binding information and the task-relevance of binding information was further suggested by presenting probes with both color and shapes as either integrated objects or independent features. Our results show that the unit used for retrieval from vWM depends on the testing situation. Search was fully object-based only when all factors support that basis of search, in which case retrieving 2 features took no longer than retrieving a single feature. Otherwise, retrieving 2 features took longer than retrieving a single feature. Additional analyses of change detection latency suggested that, even though different testing situations can result in a stronger emphasis on either the feature dimension or the object dimension, neither one disappears from the representation and both concurrently affect change detection performance. (c) 2015 APA, all rights reserved).
Implicit Object Naming in Visual Search: Evidence from Phonological Competition
Walenchok, Stephen C.; Hout, Michael C.; Goldinger, Stephen D.
2016-01-01
During visual search, people are distracted by objects that visually resemble search targets; search is impaired when targets and distractors share overlapping features. In this study, we examined whether a nonvisual form of similarity, overlapping object names, can also affect search performance. In three experiments, people searched for images of real-world objects (e.g., a beetle) among items whose names either all shared the same phonological onset (/bi/), or were phonologically varied. Participants either searched for one or three potential targets per trial, with search targets designated either visually or verbally. We examined standard visual search (Experiments 1 and 3) and a self-paced serial search task wherein participants manually rejected each distractor (Experiment 2). We hypothesized that people would maintain visual templates when searching for single targets, but would rely more on object names when searching for multiple items and when targets were verbally cued. This reliance on target names would make performance susceptible to interference from similar-sounding distractors. Experiments 1 and 2 showed the predicted interference effect in conditions with high memory load and verbal cues. In Experiment 3, eye-movement results showed that phonological interference resulted from small increases in dwell time to all distractors. The results suggest that distractor names are implicitly activated during search, slowing attention disengagement when targets and distractors share similar names. PMID:27531018
Feature-based and object-based attention orientation during short-term memory maintenance.
Ku, Yixuan
2015-12-01
Top-down attention biases the short-term memory (STM) processing at multiple stages. Orienting attention during the maintenance period of STM by a retrospective cue (retro-cue) strengthens the representation of the cued item and improves the subsequent STM performance. In a recent article, Backer et al. (Backer KC, Binns MA, Alain C. J Neurosci 35: 1307-1318, 2015) extended these findings from the visual to the auditory domain and combined electroencephalography to dissociate neural mechanisms underlying feature-based and object-based attention orientation. Both event-related potentials and neural oscillations explained the behavioral benefits of retro-cues and favored the theory that feature-based and object-based attention orientation were independent. Copyright © 2015 the American Physiological Society.
Visual Working Memory Is Independent of the Cortical Spacing Between Memoranda.
Harrison, William J; Bays, Paul M
2018-03-21
The sensory recruitment hypothesis states that visual short-term memory is maintained in the same visual cortical areas that initially encode a stimulus' features. Although it is well established that the distance between features in visual cortex determines their visibility, a limitation known as crowding, it is unknown whether short-term memory is similarly constrained by the cortical spacing of memory items. Here, we investigated whether the cortical spacing between sequentially presented memoranda affects the fidelity of memory in humans (of both sexes). In a first experiment, we varied cortical spacing by taking advantage of the log-scaling of visual cortex with eccentricity, presenting memoranda in peripheral vision sequentially along either the radial or tangential visual axis with respect to the fovea. In a second experiment, we presented memoranda sequentially either within or beyond the critical spacing of visual crowding, a distance within which visual features cannot be perceptually distinguished due to their nearby cortical representations. In both experiments and across multiple measures, we found strong evidence that the ability to maintain visual features in memory is unaffected by cortical spacing. These results indicate that the neural architecture underpinning working memory has properties inconsistent with the known behavior of sensory neurons in visual cortex. Instead, the dissociation between perceptual and memory representations supports a role of higher cortical areas such as posterior parietal or prefrontal regions or may involve an as yet unspecified mechanism in visual cortex in which stimulus features are bound to their temporal order. SIGNIFICANCE STATEMENT Although much is known about the resolution with which we can remember visual objects, the cortical representation of items held in short-term memory remains contentious. A popular hypothesis suggests that memory of visual features is maintained via the recruitment of the same neural architecture in sensory cortex that encodes stimuli. We investigated this claim by manipulating the spacing in visual cortex between sequentially presented memoranda such that some items shared cortical representations more than others while preventing perceptual interference between stimuli. We found clear evidence that short-term memory is independent of the intracortical spacing of memoranda, revealing a dissociation between perceptual and memory representations. Our data indicate that working memory relies on different neural mechanisms from sensory perception. Copyright © 2018 Harrison and Bays.
Oculomotor selection underlies feature retention in visual working memory.
Hanning, Nina M; Jonikaitis, Donatas; Deubel, Heiner; Szinte, Martin
2016-02-01
Oculomotor selection, spatial task relevance, and visual working memory (WM) are described as three processes highly intertwined and sustained by similar cortical structures. However, because task-relevant locations always constitute potential saccade targets, no study so far has been able to distinguish between oculomotor selection and spatial task relevance. We designed an experiment that allowed us to dissociate in humans the contribution of task relevance, oculomotor selection, and oculomotor execution to the retention of feature representations in WM. We report that task relevance and oculomotor selection lead to dissociable effects on feature WM maintenance. In a first task, in which an object's location was encoded as a saccade target, its feature representations were successfully maintained in WM, whereas they declined at nonsaccade target locations. Likewise, we observed a similar WM benefit at the target of saccades that were prepared but never executed. In a second task, when an object's location was marked as task relevant but constituted a nonsaccade target (a location to avoid), feature representations maintained at that location did not benefit. Combined, our results demonstrate that oculomotor selection is consistently associated with WM, whereas task relevance is not. This provides evidence for an overlapping circuitry serving saccade target selection and feature-based WM that can be dissociated from processes encoding task-relevant locations. Copyright © 2016 the American Physiological Society.
Life-Span Development of Visual Working Memory: When Is Feature Binding Difficult?
ERIC Educational Resources Information Center
Cowan, Nelson; Naveh-Benjamin, Moshe; Kilb, Angela; Saults, J. Scott
2006-01-01
We asked whether the ability to keep in working memory the binding between a visual object and its spatial location changes with development across the life span more than memory for item information. Paired arrays of colored squares were identical or differed in the color of one square, and in the latter case, the changed color was unique on…
Gestalt Effects in Visual Working Memory.
Kałamała, Patrycja; Sadowska, Aleksandra; Ordziniak, Wawrzyniec; Chuderski, Adam
2017-01-01
Four experiments investigated whether conforming to Gestalt principles, well known to drive visual perception, also facilitates the active maintenance of information in visual working memory (VWM). We used the change detection task, which required the memorization of visual patterns composed of several shapes. We observed no effects of symmetry of visual patterns on VWM performance. However, there was a moderate positive effect when a particular shape that was probed matched the shape of the whole pattern (the whole-part similarity effect). Data support the models assuming that VWM encodes not only particular objects of the perceptual scene but also the spatial relations between them (the ensemble representation). The ensemble representation may prime objects similar to its shape and thereby boost access to them. In contrast, the null effect of symmetry relates the fact that this very feature of an ensemble does not yield any useful additional information for VWM.
Saliency predicts change detection in pictures of natural scenes.
Wright, Michael J
2005-01-01
It has been proposed that the visual system encodes the salience of objects in the visual field in an explicit two-dimensional map that guides visual selective attention. Experiments were conducted to determine whether salience measurements applied to regions of pictures of outdoor scenes could predict the detection of changes in those regions. To obtain a quantitative measure of change detection, observers located changes in pairs of colour pictures presented across an interstimulus interval (ISI). Salience measurements were then obtained from different observers for image change regions using three independent methods, and all were positively correlated with change detection. Factor analysis extracted a single saliency factor that accounted for 62% of the variance contained in the four measures. Finally, estimates of the magnitude of the image change in each picture pair were obtained, using nine separate visual filters representing low-level vision features (luminance, colour, spatial frequency, orientation, edge density). None of the feature outputs was significantly associated with change detection or saliency. On the other hand it was shown that high-level (structural) properties of the changed region were related to saliency and to change detection: objects were more salient than shadows and more detectable when changed.
Metacognitive Confidence Increases with, but Does Not Determine, Visual Perceptual Learning.
Zizlsperger, Leopold; Kümmel, Florian; Haarmeier, Thomas
2016-01-01
While perceptual learning increases objective sensitivity, the effects on the constant interaction of the process of perception and its metacognitive evaluation have been rarely investigated. Visual perception has been described as a process of probabilistic inference featuring metacognitive evaluations of choice certainty. For visual motion perception in healthy, naive human subjects here we show that perceptual sensitivity and confidence in it increased with training. The metacognitive sensitivity-estimated from certainty ratings by a bias-free signal detection theoretic approach-in contrast, did not. Concomitant 3Hz transcranial alternating current stimulation (tACS) was applied in compliance with previous findings on effective high-low cross-frequency coupling subserving signal detection. While perceptual accuracy and confidence in it improved with training, there were no statistically significant tACS effects. Neither metacognitive sensitivity in distinguishing between their own correct and incorrect stimulus classifications, nor decision confidence itself determined the subjects' visual perceptual learning. Improvements of objective performance and the metacognitive confidence in it were rather determined by the perceptual sensitivity at the outset of the experiment. Post-decision certainty in visual perceptual learning was neither independent of objective performance, nor requisite for changes in sensitivity, but rather covaried with objective performance. The exact functional role of metacognitive confidence in human visual perception has yet to be determined.
Perceptual Grouping Enhances Visual Plasticity
Mastropasqua, Tommaso; Turatto, Massimo
2013-01-01
Visual perceptual learning, a manifestation of neural plasticity, refers to improvements in performance on a visual task achieved by training. Attention is known to play an important role in perceptual learning, given that the observer's discriminative ability improves only for those stimulus feature that are attended. However, the distribution of attention can be severely constrained by perceptual grouping, a process whereby the visual system organizes the initial retinal input into candidate objects. Taken together, these two pieces of evidence suggest the interesting possibility that perceptual grouping might also affect perceptual learning, either directly or via attentional mechanisms. To address this issue, we conducted two experiments. During the training phase, participants attended to the contrast of the task-relevant stimulus (oriented grating), while two similar task-irrelevant stimuli were presented in the adjacent positions. One of the two flanking stimuli was perceptually grouped with the attended stimulus as a consequence of its similar orientation (Experiment 1) or because it was part of the same perceptual object (Experiment 2). A test phase followed the training phase at each location. Compared to the task-irrelevant no-grouping stimulus, orientation discrimination improved at the attended location. Critically, a perceptual learning effect equivalent to the one observed for the attended location also emerged for the task-irrelevant grouping stimulus, indicating that perceptual grouping induced a transfer of learning to the stimulus (or feature) being perceptually grouped with the task-relevant one. Our findings indicate that no voluntary effort to direct attention to the grouping stimulus or feature is necessary to enhance visual plasticity. PMID:23301100
Wong, Yvonne J; Aldcroft, Adrian J; Large, Mary-Ellen; Culham, Jody C; Vilis, Tutis
2009-12-01
We examined the role of temporal synchrony-the simultaneous appearance of visual features-in the perceptual and neural processes underlying object persistence. When a binding cue (such as color or motion) momentarily exposes an object from a background of similar elements, viewers remain aware of the object for several seconds before it perceptually fades into the background, a phenomenon known as object persistence. We showed that persistence from temporal stimulus synchrony, like that arising from motion and color, is associated with activation in the lateral occipital (LO) area, as measured by functional magnetic resonance imaging. We also compared the distribution of occipital cortex activity related to persistence to that of iconic visual memory. Although activation related to iconic memory was largely confined to LO, activation related to object persistence was present across V1 to LO, peaking in V3 and V4, regardless of the binding cue (temporal synchrony, motion, or color). Although persistence from motion cues was not associated with higher activation in the MT+ motion complex, persistence from color cues was associated with increased activation in V4. Taken together, these results demonstrate that although persistence is a form of visual memory, it relies on neural mechanisms different from those of iconic memory. That is, persistence not only activates LO in a cue-independent manner, it also recruits visual areas that may be necessary to maintain binding between object elements.
Brief Report: Imitation of Object-Directed Acts in Young Children with Autism Spectrum Disorders
ERIC Educational Resources Information Center
Gonsiorowski, Anna; Williamson, Rebecca A.; Robins, Diana L.
2016-01-01
Children with autism spectrum disorders (ASD) imitate less than typically developing (TD) children; however, the specific features and causes of this deficit are still unclear. The current study investigates the role of joint engagement, specifically children's visual attention to demonstrations, in an object-directed imitation task. This sample…
Gillebert, Celine R; Petersen, Anders; Van Meel, Chayenne; Müller, Tanja; McIntyre, Alexandra; Wagemans, Johan; Humphreys, Glyn W
2016-06-01
Previous studies have shown that the perceptual organization of the visual scene constrains the deployment of attention. Here we investigated how the organization of multiple elements into larger configurations alters their attentional weight, depending on the "pertinence" or behavioral importance of the elements' features. We assessed object-based effects on distinct aspects of the attentional priority map: top-down control, reflecting the tendency to encode targets rather than distracters, and the spatial distribution of attention weights across the visual scene, reflecting the tendency to report elements belonging to the same rather than different objects. In 2 experiments participants had to report the letters in briefly presented displays containing 8 letters and digits, in which pairs of characters could be connected with a line. Quantitative estimates of top-down control were obtained using Bundesen's Theory of Visual Attention (1990). The spatial distribution of attention weights was assessed using the "paired response index" (PRI), indicating responses for within-object pairs of letters. In Experiment 1, grouping along the task-relevant dimension (targets with targets and distracters with distracters) increased top-down control and enhanced the PRI; in contrast, task-irrelevant grouping (targets with distracters) did not affect performance. In Experiment 2, we disentangled the effect of target-target and distracter-distracter grouping: Pairwise grouping of distracters enhanced top-down control whereas pairwise grouping of targets changed the PRI. We conclude that object-based perceptual representations interact with pertinence values (of the elements' features and location) in the computation of attention weights, thereby creating a widespread pattern of attentional facilitation across the visual scene. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Taylor, Kirsten I.; Devereux, Barry J.; Acres, Kadia; Randall, Billi; Tyler, Lorraine K.
2013-01-01
Conceptual representations are at the heart of our mental lives, involved in every aspect of cognitive functioning. Despite their centrality, a long-standing debate persists as to how the meanings of concepts are represented and processed. Many accounts agree that the meanings of concrete concepts are represented by their individual features, but disagree about the importance of different feature-based variables: some views stress the importance of the information carried by distinctive features in conceptual processing, others the features which are shared over many concepts, and still others the extent to which features co-occur. We suggest that previously disparate theoretical positions and experimental findings can be unified by an account which claims that task demands determine how concepts are processed in addition to the effects of feature distinctiveness and co-occurrence. We tested these predictions in a basic-level naming task which relies on distinctive feature information (Experiment 1) and a domain decision task which relies on shared feature information (Experiment 2). Both used large-scale regression designs with the same visual objects, and mixed-effects models incorporating participant, session, stimulus-related and feature statistic variables to model the performance. We found that concepts with relatively more distinctive and more highly correlated distinctive relative to shared features facilitated basic-level naming latencies, while concepts with relatively more shared and more highly correlated shared relative to distinctive features speeded domain decisions. These findings demonstrate that the feature statistics of distinctiveness (shared vs. distinctive) and correlational strength, as well as the task demands, determine how concept meaning is processed in the conceptual system. PMID:22137770
Visual object tracking by correlation filters and online learning
NASA Astrophysics Data System (ADS)
Zhang, Xin; Xia, Gui-Song; Lu, Qikai; Shen, Weiming; Zhang, Liangpei
2018-06-01
Due to the complexity of background scenarios and the variation of target appearance, it is difficult to achieve high accuracy and fast speed for object tracking. Currently, correlation filters based trackers (CFTs) show promising performance in object tracking. The CFTs estimate the target's position by correlation filters with different kinds of features. However, most of CFTs can hardly re-detect the target in the case of long-term tracking drifts. In this paper, a feature integration object tracker named correlation filters and online learning (CFOL) is proposed. CFOL estimates the target's position and its corresponding correlation score using the same discriminative correlation filter with multi-features. To reduce tracking drifts, a new sampling and updating strategy for online learning is proposed. Experiments conducted on 51 image sequences demonstrate that the proposed algorithm is superior to the state-of-the-art approaches.
ERIC Educational Resources Information Center
Rule, Audrey C.
2011-01-01
New tactile curriculum materials for teaching Earth and planetary science lessons on rotation=revolution, silhouettes of objects from different views, contour maps, impact craters, asteroids, and topographic features of Mars to 11 elementary and middle school students with sight impairments at a week-long residential summer camp are presented…
Normal aging delays and compromises early multifocal visual attention during object tracking.
Störmer, Viola S; Li, Shu-Chen; Heekeren, Hauke R; Lindenberger, Ulman
2013-02-01
Declines in selective attention are one of the sources contributing to age-related impairments in a broad range of cognitive functions. Most previous research on mechanisms underlying older adults' selection deficits has studied the deployment of visual attention to static objects and features. Here we investigate neural correlates of age-related differences in spatial attention to multiple objects as they move. We used a multiple object tracking task, in which younger and older adults were asked to keep track of moving target objects that moved randomly in the visual field among irrelevant distractor objects. By recording the brain's electrophysiological responses during the tracking period, we were able to delineate neural processing for targets and distractors at early stages of visual processing (~100-300 msec). Older adults showed less selective attentional modulation in the early phase of the visual P1 component (100-125 msec) than younger adults, indicating that early selection is compromised in old age. However, with a 25-msec delay relative to younger adults, older adults showed distinct processing of targets (125-150 msec), that is, a delayed yet intact attentional modulation. The magnitude of this delayed attentional modulation was related to tracking performance in older adults. The amplitude of the N1 component (175-210 msec) was smaller in older adults than in younger adults, and the target amplification effect of this component was also smaller in older relative to younger adults. Overall, these results indicate that normal aging affects the efficiency and timing of early visual processing during multiple object tracking.
Greene, Michelle R; Baldassano, Christopher; Fei-Fei, Li; Beck, Diane M; Baker, Chris I
2018-01-01
Inherent correlations between visual and semantic features in real-world scenes make it difficult to determine how different scene properties contribute to neural representations. Here, we assessed the contributions of multiple properties to scene representation by partitioning the variance explained in human behavioral and brain measurements by three feature models whose inter-correlations were minimized a priori through stimulus preselection. Behavioral assessments of scene similarity reflected unique contributions from a functional feature model indicating potential actions in scenes as well as high-level visual features from a deep neural network (DNN). In contrast, similarity of cortical responses in scene-selective areas was uniquely explained by mid- and high-level DNN features only, while an object label model did not contribute uniquely to either domain. The striking dissociation between functional and DNN features in their contribution to behavioral and brain representations of scenes indicates that scene-selective cortex represents only a subset of behaviorally relevant scene information. PMID:29513219
Groen, Iris Ia; Greene, Michelle R; Baldassano, Christopher; Fei-Fei, Li; Beck, Diane M; Baker, Chris I
2018-03-07
Inherent correlations between visual and semantic features in real-world scenes make it difficult to determine how different scene properties contribute to neural representations. Here, we assessed the contributions of multiple properties to scene representation by partitioning the variance explained in human behavioral and brain measurements by three feature models whose inter-correlations were minimized a priori through stimulus preselection. Behavioral assessments of scene similarity reflected unique contributions from a functional feature model indicating potential actions in scenes as well as high-level visual features from a deep neural network (DNN). In contrast, similarity of cortical responses in scene-selective areas was uniquely explained by mid- and high-level DNN features only, while an object label model did not contribute uniquely to either domain. The striking dissociation between functional and DNN features in their contribution to behavioral and brain representations of scenes indicates that scene-selective cortex represents only a subset of behaviorally relevant scene information.
Temporal resolution for the perception of features and conjunctions.
Bodelón, Clara; Fallah, Mazyar; Reynolds, John H
2007-01-24
The visual system decomposes stimuli into their constituent features, represented by neurons with different feature selectivities. How the signals carried by these feature-selective neurons are integrated into coherent object representations is unknown. To constrain the set of possible integrative mechanisms, we quantified the temporal resolution of perception for color, orientation, and conjunctions of these two features. We find that temporal resolution is measurably higher for each feature than for their conjunction, indicating that time is required to integrate features into a perceptual whole. This finding places temporal limits on the mechanisms that could mediate this form of perceptual integration.
Carnaghi, Andrea; Mitrovic, Aleksandra; Leder, Helmut; Fantoni, Carlo; Silani, Giorgia
2018-01-01
A controversial hypothesis, named the Sexualized Body Inversion Hypothesis (SBIH), claims similar visual processing of sexually objectified women (i.e., with a focus on the sexual body parts) and inanimate objects as indicated by an absence of the inversion effect for both type of stimuli. The current study aims at shedding light into the mechanisms behind the SBIH in a series of 4 experiments. Using a modified version of Bernard et al.´s (2012) visual-matching task, first we tested the core assumption of the SBIH, namely that a similar processing style occurs for sexualized human bodies and objects. In Experiments 1 and 2 a non-sexualized (personalized) condition plus two object-control conditions (mannequins, and houses) were included in the experimental design. Results showed an inversion effect for images of personalized women and mannequins, but not for sexualized women and houses. Second, we explored whether this effect was driven by differences in stimulus asymmetry, by testing the mediating and moderating role of this visual feature. In Experiment 3, we provided the first evidence that not only the sexual attributes of the images but also additional perceptual features of the stimuli, such as their asymmetry, played a moderating role in shaping the inversion effect. Lastly, we investigated the strategy adopted in the visual-matching task by tracking eye movements of the participants. Results of Experiment 4 suggest an association between a specific pattern of visual exploration of the images and the presence of the inversion effect. Findings are discussed with respect to the literature on sexual objectification. PMID:29621249
Combining heterogenous features for 3D hand-held object recognition
NASA Astrophysics Data System (ADS)
Lv, Xiong; Wang, Shuang; Li, Xiangyang; Jiang, Shuqiang
2014-10-01
Object recognition has wide applications in the area of human-machine interaction and multimedia retrieval. However, due to the problem of visual polysemous and concept polymorphism, it is still a great challenge to obtain reliable recognition result for the 2D images. Recently, with the emergence and easy availability of RGB-D equipment such as Kinect, this challenge could be relieved because the depth channel could bring more information. A very special and important case of object recognition is hand-held object recognition, as hand is a straight and natural way for both human-human interaction and human-machine interaction. In this paper, we study the problem of 3D object recognition by combining heterogenous features with different modalities and extraction techniques. For hand-craft feature, although it reserves the low-level information such as shape and color, it has shown weakness in representing hiconvolutionalgh-level semantic information compared with the automatic learned feature, especially deep feature. Deep feature has shown its great advantages in large scale dataset recognition but is not always robust to rotation or scale variance compared with hand-craft feature. In this paper, we propose a method to combine hand-craft point cloud features and deep learned features in RGB and depth channle. First, hand-held object segmentation is implemented by using depth cues and human skeleton information. Second, we combine the extracted hetegerogenous 3D features in different stages using linear concatenation and multiple kernel learning (MKL). Then a training model is used to recognize 3D handheld objects. Experimental results validate the effectiveness and gerneralization ability of the proposed method.
Lee, Young-Sook; Chung, Wan-Young
2012-01-01
Vision-based abnormal event detection for home healthcare systems can be greatly improved using visual sensor-based techniques able to detect, track and recognize objects in the scene. However, in moving object detection and tracking processes, moving cast shadows can be misclassified as part of objects or moving objects. Shadow removal is an essential step for developing video surveillance systems. The goal of the primary is to design novel computer vision techniques that can extract objects more accurately and discriminate between abnormal and normal activities. To improve the accuracy of object detection and tracking, our proposed shadow removal algorithm is employed. Abnormal event detection based on visual sensor by using shape features variation and 3-D trajectory is presented to overcome the low fall detection rate. The experimental results showed that the success rate of detecting abnormal events was 97% with a false positive rate of 2%. Our proposed algorithm can allow distinguishing diverse fall activities such as forward falls, backward falls, and falling asides from normal activities. PMID:22368486
Object attributes combine additively in visual search
Pramod, R. T.; Arun, S. P.
2016-01-01
We perceive objects as containing a variety of attributes: local features, relations between features, internal details, and global properties. But we know little about how they combine. Here, we report a remarkably simple additive rule that governs how these diverse object attributes combine in vision. The perceived dissimilarity between two objects was accurately explained as a sum of (a) spatially tuned local contour-matching processes modulated by part decomposition; (b) differences in internal details, such as texture; (c) differences in emergent attributes, such as symmetry; and (d) differences in global properties, such as orientation or overall configuration of parts. Our results elucidate an enduring question in object vision by showing that the whole object is not a sum of its parts but a sum of its many attributes. PMID:26967014
Berggren, Nick; Eimer, Martin
2016-09-01
Representations of target-defining features (attentional templates) guide the selection of target objects in visual search. We used behavioral and electrophysiological measures to investigate how such search templates control the allocation of attention in search tasks where targets are defined by the combination of 2 colors or by a specific spatial configuration of these colors. Target displays were preceded by spatially uninformative cue displays that contained items in 1 or both target-defining colors. Experiments 1 and 2 demonstrated that, during search for color combinations, attention is initially allocated independently and in parallel to all objects with target-matching colors, but is then rapidly withdrawn from objects that only have 1 of the 2 target colors. In Experiment 3, targets were defined by a particular spatial configuration of 2 colors, and could be accompanied by nontarget objects with a different configuration of the same colors. Attentional guidance processes were unable to distinguish between these 2 types of objects. Both attracted attention equally when they appeared in a cue display, and both received parallel focal-attentional processing and were encoded into working memory when they were presented in the same target display. Results demonstrate that attention can be guided simultaneously by multiple features from the same dimension, but that these guidance processes have no access to the spatial-configural properties of target objects. They suggest that attentional templates do not represent target objects in an integrated pictorial fashion, but contain separate representations of target-defining features. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Ferber, Susanne; Emrich, Stephen M
2007-03-01
Segregation and feature binding are essential to the perception and awareness of objects in a visual scene. When a fragmented line-drawing of an object moves relative to a background of randomly oriented lines, the previously hidden object is segregated from the background and consequently enters awareness. Interestingly, in such shape-from-motion displays, the percept of the object persists briefly when the motion stops, suggesting that the segregated and bound representation of the object is maintained in awareness. Here, we tested whether this persistence effect is mediated by capacity-limited working-memory processes, or by the amount of object-related information available. The experiments demonstrate that persistence is affected mainly by the proportion of object information available and is independent of working-memory limits. We suggest that this persistence effect can be seen as evidence for an intermediate, form-based memory store mediating between sensory and working memory.
Aging memories: differential decay of episodic memory components.
Talamini, Lucia M; Gorree, Eva
2012-05-17
Some memories about events can persist for decades, even a lifetime. However, recent memories incorporate rich sensory information, including knowledge on the spatial and temporal ordering of event features, while old memories typically lack this "filmic" quality. We suggest that this apparent change in the nature of memories may reflect a preferential loss of hippocampus-dependent, configurational information over more cortically based memory components, including memory for individual objects. The current study systematically tests this hypothesis, using a new paradigm that allows the contemporaneous assessment of memory for objects, object pairings, and object-position conjunctions. Retention of each memory component was tested, at multiple intervals, up to 3 mo following encoding. The three memory subtasks adopted the same retrieval paradigm and were matched for initial difficulty. Results show differential decay of the tested episodic memory components, whereby memory for configurational aspects of a scene (objects' co-occurrence and object position) decays faster than memory for featured objects. Interestingly, memory requiring a visually detailed object representation decays at a similar rate as global object recognition, arguing against interpretations based on task difficulty and against the notion that (visual) detail is forgotten preferentially. These findings show that memories undergo qualitative changes as they age. More specifically, event memories become less configurational over time, preferentially losing some of the higher order associations that are dependent on the hippocampus for initial fast encoding. Implications for theories of long-term memory are discussed.
Hout, Michael C; Goldinger, Stephen D
2015-01-01
When people look for things in the environment, they use target templates-mental representations of the objects they are attempting to locate-to guide attention and to assess incoming visual input as potential targets. However, unlike laboratory participants, searchers in the real world rarely have perfect knowledge regarding the potential appearance of targets. In seven experiments, we examined how the precision of target templates affects the ability to conduct visual search. Specifically, we degraded template precision in two ways: 1) by contaminating searchers' templates with inaccurate features, and 2) by introducing extraneous features to the template that were unhelpful. We recorded eye movements to allow inferences regarding the relative extents to which attentional guidance and decision-making are hindered by template imprecision. Our findings support a dual-function theory of the target template and highlight the importance of examining template precision in visual search.
Hout, Michael C.; Goldinger, Stephen D.
2014-01-01
When people look for things in the environment, they use target templates—mental representations of the objects they are attempting to locate—to guide attention and to assess incoming visual input as potential targets. However, unlike laboratory participants, searchers in the real world rarely have perfect knowledge regarding the potential appearance of targets. In seven experiments, we examined how the precision of target templates affects the ability to conduct visual search. Specifically, we degraded template precision in two ways: 1) by contaminating searchers’ templates with inaccurate features, and 2) by introducing extraneous features to the template that were unhelpful. We recorded eye movements to allow inferences regarding the relative extents to which attentional guidance and decision-making are hindered by template imprecision. Our findings support a dual-function theory of the target template and highlight the importance of examining template precision in visual search. PMID:25214306
Semantically induced distortions of visual awareness in a patient with Balint's syndrome.
Soto, David; Humphreys, Glyn W
2009-02-01
We present data indicating that visual awareness for a basic perceptual feature (colour) can be influenced by the relation between the feature and the semantic properties of the stimulus. We examined semantic interference from the meaning of a colour word (''RED") on simple colour (ink related) detection responses in a patient with simultagnosia due to bilateral parietal lesions. We found that colour detection was influenced by the congruency between the meaning of the word and the relevant ink colour, with impaired performance when the word and the colour mismatched (on incongruent trials). This result held even when remote associations between meaning and colour were used (i.e. the word ''PEA" influenced detection of the ink colour red). The results are consistent with a late locus of conscious visual experience that is derived at post-semantic levels. The implications for the understanding of the role of parietal cortex in object binding and visual awareness are discussed.
ERIC Educational Resources Information Center
Brockmole, James R.; Boot, Walter R.
2009-01-01
Distinctive aspects of a scene can capture attention even when they are irrelevant to one's goals. The authors address whether visually unique, unexpected, but task-irrelevant features also tend to hold attention. Observers searched through displays in which the color of each item was irrelevant. At the start of search, all objects changed color.…
Desantis, Andrea; Haggard, Patrick
2016-01-01
To maintain a temporally-unified representation of audio and visual features of objects in our environment, the brain recalibrates audio-visual simultaneity. This process allows adjustment for both differences in time of transmission and time for processing of audio and visual signals. In four experiments, we show that the cognitive processes for controlling instrumental actions also have strong influence on audio-visual recalibration. Participants learned that right and left hand button-presses each produced a specific audio-visual stimulus. Following one action the audio preceded the visual stimulus, while for the other action audio lagged vision. In a subsequent test phase, left and right button-press generated either the same audio-visual stimulus as learned initially, or the pair associated with the other action. We observed recalibration of simultaneity only for previously-learned audio-visual outcomes. Thus, learning an action-outcome relation promotes temporal grouping of the audio and visual events within the outcome pair, contributing to the creation of a temporally unified multisensory object. This suggests that learning action-outcome relations and the prediction of perceptual outcomes can provide an integrative temporal structure for our experiences of external events. PMID:27982063
Desantis, Andrea; Haggard, Patrick
2016-12-16
To maintain a temporally-unified representation of audio and visual features of objects in our environment, the brain recalibrates audio-visual simultaneity. This process allows adjustment for both differences in time of transmission and time for processing of audio and visual signals. In four experiments, we show that the cognitive processes for controlling instrumental actions also have strong influence on audio-visual recalibration. Participants learned that right and left hand button-presses each produced a specific audio-visual stimulus. Following one action the audio preceded the visual stimulus, while for the other action audio lagged vision. In a subsequent test phase, left and right button-press generated either the same audio-visual stimulus as learned initially, or the pair associated with the other action. We observed recalibration of simultaneity only for previously-learned audio-visual outcomes. Thus, learning an action-outcome relation promotes temporal grouping of the audio and visual events within the outcome pair, contributing to the creation of a temporally unified multisensory object. This suggests that learning action-outcome relations and the prediction of perceptual outcomes can provide an integrative temporal structure for our experiences of external events.
A biological hierarchical model based underwater moving object detection.
Shen, Jie; Fan, Tanghuai; Tang, Min; Zhang, Qian; Sun, Zhen; Huang, Fengchen
2014-01-01
Underwater moving object detection is the key for many underwater computer vision tasks, such as object recognizing, locating, and tracking. Considering the super ability in visual sensing of the underwater habitats, the visual mechanism of aquatic animals is generally regarded as the cue for establishing bionic models which are more adaptive to the underwater environments. However, the low accuracy rate and the absence of the prior knowledge learning limit their adaptation in underwater applications. Aiming to solve the problems originated from the inhomogeneous lumination and the unstable background, the mechanism of the visual information sensing and processing pattern from the eye of frogs are imitated to produce a hierarchical background model for detecting underwater objects. Firstly, the image is segmented into several subblocks. The intensity information is extracted for establishing background model which could roughly identify the object and the background regions. The texture feature of each pixel in the rough object region is further analyzed to generate the object contour precisely. Experimental results demonstrate that the proposed method gives a better performance. Compared to the traditional Gaussian background model, the completeness of the object detection is 97.92% with only 0.94% of the background region that is included in the detection results.
A Biological Hierarchical Model Based Underwater Moving Object Detection
Shen, Jie; Fan, Tanghuai; Tang, Min; Zhang, Qian; Sun, Zhen; Huang, Fengchen
2014-01-01
Underwater moving object detection is the key for many underwater computer vision tasks, such as object recognizing, locating, and tracking. Considering the super ability in visual sensing of the underwater habitats, the visual mechanism of aquatic animals is generally regarded as the cue for establishing bionic models which are more adaptive to the underwater environments. However, the low accuracy rate and the absence of the prior knowledge learning limit their adaptation in underwater applications. Aiming to solve the problems originated from the inhomogeneous lumination and the unstable background, the mechanism of the visual information sensing and processing pattern from the eye of frogs are imitated to produce a hierarchical background model for detecting underwater objects. Firstly, the image is segmented into several subblocks. The intensity information is extracted for establishing background model which could roughly identify the object and the background regions. The texture feature of each pixel in the rough object region is further analyzed to generate the object contour precisely. Experimental results demonstrate that the proposed method gives a better performance. Compared to the traditional Gaussian background model, the completeness of the object detection is 97.92% with only 0.94% of the background region that is included in the detection results. PMID:25140194
Igloo-Plot: a tool for visualization of multidimensional datasets.
Kuntal, Bhusan K; Ghosh, Tarini Shankar; Mande, Sharmila S
2014-01-01
Advances in science and technology have resulted in an exponential growth of multivariate (or multi-dimensional) datasets which are being generated from various research areas especially in the domain of biological sciences. Visualization and analysis of such data (with the objective of uncovering the hidden patterns therein) is an important and challenging task. We present a tool, called Igloo-Plot, for efficient visualization of multidimensional datasets. The tool addresses some of the key limitations of contemporary multivariate visualization and analysis tools. The visualization layout, not only facilitates an easy identification of clusters of data-points having similar feature compositions, but also the 'marker features' specific to each of these clusters. The applicability of the various functionalities implemented herein is demonstrated using several well studied multi-dimensional datasets. Igloo-Plot is expected to be a valuable resource for researchers working in multivariate data mining studies. Igloo-Plot is available for download from: http://metagenomics.atc.tcs.com/IglooPlot/. Copyright © 2014 Elsevier Inc. All rights reserved.
Wu, Ming; Nern, Aljoscha; Williamson, W Ryan; Morimoto, Mai M; Reiser, Michael B; Card, Gwyneth M; Rubin, Gerald M
2016-01-01
Visual projection neurons (VPNs) provide an anatomical connection between early visual processing and higher brain regions. Here we characterize lobula columnar (LC) cells, a class of Drosophila VPNs that project to distinct central brain structures called optic glomeruli. We anatomically describe 22 different LC types and show that, for several types, optogenetic activation in freely moving flies evokes specific behaviors. The activation phenotypes of two LC types closely resemble natural avoidance behaviors triggered by a visual loom. In vivo two-photon calcium imaging reveals that these LC types respond to looming stimuli, while another type does not, but instead responds to the motion of a small object. Activation of LC neurons on only one side of the brain can result in attractive or aversive turning behaviors depending on the cell type. Our results indicate that LC neurons convey information on the presence and location of visual features relevant for specific behaviors. DOI: http://dx.doi.org/10.7554/eLife.21022.001 PMID:28029094
Feature bindings are maintained in visual short-term memory without sustained focused attention.
Delvenne, Jean-François; Cleeremans, Axel; Laloyaux, Cédric
2010-01-01
Does the maintenance of feature bindings in visual short-term memory (VSTM) require sustained focused attention? This issue was investigated in three experiments, in which memory for single features (i.e., colors or shapes) was compared with memory for feature bindings (i.e., the link between the color and shape of an object). Attention was manipulated during the memory retention interval with a retro-cue, which allows attention to be directed and focused on a subset of memory items. The retro-cue was presented 700 ms after the offset of the memory display and 700 ms before the onset of the test display. If the maintenance of feature bindings - but not of individual features - in memory requires sustained focused attention, the retro-cue should not affect memory performance. Contrary to this prediction, we found that both memory for feature bindings and memory for individual features were equally improved by the retro-cue. Therefore, this finding does not support the view that the sustained focused attention is needed to properly maintain feature bindings in VSTM.
Dynamic interactions between visual working memory and saccade target selection
Schneegans, Sebastian; Spencer, John P.; Schöner, Gregor; Hwang, Seongmin; Hollingworth, Andrew
2014-01-01
Recent psychophysical experiments have shown that working memory for visual surface features interacts with saccadic motor planning, even in tasks where the saccade target is unambiguously specified by spatial cues. Specifically, a match between a memorized color and the color of either the designated target or a distractor stimulus influences saccade target selection, saccade amplitudes, and latencies in a systematic fashion. To elucidate these effects, we present a dynamic neural field model in combination with new experimental data. The model captures the neural processes underlying visual perception, working memory, and saccade planning relevant to the psychophysical experiment. It consists of a low-level visual sensory representation that interacts with two separate pathways: a spatial pathway implementing spatial attention and saccade generation, and a surface feature pathway implementing color working memory and feature attention. Due to bidirectional coupling between visual working memory and feature attention in the model, the working memory content can indirectly exert an effect on perceptual processing in the low-level sensory representation. This in turn biases saccadic movement planning in the spatial pathway, allowing the model to quantitatively reproduce the observed interaction effects. The continuous coupling between representations in the model also implies that modulation should be bidirectional, and model simulations provide specific predictions for complementary effects of saccade target selection on visual working memory. These predictions were empirically confirmed in a new experiment: Memory for a sample color was biased toward the color of a task-irrelevant saccade target object, demonstrating the bidirectional coupling between visual working memory and perceptual processing. PMID:25228628
Visual acuity of the honey bee retina and the limits for feature detection.
Rigosi, Elisa; Wiederman, Steven D; O'Carroll, David C
2017-04-06
Visual abilities of the honey bee have been studied for more than 100 years, recently revealing unexpectedly sophisticated cognitive skills rivalling those of vertebrates. However, the physiological limits of the honey bee eye have been largely unaddressed and only studied in an unnatural, dark state. Using a bright display and intracellular recordings, we here systematically investigated the angular sensitivity across the light adapted eye of honey bee foragers. Angular sensitivity is a measure of photoreceptor receptive field size and thus small values indicate higher visual acuity. Our recordings reveal a fronto-ventral acute zone in which angular sensitivity falls below 1.9°, some 30% smaller than previously reported. By measuring receptor noise and responses to moving dark objects, we also obtained direct measures of the smallest features detectable by the retina. In the frontal eye, single photoreceptors respond to objects as small as 0.6° × 0.6°, with >99% reliability. This indicates that honey bee foragers possess significantly better resolution than previously reported or estimated behaviourally, and commonly assumed in modelling of bee acuity.
Object oriented classification of high resolution data for inventory of horticultural crops
NASA Astrophysics Data System (ADS)
Hebbar, R.; Ravishankar, H. M.; Trivedi, S.; Subramoniam, S. R.; Uday, R.; Dadhwal, V. K.
2014-11-01
High resolution satellite images are associated with large variance and thus, per pixel classifiers often result in poor accuracy especially in delineation of horticultural crops. In this context, object oriented techniques are powerful and promising methods for classification. In the present study, a semi-automatic object oriented feature extraction model has been used for delineation of horticultural fruit and plantation crops using Erdas Objective Imagine. Multi-resolution data from Resourcesat LISS-IV and Cartosat-1 have been used as source data in the feature extraction model. Spectral and textural information along with NDVI were used as inputs for generation of Spectral Feature Probability (SFP) layers using sample training pixels. The SFP layers were then converted into raster objects using threshold and clump function resulting in pixel probability layer. A set of raster and vector operators was employed in the subsequent steps for generating thematic layer in the vector format. This semi-automatic feature extraction model was employed for classification of major fruit and plantations crops viz., mango, banana, citrus, coffee and coconut grown under different agro-climatic conditions. In general, the classification accuracy of about 75-80 per cent was achieved for these crops using object based classification alone and the same was further improved using minimal visual editing of misclassified areas. A comparison of on-screen visual interpretation with object oriented approach showed good agreement. It was observed that old and mature plantations were classified more accurately while young and recently planted ones (3 years or less) showed poor classification accuracy due to mixed spectral signature, wider spacing and poor stands of plantations. The results indicated the potential use of object oriented approach for classification of high resolution data for delineation of horticultural fruit and plantation crops. The present methodology is applicable at local levels and future development is focused on up-scaling the methodology for generation of fruit and plantation crop maps at regional and national level which is important for creation of database for overall horticultural crop development.
Nishino, Ken; Nakamura, Mutsuko; Matsumoto, Masayuki; Tanno, Osamu; Nakauchi, Shigeki
2011-03-28
Light reflected from an object's surface contains much information about its physical and chemical properties. Changes in the physical properties of an object are barely detectable in spectra. Conventional trichromatic systems, on the other hand, cannot detect most spectral features because spectral information is compressively represented as trichromatic signals forming a three-dimensional subspace. We propose a method for designing a filter that optically modulates a camera's spectral sensitivity to find an alternative subspace highlighting an object's spectral features more effectively than the original trichromatic space. We designed and developed a filter that detects cosmetic foundations on human face. Results confirmed that the filter can visualize and nondestructively inspect the foundation distribution.
Average Orientation Is More Accessible through Object Boundaries than Surface Features
ERIC Educational Resources Information Center
Choo, Heeyoung; Levinthal, Brian R.; Franconeri, Steven L.
2012-01-01
In a glance, the visual system can provide a summary of some kinds of information about objects in a scene. We explore how summary information about "orientation" is extracted and find that some representations of orientation are privileged over others. Participants judged the average orientation of either a set of 6 bars or 6 circular…
ERIC Educational Resources Information Center
Elias, Lorin J.; Robinson, Brent; Saucier, Deborah M.
2005-01-01
Neurologically normal individuals exhibit strong leftward response biases during free-viewing perceptual judgments of brightness, quantity, and size. When participants view two mirror-reversed objects and they are forced to choose which object appears darker, more numerous, or larger, the stimulus with the relevant feature on the left side is…
Chromatic information and feature detection in fast visual analysis
Del Viva, Maria M.; Punzi, Giovanni; Shevell, Steven K.; ...
2016-08-01
The visual system is able to recognize a scene based on a sketch made of very simple features. This ability is likely crucial for survival, when fast image recognition is necessary, and it is believed that a primal sketch is extracted very early in the visual processing. Such highly simplified representations can be sufficient for accurate object discrimination, but an open question is the role played by color in this process. Rich color information is available in natural scenes, yet artist's sketches are usually monochromatic; and, black-andwhite movies provide compelling representations of real world scenes. Also, the contrast sensitivity ofmore » color is low at fine spatial scales. We approach the question from the perspective of optimal information processing by a system endowed with limited computational resources. We show that when such limitations are taken into account, the intrinsic statistical properties of natural scenes imply that the most effective strategy is to ignore fine-scale color features and devote most of the bandwidth to gray-scale information. We find confirmation of these information-based predictions from psychophysics measurements of fast-viewing discrimination of natural scenes. As a result, we conclude that the lack of colored features in our visual representation, and our overall low sensitivity to high-frequency color components, are a consequence of an adaptation process, optimizing the size and power consumption of our brain for the visual world we live in.« less
Chromatic information and feature detection in fast visual analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Del Viva, Maria M.; Punzi, Giovanni; Shevell, Steven K.
The visual system is able to recognize a scene based on a sketch made of very simple features. This ability is likely crucial for survival, when fast image recognition is necessary, and it is believed that a primal sketch is extracted very early in the visual processing. Such highly simplified representations can be sufficient for accurate object discrimination, but an open question is the role played by color in this process. Rich color information is available in natural scenes, yet artist's sketches are usually monochromatic; and, black-andwhite movies provide compelling representations of real world scenes. Also, the contrast sensitivity ofmore » color is low at fine spatial scales. We approach the question from the perspective of optimal information processing by a system endowed with limited computational resources. We show that when such limitations are taken into account, the intrinsic statistical properties of natural scenes imply that the most effective strategy is to ignore fine-scale color features and devote most of the bandwidth to gray-scale information. We find confirmation of these information-based predictions from psychophysics measurements of fast-viewing discrimination of natural scenes. As a result, we conclude that the lack of colored features in our visual representation, and our overall low sensitivity to high-frequency color components, are a consequence of an adaptation process, optimizing the size and power consumption of our brain for the visual world we live in.« less
Visual search for features and conjunctions following declines in the useful field of view.
Cosman, Joshua D; Lees, Monica N; Lee, John D; Rizzo, Matthew; Vecera, Shaun P
2012-01-01
BACKGROUND/STUDY CONTEXT: Typical measures for assessing the useful field (UFOV) of view involve many components of attention. The objective of the current experiment was to examine differences in visual search efficiency for older individuals with and without UFOV impairment. The authors used a computerized screening instrument to assess the useful field of view and to characterize participants as having an impaired or normal UFOV. Participants also performed two visual search tasks, a feature search (e.g., search for a green target among red distractors) or a conjunction search (e.g., a green target with a gap on its left or right side among red distractors with gaps on the left or right and green distractors with gaps on the top or bottom). Visual search performance did not differ between UFOV impaired and unimpaired individuals when searching for a basic feature. However, search efficiency was lower for impaired individuals than unimpaired individuals when searching for a conjunction of features. The results suggest that UFOV decline in normal aging is associated with conjunction search. This finding suggests that the underlying cause of UFOV decline may arise from an overall decline in attentional efficiency. Because the useful field of view is a reliable predictor of driving safety, the results suggest that decline in the everyday visual behavior of older adults might arise from attentional declines.
Mere exposure alters category learning of novel objects.
Folstein, Jonathan R; Gauthier, Isabel; Palmeri, Thomas J
2010-01-01
We investigated how mere exposure to complex objects with correlated or uncorrelated object features affects later category learning of new objects not seen during exposure. Correlations among pre-exposed object dimensions influenced later category learning. Unlike other published studies, the collection of pre-exposed objects provided no information regarding the categories to be learned, ruling out unsupervised or incidental category learning during pre-exposure. Instead, results are interpreted with respect to statistical learning mechanisms, providing one of the first demonstrations of how statistical learning can influence visual object learning.
Mere Exposure Alters Category Learning of Novel Objects
Folstein, Jonathan R.; Gauthier, Isabel; Palmeri, Thomas J.
2010-01-01
We investigated how mere exposure to complex objects with correlated or uncorrelated object features affects later category learning of new objects not seen during exposure. Correlations among pre-exposed object dimensions influenced later category learning. Unlike other published studies, the collection of pre-exposed objects provided no information regarding the categories to be learned, ruling out unsupervised or incidental category learning during pre-exposure. Instead, results are interpreted with respect to statistical learning mechanisms, providing one of the first demonstrations of how statistical learning can influence visual object learning. PMID:21833209
NASA Astrophysics Data System (ADS)
Graham, James; Ternovskiy, Igor V.
2013-06-01
We applied a two stage unsupervised hierarchical learning system to model complex dynamic surveillance and cyber space monitoring systems using a non-commercial version of the NeoAxis visualization software. The hierarchical scene learning and recognition approach is based on hierarchical expectation maximization, and was linked to a 3D graphics engine for validation of learning and classification results and understanding the human - autonomous system relationship. Scene recognition is performed by taking synthetically generated data and feeding it to a dynamic logic algorithm. The algorithm performs hierarchical recognition of the scene by first examining the features of the objects to determine which objects are present, and then determines the scene based on the objects present. This paper presents a framework within which low level data linked to higher-level visualization can provide support to a human operator and be evaluated in a detailed and systematic way.
Brockmole, James R; Boot, Walter R
2009-06-01
Distinctive aspects of a scene can capture attention even when they are irrelevant to one's goals. The authors address whether visually unique, unexpected, but task-irrelevant features also tend to hold attention. Observers searched through displays in which the color of each item was irrelevant. At the start of search, all objects changed color. Critically, the foveated item changed to an unexpected color (it was novel), became a color singleton (it was unique), or both. Saccade latency revealed the time required to disengage overt attention from this object. Singletons resulted in longer latencies, but only if they were unexpected. Conversely, unexpected items only delayed disengagement if they were singletons. Thus, the time spent overtly attending to an object is determined, at least in part, by task-irrelevant stimulus properties, but this depends on the confluence of expectation and visual salience. (c) 2009 APA, all rights reserved.
More than a memory: Confirmatory visual search is not caused by remembering a visual feature.
Rajsic, Jason; Pratt, Jay
2017-10-01
Previous research has demonstrated a preference for positive over negative information in visual search; asking whether a target object is green biases search towards green objects, even when this entails more perceptual processing than searching non-green objects. The present study investigated whether this confirmatory search bias is due to the presence of one particular (e.g., green) color in memory during search. Across two experiments, we show that this is not the critical factor in generating a confirmation bias in search. Search slowed proportionally to the number of stimuli whose color matched the color held in memory only when the color was remembered as part of the search instructions. These results suggest that biased search for information is due to a particular attentional selection strategy, and not to memory-driven attentional biases. Copyright © 2017 Elsevier B.V. All rights reserved.
Comparing object recognition from binary and bipolar edge images for visual prostheses
Jung, Jae-Hyun; Pu, Tian; Peli, Eli
2017-01-01
Visual prostheses require an effective representation method due to the limited display condition which has only 2 or 3 levels of grayscale in low resolution. Edges derived from abrupt luminance changes in images carry essential information for object recognition. Typical binary (black and white) edge images have been used to represent features to convey essential information. However, in scenes with a complex cluttered background, the recognition rate of the binary edge images by human observers is limited and additional information is required. The polarity of edges and cusps (black or white features on a gray background) carries important additional information; the polarity may provide shape from shading information missing in the binary edge image. This depth information may be restored by using bipolar edges. We compared object recognition rates from 16 binary edge images and bipolar edge images by 26 subjects to determine the possible impact of bipolar filtering in visual prostheses with 3 or more levels of grayscale. Recognition rates were higher with bipolar edge images and the improvement was significant in scenes with complex backgrounds. The results also suggest that erroneous shape from shading interpretation of bipolar edges resulting from pigment rather than boundaries of shape may confound the recognition. PMID:28458481
Perception Of "Features" And "Objects": Applications To The Design Of Instrument Panel Displays
NASA Astrophysics Data System (ADS)
Poynter, Douglas; Czarnomski, Alan J.
1988-10-01
An experiment was conducted to determine whether socalled feature displays allow for faster and more accurate processing compared to object displays. Previous psychological studies indicate that features can be processed in parallel across the visual field, whereas objects must be processed one at a time with the aid of attentional focus. Numbers and letters are examples of objects; line orientation and color are examples of features. In this experiment, subjects were asked to search displays composed of up to 16 elements for the presence of specific elements. The ability to detect, localize, and identify targets was influenced by display format. Digital errors increased with the number of elements, the number of targets, and the distance of the target from the fixation point. Line orientation errors increased only with the number of targets. Several other display types were evaluated, and each produced a pattern of errors similar to either digital or line orientation format. Results of the study were discussed in terms of Feature Integration Theory, which distinguishes between elements that are processed with parallel versus serial mechanisms.
Yazar, Yasemin; Bergström, Zara M; Simons, Jon S
Lesions of the angular gyrus (AnG) region of human parietal cortex do not cause amnesia, but appear to be associated with reduction in the ability to consciously experience the reliving of previous events. We used continuous theta burst stimulation to test the hypothesis that the cognitive mechanism implicated in this memory deficit might be the integration of retrieved sensory event features into a coherent multimodal memory representation. Healthy volunteers received stimulation to AnG or a vertex control site after studying stimuli that each comprised a visual object embedded in a scene, with the name of the object presented auditorily. Participants were then asked to make memory judgments about the studied stimuli that involved recollection of single event features (visual or auditory), or required integration of event features within the same modality, or across modalities. Participants' ability to retrieve context features from across multiple modalities was significantly reduced after AnG stimulation compared to stimulation of the vertex. This effect was observed only for the integration of cross-modal context features but not for integration of features within the same modality, and could not be accounted for by task difficulty as performance was matched across integration conditions following vertex stimulation. These results support the hypothesis that AnG is necessary for the multimodal integration of distributed cortical episodic features into a unified conscious representation that enables the experience of remembering. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Matheson, Heath E; White, Nicole C; McMullen, Patricia A
2014-07-01
Theories of embodied object representation predict a tight association between sensorimotor processes and visual processing of manipulable objects. Previous research has shown that object handles can 'potentiate' a manual response (i.e., button press) to a congruent location. This potentiation effect is taken as evidence that objects automatically evoke sensorimotor simulations in response to the visual presentation of manipulable objects. In the present series of experiments, we investigated a critical prediction of the theory of embodied object representations that potentiation effects should be observed with manipulable artifacts but not non-manipulable animals. In four experiments we show that (a) potentiation effects are observed with animals and artifacts; (b) potentiation effects depend on the absolute size of the objects and (c) task context influences the presence/absence of potentiation effects. We conclude that potentiation effects do not provide evidence for embodied object representations, but are suggestive of a more general stimulus-response compatibility effect that may depend on the distribution of attention to different object features.
The control of attentional target selection in a colour/colour conjunction task.
Berggren, Nick; Eimer, Martin
2016-11-01
To investigate the time course of attentional object selection processes in visual search tasks where targets are defined by a combination of features from the same dimension, we measured the N2pc component as an electrophysiological marker of attentional object selection during colour/colour conjunction search. In Experiment 1, participants searched for targets defined by a combination of two colours, while ignoring distractor objects that matched only one of these colours. Reliable N2pc components were triggered by targets and also by partially matching distractors, even when these distractors were accompanied by a target in the same display. The target N2pc was initially equal in size to the sum of the two N2pc components to the two different types of partially matching distractors and became superadditive from approximately 250 ms after search display onset. Experiment 2 demonstrated that the superadditivity of the target N2pc was not due to a selective disengagement of attention from task-irrelevant partially matching distractors. These results indicate that attention was initially deployed separately and in parallel to all target-matching colours, before attentional allocation processes became sensitive to the presence of both matching colours within the same object. They suggest that attention can be controlled simultaneously and independently by multiple features from the same dimension and that feature-guided attentional selection processes operate in parallel for different target-matching objects in the visual field.
Pulmonary nodule characterization, including computer analysis and quantitative features.
Bartholmai, Brian J; Koo, Chi Wan; Johnson, Geoffrey B; White, Darin B; Raghunath, Sushravya M; Rajagopalan, Srinivasan; Moynagh, Michael R; Lindell, Rebecca M; Hartman, Thomas E
2015-03-01
Pulmonary nodules are commonly detected in computed tomography (CT) chest screening of a high-risk population. The specific visual or quantitative features on CT or other modalities can be used to characterize the likelihood that a nodule is benign or malignant. Visual features on CT such as size, attenuation, location, morphology, edge characteristics, and other distinctive "signs" can be highly suggestive of a specific diagnosis and, in general, be used to determine the probability that a specific nodule is benign or malignant. Change in size, attenuation, and morphology on serial follow-up CT, or features on other modalities such as nuclear medicine studies or MRI, can also contribute to the characterization of lung nodules. Imaging analytics can objectively and reproducibly quantify nodule features on CT, nuclear medicine, and magnetic resonance imaging. Some quantitative techniques show great promise in helping to differentiate benign from malignant lesions or to stratify the risk of aggressive versus indolent neoplasm. In this article, we (1) summarize the visual characteristics, descriptors, and signs that may be helpful in management of nodules identified on screening CT, (2) discuss current quantitative and multimodality techniques that aid in the differentiation of nodules, and (3) highlight the power, pitfalls, and limitations of these various techniques.
Correspondence effects with torches: grasping affordance or visual feature asymmetry?
Song, Xiaolei; Chen, Jing; Proctor, Robert W
2014-01-01
Three experiments were conducted to determine whether an object-based correspondence effect for torch (flashlight) stimuli reported by Pellicano et al. [( 2010 ). Simon-like and functional affordance effects with tools: The effects of object perceptual discrimination and object action state. Quarterly Journal of Experimental Psychology, 63, 2190-2201] is due to a grasping affordance provided by the handle or asymmetry of feature markings on the torch. In Experiment 1 the stimuli were the same as those from Pellicano et al.'s Experiment 2, whereas in Experiments 2 and 3 the stimuli were modified versions with the graspable handle removed. Participants in all experiments performed upright/inverted orientation judgements on the torch stimuli. The results of Experiment 1 replicated those of Pellicano et al.: A small but significant object-based correspondence effect was evident, mainly when the torch was in an active state. With the handle of the torch removed in Experiment 2, making the barrel markings more asymmetric in the display, the correspondence effect was larger. Experiment 3 directly demonstrated an effect of barrel-marking asymmetry on the correspondence effect: When only the half of the markings nearest the light end of the torch was included, the correspondence effect reversed to favour the light end. The results are in agreement with a visual feature-asymmetry account and are difficult to reconcile with a grasping-affordance account.
Perception of emotion in abstract artworks: a multidisciplinary approach.
Melcher, David; Bacci, Francesca
2013-01-01
There is a long-standing and fundamental debate regarding how emotion can be expressed by fine art. Some artists and theorists have claimed that certain features of paintings, such as color, line, form, and composition, can consistently express an "objective" emotion, while others have argued that emotion perception is subjective and depends more on expertise of the observer. Here, we discuss two studies in which we have found evidence for consistency in observer ratings of emotion for abstract artworks. We have developed a stimulus set of abstract art images to test emotional priming, both between different painting images and between paintings and faces. The ratings were also used in a computational vision analysis of the visual features underlying emotion expression. Overall, these findings suggest that there is a strong bottom-up and objective aspect to perception of emotion in abstract artworks that may tap into basic visual mechanisms. © 2013 Elsevier B.V. All rights reserved.
Real-Time Tracking by Double Templates Matching Based on Timed Motion History Image with HSV Feature
Li, Zhiyong; Li, Pengfei; Yu, Xiaoping; Hashem, Mervat
2014-01-01
It is a challenge to represent the target appearance model for moving object tracking under complex environment. This study presents a novel method with appearance model described by double templates based on timed motion history image with HSV color histogram feature (tMHI-HSV). The main components include offline template and online template initialization, tMHI-HSV-based candidate patches feature histograms calculation, double templates matching (DTM) for object location, and templates updating. Firstly, we initialize the target object region and calculate its HSV color histogram feature as offline template and online template. Secondly, the tMHI-HSV is used to segment the motion region and calculate these candidate object patches' color histograms to represent their appearance models. Finally, we utilize the DTM method to trace the target and update the offline template and online template real-timely. The experimental results show that the proposed method can efficiently handle the scale variation and pose change of the rigid and nonrigid objects, even in illumination change and occlusion visual environment. PMID:24592185
Jozwik, Kamila M.; Kriegeskorte, Nikolaus; Storrs, Katherine R.; Mur, Marieke
2017-01-01
Recent advances in Deep convolutional Neural Networks (DNNs) have enabled unprecedentedly accurate computational models of brain representations, and present an exciting opportunity to model diverse cognitive functions. State-of-the-art DNNs achieve human-level performance on object categorisation, but it is unclear how well they capture human behavior on complex cognitive tasks. Recent reports suggest that DNNs can explain significant variance in one such task, judging object similarity. Here, we extend these findings by replicating them for a rich set of object images, comparing performance across layers within two DNNs of different depths, and examining how the DNNs’ performance compares to that of non-computational “conceptual” models. Human observers performed similarity judgments for a set of 92 images of real-world objects. Representations of the same images were obtained in each of the layers of two DNNs of different depths (8-layer AlexNet and 16-layer VGG-16). To create conceptual models, other human observers generated visual-feature labels (e.g., “eye”) and category labels (e.g., “animal”) for the same image set. Feature labels were divided into parts, colors, textures and contours, while category labels were divided into subordinate, basic, and superordinate categories. We fitted models derived from the features, categories, and from each layer of each DNN to the similarity judgments, using representational similarity analysis to evaluate model performance. In both DNNs, similarity within the last layer explains most of the explainable variance in human similarity judgments. The last layer outperforms almost all feature-based models. Late and mid-level layers outperform some but not all feature-based models. Importantly, categorical models predict similarity judgments significantly better than any DNN layer. Our results provide further evidence for commonalities between DNNs and brain representations. Models derived from visual features other than object parts perform relatively poorly, perhaps because DNNs more comprehensively capture the colors, textures and contours which matter to human object perception. However, categorical models outperform DNNs, suggesting that further work may be needed to bring high-level semantic representations in DNNs closer to those extracted by humans. Modern DNNs explain similarity judgments remarkably well considering they were not trained on this task, and are promising models for many aspects of human cognition. PMID:29062291
Taylor, Kirsten I; Devereux, Barry J; Acres, Kadia; Randall, Billi; Tyler, Lorraine K
2012-03-01
Conceptual representations are at the heart of our mental lives, involved in every aspect of cognitive functioning. Despite their centrality, a long-standing debate persists as to how the meanings of concepts are represented and processed. Many accounts agree that the meanings of concrete concepts are represented by their individual features, but disagree about the importance of different feature-based variables: some views stress the importance of the information carried by distinctive features in conceptual processing, others the features which are shared over many concepts, and still others the extent to which features co-occur. We suggest that previously disparate theoretical positions and experimental findings can be unified by an account which claims that task demands determine how concepts are processed in addition to the effects of feature distinctiveness and co-occurrence. We tested these predictions in a basic-level naming task which relies on distinctive feature information (Experiment 1) and a domain decision task which relies on shared feature information (Experiment 2). Both used large-scale regression designs with the same visual objects, and mixed-effects models incorporating participant, session, stimulus-related and feature statistic variables to model the performance. We found that concepts with relatively more distinctive and more highly correlated distinctive relative to shared features facilitated basic-level naming latencies, while concepts with relatively more shared and more highly correlated shared relative to distinctive features speeded domain decisions. These findings demonstrate that the feature statistics of distinctiveness (shared vs. distinctive) and correlational strength, as well as the task demands, determine how concept meaning is processed in the conceptual system. Copyright © 2011 Elsevier B.V. All rights reserved.
Vinken, Kasper; Van den Bergh, Gert; Vermaercke, Ben; Op de Beeck, Hans P.
2016-01-01
In recent years, the rodent has come forward as a candidate model for investigating higher level visual abilities such as object vision. This view has been backed up substantially by evidence from behavioral studies that show rats can be trained to express visual object recognition and categorization capabilities. However, almost no studies have investigated the functional properties of rodent extrastriate visual cortex using stimuli that target object vision, leaving a gap compared with the primate literature. Therefore, we recorded single-neuron responses along a proposed ventral pathway in rat visual cortex to investigate hallmarks of primate neural object representations such as preference for intact versus scrambled stimuli and category-selectivity. We presented natural movies containing a rat or no rat as well as their phase-scrambled versions. Population analyses showed increased dissociation in representations of natural versus scrambled stimuli along the targeted stream, but without a clear preference for natural stimuli. Along the measured cortical hierarchy the neural response seemed to be driven increasingly by features that are not V1-like and destroyed by phase-scrambling. However, there was no evidence for category selectivity for the rat versus nonrat distinction. Together, these findings provide insights about differences and commonalities between rodent and primate visual cortex. PMID:27146315
The perception of naturalness correlates with low-level visual features of environmental scenes.
Berman, Marc G; Hout, Michael C; Kardan, Omid; Hunter, MaryCarol R; Yourganov, Grigori; Henderson, John M; Hanayik, Taylor; Karimi, Hossein; Jonides, John
2014-01-01
Previous research has shown that interacting with natural environments vs. more urban or built environments can have salubrious psychological effects, such as improvements in attention and memory. Even viewing pictures of nature vs. pictures of built environments can produce similar effects. A major question is: What is it about natural environments that produces these benefits? Problematically, there are many differing qualities between natural and urban environments, making it difficult to narrow down the dimensions of nature that may lead to these benefits. In this study, we set out to uncover visual features that related to individuals' perceptions of naturalness in images. We quantified naturalness in two ways: first, implicitly using a multidimensional scaling analysis and second, explicitly with direct naturalness ratings. Features that seemed most related to perceptions of naturalness were related to the density of contrast changes in the scene, the density of straight lines in the scene, the average color saturation in the scene and the average hue diversity in the scene. We then trained a machine-learning algorithm to predict whether a scene was perceived as being natural or not based on these low-level visual features and we could do so with 81% accuracy. As such we were able to reliably predict subjective perceptions of naturalness with objective low-level visual features. Our results can be used in future studies to determine if these features, which are related to naturalness, may also lead to the benefits attained from interacting with nature.
Cross-Domain Multi-View Object Retrieval via Multi-Scale Topic Models.
Hong, Richang; Hu, Zhenzhen; Wang, Ruxin; Wang, Meng; Tao, Dacheng
2016-09-27
The increasing number of 3D objects in various applications has increased the requirement for effective and efficient 3D object retrieval methods, which attracted extensive research efforts in recent years. Existing works mainly focus on how to extract features and conduct object matching. With the increasing applications, 3D objects come from different areas. In such circumstances, how to conduct object retrieval becomes more important. To address this issue, we propose a multi-view object retrieval method using multi-scale topic models in this paper. In our method, multiple views are first extracted from each object, and then the dense visual features are extracted to represent each view. To represent the 3D object, multi-scale topic models are employed to extract the hidden relationship among these features with respected to varied topic numbers in the topic model. In this way, each object can be represented by a set of bag of topics. To compare the objects, we first conduct topic clustering for the basic topics from two datasets, and then generate the common topic dictionary for new representation. Then, the two objects can be aligned to the same common feature space for comparison. To evaluate the performance of the proposed method, experiments are conducted on two datasets. The 3D object retrieval experimental results and comparison with existing methods demonstrate the effectiveness of the proposed method.
Khaligh-Razavi, Seyed-Mahdi; Henriksson, Linda; Kay, Kendrick; Kriegeskorte, Nikolaus
2017-02-01
Studies of the primate visual system have begun to test a wide range of complex computational object-vision models. Realistic models have many parameters, which in practice cannot be fitted using the limited amounts of brain-activity data typically available. Task performance optimization (e.g. using backpropagation to train neural networks) provides major constraints for fitting parameters and discovering nonlinear representational features appropriate for the task (e.g. object classification). Model representations can be compared to brain representations in terms of the representational dissimilarities they predict for an image set. This method, called representational similarity analysis (RSA), enables us to test the representational feature space as is (fixed RSA) or to fit a linear transformation that mixes the nonlinear model features so as to best explain a cortical area's representational space (mixed RSA). Like voxel/population-receptive-field modelling, mixed RSA uses a training set (different stimuli) to fit one weight per model feature and response channel (voxels here), so as to best predict the response profile across images for each response channel. We analysed response patterns elicited by natural images, which were measured with functional magnetic resonance imaging (fMRI). We found that early visual areas were best accounted for by shallow models, such as a Gabor wavelet pyramid (GWP). The GWP model performed similarly with and without mixing, suggesting that the original features already approximated the representational space, obviating the need for mixing. However, a higher ventral-stream visual representation (lateral occipital region) was best explained by the higher layers of a deep convolutional network and mixing of its feature set was essential for this model to explain the representation. We suspect that mixing was essential because the convolutional network had been trained to discriminate a set of 1000 categories, whose frequencies in the training set did not match their frequencies in natural experience or their behavioural importance. The latter factors might determine the representational prominence of semantic dimensions in higher-level ventral-stream areas. Our results demonstrate the benefits of testing both the specific representational hypothesis expressed by a model's original feature space and the hypothesis space generated by linear transformations of that feature space.
Using LabView for real-time monitoring and tracking of multiple biological objects
NASA Astrophysics Data System (ADS)
Nikolskyy, Aleksandr I.; Krasilenko, Vladimir G.; Bilynsky, Yosyp Y.; Starovier, Anzhelika
2017-04-01
Today real-time studying and tracking of movement dynamics of various biological objects is important and widely researched. Features of objects, conditions of their visualization and model parameters strongly influence the choice of optimal methods and algorithms for a specific task. Therefore, to automate the processes of adaptation of recognition tracking algorithms, several Labview project trackers are considered in the article. Projects allow changing templates for training and retraining the system quickly. They adapt to the speed of objects and statistical characteristics of noise in images. New functions of comparison of images or their features, descriptors and pre-processing methods will be discussed. The experiments carried out to test the trackers on real video files will be presented and analyzed.
HD-MTL: Hierarchical Deep Multi-Task Learning for Large-Scale Visual Recognition.
Fan, Jianping; Zhao, Tianyi; Kuang, Zhenzhong; Zheng, Yu; Zhang, Ji; Yu, Jun; Peng, Jinye
2017-02-09
In this paper, a hierarchical deep multi-task learning (HD-MTL) algorithm is developed to support large-scale visual recognition (e.g., recognizing thousands or even tens of thousands of atomic object classes automatically). First, multiple sets of multi-level deep features are extracted from different layers of deep convolutional neural networks (deep CNNs), and they are used to achieve more effective accomplishment of the coarseto- fine tasks for hierarchical visual recognition. A visual tree is then learned by assigning the visually-similar atomic object classes with similar learning complexities into the same group, which can provide a good environment for determining the interrelated learning tasks automatically. By leveraging the inter-task relatedness (inter-class similarities) to learn more discriminative group-specific deep representations, our deep multi-task learning algorithm can train more discriminative node classifiers for distinguishing the visually-similar atomic object classes effectively. Our hierarchical deep multi-task learning (HD-MTL) algorithm can integrate two discriminative regularization terms to control the inter-level error propagation effectively, and it can provide an end-to-end approach for jointly learning more representative deep CNNs (for image representation) and more discriminative tree classifier (for large-scale visual recognition) and updating them simultaneously. Our incremental deep learning algorithms can effectively adapt both the deep CNNs and the tree classifier to the new training images and the new object classes. Our experimental results have demonstrated that our HD-MTL algorithm can achieve very competitive results on improving the accuracy rates for large-scale visual recognition.
Solar System Visualization (SSV) Project
NASA Technical Reports Server (NTRS)
Todd, Jessida L.
2005-01-01
The Solar System Visualization (SSV) project aims at enhancing scientific and public understanding through visual representations and modeling procedures. The SSV project's objectives are to (1) create new visualization technologies, (2) organize science observations and models, and (3) visualize science results and mission Plans. The SSV project currently supports the Mars Exploration Rovers (MER) mission, the Mars Reconnaissance Orbiter (MRO), and Cassini. In support of the these missions, the SSV team has produced pan and zoom animations of large mosaics to reveal details of surface features and topography, created 3D animations of science instruments and procedures, formed 3-D anaglyphs from left and right stereo pairs, and animated registered multi-resolution mosaics to provide context for microscopic images.
Picture object recognition in an American black bear (Ursus americanus).
Johnson-Ulrich, Zoe; Vonk, Jennifer; Humbyrd, Mary; Crowley, Marilyn; Wojtkowski, Ela; Yates, Florence; Allard, Stephanie
2016-11-01
Many animals have been tested for conceptual discriminations using two-dimensional images as stimuli, and many of these species appear to transfer knowledge from 2D images to analogous real life objects. We tested an American black bear for picture-object recognition using a two alternative forced choice task. She was presented with four unique sets of objects and corresponding pictures. The bear showed generalization from both objects to pictures and pictures to objects; however, her transfer was superior when transferring from real objects to pictures, suggesting that bears can recognize visual features from real objects within photographic images during discriminations.
Commonalities between Perception and Cognition.
Tacca, Michela C
2011-01-01
Perception and cognition are highly interrelated. Given the influence that these systems exert on one another, it is important to explain how perceptual representations and cognitive representations interact. In this paper, I analyze the similarities between visual perceptual representations and cognitive representations in terms of their structural properties and content. Specifically, I argue that the spatial structure underlying visual object representation displays systematicity - a property that is considered to be characteristic of propositional cognitive representations. To this end, I propose a logical characterization of visual feature binding as described by Treisman's Feature Integration Theory and argue that systematicity is not only a property of language-like representations, but also of spatially organized visual representations. Furthermore, I argue that if systematicity is taken to be a criterion to distinguish between conceptual and non-conceptual representations, then visual representations, that display systematicity, might count as an early type of conceptual representations. Showing these analogies between visual perception and cognition is an important step toward understanding the interface between the two systems. The ideas here presented might also set the stage for new empirical studies that directly compare binding (and other relational operations) in visual perception and higher cognition.
Commonalities between Perception and Cognition
Tacca, Michela C.
2011-01-01
Perception and cognition are highly interrelated. Given the influence that these systems exert on one another, it is important to explain how perceptual representations and cognitive representations interact. In this paper, I analyze the similarities between visual perceptual representations and cognitive representations in terms of their structural properties and content. Specifically, I argue that the spatial structure underlying visual object representation displays systematicity – a property that is considered to be characteristic of propositional cognitive representations. To this end, I propose a logical characterization of visual feature binding as described by Treisman’s Feature Integration Theory and argue that systematicity is not only a property of language-like representations, but also of spatially organized visual representations. Furthermore, I argue that if systematicity is taken to be a criterion to distinguish between conceptual and non-conceptual representations, then visual representations, that display systematicity, might count as an early type of conceptual representations. Showing these analogies between visual perception and cognition is an important step toward understanding the interface between the two systems. The ideas here presented might also set the stage for new empirical studies that directly compare binding (and other relational operations) in visual perception and higher cognition. PMID:22144974
Design and evaluation of a kitchen for persons with visual impairments.
Kutintara, Benjamas; Somboon, Pornpun; Buasri, Virajada; Srettananurak, Metinee; Jedeeyod, Piyanooch; Pornpratoom, Kittikan; Iam-cham, Veraya
2013-03-01
Visually impaired people need skills on daily living, such as cooking, and Ratchasuda College offers independent living training for them. In order to fulfill their needs, a suitable kitchen should be designed with the consideration of their limitations. The objective of this study was to design and evaluate a kitchen for persons with visual impairments. Before designing the kitchen, interviews and an observation were carried out to obtain information on the needs of blind and low vision persons. Consequently, a kitchen model was developed, and it was evaluated by 10 persons with visual impairments. After the design improvement, the kitchen was built and has been routinely used for training persons with visual impairments to prepare meals. Finally, a post-occupancy evaluation of the kitchen was conducted by observing and interviewing both trainers and those with visual impairments during the food preparation training. The results of the study indicated that kitchens for persons with visual impairments should have safety and usability features. The results of the post-occupancy evaluation showed that those who attended cooking courses were able to cook safely in the kitchen. However, the kitchen still had limitations in some features.
Mladinich, C.
2010-01-01
Human disturbance is a leading ecosystem stressor. Human-induced modifications include transportation networks, areal disturbances due to resource extraction, and recreation activities. High-resolution imagery and object-oriented classification rather than pixel-based techniques have successfully identified roads, buildings, and other anthropogenic features. Three commercial, automated feature-extraction software packages (Visual Learning Systems' Feature Analyst, ENVI Feature Extraction, and Definiens Developer) were evaluated by comparing their ability to effectively detect the disturbed surface patterns from motorized vehicle traffic. Each package achieved overall accuracies in the 70% range, demonstrating the potential to map the surface patterns. The Definiens classification was more consistent and statistically valid. Copyright ?? 2010 by Bellwether Publishing, Ltd. All rights reserved.
Robust image features: concentric contrasting circles and their image extraction
NASA Astrophysics Data System (ADS)
Gatrell, Lance B.; Hoff, William A.; Sklair, Cheryl W.
1992-03-01
Many computer vision tasks can be simplified if special image features are placed on the objects to be recognized. A review of special image features that have been used in the past is given and then a new image feature, the concentric contrasting circle, is presented. The concentric contrasting circle image feature has the advantages of being easily manufactured, easily extracted from the image, robust extraction (true targets are found, while few false targets are found), it is a passive feature, and its centroid is completely invariant to the three translational and one rotational degrees of freedom and nearly invariant to the remaining two rotational degrees of freedom. There are several examples of existing parallel implementations which perform most of the extraction work. Extraction robustness was measured by recording the probability of correct detection and the false alarm rate in a set of images of scenes containing mockups of satellites, fluid couplings, and electrical components. A typical application of concentric contrasting circle features is to place them on modeled objects for monocular pose estimation or object identification. This feature is demonstrated on a visually challenging background of a specular but wrinkled surface similar to a multilayered insulation spacecraft thermal blanket.
Alexander, Gerianne M; Packard, Mark G; Peterson, Bradley S
2002-01-01
Memory for object location relative both to veridical center (left versus right visual hemispace) and to eccentricity (central versus peripheral objects) was measured in 26 males and 25 females using the Silverman and Eals Location Memory Task. A subset of participants (17 males and 13 females) also completed a measure of implicit learning, the mirror-tracing task. No sex differences were observed in memory for object identities. Further, in both sexes, memory for object locations was better for peripherally located objects than for centrally located objects. In contrast to these similarities in female and male task performance, females but not males showed better recovery of object locations in the right compared to the left visual hemispace. Moreover, memory for object locations in the right hemispace was associated with mirror-tracing performance in women but not in men. Together, these data suggest that the processing of object features and object identification in the left cerebral hemisphere may include processing of spatial information that may contribute to superior object location memory in females relative to males.
Dissociation of quantifiers and object nouns in speech in focal neurodegenerative disease.
Ash, Sharon; Ternes, Kylie; Bisbing, Teagan; Min, Nam Eun; Moran, Eileen; York, Collin; McMillan, Corey T; Irwin, David J; Grossman, Murray
2016-08-01
Quantifiers such as many and some are thought to depend in part on the conceptual representation of number knowledge, while object nouns such as cookie and boy appear to depend in part on visual feature knowledge associated with object concepts. Further, number knowledge is associated with a frontal-parietal network while object knowledge is related in part to anterior and ventral portions of the temporal lobe. We examined the cognitive and anatomic basis for the spontaneous speech production of quantifiers and object nouns in non-aphasic patients with focal neurodegenerative disease associated with corticobasal syndrome (CBS, n=33), behavioral variant frontotemporal degeneration (bvFTD, n=54), and semantic variant primary progressive aphasia (svPPA, n=19). We recorded a semi-structured speech sample elicited from patients and healthy seniors (n=27) during description of the Cookie Theft scene. We observed a dissociation: CBS and bvFTD were significantly impaired in the production of quantifiers but not object nouns, while svPPA were significantly impaired in the production of object nouns but not quantifiers. MRI analysis revealed that quantifier production deficits in CBS and bvFTD were associated with disease in a frontal-parietal network important for number knowledge, while impaired production of object nouns in all patient groups was related to disease in inferior temporal regions important for representations of visual feature knowledge of objects. These findings imply that partially dissociable representations in semantic memory may underlie different segments of the lexicon. Copyright © 2016 Elsevier Ltd. All rights reserved.
Line drawing extraction from gray level images by feature integration
NASA Astrophysics Data System (ADS)
Yoo, Hoi J.; Crevier, Daniel; Lepage, Richard; Myler, Harley R.
1994-10-01
We describe procedures that extract line drawings from digitized gray level images, without use of domain knowledge, by modeling preattentive and perceptual organization functions of the human visual system. First, edge points are identified by standard low-level processing, based on the Canny edge operator. Edge points are then linked into single-pixel thick straight- line segments and circular arcs: this operation serves to both filter out isolated and highly irregular segments, and to lump the remaining points into a smaller number of structures for manipulation by later stages of processing. The next stages consist in linking the segments into a set of closed boundaries, which is the system's definition of a line drawing. According to the principles of Gestalt psychology, closure allows us to organize the world by filling in the gaps in a visual stimulation so as to perceive whole objects instead of disjoint parts. To achieve such closure, the system selects particular features or combinations of features by methods akin to those of preattentive processing in humans: features include gaps, pairs of straight or curved parallel lines, L- and T-junctions, pairs of symmetrical lines, and the orientation and length of single lines. These preattentive features are grouped into higher-level structures according to the principles of proximity, similarity, closure, symmetry, and feature conjunction. Achieving closure may require supplying missing segments linking contour concavities. Choices are made between competing structures on the basis of their overall compliance with the principles of closure and symmetry. Results include clean line drawings of curvilinear manufactured objects. The procedures described are part of a system called VITREO (viewpoint-independent 3-D recognition and extraction of objects).
Subliminally presented and stored objects capture spatial attention.
Astle, Duncan E; Nobre, Anna C; Scerif, Gaia
2010-03-10
When objects disappear from view, we can still bring them to mind, at least for brief periods of time, because we can represent those objects in visual short-term memory (VSTM) (Sperling, 1960; Cowan, 2001). A defining characteristic of this representation is that it is topographic, that is, it preserves a spatial organization based on the original visual percept (Vogel and Machizawa, 2004; Astle et al., 2009; Kuo et al., 2009). Recent research has also shown that features or locations of visual items that match those being maintained in conscious VSTM automatically capture our attention (Awh and Jonides, 2001; Olivers et al., 2006; Soto et al., 2008). But do objects leave some trace that can guide spatial attention, even without participants intentionally remembering them? Furthermore, could subliminally presented objects leave a topographically arranged representation that can capture attention? We presented objects either supraliminally or subliminally and then 1 s later re-presented one of those objects in a new location, as a "probe" shape. As participants made an arbitrary perceptual judgment on the probe shape, their covert spatial attention was drawn to the original location of that shape, regardless of whether its initial presentation had been supraliminal or subliminal. We demonstrate this with neural and behavioral measures of memory-driven attentional capture. These findings reveal the existence of a topographically arranged store of "visual" objects, the content of which is beyond our explicit awareness but which nonetheless guides spatial attention.
Top-down influences on visual attention during listening are modulated by observer sex.
Shen, John; Itti, Laurent
2012-07-15
In conversation, women have a small advantage in decoding non-verbal communication compared to men. In light of these findings, we sought to determine whether sex differences also existed in visual attention during a related listening task, and if so, if the differences existed among attention to high-level aspects of the scene or to conspicuous visual features. Using eye-tracking and computational techniques, we present direct evidence that men and women orient attention differently during conversational listening. We tracked the eyes of 15 men and 19 women who watched and listened to 84 clips featuring 12 different speakers in various outdoor settings. At the fixation following each saccadic eye movement, we analyzed the type of object that was fixated. Men gazed more often at the mouth and women at the eyes of the speaker. Women more often exhibited "distracted" saccades directed away from the speaker and towards a background scene element. Examining the multi-scale center-surround variation in low-level visual features (static: color, intensity, orientation, and dynamic: motion energy), we found that men consistently selected regions which expressed more variation in dynamic features, which can be attributed to a male preference for motion and a female preference for areas that may contain nonverbal information about the speaker. In sum, significant differences were observed, which we speculate arise from different integration strategies of visual cues in selecting the final target of attention. Our findings have implications for studies of sex in nonverbal communication, as well as for more predictive models of visual attention. Published by Elsevier Ltd.
Optical Associative Processors For Visual Perception"
NASA Astrophysics Data System (ADS)
Casasent, David; Telfer, Brian
1988-05-01
We consider various associative processor modifications required to allow these systems to be used for visual perception, scene analysis, and object recognition. For these applications, decisions on the class of the objects present in the input image are required and thus heteroassociative memories are necessary (rather than the autoassociative memories that have been given most attention). We analyze the performance of both associative processors and note that there is considerable difference between heteroassociative and autoassociative memories. We describe associative processors suitable for realizing functions such as: distortion invariance (using linear discriminant function memory synthesis techniques), noise and image processing performance (using autoassociative memories in cascade with with a heteroassociative processor and with a finite number of autoassociative memory iterations employed), shift invariance (achieved through the use of associative processors operating on feature space data), and the analysis of multiple objects in high noise (which is achieved using associative processing of the output from symbolic correlators). We detail and provide initial demonstrations of the use of associative processors operating on iconic, feature space and symbolic data, as well as adaptive associative processors.
Semi supervised Learning of Feature Hierarchies for Object Detection in a Video (Open Access)
2013-10-03
dataset: PETS2009 Dataset, Oxford Town Center dataset [3], PNNL Parking Lot datasets [25] and CAVIAR cols1 dataset [1] for human detection. Be- sides, we...level features from TownCen- ter, ParkingLot, PETS09 and CAVIAR . As we can see that, the four set of features are visually very different from each other...information is more distinguished for detecting a person in the TownCen- ter than CAVIAR . Comparing figure 5(a) with 6(a), interest- ingly, the color
Rust, Nicole C.; DiCarlo, James J.
2012-01-01
While popular accounts suggest that neurons along the ventral visual processing stream become increasingly selective for particular objects, this appears at odds with the fact that inferior temporal cortical (IT) neurons are broadly tuned. To explore this apparent contradiction, we compared processing in two ventral stream stages (V4 and IT) in the rhesus macaque monkey. We confirmed that IT neurons are indeed more selective for conjunctions of visual features than V4 neurons, and that this increase in feature conjunction selectivity is accompanied by an increase in tolerance (“invariance”) to identity-preserving transformations (e.g. shifting, scaling) of those features. We report here that V4 and IT neurons are, on average, tightly matched in their tuning breadth for natural images (“sparseness”), and that the average V4 or IT neuron will produce a robust firing rate response (over 50% of its peak observed firing rate) to ~10% of all natural images. We also observed that sparseness was positively correlated with conjunction selectivity and negatively correlated with tolerance within both V4 and IT, consistent with selectivity-building and invariance-building computations that offset one another to produce sparseness. Our results imply that the conjunction-selectivity-building and invariance-building computations necessary to support object recognition are implemented in a balanced fashion to maintain sparseness at each stage of processing. PMID:22836252
Content-based image retrieval by matching hierarchical attributed region adjacency graphs
NASA Astrophysics Data System (ADS)
Fischer, Benedikt; Thies, Christian J.; Guld, Mark O.; Lehmann, Thomas M.
2004-05-01
Content-based image retrieval requires a formal description of visual information. In medical applications, all relevant biological objects have to be represented by this description. Although color as the primary feature has proven successful in publicly available retrieval systems of general purpose, this description is not applicable to most medical images. Additionally, it has been shown that global features characterizing the whole image do not lead to acceptable results in the medical context or that they are only suitable for specific applications. For a general purpose content-based comparison of medical images, local, i.e. regional features that are collected on multiple scales must be used. A hierarchical attributed region adjacency graph (HARAG) provides such a representation and transfers image comparison to graph matching. However, building a HARAG from an image requires a restriction in size to be computationally feasible while at the same time all visually plausible information must be preserved. For this purpose, mechanisms for the reduction of the graph size are presented. Even with a reduced graph, the problem of graph matching remains NP-complete. In this paper, the Similarity Flooding approach and Hopfield-style neural networks are adapted from the graph matching community to the needs of HARAG comparison. Based on synthetic image material build from simple geometric objects, all visually similar regions were matched accordingly showing the framework's general applicability to content-based image retrieval of medical images.
Effect of feature-selective attention on neuronal responses in macaque area MT
Chen, X.; Hoffmann, K.-P.; Albright, T. D.
2012-01-01
Attention influences visual processing in striate and extrastriate cortex, which has been extensively studied for spatial-, object-, and feature-based attention. Most studies exploring neural signatures of feature-based attention have trained animals to attend to an object identified by a certain feature and ignore objects/displays identified by a different feature. Little is known about the effects of feature-selective attention, where subjects attend to one stimulus feature domain (e.g., color) of an object while features from different domains (e.g., direction of motion) of the same object are ignored. To study this type of feature-selective attention in area MT in the middle temporal sulcus, we trained macaque monkeys to either attend to and report the direction of motion of a moving sine wave grating (a feature for which MT neurons display strong selectivity) or attend to and report its color (a feature for which MT neurons have very limited selectivity). We hypothesized that neurons would upregulate their firing rate during attend-direction conditions compared with attend-color conditions. We found that feature-selective attention significantly affected 22% of MT neurons. Contrary to our hypothesis, these neurons did not necessarily increase firing rate when animals attended to direction of motion but fell into one of two classes. In one class, attention to color increased the gain of stimulus-induced responses compared with attend-direction conditions. The other class displayed the opposite effects. Feature-selective activity modulations occurred earlier in neurons modulated by attention to color compared with neurons modulated by attention to motion direction. Thus feature-selective attention influences neuronal processing in macaque area MT but often exhibited a mismatch between the preferred stimulus dimension (direction of motion) and the preferred attention dimension (attention to color). PMID:22170961
Effect of feature-selective attention on neuronal responses in macaque area MT.
Chen, X; Hoffmann, K-P; Albright, T D; Thiele, A
2012-03-01
Attention influences visual processing in striate and extrastriate cortex, which has been extensively studied for spatial-, object-, and feature-based attention. Most studies exploring neural signatures of feature-based attention have trained animals to attend to an object identified by a certain feature and ignore objects/displays identified by a different feature. Little is known about the effects of feature-selective attention, where subjects attend to one stimulus feature domain (e.g., color) of an object while features from different domains (e.g., direction of motion) of the same object are ignored. To study this type of feature-selective attention in area MT in the middle temporal sulcus, we trained macaque monkeys to either attend to and report the direction of motion of a moving sine wave grating (a feature for which MT neurons display strong selectivity) or attend to and report its color (a feature for which MT neurons have very limited selectivity). We hypothesized that neurons would upregulate their firing rate during attend-direction conditions compared with attend-color conditions. We found that feature-selective attention significantly affected 22% of MT neurons. Contrary to our hypothesis, these neurons did not necessarily increase firing rate when animals attended to direction of motion but fell into one of two classes. In one class, attention to color increased the gain of stimulus-induced responses compared with attend-direction conditions. The other class displayed the opposite effects. Feature-selective activity modulations occurred earlier in neurons modulated by attention to color compared with neurons modulated by attention to motion direction. Thus feature-selective attention influences neuronal processing in macaque area MT but often exhibited a mismatch between the preferred stimulus dimension (direction of motion) and the preferred attention dimension (attention to color).
Does apparent size capture attention in visual search? Evidence from the Muller-Lyer illusion.
Proulx, Michael J; Green, Monique
2011-11-23
Is perceived size a crucial factor for the bottom-up guidance of attention? Here, a visual search experiment was used to examine whether an irrelevantly longer object can capture attention when participants were to detect a vertical target item. The longer object was created by an apparent size manipulation, the Müller-Lyer illusion; however, all objects contained the same number of pixels. The vertical target was detected more efficiently when it was also perceived as the longer item that was defined by apparent size. Further analysis revealed that the longer Müller-Lyer object received a greater degree of attentional priority than published results for other features such as retinal size, luminance contrast, and the abrupt onset of a new object. The present experiment has demonstrated for the first time that apparent size can capture attention and, thus, provide bottom-up guidance on the basis of perceived salience.
Brooks, Joseph L.; Gilaie-Dotan, Sharon; Rees, Geraint; Bentin, Shlomo; Driver, Jon
2012-01-01
Visual perception depends not only on local stimulus features but also on their relationship to the surrounding stimulus context, as evident in both local and contextual influences on figure-ground segmentation. Intermediate visual areas may play a role in such contextual influences, as we tested here by examining LG, a rare case of developmental visual agnosia. LG has no evident abnormality of brain structure and functional neuroimaging showed relatively normal V1 function, but his intermediate visual areas (V2/V3) function abnormally. We found that contextual influences on figure-ground organization were selectively disrupted in LG, while local sources of figure-ground influences were preserved. Effects of object knowledge and familiarity on figure-ground organization were also significantly diminished. Our results suggest that the mechanisms mediating contextual and familiarity influences on figure-ground organization are dissociable from those mediating local influences on figure-ground assignment. The disruption of contextual processing in intermediate visual areas may play a role in the substantial object recognition difficulties experienced by LG. PMID:22947116
Inferring the direction of implied motion depends on visual awareness
Faivre, Nathan; Koch, Christof
2014-01-01
Visual awareness of an event, object, or scene is, by essence, an integrated experience, whereby different visual features composing an object (e.g., orientation, color, shape) appear as an unified percept and are processed as a whole. Here, we tested in human observers whether perceptual integration of static motion cues depends on awareness by measuring the capacity to infer the direction of motion implied by a static visible or invisible image under continuous flash suppression. Using measures of directional adaptation, we found that visible but not invisible implied motion adaptors biased the perception of real motion probes. In a control experiment, we found that invisible adaptors implying motion primed the perception of subsequent probes when they were identical (i.e., repetition priming), but not when they only shared the same direction (i.e., direction priming). Furthermore, using a model of visual processing, we argue that repetition priming effects are likely to arise as early as in the primary visual cortex. We conclude that although invisible images implying motion undergo some form of nonconscious processing, visual awareness is necessary to make inferences about motion direction. PMID:24706951
What you say matters: exploring visual-verbal interactions in visual working memory.
Mate, Judit; Allen, Richard J; Baqués, Josep
2012-01-01
The aim of this study was to explore whether the content of a simple concurrent verbal load task determines the extent of its interference on memory for coloured shapes. The task consisted of remembering four visual items while repeating aloud a pair of words that varied in terms of imageability and relatedness to the task set. At test, a cue appeared that was either the colour or the shape of one of the previously seen objects, with participants required to select the object's other feature from a visual array. During encoding and retention, there were four verbal load conditions: (a) a related, shape-colour pair (from outside the experimental set, i.e., "pink square"); (b) a pair of unrelated but visually imageable, concrete, words (i.e., "big elephant"); (c) a pair of unrelated and abstract words (i.e., "critical event"); and (d) no verbal load. Results showed differential effects of these verbal load conditions. In particular, imageable words (concrete and related conditions) interfered to a greater degree than abstract words. Possible implications for how visual working memory interacts with verbal memory and long-term memory are discussed.
Inferring the direction of implied motion depends on visual awareness.
Faivre, Nathan; Koch, Christof
2014-04-04
Visual awareness of an event, object, or scene is, by essence, an integrated experience, whereby different visual features composing an object (e.g., orientation, color, shape) appear as an unified percept and are processed as a whole. Here, we tested in human observers whether perceptual integration of static motion cues depends on awareness by measuring the capacity to infer the direction of motion implied by a static visible or invisible image under continuous flash suppression. Using measures of directional adaptation, we found that visible but not invisible implied motion adaptors biased the perception of real motion probes. In a control experiment, we found that invisible adaptors implying motion primed the perception of subsequent probes when they were identical (i.e., repetition priming), but not when they only shared the same direction (i.e., direction priming). Furthermore, using a model of visual processing, we argue that repetition priming effects are likely to arise as early as in the primary visual cortex. We conclude that although invisible images implying motion undergo some form of nonconscious processing, visual awareness is necessary to make inferences about motion direction.
Feature integration across space, time, and orientation
Otto, Thomas U.; Öğmen, Haluk; Herzog, Michael H.
2012-01-01
The perception of a visual target can be strongly influenced by flanking stimuli. In static displays, performance on the target improves when the distance to the flanking elements increases- proposedly because feature pooling and integration vanishes with distance. Here, we studied feature integration with dynamic stimuli. We show that features of single elements presented within a continuous motion stream are integrated largely independent of spatial distance (and orientation). Hence, space based models of feature integration cannot be extended to dynamic stimuli. We suggest that feature integration is guided by perceptual grouping operations that maintain the identity of perceptual objects over space and time. PMID:19968428
Method and apparatus for accurately manipulating an object during microelectrophoresis
Parvin, Bahram A.; Maestre, Marcos F.; Fish, Richard H.; Johnston, William E.
1997-01-01
An apparatus using electrophoresis provides accurate manipulation of an object on a microscope stage for further manipulations add reactions. The present invention also provides an inexpensive and easily accessible means to move an object without damage to the object. A plurality of electrodes are coupled to the stage in an array whereby the electrode array allows for distinct manipulations of the electric field for accurate manipulations of the object. There is an electrode array control coupled to the plurality of electrodes for manipulating the electric field. In an alternative embodiment, a chamber is provided on the stage to hold the object. The plurality of electrodes are positioned in the chamber, and the chamber is filled with fluid. The system can be automated using visual servoing, which manipulates the control parameters, i.e., x, y stage, applying the field, etc., after extracting the significant features directly from image data. Visual servoing includes an imaging device and computer system to determine the location of the object. A second stage having a plurality of tubes positioned on top of the second stage, can be accurately positioned by visual servoing so that one end of one of the plurality of tubes surrounds at least part of the object on the first stage.
Method and apparatus for accurately manipulating an object during microelectrophoresis
Parvin, B.A.; Maestre, M.F.; Fish, R.H.; Johnston, W.E.
1997-09-23
An apparatus using electrophoresis provides accurate manipulation of an object on a microscope stage for further manipulations and reactions. The present invention also provides an inexpensive and easily accessible means to move an object without damage to the object. A plurality of electrodes are coupled to the stage in an array whereby the electrode array allows for distinct manipulations of the electric field for accurate manipulations of the object. There is an electrode array control coupled to the plurality of electrodes for manipulating the electric field. In an alternative embodiment, a chamber is provided on the stage to hold the object. The plurality of electrodes are positioned in the chamber, and the chamber is filled with fluid. The system can be automated using visual servoing, which manipulates the control parameters, i.e., x, y stage, applying the field, etc., after extracting the significant features directly from image data. Visual servoing includes an imaging device and computer system to determine the location of the object. A second stage having a plurality of tubes positioned on top of the second stage, can be accurately positioned by visual servoing so that one end of one of the plurality of tubes surrounds at least part of the object on the first stage. 11 figs.
Muthukumaraswamy, Suresh D.; Hibbs, Carina S.; Shapiro, Kimron L.; Bracewell, R. Martyn; Singh, Krish D.; Linden, David E. J.
2011-01-01
The mechanism by which distinct subprocesses in the brain are coordinated is a central conundrum of systems neuroscience. The parietal lobe is thought to play a key role in visual feature integration, and oscillatory activity in the gamma frequency range has been associated with perception of coherent objects and other tasks requiring neural coordination. Here, we examined the neural correlates of integrating mental representations in working memory and hypothesized that parietal gamma activity would be related to the success of cognitive coordination. Working memory is a classic example of a cognitive operation that requires the coordinated processing of different types of information and the contribution of multiple cognitive domains. Using magnetoencephalography (MEG), we report parietal activity in the high gamma (80–100 Hz) range during manipulation of visual and spatial information (colors and angles) in working memory. This parietal gamma activity was significantly higher during manipulation of visual-spatial conjunctions compared with single features. Furthermore, gamma activity correlated with successful performance during the conjunction task but not during the component tasks. Cortical gamma activity in parietal cortex may therefore play a role in cognitive coordination. PMID:21940605
Modulation of microsaccades by spatial frequency during object categorization.
Craddock, Matt; Oppermann, Frank; Müller, Matthias M; Martinovic, Jasna
2017-01-01
The organization of visual processing into a coarse-to-fine information processing based on the spatial frequency properties of the input forms an important facet of the object recognition process. During visual object categorization tasks, microsaccades occur frequently. One potential functional role of these eye movements is to resolve high spatial frequency information. To assess this hypothesis, we examined the rate, amplitude and speed of microsaccades in an object categorization task in which participants viewed object and non-object images and classified them as showing either natural objects, man-made objects or non-objects. Images were presented unfiltered (broadband; BB) or filtered to contain only low (LSF) or high spatial frequency (HSF) information. This allowed us to examine whether microsaccades were modulated independently by the presence of a high-level feature - the presence of an object - and by low-level stimulus characteristics - spatial frequency. We found a bimodal distribution of saccades based on their amplitude, with a split between smaller and larger microsaccades at 0.4° of visual angle. The rate of larger saccades (⩾0.4°) was higher for objects than non-objects, and higher for objects with high spatial frequency content (HSF and BB objects) than for LSF objects. No effects were observed for smaller microsaccades (<0.4°). This is consistent with a role for larger microsaccades in resolving HSF information for object identification, and previous evidence that more microsaccades are directed towards informative image regions. Copyright © 2016 Elsevier Ltd. All rights reserved.
Monocular Vision-Based Underwater Object Detection
Zhang, Zhen; Dai, Fengzhao; Bu, Yang; Wang, Huibin
2017-01-01
In this paper, we propose an underwater object detection method using monocular vision sensors. In addition to commonly used visual features such as color and intensity, we investigate the potential of underwater object detection using light transmission information. The global contrast of various features is used to initially identify the region of interest (ROI), which is then filtered by the image segmentation method, producing the final underwater object detection results. We test the performance of our method with diverse underwater datasets. Samples of the datasets are acquired by a monocular camera with different qualities (such as resolution and focal length) and setups (viewing distance, viewing angle, and optical environment). It is demonstrated that our ROI detection method is necessary and can largely remove the background noise and significantly increase the accuracy of our underwater object detection method. PMID:28771194
Robot Acting on Moving Bodies (RAMBO): Interaction with tumbling objects
NASA Technical Reports Server (NTRS)
Davis, Larry S.; Dementhon, Daniel; Bestul, Thor; Ziavras, Sotirios; Srinivasan, H. V.; Siddalingaiah, Madhu; Harwood, David
1989-01-01
Interaction with tumbling objects will become more common as human activities in space expand. Attempting to interact with a large complex object translating and rotating in space, a human operator using only his visual and mental capacities may not be able to estimate the object motion, plan actions or control those actions. A robot system (RAMBO) equipped with a camera, which, given a sequence of simple tasks, can perform these tasks on a tumbling object, is being developed. RAMBO is given a complete geometric model of the object. A low level vision module extracts and groups characteristic features in images of the object. The positions of the object are determined in a sequence of images, and a motion estimate of the object is obtained. This motion estimate is used to plan trajectories of the robot tool to relative locations rearby the object sufficient for achieving the tasks. More specifically, low level vision uses parallel algorithms for image enhancement by symmetric nearest neighbor filtering, edge detection by local gradient operators, and corner extraction by sector filtering. The object pose estimation is a Hough transform method accumulating position hypotheses obtained by matching triples of image features (corners) to triples of model features. To maximize computing speed, the estimate of the position in space of a triple of features is obtained by decomposing its perspective view into a product of rotations and a scaled orthographic projection. This allows use of 2-D lookup tables at each stage of the decomposition. The position hypotheses for each possible match of model feature triples and image feature triples are calculated in parallel. Trajectory planning combines heuristic and dynamic programming techniques. Then trajectories are created using dynamic interpolations between initial and goal trajectories. All the parallel algorithms run on a Connection Machine CM-2 with 16K processors.
Progressive 3D shape abstraction via hierarchical CSG tree
NASA Astrophysics Data System (ADS)
Chen, Xingyou; Tang, Jin; Li, Chenglong
2017-06-01
A constructive solid geometry(CSG) tree model is proposed to progressively abstract 3D geometric shape of general object from 2D image. Unlike conventional ones, our method applies to general object without the need for massive CAD models, and represents the object shapes in a coarse-to-fine manner that allows users to view temporal shape representations at any time. It stands in a transitional position between 2D image feature and CAD model, benefits from state-of-the-art object detection approaches and better initializes CAD model for finer fitting, estimates 3D shape and pose parameters of object at different levels according to visual perception objective, in a coarse-to-fine manner. Two main contributions are the application of CSG building up procedure into visual perception, and the ability of extending object estimation result into a more flexible and expressive model than 2D/3D primitive shapes. Experimental results demonstrate the feasibility and effectiveness of the proposed approach.
Murphy-Baum, Benjamin L; Taylor, W Rowland
2015-09-30
Much of the computational power of the retina derives from the activity of amacrine cells, a large and diverse group of GABAergic and glycinergic inhibitory interneurons. Here, we identify an ON-type orientation-selective, wide-field, polyaxonal amacrine cell (PAC) in the rabbit retina and demonstrate how its orientation selectivity arises from the structure of the dendritic arbor and the pattern of excitatory and inhibitory inputs. Excitation from ON bipolar cells and inhibition arising from the OFF pathway converge to generate a quasi-linear integration of visual signals in the receptive field center. This serves to suppress responses to high spatial frequencies, thereby improving sensitivity to larger objects and enhancing orientation selectivity. Inhibition also regulates the magnitude and time course of excitatory inputs to this PAC through serial inhibitory connections onto the presynaptic terminals of ON bipolar cells. This presynaptic inhibition is driven by graded potentials within local microcircuits, similar in extent to the size of single bipolar cell receptive fields. Additional presynaptic inhibition is generated by spiking amacrine cells on a larger spatial scale covering several hundred microns. The orientation selectivity of this PAC may be a substrate for the inhibition that mediates orientation selectivity in some types of ganglion cells. Significance statement: The retina comprises numerous excitatory and inhibitory circuits that encode specific features in the visual scene, such as orientation, contrast, or motion. Here, we identify a wide-field inhibitory neuron that responds to visual stimuli of a particular orientation, a feature selectivity that is primarily due to the elongated shape of the dendritic arbor. Integration of convergent excitatory and inhibitory inputs from the ON and OFF visual pathways suppress responses to small objects and fine textures, thus enhancing selectivity for larger objects. Feedback inhibition regulates the strength and speed of excitation on both local and wide-field spatial scales. This study demonstrates how different synaptic inputs are regulated to tune a neuron to respond to specific features in the visual scene. Copyright © 2015 the authors 0270-6474/15/3513336-15$15.00/0.
Brockhoff, Alisa; Huff, Markus
2016-10-01
Multiple object tracking (MOT) plays a fundamental role in processing and interpreting dynamic environments. Regarding the type of information utilized by the observer, recent studies reported evidence for the use of object features in an automatic, low- level manner. By introducing a novel paradigm that allowed us to combine tracking with a noninterfering top-down task, we tested whether a voluntary component can regulate the deployment of attention to task-relevant features in a selective manner. In four experiments we found conclusive evidence for a task-driven selection mechanism that guides attention during tracking: The observers were able to ignore or prioritize distinct objects. They marked the distinct (cued) object (target/distractor) more or less often than other objects of the same type (targets /distractors)-but only when they had received an identification task that required them to actively process object features (cues) during tracking. These effects are discussed with regard to existing theoretical approaches to attentive tracking, gaze-cue usability as well as attentional readiness, a term that originally stems from research on attention capture and visual search. Our findings indicate that existing theories of MOT need to be adjusted to allow for flexible top-down, voluntary processing during tracking.
Small numbers are sensed directly, high numbers constructed from size and density.
Zimmermann, Eckart
2018-04-01
Two theories compete to explain how we estimate the numerosity of visual object sets. The first suggests that the apparent numerosity is derived from an analysis of more low-level features like size and density of the set. The second theory suggests that numbers are sensed directly. Consistent with the latter claim is the existence of neurons in parietal cortex which are specialized for processing the numerosity of elements in the visual scene. However, recent evidence suggests that only low numbers can be sensed directly whereas the perception of high numbers is supported by the analysis of low-level features. Processing of low and high numbers, being located at different levels of the neural hierarchy should involve different receptive field sizes. Here, I tested this idea with visual adaptation. I measured the spatial spread of number adaptation for low and high numerosities. A focused adaptation spread of high numerosities suggested the involvement of early neural levels where receptive fields are comparably small and the broad spread for low numerosities was consistent with processing of number neurons which have larger receptive fields. These results provide evidence for the claim that different mechanism exist generating the perception of visual numerosity. Whereas low numbers are sensed directly as a primary visual attribute, the estimation of high numbers however likely depends on the area size over which the objects are spread. Copyright © 2017 Elsevier B.V. All rights reserved.
The Objective Identification and Quantification of Interstitial Lung Abnormalities in Smokers.
Ash, Samuel Y; Harmouche, Rola; Ross, James C; Diaz, Alejandro A; Hunninghake, Gary M; Putman, Rachel K; Onieva, Jorge; Martinez, Fernando J; Choi, Augustine M; Lynch, David A; Hatabu, Hiroto; Rosas, Ivan O; Estepar, Raul San Jose; Washko, George R
2017-08-01
Previous investigation suggests that visually detected interstitial changes in the lung parenchyma of smokers are highly clinically relevant and predict outcomes, including death. Visual subjective analysis to detect these changes is time-consuming, insensitive to subtle changes, and requires training to enhance reproducibility. Objective detection of such changes could provide a method of disease identification without these limitations. The goal of this study was to develop and test a fully automated image processing tool to objectively identify radiographic features associated with interstitial abnormalities in the computed tomography scans of a large cohort of smokers. An automated tool that uses local histogram analysis combined with distance from the pleural surface was used to detect radiographic features consistent with interstitial lung abnormalities in computed tomography scans from 2257 individuals from the Genetic Epidemiology of COPD study, a longitudinal observational study of smokers. The sensitivity and specificity of this tool was determined based on its ability to detect the visually identified presence of these abnormalities. The tool had a sensitivity of 87.8% and a specificity of 57.5% for the detection of interstitial lung abnormalities, with a c-statistic of 0.82, and was 100% sensitive and 56.7% specific for the detection of the visual subtype of interstitial abnormalities called fibrotic parenchymal abnormalities, with a c-statistic of 0.89. In smokers, a fully automated image processing tool is able to identify those individuals who have interstitial lung abnormalities with moderate sensitivity and specificity. Copyright © 2017 The Association of University Radiologists. Published by Elsevier Inc. All rights reserved.
Genoviz Software Development Kit: Java tool kit for building genomics visualization applications.
Helt, Gregg A; Nicol, John W; Erwin, Ed; Blossom, Eric; Blanchard, Steven G; Chervitz, Stephen A; Harmon, Cyrus; Loraine, Ann E
2009-08-25
Visualization software can expose previously undiscovered patterns in genomic data and advance biological science. The Genoviz Software Development Kit (SDK) is an open source, Java-based framework designed for rapid assembly of visualization software applications for genomics. The Genoviz SDK framework provides a mechanism for incorporating adaptive, dynamic zooming into applications, a desirable feature of genome viewers. Visualization capabilities of the Genoviz SDK include automated layout of features along genetic or genomic axes; support for user interactions with graphical elements (Glyphs) in a map; a variety of Glyph sub-classes that promote experimentation with new ways of representing data in graphical formats; and support for adaptive, semantic zooming, whereby objects change their appearance depending on zoom level and zooming rate adapts to the current scale. Freely available demonstration and production quality applications, including the Integrated Genome Browser, illustrate Genoviz SDK capabilities. Separation between graphics components and genomic data models makes it easy for developers to add visualization capability to pre-existing applications or build new applications using third-party data models. Source code, documentation, sample applications, and tutorials are available at http://genoviz.sourceforge.net/.
Information based universal feature extraction
NASA Astrophysics Data System (ADS)
Amiri, Mohammad; Brause, Rüdiger
2015-02-01
In many real world image based pattern recognition tasks, the extraction and usage of task-relevant features are the most crucial part of the diagnosis. In the standard approach, they mostly remain task-specific, although humans who perform such a task always use the same image features, trained in early childhood. It seems that universal feature sets exist, but they are not yet systematically found. In our contribution, we tried to find those universal image feature sets that are valuable for most image related tasks. In our approach, we trained a neural network by natural and non-natural images of objects and background, using a Shannon information-based algorithm and learning constraints. The goal was to extract those features that give the most valuable information for classification of visual objects hand-written digits. This will give a good start and performance increase for all other image learning tasks, implementing a transfer learning approach. As result, in our case we found that we could indeed extract features which are valid in all three kinds of tasks.
Sheth, Bhavin R.; Young, Ryan
2016-01-01
Evidence is strong that the visual pathway is segregated into two distinct streams—ventral and dorsal. Two proposals theorize that the pathways are segregated in function: The ventral stream processes information about object identity, whereas the dorsal stream, according to one model, processes information about either object location, and according to another, is responsible in executing movements under visual control. The models are influential; however recent experimental evidence challenges them, e.g., the ventral stream is not solely responsible for object recognition; conversely, its function is not strictly limited to object vision; the dorsal stream is not responsible by itself for spatial vision or visuomotor control; conversely, its function extends beyond vision or visuomotor control. In their place, we suggest a robust dichotomy consisting of a ventral stream selectively sampling high-resolution/focal spaces, and a dorsal stream sampling nearly all of space with reduced foveal bias. The proposal hews closely to the theme of embodied cognition: Function arises as a consequence of an extant sensory underpinning. A continuous, not sharp, segregation based on function emerges, and carries with it an undercurrent of an exploitation-exploration dichotomy. Under this interpretation, cells of the ventral stream, which individually have more punctate receptive fields that generally include the fovea or parafovea, provide detailed information about object shapes and features and lead to the systematic exploitation of said information; cells of the dorsal stream, which individually have large receptive fields, contribute to visuospatial perception, provide information about the presence/absence of salient objects and their locations for novel exploration and subsequent exploitation by the ventral stream or, under certain conditions, the dorsal stream. We leverage the dichotomy to unify neuropsychological cases under a common umbrella, account for the increased prevalence of multisensory integration in the dorsal stream under a Bayesian framework, predict conditions under which object recognition utilizes the ventral or dorsal stream, and explain why cells of the dorsal stream drive sensorimotor control and motion processing and have poorer feature selectivity. Finally, the model speculates on a dynamic interaction between the two streams that underscores a unified, seamless perception. Existing theories are subsumed under our proposal. PMID:27920670
Sheth, Bhavin R; Young, Ryan
2016-01-01
Evidence is strong that the visual pathway is segregated into two distinct streams-ventral and dorsal. Two proposals theorize that the pathways are segregated in function: The ventral stream processes information about object identity, whereas the dorsal stream, according to one model, processes information about either object location, and according to another, is responsible in executing movements under visual control. The models are influential; however recent experimental evidence challenges them, e.g., the ventral stream is not solely responsible for object recognition; conversely, its function is not strictly limited to object vision; the dorsal stream is not responsible by itself for spatial vision or visuomotor control; conversely, its function extends beyond vision or visuomotor control. In their place, we suggest a robust dichotomy consisting of a ventral stream selectively sampling high-resolution/ focal spaces, and a dorsal stream sampling nearly all of space with reduced foveal bias. The proposal hews closely to the theme of embodied cognition: Function arises as a consequence of an extant sensory underpinning. A continuous, not sharp, segregation based on function emerges, and carries with it an undercurrent of an exploitation-exploration dichotomy. Under this interpretation, cells of the ventral stream, which individually have more punctate receptive fields that generally include the fovea or parafovea, provide detailed information about object shapes and features and lead to the systematic exploitation of said information; cells of the dorsal stream, which individually have large receptive fields, contribute to visuospatial perception, provide information about the presence/absence of salient objects and their locations for novel exploration and subsequent exploitation by the ventral stream or, under certain conditions, the dorsal stream. We leverage the dichotomy to unify neuropsychological cases under a common umbrella, account for the increased prevalence of multisensory integration in the dorsal stream under a Bayesian framework, predict conditions under which object recognition utilizes the ventral or dorsal stream, and explain why cells of the dorsal stream drive sensorimotor control and motion processing and have poorer feature selectivity. Finally, the model speculates on a dynamic interaction between the two streams that underscores a unified, seamless perception. Existing theories are subsumed under our proposal.
D GIS for Flood Modelling in River Valleys
NASA Astrophysics Data System (ADS)
Tymkow, P.; Karpina, M.; Borkowski, A.
2016-06-01
The objective of this study is implementation of system architecture for collecting and analysing data as well as visualizing results for hydrodynamic modelling of flood flows in river valleys using remote sensing methods, tree-dimensional geometry of spatial objects and GPU multithread processing. The proposed solution includes: spatial data acquisition segment, data processing and transformation, mathematical modelling of flow phenomena and results visualization. Data acquisition segment was based on aerial laser scanning supplemented by images in visible range. Vector data creation was based on automatic and semiautomatic algorithms of DTM and 3D spatial features modelling. Algorithms for buildings and vegetation geometry modelling were proposed or adopted from literature. The implementation of the framework was designed as modular software using open specifications and partially reusing open source projects. The database structure for gathering and sharing vector data, including flood modelling results, was created using PostgreSQL. For the internal structure of feature classes of spatial objects in a database, the CityGML standard was used. For the hydrodynamic modelling the solutions of Navier-Stokes equations in two-dimensional version was implemented. Visualization of geospatial data and flow model results was transferred to the client side application. This gave the independence from server hardware platform. A real-world case in Poland, which is a part of Widawa River valley near Wroclaw city, was selected to demonstrate the applicability of proposed system.
De Sá Teixeira, Nuno
2016-01-01
Visual memory for the spatial location where a moving target vanishes has been found to be systematically displaced downward in the direction of gravity. Moreover, it was recently reported that the magnitude of the downward error increases steadily with increasing retention intervals imposed after object's offset and before observers are allowed to perform the spatial localization task, in a pattern where the remembered vanishing location drifts downward as if following a falling trajectory. This outcome was taken to reflect the dynamics of a representational model of earth's gravity. The present study aims to establish the spatial and temporal features of this downward drift by taking into account the dynamics of the motor response. The obtained results show that the memory for the last location of the target drifts downward with time, thus replicating previous results. Moreover, the time taken for completion of the behavioural localization movements seems to add to the imposed retention intervals in determining the temporal frame during which the visual memory is updated. Overall, it is reported that the representation of spatial location drifts downward by about 3 pixels for each two-fold increase of time until response. The outcomes are discussed in relation to a predictive internal model of gravity which outputs an on-line spatial update of remembered objects' location.
Using CNN Features to Better Understand What Makes Visual Artworks Special.
Brachmann, Anselm; Barth, Erhardt; Redies, Christoph
2017-01-01
One of the goal of computational aesthetics is to understand what is special about visual artworks. By analyzing image statistics, contemporary methods in computer vision enable researchers to identify properties that distinguish artworks from other (non-art) types of images. Such knowledge will eventually allow inferences with regard to the possible neural mechanisms that underlie aesthetic perception in the human visual system. In the present study, we define measures that capture variances of features of a well-established Convolutional Neural Network (CNN), which was trained on millions of images to recognize objects. Using an image dataset that represents traditional Western, Islamic and Chinese art, as well as various types of non-art images, we show that we need only two variance measures to distinguish between the artworks and non-art images with a high classification accuracy of 93.0%. Results for the first variance measure imply that, in the artworks, the subregions of an image tend to be filled with pictorial elements, to which many diverse CNN features respond ( richness of feature responses). Results for the second measure imply that this diversity is tied to a relatively large variability of the responses of individual CNN feature across the subregions of an image. We hypothesize that this combination of richness and variability of CNN feature responses is one of properties that makes traditional visual artworks special. We discuss the possible neural underpinnings of this perceptual quality of artworks and propose to study the same quality also in other types of aesthetic stimuli, such as music and literature.
Using CNN Features to Better Understand What Makes Visual Artworks Special
Brachmann, Anselm; Barth, Erhardt; Redies, Christoph
2017-01-01
One of the goal of computational aesthetics is to understand what is special about visual artworks. By analyzing image statistics, contemporary methods in computer vision enable researchers to identify properties that distinguish artworks from other (non-art) types of images. Such knowledge will eventually allow inferences with regard to the possible neural mechanisms that underlie aesthetic perception in the human visual system. In the present study, we define measures that capture variances of features of a well-established Convolutional Neural Network (CNN), which was trained on millions of images to recognize objects. Using an image dataset that represents traditional Western, Islamic and Chinese art, as well as various types of non-art images, we show that we need only two variance measures to distinguish between the artworks and non-art images with a high classification accuracy of 93.0%. Results for the first variance measure imply that, in the artworks, the subregions of an image tend to be filled with pictorial elements, to which many diverse CNN features respond (richness of feature responses). Results for the second measure imply that this diversity is tied to a relatively large variability of the responses of individual CNN feature across the subregions of an image. We hypothesize that this combination of richness and variability of CNN feature responses is one of properties that makes traditional visual artworks special. We discuss the possible neural underpinnings of this perceptual quality of artworks and propose to study the same quality also in other types of aesthetic stimuli, such as music and literature. PMID:28588537
Gao, Dashan; Vasconcelos, Nuno
2009-01-01
A decision-theoretic formulation of visual saliency, first proposed for top-down processing (object recognition) (Gao & Vasconcelos, 2005a), is extended to the problem of bottom-up saliency. Under this formulation, optimality is defined in the minimum probability of error sense, under a constraint of computational parsimony. The saliency of the visual features at a given location of the visual field is defined as the power of those features to discriminate between the stimulus at the location and a null hypothesis. For bottom-up saliency, this is the set of visual features that surround the location under consideration. Discrimination is defined in an information-theoretic sense and the optimal saliency detector derived for a class of stimuli that complies with known statistical properties of natural images. It is shown that under the assumption that saliency is driven by linear filtering, the optimal detector consists of what is usually referred to as the standard architecture of V1: a cascade of linear filtering, divisive normalization, rectification, and spatial pooling. The optimal detector is also shown to replicate the fundamental properties of the psychophysics of saliency: stimulus pop-out, saliency asymmetries for stimulus presence versus absence, disregard of feature conjunctions, and Weber's law. Finally, it is shown that the optimal saliency architecture can be applied to the solution of generic inference problems. In particular, for the class of stimuli studied, it performs the three fundamental operations of statistical inference: assessment of probabilities, implementation of Bayes decision rule, and feature selection.
Liu, Jianli; Lughofer, Edwin; Zeng, Xianyi
2015-01-01
Modeling human aesthetic perception of visual textures is important and valuable in numerous industrial domains, such as product design, architectural design, and decoration. Based on results from a semantic differential rating experiment, we modeled the relationship between low-level basic texture features and aesthetic properties involved in human aesthetic texture perception. First, we compute basic texture features from textural images using four classical methods. These features are neutral, objective, and independent of the socio-cultural context of the visual textures. Then, we conduct a semantic differential rating experiment to collect from evaluators their aesthetic perceptions of selected textural stimuli. In semantic differential rating experiment, eights pairs of aesthetic properties are chosen, which are strongly related to the socio-cultural context of the selected textures and to human emotions. They are easily understood and connected to everyday life. We propose a hierarchical feed-forward layer model of aesthetic texture perception and assign 8 pairs of aesthetic properties to different layers. Finally, we describe the generation of multiple linear and non-linear regression models for aesthetic prediction by taking dimensionality-reduced texture features and aesthetic properties of visual textures as dependent and independent variables, respectively. Our experimental results indicate that the relationships between each layer and its neighbors in the hierarchical feed-forward layer model of aesthetic texture perception can be fitted well by linear functions, and the models thus generated can successfully bridge the gap between computational texture features and aesthetic texture properties.
Helioviewer: A Web 2.0 Tool for Visualizing Heterogeneous Heliophysics Data
NASA Astrophysics Data System (ADS)
Hughitt, V. K.; Ireland, J.; Lynch, M. J.; Schmeidel, P.; Dimitoglou, G.; Müeller, D.; Fleck, B.
2008-12-01
Solar physics datasets are becoming larger, richer, more numerous and more distributed. Feature/event catalogs (describing objects of interest in the original data) are becoming important tools in navigating these data. In the wake of this increasing influx of data and catalogs there has been a growing need for highly sophisticated tools for accessing and visualizing this wealth of information. Helioviewer is a novel tool for integrating and visualizing disparate sources of solar and Heliophysics data. Taking advantage of the newly available power of modern web application frameworks, Helioviewer merges image and feature catalog data, and provides for Heliophysics data a familiar interface not unlike Google Maps or MapQuest. In addition to streamlining the process of combining heterogeneous Heliophysics datatypes such as full-disk images and coronagraphs, the inclusion of visual representations of automated and human-annotated features provides the user with an integrated and intuitive view of how different factors may be interacting on the Sun. Currently, Helioviewer offers images from The Extreme ultraviolet Imaging Telescope (EIT), The Large Angle and Spectrometric COronagraph experiment (LASCO) and the Michelson Doppler Imager (MDI) instruments onboard The Solar and Heliospheric Observatory (SOHO), as well as The Transition Region and Coronal Explorer (TRACE). Helioviewer also incorporates feature/event information from the LASCO CME List, NOAA Active Regions, CACTus CME and Type II Radio Bursts feature/event catalogs. The project is undergoing continuous development with many more data sources and additional functionality planned for the near future.
Hue distinctiveness overrides category in determining performance in multiple object tracking.
Sun, Mengdan; Zhang, Xuemin; Fan, Lingxia; Hu, Luming
2018-02-01
The visual distinctiveness between targets and distractors can significantly facilitate performance in multiple object tracking (MOT), in which color is a feature that has been commonly used. However, the processing of color can be more than "visual." Color is continuous in chromaticity, while it is commonly grouped into discrete categories (e.g., red, green). Evidence from color perception suggested that color categories may have a unique role in visual tasks independent of its chromatic appearance. Previous MOT studies have not examined the effect of chromatic and categorical distinctiveness on tracking separately. The current study aimed to reveal how chromatic (hue) and categorical distinctiveness of color between the targets and distractors affects tracking performance. With four experiments, we showed that tracking performance was largely facilitated by the increasing hue distance between the target set and the distractor set, suggesting that perceptual grouping was formed based on hue distinctiveness to aid tracking. However, we found no color categorical effect, because tracking performance was not significantly different when the targets and distractors were from the same or different categories. It was concluded that the chromatic distinctiveness of color overrides category in determining tracking performance, suggesting a dominant role of perceptual feature in MOT.
Acquiring Semantically Meaningful Models for Robotic Localization, Mapping and Target Recognition
2014-12-21
information, including suggesstions for reducing this burden, to Washington Headquarters Services , Directorate for Information Operations and Reports, 1215...Representations • Point features tracking • Recovery of relative motion, visual odometry • Loop closure • Environment models, sparse clouds of points...that co- occur with the object of interest Chair-Background Table-Background Object Level Segmentation Jaccard Index Silber .[5] 15.12 RenFox[4
ERIC Educational Resources Information Center
Torralba, Antonio; Oliva, Aude; Castelhano, Monica S.; Henderson, John M.
2006-01-01
Many experiments have shown that the human visual system makes extensive use of contextual information for facilitating object search in natural scenes. However, the question of how to formally model contextual influences is still open. On the basis of a Bayesian framework, the authors present an original approach of attentional guidance by global…
Sounds of silence: How to animate virtual worlds with sound
NASA Technical Reports Server (NTRS)
Astheimer, Peter
1993-01-01
Sounds are an integral and sometimes annoying part of our daily life. Virtual worlds which imitate natural environments gain a lot of authenticity from fast, high quality visualization combined with sound effects. Sounds help to increase the degree of immersion for human dwellers in imaginary worlds significantly. The virtual reality toolkit of IGD (Institute for Computer Graphics) features a broad range of standard visual and advanced real-time audio components which interpret an object-oriented definition of the scene. The virtual reality system 'Virtual Design' realized with the toolkit enables the designer of virtual worlds to create a true audiovisual environment. Several examples on video demonstrate the usage of the audio features in Virtual Design.
BATSE Gamma-Ray Burst Line Search. IV. Line Candidates from the Visual Search
NASA Astrophysics Data System (ADS)
Band, D. L.; Ryder, S.; Ford, L. A.; Matteson, J. L.; Palmer, D. M.; Teegarden, B. J.; Briggs, M. S.; Paciesas, W. S.; Pendleton, G. N.; Preece, R. D.
1996-02-01
We evaluate the significance of the line candidates identified by a visual search of burst spectra from BATSE's Spectroscopy Detectors. None of the candidates satisfy our detection criteria: an F-test probability less than 10-4 for a feature in one detector and consistency among the detectors that viewed the burst. Most of the candidates are not very significant and are likely to be fluctuations. Because of the expectation of finding absorption lines, the search was biased toward absorption features. We do not have a quantitative measure of the completeness of the search, which would enable a comparison with previous missions. Therefore, a more objective computerized search has begun.
Makuuchi, Michiru; Someya, Yoshiaki; Ogawa, Seiji; Takayama, Yoshihiro
2011-01-01
In visually guided grasping, possible hand shapes are computed from the geometrical features of the object, while prior knowledge about the object and the goal of the action influence both the computation and the selection of the hand shape. We investigated the system dynamics of the human brain for the pantomiming of grasping with two aspects accentuated. One is object recognition, with the use of objects for daily use. The subjects mimed grasping movements appropriate for an object presented in a photograph either by precision or power grip. The other is the selection of grip hand shape. We manipulated the selection demands for the grip hand shape by having the subjects use the same or different grip type in the second presentation of the identical object. Effective connectivity analysis revealed that the increased selection demands enhance the interaction between the anterior intraparietal sulcus (AIP) and posterior inferior temporal gyrus (pITG), and drive the converging causal influences from the AIP, pITG, and dorsolateral prefrontal cortex to the ventral premotor area (PMv). These results suggest that the dorsal and ventral visual areas interact in the pantomiming of grasping, while the PMv integrates the neural information of different regions to select the hand posture. The present study proposes system dynamics in visually guided movement toward meaningful objects, but further research is needed to examine if the same dynamics is found also in real grasping. PMID:21739528
Kovalenko, Lyudmyla Y; Chaumon, Maximilien; Busch, Niko A
2012-07-01
Semantic processing of verbal and visual stimuli has been investigated in semantic violation or semantic priming paradigms in which a stimulus is either related or unrelated to a previously established semantic context. A hallmark of semantic priming is the N400 event-related potential (ERP)--a deflection of the ERP that is more negative for semantically unrelated target stimuli. The majority of studies investigating the N400 and semantic integration have used verbal material (words or sentences), and standardized stimulus sets with norms for semantic relatedness have been published for verbal but not for visual material. However, semantic processing of visual objects (as opposed to words) is an important issue in research on visual cognition. In this study, we present a set of 800 pairs of semantically related and unrelated visual objects. The images were rated for semantic relatedness by a sample of 132 participants. Furthermore, we analyzed low-level image properties and matched the two semantic categories according to these features. An ERP study confirmed the suitability of this image set for evoking a robust N400 effect of semantic integration. Additionally, using a general linear modeling approach of single-trial data, we also demonstrate that low-level visual image properties and semantic relatedness are in fact only minimally overlapping. The image set is available for download from the authors' website. We expect that the image set will facilitate studies investigating mechanisms of semantic and contextual processing of visual stimuli.
No Effect of Featural Attention on Body Size Aftereffects
Stephen, Ian D.; Bickersteth, Chloe; Mond, Jonathan; Stevenson, Richard J.; Brooks, Kevin R.
2016-01-01
Prolonged exposure to images of narrow bodies has been shown to induce a perceptual aftereffect, such that observers’ point of subjective normality (PSN) for bodies shifts toward narrower bodies. The converse effect is shown for adaptation to wide bodies. In low-level stimuli, object attention (attention directed to the object) and spatial attention (attention directed to the location of the object) have been shown to increase the magnitude of visual aftereffects, while object-based attention enhances the adaptation effect in faces. It is not known whether featural attention (attention directed to a specific aspect of the object) affects the magnitude of adaptation effects in body stimuli. Here, we manipulate the attention of Caucasian observers to different featural information in body images, by asking them to rate the fatness or sex typicality of male and female bodies manipulated to appear fatter or thinner than average. PSNs for body fatness were taken at baseline and after adaptation, and a change in PSN (ΔPSN) was calculated. A body size adaptation effect was found, with observers who viewed fat bodies showing an increased PSN, and those exposed to thin bodies showing a reduced PSN. However, manipulations of featural attention to body fatness or sex typicality produced equivalent results, suggesting that featural attention may not affect the strength of the body size aftereffect. PMID:27597835
No Effect of Featural Attention on Body Size Aftereffects.
Stephen, Ian D; Bickersteth, Chloe; Mond, Jonathan; Stevenson, Richard J; Brooks, Kevin R
2016-01-01
Prolonged exposure to images of narrow bodies has been shown to induce a perceptual aftereffect, such that observers' point of subjective normality (PSN) for bodies shifts toward narrower bodies. The converse effect is shown for adaptation to wide bodies. In low-level stimuli, object attention (attention directed to the object) and spatial attention (attention directed to the location of the object) have been shown to increase the magnitude of visual aftereffects, while object-based attention enhances the adaptation effect in faces. It is not known whether featural attention (attention directed to a specific aspect of the object) affects the magnitude of adaptation effects in body stimuli. Here, we manipulate the attention of Caucasian observers to different featural information in body images, by asking them to rate the fatness or sex typicality of male and female bodies manipulated to appear fatter or thinner than average. PSNs for body fatness were taken at baseline and after adaptation, and a change in PSN (ΔPSN) was calculated. A body size adaptation effect was found, with observers who viewed fat bodies showing an increased PSN, and those exposed to thin bodies showing a reduced PSN. However, manipulations of featural attention to body fatness or sex typicality produced equivalent results, suggesting that featural attention may not affect the strength of the body size aftereffect.
Object segmentation controls image reconstruction from natural scenes
2017-01-01
The structure of the physical world projects images onto our eyes. However, those images are often poorly representative of environmental structure: well-defined boundaries within the eye may correspond to irrelevant features of the physical world, while critical features of the physical world may be nearly invisible at the retinal projection. The challenge for the visual cortex is to sort these two types of features according to their utility in ultimately reconstructing percepts and interpreting the constituents of the scene. We describe a novel paradigm that enabled us to selectively evaluate the relative role played by these two feature classes in signal reconstruction from corrupted images. Our measurements demonstrate that this process is quickly dominated by the inferred structure of the environment, and only minimally controlled by variations of raw image content. The inferential mechanism is spatially global and its impact on early visual cortex is fast. Furthermore, it retunes local visual processing for more efficient feature extraction without altering the intrinsic transduction noise. The basic properties of this process can be partially captured by a combination of small-scale circuit models and large-scale network architectures. Taken together, our results challenge compartmentalized notions of bottom-up/top-down perception and suggest instead that these two modes are best viewed as an integrated perceptual mechanism. PMID:28827801
Common Visual Preference for Curved Contours in Humans and Great Apes.
Munar, Enric; Gómez-Puerto, Gerardo; Call, Josep; Nadal, Marcos
2015-01-01
Among the visual preferences that guide many everyday activities and decisions, from consumer choices to social judgment, preference for curved over sharp-angled contours is commonly thought to have played an adaptive role throughout human evolution, favoring the avoidance of potentially harmful objects. However, because nonhuman primates also exhibit preferences for certain visual qualities, it is conceivable that humans' preference for curved contours is grounded on perceptual and cognitive mechanisms shared with extant nonhuman primate species. Here we aimed to determine whether nonhuman great apes and humans share a visual preference for curved over sharp-angled contours using a 2-alternative forced choice experimental paradigm under comparable conditions. Our results revealed that the human group and the great ape group indeed share a common preference for curved over sharp-angled contours, but that they differ in the manner and magnitude with which this preference is expressed behaviorally. These results suggest that humans' visual preference for curved objects evolved from earlier primate species' visual preferences, and that during this process it became stronger, but also more susceptible to the influence of higher cognitive processes and preference for other visual features.
Assessing clutter reduction in parallel coordinates using image processing techniques
NASA Astrophysics Data System (ADS)
Alhamaydh, Heba; Alzoubi, Hussein; Almasaeid, Hisham
2018-01-01
Information visualization has appeared as an important research field for multidimensional data and correlation analysis in recent years. Parallel coordinates (PCs) are one of the popular techniques to visual high-dimensional data. A problem with the PCs technique is that it suffers from crowding, a clutter which hides important data and obfuscates the information. Earlier research has been conducted to reduce clutter without loss in data content. We introduce the use of image processing techniques as an approach for assessing the performance of clutter reduction techniques in PC. We use histogram analysis as our first measure, where the mean feature of the color histograms of the possible alternative orderings of coordinates for the PC images is calculated and compared. The second measure is the extracted contrast feature from the texture of PC images based on gray-level co-occurrence matrices. The results show that the best PC image is the one that has the minimal mean value of the color histogram feature and the maximal contrast value of the texture feature. In addition to its simplicity, the proposed assessment method has the advantage of objectively assessing alternative ordering of PC visualization.
[Several mechanisms of visual gnosis disorders in local brain lesions].
Meerson, Ia A
1981-01-01
The object of the studies were peculiarities of recognizing visual images by patients with local cerebral lesions under conditions of incomplete sets of the image features, disjunction of the latter, distortion of their spatial arrangement, and unusual spatial orientation of the image as a whole. It was found that elimination of even one essential feature sharply hampered the recognition of the image both by healthy individuals (control), and patients with extraoccipital lesions, whereas elimination of several nonessential features only slowed down the process. In distinction from this the difficulties of the recognition of incomplete images by patients with occipital lesions were directly proportional to the number of the eliminated features irrespective of the latters' significance, i.e. these patients were unable to evaluate the hierarchy of the features. The recognition process in these patients were followed the way of scanning individual features. The reaccumulation and summation. The recognition of the fragmental, spatially distorted and unusually oriented images was found to be affected selectively in patients with parietal lobe affections. The patients with occipital lesions recognized such images practically as good as the ordinary ones.
Ahlfors, Seppo P.; Jones, Stephanie R.; Ahveninen, Jyrki; Hämäläinen, Matti S.; Belliveau, John W.; Bar, Moshe
2014-01-01
Identifying inter-area communication in terms of the hierarchical organization of functional brain areas is of considerable interest in human neuroimaging. Previous studies have suggested that the direction of magneto- and electroencephalography (MEG, EEG) source currents depends on the layer-specific input patterns into a cortical area. We examined the direction in MEG source currents in a visual object recognition experiment in which there were specific expectations of activation in the fusiform region being driven by either feedforward or feedback inputs. The source for the early non-specific visual evoked response, presumably corresponding to feedforward driven activity, pointed outward, i.e., away from the white matter. In contrast, the source for the later, object-recognition related signals, expected to be driven by feedback inputs, pointed inward, toward the white matter. Associating specific features of the MEG/EEG source waveforms to feedforward and feedback inputs could provide unique information about the activation patterns within hierarchically organized cortical areas. PMID:25445356
Slow feature analysis: unsupervised learning of invariances.
Wiskott, Laurenz; Sejnowski, Terrence J
2002-04-01
Invariant features of temporally varying signals are useful for analysis and classification. Slow feature analysis (SFA) is a new method for learning invariant or slowly varying features from a vectorial input signal. It is based on a nonlinear expansion of the input signal and application of principal component analysis to this expanded signal and its time derivative. It is guaranteed to find the optimal solution within a family of functions directly and can learn to extract a large number of decorrelated features, which are ordered by their degree of invariance. SFA can be applied hierarchically to process high-dimensional input signals and extract complex features. SFA is applied first to complex cell tuning properties based on simple cell output, including disparity and motion. Then more complicated input-output functions are learned by repeated application of SFA. Finally, a hierarchical network of SFA modules is presented as a simple model of the visual system. The same unstructured network can learn translation, size, rotation, contrast, or, to a lesser degree, illumination invariance for one-dimensional objects, depending on only the training stimulus. Surprisingly, only a few training objects suffice to achieve good generalization to new objects. The generated representation is suitable for object recognition. Performance degrades if the network is trained to learn multiple invariances simultaneously.
Visual conspicuity: a new simple standard, its reliability, validity and applicability.
Wertheim, A H
2010-03-01
A general standard for quantifying conspicuity is described. It derives from a simple and easy method to quantitatively measure the visual conspicuity of an object. The method stems from the theoretical view that the conspicuity of an object is not a property of that object, but describes the degree to which the object is perceptually embedded in, i.e. laterally masked by, its visual environment. First, three variations of a simple method to measure the strength of such lateral masking are described and empirical evidence for its reliability and its validity is presented, as are several tests of predictions concerning the effects of viewing distance and ambient light. It is then shown how this method yields a conspicuity standard, expressed as a number, which can be made part of a rule of law, and which can be used to test whether or not, and to what extent, the conspicuity of a particular object, e.g. a traffic sign, meets a predetermined criterion. An additional feature is that, when used under different ambient light conditions, the method may also yield an index of the amount of visual clutter in the environment. Taken together the evidence illustrates the methods' applicability in both the laboratory and in real-life situations. STATEMENT OF RELEVANCE: This paper concerns a proposal for a new method to measure visual conspicuity, yielding a numerical index that can be used in a rule of law. It is of importance to ergonomists and human factor specialists who are asked to measure the conspicuity of an object, such as a traffic or rail-road sign, or any other object. The new method is simple and circumvents the need to perform elaborate (search) experiments and thus has great relevance as a simple tool for applied research.
Model-based analysis of pattern motion processing in mouse primary visual cortex
Muir, Dylan R.; Roth, Morgane M.; Helmchen, Fritjof; Kampa, Björn M.
2015-01-01
Neurons in sensory areas of neocortex exhibit responses tuned to specific features of the environment. In visual cortex, information about features such as edges or textures with particular orientations must be integrated to recognize a visual scene or object. Connectivity studies in rodent cortex have revealed that neurons make specific connections within sub-networks sharing common input tuning. In principle, this sub-network architecture enables local cortical circuits to integrate sensory information. However, whether feature integration indeed occurs locally in rodent primary sensory areas has not been examined directly. We studied local integration of sensory features in primary visual cortex (V1) of the mouse by presenting drifting grating and plaid stimuli, while recording the activity of neuronal populations with two-photon calcium imaging. Using a Bayesian model-based analysis framework, we classified single-cell responses as being selective for either individual grating components or for moving plaid patterns. Rather than relying on trial-averaged responses, our model-based framework takes into account single-trial responses and can easily be extended to consider any number of arbitrary predictive models. Our analysis method was able to successfully classify significantly more responses than traditional partial correlation (PC) analysis, and provides a rigorous statistical framework to rank any number of models and reject poorly performing models. We also found a large proportion of cells that respond strongly to only one stimulus class. In addition, a quarter of selectively responding neurons had more complex responses that could not be explained by any simple integration model. Our results show that a broad range of pattern integration processes already take place at the level of V1. This diversity of integration is consistent with processing of visual inputs by local sub-networks within V1 that are tuned to combinations of sensory features. PMID:26300738
Whatever you do, don't look at the...: Evaluating guidance by an exclusionary attentional template.
Beck, Valerie M; Luck, Steven J; Hollingworth, Andrew
2018-04-01
People can use a target template consisting of one or more features to guide attention and gaze to matching objects in a search array. But can we also use feature information to guide attention away from known irrelevant items? Some studies found a benefit from foreknowledge of a distractor feature, whereas others found a cost. Importantly, previous work has largely relied on end-of-trial manual responses; it is unclear how feature-guided avoidance might unfold as candidate objects are inspected. In the current experiments, participants were cued with a distractor feature to avoid, then performed a visual search task while eye movements were recorded. Participants initially fixated a to-be-avoided object more frequently than predicted by chance, but they also demonstrated avoidance of cue-matching objects later in the trial. When provided more time between cue stimulus and search array, participants continued to be initially captured by a cued-color item. Furthermore, avoidance of cue-matching objects later in the trial was not contingent on initial capture by a cue-matching object. These results suggest that the conflicting findings in previous negative-cue experiments may be explained by a mixture of two independent processes: initial attentional capture by memory-matching items and later avoidance of known irrelevant items. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Image fusion using sparse overcomplete feature dictionaries
Brumby, Steven P.; Bettencourt, Luis; Kenyon, Garrett T.; Chartrand, Rick; Wohlberg, Brendt
2015-10-06
Approaches for deciding what individuals in a population of visual system "neurons" are looking for using sparse overcomplete feature dictionaries are provided. A sparse overcomplete feature dictionary may be learned for an image dataset and a local sparse representation of the image dataset may be built using the learned feature dictionary. A local maximum pooling operation may be applied on the local sparse representation to produce a translation-tolerant representation of the image dataset. An object may then be classified and/or clustered within the translation-tolerant representation of the image dataset using a supervised classification algorithm and/or an unsupervised clustering algorithm.
Velocity and Structure Estimation of a Moving Object Using a Moving Monocular Camera
2006-01-01
map the Euclidean position of static landmarks or visual features in the environment . Recent applications of this technique include aerial...From Motion in a Piecewise Planar Environment ,” International Journal of Pattern Recognition and Artificial Intelligence, Vol. 2, No. 3, pp. 485-508...1988. [9] J. M. Ferryman, S. J. Maybank , and A. D. Worrall, “Visual Surveil- lance for Moving Vehicles,” Intl. Journal of Computer Vision, Vol. 37, No
Comparison of Object Recognition Behavior in Human and Monkey
Rajalingham, Rishi; Schmidt, Kailyn
2015-01-01
Although the rhesus monkey is used widely as an animal model of human visual processing, it is not known whether invariant visual object recognition behavior is quantitatively comparable across monkeys and humans. To address this question, we systematically compared the core object recognition behavior of two monkeys with that of human subjects. To test true object recognition behavior (rather than image matching), we generated several thousand naturalistic synthetic images of 24 basic-level objects with high variation in viewing parameters and image background. Monkeys were trained to perform binary object recognition tasks on a match-to-sample paradigm. Data from 605 human subjects performing the same tasks on Mechanical Turk were aggregated to characterize “pooled human” object recognition behavior, as well as 33 separate Mechanical Turk subjects to characterize individual human subject behavior. Our results show that monkeys learn each new object in a few days, after which they not only match mean human performance but show a pattern of object confusion that is highly correlated with pooled human confusion patterns and is statistically indistinguishable from individual human subjects. Importantly, this shared human and monkey pattern of 3D object confusion is not shared with low-level visual representations (pixels, V1+; models of the retina and primary visual cortex) but is shared with a state-of-the-art computer vision feature representation. Together, these results are consistent with the hypothesis that rhesus monkeys and humans share a common neural shape representation that directly supports object perception. SIGNIFICANCE STATEMENT To date, several mammalian species have shown promise as animal models for studying the neural mechanisms underlying high-level visual processing in humans. In light of this diversity, making tight comparisons between nonhuman and human primates is particularly critical in determining the best use of nonhuman primates to further the goal of the field of translating knowledge gained from animal models to humans. To the best of our knowledge, this study is the first systematic attempt at comparing a high-level visual behavior of humans and macaque monkeys. PMID:26338324
Kawa, Rafał; Pisula, Ewa
2010-01-01
There have been ambiguous accounts of exploration in children with intellectual disabilities with respect to the course of that exploration, and in particular the relationship between the features of explored objects and exploratory behaviour. It is unclear whether reduced exploratory activity seen with object exploration but not with locomotor activity is autism-specific or if it is also present in children with other disabilities. The purpose of the present study was to compare preschool children with autism with their peers with Down syndrome and typical development in terms of locomotor activity and object exploration and to determine whether the complexity of explored objects affects the course of exploration activity in children with autism. In total there were 27 children in the study. The experimental room was divided into three zones equipped with experimental objects providing visual stimulation of varying levels of complexity. Our results indicate that children with autism and Down syndrome differ from children with typical development in terms of some measures of object exploration (i.e. looking at objects) and time spent in the zone with the most visually complex objects.
Puffe, Lydia; Dittrich, Kerstin; Klauer, Karl Christoph
2017-01-01
In a joint go/no-go Simon task, each of two participants is to respond to one of two non-spatial stimulus features by means of a spatially lateralized response. Stimulus position varies horizontally and responses are faster and more accurate when response side and stimulus position match (compatible trial) than when they mismatch (incompatible trial), defining the social Simon effect or joint spatial compatibility effect. This effect was originally explained in terms of action/task co-representation, assuming that the co-actor's action is automatically co-represented. Recent research by Dolk, Hommel, Prinz, and Liepelt (2013) challenged this account by demonstrating joint spatial compatibility effects in a task-setting in which non-social objects like a Japanese waving cat were present, but no real co-actor. They postulated that every sufficiently salient object induces joint spatial compatibility effects. However, what makes an object sufficiently salient is so far not well defined. To scrutinize this open question, the current study manipulated auditory and/or visual attention-attracting cues of a Japanese waving cat within an auditory (Experiment 1) and a visual joint go/no-go Simon task (Experiment 2). Results revealed that joint spatial compatibility effects only occurred in an auditory Simon task when the cat provided auditory cues while no joint spatial compatibility effects were found in a visual Simon task. This demonstrates that it is not the sufficiently salient object alone that leads to joint spatial compatibility effects but instead, a complex interaction between features of the object and the stimulus material of the joint go/no-go Simon task.
Familiarity enhances visual working memory for faces.
Jackson, Margaret C; Raymond, Jane E
2008-06-01
Although it is intuitive that familiarity with complex visual objects should aid their preservation in visual working memory (WM), empirical evidence for this is lacking. This study used a conventional change-detection procedure to assess visual WM for unfamiliar and famous faces in healthy adults. Across experiments, faces were upright or inverted and a low- or high-load concurrent verbal WM task was administered to suppress contribution from verbal WM. Even with a high verbal memory load, visual WM performance was significantly better and capacity estimated as significantly greater for famous versus unfamiliar faces. Face inversion abolished this effect. Thus, neither strategic, explicit support from verbal WM nor low-level feature processing easily accounts for the observed benefit of high familiarity for visual WM. These results demonstrate that storage of items in visual WM can be enhanced if robust visual representations of them already exist in long-term memory.
Detecting objects in radiographs for homeland security
NASA Astrophysics Data System (ADS)
Prasad, Lakshman; Snyder, Hans
2005-05-01
We present a general scheme for segmenting a radiographic image into polygons that correspond to visual features. This decomposition provides a vectorized representation that is a high-level description of the image. The polygons correspond to objects or object parts present in the image. This characterization of radiographs allows the direct application of several shape recognition algorithms to identify objects. In this paper we describe the use of constrained Delaunay triangulations as a uniform foundational tool to achieve multiple visual tasks, namely image segmentation, shape decomposition, and parts-based shape matching. Shape decomposition yields parts that serve as tokens representing local shape characteristics. Parts-based shape matching enables the recognition of objects in the presence of occlusions, which commonly occur in radiographs. The polygonal representation of image features affords the efficient design and application of sophisticated geometric filtering methods to detect large-scale structural properties of objects in images. Finally, the representation of radiographs via polygons results in significant reduction of image file sizes and permits the scalable graphical representation of images, along with annotations of detected objects, in the SVG (scalable vector graphics) format that is proposed by the world wide web consortium (W3C). This is a textual representation that can be compressed and encrypted for efficient and secure transmission of information over wireless channels and on the Internet. In particular, our methods described here provide an algorithmic framework for developing image analysis tools for screening cargo at ports of entry for homeland security.
Moriya, Jun
2017-01-01
According to cognitive theories, verbal processing attenuates emotional processing, whereas visual imagery enhances emotional processing and contributes to the maintenance of social anxiety. Individuals with social anxiety report negative mental images in social situations. However, the general ability of visual mental imagery of neutral scenes in individuals with social anxiety is still unclear. The present study investigated the general ability of non-emotional mental imagery (vividness, preferences for imagery vs. verbal processing, and object or spatial imagery) and the moderating role of effortful control in attenuating social anxiety. The participants ( N = 231) completed five questionnaires. The results showed that social anxiety was not necessarily associated with all aspects of mental imagery. As suggested by theories, social anxiety was not associated with a preference for verbal processing. However, social anxiety was positively correlated with the visual imagery scale, especially the object imagery scale, which concerns the ability to construct pictorial images of individual objects. Further, it was negatively correlated with the spatial imagery scale, which concerns the ability to process information about spatial relations between objects. Although object imagery and spatial imagery positively and negatively predicted the degree of social anxiety, respectively, these effects were attenuated when socially anxious individuals had high effortful control. Specifically, in individuals with high effortful control, both object and spatial imagery were not associated with social anxiety. Socially anxious individuals might prefer to construct pictorial images of individual objects in natural scenes through object imagery. However, even in individuals who exhibit these features of mental imagery, effortful control could inhibit the increase in social anxiety.
Cavina-Pratesi, C; Kentridge, R W; Heywood, C A; Milner, A D
2010-02-01
Real-life visual object recognition requires the processing of more than just geometric (shape, size, and orientation) properties. Surface properties such as color and texture are equally important, particularly for providing information about the material properties of objects. Recent neuroimaging research suggests that geometric and surface properties are dealt with separately within the lateral occipital cortex (LOC) and the collateral sulcus (CoS), respectively. Here we compared objects that differed either in aspect ratio or in surface texture only, keeping all other visual properties constant. Results on brain-intact participants confirmed that surface texture activates an area in the posterior CoS, quite distinct from the area activated by shape within LOC. We also tested 2 patients with visual object agnosia, one of whom (DF) performed well on the texture task but at chance on the shape task, whereas the other (MS) showed the converse pattern. This behavioral double dissociation was matched by a parallel neuroimaging dissociation, with activation in CoS but not LOC in patient DF and activation in LOC but not CoS in patient MS. These data provide presumptive evidence that the areas respectively activated by shape and texture play a causally necessary role in the perceptual discrimination of these features.
A GUI visualization system for airborne lidar image data to reconstruct 3D city model
NASA Astrophysics Data System (ADS)
Kawata, Yoshiyuki; Koizumi, Kohei
2015-10-01
A visualization toolbox system with graphical user interfaces (GUIs) was developed for the analysis of LiDAR point cloud data, as a compound object oriented widget application in IDL (Interractive Data Language). The main features in our system include file input and output abilities, data conversion capability from ascii formatted LiDAR point cloud data to LiDAR image data whose pixel value corresponds the altitude measured by LiDAR, visualization of 2D/3D images in various processing steps and automatic reconstruction ability of 3D city model. The performance and advantages of our graphical user interface (GUI) visualization system for LiDAR data are demonstrated.
ERIC Educational Resources Information Center
de Villiers, Michael
2011-01-01
Symmetry is found in the visual arts, architecture and design of artefacts since the earliest time. Many natural objects, both organic and inorganic, display symmetry: from microscopic crystals and sub-atomic particles to macro-cosmic galaxies. Today it features strongly in higher mathematics such as Linear and Abstract Algebra, Projective and…
Visual memory performance for color depends on spatiotemporal context.
Olivers, Christian N L; Schreij, Daniel
2014-10-01
Performance on visual short-term memory for features has been known to depend on stimulus complexity, spatial layout, and feature context. However, with few exceptions, memory capacity has been measured for abruptly appearing, single-instance displays. In everyday life, objects often have a spatiotemporal history as they or the observer move around. In three experiments, we investigated the effect of spatiotemporal history on explicit memory for color. Observers saw a memory display emerge from behind a wall, after which it disappeared again. The test display then emerged from either the same side as the memory display or the opposite side. In the first two experiments, memory improved for intermediate set sizes when the test display emerged in the same way as the memory display. A third experiment then showed that the benefit was tied to the original motion trajectory and not to the display object per se. The results indicate that memory for color is embedded in a richer episodic context that includes the spatiotemporal history of the display.