Newborn chickens generate invariant object representations at the onset of visual object experience
Wood, Justin N.
2013-01-01
To recognize objects quickly and accurately, mature visual systems build invariant object representations that generalize across a range of novel viewing conditions (e.g., changes in viewpoint). To date, however, the origins of this core cognitive ability have not yet been established. To examine how invariant object recognition develops in a newborn visual system, I raised chickens from birth for 2 weeks within controlled-rearing chambers. These chambers provided complete control over all visual object experiences. In the first week of life, subjects’ visual object experience was limited to a single virtual object rotating through a 60° viewpoint range. In the second week of life, I examined whether subjects could recognize that virtual object from novel viewpoints. Newborn chickens were able to generate viewpoint-invariant representations that supported object recognition across large, novel, and complex changes in the object’s appearance. Thus, newborn visual systems can begin building invariant object representations at the onset of visual object experience. These abstract representations can be generated from sparse data, in this case from a visual world containing a single virtual object seen from a limited range of viewpoints. This study shows that powerful, robust, and invariant object recognition machinery is an inherent feature of the newborn brain. PMID:23918372
The development of newborn object recognition in fast and slow visual worlds
Wood, Justin N.; Wood, Samantha M. W.
2016-01-01
Object recognition is central to perception and cognition. Yet relatively little is known about the environmental factors that cause invariant object recognition to emerge in the newborn brain. Is this ability a hardwired property of vision? Or does the development of invariant object recognition require experience with a particular kind of visual environment? Here, we used a high-throughput controlled-rearing method to examine whether newborn chicks (Gallus gallus) require visual experience with slowly changing objects to develop invariant object recognition abilities. When newborn chicks were raised with a slowly rotating virtual object, the chicks built invariant object representations that generalized across novel viewpoints and rotation speeds. In contrast, when newborn chicks were raised with a virtual object that rotated more quickly, the chicks built viewpoint-specific object representations that failed to generalize to novel viewpoints and rotation speeds. Moreover, there was a direct relationship between the speed of the object and the amount of invariance in the chick's object representation. Thus, visual experience with slowly changing objects plays a critical role in the development of invariant object recognition. These results indicate that invariant object recognition is not a hardwired property of vision, but is learned rapidly when newborns encounter a slowly changing visual world. PMID:27097925
Invariant visual object recognition: a model, with lighting invariance.
Rolls, Edmund T; Stringer, Simon M
2006-01-01
How are invariant representations of objects formed in the visual cortex? We describe a neurophysiological and computational approach which focusses on a feature hierarchy model in which invariant representations can be built by self-organizing learning based on the statistics of the visual input. The model can use temporal continuity in an associative synaptic learning rule with a short term memory trace, and/or it can use spatial continuity in Continuous Transformation learning. The model of visual processing in the ventral cortical stream can build representations of objects that are invariant with respect to translation, view, size, and in this paper we show also lighting. The model has been extended to provide an account of invariant representations in the dorsal visual system of the global motion produced by objects such as looming, rotation, and object-based movement. The model has been extended to incorporate top-down feedback connections to model the control of attention by biased competition in for example spatial and object search tasks. The model has also been extended to account for how the visual system can select single objects in complex visual scenes, and how multiple objects can be represented in a scene.
Invariant Visual Object and Face Recognition: Neural and Computational Bases, and a Model, VisNet
Rolls, Edmund T.
2012-01-01
Neurophysiological evidence for invariant representations of objects and faces in the primate inferior temporal visual cortex is described. Then a computational approach to how invariant representations are formed in the brain is described that builds on the neurophysiology. A feature hierarchy model in which invariant representations can be built by self-organizing learning based on the temporal and spatial statistics of the visual input produced by objects as they transform in the world is described. VisNet can use temporal continuity in an associative synaptic learning rule with a short-term memory trace, and/or it can use spatial continuity in continuous spatial transformation learning which does not require a temporal trace. The model of visual processing in the ventral cortical stream can build representations of objects that are invariant with respect to translation, view, size, and also lighting. The model has been extended to provide an account of invariant representations in the dorsal visual system of the global motion produced by objects such as looming, rotation, and object-based movement. The model has been extended to incorporate top-down feedback connections to model the control of attention by biased competition in, for example, spatial and object search tasks. The approach has also been extended to account for how the visual system can select single objects in complex visual scenes, and how multiple objects can be represented in a scene. The approach has also been extended to provide, with an additional layer, for the development of representations of spatial scenes of the type found in the hippocampus. PMID:22723777
Invariant Visual Object and Face Recognition: Neural and Computational Bases, and a Model, VisNet.
Rolls, Edmund T
2012-01-01
Neurophysiological evidence for invariant representations of objects and faces in the primate inferior temporal visual cortex is described. Then a computational approach to how invariant representations are formed in the brain is described that builds on the neurophysiology. A feature hierarchy model in which invariant representations can be built by self-organizing learning based on the temporal and spatial statistics of the visual input produced by objects as they transform in the world is described. VisNet can use temporal continuity in an associative synaptic learning rule with a short-term memory trace, and/or it can use spatial continuity in continuous spatial transformation learning which does not require a temporal trace. The model of visual processing in the ventral cortical stream can build representations of objects that are invariant with respect to translation, view, size, and also lighting. The model has been extended to provide an account of invariant representations in the dorsal visual system of the global motion produced by objects such as looming, rotation, and object-based movement. The model has been extended to incorporate top-down feedback connections to model the control of attention by biased competition in, for example, spatial and object search tasks. The approach has also been extended to account for how the visual system can select single objects in complex visual scenes, and how multiple objects can be represented in a scene. The approach has also been extended to provide, with an additional layer, for the development of representations of spatial scenes of the type found in the hippocampus.
Contour Curvature As an Invariant Code for Objects in Visual Area V4
Pasupathy, Anitha
2016-01-01
Size-invariant object recognition—the ability to recognize objects across transformations of scale—is a fundamental feature of biological and artificial vision. To investigate its basis in the primate cerebral cortex, we measured single neuron responses to stimuli of varying size in visual area V4, a cornerstone of the object-processing pathway, in rhesus monkeys (Macaca mulatta). Leveraging two competing models for how neuronal selectivity for the bounding contours of objects may depend on stimulus size, we show that most V4 neurons (∼70%) encode objects in a size-invariant manner, consistent with selectivity for a size-independent parameter of boundary form: for these neurons, “normalized” curvature, rather than “absolute” curvature, provided a better account of responses. Our results demonstrate the suitability of contour curvature as a basis for size-invariant object representation in the visual cortex, and posit V4 as a foundation for behaviorally relevant object codes. SIGNIFICANCE STATEMENT Size-invariant object recognition is a bedrock for many perceptual and cognitive functions. Despite growing neurophysiological evidence for invariant object representations in the primate cortex, we still lack a basic understanding of the encoding rules that govern them. Classic work in the field of visual shape theory has long postulated that a representation of objects based on information about their bounding contours is well suited to mediate such an invariant code. In this study, we provide the first empirical support for this hypothesis, and its instantiation in single neurons of visual area V4. PMID:27194333
A Balanced Comparison of Object Invariances in Monkey IT Neurons.
Ratan Murty, N Apurva; Arun, Sripati P
2017-01-01
Our ability to recognize objects across variations in size, position, or rotation is based on invariant object representations in higher visual cortex. However, we know little about how these invariances are related. Are some invariances harder than others? Do some invariances arise faster than others? These comparisons can be made only upon equating image changes across transformations. Here, we targeted invariant neural representations in the monkey inferotemporal (IT) cortex using object images with balanced changes in size, position, and rotation. Across the recorded population, IT neurons generalized across size and position both stronger and faster than to rotations in the image plane as well as in depth. We obtained a similar ordering of invariances in deep neural networks but not in low-level visual representations. Thus, invariant neural representations dynamically evolve in a temporal order reflective of their underlying computational complexity.
A Balanced Comparison of Object Invariances in Monkey IT Neurons
2017-01-01
Abstract Our ability to recognize objects across variations in size, position, or rotation is based on invariant object representations in higher visual cortex. However, we know little about how these invariances are related. Are some invariances harder than others? Do some invariances arise faster than others? These comparisons can be made only upon equating image changes across transformations. Here, we targeted invariant neural representations in the monkey inferotemporal (IT) cortex using object images with balanced changes in size, position, and rotation. Across the recorded population, IT neurons generalized across size and position both stronger and faster than to rotations in the image plane as well as in depth. We obtained a similar ordering of invariances in deep neural networks but not in low-level visual representations. Thus, invariant neural representations dynamically evolve in a temporal order reflective of their underlying computational complexity. PMID:28413827
A rodent model for the study of invariant visual object recognition
Zoccolan, Davide; Oertelt, Nadja; DiCarlo, James J.; Cox, David D.
2009-01-01
The human visual system is able to recognize objects despite tremendous variation in their appearance on the retina resulting from variation in view, size, lighting, etc. This ability—known as “invariant” object recognition—is central to visual perception, yet its computational underpinnings are poorly understood. Traditionally, nonhuman primates have been the animal model-of-choice for investigating the neuronal substrates of invariant recognition, because their visual systems closely mirror our own. Meanwhile, simpler and more accessible animal models such as rodents have been largely overlooked as possible models of higher-level visual functions, because their brains are often assumed to lack advanced visual processing machinery. As a result, little is known about rodents' ability to process complex visual stimuli in the face of real-world image variation. In the present work, we show that rats possess more advanced visual abilities than previously appreciated. Specifically, we trained pigmented rats to perform a visual task that required them to recognize objects despite substantial variation in their appearance, due to changes in size, view, and lighting. Critically, rats were able to spontaneously generalize to previously unseen transformations of learned objects. These results provide the first systematic evidence for invariant object recognition in rats and argue for an increased focus on rodents as models for studying high-level visual processing. PMID:19429704
ERIC Educational Resources Information Center
Austerweil, Joseph L.; Griffiths, Thomas L.; Palmer, Stephen E.
2017-01-01
How does the visual system recognize images of a novel object after a single observation despite possible variations in the viewpoint of that object relative to the observer? One possibility is comparing the image with a prototype for invariance over a relevant transformation set (e.g., translations and dilations). However, invariance over…
Rolls, Edmund T; Mills, W Patrick C
2018-05-01
When objects transform into different views, some properties are maintained, such as whether the edges are convex or concave, and these non-accidental properties are likely to be important in view-invariant object recognition. The metric properties, such as the degree of curvature, may change with different views, and are less likely to be useful in object recognition. It is shown that in a model of invariant visual object recognition in the ventral visual stream, VisNet, non-accidental properties are encoded much more than metric properties by neurons. Moreover, it is shown how with the temporal trace rule training in VisNet, non-accidental properties of objects become encoded by neurons, and how metric properties are treated invariantly. We also show how VisNet can generalize between different objects if they have the same non-accidental property, because the metric properties are likely to overlap. VisNet is a 4-layer unsupervised model of visual object recognition trained by competitive learning that utilizes a temporal trace learning rule to implement the learning of invariance using views that occur close together in time. A second crucial property of this model of object recognition is, when neurons in the level corresponding to the inferior temporal visual cortex respond selectively to objects, whether neurons in the intermediate layers can respond to combinations of features that may be parts of two or more objects. In an investigation using the four sides of a square presented in every possible combination, it was shown that even though different layer 4 neurons are tuned to encode each feature or feature combination orthogonally, neurons in the intermediate layers can respond to features or feature combinations present is several objects. This property is an important part of the way in which high capacity can be achieved in the four-layer ventral visual cortical pathway. These findings concerning non-accidental properties and the use of neurons in intermediate layers of the hierarchy help to emphasise fundamental underlying principles of the computations that may be implemented in the ventral cortical visual stream used in object recognition. Copyright © 2018 Elsevier Inc. All rights reserved.
Sountsov, Pavel; Santucci, David M; Lisman, John E
2011-01-01
Visual object recognition occurs easily despite differences in position, size, and rotation of the object, but the neural mechanisms responsible for this invariance are not known. We have found a set of transforms that achieve invariance in a neurally plausible way. We find that a transform based on local spatial frequency analysis of oriented segments and on logarithmic mapping, when applied twice in an iterative fashion, produces an output image that is unique to the object and that remains constant as the input image is shifted, scaled, or rotated.
Sountsov, Pavel; Santucci, David M.; Lisman, John E.
2011-01-01
Visual object recognition occurs easily despite differences in position, size, and rotation of the object, but the neural mechanisms responsible for this invariance are not known. We have found a set of transforms that achieve invariance in a neurally plausible way. We find that a transform based on local spatial frequency analysis of oriented segments and on logarithmic mapping, when applied twice in an iterative fashion, produces an output image that is unique to the object and that remains constant as the input image is shifted, scaled, or rotated. PMID:22125522
Rosselli, Federica B.; Alemi, Alireza; Ansuini, Alessio; Zoccolan, Davide
2015-01-01
In recent years, a number of studies have explored the possible use of rats as models of high-level visual functions. One central question at the root of such an investigation is to understand whether rat object vision relies on the processing of visual shape features or, rather, on lower-order image properties (e.g., overall brightness). In a recent study, we have shown that rats are capable of extracting multiple features of an object that are diagnostic of its identity, at least when those features are, structure-wise, distinct enough to be parsed by the rat visual system. In the present study, we have assessed the impact of object structure on rat perceptual strategy. We trained rats to discriminate between two structurally similar objects, and compared their recognition strategies with those reported in our previous study. We found that, under conditions of lower stimulus discriminability, rat visual discrimination strategy becomes more view-dependent and subject-dependent. Rats were still able to recognize the target objects, in a way that was largely tolerant (i.e., invariant) to object transformation; however, the larger structural and pixel-wise similarity affected the way objects were processed. Compared to the findings of our previous study, the patterns of diagnostic features were: (i) smaller and more scattered; (ii) only partially preserved across object views; and (iii) only partially reproducible across rats. On the other hand, rats were still found to adopt a multi-featural processing strategy and to make use of part of the optimal discriminatory information afforded by the two objects. Our findings suggest that, as in humans, rat invariant recognition can flexibly rely on either view-invariant representations of distinctive object features or view-specific object representations, acquired through learning. PMID:25814936
Learning and disrupting invariance in visual recognition with a temporal association rule
Isik, Leyla; Leibo, Joel Z.; Poggio, Tomaso
2012-01-01
Learning by temporal association rules such as Foldiak's trace rule is an attractive hypothesis that explains the development of invariance in visual recognition. Consistent with these rules, several recent experiments have shown that invariance can be broken at both the psychophysical and single cell levels. We show (1) that temporal association learning provides appropriate invariance in models of object recognition inspired by the visual cortex, (2) that we can replicate the “invariance disruption” experiments using these models with a temporal association learning rule to develop and maintain invariance, and (3) that despite dramatic single cell effects, a population of cells is very robust to these disruptions. We argue that these models account for the stability of perceptual invariance despite the underlying plasticity of the system, the variability of the visual world and expected noise in the biological mechanisms. PMID:22754523
Spoerer, Courtney J; Eguchi, Akihiro; Stringer, Simon M
2016-02-01
In order to develop transformation invariant representations of objects, the visual system must make use of constraints placed upon object transformation by the environment. For example, objects transform continuously from one point to another in both space and time. These two constraints have been exploited separately in order to develop translation and view invariance in a hierarchical multilayer model of the primate ventral visual pathway in the form of continuous transformation learning and temporal trace learning. We show for the first time that these two learning rules can work cooperatively in the model. Using these two learning rules together can support the development of invariance in cells and help maintain object selectivity when stimuli are presented over a large number of locations or when trained separately over a large number of viewing angles. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
ERIC Educational Resources Information Center
Wood, Justin N.; Wood, Samantha M. W.
2018-01-01
How do newborns learn to recognize objects? According to temporal learning models in computational neuroscience, the brain constructs object representations by extracting smoothly changing features from the environment. To date, however, it is unknown whether newborns depend on smoothly changing features to build invariant object representations.…
Feedforward object-vision models only tolerate small image variations compared to human
Ghodrati, Masoud; Farzmahdi, Amirhossein; Rajaei, Karim; Ebrahimpour, Reza; Khaligh-Razavi, Seyed-Mahdi
2014-01-01
Invariant object recognition is a remarkable ability of primates' visual system that its underlying mechanism has constantly been under intense investigations. Computational modeling is a valuable tool toward understanding the processes involved in invariant object recognition. Although recent computational models have shown outstanding performances on challenging image databases, they fail to perform well in image categorization under more complex image variations. Studies have shown that making sparse representation of objects by extracting more informative visual features through a feedforward sweep can lead to higher recognition performances. Here, however, we show that when the complexity of image variations is high, even this approach results in poor performance compared to humans. To assess the performance of models and humans in invariant object recognition tasks, we built a parametrically controlled image database consisting of several object categories varied in different dimensions and levels, rendered from 3D planes. Comparing the performance of several object recognition models with human observers shows that only in low-level image variations the models perform similar to humans in categorization tasks. Furthermore, the results of our behavioral experiments demonstrate that, even under difficult experimental conditions (i.e., briefly presented masked stimuli with complex image variations), human observers performed outstandingly well, suggesting that the models are still far from resembling humans in invariant object recognition. Taken together, we suggest that learning sparse informative visual features, although desirable, is not a complete solution for future progresses in object-vision modeling. We show that this approach is not of significant help in solving the computational crux of object recognition (i.e., invariant object recognition) when the identity-preserving image variations become more complex. PMID:25100986
Invariant visual object recognition and shape processing in rats
Zoccolan, Davide
2015-01-01
Invariant visual object recognition is the ability to recognize visual objects despite the vastly different images that each object can project onto the retina during natural vision, depending on its position and size within the visual field, its orientation relative to the viewer, etc. Achieving invariant recognition represents such a formidable computational challenge that is often assumed to be a unique hallmark of primate vision. Historically, this has limited the invasive investigation of its neuronal underpinnings to monkey studies, in spite of the narrow range of experimental approaches that these animal models allow. Meanwhile, rodents have been largely neglected as models of object vision, because of the widespread belief that they are incapable of advanced visual processing. However, the powerful array of experimental tools that have been developed to dissect neuronal circuits in rodents has made these species very attractive to vision scientists too, promoting a new tide of studies that have started to systematically explore visual functions in rats and mice. Rats, in particular, have been the subjects of several behavioral studies, aimed at assessing how advanced object recognition and shape processing is in this species. Here, I review these recent investigations, as well as earlier studies of rat pattern vision, to provide an historical overview and a critical summary of the status of the knowledge about rat object vision. The picture emerging from this survey is very encouraging with regard to the possibility of using rats as complementary models to monkeys in the study of higher-level vision. PMID:25561421
Invariant recognition drives neural representations of action sequences
Poggio, Tomaso
2017-01-01
Recognizing the actions of others from visual stimuli is a crucial aspect of human perception that allows individuals to respond to social cues. Humans are able to discriminate between similar actions despite transformations, like changes in viewpoint or actor, that substantially alter the visual appearance of a scene. This ability to generalize across complex transformations is a hallmark of human visual intelligence. Advances in understanding action recognition at the neural level have not always translated into precise accounts of the computational principles underlying what representations of action sequences are constructed by human visual cortex. Here we test the hypothesis that invariant action discrimination might fill this gap. Recently, the study of artificial systems for static object perception has produced models, Convolutional Neural Networks (CNNs), that achieve human level performance in complex discriminative tasks. Within this class, architectures that better support invariant object recognition also produce image representations that better match those implied by human and primate neural data. However, whether these models produce representations of action sequences that support recognition across complex transformations and closely follow neural representations of actions remains unknown. Here we show that spatiotemporal CNNs accurately categorize video stimuli into action classes, and that deliberate model modifications that improve performance on an invariant action recognition task lead to data representations that better match human neural recordings. Our results support our hypothesis that performance on invariant discrimination dictates the neural representations of actions computed in the brain. These results broaden the scope of the invariant recognition framework for understanding visual intelligence from perception of inanimate objects and faces in static images to the study of human perception of action sequences. PMID:29253864
Learning invariance from natural images inspired by observations in the primary visual cortex.
Teichmann, Michael; Wiltschut, Jan; Hamker, Fred
2012-05-01
The human visual system has the remarkable ability to largely recognize objects invariant of their position, rotation, and scale. A good interpretation of neurobiological findings involves a computational model that simulates signal processing of the visual cortex. In part, this is likely achieved step by step from early to late areas of visual perception. While several algorithms have been proposed for learning feature detectors, only few studies at hand cover the issue of biologically plausible learning of such invariance. In this study, a set of Hebbian learning rules based on calcium dynamics and homeostatic regulations of single neurons is proposed. Their performance is verified within a simple model of the primary visual cortex to learn so-called complex cells, based on a sequence of static images. As a result, the learned complex-cell responses are largely invariant to phase and position.
Metric invariance in object recognition: a review and further evidence.
Cooper, E E; Biederman, I; Hummel, J E
1992-06-01
Phenomenologically, human shape recognition appears to be invariant with changes of orientation in depth (up to parts occlusion), position in the visual field, and size. Recent versions of template theories (e.g., Ullman, 1989; Lowe, 1987) assume that these invariances are achieved through the application of transformations such as rotation, translation, and scaling of the image so that it can be matched metrically to a stored template. Presumably, such transformations would require time for their execution. We describe recent priming experiments in which the effects of a prior brief presentation of an image on its subsequent recognition are assessed. The results of these experiments indicate that the invariance is complete: The magnitude of visual priming (as distinct from name or basic level concept priming) is not affected by a change in position, size, orientation in depth, or the particular lines and vertices present in the image, as long as representations of the same components can be activated. An implemented seven layer neural network model (Hummel & Biederman, 1992) that captures these fundamental properties of human object recognition is described. Given a line drawing of an object, the model activates a viewpoint-invariant structural description of the object, specifying its parts and their interrelations. Visual priming is interpreted as a change in the connection weights for the activation of: a) cells, termed geon feature assemblies (GFAs), that conjoin the output of units that represent invariant, independent properties of a single geon and its relations (such as its type, aspect ratio, relations to other geons), or b) a change in the connection weights by which several GFAs activate a cell representing an object.
Behavioral model of visual perception and recognition
NASA Astrophysics Data System (ADS)
Rybak, Ilya A.; Golovan, Alexander V.; Gusakova, Valentina I.
1993-09-01
In the processes of visual perception and recognition human eyes actively select essential information by way of successive fixations at the most informative points of the image. A behavioral program defining a scanpath of the image is formed at the stage of learning (object memorizing) and consists of sequential motor actions, which are shifts of attention from one to another point of fixation, and sensory signals expected to arrive in response to each shift of attention. In the modern view of the problem, invariant object recognition is provided by the following: (1) separated processing of `what' (object features) and `where' (spatial features) information at high levels of the visual system; (2) mechanisms of visual attention using `where' information; (3) representation of `what' information in an object-based frame of reference (OFR). However, most recent models of vision based on OFR have demonstrated the ability of invariant recognition of only simple objects like letters or binary objects without background, i.e. objects to which a frame of reference is easily attached. In contrast, we use not OFR, but a feature-based frame of reference (FFR), connected with the basic feature (edge) at the fixation point. This has provided for our model, the ability for invariant representation of complex objects in gray-level images, but demands realization of behavioral aspects of vision described above. The developed model contains a neural network subsystem of low-level vision which extracts a set of primary features (edges) in each fixation, and high- level subsystem consisting of `what' (Sensory Memory) and `where' (Motor Memory) modules. The resolution of primary features extraction decreases with distances from the point of fixation. FFR provides both the invariant representation of object features in Sensor Memory and shifts of attention in Motor Memory. Object recognition consists in successive recall (from Motor Memory) and execution of shifts of attention and successive verification of the expected sets of features (stored in Sensory Memory). The model shows the ability of recognition of complex objects (such as faces) in gray-level images invariant with respect to shift, rotation, and scale.
Computation of pattern invariance in brain-like structures.
Ullman, S; Soloviev, S
1999-10-01
A fundamental capacity of the perceptual systems and the brain in general is to deal with the novel and the unexpected. In vision, we can effortlessly recognize a familiar object under novel viewing conditions, or recognize a new object as a member of a familiar class, such as a house, a face, or a car. This ability to generalize and deal efficiently with novel stimuli has long been considered a challenging example of brain-like computation that proved extremely difficult to replicate in artificial systems. In this paper we present an approach to generalization and invariant recognition. We focus our discussion on the problem of invariance to position in the visual field, but also sketch how similar principles could apply to other domains.The approach is based on the use of a large repertoire of partial generalizations that are built upon past experience. In the case of shift invariance, visual patterns are described as the conjunction of multiple overlapping image fragments. The invariance to the more primitive fragments is built into the system by past experience. Shift invariance of complex shapes is obtained from the invariance of their constituent fragments. We study by simulations aspects of this shift invariance method and then consider its extensions to invariant perception and classification by brain-like structures.
Sight and sound converge to form modality-invariant representations in temporo-parietal cortex
Man, Kingson; Kaplan, Jonas T.; Damasio, Antonio; Meyer, Kaspar
2013-01-01
People can identify objects in the environment with remarkable accuracy, irrespective of the sensory modality they use to perceive them. This suggests that information from different sensory channels converges somewhere in the brain to form modality-invariant representations, i.e., representations that reflect an object independently of the modality through which it has been apprehended. In this functional magnetic resonance imaging study of human subjects, we first identified brain areas that responded to both visual and auditory stimuli and then used crossmodal multivariate pattern analysis to evaluate the neural representations in these regions for content-specificity (i.e., do different objects evoke different representations?) and modality-invariance (i.e., do the sight and the sound of the same object evoke a similar representation?). While several areas became activated in response to both auditory and visual stimulation, only the neural patterns recorded in a region around the posterior part of the superior temporal sulcus displayed both content-specificity and modality-invariance. This region thus appears to play an important role in our ability to recognize objects in our surroundings through multiple sensory channels and to process them at a supra-modal (i.e., conceptual) level. PMID:23175818
The Invariance Hypothesis Implies Domain-Specific Regions in Visual Cortex
Leibo, Joel Z.; Liao, Qianli; Anselmi, Fabio; Poggio, Tomaso
2015-01-01
Is visual cortex made up of general-purpose information processing machinery, or does it consist of a collection of specialized modules? If prior knowledge, acquired from learning a set of objects is only transferable to new objects that share properties with the old, then the recognition system’s optimal organization must be one containing specialized modules for different object classes. Our analysis starts from a premise we call the invariance hypothesis: that the computational goal of the ventral stream is to compute an invariant-to-transformations and discriminative signature for recognition. The key condition enabling approximate transfer of invariance without sacrificing discriminability turns out to be that the learned and novel objects transform similarly. This implies that the optimal recognition system must contain subsystems trained only with data from similarly-transforming objects and suggests a novel interpretation of domain-specific regions like the fusiform face area (FFA). Furthermore, we can define an index of transformation-compatibility, computable from videos, that can be combined with information about the statistics of natural vision to yield predictions for which object categories ought to have domain-specific regions in agreement with the available data. The result is a unifying account linking the large literature on view-based recognition with the wealth of experimental evidence concerning domain-specific regions. PMID:26496457
Learning viewpoint invariant object representations using a temporal coherence principle.
Einhäuser, Wolfgang; Hipp, Jörg; Eggert, Julian; Körner, Edgar; König, Peter
2005-07-01
Invariant object recognition is arguably one of the major challenges for contemporary machine vision systems. In contrast, the mammalian visual system performs this task virtually effortlessly. How can we exploit our knowledge on the biological system to improve artificial systems? Our understanding of the mammalian early visual system has been augmented by the discovery that general coding principles could explain many aspects of neuronal response properties. How can such schemes be transferred to system level performance? In the present study we train cells on a particular variant of the general principle of temporal coherence, the "stability" objective. These cells are trained on unlabeled real-world images without a teaching signal. We show that after training, the cells form a representation that is largely independent of the viewpoint from which the stimulus is looked at. This finding includes generalization to previously unseen viewpoints. The achieved representation is better suited for view-point invariant object classification than the cells' input patterns. This property to facilitate view-point invariant classification is maintained even if training and classification take place in the presence of an--also unlabeled--distractor object. In summary, here we show that unsupervised learning using a general coding principle facilitates the classification of real-world objects, that are not segmented from the background and undergo complex, non-isomorphic, transformations.
Cao, Yongqiang; Grossberg, Stephen; Markowitz, Jeffrey
2011-12-01
All primates depend for their survival on being able to rapidly learn about and recognize objects. Objects may be visually detected at multiple positions, sizes, and viewpoints. How does the brain rapidly learn and recognize objects while scanning a scene with eye movements, without causing a combinatorial explosion in the number of cells that are needed? How does the brain avoid the problem of erroneously classifying parts of different objects together at the same or different positions in a visual scene? In monkeys and humans, a key area for such invariant object category learning and recognition is the inferotemporal cortex (IT). A neural model is proposed to explain how spatial and object attention coordinate the ability of IT to learn invariant category representations of objects that are seen at multiple positions, sizes, and viewpoints. The model clarifies how interactions within a hierarchy of processing stages in the visual brain accomplish this. These stages include the retina, lateral geniculate nucleus, and cortical areas V1, V2, V4, and IT in the brain's What cortical stream, as they interact with spatial attention processes within the parietal cortex of the Where cortical stream. The model builds upon the ARTSCAN model, which proposed how view-invariant object representations are generated. The positional ARTSCAN (pARTSCAN) model proposes how the following additional processes in the What cortical processing stream also enable position-invariant object representations to be learned: IT cells with persistent activity, and a combination of normalizing object category competition and a view-to-object learning law which together ensure that unambiguous views have a larger effect on object recognition than ambiguous views. The model explains how such invariant learning can be fooled when monkeys, or other primates, are presented with an object that is swapped with another object during eye movements to foveate the original object. The swapping procedure is predicted to prevent the reset of spatial attention, which would otherwise keep the representations of multiple objects from being combined by learning. Li and DiCarlo (2008) have presented neurophysiological data from monkeys showing how unsupervised natural experience in a target swapping experiment can rapidly alter object representations in IT. The model quantitatively simulates the swapping data by showing how the swapping procedure fools the spatial attention mechanism. More generally, the model provides a unifying framework, and testable predictions in both monkeys and humans, for understanding object learning data using neurophysiological methods in monkeys, and spatial attention, episodic learning, and memory retrieval data using functional imaging methods in humans. Copyright © 2011 Elsevier Ltd. All rights reserved.
Invariance of visual operations at the level of receptive fields
Lindeberg, Tony
2013-01-01
The brain is able to maintain a stable perception although the visual stimuli vary substantially on the retina due to geometric transformations and lighting variations in the environment. This paper presents a theory for achieving basic invariance properties already at the level of receptive fields. Specifically, the presented framework comprises (i) local scaling transformations caused by objects of different size and at different distances to the observer, (ii) locally linearized image deformations caused by variations in the viewing direction in relation to the object, (iii) locally linearized relative motions between the object and the observer and (iv) local multiplicative intensity transformations caused by illumination variations. The receptive field model can be derived by necessity from symmetry properties of the environment and leads to predictions about receptive field profiles in good agreement with receptive field profiles measured by cell recordings in mammalian vision. Indeed, the receptive field profiles in the retina, LGN and V1 are close to ideal to what is motivated by the idealized requirements. By complementing receptive field measurements with selection mechanisms over the parameters in the receptive field families, it is shown how true invariance of receptive field responses can be obtained under scaling transformations, affine transformations and Galilean transformations. Thereby, the framework provides a mathematically well-founded and biologically plausible model for how basic invariance properties can be achieved already at the level of receptive fields and support invariant recognition of objects and events under variations in viewpoint, retinal size, object motion and illumination. The theory can explain the different shapes of receptive field profiles found in biological vision, which are tuned to different sizes and orientations in the image domain as well as to different image velocities in space-time, from a requirement that the visual system should be invariant to the natural types of image transformations that occur in its environment. PMID:23894283
Learning viewpoint invariant perceptual representations from cluttered images.
Spratling, Michael W
2005-05-01
In order to perform object recognition, it is necessary to form perceptual representations that are sufficiently specific to distinguish between objects, but that are also sufficiently flexible to generalize across changes in location, rotation, and scale. A standard method for learning perceptual representations that are invariant to viewpoint is to form temporal associations across image sequences showing object transformations. However, this method requires that individual stimuli be presented in isolation and is therefore unlikely to succeed in real-world applications where multiple objects can co-occur in the visual input. This paper proposes a simple modification to the learning method that can overcome this limitation and results in more robust learning of invariant representations.
Zellin, Martina; Conci, Markus; von Mühlenen, Adrian; Müller, Hermann J
2011-10-01
Visual search for a target object is facilitated when the object is repeatedly presented within an invariant context of surrounding items ("contextual cueing"; Chun & Jiang, Cognitive Psychology, 36, 28-71, 1998). The present study investigated whether such invariant contexts can cue more than one target location. In a series of three experiments, we showed that contextual cueing is significantly reduced when invariant contexts are paired with two rather than one possible target location, whereas no contextual cueing occurs with three distinct target locations. Closer data inspection revealed that one "dominant" target always exhibited substantially more contextual cueing than did the other, "minor" target(s), which caused negative contextual-cueing effects. However, minor targets could benefit from the invariant context when they were spatially close to the dominant target. In sum, our experiments suggest that contextual cueing can guide visual attention to a spatially limited region of the display, only enhancing the detection of targets presented inside that region.
Are face representations depth cue invariant?
Dehmoobadsharifabadi, Armita; Farivar, Reza
2016-06-01
The visual system can process three-dimensional depth cues defining surfaces of objects, but it is unclear whether such information contributes to complex object recognition, including face recognition. The processing of different depth cues involves both dorsal and ventral visual pathways. We investigated whether facial surfaces defined by individual depth cues resulted in meaningful face representations-representations that maintain the relationship between the population of faces as defined in a multidimensional face space. We measured face identity aftereffects for facial surfaces defined by individual depth cues (Experiments 1 and 2) and tested whether the aftereffect transfers across depth cues (Experiments 3 and 4). Facial surfaces and their morphs to the average face were defined purely by one of shading, texture, motion, or binocular disparity. We obtained identification thresholds for matched (matched identity between adapting and test stimuli), non-matched (non-matched identity between adapting and test stimuli), and no-adaptation (showing only the test stimuli) conditions for each cue and across different depth cues. We found robust face identity aftereffect in both experiments. Our results suggest that depth cues do contribute to forming meaningful face representations that are depth cue invariant. Depth cue invariance would require integration of information across different areas and different pathways for object recognition, and this in turn has important implications for cortical models of visual object recognition.
Evans, Benjamin D; Stringer, Simon M
2015-04-01
Learning to recognise objects and faces is an important and challenging problem tackled by the primate ventral visual system. One major difficulty lies in recognising an object despite profound differences in the retinal images it projects, due to changes in view, scale, position and other identity-preserving transformations. Several models of the ventral visual system have been successful in coping with these issues, but have typically been privileged by exposure to only one object at a time. In natural scenes, however, the challenges of object recognition are typically further compounded by the presence of several objects which should be perceived as distinct entities. In the present work, we explore one possible mechanism by which the visual system may overcome these two difficulties simultaneously, through segmenting unseen (artificial) stimuli using information about their category encoded in plastic lateral connections. We demonstrate that these experience-guided lateral interactions robustly organise input representations into perceptual cycles, allowing feed-forward connections trained with spike-timing-dependent plasticity to form independent, translation-invariant output representations. We present these simulations as a functional explanation for the role of plasticity in the lateral connectivity of visual cortex.
Orlov, Tanya; Zohary, Ehud
2018-01-17
We typically recognize visual objects using the spatial layout of their parts, which are present simultaneously on the retina. Therefore, shape extraction is based on integration of the relevant retinal information over space. The lateral occipital complex (LOC) can represent shape faithfully in such conditions. However, integration over time is sometimes required to determine object shape. To study shape extraction through temporal integration of successive partial shape views, we presented human participants (both men and women) with artificial shapes that moved behind a narrow vertical or horizontal slit. Only a tiny fraction of the shape was visible at any instant at the same retinal location. However, observers perceived a coherent whole shape instead of a jumbled pattern. Using fMRI and multivoxel pattern analysis, we searched for brain regions that encode temporally integrated shape identity. We further required that the representation of shape should be invariant to changes in the slit orientation. We show that slit-invariant shape information is most accurate in the LOC. Importantly, the slit-invariant shape representations matched the conventional whole-shape representations assessed during full-image runs. Moreover, when the same slit-dependent shape slivers were shuffled, thereby preventing their spatiotemporal integration, slit-invariant shape information was reduced dramatically. The slit-invariant representation of the various shapes also mirrored the structure of shape perceptual space as assessed by perceptual similarity judgment tests. Therefore, the LOC is likely to mediate temporal integration of slit-dependent shape views, generating a slit-invariant whole-shape percept. These findings provide strong evidence for a global encoding of shape in the LOC regardless of integration processes required to generate the shape percept. SIGNIFICANCE STATEMENT Visual objects are recognized through spatial integration of features available simultaneously on the retina. The lateral occipital complex (LOC) represents shape faithfully in such conditions even if the object is partially occluded. However, shape must sometimes be reconstructed over both space and time. Such is the case in anorthoscopic perception, when an object is moving behind a narrow slit. In this scenario, spatial information is limited at any moment so the whole-shape percept can only be inferred by integration of successive shape views over time. We find that LOC carries shape-specific information recovered using such temporal integration processes. The shape representation is invariant to slit orientation and is similar to that evoked by a fully viewed image. Existing models of object recognition lack such capabilities. Copyright © 2018 the authors 0270-6474/18/380659-20$15.00/0.
View-Based Models of 3D Object Recognition and Class-Specific Invariance
1994-04-01
underlie recognition of geon-like com- ponents (see Edelman, 1991 and Biederman , 1987 ). I(X -_ ta)II1y = (X - ta)TWTW(x -_ ta) (3) View-invariant features...Institute of Technology, 1993. neocortex. Biological Cybernetics, 1992. 14] I. Biederman . Recognition by components: a theory [20] B. Olshausen, C...Anderson, and D. Van Essen. A of human image understanding. Psychol. Review, neural model of visual attention and invariant pat- 94:115-147, 1987 . tern
Eickenberg, Michael; Rowekamp, Ryan J.; Kouh, Minjoon; Sharpee, Tatyana O.
2012-01-01
Our visual system is capable of recognizing complex objects even when their appearances change drastically under various viewing conditions. Especially in the higher cortical areas, the sensory neurons reflect such functional capacity in their selectivity for complex visual features and invariance to certain object transformations, such as image translation. Due to the strong nonlinearities necessary to achieve both the selectivity and invariance, characterizing and predicting the response patterns of these neurons represents a formidable computational challenge. A related problem is that such neurons are poorly driven by randomized inputs, such as white noise, and respond strongly only to stimuli with complex high-order correlations, such as natural stimuli. Here we describe a novel two-step optimization technique that can characterize both the shape selectivity and the range and coarseness of position invariance from neural responses to natural stimuli. One step in the optimization involves finding the template as the maximally informative dimension given the estimated spatial location where the response could have been triggered within each image. The estimates of the locations that triggered the response are subsequently updated in the next step. Under the assumption of a monotonic relationship between the firing rate and stimulus projections on the template at a given position, the most likely location is the one that has the largest projection on the estimate of the template. The algorithm shows quick convergence during optimization, and the estimation results are reliable even in the regime of small signal-to-noise ratios. When we apply the algorithm to responses of complex cells in the primary visual cortex (V1) to natural movies, we find that responses of the majority of cells were significantly better described by translation invariant models based on one template compared with position-specific models with several relevant features. PMID:22734487
A model for size- and rotation-invariant pattern processing in the visual system.
Reitboeck, H J; Altmann, J
1984-01-01
The mapping of retinal space onto the striate cortex of some mammals can be approximated by a log-polar function. It has been proposed that this mapping is of functional importance for scale- and rotation-invariant pattern recognition in the visual system. An exact log-polar transform converts centered scaling and rotation into translations. A subsequent translation-invariant transform, such as the absolute value of the Fourier transform, thus generates overall size- and rotation-invariance. In our model, the translation-invariance is realized via the R-transform. This transform can be executed by simple neural networks, and it does not require the complex computations of the Fourier transform, used in Mellin-transform size-invariance models. The logarithmic space distortion and differentiation in the first processing stage of the model is realized via "Mexican hat" filters whose diameter increases linearly with eccentricity, similar to the characteristics of the receptive fields of retinal ganglion cells. Except for some special cases, the model can explain object recognition independent of size, orientation and position. Some general problems of Mellin-type size-invariance models-that also apply to our model-are discussed.
Slow feature analysis: unsupervised learning of invariances.
Wiskott, Laurenz; Sejnowski, Terrence J
2002-04-01
Invariant features of temporally varying signals are useful for analysis and classification. Slow feature analysis (SFA) is a new method for learning invariant or slowly varying features from a vectorial input signal. It is based on a nonlinear expansion of the input signal and application of principal component analysis to this expanded signal and its time derivative. It is guaranteed to find the optimal solution within a family of functions directly and can learn to extract a large number of decorrelated features, which are ordered by their degree of invariance. SFA can be applied hierarchically to process high-dimensional input signals and extract complex features. SFA is applied first to complex cell tuning properties based on simple cell output, including disparity and motion. Then more complicated input-output functions are learned by repeated application of SFA. Finally, a hierarchical network of SFA modules is presented as a simple model of the visual system. The same unstructured network can learn translation, size, rotation, contrast, or, to a lesser degree, illumination invariance for one-dimensional objects, depending on only the training stimulus. Surprisingly, only a few training objects suffice to achieve good generalization to new objects. The generated representation is suitable for object recognition. Performance degrades if the network is trained to learn multiple invariances simultaneously.
Goodhew, Stephanie C; Edwards, Mark
2016-12-01
When the human brain is confronted with complex and dynamic visual scenes, two pivotal processes are at play: visual attention (the process of selecting certain aspects of the scene for privileged processing) and object individuation (determining what information belongs to a continuing object over time versus what represents two or more distinct objects). Here we examined whether these processes are independent or whether they interact. Object-substitution masking (OSM) has been used as a tool to examine such questions, however, there is controversy surrounding whether OSM reflects object individuation versus substitution processes. The object-individuation account is agnostic regarding the role of attention, whereas object-substitution theory stipulates a pivotal role for attention. There have been attempts to investigate the role of attention in OSM, but they have been subject to alternative explanations. Here, therefore, we manipulated the size of the attended region, a pure and uncontaminated attentional manipulation, and examined the impact on OSM. Across three experiments, there was no interaction. This refutes the object-substitution theory of OSM. This, in turn, tell us that object-individuation is invariant the distribution of attention. Copyright © 2016 Elsevier B.V. All rights reserved.
Trade-off between curvature tuning and position invariance in visual area V4
Sharpee, Tatyana O.; Kouh, Minjoon; Reynolds, John H.
2013-01-01
Humans can rapidly recognize a multitude of objects despite differences in their appearance. The neural mechanisms that endow high-level sensory neurons with both selectivity to complex stimulus features and “tolerance” or invariance to identity-preserving transformations, such as spatial translation, remain poorly understood. Previous studies have demonstrated that both tolerance and selectivity to conjunctions of features are increased at successive stages of the ventral visual stream that mediates visual recognition. Within a given area, such as visual area V4 or the inferotemporal cortex, tolerance has been found to be inversely related to the sparseness of neural responses, which in turn was positively correlated with conjunction selectivity. However, the direct relationship between tolerance and conjunction selectivity has been difficult to establish, with different studies reporting either an inverse or no significant relationship. To resolve this, we measured V4 responses to natural scenes, and using recently developed statistical techniques, we estimated both the relevant stimulus features and the range of translation invariance for each neuron. Focusing the analysis on tuning to curvature, a tractable example of conjunction selectivity, we found that neurons that were tuned to more curved contours had smaller ranges of position invariance and produced sparser responses to natural stimuli. These trade-offs provide empirical support for recent theories of how the visual system estimates 3D shapes from shading and texture flows, as well as the tiling hypothesis of the visual space for different curvature values. PMID:23798444
Generic decoding of seen and imagined objects using hierarchical visual features.
Horikawa, Tomoyasu; Kamitani, Yukiyasu
2017-05-22
Object recognition is a key function in both human and machine vision. While brain decoding of seen and imagined objects has been achieved, the prediction is limited to training examples. We present a decoding approach for arbitrary objects using the machine vision principle that an object category is represented by a set of features rendered invariant through hierarchical processing. We show that visual features, including those derived from a deep convolutional neural network, can be predicted from fMRI patterns, and that greater accuracy is achieved for low-/high-level features with lower-/higher-level visual areas, respectively. Predicted features are used to identify seen/imagined object categories (extending beyond decoder training) from a set of computed features for numerous object images. Furthermore, decoding of imagined objects reveals progressive recruitment of higher-to-lower visual representations. Our results demonstrate a homology between human and machine vision and its utility for brain-based information retrieval.
Numerosity as a topological invariant.
Kluth, Tobias; Zetzsche, Christoph
2016-01-01
The ability to quickly recognize the number of objects in our environment is a fundamental cognitive function. However, it is far from clear which computations and which actual neural processing mechanisms are used to provide us with such a skill. Here we try to provide a detailed and comprehensive analysis of this issue, which comprises both the basic mathematical foundations and the peculiarities imposed by the structure of the visual system and by the neural computations provided by the visual cortex. We suggest that numerosity should be considered as a mathematical invariant. Making use of concepts from mathematical topology--like connectedness, Betti numbers, and the Gauss-Bonnet theorem--we derive the basic computations suited for the computation of this invariant. We show that the computation of numerosity is possible in a neurophysiologically plausible fashion using only computational elements which are known to exist in the visual cortex. We further show that a fundamental feature of numerosity perception, its Weber property, arises naturally, assuming noise in the basic neural operations. The model is tested on an extended data set (made publicly available). It is hoped that our results can provide a general framework for future research on the invariance properties of the numerosity system.
Coding the presence of visual objects in a recurrent neural network of visual cortex.
Zwickel, Timm; Wachtler, Thomas; Eckhorn, Reinhard
2007-01-01
Before we can recognize a visual object, our visual system has to segregate it from its background. This requires a fast mechanism for establishing the presence and location of objects independently of their identity. Recently, border-ownership neurons were recorded in monkey visual cortex which might be involved in this task [Zhou, H., Friedmann, H., von der Heydt, R., 2000. Coding of border ownership in monkey visual cortex. J. Neurosci. 20 (17), 6594-6611]. In order to explain the basic mechanisms required for fast coding of object presence, we have developed a neural network model of visual cortex consisting of three stages. Feed-forward and lateral connections support coding of Gestalt properties, including similarity, good continuation, and convexity. Neurons of the highest area respond to the presence of an object and encode its position, invariant of its form. Feedback connections to the lowest area facilitate orientation detectors activated by contours belonging to potential objects, and thus generate the experimentally observed border-ownership property. This feedback control acts fast and significantly improves the figure-ground segregation required for the consecutive task of object recognition.
2017-01-01
Recent studies have challenged the ventral/“what” and dorsal/“where” two-visual-processing-pathway view by showing the existence of “what” and “where” information in both pathways. Is the two-pathway distinction still valid? Here, we examined how goal-directed visual information processing may differentially impact visual representations in these two pathways. Using fMRI and multivariate pattern analysis, in three experiments on human participants (57% females), by manipulating whether color or shape was task-relevant and how they were conjoined, we examined shape-based object category decoding in occipitotemporal and parietal regions. We found that object category representations in all the regions examined were influenced by whether or not object shape was task-relevant. This task effect, however, tended to decrease as task-relevant and irrelevant features were more integrated, reflecting the well-known object-based feature encoding. Interestingly, task relevance played a relatively minor role in driving the representational structures of early visual and ventral object regions. They were driven predominantly by variations in object shapes. In contrast, the effect of task was much greater in dorsal than ventral regions, with object category and task relevance both contributing significantly to the representational structures of the dorsal regions. These results showed that, whereas visual representations in the ventral pathway are more invariant and reflect “what an object is,” those in the dorsal pathway are more adaptive and reflect “what we do with it.” Thus, despite the existence of “what” and “where” information in both visual processing pathways, the two pathways may still differ fundamentally in their roles in visual information representation. SIGNIFICANCE STATEMENT Visual information is thought to be processed in two distinctive pathways: the ventral pathway that processes “what” an object is and the dorsal pathway that processes “where” it is located. This view has been challenged by recent studies revealing the existence of “what” and “where” information in both pathways. Here, we found that goal-directed visual information processing differentially modulates shape-based object category representations in the two pathways. Whereas ventral representations are more invariant to the demand of the task, reflecting what an object is, dorsal representations are more adaptive, reflecting what we do with the object. Thus, despite the existence of “what” and “where” information in both pathways, visual representations may still differ fundamentally in the two pathways. PMID:28821655
Neural Encoding of Relative Position
ERIC Educational Resources Information Center
Hayworth, Kenneth J.; Lescroart, Mark D.; Biederman, Irving
2011-01-01
Late ventral visual areas generally consist of cells having a significant degree of translation invariance. Such a "bag of features" representation is useful for the recognition of individual objects; however, it seems unable to explain our ability to parse a scene into multiple objects and to understand their spatial relationships. We…
Conjunctive Coding of Complex Object Features
Erez, Jonathan; Cusack, Rhodri; Kendall, William; Barense, Morgan D.
2016-01-01
Critical to perceiving an object is the ability to bind its constituent features into a cohesive representation, yet the manner by which the visual system integrates object features to yield a unified percept remains unknown. Here, we present a novel application of multivoxel pattern analysis of neuroimaging data that allows a direct investigation of whether neural representations integrate object features into a whole that is different from the sum of its parts. We found that patterns of activity throughout the ventral visual stream (VVS), extending anteriorly into the perirhinal cortex (PRC), discriminated between the same features combined into different objects. Despite this sensitivity to the unique conjunctions of features comprising objects, activity in regions of the VVS, again extending into the PRC, was invariant to the viewpoints from which the conjunctions were presented. These results suggest that the manner in which our visual system processes complex objects depends on the explicit coding of the conjunctions of features comprising them. PMID:25921583
The Representation of Information about Faces in the Temporal and Frontal Lobes
ERIC Educational Resources Information Center
Rolls, Edmund T.
2007-01-01
Neurophysiological evidence is described showing that some neurons in the macaque inferior temporal visual cortex have responses that are invariant with respect to the position, size and view of faces and objects, and that these neurons show rapid processing and rapid learning. Which face or object is present is encoded using a distributed…
Grossberg, Stephen; Markowitz, Jeffrey; Cao, Yongqiang
2011-12-01
Visual object recognition is an essential accomplishment of advanced brains. Object recognition needs to be tolerant, or invariant, with respect to changes in object position, size, and view. In monkeys and humans, a key area for recognition is the anterior inferotemporal cortex (ITa). Recent neurophysiological data show that ITa cells with high object selectivity often have low position tolerance. We propose a neural model whose cells learn to simulate this tradeoff, as well as ITa responses to image morphs, while explaining how invariant recognition properties may arise in stages due to processes across multiple cortical areas. These processes include the cortical magnification factor, multiple receptive field sizes, and top-down attentive matching and learning properties that may be tuned by task requirements to attend to either concrete or abstract visual features with different levels of vigilance. The model predicts that data from the tradeoff and image morph tasks emerge from different levels of vigilance in the animals performing them. This result illustrates how different vigilance requirements of a task may change the course of category learning, notably the critical features that are attended and incorporated into learned category prototypes. The model outlines a path for developing an animal model of how defective vigilance control can lead to symptoms of various mental disorders, such as autism and amnesia. Copyright © 2011 Elsevier Ltd. All rights reserved.
Learning Complex Cell Invariance from Natural Videos: A Plausibility Proof
2007-12-26
is in the McGovern Institute for Brain Research at MIT, as well as in the Dept. of Brain & Cognitive Sciences, and which is affiliated with the...stimulation induce plasticity? Proc. Nat. Acad. Sci. USA, 92:9682–9686. Deco, G. and Rolls, E. T. (2004). A neurodynamical cor- tical model of visual attention...and invariant object recognition. Vision Res, 44(6):621–42. Deco, G. and Rolls, E. T. (2005). Neurodynamics of biased competition and cooperation for
Optical Associative Processors For Visual Perception"
NASA Astrophysics Data System (ADS)
Casasent, David; Telfer, Brian
1988-05-01
We consider various associative processor modifications required to allow these systems to be used for visual perception, scene analysis, and object recognition. For these applications, decisions on the class of the objects present in the input image are required and thus heteroassociative memories are necessary (rather than the autoassociative memories that have been given most attention). We analyze the performance of both associative processors and note that there is considerable difference between heteroassociative and autoassociative memories. We describe associative processors suitable for realizing functions such as: distortion invariance (using linear discriminant function memory synthesis techniques), noise and image processing performance (using autoassociative memories in cascade with with a heteroassociative processor and with a finite number of autoassociative memory iterations employed), shift invariance (achieved through the use of associative processors operating on feature space data), and the analysis of multiple objects in high noise (which is achieved using associative processing of the output from symbolic correlators). We detail and provide initial demonstrations of the use of associative processors operating on iconic, feature space and symbolic data, as well as adaptive associative processors.
An integration of minimum local feature representation methods to recognize large variation of foods
NASA Astrophysics Data System (ADS)
Razali, Mohd Norhisham bin; Manshor, Noridayu; Halin, Alfian Abdul; Mustapha, Norwati; Yaakob, Razali
2017-10-01
Local invariant features have shown to be successful in describing object appearances for image classification tasks. Such features are robust towards occlusion and clutter and are also invariant against scale and orientation changes. This makes them suitable for classification tasks with little inter-class similarity and large intra-class difference. In this paper, we propose an integrated representation of the Speeded-Up Robust Feature (SURF) and Scale Invariant Feature Transform (SIFT) descriptors, using late fusion strategy. The proposed representation is used for food recognition from a dataset of food images with complex appearance variations. The Bag of Features (BOF) approach is employed to enhance the discriminative ability of the local features. Firstly, the individual local features are extracted to construct two kinds of visual vocabularies, representing SURF and SIFT. The visual vocabularies are then concatenated and fed into a Linear Support Vector Machine (SVM) to classify the respective food categories. Experimental results demonstrate impressive overall recognition at 82.38% classification accuracy based on the challenging UEC-Food100 dataset.
Here Today, Gone Tomorrow – Adaptation to Change in Memory-Guided Visual Search
Zellin, Martina; Conci, Markus; von Mühlenen, Adrian; Müller, Hermann J.
2013-01-01
Visual search for a target object can be facilitated by the repeated presentation of an invariant configuration of nontargets (‘contextual cueing’). Here, we tested adaptation of learned contextual associations after a sudden, but permanent, relocation of the target. After an initial learning phase targets were relocated within their invariant contexts and repeatedly presented at new locations, before they returned to the initial locations. Contextual cueing for relocated targets was neither observed after numerous presentations nor after insertion of an overnight break. Further experiments investigated whether learning of additional, previously unseen context-target configurations is comparable to adaptation of existing contextual associations to change. In contrast to the lack of adaptation to changed target locations, contextual cueing developed for additional invariant configurations under identical training conditions. Moreover, across all experiments, presenting relocated targets or additional contexts did not interfere with contextual cueing of initially learned invariant configurations. Overall, the adaptation of contextual memory to changed target locations was severely constrained and unsuccessful in comparison to learning of an additional set of contexts, which suggests that contextual cueing facilitates search for only one repeated target location. PMID:23555038
Rolls, Edmund T.; Webb, Tristan J.
2014-01-01
Searching for and recognizing objects in complex natural scenes is implemented by multiple saccades until the eyes reach within the reduced receptive field sizes of inferior temporal cortex (IT) neurons. We analyze and model how the dorsal and ventral visual streams both contribute to this. Saliency detection in the dorsal visual system including area LIP is modeled by graph-based visual saliency, and allows the eyes to fixate potential objects within several degrees. Visual information at the fixated location subtending approximately 9° corresponding to the receptive fields of IT neurons is then passed through a four layer hierarchical model of the ventral cortical visual system, VisNet. We show that VisNet can be trained using a synaptic modification rule with a short-term memory trace of recent neuronal activity to capture both the required view and translation invariances to allow in the model approximately 90% correct object recognition for 4 objects shown in any view across a range of 135° anywhere in a scene. The model was able to generalize correctly within the four trained views and the 25 trained translations. This approach analyses the principles by which complementary computations in the dorsal and ventral visual cortical streams enable objects to be located and recognized in complex natural scenes. PMID:25161619
Eye movement-invariant representations in the human visual system.
Nishimoto, Shinji; Huth, Alexander G; Bilenko, Natalia Y; Gallant, Jack L
2017-01-01
During natural vision, humans make frequent eye movements but perceive a stable visual world. It is therefore likely that the human visual system contains representations of the visual world that are invariant to eye movements. Here we present an experiment designed to identify visual areas that might contain eye-movement-invariant representations. We used functional MRI to record brain activity from four human subjects who watched natural movies. In one condition subjects were required to fixate steadily, and in the other they were allowed to freely make voluntary eye movements. The movies used in each condition were identical. We reasoned that the brain activity recorded in a visual area that is invariant to eye movement should be similar under fixation and free viewing conditions. In contrast, activity in a visual area that is sensitive to eye movement should differ between fixation and free viewing. We therefore measured the similarity of brain activity across repeated presentations of the same movie within the fixation condition, and separately between the fixation and free viewing conditions. The ratio of these measures was used to determine which brain areas are most likely to contain eye movement-invariant representations. We found that voxels located in early visual areas are strongly affected by eye movements, while voxels in ventral temporal areas are only weakly affected by eye movements. These results suggest that the ventral temporal visual areas contain a stable representation of the visual world that is invariant to eye movements made during natural vision.
Erdogan, Goker; Yildirim, Ilker; Jacobs, Robert A.
2015-01-01
People learn modality-independent, conceptual representations from modality-specific sensory signals. Here, we hypothesize that any system that accomplishes this feat will include three components: a representational language for characterizing modality-independent representations, a set of sensory-specific forward models for mapping from modality-independent representations to sensory signals, and an inference algorithm for inverting forward models—that is, an algorithm for using sensory signals to infer modality-independent representations. To evaluate this hypothesis, we instantiate it in the form of a computational model that learns object shape representations from visual and/or haptic signals. The model uses a probabilistic grammar to characterize modality-independent representations of object shape, uses a computer graphics toolkit and a human hand simulator to map from object representations to visual and haptic features, respectively, and uses a Bayesian inference algorithm to infer modality-independent object representations from visual and/or haptic signals. Simulation results show that the model infers identical object representations when an object is viewed, grasped, or both. That is, the model’s percepts are modality invariant. We also report the results of an experiment in which different subjects rated the similarity of pairs of objects in different sensory conditions, and show that the model provides a very accurate account of subjects’ ratings. Conceptually, this research significantly contributes to our understanding of modality invariance, an important type of perceptual constancy, by demonstrating how modality-independent representations can be acquired and used. Methodologically, it provides an important contribution to cognitive modeling, particularly an emerging probabilistic language-of-thought approach, by showing how symbolic and statistical approaches can be combined in order to understand aspects of human perception. PMID:26554704
Chang, Hung-Cheng; Grossberg, Stephen; Cao, Yongqiang
2014-01-01
The Where’s Waldo problem concerns how individuals can rapidly learn to search a scene to detect, attend, recognize, and look at a valued target object in it. This article develops the ARTSCAN Search neural model to clarify how brain mechanisms across the What and Where cortical streams are coordinated to solve the Where’s Waldo problem. The What stream learns positionally-invariant object representations, whereas the Where stream controls positionally-selective spatial and action representations. The model overcomes deficiencies of these computationally complementary properties through What and Where stream interactions. Where stream processes of spatial attention and predictive eye movement control modulate What stream processes whereby multiple view- and positionally-specific object categories are learned and associatively linked to view- and positionally-invariant object categories through bottom-up and attentive top-down interactions. Gain fields control the coordinate transformations that enable spatial attention and predictive eye movements to carry out this role. What stream cognitive-emotional learning processes enable the focusing of motivated attention upon the invariant object categories of desired objects. What stream cognitive names or motivational drives can prime a view- and positionally-invariant object category of a desired target object. A volitional signal can convert these primes into top-down activations that can, in turn, prime What stream view- and positionally-specific categories. When it also receives bottom-up activation from a target, such a positionally-specific category can cause an attentional shift in the Where stream to the positional representation of the target, and an eye movement can then be elicited to foveate it. These processes describe interactions among brain regions that include visual cortex, parietal cortex, inferotemporal cortex, prefrontal cortex (PFC), amygdala, basal ganglia (BG), and superior colliculus (SC). PMID:24987339
The representation of information about faces in the temporal and frontal lobes.
Rolls, Edmund T
2007-01-07
Neurophysiological evidence is described showing that some neurons in the macaque inferior temporal visual cortex have responses that are invariant with respect to the position, size and view of faces and objects, and that these neurons show rapid processing and rapid learning. Which face or object is present is encoded using a distributed representation in which each neuron conveys independent information in its firing rate, with little information evident in the relative time of firing of different neurons. This ensemble encoding has the advantages of maximising the information in the representation useful for discrimination between stimuli using a simple weighted sum of the neuronal firing by the receiving neurons, generalisation and graceful degradation. These invariant representations are ideally suited to provide the inputs to brain regions such as the orbitofrontal cortex and amygdala that learn the reinforcement associations of an individual's face, for then the learning, and the appropriate social and emotional responses, generalise to other views of the same face. A theory is described of how such invariant representations may be produced in a hierarchically organised set of visual cortical areas with convergent connectivity. The theory proposes that neurons in these visual areas use a modified Hebb synaptic modification rule with a short-term memory trace to capture whatever can be captured at each stage that is invariant about objects as the objects change in retinal view, position, size and rotation. Another population of neurons in the cortex in the superior temporal sulcus encodes other aspects of faces such as face expression, eye gaze, face view and whether the head is moving. These neurons thus provide important additional inputs to parts of the brain such as the orbitofrontal cortex and amygdala that are involved in social communication and emotional behaviour. Outputs of these systems reach the amygdala, in which face-selective neurons are found, and also the orbitofrontal cortex, in which some neurons are tuned to face identity and others to face expression. In humans, activation of the orbitofrontal cortex is found when a change of face expression acts as a social signal that behaviour should change; and damage to the orbitofrontal cortex can impair face and voice expression identification, and also the reversal of emotional behaviour that normally occurs when reinforcers are reversed.
Guidance of attention by information held in working memory.
Calleja, Marissa Ortiz; Rich, Anina N
2013-05-01
Information held in working memory (WM) can guide attention during visual search. The authors of recent studies have interpreted the effect of holding verbal labels in WM as guidance of visual attention by semantic information. In a series of experiments, we tested how attention is influenced by visual features versus category-level information about complex objects held in WM. Participants either memorized an object's image or its category. While holding this information in memory, they searched for a target in a four-object search display. On exact-match trials, the memorized item reappeared as a distractor in the search display. On category-match trials, another exemplar of the memorized item appeared as a distractor. On neutral trials, none of the distractors were related to the memorized object. We found attentional guidance in visual search on both exact-match and category-match trials in Experiment 1, in which the exemplars were visually similar. When we controlled for visual similarity among the exemplars by using four possible exemplars (Exp. 2) or by using two exemplars rated as being visually dissimilar (Exp. 3), we found attentional guidance only on exact-match trials when participants memorized the object's image. The same pattern of results held when the target was invariant (Exps. 2-3) and when the target was defined semantically and varied in visual features (Exp. 4). The findings of these experiments suggest that attentional guidance by WM requires active visual information.
The 4-D approach to visual control of autonomous systems
NASA Technical Reports Server (NTRS)
Dickmanns, Ernst D.
1994-01-01
Development of a 4-D approach to dynamic machine vision is described. Core elements of this method are spatio-temporal models oriented towards objects and laws of perspective projection in a foward mode. Integration of multi-sensory measurement data was achieved through spatio-temporal models as invariants for object recognition. Situation assessment and long term predictions were allowed through maintenance of a symbolic 4-D image of processes involving objects. Behavioral capabilities were easily realized by state feedback and feed-foward control.
NASA Astrophysics Data System (ADS)
Madokoro, H.; Tsukada, M.; Sato, K.
2013-07-01
This paper presents an unsupervised learning-based object category formation and recognition method for mobile robot vision. Our method has the following features: detection of feature points and description of features using a scale-invariant feature transform (SIFT), selection of target feature points using one class support vector machines (OC-SVMs), generation of visual words using self-organizing maps (SOMs), formation of labels using adaptive resonance theory 2 (ART-2), and creation and classification of categories on a category map of counter propagation networks (CPNs) for visualizing spatial relations between categories. Classification results of dynamic images using time-series images obtained using two different-size robots and according to movements respectively demonstrate that our method can visualize spatial relations of categories while maintaining time-series characteristics. Moreover, we emphasize the effectiveness of our method for category formation of appearance changes of objects.
Real-time object tracking based on scale-invariant features employing bio-inspired hardware.
Yasukawa, Shinsuke; Okuno, Hirotsugu; Ishii, Kazuo; Yagi, Tetsuya
2016-09-01
We developed a vision sensor system that performs a scale-invariant feature transform (SIFT) in real time. To apply the SIFT algorithm efficiently, we focus on a two-fold process performed by the visual system: whole-image parallel filtering and frequency-band parallel processing. The vision sensor system comprises an active pixel sensor, a metal-oxide semiconductor (MOS)-based resistive network, a field-programmable gate array (FPGA), and a digital computer. We employed the MOS-based resistive network for instantaneous spatial filtering and a configurable filter size. The FPGA is used to pipeline process the frequency-band signals. The proposed system was evaluated by tracking the feature points detected on an object in a video. Copyright © 2016 Elsevier Ltd. All rights reserved.
Comparing visual representations across human fMRI and computational vision
Leeds, Daniel D.; Seibert, Darren A.; Pyles, John A.; Tarr, Michael J.
2013-01-01
Feedforward visual object perception recruits a cortical network that is assumed to be hierarchical, progressing from basic visual features to complete object representations. However, the nature of the intermediate features related to this transformation remains poorly understood. Here, we explore how well different computer vision recognition models account for neural object encoding across the human cortical visual pathway as measured using fMRI. These neural data, collected during the viewing of 60 images of real-world objects, were analyzed with a searchlight procedure as in Kriegeskorte, Goebel, and Bandettini (2006): Within each searchlight sphere, the obtained patterns of neural activity for all 60 objects were compared to model responses for each computer recognition algorithm using representational dissimilarity analysis (Kriegeskorte et al., 2008). Although each of the computer vision methods significantly accounted for some of the neural data, among the different models, the scale invariant feature transform (Lowe, 2004), encoding local visual properties gathered from “interest points,” was best able to accurately and consistently account for stimulus representations within the ventral pathway. More generally, when present, significance was observed in regions of the ventral-temporal cortex associated with intermediate-level object perception. Differences in model effectiveness and the neural location of significant matches may be attributable to the fact that each model implements a different featural basis for representing objects (e.g., more holistic or more parts-based). Overall, we conclude that well-known computer vision recognition systems may serve as viable proxies for theories of intermediate visual object representation. PMID:24273227
Contributions of Invariants, Heuristics, and Exemplars to the Visual Perception of Relative Mass
ERIC Educational Resources Information Center
Cohen, Andrew L.
2006-01-01
Some potential contributions of invariants, heuristics, and exemplars to the perception of dynamic properties in the colliding balls task were explored. On each trial, an observer is asked to determine the heavier of 2 colliding balls. The invariant approach assumes that people can learn to detect complex visual patterns that reliably specify…
Deep generative learning of location-invariant visual word recognition.
Di Bono, Maria Grazia; Zorzi, Marco
2013-01-01
It is widely believed that orthographic processing implies an approximate, flexible coding of letter position, as shown by relative-position and transposition priming effects in visual word recognition. These findings have inspired alternative proposals about the representation of letter position, ranging from noisy coding across the ordinal positions to relative position coding based on open bigrams. This debate can be cast within the broader problem of learning location-invariant representations of written words, that is, a coding scheme abstracting the identity and position of letters (and combinations of letters) from their eye-centered (i.e., retinal) locations. We asked whether location-invariance would emerge from deep unsupervised learning on letter strings and what type of intermediate coding would emerge in the resulting hierarchical generative model. We trained a deep network with three hidden layers on an artificial dataset of letter strings presented at five possible retinal locations. Though word-level information (i.e., word identity) was never provided to the network during training, linear decoding from the activity of the deepest hidden layer yielded near-perfect accuracy in location-invariant word recognition. Conversely, decoding from lower layers yielded a large number of transposition errors. Analyses of emergent internal representations showed that word selectivity and location invariance increased as a function of layer depth. Word-tuning and location-invariance were found at the level of single neurons, but there was no evidence for bigram coding. Finally, the distributed internal representation of words at the deepest layer showed higher similarity to the representation elicited by the two exterior letters than by other combinations of two contiguous letters, in agreement with the hypothesis that word edges have special status. These results reveal that the efficient coding of written words-which was the model's learning objective-is largely based on letter-level information.
Blind readers break mirror invariance as sighted do.
de Heering, Adélaïde; Collignon, Olivier; Kolinsky, Régine
2018-04-01
Mirror invariance refers to a predisposition of humans, including infants and animals, which urge them to consider mirrored images as corresponding to the same object. Yet in order to learn to read a written system that incorporates mirrored letters (e.g., vs.
Terrestrial passage theory of the moon illusion.
Reed, C F
1984-12-01
Theories of the celestial, or moon, illusion have neglected geometric characteristics of movement along and above the surface of the earth. The illusion occurs because the characteristics of terrestrial passage are attributed to celestial passage. In terrestrial passage, the visual angle subtended by an object changes discriminably as an essentially invariant function of elevation above the horizon. In celestial passage, by contrast, change in visual angle is indiscriminable at all elevations. If a terrestrial object gains altitude, its angular subtense fails to follow the expansion projected for an orbital course: Angular diminution or constancy is equivalent to distancing. On the basis of terrestrial projections, a similar failure of celestial objects in successive elevations is also equivalent to distancing. The illusion occurs because of retinal image constancy, not--as traditionally stated--despite it.
The representation of object viewpoint in human visual cortex.
Andresen, David R; Vinberg, Joakim; Grill-Spector, Kalanit
2009-04-01
Understanding the nature of object representations in the human brain is critical for understanding the neural basis of invariant object recognition. However, the degree to which object representations are sensitive to object viewpoint is unknown. Using fMRI we employed a parametric approach to examine the sensitivity to object view as a function of rotation (0 degrees-180 degrees ), category (animal/vehicle) and fMRI-adaptation paradigm (short or long-lagged). For both categories and fMRI-adaptation paradigms, object-selective regions recovered from adaptation when a rotated view of an object was shown after adaptation to a specific view of that object, suggesting that representations are sensitive to object rotation. However, we found evidence for differential representations across categories and ventral stream regions. Rotation cross-adaptation was larger for animals than vehicles, suggesting higher sensitivity to vehicle than animal rotation, and was largest in the left fusiform/occipito-temporal sulcus (pFUS/OTS), suggesting that this region has low sensitivity to rotation. Moreover, right pFUS/OTS and FFA responded more strongly to front than back views of animals (without adaptation) and rotation cross-adaptation depended both on the level of rotation and the adapting view. This result suggests a prevalence of neurons that prefer frontal views of animals in fusiform regions. Using a computational model of view-tuned neurons, we demonstrate that differential neural view tuning widths and relative distributions of neural-tuned populations in fMRI voxels can explain the fMRI results. Overall, our findings underscore the utility of parametric approaches for studying the neural basis of object invariance and suggest that there is no complete invariance to object view in the human ventral stream.
Rust, Nicole C.; DiCarlo, James J.
2012-01-01
While popular accounts suggest that neurons along the ventral visual processing stream become increasingly selective for particular objects, this appears at odds with the fact that inferior temporal cortical (IT) neurons are broadly tuned. To explore this apparent contradiction, we compared processing in two ventral stream stages (V4 and IT) in the rhesus macaque monkey. We confirmed that IT neurons are indeed more selective for conjunctions of visual features than V4 neurons, and that this increase in feature conjunction selectivity is accompanied by an increase in tolerance (“invariance”) to identity-preserving transformations (e.g. shifting, scaling) of those features. We report here that V4 and IT neurons are, on average, tightly matched in their tuning breadth for natural images (“sparseness”), and that the average V4 or IT neuron will produce a robust firing rate response (over 50% of its peak observed firing rate) to ~10% of all natural images. We also observed that sparseness was positively correlated with conjunction selectivity and negatively correlated with tolerance within both V4 and IT, consistent with selectivity-building and invariance-building computations that offset one another to produce sparseness. Our results imply that the conjunction-selectivity-building and invariance-building computations necessary to support object recognition are implemented in a balanced fashion to maintain sparseness at each stage of processing. PMID:22836252
Multidimensional brain activity dictated by winner-take-all mechanisms.
Tozzi, Arturo; Peters, James F
2018-06-21
A novel demon-based architecture is introduced to elucidate brain functions such as pattern recognition during human perception and mental interpretation of visual scenes. Starting from the topological concepts of invariance and persistence, we introduce a Selfridge pandemonium variant of brain activity that takes into account a novel feature, namely, demons that recognize short straight-line segments, curved lines and scene shapes, such as shape interior, density and texture. Low-level representations of objects can be mapped to higher-level views (our mental interpretations): a series of transformations can be gradually applied to a pattern in a visual scene, without affecting its invariant properties. This makes it possible to construct a symbolic multi-dimensional representation of the environment. These representations can be projected continuously to an object that we have seen and continue to see, thanks to the mapping from shapes in our memory to shapes in Euclidean space. Although perceived shapes are 3-dimensional (plus time), the evaluation of shape features (volume, color, contour, closeness, texture, and so on) leads to n-dimensional brain landscapes. Here we discuss the advantages of our parallel, hierarchical model in pattern recognition, computer vision and biological nervous system's evolution. Copyright © 2018 Elsevier B.V. All rights reserved.
Did Ptolemy understand the moon illusion?
Ross, H E; Ross, G M
1976-01-01
Ptolemy is often wrongly credited with an explanation of the moon illusion based on the size-distance invariance principle. This paper elucidates the two Ptolemaic accounts: one in the Almagest, based on atmospheric refraction, and the other in the Optics, based on the difficulty of looking upwards. It is the latter passage which has been thought to refer to size-distance invariance, but it is more probable that it refers to the idea that the visual rays are diminished by the force of gravity (i.e. that the retinal image is reduced in size). Alhazen was probably the first author to explain the illusion by the size-distance invariance principle, and Roger Bacon the first to explain the enlarged apparent distance of the horizon by the presence of intervening objects. Della Porta was the first to credit Ptolemy with these explanations, and this mistake was repeated by many subsequent authors.
Grossberg, Stephen; Srinivasan, Karthik; Yazdanbakhsh, Arash
2015-01-01
How does the brain maintain stable fusion of 3D scenes when the eyes move? Every eye movement causes each retinal position to process a different set of scenic features, and thus the brain needs to binocularly fuse new combinations of features at each position after an eye movement. Despite these breaks in retinotopic fusion due to each movement, previously fused representations of a scene in depth often appear stable. The 3D ARTSCAN neural model proposes how the brain does this by unifying concepts about how multiple cortical areas in the What and Where cortical streams interact to coordinate processes of 3D boundary and surface perception, spatial attention, invariant object category learning, predictive remapping, eye movement control, and learned coordinate transformations. The model explains data from single neuron and psychophysical studies of covert visual attention shifts prior to eye movements. The model further clarifies how perceptual, attentional, and cognitive interactions among multiple brain regions (LGN, V1, V2, V3A, V4, MT, MST, PPC, LIP, ITp, ITa, SC) may accomplish predictive remapping as part of the process whereby view-invariant object categories are learned. These results build upon earlier neural models of 3D vision and figure-ground separation and the learning of invariant object categories as the eyes freely scan a scene. A key process concerns how an object's surface representation generates a form-fitting distribution of spatial attention, or attentional shroud, in parietal cortex that helps maintain the stability of multiple perceptual and cognitive processes. Predictive eye movement signals maintain the stability of the shroud, as well as of binocularly fused perceptual boundaries and surface representations. PMID:25642198
Grossberg, Stephen; Srinivasan, Karthik; Yazdanbakhsh, Arash
2014-01-01
How does the brain maintain stable fusion of 3D scenes when the eyes move? Every eye movement causes each retinal position to process a different set of scenic features, and thus the brain needs to binocularly fuse new combinations of features at each position after an eye movement. Despite these breaks in retinotopic fusion due to each movement, previously fused representations of a scene in depth often appear stable. The 3D ARTSCAN neural model proposes how the brain does this by unifying concepts about how multiple cortical areas in the What and Where cortical streams interact to coordinate processes of 3D boundary and surface perception, spatial attention, invariant object category learning, predictive remapping, eye movement control, and learned coordinate transformations. The model explains data from single neuron and psychophysical studies of covert visual attention shifts prior to eye movements. The model further clarifies how perceptual, attentional, and cognitive interactions among multiple brain regions (LGN, V1, V2, V3A, V4, MT, MST, PPC, LIP, ITp, ITa, SC) may accomplish predictive remapping as part of the process whereby view-invariant object categories are learned. These results build upon earlier neural models of 3D vision and figure-ground separation and the learning of invariant object categories as the eyes freely scan a scene. A key process concerns how an object's surface representation generates a form-fitting distribution of spatial attention, or attentional shroud, in parietal cortex that helps maintain the stability of multiple perceptual and cognitive processes. Predictive eye movement signals maintain the stability of the shroud, as well as of binocularly fused perceptual boundaries and surface representations.
Krafnick, Anthony J; Tan, Li-Hai; Flowers, D Lynn; Luetje, Megan M; Napoliello, Eileen M; Siok, Wai-Ting; Perfetti, Charles; Eden, Guinevere F
2016-06-01
Learning to read is thought to involve the recruitment of left hemisphere ventral occipitotemporal cortex (OTC) by a process of "neuronal recycling", whereby object processing mechanisms are co-opted for reading. Under the same theoretical framework, it has been proposed that the visual word form area (VWFA) within OTC processes orthographic stimuli independent of culture and writing systems, suggesting that it is universally involved in written language. However, this "script invariance" has yet to be demonstrated in monolingual readers of two different writing systems studied under the same experimental conditions. Here, using functional magnetic resonance imaging (fMRI), we examined activity in response to English Words and Chinese Characters in 1st graders in the United States and China, respectively. We examined each group separately and found the readers of English as well as the readers of Chinese to activate the left ventral OTC for their respective native writing systems (using both a whole-brain and a bilateral OTC-restricted analysis). Critically, a conjunction analysis of the two groups revealed significant overlap between them for native writing system processing, located in the VWFA and therefore supporting the hypothesis of script invariance. In the second part of the study, we further examined the left OTC region responsive to each group's native writing system and found that it responded equally to Object stimuli (line drawings) in the Chinese-reading children. In English-reading children, the OTC responded much more to Objects than to English Words. Together, these results support the script invariant role of the VWFA and also support the idea that the areas recruited for character or word processing are rooted in object processing mechanisms of the left OTC. Copyright © 2016 Elsevier Inc. All rights reserved.
Magnocellular pathway for rotation invariant Neocognitron.
Ting, C H
1993-03-01
In the mammalian visual system, magnocellular pathway and parvocellular pathway cooperatively process visual information in parallel. The magnocellular pathway is more global and less particular about the details while the parvocellular pathway recognizes objects based on the local features. In many aspects, Neocognitron may be regarded as the artificial analogue of the parvocellular pathway. It is interesting then to model the magnocellular pathway. In order to achieve "rotation invariance" for Neocognitron, we propose a neural network model after the magnocellular pathway and expand its roles to include surmising the orientation of the input pattern prior to recognition. With the incorporation of the magnocellular pathway, a basic shift in the original paradigm has taken place. A pattern is now said to be recognized when and only when one of the winners of the magnocellular pathway is validified by the parvocellular pathway. We have implemented the magnocellular pathway coupled with Neocognitron parallel on transputers; our simulation programme is now able to recognize numerals in arbitrary orientation.
Improved medical image fusion based on cascaded PCA and shift invariant wavelet transforms.
Reena Benjamin, J; Jayasree, T
2018-02-01
In the medical field, radiologists need more informative and high-quality medical images to diagnose diseases. Image fusion plays a vital role in the field of biomedical image analysis. It aims to integrate the complementary information from multimodal images, producing a new composite image which is expected to be more informative for visual perception than any of the individual input images. The main objective of this paper is to improve the information, to preserve the edges and to enhance the quality of the fused image using cascaded principal component analysis (PCA) and shift invariant wavelet transforms. A novel image fusion technique based on cascaded PCA and shift invariant wavelet transforms is proposed in this paper. PCA in spatial domain extracts relevant information from the large dataset based on eigenvalue decomposition, and the wavelet transform operating in the complex domain with shift invariant properties brings out more directional and phase details of the image. The significance of maximum fusion rule applied in dual-tree complex wavelet transform domain enhances the average information and morphological details. The input images of the human brain of two different modalities (MRI and CT) are collected from whole brain atlas data distributed by Harvard University. Both MRI and CT images are fused using cascaded PCA and shift invariant wavelet transform method. The proposed method is evaluated based on three main key factors, namely structure preservation, edge preservation, contrast preservation. The experimental results and comparison with other existing fusion methods show the superior performance of the proposed image fusion framework in terms of visual and quantitative evaluations. In this paper, a complex wavelet-based image fusion has been discussed. The experimental results demonstrate that the proposed method enhances the directional features as well as fine edge details. Also, it reduces the redundant details, artifacts, distortions.
The Emergence of Contrast-Invariant Orientation Tuning in Simple Cells of Cat Visual Cortex
Finn, Ian M.; Priebe, Nicholas J.; Ferster, David
2007-01-01
Simple cells in primary visual cortex exhibit contrast-invariant orientation tuning, in seeming contradiction to feed-forward models relying on lateral geniculate nucleus (LGN) input alone. Contrast invariance has therefore been thought to depend on the presence of intracortical lateral inhibition. In vivo intracellular recordings instead suggest that contrast invariance can be explained by three properties of the excitatory pathway. 1) Depolarizations evoked by orthogonal stimuli are determined by the amount of excitation a cell receives from the LGN, relative to the excitation it receives from other cortical cells. 2) Depolarizations evoked by preferred stimuli saturate at lower contrasts than the spike output of LGN relay cells. 3) Visual stimuli evoke contrast-dependent changes in trial-to-trial variability, which lead to contrast-dependent changes in the relationship between membrane potential and spike rate. Thus, high-contrast, orthogonally-oriented stimuli that evoke significant depolarizations evoke few spikes. Together these mechanisms, without lateral inhibition, can account for contrast-invariant stimulus selectivity. PMID:17408583
Viswanathan, Sivaram; Jayakumar, Jaikishan; Vidyasagar, Trichur R
2015-09-01
Responses of most neurons in the primary visual cortex of mammals are markedly selective for stimulus orientation and their orientation tuning does not vary with changes in stimulus contrast. The basis of such contrast invariance of orientation tuning has been shown to be the higher variability in the response for low-contrast stimuli. Neurons in the lateral geniculate nucleus (LGN), which provides the major visual input to the cortex, have also been shown to have higher variability in their response to low-contrast stimuli. Parallel studies have also long established mild degrees of orientation selectivity in LGN and retinal cells. In our study, we show that contrast invariance of orientation tuning is already present in the LGN. In addition, we show that the variability of spike responses of LGN neurons increases at lower stimulus contrasts, especially for non-preferred orientations. We suggest that such contrast- and orientation-sensitive variability not only explains the contrast invariance observed in the LGN but can also underlie the contrast-invariant orientation tuning seen at the level of the primary visual cortex. © 2015 Federation of European Neuroscience Societies and John Wiley & Sons Ltd.
Combining color and shape information for illumination-viewpoint invariant object recognition.
Diplaros, Aristeidis; Gevers, Theo; Patras, Ioannis
2006-01-01
In this paper, we propose a new scheme that merges color- and shape-invariant information for object recognition. To obtain robustness against photometric changes, color-invariant derivatives are computed first. Color invariance is an important aspect of any object recognition scheme, as color changes considerably with the variation in illumination, object pose, and camera viewpoint. These color invariant derivatives are then used to obtain similarity invariant shape descriptors. Shape invariance is equally important as, under a change in camera viewpoint and object pose, the shape of a rigid object undergoes a perspective projection on the image plane. Then, the color and shape invariants are combined in a multidimensional color-shape context which is subsequently used as an index. As the indexing scheme makes use of a color-shape invariant context, it provides a high-discriminative information cue robust against varying imaging conditions. The matching function of the color-shape context allows for fast recognition, even in the presence of object occlusion and cluttering. From the experimental results, it is shown that the method recognizes rigid objects with high accuracy in 3-D complex scenes and is robust against changing illumination, camera viewpoint, object pose, and noise.
Recognition Of Complex Three Dimensional Objects Using Three Dimensional Moment Invariants
NASA Astrophysics Data System (ADS)
Sadjadi, Firooz A.
1985-01-01
A technique for the recognition of complex three dimensional objects is presented. The complex 3-D objects are represented in terms of their 3-D moment invariants, algebraic expressions that remain invariant independent of the 3-D objects' orientations and locations in the field of view. The technique of 3-D moment invariants has been used successfully for simple 3-D object recognition in the past. In this work we have extended this method for the representation of more complex objects. Two complex objects are represented digitally; their 3-D moment invariants have been calculated, and then the invariancy of these 3-D invariant moment expressions is verified by changing the orientation and the location of the objects in the field of view. The results of this study have significant impact on 3-D robotic vision, 3-D target recognition, scene analysis and artificial intelligence.
Discrimination of holograms and real objects by pigeons (Columba livia) and humans (Homo sapiens).
Stephan, Claudia; Steurer, Michael M; Aust, Ulrike
2014-08-01
The type of stimulus material employed in visual tasks is crucial to all comparative cognition research that involves object recognition. There is considerable controversy about the use of 2-dimensional stimuli and the impact that the lack of the 3rd dimension (i.e., depth) may have on animals' performance in tests for their visual and cognitive abilities. We report evidence of discrimination learning using a completely novel type of stimuli, namely, holograms. Like real objects, holograms provide full 3-dimensional shape information but they also offer many possibilities for systematically modifying the appearance of a stimulus. Hence, they provide a promising means for investigating visual perception and cognition of different species in a comparative way. We trained pigeons and humans to discriminate either between 2 real objects or between holograms of the same 2 objects, and we subsequently tested both species for the transfer of discrimination to the other presentation mode. The lack of any decrements in accuracy suggests that real objects and holograms were perceived as equivalent in both species and shows the general appropriateness of holograms as stimuli in visual tasks. A follow-up experiment involving the presentation of novel views of the training objects and holograms revealed some interspecies differences in rotational invariance, thereby confirming and extending the results of previous studies. Taken together, these results suggest that holograms may not only provide a promising tool for investigating yet unexplored issues, but their use may also lead to novel insights into some crucial aspects of comparative visual perception and categorization.
The prior statistics of object colors.
Koenderink, Jan J
2010-02-01
The prior statistics of object colors is of much interest because extensive statistical investigations of reflectance spectra reveal highly non-uniform structure in color space common to several very different databases. This common structure is due to the visual system rather than to the statistics of environmental structure. Analysis involves an investigation of the proper sample space of spectral reflectance factors and of the statistical consequences of the projection of spectral reflectances on the color solid. Even in the case of reflectance statistics that are translationally invariant with respect to the wavelength dimension, the statistics of object colors is highly non-uniform. The qualitative nature of this non-uniformity is due to trichromacy.
Digital implementation of a neural network for imaging
NASA Astrophysics Data System (ADS)
Wood, Richard; McGlashan, Alex; Yatulis, Jay; Mascher, Peter; Bruce, Ian
2012-10-01
This paper outlines the design and testing of a digital imaging system that utilizes an artificial neural network with unsupervised and supervised learning to convert streaming input (real time) image space into parameter space. The primary objective of this work is to investigate the effectiveness of using a neural network to significantly reduce the information density of streaming images so that objects can be readily identified by a limited set of primary parameters and act as an enhanced human machine interface (HMI). Many applications are envisioned including use in biomedical imaging, anomaly detection and as an assistive device for the visually impaired. A digital circuit was designed and tested using a Field Programmable Gate Array (FPGA) and an off the shelf digital camera. Our results indicate that the networks can be readily trained when subject to limited sets of objects such as the alphabet. We can also separate limited object sets with rotational and positional invariance. The results also show that limited visual fields form with only local connectivity.
Adaptive particle filter for robust visual tracking
NASA Astrophysics Data System (ADS)
Dai, Jianghua; Yu, Shengsheng; Sun, Weiping; Chen, Xiaoping; Xiang, Jinhai
2009-10-01
Object tracking plays a key role in the field of computer vision. Particle filter has been widely used for visual tracking under nonlinear and/or non-Gaussian circumstances. In particle filter, the state transition model for predicting the next location of tracked object assumes the object motion is invariable, which cannot well approximate the varying dynamics of the motion changes. In addition, the state estimate calculated by the mean of all the weighted particles is coarse or inaccurate due to various noise disturbances. Both these two factors may degrade tracking performance greatly. In this work, an adaptive particle filter (APF) with a velocity-updating based transition model (VTM) and an adaptive state estimate approach (ASEA) is proposed to improve object tracking. In APF, the motion velocity embedded into the state transition model is updated continuously by a recursive equation, and the state estimate is obtained adaptively according to the state posterior distribution. The experiment results show that the APF can increase the tracking accuracy and efficiency in complex environments.
NASA Astrophysics Data System (ADS)
Cheng, Gong; Han, Junwei; Zhou, Peicheng; Guo, Lei
2014-12-01
The rapid development of remote sensing technology has facilitated us the acquisition of remote sensing images with higher and higher spatial resolution, but how to automatically understand the image contents is still a big challenge. In this paper, we develop a practical and rotation-invariant framework for multi-class geospatial object detection and geographic image classification based on collection of part detectors (COPD). The COPD is composed of a set of representative and discriminative part detectors, where each part detector is a linear support vector machine (SVM) classifier used for the detection of objects or recurring spatial patterns within a certain range of orientation. Specifically, when performing multi-class geospatial object detection, we learn a set of seed-based part detectors where each part detector corresponds to a particular viewpoint of an object class, so the collection of them provides a solution for rotation-invariant detection of multi-class objects. When performing geographic image classification, we utilize a large number of pre-trained part detectors to discovery distinctive visual parts from images and use them as attributes to represent the images. Comprehensive evaluations on two remote sensing image databases and comparisons with some state-of-the-art approaches demonstrate the effectiveness and superiority of the developed framework.
Getting a grip: different actions and visual guidance of the thumb and finger in precision grasping.
Melmoth, Dean R; Grant, Simon
2012-10-01
We manipulated the visual information available for grasping to examine what is visually guided when subjects get a precision grip on a common class of object (upright cylinders). In Experiment 1, objects (2 sizes) were placed at different eccentricities to vary the relative proximity to the participant's (n = 6) body of their thumb and finger contact positions in the final grip orientations, with vision available throughout or only for movement programming. Thumb trajectories were straighter and less variable than finger paths, and the thumb normally made initial contact with the objects at a relatively invariant landing site, but consistent thumb first-contacts were disrupted without visual guidance. Finger deviations were more affected by the object's properties and increased when vision was unavailable after movement onset. In Experiment 2, participants (n = 12) grasped 'glow-in-the-dark' objects wearing different luminous gloves in which the whole hand was visible or the thumb or the index finger was selectively occluded. Grip closure times were prolonged and thumb first-contacts disrupted when subjects could not see their thumb, whereas occluding the finger resulted in wider grips at contact because this digit remained distant from the object. Results were together consistent with visual feedback guiding the thumb in the period just prior to contacting the object, with the finger more involved in opening the grip and avoiding collision with the opposite contact surface. As people can overtly fixate only one object contact point at a time, we suggest that selecting one digit for online guidance represents an optimal strategy for initial grip placement. Other grasping tasks, in which the finger appears to be used for this purpose, are discussed.
Obtaining information by dynamic (effortful) touching
Turvey, M. T.; Carello, Claudia
2011-01-01
Dynamic touching is effortful touching. It entails deformation of muscles and fascia and activation of the embedded mechanoreceptors, as when an object is supported and moved by the body. It is realized as exploratory activities that can vary widely in spatial and temporal extents (a momentary heft, an extended walk). Research has revealed the potential of dynamic touching for obtaining non-visual information about the body (e.g. limb orientation), attachments to the body (e.g. an object's height and width) and the relation of the body both to attachments (e.g. hand's location on a grasped object) and surrounding surfaces (e.g. places and their distances). Invariants over the exploratory activity (e.g. moments of a wielded object's mass distribution) seem to ground this ‘information about’. The conception of a haptic medium as a nested tensegrity structure has been proposed to express the obtained information realized by myofascia deformation, by its invariants and transformations. The tensegrity proposal rationalizes the relative indifference of dynamic touch to the site of mechanical contact (hand, foot, torso or probe) and the overtness of exploratory activity. It also provides a framework for dynamic touching's fractal nature, and the finding that its degree of fractality may matter to its accomplishments. PMID:21969694
Effect of silhouetting and inversion on view invariance in the monkey inferotemporal cortex
2017-01-01
We effortlessly recognize objects across changes in viewpoint, but we know relatively little about the features that underlie viewpoint invariance in the brain. Here, we set out to characterize how viewpoint invariance in monkey inferior temporal (IT) neurons is influenced by two image manipulations—silhouetting and inversion. Reducing an object into its silhouette removes internal detail, so this would reveal how much viewpoint invariance depends on the external contours. Inverting an object retains but rearranges features, so this would reveal how much viewpoint invariance depends on the arrangement and orientation of features. Our main findings are 1) view invariance is weakened by silhouetting but not by inversion; 2) view invariance was stronger in neurons that generalized across silhouetting and inversion; 3) neuronal responses to natural objects matched early with that of silhouettes and only later to that of inverted objects, indicative of coarse-to-fine processing; and 4) the impact of silhouetting and inversion depended on object structure. Taken together, our results elucidate the underlying features and dynamics of view-invariant object representations in the brain. NEW & NOTEWORTHY We easily recognize objects across changes in viewpoint, but the underlying features are unknown. Here, we show that view invariance in the monkey inferotemporal cortex is driven mainly by external object contours and is not specialized for object orientation. We also find that the responses to natural objects match with that of their silhouettes early in the response, and with inverted versions later in the response—indicative of a coarse-to-fine processing sequence in the brain. PMID:28381484
Effect of silhouetting and inversion on view invariance in the monkey inferotemporal cortex.
Ratan Murty, N Apurva; Arun, S P
2017-07-01
We effortlessly recognize objects across changes in viewpoint, but we know relatively little about the features that underlie viewpoint invariance in the brain. Here, we set out to characterize how viewpoint invariance in monkey inferior temporal (IT) neurons is influenced by two image manipulations-silhouetting and inversion. Reducing an object into its silhouette removes internal detail, so this would reveal how much viewpoint invariance depends on the external contours. Inverting an object retains but rearranges features, so this would reveal how much viewpoint invariance depends on the arrangement and orientation of features. Our main findings are 1 ) view invariance is weakened by silhouetting but not by inversion; 2 ) view invariance was stronger in neurons that generalized across silhouetting and inversion; 3 ) neuronal responses to natural objects matched early with that of silhouettes and only later to that of inverted objects, indicative of coarse-to-fine processing; and 4 ) the impact of silhouetting and inversion depended on object structure. Taken together, our results elucidate the underlying features and dynamics of view-invariant object representations in the brain. NEW & NOTEWORTHY We easily recognize objects across changes in viewpoint, but the underlying features are unknown. Here, we show that view invariance in the monkey inferotemporal cortex is driven mainly by external object contours and is not specialized for object orientation. We also find that the responses to natural objects match with that of their silhouettes early in the response, and with inverted versions later in the response-indicative of a coarse-to-fine processing sequence in the brain. Copyright © 2017 the American Physiological Society.
2015-09-01
Detectability ...............................................................................................37 Figure 20. Excel VBA Codes for Checker...National Vulnerability Database OS Operating System SQL Structured Query Language VC Verification Condition VBA Visual Basic for Applications...checks each of these assertions for detectability by Daikon. The checker is an Excel Visual Basic for Applications ( VBA ) script that checks the
Comparing the minimum spatial-frequency content for recognizing Chinese and alphabet characters
Wang, Hui; Legge, Gordon E.
2018-01-01
Visual blur is a common problem that causes difficulty in pattern recognition for normally sighted people under degraded viewing conditions (e.g., near the acuity limit, when defocused, or in fog) and also for people with impaired vision. For reliable identification, the spatial frequency content of an object needs to extend up to or exceed a minimum value in units of cycles per object, referred to as the critical spatial frequency. In this study, we investigated the critical spatial frequency for alphabet and Chinese characters, and examined the effect of pattern complexity. The stimuli were divided into seven categories based on their perimetric complexity, including the lowercase and uppercase alphabet letters, and five groups of Chinese characters. We found that the critical spatial frequency significantly increased with complexity, from 1.01 cycles per character for the simplest group to 2.00 cycles per character for the most complex group of Chinese characters. A second goal of the study was to test a space-bandwidth invariance hypothesis that would represent a tradeoff between the critical spatial frequency and the number of adjacent patterns that can be recognized at one time. We tested this hypothesis by comparing the critical spatial frequencies in cycles per character from the current study and visual-span sizes in number of characters (measured by Wang, He, & Legge, 2014) for sets of characters with different complexities. For the character size (1.2°) we used in the study, we found an invariant product of approximately 10 cycles, which may represent a capacity limitation on visual pattern recognition. PMID:29297056
A novel false color mapping model-based fusion method of visual and infrared images
NASA Astrophysics Data System (ADS)
Qi, Bin; Kun, Gao; Tian, Yue-xin; Zhu, Zhen-yu
2013-12-01
A fast and efficient image fusion method is presented to generate near-natural colors from panchromatic visual and thermal imaging sensors. Firstly, a set of daytime color reference images are analyzed and the false color mapping principle is proposed according to human's visual and emotional habits. That is, object colors should remain invariant after color mapping operations, differences between infrared and visual images should be enhanced and the background color should be consistent with the main scene content. Then a novel nonlinear color mapping model is given by introducing the geometric average value of the input visual and infrared image gray and the weighted average algorithm. To determine the control parameters in the mapping model, the boundary conditions are listed according to the mapping principle above. Fusion experiments show that the new fusion method can achieve the near-natural appearance of the fused image, and has the features of enhancing color contrasts and highlighting the infrared brilliant objects when comparing with the traditional TNO algorithm. Moreover, it owns the low complexity and is easy to realize real-time processing. So it is quite suitable for the nighttime imaging apparatus.
NASA Astrophysics Data System (ADS)
Fiorini, Rodolfo A.; Dacquino, Gianfranco
2005-03-01
GEOGINE (GEOmetrical enGINE), a state-of-the-art OMG (Ontological Model Generator) based on n-D Tensor Invariants for n-Dimensional shape/texture optimal synthetic representation, description and learning, was presented in previous conferences elsewhere recently. Improved computational algorithms based on the computational invariant theory of finite groups in Euclidean space and a demo application is presented. Progressive model automatic generation is discussed. GEOGINE can be used as an efficient computational kernel for fast reliable application development and delivery in advanced biomedical engineering, biometric, intelligent computing, target recognition, content image retrieval, data mining technological areas mainly. Ontology can be regarded as a logical theory accounting for the intended meaning of a formal dictionary, i.e., its ontological commitment to a particular conceptualization of the world object. According to this approach, "n-D Tensor Calculus" can be considered a "Formal Language" to reliably compute optimized "n-Dimensional Tensor Invariants" as specific object "invariant parameter and attribute words" for automated n-Dimensional shape/texture optimal synthetic object description by incremental model generation. The class of those "invariant parameter and attribute words" can be thought as a specific "Formal Vocabulary" learned from a "Generalized Formal Dictionary" of the "Computational Tensor Invariants" language. Even object chromatic attributes can be effectively and reliably computed from object geometric parameters into robust colour shape invariant characteristics. As a matter of fact, any highly sophisticated application needing effective, robust object geometric/colour invariant attribute capture and parameterization features, for reliable automated object learning and discrimination can deeply benefit from GEOGINE progressive automated model generation computational kernel performance. Main operational advantages over previous, similar approaches are: 1) Progressive Automated Invariant Model Generation, 2) Invariant Minimal Complete Description Set for computational efficiency, 3) Arbitrary Model Precision for robust object description and identification.
Binary optical filters for scale invariant pattern recognition
NASA Technical Reports Server (NTRS)
Reid, Max B.; Downie, John D.; Hine, Butler P.
1992-01-01
Binary synthetic discriminant function (BSDF) optical filters which are invariant to scale changes in the target object of more than 50 percent are demonstrated in simulation and experiment. Efficient databases of scale invariant BSDF filters can be designed which discriminate between two very similar objects at any view scaled over a factor of 2 or more. The BSDF technique has considerable advantages over other methods for achieving scale invariant object recognition, as it also allows determination of the object's scale. In addition to scale, the technique can be used to design recognition systems invariant to other geometric distortions.
Pauls, Franz; Petermann, Franz; Lepach, Anja Christina
2013-01-01
Between-group comparisons are permissible and meaningfully interpretable only if diagnostic instruments are proved to measure the same latent dimensions across different groups. Addressing this issue, the present study was carried out to provide a rigorous test of measurement invariance. Confirmatory factor analyses were used to determine which model solution could best explain memory performance as measured by the Wechsler Memory Scale-Fourth Edition (WMS-IV) in a clinical depression sample and in healthy controls. Multigroup confirmatory factor analysis was conducted to evaluate the evidence for measurement invariance. A three-factor model solution including the dimensions of auditory memory, visual memory, and visual working memory was identified to best fit the data in both samples, and measurement invariance was partially satisfied. The results supported clinical utility of the WMS-IV--that is, auditory and visual memory performances of patients with depressive disorders are interpretable on the basis of the WMS-IV standardization data. However, possible differences in visual working memory functions between healthy and depressed individuals could restrict comparisons of the WMS-IV working memory index.
2015-09-02
human behavior. In this project, we hypothesized that visual memory of past motion trajectories may be used for selecting future behavior. In other...34Decoding sequence of actions using fMRI ", Society for Neuroscience Annual Meeting, San Diego, CA, USA, Nov 9-13 2013 (only abstract) 3. Hansol Choi, Dae...Shik Kim, "Planning as inference in a Hierarchical Predictive Memory ", Proceedings of International Conference on Neural Information Processing
Interpretation of the function of the striate cortex
NASA Astrophysics Data System (ADS)
Garner, Bernardette M.; Paplinski, Andrew P.
2000-04-01
Biological neural networks do not require retraining every time objects move in the visual field. Conventional computer neural networks do not share this shift-invariance. The brain compensates for movements in the head, body, eyes and objects by allowing the sensory data to be tracked across the visual field. The neurons in the striate cortex respond to objects moving across the field of vision as is seen in many experiments. It is proposed, that the neurons in the striate cortex allow continuous angle changes needed to compensate for changes in orientation of the head, eyes and the motion of objects in the field of vision. It is hypothesized that the neurons in the striate cortex form a system that allows for the translation, some rotation and scaling of objects and provides a continuity of objects as they move relative to other objects. The neurons in the striate cortex respond to features which are fundamental to sight, such as orientation of lines, direction of motion, color and contrast. The neurons that respond to these features are arranged on the cortex in a way that depends on the features they are responding to and on the area of the retina from which they receive their inputs.
"What" and "where" in word reading: ventral coding of written words revealed by parietal atrophy.
Vinckier, Fabien; Naccache, Lionel; Papeix, Caroline; Forget, Joaquim; Hahn-Barma, Valerie; Dehaene, Stanislas; Cohen, Laurent
2006-12-01
The visual system of literate adults develops a remarkable perceptual expertise for printed words. To delineate the aspects of this competence intrinsic to the occipitotemporal "what" pathway, we studied a patient with bilateral lesions of the occipitoparietal "where" pathway. Depending on critical geometric features of the display (rotation angle, letter spacing, mirror reversal, etc.), she switched from a good performance, when her intact ventral pathway was sufficient to encode words, to severely impaired reading, when her parietal lesions prevented the use of alternative reading strategies as a result of spatial and attentional impairments. In particular, reading was disrupted (a) by rotating word by more than 50 degrees , providing an approximation of the invariance range for words encoding in the ventral pathway; (b) by separating letters with double spaces, revealing the limits of letter grouping into perceptual wholes; (c) by mirror-reversing words, showing that words escape the default mirror-invariant representation of visual objects in the ventral pathway. Moreover, because of her parietal lesions, she was unable to discriminate mirror images of common objects, although she was excellent with reversible pseudowords, confirming that the breaking of mirror symmetry was intrinsic to the occipitotemporal cortex. Thus, charting the display conditions associated with preserved or impaired performance allowed us to infer properties of word coding in the normal ventral pathway and to delineate the roles of the parietal lobes in single-word recognition.
What can fish brains tell us about visual perception?
Rosa Salva, Orsola; Sovrano, Valeria Anna; Vallortigara, Giorgio
2014-01-01
Fish are a complex taxonomic group, whose diversity and distance from other vertebrates well suits the comparative investigation of brain and behavior: in fish species we observe substantial differences with respect to the telencephalic organization of other vertebrates and an astonishing variety in the development and complexity of pallial structures. We will concentrate on the contribution of research on fish behavioral biology for the understanding of the evolution of the visual system. We shall review evidence concerning perceptual effects that reflect fundamental principles of the visual system functioning, highlighting the similarities and differences between distant fish groups and with other vertebrates. We will focus on perceptual effects reflecting some of the main tasks that the visual system must attain. In particular, we will deal with subjective contours and optical illusions, invariance effects, second order motion and biological motion and, finally, perceptual binding of object properties in a unified higher level representation. PMID:25324728
Mitchnick, Krista A; Wideman, Cassidy E; Huff, Andrew E; Palmer, Daniel; McNaughton, Bruce L; Winters, Boyer D
2018-05-15
The capacity to recognize objects from different view-points or angles, referred to as view-invariance, is an essential process that humans engage in daily. Currently, the ability to investigate the neurobiological underpinnings of this phenomenon is limited, as few ethologically valid view-invariant object recognition tasks exist for rodents. Here, we report two complementary, novel view-invariant object recognition tasks in which rodents physically interact with three-dimensional objects. Prior to experimentation, rats and mice were given extensive experience with a set of 'pre-exposure' objects. In a variant of the spontaneous object recognition task, novelty preference for pre-exposed or new objects was assessed at various angles of rotation (45°, 90° or 180°); unlike control rodents, for whom the objects were novel, rats and mice tested with pre-exposed objects did not discriminate between rotated and un-rotated objects in the choice phase, indicating substantial view-invariant object recognition. Secondly, using automated operant touchscreen chambers, rats were tested on pre-exposed or novel objects in a pairwise discrimination task, where the rewarded stimulus (S+) was rotated (180°) once rats had reached acquisition criterion; rats tested with pre-exposed objects re-acquired the pairwise discrimination following S+ rotation more effectively than those tested with new objects. Systemic scopolamine impaired performance on both tasks, suggesting involvement of acetylcholine at muscarinic receptors in view-invariant object processing. These tasks present novel means of studying the behavioral and neural bases of view-invariant object recognition in rodents. Copyright © 2018 Elsevier B.V. All rights reserved.
Gharat, Amol; Baker, Curtis L
2017-01-25
Many of the neurons in early visual cortex are selective for the orientation of boundaries defined by first-order cues (luminance) as well as second-order cues (contrast, texture). The neural circuit mechanism underlying this selectivity is still unclear, but some studies have proposed that it emerges from spatial nonlinearities of subcortical Y cells. To understand how inputs from the Y-cell pathway might be pooled to generate cue-invariant receptive fields, we recorded visual responses from single neurons in cat Area 18 using linear multielectrode arrays. We measured responses to drifting and contrast-reversing luminance gratings as well as contrast modulation gratings. We found that a large fraction of these neurons have nonoriented responses to gratings, similar to those of subcortical Y cells: they respond at the second harmonic (F2) to high-spatial frequency contrast-reversing gratings and at the first harmonic (F1) to low-spatial frequency drifting gratings ("Y-cell signature"). For a given neuron, spatial frequency tuning for linear (F1) and nonlinear (F2) responses is quite distinct, similar to orientation-selective cue-invariant neurons. Also, these neurons respond to contrast modulation gratings with selectivity for the carrier (texture) spatial frequency and, in some cases, orientation. Their receptive field properties suggest that they could serve as building blocks for orientation-selective cue-invariant neurons. We propose a circuit model that combines ON- and OFF-center cortical Y-like cells in an unbalanced push-pull manner to generate orientation-selective, cue-invariant receptive fields. A significant fraction of neurons in early visual cortex have specialized receptive fields that allow them to selectively respond to the orientation of boundaries that are invariant to the cue (luminance, contrast, texture, motion) that defines them. However, the neural mechanism to construct such versatile receptive fields remains unclear. Using multielectrode recording, we found a large fraction of neurons in early visual cortex with receptive fields not selective for orientation that have spatial nonlinearities like those of subcortical Y cells. These are strong candidates for building cue-invariant orientation-selective neurons; we present a neural circuit model that pools such neurons in an imbalanced "push-pull" manner, to generate orientation-selective cue-invariant receptive fields. Copyright © 2017 the authors 0270-6474/17/370998-16$15.00/0.
Neural-Network Object-Recognition Program
NASA Technical Reports Server (NTRS)
Spirkovska, L.; Reid, M. B.
1993-01-01
HONTIOR computer program implements third-order neural network exhibiting invariance under translation, change of scale, and in-plane rotation. Invariance incorporated directly into architecture of network. Only one view of each object needed to train network for two-dimensional-translation-invariant recognition of object. Also used for three-dimensional-transformation-invariant recognition by training network on only set of out-of-plane rotated views. Written in C language.
Comparison of Object Recognition Behavior in Human and Monkey
Rajalingham, Rishi; Schmidt, Kailyn
2015-01-01
Although the rhesus monkey is used widely as an animal model of human visual processing, it is not known whether invariant visual object recognition behavior is quantitatively comparable across monkeys and humans. To address this question, we systematically compared the core object recognition behavior of two monkeys with that of human subjects. To test true object recognition behavior (rather than image matching), we generated several thousand naturalistic synthetic images of 24 basic-level objects with high variation in viewing parameters and image background. Monkeys were trained to perform binary object recognition tasks on a match-to-sample paradigm. Data from 605 human subjects performing the same tasks on Mechanical Turk were aggregated to characterize “pooled human” object recognition behavior, as well as 33 separate Mechanical Turk subjects to characterize individual human subject behavior. Our results show that monkeys learn each new object in a few days, after which they not only match mean human performance but show a pattern of object confusion that is highly correlated with pooled human confusion patterns and is statistically indistinguishable from individual human subjects. Importantly, this shared human and monkey pattern of 3D object confusion is not shared with low-level visual representations (pixels, V1+; models of the retina and primary visual cortex) but is shared with a state-of-the-art computer vision feature representation. Together, these results are consistent with the hypothesis that rhesus monkeys and humans share a common neural shape representation that directly supports object perception. SIGNIFICANCE STATEMENT To date, several mammalian species have shown promise as animal models for studying the neural mechanisms underlying high-level visual processing in humans. In light of this diversity, making tight comparisons between nonhuman and human primates is particularly critical in determining the best use of nonhuman primates to further the goal of the field of translating knowledge gained from animal models to humans. To the best of our knowledge, this study is the first systematic attempt at comparing a high-level visual behavior of humans and macaque monkeys. PMID:26338324
Robust image features: concentric contrasting circles and their image extraction
NASA Astrophysics Data System (ADS)
Gatrell, Lance B.; Hoff, William A.; Sklair, Cheryl W.
1992-03-01
Many computer vision tasks can be simplified if special image features are placed on the objects to be recognized. A review of special image features that have been used in the past is given and then a new image feature, the concentric contrasting circle, is presented. The concentric contrasting circle image feature has the advantages of being easily manufactured, easily extracted from the image, robust extraction (true targets are found, while few false targets are found), it is a passive feature, and its centroid is completely invariant to the three translational and one rotational degrees of freedom and nearly invariant to the remaining two rotational degrees of freedom. There are several examples of existing parallel implementations which perform most of the extraction work. Extraction robustness was measured by recording the probability of correct detection and the false alarm rate in a set of images of scenes containing mockups of satellites, fluid couplings, and electrical components. A typical application of concentric contrasting circle features is to place them on modeled objects for monocular pose estimation or object identification. This feature is demonstrated on a visually challenging background of a specular but wrinkled surface similar to a multilayered insulation spacecraft thermal blanket.
The Occipital Face Area Is Causally Involved in Facial Viewpoint Perception
Poltoratski, Sonia; König, Peter; Blake, Randolph; Tong, Frank; Ling, Sam
2015-01-01
Humans reliably recognize faces across a range of viewpoints, but the neural substrates supporting this ability remain unclear. Recent work suggests that neural selectivity to mirror-symmetric viewpoints of faces, found across a large network of visual areas, may constitute a key computational step in achieving full viewpoint invariance. In this study, we used repetitive transcranial magnetic stimulation (rTMS) to test the hypothesis that the occipital face area (OFA), putatively a key node in the face network, plays a causal role in face viewpoint symmetry perception. Each participant underwent both offline rTMS to the right OFA and sham stimulation, preceding blocks of behavioral trials. After each stimulation period, the participant performed one of two behavioral tasks involving presentation of faces in the peripheral visual field: (1) judging the viewpoint symmetry; or (2) judging the angular rotation. rTMS applied to the right OFA significantly impaired performance in both tasks when stimuli were presented in the contralateral, left visual field. Interestingly, however, rTMS had a differential effect on the two tasks performed ipsilaterally. Although viewpoint symmetry judgments were significantly disrupted, we observed no effect on the angle judgment task. This interaction, caused by ipsilateral rTMS, provides support for models emphasizing the role of interhemispheric crosstalk in the formation of viewpoint-invariant face perception. SIGNIFICANCE STATEMENT Faces are among the most salient objects we encounter during our everyday activities. Moreover, we are remarkably adept at identifying people at a glance, despite the diversity of viewpoints during our social encounters. Here, we investigate the cortical mechanisms underlying this ability by focusing on effects of viewpoint symmetry, i.e., the invariance of neural responses to mirror-symmetric facial viewpoints. We did this by temporarily disrupting neural processing in the occipital face area (OFA) using transcranial magnetic stimulation. Our results demonstrate that the OFA causally contributes to judgments facial viewpoints and suggest that effects of viewpoint symmetry, previously observed using fMRI, arise from an interhemispheric integration of visual information even when only one hemisphere receives direct visual stimulation. PMID:26674865
The Occipital Face Area Is Causally Involved in Facial Viewpoint Perception.
Kietzmann, Tim C; Poltoratski, Sonia; König, Peter; Blake, Randolph; Tong, Frank; Ling, Sam
2015-12-16
Humans reliably recognize faces across a range of viewpoints, but the neural substrates supporting this ability remain unclear. Recent work suggests that neural selectivity to mirror-symmetric viewpoints of faces, found across a large network of visual areas, may constitute a key computational step in achieving full viewpoint invariance. In this study, we used repetitive transcranial magnetic stimulation (rTMS) to test the hypothesis that the occipital face area (OFA), putatively a key node in the face network, plays a causal role in face viewpoint symmetry perception. Each participant underwent both offline rTMS to the right OFA and sham stimulation, preceding blocks of behavioral trials. After each stimulation period, the participant performed one of two behavioral tasks involving presentation of faces in the peripheral visual field: (1) judging the viewpoint symmetry; or (2) judging the angular rotation. rTMS applied to the right OFA significantly impaired performance in both tasks when stimuli were presented in the contralateral, left visual field. Interestingly, however, rTMS had a differential effect on the two tasks performed ipsilaterally. Although viewpoint symmetry judgments were significantly disrupted, we observed no effect on the angle judgment task. This interaction, caused by ipsilateral rTMS, provides support for models emphasizing the role of interhemispheric crosstalk in the formation of viewpoint-invariant face perception. Faces are among the most salient objects we encounter during our everyday activities. Moreover, we are remarkably adept at identifying people at a glance, despite the diversity of viewpoints during our social encounters. Here, we investigate the cortical mechanisms underlying this ability by focusing on effects of viewpoint symmetry, i.e., the invariance of neural responses to mirror-symmetric facial viewpoints. We did this by temporarily disrupting neural processing in the occipital face area (OFA) using transcranial magnetic stimulation. Our results demonstrate that the OFA causally contributes to judgments facial viewpoints and suggest that effects of viewpoint symmetry, previously observed using fMRI, arise from an interhemispheric integration of visual information even when only one hemisphere receives direct visual stimulation. Copyright © 2015 the authors 0270-6474/15/3516398-06$15.00/0.
First-Pass Processing of Value Cues in the Ventral Visual Pathway.
Sasikumar, Dennis; Emeric, Erik; Stuphorn, Veit; Connor, Charles E
2018-02-19
Real-world value often depends on subtle, continuously variable visual cues specific to particular object categories, like the tailoring of a suit, the condition of an automobile, or the construction of a house. Here, we used microelectrode recording in behaving monkeys to test two possible mechanisms for category-specific value-cue processing: (1) previous findings suggest that prefrontal cortex (PFC) identifies object categories, and based on category identity, PFC could use top-down attentional modulation to enhance visual processing of category-specific value cues, providing signals to PFC for calculating value, and (2) a faster mechanism would be first-pass visual processing of category-specific value cues, immediately providing the necessary visual information to PFC. This, however, would require learned mechanisms for processing the appropriate cues in a given object category. To test these hypotheses, we trained monkeys to discriminate value in four letter-like stimulus categories. Each category had a different, continuously variable shape cue that signified value (liquid reward amount) as well as other cues that were irrelevant. Monkeys chose between stimuli of different reward values. Consistent with the first-pass hypothesis, we found early signals for category-specific value cues in area TE (the final stage in monkey ventral visual pathway) beginning 81 ms after stimulus onset-essentially at the start of TE responses. Task-related activity emerged in lateral PFC approximately 40 ms later and consisted mainly of category-invariant value tuning. Our results show that, for familiar, behaviorally relevant object categories, high-level ventral pathway cortex can implement rapid, first-pass processing of category-specific value cues. Copyright © 2018 Elsevier Ltd. All rights reserved.
Krafnick, Anthony J.; Tan, Li-Hai; Flowers, D. Lynn; Luetje, Megan M.; Napoliello, Eileen M.; Siok, Wai-Ting; Perfetti, Charles; Eden, Guinevere F.
2016-01-01
Learning to read is thought to involve the recruitment of left hemisphere ventral occipitotemporal cortex (OTC) by a process of “neuronal recycling”, whereby object processing mechanisms are co-opted for reading. Under the same theoretical framework, it has been proposed that the visual word form area (VWFA) within the OTC processes orthographic stimuli independent of culture and writing systems, suggesting that it is universally involved in written language. However, this “script invariance” has yet to be demonstrated in monolingual readers of two different writing systems studied under the same experimental conditions. Here, using functional magnetic resonance imaging (fMRI), we examined activity in response to English Words and Chinese Characters in 1st graders in the United States and China, respectively. We examined each group separately and found the readers of English as well as the readers of Chinese to activate the left ventral OTC for their respective native writing systems (using both a whole-brain and a bilateral OTC-restricted analysis). Critically, a conjunction analysis of the two groups revealed significant overlap between them for native writing system processing, located in the VWFA and therefore supporting the hypothesis of script invariance. In the second part of the study, we further examined the left OTC region responsive to each group’s native writing system and found it responded equally to Object stimuli (line drawings) in the Chinese-reading children. In English-reading children, the OTC responded much more to Objects than to English Words. Together, these results support the script invariant role of the VWFA and also support the idea that the areas recruited for character or word processing are rooted in object processing mechanisms of the left OTC. PMID:27012502
Dynamics of 3D view invariance in monkey inferotemporal cortex
Ratan Murty, N. Apurva
2015-01-01
Rotations in depth are challenging for object vision because features can appear, disappear, be stretched or compressed. Yet we easily recognize objects across views. Are the underlying representations view invariant or dependent? This question has been intensely debated in human vision, but the neuronal representations remain poorly understood. Here, we show that for naturalistic objects, neurons in the monkey inferotemporal (IT) cortex undergo a dynamic transition in time, whereby they are initially sensitive to viewpoint and later encode view-invariant object identity. This transition depended on two aspects of object structure: it was strongest when objects foreshortened strongly across views and were similar to each other. View invariance in IT neurons was present even when objects were reduced to silhouettes, suggesting that it can arise through similarity between external contours of objects across views. Our results elucidate the viewpoint debate by showing that view invariance arises dynamically in IT neurons out of a representation that is initially view dependent. PMID:25609108
Invariant Spatial Context Is Learned but Not Retrieved in Gaze-Contingent Tunnel-View Search
ERIC Educational Resources Information Center
Zang, Xuelian; Jia, Lina; Müller, Hermann J.; Shi, Zhuanghua
2015-01-01
Our visual brain is remarkable in extracting invariant properties from the noisy environment, guiding selection of where to look and what to identify. However, how the brain achieves this is still poorly understood. Here we explore interactions of local context and global structure in the long-term learning and retrieval of invariant display…
Yamashita, Wakayo; Wang, Gang; Tanaka, Keiji
2010-01-01
One usually fails to recognize an unfamiliar object across changes in viewing angle when it has to be discriminated from similar distractor objects. Previous work has demonstrated that after long-term experience in discriminating among a set of objects seen from the same viewing angle, immediate recognition of the objects across 30-60 degrees changes in viewing angle becomes possible. The capability for view-invariant object recognition should develop during the within-viewing-angle discrimination, which includes two kinds of experience: seeing individual views and discriminating among the objects. The aim of the present study was to determine the relative contribution of each factor to the development of view-invariant object recognition capability. Monkeys were first extensively trained in a task that required view-invariant object recognition (Object task) with several sets of objects. The animals were then exposed to a new set of objects over 26 days in one of two preparatory tasks: one in which each object view was seen individually, and a second that required discrimination among the objects at each of four viewing angles. After the preparatory period, we measured the monkeys' ability to recognize the objects across changes in viewing angle, by introducing the object set to the Object task. Results indicated significant view-invariant recognition after the second but not first preparatory task. These results suggest that discrimination of objects from distractors at each of several viewing angles is required for the development of view-invariant recognition of the objects when the distractors are similar to the objects.
Reconstructing the Curve-Skeletons of 3D Shapes Using the Visual Hull.
Livesu, Marco; Guggeri, Fabio; Scateni, Riccardo
2012-11-01
Curve-skeletons are the most important descriptors for shapes, capable of capturing in a synthetic manner the most relevant features. They are useful for many different applications: from shape matching and retrieval, to medical imaging, to animation. This has led, over the years, to the development of several different techniques for extraction, each trying to comply with specific goals. We propose a novel technique which stems from the intuition of reproducing what a human being does to deduce the shape of an object holding it in his or her hand and rotating. To accomplish this, we use the formal definitions of epipolar geometry and visual hull. We show how it is possible to infer the curve-skeleton of a broad class of 3D shapes, along with an estimation of the radii of the maximal inscribed balls, by gathering information about the medial axes of their projections on the image planes of the stereographic vision. It is definitely worth to point out that our method works indifferently on (even unoriented) polygonal meshes, voxel models, and point clouds. Moreover, it is insensitive to noise, pose-invariant, resolution-invariant, and robust when applied to incomplete data sets.
Online gesture spotting from visual hull data.
Peng, Bo; Qian, Gang
2011-06-01
This paper presents a robust framework for online full-body gesture spotting from visual hull data. Using view-invariant pose features as observations, hidden Markov models (HMMs) are trained for gesture spotting from continuous movement data streams. Two major contributions of this paper are 1) view-invariant pose feature extraction from visual hulls, and 2) a systematic approach to automatically detecting and modeling specific nongesture movement patterns and using their HMMs for outlier rejection in gesture spotting. The experimental results have shown the view-invariance property of the proposed pose features for both training poses and new poses unseen in training, as well as the efficacy of using specific nongesture models for outlier rejection. Using the IXMAS gesture data set, the proposed framework has been extensively tested and the gesture spotting results are superior to those reported on the same data set obtained using existing state-of-the-art gesture spotting methods.
Object recognition with hierarchical discriminant saliency networks.
Han, Sunhyoung; Vasconcelos, Nuno
2014-01-01
The benefits of integrating attention and object recognition are investigated. While attention is frequently modeled as a pre-processor for recognition, we investigate the hypothesis that attention is an intrinsic component of recognition and vice-versa. This hypothesis is tested with a recognition model, the hierarchical discriminant saliency network (HDSN), whose layers are top-down saliency detectors, tuned for a visual class according to the principles of discriminant saliency. As a model of neural computation, the HDSN has two possible implementations. In a biologically plausible implementation, all layers comply with the standard neurophysiological model of visual cortex, with sub-layers of simple and complex units that implement a combination of filtering, divisive normalization, pooling, and non-linearities. In a convolutional neural network implementation, all layers are convolutional and implement a combination of filtering, rectification, and pooling. The rectification is performed with a parametric extension of the now popular rectified linear units (ReLUs), whose parameters can be tuned for the detection of target object classes. This enables a number of functional enhancements over neural network models that lack a connection to saliency, including optimal feature denoising mechanisms for recognition, modulation of saliency responses by the discriminant power of the underlying features, and the ability to detect both feature presence and absence. In either implementation, each layer has a precise statistical interpretation, and all parameters are tuned by statistical learning. Each saliency detection layer learns more discriminant saliency templates than its predecessors and higher layers have larger pooling fields. This enables the HDSN to simultaneously achieve high selectivity to target object classes and invariance. The performance of the network in saliency and object recognition tasks is compared to those of models from the biological and computer vision literatures. This demonstrates benefits for all the functional enhancements of the HDSN, the class tuning inherent to discriminant saliency, and saliency layers based on templates of increasing target selectivity and invariance. Altogether, these experiments suggest that there are non-trivial benefits in integrating attention and recognition.
New technologies lead to a new frontier: cognitive multiple data representation
NASA Astrophysics Data System (ADS)
Buffat, S.; Liege, F.; Plantier, J.; Roumes, C.
2005-05-01
The increasing number and complexity of operational sensors (radar, infrared, hyperspectral...) and availability of huge amount of data, lead to more and more sophisticated information presentations. But one key element of the IMINT line cannot be improved beyond initial system specification: the operator.... In order to overcome this issue, we have to better understand human visual object representation. Object recognition theories in human vision balance between matching 2D templates representation with viewpoint-dependant information, and a viewpoint-invariant system based on structural description. Spatial frequency content is relevant due to early vision filtering. Orientation in depth is an important variable to challenge object constancy. Three objects, seen from three different points of view in a natural environment made the original images in this study. Test images were a combination of spatial frequency filtered original images and an additive contrast level of white noise. In the first experiment, the observer's task was a same versus different forced choice with spatial alternative. Test images had the same noise level in a presentation row. Discrimination threshold was determined by modifying the white noise contrast level by means of an adaptative method. In the second experiment, a repetition blindness paradigm was used to further investigate the viewpoint effect on object recognition. The results shed some light on the human visual system processing of objects displayed under different physical descriptions. This is an important achievement because targets which not always match physical properties of usual visual stimuli can increase operational workload.
Deterministic object tracking using Gaussian ringlet and directional edge features
NASA Astrophysics Data System (ADS)
Krieger, Evan W.; Sidike, Paheding; Aspiras, Theus; Asari, Vijayan K.
2017-10-01
Challenges currently existing for intensity-based histogram feature tracking methods in wide area motion imagery (WAMI) data include object structural information distortions, background variations, and object scale change. These issues are caused by different pavement or ground types and from changing the sensor or altitude. All of these challenges need to be overcome in order to have a robust object tracker, while attaining a computation time appropriate for real-time processing. To achieve this, we present a novel method, Directional Ringlet Intensity Feature Transform (DRIFT), which employs Kirsch kernel filtering for edge features and a ringlet feature mapping for rotational invariance. The method also includes an automatic scale change component to obtain accurate object boundaries and improvements for lowering computation times. We evaluated the DRIFT algorithm on two challenging WAMI datasets, namely Columbus Large Image Format (CLIF) and Large Area Image Recorder (LAIR), to evaluate its robustness and efficiency. Additional evaluations on general tracking video sequences are performed using the Visual Tracker Benchmark and Visual Object Tracking 2014 databases to demonstrate the algorithms ability with additional challenges in long complex sequences including scale change. Experimental results show that the proposed approach yields competitive results compared to state-of-the-art object tracking methods on the testing datasets.
Behaviorally Relevant Abstract Object Identity Representation in the Human Parietal Cortex
Jeong, Su Keun
2016-01-01
The representation of object identity is fundamental to human vision. Using fMRI and multivoxel pattern analysis, here we report the representation of highly abstract object identity information in human parietal cortex. Specifically, in superior intraparietal sulcus (IPS), a region previously shown to track visual short-term memory capacity, we found object identity representations for famous faces varying freely in viewpoint, hairstyle, facial expression, and age; and for well known cars embedded in different scenes, and shown from different viewpoints and sizes. Critically, these parietal identity representations were behaviorally relevant as they closely tracked the perceived face-identity similarity obtained in a behavioral task. Meanwhile, the task-activated regions in prefrontal and parietal cortices (excluding superior IPS) did not exhibit such abstract object identity representations. Unlike previous studies, we also failed to observe identity representations in posterior ventral and lateral visual object-processing regions, likely due to the greater amount of identity abstraction demanded by our stimulus manipulation here. Our MRI slice coverage precluded us from examining identity representation in anterior temporal lobe, a likely region for the computing of identity information in the ventral region. Overall, we show that human parietal cortex, part of the dorsal visual processing pathway, is capable of holding abstract and complex visual representations that are behaviorally relevant. These results argue against a “content-poor” view of the role of parietal cortex in attention. Instead, the human parietal cortex seems to be “content rich” and capable of directly participating in goal-driven visual information representation in the brain. SIGNIFICANCE STATEMENT The representation of object identity (including faces) is fundamental to human vision and shapes how we interact with the world. Although object representation has traditionally been associated with human occipital and temporal cortices, here we show, by measuring fMRI response patterns, that a region in the human parietal cortex can robustly represent task-relevant object identities. These representations are invariant to changes in a host of visual features, such as viewpoint, and reflect an abstract level of representation that has not previously been reported in the human parietal cortex. Critically, these neural representations are behaviorally relevant as they closely track the perceived object identities. Human parietal cortex thus participates in the moment-to-moment goal-directed visual information representation in the brain. PMID:26843642
Perception of biological motion from size-invariant body representations.
Lappe, Markus; Wittinghofer, Karin; de Lussanet, Marc H E
2015-01-01
The visual recognition of action is one of the socially most important and computationally demanding capacities of the human visual system. It combines visual shape recognition with complex non-rigid motion perception. Action presented as a point-light animation is a striking visual experience for anyone who sees it for the first time. Information about the shape and posture of the human body is sparse in point-light animations, but it is essential for action recognition. In the posturo-temporal filter model of biological motion perception posture information is picked up by visual neurons tuned to the form of the human body before body motion is calculated. We tested whether point-light stimuli are processed through posture recognition of the human body form by using a typical feature of form recognition, namely size invariance. We constructed a point-light stimulus that can only be perceived through a size-invariant mechanism. This stimulus changes rapidly in size from one image to the next. It thus disrupts continuity of early visuo-spatial properties but maintains continuity of the body posture representation. Despite this massive manipulation at the visuo-spatial level, size-changing point-light figures are spontaneously recognized by naive observers, and support discrimination of human body motion.
Integration trumps selection in object recognition.
Saarela, Toni P; Landy, Michael S
2015-03-30
Finding and recognizing objects is a fundamental task of vision. Objects can be defined by several "cues" (color, luminance, texture, etc.), and humans can integrate sensory cues to improve detection and recognition [1-3]. Cortical mechanisms fuse information from multiple cues [4], and shape-selective neural mechanisms can display cue invariance by responding to a given shape independent of the visual cue defining it [5-8]. Selective attention, in contrast, improves recognition by isolating a subset of the visual information [9]. Humans can select single features (red or vertical) within a perceptual dimension (color or orientation), giving faster and more accurate responses to items having the attended feature [10, 11]. Attention elevates neural responses and sharpens neural tuning to the attended feature, as shown by studies in psychophysics and modeling [11, 12], imaging [13-16], and single-cell and neural population recordings [17, 18]. Besides single features, attention can select whole objects [19-21]. Objects are among the suggested "units" of attention because attention to a single feature of an object causes the selection of all of its features [19-21]. Here, we pit integration against attentional selection in object recognition. We find, first, that humans can integrate information near optimally from several perceptual dimensions (color, texture, luminance) to improve recognition. They cannot, however, isolate a single dimension even when the other dimensions provide task-irrelevant, potentially conflicting information. For object recognition, it appears that there is mandatory integration of information from multiple dimensions of visual experience. The advantage afforded by this integration, however, comes at the expense of attentional selection. Copyright © 2015 Elsevier Ltd. All rights reserved.
Integration trumps selection in object recognition
Saarela, Toni P.; Landy, Michael S.
2015-01-01
Summary Finding and recognizing objects is a fundamental task of vision. Objects can be defined by several “cues” (color, luminance, texture etc.), and humans can integrate sensory cues to improve detection and recognition [1–3]. Cortical mechanisms fuse information from multiple cues [4], and shape-selective neural mechanisms can display cue-invariance by responding to a given shape independent of the visual cue defining it [5–8]. Selective attention, in contrast, improves recognition by isolating a subset of the visual information [9]. Humans can select single features (red or vertical) within a perceptual dimension (color or orientation), giving faster and more accurate responses to items having the attended feature [10,11]. Attention elevates neural responses and sharpens neural tuning to the attended feature, as shown by studies in psychophysics and modeling [11,12], imaging [13–16], and single-cell and neural population recordings [17,18]. Besides single features, attention can select whole objects [19–21]. Objects are among the suggested “units” of attention because attention to a single feature of an object causes the selection of all of its features [19–21]. Here, we pit integration against attentional selection in object recognition. We find, first, that humans can integrate information near-optimally from several perceptual dimensions (color, texture, luminance) to improve recognition. They cannot, however, isolate a single dimension even when the other dimensions provide task-irrelevant, potentially conflicting information. For object recognition, it appears that there is mandatory integration of information from multiple dimensions of visual experience. The advantage afforded by this integration, however, comes at the expense of attentional selection. PMID:25802154
Does object view influence the scene consistency effect?
Sastyin, Gergo; Niimi, Ryosuke; Yokosawa, Kazuhiko
2015-04-01
Traditional research on the scene consistency effect only used clearly recognizable object stimuli to show mutually interactive context effects for both the object and background components on scene perception (Davenport & Potter in Psychological Science, 15, 559-564, 2004). However, in real environments, objects are viewed from multiple viewpoints, including an accidental, hard-to-recognize one. When the observers named target objects in scenes (Experiments 1a and 1b, object recognition task), we replicated the scene consistency effect (i.e., there was higher accuracy for the objects with consistent backgrounds). However, there was a significant interaction effect between consistency and object viewpoint, which indicated that the scene consistency effect was more important for identifying objects in the accidental view condition than in the canonical view condition. Therefore, the object recognition system may rely more on the scene context when the object is difficult to recognize. In Experiment 2, the observers identified the background (background recognition task) while the scene consistency and object views were manipulated. The results showed that object viewpoint had no effect, while the scene consistency effect was observed. More specifically, the canonical and accidental views both equally provided contextual information for scene perception. These findings suggested that the mechanism for conscious recognition of objects could be dissociated from the mechanism for visual analysis of object images that were part of a scene. The "context" that the object images provided may have been derived from its view-invariant, relatively low-level visual features (e.g., color), rather than its semantic information.
Born, Jannis; Galeazzi, Juan M; Stringer, Simon M
2017-01-01
A subset of neurons in the posterior parietal and premotor areas of the primate brain respond to the locations of visual targets in a hand-centred frame of reference. Such hand-centred visual representations are thought to play an important role in visually-guided reaching to target locations in space. In this paper we show how a biologically plausible, Hebbian learning mechanism may account for the development of localized hand-centred representations in a hierarchical neural network model of the primate visual system, VisNet. The hand-centered neurons developed in the model use an invariance learning mechanism known as continuous transformation (CT) learning. In contrast to previous theoretical proposals for the development of hand-centered visual representations, CT learning does not need a memory trace of recent neuronal activity to be incorporated in the synaptic learning rule. Instead, CT learning relies solely on a Hebbian learning rule, which is able to exploit the spatial overlap that naturally occurs between successive images of a hand-object configuration as it is shifted across different retinal locations due to saccades. Our simulations show how individual neurons in the network model can learn to respond selectively to target objects in particular locations with respect to the hand, irrespective of where the hand-object configuration occurs on the retina. The response properties of these hand-centred neurons further generalise to localised receptive fields in the hand-centred space when tested on novel hand-object configurations that have not been explored during training. Indeed, even when the network is trained with target objects presented across a near continuum of locations around the hand during training, the model continues to develop hand-centred neurons with localised receptive fields in hand-centred space. With the help of principal component analysis, we provide the first theoretical framework that explains the behavior of Hebbian learning in VisNet.
Born, Jannis; Stringer, Simon M.
2017-01-01
A subset of neurons in the posterior parietal and premotor areas of the primate brain respond to the locations of visual targets in a hand-centred frame of reference. Such hand-centred visual representations are thought to play an important role in visually-guided reaching to target locations in space. In this paper we show how a biologically plausible, Hebbian learning mechanism may account for the development of localized hand-centred representations in a hierarchical neural network model of the primate visual system, VisNet. The hand-centered neurons developed in the model use an invariance learning mechanism known as continuous transformation (CT) learning. In contrast to previous theoretical proposals for the development of hand-centered visual representations, CT learning does not need a memory trace of recent neuronal activity to be incorporated in the synaptic learning rule. Instead, CT learning relies solely on a Hebbian learning rule, which is able to exploit the spatial overlap that naturally occurs between successive images of a hand-object configuration as it is shifted across different retinal locations due to saccades. Our simulations show how individual neurons in the network model can learn to respond selectively to target objects in particular locations with respect to the hand, irrespective of where the hand-object configuration occurs on the retina. The response properties of these hand-centred neurons further generalise to localised receptive fields in the hand-centred space when tested on novel hand-object configurations that have not been explored during training. Indeed, even when the network is trained with target objects presented across a near continuum of locations around the hand during training, the model continues to develop hand-centred neurons with localised receptive fields in hand-centred space. With the help of principal component analysis, we provide the first theoretical framework that explains the behavior of Hebbian learning in VisNet. PMID:28562618
Good Features to Correlate for Visual Tracking
NASA Astrophysics Data System (ADS)
Gundogdu, Erhan; Alatan, A. Aydin
2018-05-01
During the recent years, correlation filters have shown dominant and spectacular results for visual object tracking. The types of the features that are employed in these family of trackers significantly affect the performance of visual tracking. The ultimate goal is to utilize robust features invariant to any kind of appearance change of the object, while predicting the object location as properly as in the case of no appearance change. As the deep learning based methods have emerged, the study of learning features for specific tasks has accelerated. For instance, discriminative visual tracking methods based on deep architectures have been studied with promising performance. Nevertheless, correlation filter based (CFB) trackers confine themselves to use the pre-trained networks which are trained for object classification problem. To this end, in this manuscript the problem of learning deep fully convolutional features for the CFB visual tracking is formulated. In order to learn the proposed model, a novel and efficient backpropagation algorithm is presented based on the loss function of the network. The proposed learning framework enables the network model to be flexible for a custom design. Moreover, it alleviates the dependency on the network trained for classification. Extensive performance analysis shows the efficacy of the proposed custom design in the CFB tracking framework. By fine-tuning the convolutional parts of a state-of-the-art network and integrating this model to a CFB tracker, which is the top performing one of VOT2016, 18% increase is achieved in terms of expected average overlap, and tracking failures are decreased by 25%, while maintaining the superiority over the state-of-the-art methods in OTB-2013 and OTB-2015 tracking datasets.
Humans and Deep Networks Largely Agree on Which Kinds of Variation Make Object Recognition Harder.
Kheradpisheh, Saeed R; Ghodrati, Masoud; Ganjtabesh, Mohammad; Masquelier, Timothée
2016-01-01
View-invariant object recognition is a challenging problem that has attracted much attention among the psychology, neuroscience, and computer vision communities. Humans are notoriously good at it, even if some variations are presumably more difficult to handle than others (e.g., 3D rotations). Humans are thought to solve the problem through hierarchical processing along the ventral stream, which progressively extracts more and more invariant visual features. This feed-forward architecture has inspired a new generation of bio-inspired computer vision systems called deep convolutional neural networks (DCNN), which are currently the best models for object recognition in natural images. Here, for the first time, we systematically compared human feed-forward vision and DCNNs at view-invariant object recognition task using the same set of images and controlling the kinds of transformation (position, scale, rotation in plane, and rotation in depth) as well as their magnitude, which we call "variation level." We used four object categories: car, ship, motorcycle, and animal. In total, 89 human subjects participated in 10 experiments in which they had to discriminate between two or four categories after rapid presentation with backward masking. We also tested two recent DCNNs (proposed respectively by Hinton's group and Zisserman's group) on the same tasks. We found that humans and DCNNs largely agreed on the relative difficulties of each kind of variation: rotation in depth is by far the hardest transformation to handle, followed by scale, then rotation in plane, and finally position (much easier). This suggests that DCNNs would be reasonable models of human feed-forward vision. In addition, our results show that the variation levels in rotation in depth and scale strongly modulate both humans' and DCNNs' recognition performances. We thus argue that these variations should be controlled in the image datasets used in vision research.
Deep Networks Can Resemble Human Feed-forward Vision in Invariant Object Recognition
Kheradpisheh, Saeed Reza; Ghodrati, Masoud; Ganjtabesh, Mohammad; Masquelier, Timothée
2016-01-01
Deep convolutional neural networks (DCNNs) have attracted much attention recently, and have shown to be able to recognize thousands of object categories in natural image databases. Their architecture is somewhat similar to that of the human visual system: both use restricted receptive fields, and a hierarchy of layers which progressively extract more and more abstracted features. Yet it is unknown whether DCNNs match human performance at the task of view-invariant object recognition, whether they make similar errors and use similar representations for this task, and whether the answers depend on the magnitude of the viewpoint variations. To investigate these issues, we benchmarked eight state-of-the-art DCNNs, the HMAX model, and a baseline shallow model and compared their results to those of humans with backward masking. Unlike in all previous DCNN studies, we carefully controlled the magnitude of the viewpoint variations to demonstrate that shallow nets can outperform deep nets and humans when variations are weak. When facing larger variations, however, more layers were needed to match human performance and error distributions, and to have representations that are consistent with human behavior. A very deep net with 18 layers even outperformed humans at the highest variation level, using the most human-like representations. PMID:27601096
Nonretinotopic visual processing in the brain.
Melcher, David; Morrone, Maria Concetta
2015-01-01
A basic principle in visual neuroscience is the retinotopic organization of neural receptive fields. Here, we review behavioral, neurophysiological, and neuroimaging evidence for nonretinotopic processing of visual stimuli. A number of behavioral studies have shown perception depending on object or external-space coordinate systems, in addition to retinal coordinates. Both single-cell neurophysiology and neuroimaging have provided evidence for the modulation of neural firing by gaze position and processing of visual information based on craniotopic or spatiotopic coordinates. Transient remapping of the spatial and temporal properties of neurons contingent on saccadic eye movements has been demonstrated in visual cortex, as well as frontal and parietal areas involved in saliency/priority maps, and is a good candidate to mediate some of the spatial invariance demonstrated by perception. Recent studies suggest that spatiotopic selectivity depends on a low spatial resolution system of maps that operates over a longer time frame than retinotopic processing and is strongly modulated by high-level cognitive factors such as attention. The interaction of an initial and rapid retinotopic processing stage, tied to new fixations, and a longer lasting but less precise nonretinotopic level of visual representation could underlie the perception of both a detailed and a stable visual world across saccadic eye movements.
Positional priming of visual pop-out search is supported by multiple spatial reference frames
Gokce, Ahu; Müller, Hermann J.; Geyer, Thomas
2015-01-01
The present study investigates the representations(s) underlying positional priming of visual ‘pop-out’ search (Maljkovic and Nakayama, 1996). Three search items (one target and two distractors) were presented at different locations, in invariant (Experiment 1) or random (Experiment 2) cross-trial sequences. By these manipulations it was possible to disentangle retinotopic, spatiotopic, and object-centered priming representations. Two forms of priming were tested: target location facilitation (i.e., faster reaction times – RTs– when the trial n target is presented at a trial n-1 target relative to n-1 blank location) and distractor location inhibition (i.e., slower RTs for n targets presented at n-1 distractor compared to n-1 blank locations). It was found that target locations were coded in positional short-term memory with reference to both spatiotopic and object-centered representations (Experiment 1 vs. 2). In contrast, distractor locations were maintained in an object-centered reference frame (Experiments 1 and 2). We put forward the idea that the uncertainty induced by the experiment manipulation (predictable versus random cross-trial item displacements) modulates the transition from object- to space-based representations in cross-trial memory for target positions. PMID:26136718
Face processing in different brain areas, and critical band masking.
Rolls, Edmund T
2008-09-01
Neurophysiological evidence is described showing that some neurons in the macaque inferior temporal visual cortex have responses that are invariant with respect to the position, size, view, and spatial frequency of faces and objects, and that these neurons show rapid processing and rapid learning. Critical band spatial frequency masking is shown to be a property of these face-selective neurons and of the human visual perception of faces. Which face or object is present is encoded using a distributed representation in which each neuron conveys independent information in its firing rate, with little information evident in the relative time of firing of different neurons. This ensemble encoding has the advantages of maximizing the information in the representation useful for discrimination between stimuli using a simple weighted sum of the neuronal firing by the receiving neurons, generalization, and graceful degradation. These invariant representations are ideally suited to provide the inputs to brain regions such as the orbitofrontal cortex and amygdala that learn the reinforcement associations of an individual's face, for then the learning, and the appropriate social and emotional responses generalize to other views of the same face. A theory is described of how such invariant representations may be produced by self-organizing learning in a hierarchically organized set of visual cortical areas with convergent connectivity. The theory utilizes either temporal or spatial continuity with an associative synaptic modification rule. Another population of neurons in the cortex in the superior temporal sulcus encodes other aspects of faces such as face expression, eye-gaze, face view, and whether the head is moving. These neurons thus provide important additional inputs to parts of the brain such as the orbitofrontal cortex and amygdala that are involved in social communication and emotional behaviour. Outputs of these systems reach the amygdala, in which face-selective neurons are found, and also the orbitofrontal cortex, in which some neurons are tuned to face identity and others to face expression. In humans, activation of the orbitofrontal cortex is found when a change of face expression acts as a social signal that behaviour should change; and damage to the human orbitofrontal and pregenual cingulate cortex can impair face and voice expression identification, and also the reversal of emotional behaviour that normally occurs when reinforcers are reversed.
NASA Astrophysics Data System (ADS)
Kuvich, Gary
2004-08-01
Vision is only a part of a system that converts visual information into knowledge structures. These structures drive the vision process, resolving ambiguity and uncertainty via feedback, and provide image understanding, which is an interpretation of visual information in terms of these knowledge models. These mechanisms provide a reliable recognition if the object is occluded or cannot be recognized as a whole. It is hard to split the entire system apart, and reliable solutions to the target recognition problems are possible only within the solution of a more generic Image Understanding Problem. Brain reduces informational and computational complexities, using implicit symbolic coding of features, hierarchical compression, and selective processing of visual information. Biologically inspired Network-Symbolic representation, where both systematic structural/logical methods and neural/statistical methods are parts of a single mechanism, is the most feasible for such models. It converts visual information into relational Network-Symbolic structures, avoiding artificial precise computations of 3-dimensional models. Network-Symbolic Transformations derive abstract structures, which allows for invariant recognition of an object as exemplar of a class. Active vision helps creating consistent models. Attention, separation of figure from ground and perceptual grouping are special kinds of network-symbolic transformations. Such Image/Video Understanding Systems will be reliably recognizing targets.
Gestalt isomorphism and the primacy of subjective conscious experience: a Gestalt Bubble model.
Lehar, Steven
2003-08-01
A serious crisis is identified in theories of neurocomputation, marked by a persistent disparity between the phenomenological or experiential account of visual perception and the neurophysiological level of description of the visual system. In particular, conventional concepts of neural processing offer no explanation for the holistic global aspects of perception identified by Gestalt theory. The problem is paradigmatic and can be traced to contemporary concepts of the functional role of the neural cell, known as the Neuron Doctrine. In the absence of an alternative neurophysiologically plausible model, I propose a perceptual modeling approach, to model the percept as experienced subjectively, rather than modeling the objective neurophysiological state of the visual system that supposedly subserves that experience. A Gestalt Bubble model is presented to demonstrate how the elusive Gestalt principles of emergence, reification, and invariance can be expressed in a quantitative model of the subjective experience of visual consciousness. That model in turn reveals a unique computational strategy underlying visual processing, which is unlike any algorithm devised by man, and certainly unlike the atomistic feed-forward model of neurocomputation offered by the Neuron Doctrine paradigm. The perceptual modeling approach reveals the primary function of perception as that of generating a fully spatial virtual-reality replica of the external world in an internal representation. The common objections to this "picture-in-the-head" concept of perceptual representation are shown to be ill founded.
Linear and Non-Linear Visual Feature Learning in Rat and Humans
Bossens, Christophe; Op de Beeck, Hans P.
2016-01-01
The visual system processes visual input in a hierarchical manner in order to extract relevant features that can be used in tasks such as invariant object recognition. Although typically investigated in primates, recent work has shown that rats can be trained in a variety of visual object and shape recognition tasks. These studies did not pinpoint the complexity of the features used by these animals. Many tasks might be solved by using a combination of relatively simple features which tend to be correlated. Alternatively, rats might extract complex features or feature combinations which are nonlinear with respect to those simple features. In the present study, we address this question by starting from a small stimulus set for which one stimulus-response mapping involves a simple linear feature to solve the task while another mapping needs a well-defined nonlinear combination of simpler features related to shape symmetry. We verified computationally that the nonlinear task cannot be trivially solved by a simple V1-model. We show how rats are able to solve the linear feature task but are unable to acquire the nonlinear feature. In contrast, humans are able to use the nonlinear feature and are even faster in uncovering this solution as compared to the linear feature. The implications for the computational capabilities of the rat visual system are discussed. PMID:28066201
Multiplicative mixing of object identity and image attributes in single inferior temporal neurons.
Ratan Murty, N Apurva; Arun, S P
2018-04-03
Object recognition is challenging because the same object can produce vastly different images, mixing signals related to its identity with signals due to its image attributes, such as size, position, rotation, etc. Previous studies have shown that both signals are present in high-level visual areas, but precisely how they are combined has remained unclear. One possibility is that neurons might encode identity and attribute signals multiplicatively so that each can be efficiently decoded without interference from the other. Here, we show that, in high-level visual cortex, responses of single neurons can be explained better as a product rather than a sum of tuning for object identity and tuning for image attributes. This subtle effect in single neurons produced substantially better population decoding of object identity and image attributes in the neural population as a whole. This property was absent both in low-level vision models and in deep neural networks. It was also unique to invariances: when tested with two-part objects, neural responses were explained better as a sum than as a product of part tuning. Taken together, our results indicate that signals requiring separate decoding, such as object identity and image attributes, are combined multiplicatively in IT neurons, whereas signals that require integration (such as parts in an object) are combined additively. Copyright © 2018 the Author(s). Published by PNAS.
The roles of perceptual and conceptual information in face recognition.
Schwartz, Linoy; Yovel, Galit
2016-11-01
The representation of familiar objects is comprised of perceptual information about their visual properties as well as the conceptual knowledge that we have about them. What is the relative contribution of perceptual and conceptual information to object recognition? Here, we examined this question by designing a face familiarization protocol during which participants were either exposed to rich perceptual information (viewing each face in different angles and illuminations) or with conceptual information (associating each face with a different name). Both conditions were compared with single-view faces presented with no labels. Recognition was tested on new images of the same identities to assess whether learning generated a view-invariant representation. Results showed better recognition of novel images of the learned identities following association of a face with a name label, but no enhancement following exposure to multiple face views. Whereas these findings may be consistent with the role of category learning in object recognition, face recognition was better for labeled faces only when faces were associated with person-related labels (name, occupation), but not with person-unrelated labels (object names or symbols). These findings suggest that association of meaningful conceptual information with an image shifts its representation from an image-based percept to a view-invariant concept. They further indicate that the role of conceptual information should be considered to account for the superior recognition that we have for familiar faces and objects. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Automated transformation-invariant shape recognition through wavelet multiresolution
NASA Astrophysics Data System (ADS)
Brault, Patrice; Mounier, Hugues
2001-12-01
We present here new results in Wavelet Multi-Resolution Analysis (W-MRA) applied to shape recognition in automatic vehicle driving applications. Different types of shapes have to be recognized in this framework. They pertain to most of the objects entering the sensors field of a car. These objects can be road signs, lane separation lines, moving or static obstacles, other automotive vehicles, or visual beacons. The recognition process must be invariant to global, affine or not, transformations which are : rotation, translation and scaling. It also has to be invariant to more local, elastic, deformations like the perspective (in particular with wide angle camera lenses), and also like deformations due to environmental conditions (weather : rain, mist, light reverberation) or optical and electrical signal noises. To demonstrate our method, an initial shape, with a known contour, is compared to the same contour altered by rotation, translation, scaling and perspective. The curvature computed for each contour point is used as a main criterion in the shape matching process. The original part of this work is to use wavelet descriptors, generated with a fast orthonormal W-MRA, rather than Fourier descriptors, in order to provide a multi-resolution description of the contour to be analyzed. In such way, the intrinsic spatial localization property of wavelet descriptors can be used and the recognition process can be speeded up. The most important part of this work is to demonstrate the potential performance of Wavelet-MRA in this application of shape recognition.
Webber, C J
2001-05-01
This article shows analytically that single-cell learning rules that give rise to oriented and localized receptive fields, when their synaptic weights are randomly and independently initialized according to a plausible assumption of zero prior information, will generate visual codes that are invariant under two-dimensional translations, rotations, and scale magnifications, provided that the statistics of their training images are sufficiently invariant under these transformations. Such codes span different image locations, orientations, and size scales with equal economy. Thus, single-cell rules could account for the spatial scaling property of the cortical simple-cell code. This prediction is tested computationally by training with natural scenes; it is demonstrated that a single-cell learning rule can give rise to simple-cell receptive fields spanning the full range of orientations, image locations, and spatial frequencies (except at the extreme high and low frequencies at which the scale invariance of the statistics of digitally sampled images must ultimately break down, because of the image boundary and the finite pixel resolution). Thus, no constraint on completeness, or any other coupling between cells, is necessary to induce the visual code to span wide ranges of locations, orientations, and size scales. This prediction is made using the theory of spontaneous symmetry breaking, which we have previously shown can also explain the data-driven self-organization of a wide variety of transformation invariances in neurons' responses, such as the translation invariance of complex cell response.
Conci, Markus; Müller, Hermann J; von Mühlenen, Adrian
2013-07-09
In visual search, detection of a target is faster when it is presented within a spatial layout of repeatedly encountered nontarget items, indicating that contextual invariances can guide selective attention (contextual cueing; Chun & Jiang, 1998). However, perceptual regularities may interfere with contextual learning; for instance, no contextual facilitation occurs when four nontargets form a square-shaped grouping, even though the square location predicts the target location (Conci & von Mühlenen, 2009). Here, we further investigated potential causes for this interference-effect: We show that contextual cueing can reliably occur for targets located within the region of a segmented object, but not for targets presented outside of the object's boundaries. Four experiments demonstrate an object-based facilitation in contextual cueing, with a modulation of context-based learning by relatively subtle grouping cues including closure, symmetry, and spatial regularity. Moreover, the lack of contextual cueing for targets located outside the segmented region was due to an absence of (latent) learning of contextual layouts, rather than due to an attentional bias towards the grouped region. Taken together, these results indicate that perceptual segmentation provides a basic structure within which contextual scene regularities are acquired. This in turn argues that contextual learning is constrained by object-based selection.
Blazevski, Daniel; Franklin, Jennifer
2012-12-01
Scattering theory is a convenient way to describe systems that are subject to time-dependent perturbations which are localized in time. Using scattering theory, one can compute time-dependent invariant objects for the perturbed system knowing the invariant objects of the unperturbed system. In this paper, we use scattering theory to give numerical computations of invariant manifolds appearing in laser-driven reactions. In this setting, invariant manifolds separate regions of phase space that lead to different outcomes of the reaction and can be used to compute reaction rates.
Biased Competition in Visual Processing Hierarchies: A Learning Approach Using Multiple Cues.
Gepperth, Alexander R T; Rebhan, Sven; Hasler, Stephan; Fritsch, Jannik
2011-03-01
In this contribution, we present a large-scale hierarchical system for object detection fusing bottom-up (signal-driven) processing results with top-down (model or task-driven) attentional modulation. Specifically, we focus on the question of how the autonomous learning of invariant models can be embedded into a performing system and how such models can be used to define object-specific attentional modulation signals. Our system implements bi-directional data flow in a processing hierarchy. The bottom-up data flow proceeds from a preprocessing level to the hypothesis level where object hypotheses created by exhaustive object detection algorithms are represented in a roughly retinotopic way. A competitive selection mechanism is used to determine the most confident hypotheses, which are used on the system level to train multimodal models that link object identity to invariant hypothesis properties. The top-down data flow originates at the system level, where the trained multimodal models are used to obtain space- and feature-based attentional modulation signals, providing biases for the competitive selection process at the hypothesis level. This results in object-specific hypothesis facilitation/suppression in certain image regions which we show to be applicable to different object detection mechanisms. In order to demonstrate the benefits of this approach, we apply the system to the detection of cars in a variety of challenging traffic videos. Evaluating our approach on a publicly available dataset containing approximately 3,500 annotated video images from more than 1 h of driving, we can show strong increases in performance and generalization when compared to object detection in isolation. Furthermore, we compare our results to a late hypothesis rejection approach, showing that early coupling of top-down and bottom-up information is a favorable approach especially when processing resources are constrained.
A comparison of visuomotor cue integration strategies for object placement and prehension.
Greenwald, Hal S; Knill, David C
2009-01-01
Visual cue integration strategies are known to depend on cue reliability and how rapidly the visual system processes incoming information. We investigated whether these strategies also depend on differences in the information demands for different natural tasks. Using two common goal-oriented tasks, prehension and object placement, we determined whether monocular and binocular information influence estimates of three-dimensional (3D) orientation differently depending on task demands. Both tasks rely on accurate 3D orientation estimates, but 3D position is potentially more important for grasping. Subjects placed an object on or picked up a disc in a virtual environment. On some trials, the monocular cues (aspect ratio and texture compression) and binocular cues (e.g., binocular disparity) suggested slightly different 3D orientations for the disc; these conflicts either were present upon initial stimulus presentation or were introduced after movement initiation, which allowed us to quantify how information from the cues accumulated over time. We analyzed the time-varying orientations of subjects' fingers in the grasping task and those of the object in the object placement task to quantify how different visual cues influenced motor control. In the first experiment, different subjects performed each task, and those performing the grasping task relied on binocular information more when orienting their hands than those performing the object placement task. When subjects in the second experiment performed both tasks in interleaved sessions, binocular cues were still more influential during grasping than object placement, and the different cue integration strategies observed for each task in isolation were maintained. In both experiments, the temporal analyses showed that subjects processed binocular information faster than monocular information, but task demands did not affect the time course of cue processing. How one uses visual cues for motor control depends on the task being performed, although how quickly the information is processed appears to be task invariant.
Spectral Skyline Separation: Extended Landmark Databases and Panoramic Imaging
Differt, Dario; Möller, Ralf
2016-01-01
Evidence from behavioral experiments suggests that insects use the skyline as a cue for visual navigation. However, changes of lighting conditions, over hours, days or possibly seasons, significantly affect the appearance of the sky and ground objects. One possible solution to this problem is to extract the “skyline” by an illumination-invariant classification of the environment into two classes, ground objects and sky. In a previous study (Insect models of illumination-invariant skyline extraction from UV (ultraviolet) and green channels), we examined the idea of using two different color channels available for many insects (UV and green) to perform this segmentation. We found out that for suburban scenes in temperate zones, where the skyline is dominated by trees and artificial objects like houses, a “local” UV segmentation with adaptive thresholds applied to individual images leads to the most reliable classification. Furthermore, a “global” segmentation with fixed thresholds (trained on an image dataset recorded over several days) using UV-only information is only slightly worse compared to using both the UV and green channel. In this study, we address three issues: First, to enhance the limited range of environments covered by the dataset collected in the previous study, we gathered additional data samples of skylines consisting of minerals (stones, sand, earth) as ground objects. We could show that also for mineral-rich environments, UV-only segmentation achieves a quality comparable to multi-spectral (UV and green) segmentation. Second, we collected a wide variety of ground objects to examine their spectral characteristics under different lighting conditions. On the one hand, we found that the special case of diffusely-illuminated minerals increases the difficulty to reliably separate ground objects from the sky. On the other hand, the spectral characteristics of this collection of ground objects covers well with the data collected in the skyline databases, increasing, due to the increased variety of ground objects, the validity of our findings for novel environments. Third, we collected omnidirectional images, as often used for visual navigation tasks, of skylines using an UV-reflective hyperbolic mirror. We could show that “local” separation techniques can be adapted to the use of panoramic images by splitting the image into segments and finding individual thresholds for each segment. Contrarily, this is not possible for ‘global’ separation techniques. PMID:27690053
Barenholtz, Elan; Tarr, Michael J
2008-06-01
A single biological object, such as a hand, can assume multiple, very different shapes, due to the articulation of its parts. Yet we are able to recognize all of these shapes as examples of the same object. How is this invariance to pose achieved? Here, we present evidence that the visual system maintains a model of object transformation that is based on rigid, convex parts articulating at extrema of negative curvature, i.e., part boundaries. We compared similarity judgments in a task in which subjects had to decide which of the two transformed versions of a 'base' shape-one a 'biologically valid' articulation and one a geometrically similar but 'biologically invalid' articulation-was more similar to the base shape. Two types of comparisons were made: in the figure/ground-reversal, the invalid articulation consisted of exactly the same contour transformation as the valid one with reversed figural polarity. In the axis-of-rotation reversal, the valid articulation consisted of a part rotated around its concave part boundaries, while the invalid articulation consisted of the same part rotated around the endpoints on the opposite side of the part. In two separate 2AFC similarity experiments-one in which the base and transformed shapes were presented simultaneously and one in which they were presented sequentially-subjects were more likely to match the base shape to a transform when it corresponded to a legitimate articulation. These results suggest that the visual system maintains expectations about the way objects will transform, based on their static geometry.
Multi-clues image retrieval based on improved color invariants
NASA Astrophysics Data System (ADS)
Liu, Liu; Li, Jian-Xun
2012-05-01
At present, image retrieval has a great progress in indexing efficiency and memory usage, which mainly benefits from the utilization of the text retrieval technology, such as the bag-of-features (BOF) model and the inverted-file structure. Meanwhile, because the robust local feature invariants are selected to establish BOF, the retrieval precision of BOF is enhanced, especially when it is applied to a large-scale database. However, these local feature invariants mainly consider the geometric variance of the objects in the images, and thus the color information of the objects fails to be made use of. Because of the development of the information technology and Internet, the majority of our retrieval objects is color images. Therefore, retrieval performance can be further improved through proper utilization of the color information. We propose an improved method through analyzing the flaw of shadow-shading quasi-invariant. The response and performance of shadow-shading quasi-invariant for the object edge with the variance of lighting are enhanced. The color descriptors of the invariant regions are extracted and integrated into BOF based on the local feature. The robustness of the algorithm and the improvement of the performance are verified in the final experiments.
Real-time classification of vehicles by type within infrared imagery
NASA Astrophysics Data System (ADS)
Kundegorski, Mikolaj E.; Akçay, Samet; Payen de La Garanderie, Grégoire; Breckon, Toby P.
2016-10-01
Real-time classification of vehicles into sub-category types poses a significant challenge within infra-red imagery due to the high levels of intra-class variation in thermal vehicle signatures caused by aspects of design, current operating duration and ambient thermal conditions. Despite these challenges, infra-red sensing offers significant generalized target object detection advantages in terms of all-weather operation and invariance to visual camouflage techniques. This work investigates the accuracy of a number of real-time object classification approaches for this task within the wider context of an existing initial object detection and tracking framework. Specifically we evaluate the use of traditional feature-driven bag of visual words and histogram of oriented gradient classification approaches against modern convolutional neural network architectures. Furthermore, we use classical photogrammetry, within the context of current target detection and classification techniques, as a means of approximating 3D target position within the scene based on this vehicle type classification. Based on photogrammetric estimation of target position, we then illustrate the use of regular Kalman filter based tracking operating on actual 3D vehicle trajectories. Results are presented using a conventional thermal-band infra-red (IR) sensor arrangement where targets are tracked over a range of evaluation scenarios.
Foley, Nicholas C.; Grossberg, Stephen; Mingolla, Ennio
2015-01-01
How are spatial and object attention coordinated to achieve rapid object learning and recognition during eye movement search? How do prefrontal priming and parietal spatial mechanisms interact to determine the reaction time costs of intra-object attention shifts, inter-object attention shifts, and shifts between visible objects and covertly cued locations? What factors underlie individual differences in the timing and frequency of such attentional shifts? How do transient and sustained spatial attentional mechanisms work and interact? How can volition, mediated via the basal ganglia, influence the span of spatial attention? A neural model is developed of how spatial attention in the where cortical stream coordinates view-invariant object category learning in the what cortical stream under free viewing conditions. The model simulates psychological data about the dynamics of covert attention priming and switching requiring multifocal attention without eye movements. The model predicts how “attentional shrouds” are formed when surface representations in cortical area V4 resonate with spatial attention in posterior parietal cortex (PPC) and prefrontal cortex (PFC), while shrouds compete among themselves for dominance. Winning shrouds support invariant object category learning, and active surface-shroud resonances support conscious surface perception and recognition. Attentive competition between multiple objects and cues simulates reaction-time data from the two-object cueing paradigm. The relative strength of sustained surface-driven and fast-transient motion-driven spatial attention controls individual differences in reaction time for invalid cues. Competition between surface-driven attentional shrouds controls individual differences in detection rate of peripheral targets in useful-field-of-view tasks. The model proposes how the strength of competition can be mediated, though learning or momentary changes in volition, by the basal ganglia. A new explanation of crowding shows how the cortical magnification factor, among other variables, can cause multiple object surfaces to share a single surface-shroud resonance, thereby preventing recognition of the individual objects. PMID:22425615
Foley, Nicholas C; Grossberg, Stephen; Mingolla, Ennio
2012-08-01
How are spatial and object attention coordinated to achieve rapid object learning and recognition during eye movement search? How do prefrontal priming and parietal spatial mechanisms interact to determine the reaction time costs of intra-object attention shifts, inter-object attention shifts, and shifts between visible objects and covertly cued locations? What factors underlie individual differences in the timing and frequency of such attentional shifts? How do transient and sustained spatial attentional mechanisms work and interact? How can volition, mediated via the basal ganglia, influence the span of spatial attention? A neural model is developed of how spatial attention in the where cortical stream coordinates view-invariant object category learning in the what cortical stream under free viewing conditions. The model simulates psychological data about the dynamics of covert attention priming and switching requiring multifocal attention without eye movements. The model predicts how "attentional shrouds" are formed when surface representations in cortical area V4 resonate with spatial attention in posterior parietal cortex (PPC) and prefrontal cortex (PFC), while shrouds compete among themselves for dominance. Winning shrouds support invariant object category learning, and active surface-shroud resonances support conscious surface perception and recognition. Attentive competition between multiple objects and cues simulates reaction-time data from the two-object cueing paradigm. The relative strength of sustained surface-driven and fast-transient motion-driven spatial attention controls individual differences in reaction time for invalid cues. Competition between surface-driven attentional shrouds controls individual differences in detection rate of peripheral targets in useful-field-of-view tasks. The model proposes how the strength of competition can be mediated, though learning or momentary changes in volition, by the basal ganglia. A new explanation of crowding shows how the cortical magnification factor, among other variables, can cause multiple object surfaces to share a single surface-shroud resonance, thereby preventing recognition of the individual objects. Copyright © 2012 Elsevier Inc. All rights reserved.
Rotation, scale, and translation invariant pattern recognition using feature extraction
NASA Astrophysics Data System (ADS)
Prevost, Donald; Doucet, Michel; Bergeron, Alain; Veilleux, Luc; Chevrette, Paul C.; Gingras, Denis J.
1997-03-01
A rotation, scale and translation invariant pattern recognition technique is proposed.It is based on Fourier- Mellin Descriptors (FMD). Each FMD is taken as an independent feature of the object, and a set of those features forms a signature. FMDs are naturally rotation invariant. Translation invariance is achieved through pre- processing. A proper normalization of the FMDs gives the scale invariance property. This approach offers the double advantage of providing invariant signatures of the objects, and a dramatic reduction of the amount of data to process. The compressed invariant feature signature is next presented to a multi-layered perceptron neural network. This final step provides some robustness to the classification of the signatures, enabling good recognition behavior under anamorphically scaled distortion. We also present an original feature extraction technique, adapted to optical calculation of the FMDs. A prototype optical set-up was built, and experimental results are presented.
Challenging Cognitive Control by Mirrored Stimuli in Working Memory Matching
Wirth, Maria; Gaschler, Robert
2017-01-01
Cognitive conflict has often been investigated by placing automatic processing originating from learned associations in competition with instructed task demands. Here we explore whether mirror generalization as a congenital mechanism can be employed to create cognitive conflict. Past research suggests that the visual system automatically generates an invariant representation of visual objects and their mirrored counterparts (i.e., mirror generalization), and especially so for lateral reversals (e.g., a cup seen from the left side vs. right side). Prior work suggests that mirror generalization can be reduced or even overcome by learning (i.e., for those visual objects for which it is not appropriate, such as letters d and b). We, therefore, minimized prior practice on resolving conflicts involving mirror generalization by using kanji stimuli as non-verbal and unfamiliar material. In a 1-back task, participants had to check a stream of kanji stimuli for identical repetitions and avoid miss-categorizing mirror reversed stimuli as exact repetitions. Consistent with previous work, lateral reversals led to profound slowing of reaction times and lower accuracy in Experiment 1. Yet, different from previous reports suggesting that lateral reversals lead to stronger conflict, similar slowing for vertical and horizontal mirror transformations was observed in Experiment 2. Taken together, the results suggest that transformations of visual stimuli can be employed to challenge cognitive control in the 1-back task. PMID:28503160
Modulation of neuronal responses during covert search for visual feature conjunctions
Buracas, Giedrius T.; Albright, Thomas D.
2009-01-01
While searching for an object in a visual scene, an observer's attentional focus and eye movements are often guided by information about object features and spatial locations. Both spatial and feature-specific attention are known to modulate neuronal responses in visual cortex, but little is known of the dynamics and interplay of these mechanisms as visual search progresses. To address this issue, we recorded from directionally selective cells in visual area MT of monkeys trained to covertly search for targets defined by a unique conjunction of color and motion features and to signal target detection with an eye movement to the putative target. Two patterns of response modulation were observed. One pattern consisted of enhanced responses to targets presented in the receptive field (RF). These modulations occurred at the end-stage of search and were more potent during correct target identification than during erroneous saccades to a distractor in RF, thus suggesting that this modulation is not a mere presaccadic enhancement. A second pattern of modulation was observed when RF stimuli were nontargets that shared a feature with the target. The latter effect was observed during early stages of search and is consistent with a global feature-specific mechanism. This effect often terminated before target identification, thus suggesting that it interacts with spatial attention. This modulation was exhibited not only for motion but also for color cue, although MT neurons are known to be insensitive to color. Such cue-invariant attentional effects may contribute to a feature binding mechanism acting across visual dimensions. PMID:19805385
Modulation of neuronal responses during covert search for visual feature conjunctions.
Buracas, Giedrius T; Albright, Thomas D
2009-09-29
While searching for an object in a visual scene, an observer's attentional focus and eye movements are often guided by information about object features and spatial locations. Both spatial and feature-specific attention are known to modulate neuronal responses in visual cortex, but little is known of the dynamics and interplay of these mechanisms as visual search progresses. To address this issue, we recorded from directionally selective cells in visual area MT of monkeys trained to covertly search for targets defined by a unique conjunction of color and motion features and to signal target detection with an eye movement to the putative target. Two patterns of response modulation were observed. One pattern consisted of enhanced responses to targets presented in the receptive field (RF). These modulations occurred at the end-stage of search and were more potent during correct target identification than during erroneous saccades to a distractor in RF, thus suggesting that this modulation is not a mere presaccadic enhancement. A second pattern of modulation was observed when RF stimuli were nontargets that shared a feature with the target. The latter effect was observed during early stages of search and is consistent with a global feature-specific mechanism. This effect often terminated before target identification, thus suggesting that it interacts with spatial attention. This modulation was exhibited not only for motion but also for color cue, although MT neurons are known to be insensitive to color. Such cue-invariant attentional effects may contribute to a feature binding mechanism acting across visual dimensions.
NASA Astrophysics Data System (ADS)
Assadi, Amir H.
2001-11-01
Perceptual geometry is an emerging field of interdisciplinary research whose objectives focus on study of geometry from the perspective of visual perception, and in turn, apply such geometric findings to the ecological study of vision. Perceptual geometry attempts to answer fundamental questions in perception of form and representation of space through synthesis of cognitive and biological theories of visual perception with geometric theories of the physical world. Perception of form and space are among fundamental problems in vision science. In recent cognitive and computational models of human perception, natural scenes are used systematically as preferred visual stimuli. Among key problems in perception of form and space, we have examined perception of geometry of natural surfaces and curves, e.g. as in the observer's environment. Besides a systematic mathematical foundation for a remarkably general framework, the advantages of the Gestalt theory of natural surfaces include a concrete computational approach to simulate or recreate images whose geometric invariants and quantities might be perceived and estimated by an observer. The latter is at the very foundation of understanding the nature of perception of space and form, and the (computer graphics) problem of rendering scenes to visually invoke virtual presence.
Integer sequence discovery from small graphs
Hoppe, Travis; Petrone, Anna
2015-01-01
We have exhaustively enumerated all simple, connected graphs of a finite order and have computed a selection of invariants over this set. Integer sequences were constructed from these invariants and checked against the Online Encyclopedia of Integer Sequences (OEIS). 141 new sequences were added and six sequences were extended. From the graph database, we were able to programmatically suggest relationships among the invariants. It will be shown that we can readily visualize any sequence of graphs with a given criteria. The code has been released as an open-source framework for further analysis and the database was constructed to be extensible to invariants not considered in this work. PMID:27034526
The shift-invariant discrete wavelet transform and application to speech waveform analysis.
Enders, Jörg; Geng, Weihua; Li, Peijun; Frazier, Michael W; Scholl, David J
2005-04-01
The discrete wavelet transform may be used as a signal-processing tool for visualization and analysis of nonstationary, time-sampled waveforms. The highly desirable property of shift invariance can be obtained at the cost of a moderate increase in computational complexity, and accepting a least-squares inverse (pseudoinverse) in place of a true inverse. A new algorithm for the pseudoinverse of the shift-invariant transform that is easier to implement in array-oriented scripting languages than existing algorithms is presented together with self-contained proofs. Representing only one of the many and varied potential applications, a recorded speech waveform illustrates the benefits of shift invariance with pseudoinvertibility. Visualization shows the glottal modulation of vowel formants and frication noise, revealing secondary glottal pulses and other waveform irregularities. Additionally, performing sound waveform editing operations (i.e., cutting and pasting sections) on the shift-invariant wavelet representation automatically produces quiet, click-free section boundaries in the resulting sound. The capabilities of this wavelet-domain editing technique are demonstrated by changing the rate of a recorded spoken word. Individual pitch periods are repeated to obtain a half-speed result, and alternate individual pitch periods are removed to obtain a double-speed result. The original pitch and formant frequencies are preserved. In informal listening tests, the results are clear and understandable.
The shift-invariant discrete wavelet transform and application to speech waveform analysis
NASA Astrophysics Data System (ADS)
Enders, Jörg; Geng, Weihua; Li, Peijun; Frazier, Michael W.; Scholl, David J.
2005-04-01
The discrete wavelet transform may be used as a signal-processing tool for visualization and analysis of nonstationary, time-sampled waveforms. The highly desirable property of shift invariance can be obtained at the cost of a moderate increase in computational complexity, and accepting a least-squares inverse (pseudoinverse) in place of a true inverse. A new algorithm for the pseudoinverse of the shift-invariant transform that is easier to implement in array-oriented scripting languages than existing algorithms is presented together with self-contained proofs. Representing only one of the many and varied potential applications, a recorded speech waveform illustrates the benefits of shift invariance with pseudoinvertibility. Visualization shows the glottal modulation of vowel formants and frication noise, revealing secondary glottal pulses and other waveform irregularities. Additionally, performing sound waveform editing operations (i.e., cutting and pasting sections) on the shift-invariant wavelet representation automatically produces quiet, click-free section boundaries in the resulting sound. The capabilities of this wavelet-domain editing technique are demonstrated by changing the rate of a recorded spoken word. Individual pitch periods are repeated to obtain a half-speed result, and alternate individual pitch periods are removed to obtain a double-speed result. The original pitch and formant frequencies are preserved. In informal listening tests, the results are clear and understandable. .
Sunkara, Adhira
2015-01-01
As we navigate through the world, eye and head movements add rotational velocity patterns to the retinal image. When such rotations accompany observer translation, the rotational velocity patterns must be discounted to accurately perceive heading. The conventional view holds that this computation requires efference copies of self-generated eye/head movements. Here we demonstrate that the brain implements an alternative solution in which retinal velocity patterns are themselves used to dissociate translations from rotations. These results reveal a novel role for visual cues in achieving a rotation-invariant representation of heading in the macaque ventral intraparietal area. Specifically, we show that the visual system utilizes both local motion parallax cues and global perspective distortions to estimate heading in the presence of rotations. These findings further suggest that the brain is capable of performing complex computations to infer eye movements and discount their sensory consequences based solely on visual cues. DOI: http://dx.doi.org/10.7554/eLife.04693.001 PMID:25693417
Method of synthesized phase objects for pattern recognition with rotation invariance
NASA Astrophysics Data System (ADS)
Ostroukh, Alexander P.; Butok, Alexander M.; Shvets, Rostislav A.; Yezhov, Pavel V.; Kim, Jin-Tae; Kuzmenko, Alexander V.
2015-11-01
We present a development of the method of synthesized phase objects (SPO-method) [1] for the rotation-invariant pattern recognition. For the standard method of recognition and the SPO-method, the comparison of the parameters of correlation signals for a number of amplitude objects is executed at the realization of a rotation in an optical-digital correlator with the joint Fourier transformation. It is shown that not only the invariance relative to a rotation at a realization of the joint correlation for synthesized phase objects (SP-objects) but also the main advantage of the method of SP-objects over the reference one such as the unified δ-like recognition signal with the largest possible signal-to-noise ratio independent of the type of an object are attained.
Towards a unified theory of neocortex: laminar cortical circuits for vision and cognition.
Grossberg, Stephen
2007-01-01
A key goal of computational neuroscience is to link brain mechanisms to behavioral functions. The present article describes recent progress towards explaining how laminar neocortical circuits give rise to biological intelligence. These circuits embody two new and revolutionary computational paradigms: Complementary Computing and Laminar Computing. Circuit properties include a novel synthesis of feedforward and feedback processing, of digital and analog processing, and of preattentive and attentive processing. This synthesis clarifies the appeal of Bayesian approaches but has a far greater predictive range that naturally extends to self-organizing processes. Examples from vision and cognition are summarized. A LAMINART architecture unifies properties of visual development, learning, perceptual grouping, attention, and 3D vision. A key modeling theme is that the mechanisms which enable development and learning to occur in a stable way imply properties of adult behavior. It is noted how higher-order attentional constraints can influence multiple cortical regions, and how spatial and object attention work together to learn view-invariant object categories. In particular, a form-fitting spatial attentional shroud can allow an emerging view-invariant object category to remain active while multiple view categories are associated with it during sequences of saccadic eye movements. Finally, the chapter summarizes recent work on the LIST PARSE model of cognitive information processing by the laminar circuits of prefrontal cortex. LIST PARSE models the short-term storage of event sequences in working memory, their unitization through learning into sequence, or list, chunks, and their read-out in planned sequential performance that is under volitional control. LIST PARSE provides a laminar embodiment of Item and Order working memories, also called Competitive Queuing models, that have been supported by both psychophysical and neurobiological data. These examples show how variations of a common laminar cortical design can embody properties of visual and cognitive intelligence that seem, at least on the surface, to be mechanistically unrelated.
Right fusiform response patterns reflect visual object identity rather than semantic similarity.
Bruffaerts, Rose; Dupont, Patrick; De Grauwe, Sophie; Peeters, Ronald; De Deyne, Simon; Storms, Gerrit; Vandenberghe, Rik
2013-12-01
We previously reported the neuropsychological consequences of a lesion confined to the middle and posterior part of the right fusiform gyrus (case JA) causing a partial loss of knowledge of visual attributes of concrete entities in the absence of category-selectivity (animate versus inanimate). We interpreted this in the context of a two-step model that distinguishes structural description knowledge from associative-semantic processing and implicated the lesioned area in the former process. To test this hypothesis in the intact brain, multi-voxel pattern analysis was used in a series of event-related fMRI studies in a total of 46 healthy subjects. We predicted that activity patterns in this region would be determined by the identity of rather than the conceptual similarity between concrete entities. In a prior behavioral experiment features were generated for each entity by more than 1000 subjects. Based on a hierarchical clustering analysis the entities were organised into 3 semantic clusters (musical instruments, vehicles, tools). Entities were presented as words or pictures. With foveal presentation of pictures, cosine similarity between fMRI response patterns in right fusiform cortex appeared to reflect both the identity of and the semantic similarity between the entities. No such effects were found for words in this region. The effect of object identity was invariant for location, scaling, orientation axis and color (grayscale versus color). It also persisted for different exemplars referring to a same concrete entity. The apparent semantic similarity effect however was not invariant. This study provides further support for a neurobiological distinction between structural description knowledge and processing of semantic relationships and confirms the role of right mid-posterior fusiform cortex in the former process, in accordance with previous lesion evidence. © 2013.
Edge co-occurrences can account for rapid categorization of natural versus animal images
NASA Astrophysics Data System (ADS)
Perrinet, Laurent U.; Bednar, James A.
2015-06-01
Making a judgment about the semantic category of a visual scene, such as whether it contains an animal, is typically assumed to involve high-level associative brain areas. Previous explanations require progressively analyzing the scene hierarchically at increasing levels of abstraction, from edge extraction to mid-level object recognition and then object categorization. Here we show that the statistics of edge co-occurrences alone are sufficient to perform a rough yet robust (translation, scale, and rotation invariant) scene categorization. We first extracted the edges from images using a scale-space analysis coupled with a sparse coding algorithm. We then computed the “association field” for different categories (natural, man-made, or containing an animal) by computing the statistics of edge co-occurrences. These differed strongly, with animal images having more curved configurations. We show that this geometry alone is sufficient for categorization, and that the pattern of errors made by humans is consistent with this procedure. Because these statistics could be measured as early as the primary visual cortex, the results challenge widely held assumptions about the flow of computations in the visual system. The results also suggest new algorithms for image classification and signal processing that exploit correlations between low-level structure and the underlying semantic category.
Random Wiring, Ganglion Cell Mosaics, and the Functional Architecture of the Visual Cortex
Coppola, David; White, Leonard E.; Wolf, Fred
2015-01-01
The architecture of iso-orientation domains in the primary visual cortex (V1) of placental carnivores and primates apparently follows species invariant quantitative laws. Dynamical optimization models assuming that neurons coordinate their stimulus preferences throughout cortical circuits linking millions of cells specifically predict these invariants. This might indicate that V1’s intrinsic connectome and its functional architecture adhere to a single optimization principle with high precision and robustness. To validate this hypothesis, it is critical to closely examine the quantitative predictions of alternative candidate theories. Random feedforward wiring within the retino-cortical pathway represents a conceptually appealing alternative to dynamical circuit optimization because random dimension-expanding projections are believed to generically exhibit computationally favorable properties for stimulus representations. Here, we ask whether the quantitative invariants of V1 architecture can be explained as a generic emergent property of random wiring. We generalize and examine the stochastic wiring model proposed by Ringach and coworkers, in which iso-orientation domains in the visual cortex arise through random feedforward connections between semi-regular mosaics of retinal ganglion cells (RGCs) and visual cortical neurons. We derive closed-form expressions for cortical receptive fields and domain layouts predicted by the model for perfectly hexagonal RGC mosaics. Including spatial disorder in the RGC positions considerably changes the domain layout properties as a function of disorder parameters such as position scatter and its correlations across the retina. However, independent of parameter choice, we find that the model predictions substantially deviate from the layout laws of iso-orientation domains observed experimentally. Considering random wiring with the currently most realistic model of RGC mosaic layouts, a pairwise interacting point process, the predicted layouts remain distinct from experimental observations and resemble Gaussian random fields. We conclude that V1 layout invariants are specific quantitative signatures of visual cortical optimization, which cannot be explained by generic random feedforward-wiring models. PMID:26575467
Neurons with two sites of synaptic integration learn invariant representations.
Körding, K P; König, P
2001-12-01
Neurons in mammalian cerebral cortex combine specific responses with respect to some stimulus features with invariant responses to other stimulus features. For example, in primary visual cortex, complex cells code for orientation of a contour but ignore its position to a certain degree. In higher areas, such as the inferotemporal cortex, translation-invariant, rotation-invariant, and even view point-invariant responses can be observed. Such properties are of obvious interest to artificial systems performing tasks like pattern recognition. It remains to be resolved how such response properties develop in biological systems. Here we present an unsupervised learning rule that addresses this problem. It is based on a neuron model with two sites of synaptic integration, allowing qualitatively different effects of input to basal and apical dendritic trees, respectively. Without supervision, the system learns to extract invariance properties using temporal or spatial continuity of stimuli. Furthermore, top-down information can be smoothly integrated in the same framework. Thus, this model lends a physiological implementation to approaches of unsupervised learning of invariant-response properties.
Higher-Order Neural Networks Applied to 2D and 3D Object Recognition
NASA Technical Reports Server (NTRS)
Spirkovska, Lilly; Reid, Max B.
1994-01-01
A Higher-Order Neural Network (HONN) can be designed to be invariant to geometric transformations such as scale, translation, and in-plane rotation. Invariances are built directly into the architecture of a HONN and do not need to be learned. Thus, for 2D object recognition, the network needs to be trained on just one view of each object class, not numerous scaled, translated, and rotated views. Because the 2D object recognition task is a component of the 3D object recognition task, built-in 2D invariance also decreases the size of the training set required for 3D object recognition. We present results for 2D object recognition both in simulation and within a robotic vision experiment and for 3D object recognition in simulation. We also compare our method to other approaches and show that HONNs have distinct advantages for position, scale, and rotation-invariant object recognition. The major drawback of HONNs is that the size of the input field is limited due to the memory required for the large number of interconnections in a fully connected network. We present partial connectivity strategies and a coarse-coding technique for overcoming this limitation and increasing the input field to that required by practical object recognition problems.
Nakamura, Kimihiro; Makuuchi, Michiru; Nakajima, Yasoichi
2014-01-01
Previous studies show that the primate and human visual system automatically generates a common and invariant representation from a visual object image and its mirror reflection. For humans, however, this mirror-image generalization seems to be partially suppressed through literacy acquisition, since literate adults have greater difficulty in recognizing mirror images of letters than those of other visual objects. At the neural level, such category-specific effect on mirror-image processing has been associated with the left occpitotemporal cortex (L-OTC), but it remains unclear whether the apparent "inhibition" on mirror letters is mediated by suppressing mirror-image representations covertly generated from normal letter stimuli. Using transcranial magnetic stimulation (TMS), we examined how transient disruption of the L-OTC affects mirror-image recognition during a same-different judgment task, while varying the semantic category (letters and non-letter objects), identity (same or different), and orientation (same or mirror-reversed) of the first and second stimuli. We found that magnetic stimulation of the L-OTC produced a significant delay in mirror-image recognition for letter-strings but not for other objects. By contrast, this category specific impact was not observed when TMS was applied to other control sites, including the right homologous area and vertex. These results thus demonstrate a causal link between the L-OTC and mirror-image discrimination in literate people. We further suggest that left-right sensitivity for letters is not achieved by a local inhibitory mechanism in the L-OTC but probably relies on the inter-regional coupling with other orientation-sensitive occipito-parietal regions.
An object recognition method based on fuzzy theory and BP networks
NASA Astrophysics Data System (ADS)
Wu, Chuan; Zhu, Ming; Yang, Dong
2006-01-01
It is difficult to choose eigenvectors when neural network recognizes object. It is possible that the different object eigenvectors is similar or the same object eigenvectors is different under scaling, shifting, rotation if eigenvectors can not be chosen appropriately. In order to solve this problem, the image is edged, the membership function is reconstructed and a new threshold segmentation method based on fuzzy theory is proposed to get the binary image. Moment invariant of binary image is extracted and normalized. Some time moment invariant is too small to calculate effectively so logarithm of moment invariant is taken as input eigenvectors of BP network. The experimental results demonstrate that the proposed approach could recognize the object effectively, correctly and quickly.
NASA Astrophysics Data System (ADS)
Pomares, Jorge; Felicetti, Leonard; Pérez, Javier; Emami, M. Reza
2018-02-01
An image-based servo controller for the guidance of a spacecraft during non-cooperative rendezvous is presented in this paper. The controller directly utilizes the visual features from image frames of a target spacecraft for computing both attitude and orbital maneuvers concurrently. The utilization of adaptive optics, such as zooming cameras, is also addressed through developing an invariant-image servo controller. The controller allows for performing rendezvous maneuvers independently from the adjustments of the camera focal length, improving the performance and versatility of maneuvers. The stability of the proposed control scheme is proven analytically in the invariant space, and its viability is explored through numerical simulations.
NASA Astrophysics Data System (ADS)
Wihardi, Y.; Setiawan, W.; Nugraha, E.
2018-01-01
On this research we try to build CBIRS based on Learning Distance/Similarity Function using Linear Discriminant Analysis (LDA) and Histogram of Oriented Gradient (HoG) feature. Our method is invariant to depiction of image, such as similarity of image to image, sketch to image, and painting to image. LDA can decrease execution time compared to state of the art method, but it still needs an improvement in term of accuracy. Inaccuracy in our experiment happen because we did not perform sliding windows search and because of low number of negative samples as natural-world images.
Priming of familiar and unfamiliar visual objects over delays in young and older adults.
Soldan, Anja; Hilton, H John; Cooper, Lynn A; Stern, Yaakov
2009-03-01
Although priming of familiar stimuli is usually age invariant, little is known about how aging affects priming of preexperimentally unfamiliar stimuli. Therefore, this study investigated the effects of aging and encoding-to-test delays (0 min, 20 min, 90 min, and 1 week) on priming of unfamiliar objects in block-based priming paradigms. During the encoding phase, participants viewed pictures of novel objects (Experiments 1 and 2) or novel and familiar objects (Experiment 3) and judged their left-right orientation. In the test block, priming was measured using the possible-impossible object-decision test (Experiment 1), symmetric-asymmetric object-decision test (Experiment 2), and real-nonreal object-decision test (Experiment 3). In Experiments 1 and 2, young adults showed priming for unfamiliar objects at all delays, whereas older adults whose baseline task performance was similar to that of young adults did not show any priming. Experiment 3 found no effects of age or delay on priming of familiar objects; however, priming of unfamiliar objects was only observed in the young participants. This suggests that when older adults cannot rely on preexisting memory representations, age-related deficits in priming can emerge.
Invariant object recognition based on the generalized discrete radon transform
NASA Astrophysics Data System (ADS)
Easley, Glenn R.; Colonna, Flavia
2004-04-01
We introduce a method for classifying objects based on special cases of the generalized discrete Radon transform. We adjust the transform and the corresponding ridgelet transform by means of circular shifting and a singular value decomposition (SVD) to obtain a translation, rotation and scaling invariant set of feature vectors. We then use a back-propagation neural network to classify the input feature vectors. We conclude with experimental results and compare these with other invariant recognition methods.
Electrophysiological evidence for size invariance in masked picture repetition priming
Eddy, Marianna D.; Holcomb, Phillip J.
2009-01-01
This experiment examined invariance in object representations through measuring event-related potentials (ERPs) to pictures in a masked repetition priming paradigm. Pairs of pictures were presented where the prime was either the same size or half the size of the target object and the target was either presented in a normal orientation or was a normal sized mirror reflection of the prime object. Previous masked repetition priming studies have found a cascade of priming effect sensitive to perceptual (N190/P190) and semantic (N400) properties of the stimulus. This experiment found that both early (N190/P190 effects) and later effects (N400) were invariant to size, whereas only the N190/P190 effect was invariant to mirror reflection. The combination of a small prime and a mirror reflected target led to no significant priming effects. Taken together, the results of this set of experiments suggests that object recognition, more specifically, activating an object representation, occurs in a hierarchical fashion where overlapping perceptual information between the prime and target is necessary, although not always sufficient, to activate a higher level semantic representation. PMID:19560248
Leibo, Joel Z.; Liao, Qianli; Freiwald, Winrich A.; Anselmi, Fabio; Poggio, Tomaso
2017-01-01
SUMMARY The primate brain contains a hierarchy of visual areas, dubbed the ventral stream, which rapidly computes object representations that are both specific for object identity and robust against identity-preserving transformations like depth-rotations [1, 2]. Current computational models of object recognition, including recent deep learning networks, generate these properties through a hierarchy of alternating selectivity-increasing filtering and tolerance-increasing pooling operations, similar to simple-complex cells operations [3, 4, 5, 6]. Here we prove that a class of hierarchical architectures and a broad set of biologically plausible learning rules generate approximate invariance to identity-preserving transformations at the top level of the processing hierarchy. However, all past models tested failed to reproduce the most salient property of an intermediate representation of a three-level face-processing hierarchy in the brain: mirror-symmetric tuning to head orientation [7]. Here we demonstrate that one specific biologically-plausible Hebb-type learning rule generates mirror-symmetric tuning to bilaterally symmetric stimuli like faces at intermediate levels of the architecture and show why it does so. Thus the tuning properties of individual cells inside the visual stream appear to result from group properties of the stimuli they encode and to reflect the learning rules that sculpted the information-processing system within which they reside. PMID:27916522
Gnadt, William; Grossberg, Stephen
2008-06-01
How do reactive and planned behaviors interact in real time? How are sequences of such behaviors released at appropriate times during autonomous navigation to realize valued goals? Controllers for both animals and mobile robots, or animats, need reactive mechanisms for exploration, and learned plans to reach goal objects once an environment becomes familiar. The SOVEREIGN (Self-Organizing, Vision, Expectation, Recognition, Emotion, Intelligent, Goal-oriented Navigation) animat model embodies these capabilities, and is tested in a 3D virtual reality environment. SOVEREIGN includes several interacting subsystems which model complementary properties of cortical What and Where processing streams and which clarify similarities between mechanisms for navigation and arm movement control. As the animat explores an environment, visual inputs are processed by networks that are sensitive to visual form and motion in the What and Where streams, respectively. Position-invariant and size-invariant recognition categories are learned by real-time incremental learning in the What stream. Estimates of target position relative to the animat are computed in the Where stream, and can activate approach movements toward the target. Motion cues from animat locomotion can elicit head-orienting movements to bring a new target into view. Approach and orienting movements are alternately performed during animat navigation. Cumulative estimates of each movement are derived from interacting proprioceptive and visual cues. Movement sequences are stored within a motor working memory. Sequences of visual categories are stored in a sensory working memory. These working memories trigger learning of sensory and motor sequence categories, or plans, which together control planned movements. Predictively effective chunk combinations are selectively enhanced via reinforcement learning when the animat is rewarded. Selected planning chunks effect a gradual transition from variable reactive exploratory movements to efficient goal-oriented planned movement sequences. Volitional signals gate interactions between model subsystems and the release of overt behaviors. The model can control different motor sequences under different motivational states and learns more efficient sequences to rewarded goals as exploration proceeds.
Transfer of contextual cueing in full-icon display remapping.
Shi, Zhuanghua; Zang, Xuelian; Jia, Lina; Geyer, Thomas; Müller, Hermann J
2013-02-25
Invariant spatial context can expedite visual search, an effect that is known as contextual cueing (e.g., Chun & Jiang, 1998). However, disrupting learned display configurations abolishes the effect. In current touch-based mobile devices, such as the iPad, icons are shuffled and remapped when the display mode is changed. However, such remapping also disrupts the spatial relationships between icons. This may hamper usability. In the present study, we examined the transfer of contextual cueing in four different methods of display remapping: position-order invariant, global rotation, local invariant, and central invariant. We used full-icon landscape mode for training and both landscape and portrait modes for testing, to check whether the cueing transfers to portrait mode. The results showed transfer of contextual cueing but only with the local invariant and the central invariant remapping methods. We take the results to mean that the predictability of target locations is a crucial factor for the transfer of contextual cueing and thus icon remapping design for mobile devices.
Hypothesis Support Mechanism for Mid-Level Visual Pattern Recognition
NASA Technical Reports Server (NTRS)
Amador, Jose J (Inventor)
2007-01-01
A method of mid-level pattern recognition provides for a pose invariant Hough Transform by parametrizing pairs of points in a pattern with respect to at least two reference points, thereby providing a parameter table that is scale- or rotation-invariant. A corresponding inverse transform may be applied to test hypothesized matches in an image and a distance transform utilized to quantify the level of match.
Cognitive, perceptual and action-oriented representations of falling objects.
Zago, Myrka; Lacquaniti, Francesco
2005-01-01
We interact daily with moving objects. How accurate are our predictions about objects' motions? What sources of information do we use? These questions have received wide attention from a variety of different viewpoints. On one end of the spectrum are the ecological approaches assuming that all the information about the visual environment is present in the optic array, with no need to postulate conscious or unconscious representations. On the other end of the spectrum are the constructivist approaches assuming that a more or less accurate representation of the external world is built in the brain using explicit or implicit knowledge or memory besides sensory inputs. Representations can be related to naive physics or to context cue-heuristics or to the construction of internal copies of environmental invariants. We address the issue of prediction of objects' fall at different levels. Cognitive understanding and perceptual judgment of simple Newtonian dynamics can be surprisingly inaccurate. By contrast, motor interactions with falling objects are often very accurate. We argue that the pragmatic action-oriented behaviour and the perception-oriented behaviour may use different modes of operation and different levels of representation.
Neural Representations that Support Invariant Object Recognition
Goris, Robbe L. T.; Op de Beeck, Hans P.
2008-01-01
Neural mechanisms underlying invariant behaviour such as object recognition are not well understood. For brain regions critical for object recognition, such as inferior temporal cortex (ITC), there is now ample evidence indicating that single cells code for many stimulus aspects, implying that only a moderate degree of invariance is present. However, recent theoretical and empirical work seems to suggest that integrating responses of multiple non-invariant units may produce invariant representations at population level. We provide an explicit test for the hypothesis that a linear read-out mechanism of a pool of units resembling ITC neurons may achieve invariant performance in an identification task. A linear classifier was trained to decode a particular value in a 2-D stimulus space using as input the response pattern across a population of units. Only one dimension was relevant for the task, and the stimulus location on the irrelevant dimension (ID) was kept constant during training. In a series of identification tests, the stimulus location on the relevant dimension (RD) and ID was manipulated, yielding estimates for both the level of sensitivity and tolerance reached by the network. We studied the effects of several single-cell characteristics as well as population characteristics typically considered in the literature, but found little support for the hypothesis. While the classifier averages out effects of idiosyncratic tuning properties and inter-unit variability, its invariance is very much determined by the (hypothetical) ‘average’ neuron. Consequently, even at population level there exists a fundamental trade-off between selectivity and tolerance, and invariant behaviour does not emerge spontaneously. PMID:19242556
The role of pulvinar in the transmission of information in the visual hierarchy.
Cortes, Nelson; van Vreeswijk, Carl
2012-01-01
VISUAL RECEPTIVE FIELD (RF) ATTRIBUTES IN VISUAL CORTEX OF PRIMATES HAVE BEEN EXPLAINED MAINLY FROM CORTICAL CONNECTIONS: visual RFs progress from simple to complex through cortico-cortical pathways from lower to higher levels in the visual hierarchy. This feedforward flow of information is paired with top-down processes through the feedback pathway. Although the hierarchical organization explains the spatial properties of RFs, is unclear how a non-linear transmission of activity through the visual hierarchy can yield smooth contrast response functions in all level of the hierarchy. Depending on the gain, non-linear transfer functions create either a bimodal response to contrast, or no contrast dependence of the response in the highest level of the hierarchy. One possible mechanism to regulate this transmission of visual contrast information from low to high level involves an external component that shortcuts the flow of information through the hierarchy. A candidate for this shortcut is the Pulvinar nucleus of the thalamus. To investigate representation of stimulus contrast a hierarchical model network of ten cortical areas is examined. In each level of the network, the activity from the previous layer is integrated and then non-linearly transmitted to the next level. The arrangement of interactions creates a gradient from simple to complex RFs of increasing size as one moves from lower to higher cortical levels. The visual input is modeled as a Gaussian random input, whose width codes for the contrast. This input is applied to the first area. The output activity ratio among different contrast values is analyzed for the last level to observe sensitivity to a contrast and contrast invariant tuning. For a purely cortical system, the output of the last area can be approximately contrast invariant, but the sensitivity to contrast is poor. To account for an alternative visual processing pathway, non-reciprocal connections from and to a parallel pulvinar like structure of nine areas is coupled to the system. Compared to the pure feedforward model, cortico-pulvino-cortical output presents much more sensitivity to contrast and has a similar level of contrast invariance of the tuning.
The Role of Pulvinar in the Transmission of Information in the Visual Hierarchy
Cortes, Nelson; van Vreeswijk, Carl
2012-01-01
Visual receptive field (RF) attributes in visual cortex of primates have been explained mainly from cortical connections: visual RFs progress from simple to complex through cortico-cortical pathways from lower to higher levels in the visual hierarchy. This feedforward flow of information is paired with top-down processes through the feedback pathway. Although the hierarchical organization explains the spatial properties of RFs, is unclear how a non-linear transmission of activity through the visual hierarchy can yield smooth contrast response functions in all level of the hierarchy. Depending on the gain, non-linear transfer functions create either a bimodal response to contrast, or no contrast dependence of the response in the highest level of the hierarchy. One possible mechanism to regulate this transmission of visual contrast information from low to high level involves an external component that shortcuts the flow of information through the hierarchy. A candidate for this shortcut is the Pulvinar nucleus of the thalamus. To investigate representation of stimulus contrast a hierarchical model network of ten cortical areas is examined. In each level of the network, the activity from the previous layer is integrated and then non-linearly transmitted to the next level. The arrangement of interactions creates a gradient from simple to complex RFs of increasing size as one moves from lower to higher cortical levels. The visual input is modeled as a Gaussian random input, whose width codes for the contrast. This input is applied to the first area. The output activity ratio among different contrast values is analyzed for the last level to observe sensitivity to a contrast and contrast invariant tuning. For a purely cortical system, the output of the last area can be approximately contrast invariant, but the sensitivity to contrast is poor. To account for an alternative visual processing pathway, non-reciprocal connections from and to a parallel pulvinar like structure of nine areas is coupled to the system. Compared to the pure feedforward model, cortico-pulvino-cortical output presents much more sensitivity to contrast and has a similar level of contrast invariance of the tuning. PMID:22654750
An, Xu; Gong, Hongliang; Yin, Jiapeng; Wang, Xiaochun; Pan, Yanxia; Zhang, Xian; Lu, Yiliang; Yang, Yupeng; Toth, Zoltan; Schiessl, Ingo; McLoughlin, Niall; Wang, Wei
2014-01-01
Visual scenes can be readily decomposed into a variety of oriented components, the processing of which is vital for object segregation and recognition. In primate V1 and V2, most neurons have small spatio-temporal receptive fields responding selectively to oriented luminance contours (first order), while only a subgroup of neurons signal non-luminance defined contours (second order). So how is the orientation of second-order contours represented at the population level in macaque V1 and V2? Here we compared the population responses in macaque V1 and V2 to two types of second-order contour stimuli generated either by modulation of contrast or phase reversal with those to first-order contour stimuli. Using intrinsic signal optical imaging, we found that the orientation of second-order contour stimuli was represented invariantly in the orientation columns of both macaque V1 and V2. A physiologically constrained spatio-temporal energy model of V1 and V2 neuronal populations could reproduce all the recorded population responses. These findings suggest that, at the population level, the primate early visual system processes the orientation of second-order contours initially through a linear spatio-temporal filter mechanism. Our results of population responses to different second-order contour stimuli support the idea that the orientation maps in primate V1 and V2 can be described as a spatial-temporal energy map. PMID:25188576
The Perspective Structure of Visual Space
2015-01-01
Luneburg’s model has been the reference for experimental studies of visual space for almost seventy years. His claim for a curved visual space has been a source of inspiration for visual scientists as well as philosophers. The conclusion of many experimental studies has been that Luneburg’s model does not describe visual space in various tasks and conditions. Remarkably, no alternative model has been suggested. The current study explores perspective transformations of Euclidean space as a model for visual space. Computations show that the geometry of perspective spaces is considerably different from that of Euclidean space. Collinearity but not parallelism is preserved in perspective space and angles are not invariant under translation and rotation. Similar relationships have shown to be properties of visual space. Alley experiments performed early in the nineteenth century have been instrumental in hypothesizing curved visual spaces. Alleys were computed in perspective space and compared with reconstructed alleys of Blumenfeld. Parallel alleys were accurately described by perspective geometry. Accurate distance alleys were derived from parallel alleys by adjusting the interstimulus distances according to the size-distance invariance hypothesis. Agreement between computed and experimental alleys and accommodation of experimental results that rejected Luneburg’s model show that perspective space is an appropriate model for how we perceive orientations and angles. The model is also appropriate for perceived distance ratios between stimuli but fails to predict perceived distances. PMID:27648222
Spatial filtering, color constancy, and the color-changing dress.
Dixon, Erica L; Shapiro, Arthur G
2017-03-01
The color-changing dress is a 2015 Internet phenomenon in which the colors in a picture of a dress are reported as blue-black by some observers and white-gold by others. The standard explanation is that observers make different inferences about the lighting (is the dress in shadow or bright yellow light?); based on these inferences, observers make a best guess about the reflectance of the dress. The assumption underlying this explanation is that reflectance is the key to color constancy because reflectance alone remains invariant under changes in lighting conditions. Here, we demonstrate an alternative type of invariance across illumination conditions: An object that appears to vary in color under blue, white, or yellow illumination does not change color in the high spatial frequency region. A first approximation to color constancy can therefore be accomplished by a high-pass filter that retains enough low spatial frequency content so as to not to completely desaturate the object. We demonstrate the implications of this idea on the Rubik's cube illusion; on a shirt placed under white, yellow, and blue illuminants; and on spatially filtered images of the dress. We hypothesize that observer perceptions of the dress's color vary because of individual differences in how the visual system extracts high and low spatial frequency color content from the environment, and we demonstrate cross-group differences in average sensitivity to low spatial frequency patterns.
Altschuler, Ted S.; Molholm, Sophie; Butler, John S.; Mercier, Manuel R.; Brandwein, Alice B.; Foxe, John J.
2014-01-01
The adult human visual system can efficiently fill-in missing object boundaries when low-level information from the retina is incomplete, but little is known about how these processes develop across childhood. A decade of visual-evoked potential (VEP) studies has produced a theoretical model identifying distinct phases of contour completion in adults. The first, termed a perceptual phase, occurs from approximately 100-200 ms and is associated with automatic boundary completion. The second is termed a conceptual phase occurring between 230-400 ms. The latter has been associated with the analysis of ambiguous objects which seem to require more effort to complete. The electrophysiological markers of these phases have both been localized to the lateral occipital complex, a cluster of ventral visual stream brain regions associated with object-processing. We presented Kanizsa-type illusory contour stimuli, often used for exploring contour completion processes, to neurotypical persons ages 6-31 (N= 63), while parametrically varying the spatial extent of these induced contours, in order to better understand how filling-in processes develop across childhood and adolescence. Our results suggest that, while adults complete contour boundaries in a single discrete period during the automatic perceptual phase, children display an immature response pattern - engaging in more protracted processing across both timeframes and appearing to recruit more widely distributed regions which resemble those evoked during adult processing of higher-order ambiguous figures. However, children older than 5 years of age were remarkably like adults in that the effects of contour processing were invariant to manipulation of contour extent. PMID:24365674
Image-based automatic recognition of larvae
NASA Astrophysics Data System (ADS)
Sang, Ru; Yu, Guiying; Fan, Weijun; Guo, Tiantai
2010-08-01
As the main objects, imagoes have been researched in quarantine pest recognition in these days. However, pests in their larval stage are latent, and the larvae spread abroad much easily with the circulation of agricultural and forest products. It is presented in this paper that, as the new research objects, larvae are recognized by means of machine vision, image processing and pattern recognition. More visional information is reserved and the recognition rate is improved as color image segmentation is applied to images of larvae. Along with the characteristics of affine invariance, perspective invariance and brightness invariance, scale invariant feature transform (SIFT) is adopted for the feature extraction. The neural network algorithm is utilized for pattern recognition, and the automatic identification of larvae images is successfully achieved with satisfactory results.
Orientation-Invariant Object Recognition: Evidence from Repetition Blindness
ERIC Educational Resources Information Center
Harris, Irina M.; Dux, Paul E.
2005-01-01
The question of whether object recognition is orientation-invariant or orientation-dependent was investigated using a repetition blindness (RB) paradigm. In RB, the second occurrence of a repeated stimulus is less likely to be reported, compared to the occurrence of a different stimulus, if it occurs within a short time of the first presentation.…
Convergent and invariant object representations for sight, sound, and touch.
Man, Kingson; Damasio, Antonio; Meyer, Kaspar; Kaplan, Jonas T
2015-09-01
We continuously perceive objects in the world through multiple sensory channels. In this study, we investigated the convergence of information from different sensory streams within the cerebral cortex. We presented volunteers with three common objects via three different modalities-sight, sound, and touch-and used multivariate pattern analysis of functional magnetic resonance imaging data to map the cortical regions containing information about the identity of the objects. We could reliably predict which of the three stimuli a subject had seen, heard, or touched from the pattern of neural activity in the corresponding early sensory cortices. Intramodal classification was also successful in large portions of the cerebral cortex beyond the primary areas, with multiple regions showing convergence of information from two or all three modalities. Using crossmodal classification, we also searched for brain regions that would represent objects in a similar fashion across different modalities of presentation. We trained a classifier to distinguish objects presented in one modality and then tested it on the same objects presented in a different modality. We detected audiovisual invariance in the right temporo-occipital junction, audiotactile invariance in the left postcentral gyrus and parietal operculum, and visuotactile invariance in the right postcentral and supramarginal gyri. Our maps of multisensory convergence and crossmodal generalization reveal the underlying organization of the association cortices, and may be related to the neural basis for mental concepts. © 2015 Wiley Periodicals, Inc.
Mahajan, Dhruv; Ramamoorthi, Ravi; Curless, Brian
2008-02-01
This paper develops a theory of frequency domain invariants in computer vision. We derive novel identities using spherical harmonics, which are the angular frequency domain analog to common spatial domain invariants such as reflectance ratios. These invariants are derived from the spherical harmonic convolution framework for reflection from a curved surface. Our identities apply in a number of canonical cases, including single and multiple images of objects under the same and different lighting conditions. One important case we consider is two different glossy objects in two different lighting environments. For this case, we derive a novel identity, independent of the specific lighting configurations or BRDFs, that allows us to directly estimate the fourth image if the other three are available. The identity can also be used as an invariant to detecttampering in the images. While this paper is primarily theoretical, it has the potential to lay the mathematical foundations for two important practical applications. First, we can develop more general algorithms for inverse rendering problems, which can directly relight and change material properties by transferring the BRDF or lighting from another object or illumination. Second, we can check the consistency of an image, to detect tampering or image splicing.
Recognizing 3-D Objects Using 2-D Images
1993-05-01
also depends on models that contain significant numbers of viewpoint-invariant features, such as parallelograms. Biederman [9] built on Lowe’s work to...objects. Biederman suggests that we recognize images of objects by dividing the image into a few parts, called geons. Each geon is described by the...are also described with a few view-invariant features. Together, these provide a set of features 1.3. STRATEG;IES FOR INDEXING 25 which Biederman
Burmann, Britta; Dehnhardt, Guido; Mauck, Björn
2005-01-01
Mental rotation is a widely accepted concept indicating an image-like mental representation of visual information and an analogue mode of information processing in certain visuospatial tasks. In the task of discriminating between image and mirror-image of rotated figures, human reaction times increase with the angular disparity between the figures. In animals, tests of this kind yield inconsistent results. Pigeons were found to use a time-independent rotational invariance, possibly indicating a non-analogue information processing system that evolved in response to the horizontal plane of reference birds perceive during flight. Despite similar ecological demands concerning the visual reference plane, a sea lion was found to use mental rotation in similar tasks, but its processing speed while rotating three-dimensional stimuli seemed to depend on the axis of rotation in a different way than found for humans in similar tasks. If ecological demands influence the way information processing systems evolve, hominids might have secondarily lost the ability of rotational invariance while retreating from arboreal living and evolving an upright gait in which the vertical reference plane is more important. We therefore conducted mental rotation experiments with an arboreal living primate species, the lion-tailed macaque. Performing a two-alternative matching-to-sample procedure, the animal had to decide between rotated figures representing image and mirror-image of a previously shown upright sample. Although non-rotated stimuli were recognized faster than rotated ones, the animal's mean reaction times did not clearly increase with the angle of rotation. These results are inconsistent with the mental rotation concept but also cannot be explained assuming a mere rotational invariance. Our study thus seems to support the idea of information processing systems evolving gradually in response to specific ecological demands.
Leibo, Joel Z; Liao, Qianli; Anselmi, Fabio; Freiwald, Winrich A; Poggio, Tomaso
2017-01-09
The primate brain contains a hierarchy of visual areas, dubbed the ventral stream, which rapidly computes object representations that are both specific for object identity and robust against identity-preserving transformations, like depth rotations [1, 2]. Current computational models of object recognition, including recent deep-learning networks, generate these properties through a hierarchy of alternating selectivity-increasing filtering and tolerance-increasing pooling operations, similar to simple-complex cells operations [3-6]. Here, we prove that a class of hierarchical architectures and a broad set of biologically plausible learning rules generate approximate invariance to identity-preserving transformations at the top level of the processing hierarchy. However, all past models tested failed to reproduce the most salient property of an intermediate representation of a three-level face-processing hierarchy in the brain: mirror-symmetric tuning to head orientation [7]. Here, we demonstrate that one specific biologically plausible Hebb-type learning rule generates mirror-symmetric tuning to bilaterally symmetric stimuli, like faces, at intermediate levels of the architecture and show why it does so. Thus, the tuning properties of individual cells inside the visual stream appear to result from group properties of the stimuli they encode and to reflect the learning rules that sculpted the information-processing system within which they reside. Copyright © 2017 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Tsao, Thomas R.; Tsao, Doris
1997-04-01
In the 1980's, neurobiologist suggested a simple mechanism in primate visual cortex for maintaining a stable and invariant representation of a moving object. The receptive field of visual neurons has real-time transforms in response to motion, to maintain a stable representation. When the visual stimulus is changed due to motion, the geometric transform of the stimulus triggers a dual transform of the receptive field. This dual transform in the receptive fields compensates geometric variation in the stimulus. This process can be modelled using a Lie group method. The massive array of affine parameter sensing circuits will function as a smart sensor tightly coupled to the passive imaging sensor (retina). Neural geometric engine is a neuromorphic computing device simulating our Lie group model of spatial perception of primate's primal visual cortex. We have developed the computer simulation and experimented on realistic and synthetic image data, and performed a preliminary research of using analog VLSI technology for implementation of the neural geometric engine. We have benchmark tested on DMA's terrain data with their result and have built an analog integrated circuit to verify the computational structure of the engine. When fully implemented on ANALOG VLSI chip, we will be able to accurately reconstruct a 3D terrain surface in real-time from stereoscopic imagery.
Marginalization in neural circuits with divisive normalization
Beck, J.M.; Latham, P.E.; Pouget, A.
2011-01-01
A wide range of computations performed by the nervous system involves a type of probabilistic inference known as marginalization. This computation comes up in seemingly unrelated tasks, including causal reasoning, odor recognition, motor control, visual tracking, coordinate transformations, visual search, decision making, and object recognition, to name just a few. The question we address here is: how could neural circuits implement such marginalizations? We show that when spike trains exhibit a particular type of statistics – associated with constant Fano factors and gain-invariant tuning curves, as is often reported in vivo – some of the more common marginalizations can be achieved with networks that implement a quadratic nonlinearity and divisive normalization, the latter being a type of nonlinear lateral inhibition that has been widely reported in neural circuits. Previous studies have implicated divisive normalization in contrast gain control and attentional modulation. Our results raise the possibility that it is involved in yet another, highly critical, computation: near optimal marginalization in a remarkably wide range of tasks. PMID:22031877
An information capacity limitation of visual short-term memory.
Sewell, David K; Lilburn, Simon D; Smith, Philip L
2014-12-01
Research suggests that visual short-term memory (VSTM) has both an item capacity, of around 4 items, and an information capacity. We characterize the information capacity limits of VSTM using a task in which observers discriminated the orientation of a single probed item in displays consisting of 1, 2, 3, or 4 orthogonally oriented Gabor patch stimuli that were presented in noise for 50 ms, 100 ms, 150 ms, or 200 ms. The observed capacity limitations are well described by a sample-size model, which predicts invariance of ∑(i)(d'(i))² for displays of different sizes and linearity of (d'(i))² for displays of different durations. Performance was the same for simultaneous and sequentially presented displays, which implicates VSTM as the locus of the observed invariance and rules out explanations that ascribe it to divided attention or stimulus encoding. The invariance of ∑(i)(d'(i))² is predicted by the competitive interaction theory of Smith and Sewell (2013), which attributes it to the normalization of VSTM traces strengths arising from competition among stimuli entering VSTM. PsycINFO Database Record (c) 2014 APA, all rights reserved.
Devereux, Barry J; Clarke, Alex; Marouchos, Andreas; Tyler, Lorraine K
2013-11-27
Understanding the meanings of words and objects requires the activation of underlying conceptual representations. Semantic representations are often assumed to be coded such that meaning is evoked regardless of the input modality. However, the extent to which meaning is coded in modality-independent or amodal systems remains controversial. We address this issue in a human fMRI study investigating the neural processing of concepts, presented separately as written words and pictures. Activation maps for each individual word and picture were used as input for searchlight-based multivoxel pattern analyses. Representational similarity analysis was used to identify regions correlating with low-level visual models of the words and objects and the semantic category structure common to both. Common semantic category effects for both modalities were found in a left-lateralized network, including left posterior middle temporal gyrus (LpMTG), left angular gyrus, and left intraparietal sulcus (LIPS), in addition to object- and word-specific semantic processing in ventral temporal cortex and more anterior MTG, respectively. To explore differences in representational content across regions and modalities, we developed novel data-driven analyses, based on k-means clustering of searchlight dissimilarity matrices and seeded correlation analysis. These revealed subtle differences in the representations in semantic-sensitive regions, with representations in LIPS being relatively invariant to stimulus modality and representations in LpMTG being uncorrelated across modality. These results suggest that, although both LpMTG and LIPS are involved in semantic processing, only the functional role of LIPS is the same regardless of the visual input, whereas the functional role of LpMTG differs for words and objects.
Ossikovski, Razvigor; Vizet, Jérémy
2016-07-01
We report on the identification of the two algebraic invariants inherent to Mueller matrix polarimetry measurements performed through double pass illumination-collection optics (e.g., an optical fiber or an objective) of unknown polarimetric response. The practical use of the invariants, potentially applicable to the characterization of nonreciprocal media, is illustrated on experimental examples.
Fazl, Arash; Grossberg, Stephen; Mingolla, Ennio
2009-02-01
How does the brain learn to recognize an object from multiple viewpoints while scanning a scene with eye movements? How does the brain avoid the problem of erroneously classifying parts of different objects together? How are attention and eye movements intelligently coordinated to facilitate object learning? A neural model provides a unified mechanistic explanation of how spatial and object attention work together to search a scene and learn what is in it. The ARTSCAN model predicts how an object's surface representation generates a form-fitting distribution of spatial attention, or "attentional shroud". All surface representations dynamically compete for spatial attention to form a shroud. The winning shroud persists during active scanning of the object. The shroud maintains sustained activity of an emerging view-invariant category representation while multiple view-specific category representations are learned and are linked through associative learning to the view-invariant object category. The shroud also helps to restrict scanning eye movements to salient features on the attended object. Object attention plays a role in controlling and stabilizing the learning of view-specific object categories. Spatial attention hereby coordinates the deployment of object attention during object category learning. Shroud collapse releases a reset signal that inhibits the active view-invariant category in the What cortical processing stream. Then a new shroud, corresponding to a different object, forms in the Where cortical processing stream, and search using attention shifts and eye movements continues to learn new objects throughout a scene. The model mechanistically clarifies basic properties of attention shifts (engage, move, disengage) and inhibition of return. It simulates human reaction time data about object-based spatial attention shifts, and learns with 98.1% accuracy and a compression of 430 on a letter database whose letters vary in size, position, and orientation. The model provides a powerful framework for unifying many data about spatial and object attention, and their interactions during perception, cognition, and action.
Hybrid vision activities at NASA Johnson Space Center
NASA Technical Reports Server (NTRS)
Juday, Richard D.
1990-01-01
NASA's Johnson Space Center in Houston, Texas, is active in several aspects of hybrid image processing. (The term hybrid image processing refers to a system that combines digital and photonic processing). The major thrusts are autonomous space operations such as planetary landing, servicing, and rendezvous and docking. By processing images in non-Cartesian geometries to achieve shift invariance to canonical distortions, researchers use certain aspects of the human visual system for machine vision. That technology flow is bidirectional; researchers are investigating the possible utility of video-rate coordinate transformations for human low-vision patients. Man-in-the-loop teleoperations are also supported by the use of video-rate image-coordinate transformations, as researchers plan to use bandwidth compression tailored to the varying spatial acuity of the human operator. Technological elements being developed in the program include upgraded spatial light modulators, real-time coordinate transformations in video imagery, synthetic filters that robustly allow estimation of object pose parameters, convolutionally blurred filters that have continuously selectable invariance to such image changes as magnification and rotation, and optimization of optical correlation done with spatial light modulators that have limited range and couple both phase and amplitude in their response.
Image object recognition based on the Zernike moment and neural networks
NASA Astrophysics Data System (ADS)
Wan, Jianwei; Wang, Ling; Huang, Fukan; Zhou, Liangzhu
1998-03-01
This paper first give a comprehensive discussion about the concept of artificial neural network its research methods and the relations with information processing. On the basis of such a discussion, we expound the mathematical similarity of artificial neural network and information processing. Then, the paper presents a new method of image recognition based on invariant features and neural network by using image Zernike transform. The method not only has the invariant properties for rotation, shift and scale of image object, but also has good fault tolerance and robustness. Meanwhile, it is also compared with statistical classifier and invariant moments recognition method.
Fault Diagnosis for Rolling Bearings under Variable Conditions Based on Visual Cognition
Cheng, Yujie; Zhou, Bo; Lu, Chen; Yang, Chao
2017-01-01
Fault diagnosis for rolling bearings has attracted increasing attention in recent years. However, few studies have focused on fault diagnosis for rolling bearings under variable conditions. This paper introduces a fault diagnosis method for rolling bearings under variable conditions based on visual cognition. The proposed method includes the following steps. First, the vibration signal data are transformed into a recurrence plot (RP), which is a two-dimensional image. Then, inspired by the visual invariance characteristic of the human visual system (HVS), we utilize speed up robust feature to extract fault features from the two-dimensional RP and generate a 64-dimensional feature vector, which is invariant to image translation, rotation, scaling variation, etc. Third, based on the manifold perception characteristic of HVS, isometric mapping, a manifold learning method that can reflect the intrinsic manifold embedded in the high-dimensional space, is employed to obtain a low-dimensional feature vector. Finally, a classical classification method, support vector machine, is utilized to realize fault diagnosis. Verification data were collected from Case Western Reserve University Bearing Data Center, and the experimental result indicates that the proposed fault diagnosis method based on visual cognition is highly effective for rolling bearings under variable conditions, thus providing a promising approach from the cognitive computing field. PMID:28772943
Altschuler, Ted S; Molholm, Sophie; Butler, John S; Mercier, Manuel R; Brandwein, Alice B; Foxe, John J
2014-04-15
The adult human visual system can efficiently fill-in missing object boundaries when low-level information from the retina is incomplete, but little is known about how these processes develop across childhood. A decade of visual-evoked potential (VEP) studies has produced a theoretical model identifying distinct phases of contour completion in adults. The first, termed a perceptual phase, occurs from approximately 100-200 ms and is associated with automatic boundary completion. The second is termed a conceptual phase occurring between 230 and 400 ms. The latter has been associated with the analysis of ambiguous objects which seem to require more effort to complete. The electrophysiological markers of these phases have both been localized to the lateral occipital complex, a cluster of ventral visual stream brain regions associated with object-processing. We presented Kanizsa-type illusory contour stimuli, often used for exploring contour completion processes, to neurotypical persons ages 6-31 (N=63), while parametrically varying the spatial extent of these induced contours, in order to better understand how filling-in processes develop across childhood and adolescence. Our results suggest that, while adults complete contour boundaries in a single discrete period during the automatic perceptual phase, children display an immature response pattern-engaging in more protracted processing across both timeframes and appearing to recruit more widely distributed regions which resemble those evoked during adult processing of higher-order ambiguous figures. However, children older than 5years of age were remarkably like adults in that the effects of contour processing were invariant to manipulation of contour extent. Copyright © 2013 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Deschenes, Sylvain; Sheng, Yunlong; Chevrette, Paul C.
1998-03-01
3D object classification from 2D IR images is shown. The wavelet transform is used for edge detection. Edge tracking is used for removing noise effectively int he wavelet transform. The invariant Fourier descriptor is used to describe the contour curves. Invariance under out-of-plane rotation is achieved by the feature space trajectory neural network working as a classifier.
Mausfeld, Rainer; Andres, Johannes
2002-01-01
We argue, from an ethology-inspired perspective, that the internal concepts 'surface colours' and 'illumination colours' are part of the data format of two different representational primitives. Thus, the internal concept of 'colour' is not a unitary one but rather refers to two different types of 'data structure', each with its own proprietary types of parameters and relations. The relation of these representational structures is modulated by a class of parameterised transformations whose effects are mirrored in the idealised computational achievements of illumination invariance of colour codes, on the one hand, and scene invariance, on the other hand. Because the same characteristics of a light array reaching the eye can be physically produced in many different ways, the visual system, then, has to make an 'inference' whether a chromatic deviation of the space-averaged colour codes from the neutral point is due to a 'non-normal', ie chromatic, illumination or due to an imbalanced spectral reflectance composition. We provide evidence that the visual system uses second-order statistics of chromatic codes of a single view of a scene in order to modulate corresponding transformations. In our experiments we used centre surround configurations with inhomogeneous surrounds given by a random structure of overlapping circles, referred to as Seurat configurations. Each family of surrounds has a fixed space-average of colour codes, but differs with respect to the covariance matrix of colour codes of pixels that defines the chromatic variance along some chromatic axis and the covariance between luminance and chromatic channels. We found that dominant wavelengths of red-green equilibrium settings of the infield exhibited a stable and strong dependence on the chromatic variance of the surround. High variances resulted in a tendency towards 'scene invariance', low variances in a tendency towards 'illumination invariance' of the infield.
Coding of Border Ownership in Monkey Visual Cortex
Zhou, Hong; Friedman, Howard S.; von der Heydt, Rüdiger
2016-01-01
Areas V1 and V2 of the visual cortex have traditionally been conceived as stages of local feature representations. We investigated whether neural responses carry information about how local features belong to objects. Single-cell activity was recorded in areas V1, V2, and V4 of awake behaving monkeys. Displays were used in which the same local feature (contrast edge or line) could be presented as part of different figures. For example, the same light–dark edge could be the left side of a dark square or the right side of a light square. Each display was also presented with reversed contrast. We found significant modulation of responses as a function of the side of the figure in >50% of neurons of V2 and V4 and in 18% of neurons of the top layers of V1. Thus, besides the local contrast border information, neurons were found to encode the side to which the border belongs (“border ownership coding”). A majority of these neurons coded border ownership and the local polarity of luminance–chromaticity contrast. The others were insensitive to contrast polarity. Another 20% of the neurons of V2 and V4, and 48% of top layer V1, coded local contrast polarity, but not border ownership. The border ownership-related response differences emerged soon (<25 msec) after the response onset. In V2 and V4, the differences were found to be nearly independent of figure size up to the limit set by the size of our display (21°). Displays that differed only far outside the conventional receptive field could produce markedly different responses. When tested with more complex displays in which figure-ground cues were varied, some neurons produced invariant border ownership signals, others failed to signal border ownership for some of the displays, but neurons that reversed signals were rare. The influence of visual stimulation far from the receptive field center indicates mechanisms of global context integration. The short latencies and incomplete cue invariance suggest that the border-ownership effect is generated within the visual cortex rather than projected down from higher levels. PMID:10964965
Visual learning with reduced adaptation is eccentricity-specific.
Harris, Hila; Sagi, Dov
2018-01-12
Visual learning is known to be specific to the trained target location, showing little transfer to untrained locations. Recently, learning was shown to transfer across equal-eccentricity retinal-locations when sensory adaptation due to repetitive stimulation was minimized. It was suggested that learning transfers to previously untrained locations when the learned representation is location invariant, with sensory adaptation introducing location-dependent representations, thus preventing transfer. Spatial invariance may also fail when the trained and tested locations are at different distance from the center of gaze (different retinal eccentricities), due to differences in the corresponding low-level cortical representations (e.g. allocated cortical area decreases with eccentricity). Thus, if learning improves performance by better classifying target-dependent early visual representations, generalization is predicted to fail when locations of different retinal eccentricities are trained and tested in the absence sensory adaptation. Here, using the texture discrimination task, we show specificity of learning across different retinal eccentricities (4-8°) using reduced adaptation training. The existence of generalization across equal-eccentricity locations but not across different eccentricities demonstrates that learning accesses visual representations preceding location independent representations, with specificity of learning explained by inhomogeneous sensory representation.
A Multi-modal, Discriminative and Spatially Invariant CNN for RGB-D Object Labeling.
Asif, Umar; Bennamoun, Mohammed; Sohel, Ferdous
2017-08-30
While deep convolutional neural networks have shown a remarkable success in image classification, the problems of inter-class similarities, intra-class variances, the effective combination of multimodal data, and the spatial variability in images of objects remain to be major challenges. To address these problems, this paper proposes a novel framework to learn a discriminative and spatially invariant classification model for object and indoor scene recognition using multimodal RGB-D imagery. This is achieved through three postulates: 1) spatial invariance - this is achieved by combining a spatial transformer network with a deep convolutional neural network to learn features which are invariant to spatial translations, rotations, and scale changes, 2) high discriminative capability - this is achieved by introducing Fisher encoding within the CNN architecture to learn features which have small inter-class similarities and large intra-class compactness, and 3) multimodal hierarchical fusion - this is achieved through the regularization of semantic segmentation to a multi-modal CNN architecture, where class probabilities are estimated at different hierarchical levels (i.e., imageand pixel-levels), and fused into a Conditional Random Field (CRF)- based inference hypothesis, the optimization of which produces consistent class labels in RGB-D images. Extensive experimental evaluations on RGB-D object and scene datasets, and live video streams (acquired from Kinect) show that our framework produces superior object and scene classification results compared to the state-of-the-art methods.
Affine invariants of convex polygons.
Flusser, Jan
2002-01-01
In this correspondence, we prove that the affine invariants, for image registration and object recognition, proposed recently by Yang and Cohen (see ibid., vol.8, no.7, p.934-46, July 1999) are algebraically dependent. We show how to select an independent and complete set of the invariants. The use of this new set leads to a significant reduction of the computing complexity without decreasing the discrimination power.
NASA Astrophysics Data System (ADS)
Kushwaha, Alok Kumar Singh; Srivastava, Rajeev
2015-09-01
An efficient view invariant framework for the recognition of human activities from an input video sequence is presented. The proposed framework is composed of three consecutive modules: (i) detect and locate people by background subtraction, (ii) view invariant spatiotemporal template creation for different activities, (iii) and finally, template matching is performed for view invariant activity recognition. The foreground objects present in a scene are extracted using change detection and background modeling. The view invariant templates are constructed using the motion history images and object shape information for different human activities in a video sequence. For matching the spatiotemporal templates for various activities, the moment invariants and Mahalanobis distance are used. The proposed approach is tested successfully on our own viewpoint dataset, KTH action recognition dataset, i3DPost multiview dataset, MSR viewpoint action dataset, VideoWeb multiview dataset, and WVU multiview human action recognition dataset. From the experimental results and analysis over the chosen datasets, it is observed that the proposed framework is robust, flexible, and efficient with respect to multiple views activity recognition, scale, and phase variations.
Mobile visual object identification: from SIFT-BoF-RANSAC to Sketchprint
NASA Astrophysics Data System (ADS)
Voloshynovskiy, Sviatoslav; Diephuis, Maurits; Holotyak, Taras
2015-03-01
Mobile object identification based on its visual features find many applications in the interaction with physical objects and security. Discriminative and robust content representation plays a central role in object and content identification. Complex post-processing methods are used to compress descriptors and their geometrical information, aggregate them into more compact and discriminative representations and finally re-rank the results based on the similarity geometries of descriptors. Unfortunately, most of the existing descriptors are not very robust and discriminative once applied to the various contend such as real images, text or noise-like microstructures next to requiring at least 500-1'000 descriptors per image for reliable identification. At the same time, the geometric re-ranking procedures are still too complex to be applied to the numerous candidates obtained from the feature similarity based search only. This restricts that list of candidates to be less than 1'000 which obviously causes a higher probability of miss. In addition, the security and privacy of content representation has become a hot research topic in multimedia and security communities. In this paper, we introduce a new framework for non- local content representation based on SketchPrint descriptors. It extends the properties of local descriptors to a more informative and discriminative, yet geometrically invariant content representation. In particular it allows images to be compactly represented by 100 SketchPrint descriptors without being fully dependent on re-ranking methods. We consider several use cases, applying SketchPrint descriptors to natural images, text documents, packages and micro-structures and compare them with the traditional local descriptors.
Moving attention - Evidence for time-invariant shifts of visual selective attention
NASA Technical Reports Server (NTRS)
Remington, R.; Pierce, L.
1984-01-01
Two experiments measured the time to shift spatial selective attention across the visual field to targets 2 or 10 deg from central fixation. A central arrow cued the most likely target location. The direction of attention was inferred from reaction times to expected, unexpected, and neutral locations. The development of a spatial attentional set with time was examined by presenting target probes at varying times after the cue. There were no effects of distance on the time course of the attentional set. Reaction times for far locations were slower than for near, but the effects of attention were evident by 150 msec in both cases. Spatial attention does not shift with a characteristic, fixed velocity. Rather, velocity is proportional to distance, resulting in a movement time that is invariant over the distances tested.
Fractals in geology and geophysics
NASA Technical Reports Server (NTRS)
Turcotte, Donald L.
1989-01-01
The definition of a fractal distribution is that the number of objects N with a characteristic size greater than r scales with the relation N of about r exp -D. The frequency-size distributions for islands, earthquakes, fragments, ore deposits, and oil fields often satisfy this relation. This application illustrates a fundamental aspect of fractal distributions, scale invariance. The requirement of an object to define a scale in photograhs of many geological features is one indication of the wide applicability of scale invariance to geological problems; scale invariance can lead to fractal clustering. Geophysical spectra can also be related to fractals; these are self-affine fractals rather than self-similar fractals. Examples include the earth's topography and geoid.
Devereux, Barry J.; Clarke, Alex; Marouchos, Andreas; Tyler, Lorraine K.
2013-01-01
Understanding the meanings of words and objects requires the activation of underlying conceptual representations. Semantic representations are often assumed to be coded such that meaning is evoked regardless of the input modality. However, the extent to which meaning is coded in modality-independent or amodal systems remains controversial. We address this issue in a human fMRI study investigating the neural processing of concepts, presented separately as written words and pictures. Activation maps for each individual word and picture were used as input for searchlight-based multivoxel pattern analyses. Representational similarity analysis was used to identify regions correlating with low-level visual models of the words and objects and the semantic category structure common to both. Common semantic category effects for both modalities were found in a left-lateralized network, including left posterior middle temporal gyrus (LpMTG), left angular gyrus, and left intraparietal sulcus (LIPS), in addition to object- and word-specific semantic processing in ventral temporal cortex and more anterior MTG, respectively. To explore differences in representational content across regions and modalities, we developed novel data-driven analyses, based on k-means clustering of searchlight dissimilarity matrices and seeded correlation analysis. These revealed subtle differences in the representations in semantic-sensitive regions, with representations in LIPS being relatively invariant to stimulus modality and representations in LpMTG being uncorrelated across modality. These results suggest that, although both LpMTG and LIPS are involved in semantic processing, only the functional role of LIPS is the same regardless of the visual input, whereas the functional role of LpMTG differs for words and objects. PMID:24285896
NASA Technical Reports Server (NTRS)
Spirkovska, Lilly; Reid, Max B.
1993-01-01
A higher-order neural network (HONN) can be designed to be invariant to changes in scale, translation, and inplane rotation. Invariances are built directly into the architecture of a HONN and do not need to be learned. Consequently, fewer training passes and a smaller training set are required to learn to distinguish between objects. The size of the input field is limited, however, because of the memory required for the large number of interconnections in a fully connected HONN. By coarse coding the input image, the input field size can be increased to allow the larger input scenes required for practical object recognition problems. We describe a coarse coding technique and present simulation results illustrating its usefulness and its limitations. Our simulations show that a third-order neural network can be trained to distinguish between two objects in a 4096 x 4096 pixel input field independent of transformations in translation, in-plane rotation, and scale in less than ten passes through the training set. Furthermore, we empirically determine the limits of the coarse coding technique in the object recognition domain.
HONTIOR - HIGHER-ORDER NEURAL NETWORK FOR TRANSFORMATION INVARIANT OBJECT RECOGNITION
NASA Technical Reports Server (NTRS)
Spirkovska, L.
1994-01-01
Neural networks have been applied in numerous fields, including transformation invariant object recognition, wherein an object is recognized despite changes in the object's position in the input field, size, or rotation. One of the more successful neural network methods used in invariant object recognition is the higher-order neural network (HONN) method. With a HONN, known relationships are exploited and the desired invariances are built directly into the architecture of the network, eliminating the need for the network to learn invariance to transformations. This results in a significant reduction in the training time required, since the network needs to be trained on only one view of each object, not on numerous transformed views. Moreover, one hundred percent accuracy is guaranteed for images characterized by the built-in distortions, providing noise is not introduced through pixelation. The program HONTIOR implements a third-order neural network having invariance to translation, scale, and in-plane rotation built directly into the architecture, Thus, for 2-D transformation invariance, the network needs only to be trained on just one view of each object. HONTIOR can also be used for 3-D transformation invariant object recognition by training the network only on a set of out-of-plane rotated views. Historically, the major drawback of HONNs has been that the size of the input field was limited to the memory required for the large number of interconnections in a fully connected network. HONTIOR solves this problem by coarse coding the input images (coding an image as a set of overlapping but offset coarser images). Using this scheme, large input fields (4096 x 4096 pixels) can easily be represented using very little virtual memory (30Mb). The HONTIOR distribution consists of three main programs. The first program contains the training and testing routines for a third-order neural network. The second program contains the same training and testing procedures as the first, but it also contains a number of functions to display and edit training and test images. Finally, the third program is an auxiliary program which calculates the included angles for a given input field size. HONTIOR is written in C language, and was originally developed for Sun3 and Sun4 series computers. Both graphic and command line versions of the program are provided. The command line version has been successfully compiled and executed both on computers running the UNIX operating system and on DEC VAX series computer running VMS. The graphic version requires the SunTools windowing environment, and therefore runs only on Sun series computers. The executable for the graphics version of HONTIOR requires 1Mb of RAM. The standard distribution medium for HONTIOR is a .25 inch streaming magnetic tape cartridge in UNIX tar format. It is also available on a 3.5 inch diskette in UNIX tar format. The package includes sample input and output data. HONTIOR was developed in 1991. Sun, Sun3 and Sun4 are trademarks of Sun Microsystems, Inc. UNIX is a registered trademark of AT&T Bell Laboratories. DEC, VAX, and VMS are trademarks of Digital Equipment Corporation.
Matsukura, Michi; Vecera, Shaun P
2011-02-01
Attention selects objects as well as locations. When attention selects an object's features, observers identify two features from a single object more accurately than two features from two different objects (object-based effect of attention; e.g., Duncan, Journal of Experimental Psychology: General, 113, 501-517, 1984). Several studies have demonstrated that object-based attention can operate at a late visual processing stage that is independent of objects' spatial information (Awh, Dhaliwal, Christensen, & Matsukura, Psychological Science, 12, 329-334, 2001; Matsukura & Vecera, Psychonomic Bulletin & Review, 16, 529-536, 2009; Vecera, Journal of Experimental Psychology: General, 126, 14-18, 1997; Vecera & Farah, Journal of Experimental Psychology: General, 123, 146-160, 1994). In the present study, we asked two questions regarding this late object-based selection mechanism. In Part I, we investigated how observers' foreknowledge of to-be-reported features allows attention to select objects, as opposed to individual features. Using a feature-report task, a significant object-based effect was observed when to-be-reported features were known in advance but not when this advance knowledge was absent. In Part II, we examined what drives attention to select objects rather than individual features in the absence of observers' foreknowledge of to-be-reported features. Results suggested that, when there was no opportunity for observers to direct their attention to objects that possess to-be-reported features at the time of stimulus presentation, these stimuli must retain strong perceptual cues to establish themselves as separate objects.
fMRI-adaptation studies of viewpoint tuning in the extrastriate and fusiform body areas.
Taylor, John C; Wiggett, Alison J; Downing, Paul E
2010-03-01
People are easily able to perceive the human body across different viewpoints, but the neural mechanisms underpinning this ability are currently unclear. In three experiments, we used functional MRI (fMRI) adaptation to study the view-invariance of representations in two cortical regions that have previously been shown to be sensitive to visual depictions of the human body--the extrastriate and fusiform body areas (EBA and FBA). The BOLD response to sequentially presented pairs of bodies was treated as an index of view invariance. Specifically, we compared trials in which the bodies in each image held identical poses (seen from different views) to trials containing different poses. EBA and FBA adapted to identical views of the same pose, and both showed a progressive rebound from adaptation as a function of the angular difference between views, up to approximately 30 degrees. However, these adaptation effects were eliminated when the body stimuli were followed by a pattern mask. Delaying the mask onset increased the response (but not the adaptation effect) in EBA, leaving FBA unaffected. We interpret these masking effects as evidence that view-dependent fMRI adaptation is driven by later waves of neuronal responses in the regions of interest. Finally, in a whole brain analysis, we identified an anterior region of the left inferior temporal sulcus (l-aITS) that responded linearly to stimulus rotation, but showed no selectivity for bodies. Our results show that body-selective cortical areas exhibit a similar degree of view-invariance as other object selective areas--such as the lateral occipitotemporal area (LO) and posterior fusiform gyrus (pFs).
Computational foundations of the visual number sense.
Stoianov, Ivilin Peev; Zorzi, Marco
2017-01-01
We provide an emergentist perspective on the computational mechanism underlying numerosity perception, its development, and the role of inhibition, based on our deep neural network model. We argue that the influence of continuous visual properties does not challenge the notion of number sense, but reveals limit conditions for the computation that yields invariance in numerosity perception. Alternative accounts should be formalized in a computational model.
ERIC Educational Resources Information Center
Maeda, Yukiko; Yoon, So Yoon
2016-01-01
We investigated the extent to which the observed gender differences in mental rotation ability among the 2,468 freshmen studying engineering at a Midwest public university attributed to the gender bias of a test. The Revised Purdue Spatial Visualization Tests: Visualization of Rotations (Revised PSVT:R) is a spatial test frequently used to measure…
Shape equivalence under perspective and projective transformations.
Wagemans, J; Lamote, C; Van Gool, L
1997-06-01
When a planar shape is viewed obliquely, it is deformed by a perspective deformation. If the visual system were to pick up geometrical invariants from such projections, these would necessarily be invariant under the wider class of projective transformations. To what extent can the visual system tell the difference between perspective and nonperspective but still projective deformations of shapes? To investigate this, observers were asked to indicate which of two test patterns most resembled a standard pattern. The test patterns were related to the standard pattern by a perspective or projective transformation, or they were completely unrelated. Performance was slightly better in a matching task with perspective and unrelated test patterns (92.6%) than in a projective-random matching task (88.8%). In a direct comparison, participants had a small preference (58.5%) for the perspectively related patterns over the projectively related ones. Preferences were based on the values of the transformation parameters (slant and shear). Hence, perspective and projective transformations yielded perceptual differences, but they were not treated in a categorically different manner by the human visual system.
Linder, Nina; Turkki, Riku; Walliander, Margarita; Mårtensson, Andreas; Diwan, Vinod; Rahtu, Esa; Pietikäinen, Matti; Lundin, Mikael; Lundin, Johan
2014-01-01
Microscopy is the gold standard for diagnosis of malaria, however, manual evaluation of blood films is highly dependent on skilled personnel in a time-consuming, error-prone and repetitive process. In this study we propose a method using computer vision detection and visualization of only the diagnostically most relevant sample regions in digitized blood smears. Giemsa-stained thin blood films with P. falciparum ring-stage trophozoites (n = 27) and uninfected controls (n = 20) were digitally scanned with an oil immersion objective (0.1 µm/pixel) to capture approximately 50,000 erythrocytes per sample. Parasite candidate regions were identified based on color and object size, followed by extraction of image features (local binary patterns, local contrast and Scale-invariant feature transform descriptors) used as input to a support vector machine classifier. The classifier was trained on digital slides from ten patients and validated on six samples. The diagnostic accuracy was tested on 31 samples (19 infected and 12 controls). From each digitized area of a blood smear, a panel with the 128 most probable parasite candidate regions was generated. Two expert microscopists were asked to visually inspect the panel on a tablet computer and to judge whether the patient was infected with P. falciparum. The method achieved a diagnostic sensitivity and specificity of 95% and 100% as well as 90% and 100% for the two readers respectively using the diagnostic tool. Parasitemia was separately calculated by the automated system and the correlation coefficient between manual and automated parasitemia counts was 0.97. We developed a decision support system for detecting malaria parasites using a computer vision algorithm combined with visualization of sample areas with the highest probability of malaria infection. The system provides a novel method for blood smear screening with a significantly reduced need for visual examination and has a potential to increase the throughput in malaria diagnostics.
Invariance to Rotation in Depth Measured by Masked Repetition Priming is Dependent on Prime Duration
Eddy, Marianna D.; Holcomb, Phillip J.
2011-01-01
The current experiment examined invariance to pictures of objects rotated in depth using event-related potentials (ERPs) and masked repetition priming. Specifically we rotated objects 30°, 60° or 150° from their canonical view and, across two experiments, varied the prime duration (50 or 90 milliseconds (ms)). We examined three ERP components, the P/N190, N300 and N400. In Experiment 1, only the 30° rotation condition produced repetition priming effects on the N/P190, N300 and N400. The other rotation conditions only showed repetition priming effects on the early perceptual component, the N/P190. Experiment 2 extended the prime duration to 90 ms to determine whether additional exposure to the prime may produce invariance on the N300 and N400 for the 60° and 150° rotation conditions. Repetition priming effects were found for all rotation conditions across the N/P190, N300 and N400 components. We interpret these results to suggest that whether or not view invariant priming effects are found depends partly on the extent to which representation of an object has been activated. PMID:22005687
ERIC Educational Resources Information Center
Gomez, Rapson
2009-01-01
Objective: This study used the mean and covariance structures analysis approach to examine the equality or invariance of ratings of the 18 ADHD symptoms. Method: 783 Australian and 928 Malaysian parents provided ratings for an ADHD rating scale. Invariance was tested across these groups (Comparison 1), and North European Australian (n = 623) and…
BRDF invariant stereo using light transport constancy.
Wang, Liang; Yang, Ruigang; Davis, James E
2007-09-01
Nearly all existing methods for stereo reconstruction assume that scene reflectance is Lambertian and make use of brightness constancy as a matching invariant. We introduce a new invariant for stereo reconstruction called light transport constancy (LTC), which allows completely arbitrary scene reflectance (bidirectional reflectance distribution functions (BRDFs)). This invariant can be used to formulate a rank constraint on multiview stereo matching when the scene is observed by several lighting configurations in which only the lighting intensity varies. In addition, we show that this multiview constraint can be used with as few as two cameras and two lighting configurations. Unlike previous methods for BRDF invariant stereo, LTC does not require precisely configured or calibrated light sources or calibration objects in the scene. Importantly, the new constraint can be used to provide BRDF invariance to any existing stereo method whenever appropriate lighting variation is available.
Visual Stimuli Induce Waves of Electrical Activity in Turtle Cortex
NASA Astrophysics Data System (ADS)
Prechtl, J. C.; Cohen, L. B.; Pesaran, B.; Mitra, P. P.; Kleinfeld, D.
1997-07-01
The computations involved in the processing of a visual scene invariably involve the interactions among neurons throughout all of visual cortex. One hypothesis is that the timing of neuronal activity, as well as the amplitude of activity, provides a means to encode features of objects. The experimental data from studies on cat [Gray, C. M., Konig, P., Engel, A. K. & Singer, W. (1989) Nature (London) 338, 334-337] support a view in which only synchronous (no phase lags) activity carries information about the visual scene. In contrast, theoretical studies suggest, on the one hand, the utility of multiple phases within a population of neurons as a means to encode independent visual features and, on the other hand, the likely existence of timing differences solely on the basis of network dynamics. Here we use widefield imaging in conjunction with voltage-sensitive dyes to record electrical activity from the virtually intact, unanesthetized turtle brain. Our data consist of single-trial measurements. We analyze our data in the frequency domain to isolate coherent events that lie in different frequency bands. Low frequency oscillations (<5 Hz) are seen in both ongoing activity and activity induced by visual stimuli. These oscillations propagate parallel to the afferent input. Higher frequency activity, with spectral peaks near 10 and 20 Hz, is seen solely in response to stimulation. This activity consists of plane waves and spiral-like waves, as well as more complex patterns. The plane waves have an average phase gradient of ≈ π /2 radians/mm and propagate orthogonally to the low frequency waves. Our results show that large-scale differences in neuronal timing are present and persistent during visual processing.
Visual stimuli induce waves of electrical activity in turtle cortex
Prechtl, J. C.; Cohen, L. B.; Pesaran, B.; Mitra, P. P.; Kleinfeld, D.
1997-01-01
The computations involved in the processing of a visual scene invariably involve the interactions among neurons throughout all of visual cortex. One hypothesis is that the timing of neuronal activity, as well as the amplitude of activity, provides a means to encode features of objects. The experimental data from studies on cat [Gray, C. M., Konig, P., Engel, A. K. & Singer, W. (1989) Nature (London) 338, 334–337] support a view in which only synchronous (no phase lags) activity carries information about the visual scene. In contrast, theoretical studies suggest, on the one hand, the utility of multiple phases within a population of neurons as a means to encode independent visual features and, on the other hand, the likely existence of timing differences solely on the basis of network dynamics. Here we use widefield imaging in conjunction with voltage-sensitive dyes to record electrical activity from the virtually intact, unanesthetized turtle brain. Our data consist of single-trial measurements. We analyze our data in the frequency domain to isolate coherent events that lie in different frequency bands. Low frequency oscillations (<5 Hz) are seen in both ongoing activity and activity induced by visual stimuli. These oscillations propagate parallel to the afferent input. Higher frequency activity, with spectral peaks near 10 and 20 Hz, is seen solely in response to stimulation. This activity consists of plane waves and spiral-like waves, as well as more complex patterns. The plane waves have an average phase gradient of ≈π/2 radians/mm and propagate orthogonally to the low frequency waves. Our results show that large-scale differences in neuronal timing are present and persistent during visual processing. PMID:9207142
Gohel, Bakul; Lee, Peter; Jeong, Yong
2016-08-01
Brain regions that respond to more than one sensory modality are characterized as multisensory regions. Studies on the processing of shape or object information have revealed recruitment of the lateral occipital cortex, posterior parietal cortex, and other regions regardless of input sensory modalities. However, it remains unknown whether such regions show similar (modality-invariant) or different (modality-specific) neural oscillatory dynamics, as recorded using magnetoencephalography (MEG), in response to identical shape information processing tasks delivered to different sensory modalities. Modality-invariant or modality-specific neural oscillatory dynamics indirectly suggest modality-independent or modality-dependent participation of particular brain regions, respectively. Therefore, this study investigated the modality-specificity of neural oscillatory dynamics in the form of spectral power modulation patterns in response to visual and tactile sequential shape-processing tasks that are well-matched in terms of speed and content between the sensory modalities. Task-related changes in spectral power modulation and differences in spectral power modulation between sensory modalities were investigated at source-space (voxel) level, using a multivariate pattern classification (MVPC) approach. Additionally, whole analyses were extended from the voxel level to the independent-component level to take account of signal leakage effects caused by inverse solution. The modality-specific spectral dynamics in multisensory and higher-order brain regions, such as the lateral occipital cortex, posterior parietal cortex, inferior temporal cortex, and other brain regions, showed task-related modulation in response to both sensory modalities. This suggests modality-dependency of such brain regions on the input sensory modality for sequential shape-information processing. Copyright © 2016 Elsevier B.V. All rights reserved.
How a Hat May Affect 3-Month-Olds' Recognition of a Face: An Eye-Tracking Study
Bulf, Hermann; Valenza, Eloisa; Turati, Chiara
2013-01-01
Recent studies have shown that infants’ face recognition rests on a robust face representation that is resilient to a variety of facial transformations such as rotations in depth, motion, occlusion or deprivation of inner/outer features. Here, we investigated whether 3-month-old infants’ ability to represent the invariant aspects of a face is affected by the presence of an external add-on element, i.e. a hat. Using a visual habituation task, three experiments were carried out in which face recognition was investigated by manipulating the presence/absence of a hat during face encoding (i.e. habituation phase) and face recognition (i.e. test phase). An eye-tracker system was used to record the time infants spent looking at face-relevant information compared to the hat. The results showed that infants’ face recognition was not affected by the presence of the external element when the type of the hat did not vary between the habituation and test phases, and when both the novel and the familiar face wore the same hat during the test phase (Experiment 1). Infants’ ability to recognize the invariant aspects of a face was preserved also when the hat was absent in the habituation phase and the same hat was shown only during the test phase (Experiment 2). Conversely, when the novel face identity competed with a novel hat, the hat triggered the infants’ attention, interfering with the recognition process and preventing the infants’ preference for the novel face during the test phase (Experiment 3). Findings from the current study shed light on how faces and objects are processed when they are simultaneously presented in the same visual scene, contributing to an understanding of how infants respond to the multiple and composite information available in their surrounding environment. PMID:24349378
How a hat may affect 3-month-olds' recognition of a face: an eye-tracking study.
Bulf, Hermann; Valenza, Eloisa; Turati, Chiara
2013-01-01
Recent studies have shown that infants' face recognition rests on a robust face representation that is resilient to a variety of facial transformations such as rotations in depth, motion, occlusion or deprivation of inner/outer features. Here, we investigated whether 3-month-old infants' ability to represent the invariant aspects of a face is affected by the presence of an external add-on element, i.e. a hat. Using a visual habituation task, three experiments were carried out in which face recognition was investigated by manipulating the presence/absence of a hat during face encoding (i.e. habituation phase) and face recognition (i.e. test phase). An eye-tracker system was used to record the time infants spent looking at face-relevant information compared to the hat. The results showed that infants' face recognition was not affected by the presence of the external element when the type of the hat did not vary between the habituation and test phases, and when both the novel and the familiar face wore the same hat during the test phase (Experiment 1). Infants' ability to recognize the invariant aspects of a face was preserved also when the hat was absent in the habituation phase and the same hat was shown only during the test phase (Experiment 2). Conversely, when the novel face identity competed with a novel hat, the hat triggered the infants' attention, interfering with the recognition process and preventing the infants' preference for the novel face during the test phase (Experiment 3). Findings from the current study shed light on how faces and objects are processed when they are simultaneously presented in the same visual scene, contributing to an understanding of how infants respond to the multiple and composite information available in their surrounding environment.
Population Coding of Visual Space: Modeling
Lehky, Sidney R.; Sereno, Anne B.
2011-01-01
We examine how the representation of space is affected by receptive field (RF) characteristics of the encoding population. Spatial responses were defined by overlapping Gaussian RFs. These responses were analyzed using multidimensional scaling to extract the representation of global space implicit in population activity. Spatial representations were based purely on firing rates, which were not labeled with RF characteristics (tuning curve peak location, for example), differentiating this approach from many other population coding models. Because responses were unlabeled, this model represents space using intrinsic coding, extracting relative positions amongst stimuli, rather than extrinsic coding where known RF characteristics provide a reference frame for extracting absolute positions. Two parameters were particularly important: RF diameter and RF dispersion, where dispersion indicates how broadly RF centers are spread out from the fovea. For large RFs, the model was able to form metrically accurate representations of physical space on low-dimensional manifolds embedded within the high-dimensional neural population response space, suggesting that in some cases the neural representation of space may be dimensionally isomorphic with 3D physical space. Smaller RF sizes degraded and distorted the spatial representation, with the smallest RF sizes (present in early visual areas) being unable to recover even a topologically consistent rendition of space on low-dimensional manifolds. Finally, although positional invariance of stimulus responses has long been associated with large RFs in object recognition models, we found RF dispersion rather than RF diameter to be the critical parameter. In fact, at a population level, the modeling suggests that higher ventral stream areas with highly restricted RF dispersion would be unable to achieve positionally-invariant representations beyond this narrow region around fixation. PMID:21344012
Pulay, Márk Ágoston
2015-01-01
Letting children with severe physical disabilities (like Tetraparesis spastica) to get relevant motional experiences of appropriate quality and quantity is now the greatest challenge for us in the field of neurorehabilitation. These motional experiences may establish many cognitive processes, but may also cause additional secondary cognitive dysfunctions such as disorders in body image, figure invariance, visual perception, auditory differentiation, concentration, analytic and synthetic ways of thinking, visual memory etc. Virtual Reality is a technology that provides a sense of presence in a real environment with the help of 3D pictures and animations formed in a computer environment and enable the person to interact with the objects in that environment. One of our biggest challenges is to find a well suited input device (hardware) to let the children with severe physical disabilities to interact with the computer. Based on our own experiences and a thorough literature review we have come to the conclusion that an effective combination of eye-tracking and EMG devices should work well.
Diesfeldt, H F A
2011-06-01
A right-handed patient, aged 72, manifested alexia without agraphia, a right homonymous hemianopia and an impaired ability to identify visually presented objects. He was completely unable to read words aloud and severely deficient in naming visually presented letters. He responded to orthographic familiarity in the lexical decision tasks of the Psycholinguistic Assessments of Language Processing in Aphasia (PALPA) rather than to the lexicality of the letter strings. He was impaired at deciding whether two letters of different case (e.g., A, a) are the same, though he could detect real letters from made-up ones or from their mirror image. Consequently, his core deficit in reading was posited at the level of the abstract letter identifiers. When asked to trace a letter with his right index finger, kinesthetic facilitation enabled him to read letters and words aloud. Though he could use intact motor representations of letters in order to facilitate recognition and reading, the slow, sequential and error-prone process of reading letter by letter made him abandon further training.
Local invariants in non-ideal flows of neutral fluids and two-fluid plasmas
NASA Astrophysics Data System (ADS)
Zhu, Jian-Zhou
2018-03-01
The main objective is the locally invariant geometric object of any (magneto-)fluid dynamics with forcing and damping (nonideal), while more attention is paid to the untouched dynamical properties of two-fluid fashion. Specifically, local structures, beyond the well-known "frozen-in" to the barotropic flows of the generalized vorticities, of the two-fluid model of plasma flows are presented. More general non-barotropic situations are also considered. A modified Euler equation [T. Tao, "Finite time blowup for Lagrangian modifications of the three-dimensional Euler equation," Ann. PDE 2, 9 (2016)] is also accordingly analyzed and remarked from the angle of view of the two-fluid model, with emphasis on the local structures. The local constraints of high-order differential forms such as helicity, among others, find simple formulation for possible practices in modeling the dynamics. Thus, the Cauchy invariants equation [N. Besse and U. Frisch, "Geometric formulation of the Cauchy invariants for incompressible Euler flow in flat and curved spaces," J. Fluid Mech. 825, 412 (2017)] may be enabled to find applications in non-ideal flows. Some formal examples are offered to demonstrate the calculations, and particularly interestingly the two-dimensional-three-component (2D3C) or the 2D passive scalar problem presents that a locally invariant Θ = 2θζ, with θ and ζ being, respectively, the scalar value of the "vertical velocity" (or the passive scalar) and the "vertical vorticity," may be used as if it were the spatial density of the globally invariant helicity, providing a Lagrangian prescription to control the latter in some situations of studying its physical effects in rapidly rotating flows (ubiquitous in atmosphere of astrophysical objects) with marked 2D3C vortical modes or in purely 2D passive scalars.
Invariance algorithms for processing NDE signals
NASA Astrophysics Data System (ADS)
Mandayam, Shreekanth; Udpa, Lalita; Udpa, Satish S.; Lord, William
1996-11-01
Signals that are obtained in a variety of nondestructive evaluation (NDE) processes capture information not only about the characteristics of the flaw, but also reflect variations in the specimen's material properties. Such signal changes may be viewed as anomalies that could obscure defect related information. An example of this situation occurs during in-line inspection of gas transmission pipelines. The magnetic flux leakage (MFL) method is used to conduct noninvasive measurements of the integrity of the pipe-wall. The MFL signals contain information both about the permeability of the pipe-wall and the dimensions of the flaw. Similar operational effects can be found in other NDE processes. This paper presents algorithms to render NDE signals invariant to selected test parameters, while retaining defect related information. Wavelet transform based neural network techniques are employed to develop the invariance algorithms. The invariance transformation is shown to be a necessary pre-processing step for subsequent defect characterization and visualization schemes. Results demonstrating the successful application of the method are presented.
Okamura, Jun-ya; Yamaguchi, Reona; Honda, Kazunari; Tanaka, Keiji
2014-01-01
One fails to recognize an unfamiliar object across changes in viewing angle when it must be discriminated from similar distractor objects. View-invariant recognition gradually develops as the viewer repeatedly sees the objects in rotation. It is assumed that different views of each object are associated with one another while their successive appearance is experienced in rotation. However, natural experience of objects also contains ample opportunities to discriminate among objects at each of the multiple viewing angles. Our previous behavioral experiments showed that after experiencing a new set of object stimuli during a task that required only discrimination at each of four viewing angles at 30° intervals, monkeys could recognize the objects across changes in viewing angle up to 60°. By recording activities of neurons from the inferotemporal cortex after various types of preparatory experience, we here found a possible neural substrate for the monkeys' performance. For object sets that the monkeys had experienced during the task that required only discrimination at each of four viewing angles, many inferotemporal neurons showed object selectivity covering multiple views. The degree of view generalization found for these object sets was similar to that found for stimulus sets with which the monkeys had been trained to conduct view-invariant recognition. These results suggest that the experience of discriminating new objects in each of several viewing angles develops the partially view-generalized object selectivity distributed over many neurons in the inferotemporal cortex, which in turn bases the monkeys' emergent capability to discriminate the objects across changes in viewing angle. PMID:25378169
NASA Astrophysics Data System (ADS)
Arevalo, John; Cruz-Roa, Angel; González, Fabio A.
2013-11-01
This paper presents a novel method for basal-cell carcinoma detection, which combines state-of-the-art methods for unsupervised feature learning (UFL) and bag of features (BOF) representation. BOF, which is a form of representation learning, has shown a good performance in automatic histopathology image classi cation. In BOF, patches are usually represented using descriptors such as SIFT and DCT. We propose to use UFL to learn the patch representation itself. This is accomplished by applying a topographic UFL method (T-RICA), which automatically learns visual invariance properties of color, scale and rotation from an image collection. These learned features also reveals these visual properties associated to cancerous and healthy tissues and improves carcinoma detection results by 7% with respect to traditional autoencoders, and 6% with respect to standard DCT representations obtaining in average 92% in terms of F-score and 93% of balanced accuracy.
Invariants of polarization transformations.
Sadjadi, Firooz A
2007-05-20
The use of polarization-sensitive sensors is being explored in a variety of applications. Polarization diversity has been shown to improve the performance of the automatic target detection and recognition in a significant way. However, it also brings out the problems associated with processing and storing more data and the problem of polarization distortion during transmission. We present a technique for extracting attributes that are invariant under polarization transformations. The polarimetric signatures are represented in terms of the components of the Stokes vectors. Invariant algebra is then used to extract a set of signature-related attributes that are invariant under linear transformation of the Stokes vectors. Experimental results using polarimetric infrared signatures of a number of manmade and natural objects undergoing systematic linear transformations support the invariancy of these attributes.
Learned Non-Rigid Object Motion is a View-Invariant Cue to Recognizing Novel Objects
Chuang, Lewis L.; Vuong, Quoc C.; Bülthoff, Heinrich H.
2012-01-01
There is evidence that observers use learned object motion to recognize objects. For instance, studies have shown that reversing the learned direction in which a rigid object rotated in depth impaired recognition accuracy. This motion reversal can be achieved by playing animation sequences of moving objects in reverse frame order. In the current study, we used this sequence-reversal manipulation to investigate whether observers encode the motion of dynamic objects in visual memory, and whether such dynamic representations are encoded in a way that is dependent on the viewing conditions. Participants first learned dynamic novel objects, presented as animation sequences. Following learning, they were then tested on their ability to recognize these learned objects when their animation sequence was shown in the same sequence order as during learning or in the reverse sequence order. In Experiment 1, we found that non-rigid motion contributed to recognition performance; that is, sequence-reversal decreased sensitivity across different tasks. In subsequent experiments, we tested the recognition of non-rigidly deforming (Experiment 2) and rigidly rotating (Experiment 3) objects across novel viewpoints. Recognition performance was affected by viewpoint changes for both experiments. Learned non-rigid motion continued to contribute to recognition performance and this benefit was the same across all viewpoint changes. By comparison, learned rigid motion did not contribute to recognition performance. These results suggest that non-rigid motion provides a source of information for recognizing dynamic objects, which is not affected by changes to viewpoint. PMID:22661939
Clemens, Jan; Weschke, Gerroth; Vogel, Astrid; Ronacher, Bernhard
2010-04-01
The temporal pattern of amplitude modulations (AM) is often used to recognize acoustic objects. To identify objects reliably, intensity invariant representations have to be formed. We approached this problem within the auditory pathway of grasshoppers. We presented AM patterns modulated at different time scales and intensities. Metric space analysis of neuronal responses allowed us to determine how well, how invariantly, and at which time scales AM frequency is encoded. We find that in some neurons spike-count cues contribute substantially (20-60%) to the decoding of AM frequency at a single intensity. However, such cues are not robust when intensity varies. The general intensity invariance of the system is poor. However, there exists a range of AM frequencies around 83 Hz where intensity invariance of local interneurons is relatively high. In this range, natural communication signals exhibit much variation between species, suggesting an important behavioral role for this frequency band. We hypothesize, just as has been proposed for human speech, that the communication signals might have evolved to match the processing properties of the receivers. This contrasts with optimal coding theory, which postulates that neuronal systems are adapted to the statistics of the relevant signals.
New technique for real-time distortion-invariant multiobject recognition and classification
NASA Astrophysics Data System (ADS)
Hong, Rutong; Li, Xiaoshun; Hong, En; Wang, Zuyi; Wei, Hongan
2001-04-01
A real-time hybrid distortion-invariant OPR system was established to make 3D multiobject distortion-invariant automatic pattern recognition. Wavelet transform technique was used to make digital preprocessing of the input scene, to depress the noisy background and enhance the recognized object. A three-layer backpropagation artificial neural network was used in correlation signal post-processing to perform multiobject distortion-invariant recognition and classification. The C-80 and NOA real-time processing ability and the multithread programming technology were used to perform high speed parallel multitask processing and speed up the post processing rate to ROIs. The reference filter library was constructed for the distortion version of 3D object model images based on the distortion parameter tolerance measuring as rotation, azimuth and scale. The real-time optical correlation recognition testing of this OPR system demonstrates that using the preprocessing, post- processing, the nonlinear algorithm os optimum filtering, RFL construction technique and the multithread programming technology, a high possibility of recognition and recognition rate ere obtained for the real-time multiobject distortion-invariant OPR system. The recognition reliability and rate was improved greatly. These techniques are very useful to automatic target recognition.
Noise-invariant Neurons in the Avian Auditory Cortex: Hearing the Song in Noise
Moore, R. Channing; Lee, Tyler; Theunissen, Frédéric E.
2013-01-01
Given the extraordinary ability of humans and animals to recognize communication signals over a background of noise, describing noise invariant neural responses is critical not only to pinpoint the brain regions that are mediating our robust perceptions but also to understand the neural computations that are performing these tasks and the underlying circuitry. Although invariant neural responses, such as rotation-invariant face cells, are well described in the visual system, high-level auditory neurons that can represent the same behaviorally relevant signal in a range of listening conditions have yet to be discovered. Here we found neurons in a secondary area of the avian auditory cortex that exhibit noise-invariant responses in the sense that they responded with similar spike patterns to song stimuli presented in silence and over a background of naturalistic noise. By characterizing the neurons' tuning in terms of their responses to modulations in the temporal and spectral envelope of the sound, we then show that noise invariance is partly achieved by selectively responding to long sounds with sharp spectral structure. Finally, to demonstrate that such computations could explain noise invariance, we designed a biologically inspired noise-filtering algorithm that can be used to separate song or speech from noise. This novel noise-filtering method performs as well as other state-of-the-art de-noising algorithms and could be used in clinical or consumer oriented applications. Our biologically inspired model also shows how high-level noise-invariant responses could be created from neural responses typically found in primary auditory cortex. PMID:23505354
Noise-invariant neurons in the avian auditory cortex: hearing the song in noise.
Moore, R Channing; Lee, Tyler; Theunissen, Frédéric E
2013-01-01
Given the extraordinary ability of humans and animals to recognize communication signals over a background of noise, describing noise invariant neural responses is critical not only to pinpoint the brain regions that are mediating our robust perceptions but also to understand the neural computations that are performing these tasks and the underlying circuitry. Although invariant neural responses, such as rotation-invariant face cells, are well described in the visual system, high-level auditory neurons that can represent the same behaviorally relevant signal in a range of listening conditions have yet to be discovered. Here we found neurons in a secondary area of the avian auditory cortex that exhibit noise-invariant responses in the sense that they responded with similar spike patterns to song stimuli presented in silence and over a background of naturalistic noise. By characterizing the neurons' tuning in terms of their responses to modulations in the temporal and spectral envelope of the sound, we then show that noise invariance is partly achieved by selectively responding to long sounds with sharp spectral structure. Finally, to demonstrate that such computations could explain noise invariance, we designed a biologically inspired noise-filtering algorithm that can be used to separate song or speech from noise. This novel noise-filtering method performs as well as other state-of-the-art de-noising algorithms and could be used in clinical or consumer oriented applications. Our biologically inspired model also shows how high-level noise-invariant responses could be created from neural responses typically found in primary auditory cortex.
The recognition of graphical patterns invariant to geometrical transformation of the models
NASA Astrophysics Data System (ADS)
Ileană, Ioan; Rotar, Corina; Muntean, Maria; Ceuca, Emilian
2010-11-01
In case that a pattern recognition system is used for images recognition (in robot vision, handwritten recognition etc.), the system must have the capacity to identify an object indifferently of its size or position in the image. The problem of the invariance of recognition can be approached in some fundamental modes. One may apply the similarity criterion used in associative recall. The original pattern is replaced by a mathematical transform that assures some invariance (e.g. the value of two-dimensional Fourier transformation is translation invariant, the value of Mellin transformation is scale invariant). In a different approach the original pattern is represented through a set of features, each of them being coded indifferently of the position, orientation or position of the pattern. Generally speaking, it is easy to obtain invariance in relation with one transformation group, but is difficult to obtain simultaneous invariance at rotation, translation and scale. In this paper we analyze some methods to achieve invariant recognition of images, particularly for digit images. A great number of experiments are due and the conclusions are underplayed in the paper.
Qiao, Yu; Wang, Wei; Minematsu, Nobuaki; Liu, Jianzhuang; Takeda, Mitsuo; Tang, Xiaoou
2009-10-01
This paper studies phase singularities (PSs) for image representation. We show that PSs calculated with Laguerre-Gauss filters contain important information and provide a useful tool for image analysis. PSs are invariant to image translation and rotation. We introduce several invariant features to characterize the core structures around PSs and analyze the stability of PSs to noise addition and scale change. We also study the characteristics of PSs in a scale space, which lead to a method to select key scales along phase singularity curves. We demonstrate two applications of PSs: object tracking and image matching. In object tracking, we use the iterative closest point algorithm to determine the correspondences of PSs between two adjacent frames. The use of PSs allows us to precisely determine the motions of tracked objects. In image matching, we combine PSs and scale-invariant feature transform (SIFT) descriptor to deal with the variations between two images and examine the proposed method on a benchmark database. The results indicate that our method can find more correct matching pairs with higher repeatability rates than some well-known methods.
A survey of visual preprocessing and shape representation techniques
NASA Technical Reports Server (NTRS)
Olshausen, Bruno A.
1988-01-01
Many recent theories and methods proposed for visual preprocessing and shape representation are summarized. The survey brings together research from the fields of biology, psychology, computer science, electrical engineering, and most recently, neural networks. It was motivated by the need to preprocess images for a sparse distributed memory (SDM), but the techniques presented may also prove useful for applying other associative memories to visual pattern recognition. The material of this survey is divided into three sections: an overview of biological visual processing; methods of preprocessing (extracting parts of shape, texture, motion, and depth); and shape representation and recognition (form invariance, primitives and structural descriptions, and theories of attention).
New Actions Upon Old Objects: A New Ontological Perspective on Functions.
ERIC Educational Resources Information Center
Schwarz, Baruch; Dreyfus, Tommy
1995-01-01
A computer microworld called Triple Representation Model uses graphical, tabular, and algebraic representations to influence conceptions of function. A majority of students were able to cope with partial data, recognize invariants while coordinating actions among representations, and recognize invariants while creating and comparing different…
On the sighting of unicorns: A variational approach to computing invariant sets in dynamical systems
NASA Astrophysics Data System (ADS)
Junge, Oliver; Kevrekidis, Ioannis G.
2017-06-01
We propose to compute approximations to invariant sets in dynamical systems by minimizing an appropriate distance between a suitably selected finite set of points and its image under the dynamics. We demonstrate, through computational experiments, that this approach can successfully converge to approximations of (maximal) invariant sets of arbitrary topology, dimension, and stability, such as, e.g., saddle type invariant sets with complicated dynamics. We further propose to extend this approach by adding a Lennard-Jones type potential term to the objective function, which yields more evenly distributed approximating finite point sets, and illustrate the procedure through corresponding numerical experiments.
Junge, Oliver; Kevrekidis, Ioannis G
2017-06-01
We propose to compute approximations to invariant sets in dynamical systems by minimizing an appropriate distance between a suitably selected finite set of points and its image under the dynamics. We demonstrate, through computational experiments, that this approach can successfully converge to approximations of (maximal) invariant sets of arbitrary topology, dimension, and stability, such as, e.g., saddle type invariant sets with complicated dynamics. We further propose to extend this approach by adding a Lennard-Jones type potential term to the objective function, which yields more evenly distributed approximating finite point sets, and illustrate the procedure through corresponding numerical experiments.
The relativistic invariance of 4D-shapes
NASA Astrophysics Data System (ADS)
Calosi, Claudio
2015-05-01
A recent debate in the metaphysics of physics focuses on the invariance and intrinsicality of four-dimensional shapes in the Special Theory of Relativity. Davidson (2014) argues that four-dimensional shapes cannot be intrinsic properties of persisting objects because they have to be relativized to reference frames. Balashov (2014a) criticizes such an argument in that it mistakes four-dimensional shapes with their three-dimensional projections on the axes of those frames. This paper adds to that debate. Rather than criticizing an argument against the relativistic invariance of four-dimensional shapes, as Balashov did, it offers a direct argument in favor of such an invariance.
Grossberg, Stephen; Vladusich, Tony
2010-01-01
How does an infant learn through visual experience to imitate actions of adult teachers, despite the fact that the infant and adult view one another and the world from different perspectives? To accomplish this, an infant needs to learn how to share joint attention with adult teachers and to follow their gaze towards valued goal objects. The infant also needs to be capable of view-invariant object learning and recognition whereby it can carry out goal-directed behaviors, such as the use of tools, using different object views than the ones that its teachers use. Such capabilities are often attributed to "mirror neurons". This attribution does not, however, explain the brain processes whereby these competences arise. This article describes the CRIB (Circular Reactions for Imitative Behavior) neural model of how the brain achieves these goals through inter-personal circular reactions. Inter-personal circular reactions generalize the intra-personal circular reactions of Piaget, which clarify how infants learn from their own babbled arm movements and reactive eye movements how to carry out volitional reaches, with or without tools, towards valued goal objects. The article proposes how intra-personal circular reactions create a foundation for inter-personal circular reactions when infants and other learners interact with external teachers in space. Both types of circular reactions involve learned coordinate transformations between body-centered arm movement commands and retinotopic visual feedback, and coordination of processes within and between the What and Where cortical processing streams. Specific breakdowns of model processes generate formal symptoms similar to clinical symptoms of autism. Copyright © 2010 Elsevier Ltd. All rights reserved.
Geometric Invariants and Object Recognition.
1992-08-01
University of Chicago Press. Maybank , S.J. [1992], "The Projection of Two Non-coplanar Conics", in Geometric Invariance in Machine Vision, eds. J.L...J.L. Mundy and A. Zisserman, MIT Press, Cambridge, MA. Mundy, J.L., Kapur, .. , Maybank , S.J., and Quan, L. [1992a] "Geometric Inter- pretation of
Perception of Invariance Over Perspective Transformations in Five Month Old Infants.
ERIC Educational Resources Information Center
Gibson, Eleanor; And Others
This experiment asked whether infants at 5 months perceived an invariant over four types of rigid motion (perspective transformations), and thereby differentiated rigid motion from deformation. Four perspective transformations of a sponge rubber object (rotation around the vertical axis, rotation around the horizontal axis, rotation in the frontal…
Okamura, Jun-Ya; Yamaguchi, Reona; Honda, Kazunari; Wang, Gang; Tanaka, Keiji
2014-11-05
One fails to recognize an unfamiliar object across changes in viewing angle when it must be discriminated from similar distractor objects. View-invariant recognition gradually develops as the viewer repeatedly sees the objects in rotation. It is assumed that different views of each object are associated with one another while their successive appearance is experienced in rotation. However, natural experience of objects also contains ample opportunities to discriminate among objects at each of the multiple viewing angles. Our previous behavioral experiments showed that after experiencing a new set of object stimuli during a task that required only discrimination at each of four viewing angles at 30° intervals, monkeys could recognize the objects across changes in viewing angle up to 60°. By recording activities of neurons from the inferotemporal cortex after various types of preparatory experience, we here found a possible neural substrate for the monkeys' performance. For object sets that the monkeys had experienced during the task that required only discrimination at each of four viewing angles, many inferotemporal neurons showed object selectivity covering multiple views. The degree of view generalization found for these object sets was similar to that found for stimulus sets with which the monkeys had been trained to conduct view-invariant recognition. These results suggest that the experience of discriminating new objects in each of several viewing angles develops the partially view-generalized object selectivity distributed over many neurons in the inferotemporal cortex, which in turn bases the monkeys' emergent capability to discriminate the objects across changes in viewing angle. Copyright © 2014 the authors 0270-6474/14/3415047-13$15.00/0.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Changhong; Cheung, Yeuk-Kwan E., E-mail: chellifegood@gmail.com, E-mail: cheung@nju.edu.cn
2014-07-01
We investigate the spectrum of cosmological perturbations in a bounce cosmos modeled by a scalar field coupled to the string tachyon field (CSTB cosmos). By explicit computation of its primordial spectral index we show the power spectrum of curvature perturbations, generated during the tachyon matter dominated contraction phase, to be nearly scale invariant. We propose a unified parameter space for a systematic study of inflationary and bounce cosmologies. The CSTB cosmos is dual-in Wands's sense-to slow-roll inflation as can be visualized with the aid of this parameter space. Guaranteed by the dynamical attractor behavior of the CSTB Cosmos, the scalemore » invariance of its power spectrum is free of the fine-tuning problem, in contrast to the slow-roll inflation model.« less
Slow Feature Analysis on Retinal Waves Leads to V1 Complex Cells
Dähne, Sven; Wilbert, Niko; Wiskott, Laurenz
2014-01-01
The developing visual system of many mammalian species is partially structured and organized even before the onset of vision. Spontaneous neural activity, which spreads in waves across the retina, has been suggested to play a major role in these prenatal structuring processes. Recently, it has been shown that when employing an efficient coding strategy, such as sparse coding, these retinal activity patterns lead to basis functions that resemble optimal stimuli of simple cells in primary visual cortex (V1). Here we present the results of applying a coding strategy that optimizes for temporal slowness, namely Slow Feature Analysis (SFA), to a biologically plausible model of retinal waves. Previously, SFA has been successfully applied to model parts of the visual system, most notably in reproducing a rich set of complex-cell features by training SFA with quasi-natural image sequences. In the present work, we obtain SFA units that share a number of properties with cortical complex-cells by training on simulated retinal waves. The emergence of two distinct properties of the SFA units (phase invariance and orientation tuning) is thoroughly investigated via control experiments and mathematical analysis of the input-output functions found by SFA. The results support the idea that retinal waves share relevant temporal and spatial properties with natural visual input. Hence, retinal waves seem suitable training stimuli to learn invariances and thereby shape the developing early visual system such that it is best prepared for coding input from the natural world. PMID:24810948
Sereno, Anne B.; Lehky, Sidney R.
2011-01-01
Although the representation of space is as fundamental to visual processing as the representation of shape, it has received relatively little attention from neurophysiological investigations. In this study we characterize representations of space within visual cortex, and examine how they differ in a first direct comparison between dorsal and ventral subdivisions of the visual pathways. Neural activities were recorded in anterior inferotemporal cortex (AIT) and lateral intraparietal cortex (LIP) of awake behaving monkeys, structures associated with the ventral and dorsal visual pathways respectively, as a stimulus was presented at different locations within the visual field. In spatially selective cells, we find greater modulation of cell responses in LIP with changes in stimulus position. Further, using a novel population-based statistical approach (namely, multidimensional scaling), we recover the spatial map implicit within activities of neural populations, allowing us to quantitatively compare the geometry of neural space with physical space. We show that a population of spatially selective LIP neurons, despite having large receptive fields, is able to almost perfectly reconstruct stimulus locations within a low-dimensional representation. In contrast, a population of AIT neurons, despite each cell being spatially selective, provide less accurate low-dimensional reconstructions of stimulus locations. They produce instead only a topologically (categorically) correct rendition of space, which nevertheless might be critical for object and scene recognition. Furthermore, we found that the spatial representation recovered from population activity shows greater translation invariance in LIP than in AIT. We suggest that LIP spatial representations may be dimensionally isomorphic with 3D physical space, while in AIT spatial representations may reflect a more categorical representation of space (e.g., “next to” or “above”). PMID:21344010
DOE Office of Scientific and Technical Information (OSTI.GOV)
Finger, Paul T., E-mail: pfinger@eyecancer.com; Chin, Kimberly J.
2012-02-01
Purpose: To evaluate the intravitreal antivascular endothelial growth factor, bevacizumab, for treatment of radiation optic neuropathy (RON). Methods and Materials: A prospective interventional clinical case series was performed of 14 patients with RON related to plaque radiotherapy for choroidal melanoma. The RON was characterized by optic disc edema, hemorrhages, microangiopathy, and neovascularization. The entry criteria included a subjective or objective loss of vision, coupled with findings of RON. The study subjects received a minimum of two initial injections of intravitreal bevacizumab (1.25 mg in 0.05 mL) every 6-8 weeks. The primary objectives included safety and tolerability. The secondary objectives includedmore » the efficacy as measured using the Early Treatment Diabetic Retinopathy Study chart for visual acuity, fundus photography, angiography, and optical coherence tomography/scanning laser ophthalmoscopy. Results: Reductions in optic disc hemorrhage and edema were noted in all patients. The visual acuity was stable or improved in 9 (64%) of the 14 patients. Of the 5 patients who had lost vision, 2 had relatively large posterior tumors, 1 had had the vision decrease because of intraocular hemorrhage, and 1 had developed optic atrophy. The fifth patient who lost vision was noncompliant. No treatment-related ocular or systemic side effects were observed. Conclusions: Intravitreal antivascular endothelial growth factor bevacizumab was tolerated and generally associated with improved vision, reduced papillary hemorrhage, and resolution of optic disc edema. Persistent optic disc neovascularization and fluorescein angiographic leakage were invariably noted. The results of the present study support additional evaluation of antivascular endothelial growth factor medications as treatment of RON.« less
Do characteristics of a stationary obstacle lead to adjustments in obstacle stepping strategies?
Worden, Timothy A; De Jong, Audrey F; Vallis, Lori Ann
2016-01-01
Navigating cluttered and complex environments increases the risk of falling. To decrease this risk, it is important to understand the influence of obstacle visual cues on stepping parameters, however the specific obstacle characteristics that have the greatest influence on avoidance strategies is still under debate. The purpose of the current work is to provide further insight on the relationship between obstacle appearance in the environment and modulation of stepping parameters. Healthy young adults (N=8) first stepped over an obstacle with one visible top edge ("floating"; 8 trials) followed by trials where experimenters randomly altered the location of a ground reference object to one of 7 different positions (8 trials per location), which ranged from 6cm in front of, directly under, or up to 6cm behind the floating obstacle (at 2cm intervals). Mean take-off and landing distance as well as minimum foot clearance values were unchanged across different positions of the ground reference object; a consistent stepping trajectory was observed for all experimental conditions. Contrary to our hypotheses, results of this study indicate that ground based visual cues are not essential for the planning of stepping and clearance strategies. The simultaneous presentation of both floating and ground based objects may have provided critical information that lead to the adoption of a consistent strategy for clearing the top edge of the obstacle. The invariant foot placement observed here may be an appropriate stepping strategy for young adults, however this may not be the case across the lifespan or in special populations. Copyright © 2015 Elsevier B.V. All rights reserved.
Optical-Correlator Neural Network Based On Neocognitron
NASA Technical Reports Server (NTRS)
Chao, Tien-Hsin; Stoner, William W.
1994-01-01
Multichannel optical correlator implements shift-invariant, high-discrimination pattern-recognizing neural network based on paradigm of neocognitron. Selected as basic building block of this neural network because invariance under shifts is inherent advantage of Fourier optics included in optical correlators in general. Neocognitron is conceptual electronic neural-network model for recognition of visual patterns. Multilayer processing achieved by iteratively feeding back output of feature correlator to input spatial light modulator and updating Fourier filters. Neural network trained by use of characteristic features extracted from target images. Multichannel implementation enables parallel processing of large number of selected features.
Stimulus homogeneity enhances implicit learning: evidence from contextual cueing.
Feldmann-Wüstefeld, Tobias; Schubö, Anna
2014-04-01
Visual search for a target object is faster if the target is embedded in a repeatedly presented invariant configuration of distractors ('contextual cueing'). It has also been shown that the homogeneity of a context affects the efficiency of visual search: targets receive prioritized processing when presented in a homogeneous context compared to a heterogeneous context, presumably due to grouping processes at early stages of visual processing. The present study investigated in three Experiments whether context homogeneity also affects contextual cueing. In Experiment 1, context homogeneity varied on three levels of the task-relevant dimension (orientation) and contextual cueing was most pronounced for context configurations with high orientation homogeneity. When context homogeneity varied on three levels of the task-irrelevant dimension (color) and orientation homogeneity was fixed, no modulation of contextual cueing was observed: high orientation homogeneity led to large contextual cueing effects (Experiment 2) and low orientation homogeneity led to low contextual cueing effects (Experiment 3), irrespective of color homogeneity. Enhanced contextual cueing for homogeneous context configurations suggest that grouping processes do not only affect visual search but also implicit learning. We conclude that memory representation of context configurations are more easily acquired when context configurations can be processed as larger, grouped perceptual units. However, this form of implicit perceptual learning is only improved by stimulus homogeneity when stimulus homogeneity facilitates grouping processes on a dimension that is currently relevant in the task. Copyright © 2014 Elsevier B.V. All rights reserved.
Otten, Marte; Banaji, Mahzarin R.
2012-01-01
A number of recent behavioral studies have shown that emotional expressions are differently perceived depending on the race of a face, and that perception of race cues is influenced by emotional expressions. However, neural processes related to the perception of invariant cues that indicate the identity of a face (such as race) are often described to proceed independently of processes related to the perception of cues that can vary over time (such as emotion). Using a visual face adaptation paradigm, we tested whether these behavioral interactions between emotion and race also reflect interdependent neural representation of emotion and race. We compared visual emotion aftereffects when the adapting face and ambiguous test face differed in race or not. Emotion aftereffects were much smaller in different race (DR) trials than same race (SR) trials, indicating that the neural representation of a facial expression is significantly different depending on whether the emotional face is black or white. It thus seems that invariable cues such as race interact with variable face cues such as emotion not just at a response level, but also at the level of perception and neural representation. PMID:22403531
Vanrie, Jan; Béatse, Erik; Wagemans, Johan; Sunaert, Stefan; Van Hecke, Paul
2002-01-01
It has been proposed that object perception can proceed through different routes, which can be situated on a continuum ranging from complete viewpoint-dependency to complete viewpoint-independency, depending on the objects and the task at hand. Although these different routes have been extensively demonstrated on the behavioral level, the corresponding distinction in the underlying neural substrate has not received the same attention. Our goal was to disentangle, on the behavioral and the neurofunctional level, a process associated with extreme viewpoint-dependency, i.e. mental rotation, and a process associated with extreme viewpoint-independency, i.e. the use of viewpoint-invariant, diagnostic features. Two sets of 3-D block figures were created that either differed in handedness (original versus mirrored) or in the angles joining the block components (orthogonal versus skewed). Behavioral measures on a same-different judgment task were predicted to be dependent on viewpoint in the rotation condition (same versus mirrored), but not in the invariance condition (same angles versus different angles). Six subjects participated in an fMRI experiment while presented with both conditions in alternating blocks. Both reaction times and accuracy confirmed the predicted dissociation between the two conditions. Neurofunctional results indicate that all cortical areas activated in the invariance condition were also activated in the rotation condition. Parietal areas were more activated than occipito-temporal areas in the rotation condition, while this pattern was reversed in the invariance condition. Furthermore, some areas were activated uniquely by the rotation condition, probably reflecting the additional processes apparent in the behavioral response patterns.
On deformation of complex continuum immersed in a plane space
NASA Astrophysics Data System (ADS)
Kovalev, V. A.; Murashkin, E. V.; Radayev, Y. N.
2018-05-01
The present paper is devoted to mathematical modelling of complex continua deformations considered as immersed in an external plane space. The complex continuum is defined as a differential manifold supplied with metrics induced by the external space. A systematic derivation of strain tensors by notion of isometric immersion of the complex continuum into a plane space of a higher dimension is proposed. Problem of establishing complete systems of irreducible objective strain and extrastrain tensors for complex continuum immersed in an external plane space is resolved. The solution to the problem is obtained by methods of the field theory and the theory of rational algebraic invariants. Strain tensors of the complex continuum are derived as irreducible algebraic invariants of contravariant vectors of the external space emerging as functional arguments in the complex continuum action density. Present analysis is restricted to rational algebraic invariants. Completeness of the considered systems of rational algebraic invariants is established for micropolar elastic continua. Rational syzygies for non-quadratic invariants are discussed. Objective strain tensors (indifferent to frame rotations in the external plane space) for micropolar continuum are alternatively obtained by properly combining multipliers of polar decompositions of deformation and extra-deformation gradients. The latter is realized only for continua immersed in a plane space of the equal mathematical dimension.
A biologically inspired neural network model to transformation invariant object recognition
NASA Astrophysics Data System (ADS)
Iftekharuddin, Khan M.; Li, Yaqin; Siddiqui, Faraz
2007-09-01
Transformation invariant image recognition has been an active research area due to its widespread applications in a variety of fields such as military operations, robotics, medical practices, geographic scene analysis, and many others. The primary goal for this research is detection of objects in the presence of image transformations such as changes in resolution, rotation, translation, scale and occlusion. We investigate a biologically-inspired neural network (NN) model for such transformation-invariant object recognition. In a classical training-testing setup for NN, the performance is largely dependent on the range of transformation or orientation involved in training. However, an even more serious dilemma is that there may not be enough training data available for successful learning or even no training data at all. To alleviate this problem, a biologically inspired reinforcement learning (RL) approach is proposed. In this paper, the RL approach is explored for object recognition with different types of transformations such as changes in scale, size, resolution and rotation. The RL is implemented in an adaptive critic design (ACD) framework, which approximates the neuro-dynamic programming of an action network and a critic network, respectively. Two ACD algorithms such as Heuristic Dynamic Programming (HDP) and Dual Heuristic dynamic Programming (DHP) are investigated to obtain transformation invariant object recognition. The two learning algorithms are evaluated statistically using simulated transformations in images as well as with a large-scale UMIST face database with pose variations. In the face database authentication case, the 90° out-of-plane rotation of faces from 20 different subjects in the UMIST database is used. Our simulations show promising results for both designs for transformation-invariant object recognition and authentication of faces. Comparing the two algorithms, DHP outperforms HDP in learning capability, as DHP takes fewer steps to perform a successful recognition task in general. Further, the residual critic error in DHP is generally smaller than that of HDP, and DHP achieves a 100% success rate more frequently than HDP for individual objects/subjects. On the other hand, HDP is more robust than the DHP as far as success rate across the database is concerned when applied in a stochastic and uncertain environment, and the computational time involved in DHP is more.
On Integral Invariants for Effective 3-D Motion Trajectory Matching and Recognition.
Shao, Zhanpeng; Li, Youfu
2016-02-01
Motion trajectories tracked from the motions of human, robots, and moving objects can provide an important clue for motion analysis, classification, and recognition. This paper defines some new integral invariants for a 3-D motion trajectory. Based on two typical kernel functions, we design two integral invariants, the distance and area integral invariants. The area integral invariants are estimated based on the blurred segment of noisy discrete curve to avoid the computation of high-order derivatives. Such integral invariants for a motion trajectory enjoy some desirable properties, such as computational locality, uniqueness of representation, and noise insensitivity. Moreover, our formulation allows the analysis of motion trajectories at a range of scales by varying the scale of kernel function. The features of motion trajectories can thus be perceived at multiscale levels in a coarse-to-fine manner. Finally, we define a distance function to measure the trajectory similarity to find similar trajectories. Through the experiments, we examine the robustness and effectiveness of the proposed integral invariants and find that they can capture the motion cues in trajectory matching and sign recognition satisfactorily.
Gender Invariance of Family, School, and Peer Influence on Volunteerism Scale
ERIC Educational Resources Information Center
Law, Ben; Shek, Daniel; Ma, Cecilia
2015-01-01
Objective: This article examines the measurement invariance of "Family, School, and Peer Influence on Volunteerism Scale" (FSPV) across genders using the mean and covariance structure analysis approach. Method: A total of 2,845 Chinese high school adolescents aged 11 to 15 years completed the FSPV scale. Results: Results of the…
Evidence for Tempo-Specific Timing in Music Using a Web-Based Experimental Setup
ERIC Educational Resources Information Center
Honing, Henkjan
2006-01-01
Perceptual invariance has been studied and found in several domains of cognition, including those of speech, motor behavior, and object motion. It has also been the topic of several studies in music perception. However, the existing perceptual studies present rather inconclusive evidence with regard to the perceptual invariance of expressive…
2D Affine and Projective Shape Analysis.
Bryner, Darshan; Klassen, Eric; Huiling Le; Srivastava, Anuj
2014-05-01
Current techniques for shape analysis tend to seek invariance to similarity transformations (rotation, translation, and scale), but certain imaging situations require invariance to larger groups, such as affine or projective groups. Here we present a general Riemannian framework for shape analysis of planar objects where metrics and related quantities are invariant to affine and projective groups. Highlighting two possibilities for representing object boundaries-ordered points (or landmarks) and parameterized curves-we study different combinations of these representations (points and curves) and transformations (affine and projective). Specifically, we provide solutions to three out of four situations and develop algorithms for computing geodesics and intrinsic sample statistics, leading up to Gaussian-type statistical models, and classifying test shapes using such models learned from training data. In the case of parameterized curves, we also achieve the desired goal of invariance to re-parameterizations. The geodesics are constructed by particularizing the path-straightening algorithm to geometries of current manifolds and are used, in turn, to compute shape statistics and Gaussian-type shape models. We demonstrate these ideas using a number of examples from shape and activity recognition.
NASA Astrophysics Data System (ADS)
Henneaux, Marc; Lekeu, Victor; Matulich, Javier; Prohazka, Stefan
2018-06-01
The action of the free [InlineMediaObject not available: see fulltext.] theory in six spacetime dimensions is explicitly constructed. The variables of the variational principle are prepotentials adapted to the self-duality conditions on the fields. The (3, 1) supersymmetry variations are given and the invariance of the action is verified. The action is first-order in time derivatives. It is also Poincaré invariant but not manifestly so, just like the Hamiltonian action of more familiar relativistic field theories.
Three-dimensional object recognition using similar triangles and decision trees
NASA Technical Reports Server (NTRS)
Spirkovska, Lilly
1993-01-01
A system, TRIDEC, that is capable of distinguishing between a set of objects despite changes in the objects' positions in the input field, their size, or their rotational orientation in 3D space is described. TRIDEC combines very simple yet effective features with the classification capabilities of inductive decision tree methods. The feature vector is a list of all similar triangles defined by connecting all combinations of three pixels in a coarse coded 127 x 127 pixel input field. The classification is accomplished by building a decision tree using the information provided from a limited number of translated, scaled, and rotated samples. Simulation results are presented which show that TRIDEC achieves 94 percent recognition accuracy in the 2D invariant object recognition domain and 98 percent recognition accuracy in the 3D invariant object recognition domain after training on only a small sample of transformed views of the objects.
A cortical framework for invariant object categorization and recognition.
Rodrigues, João; Hans du Buf, J M
2009-08-01
In this paper we present a new model for invariant object categorization and recognition. It is based on explicit multi-scale features: lines, edges and keypoints are extracted from responses of simple, complex and end-stopped cells in cortical area V1, and keypoints are used to construct saliency maps for Focus-of-Attention. The model is a functional but dichotomous one, because keypoints are employed to model the "where" data stream, with dynamic routing of features from V1 to higher areas to obtain translation, rotation and size invariance, whereas lines and edges are employed in the "what" stream for object categorization and recognition. Furthermore, both the "where" and "what" pathways are dynamic in that information at coarse scales is employed first, after which information at progressively finer scales is added in order to refine the processes, i.e., both the dynamic feature routing and the categorization level. The construction of group and object templates, which are thought to be available in the prefrontal cortex with "what" and "where" components in PF46d and PF46v, is also illustrated. The model was tested in the framework of an integrated and biologically plausible architecture.
3-Dimensional Scene Perception during Active Electrolocation in a Weakly Electric Pulse Fish
von der Emde, Gerhard; Behr, Katharina; Bouton, Béatrice; Engelmann, Jacob; Fetz, Steffen; Folde, Caroline
2010-01-01
Weakly electric fish use active electrolocation for object detection and orientation in their environment even in complete darkness. The African mormyrid Gnathonemus petersii can detect object parameters, such as material, size, shape, and distance. Here, we tested whether individuals of this species can learn to identify 3-dimensional objects independently of the training conditions and independently of the object's position in space (rotation-invariance; size-constancy). Individual G. petersii were trained in a two-alternative forced-choice procedure to electrically discriminate between a 3-dimensional object (S+) and several alternative objects (S−). Fish were then tested whether they could identify the S+ among novel objects and whether single components of S+ were sufficient for recognition. Size-constancy was investigated by presenting the S+ together with a larger version at different distances. Rotation-invariance was tested by rotating S+ and/or S− in 3D. Our results show that electrolocating G. petersii could (1) recognize an object independently of the S− used during training. When only single components of a complex S+ were offered, recognition of S+ was more or less affected depending on which part was used. (2) Object-size was detected independently of object distance, i.e. fish showed size-constancy. (3) The majority of the fishes tested recognized their S+ even if it was rotated in space, i.e. these fishes showed rotation-invariance. (4) Object recognition was restricted to the near field around the fish and failed when objects were moved more than about 4 cm away from the animals. Our results indicate that even in complete darkness our G. petersii were capable of complex 3-dimensional scene perception using active electrolocation. PMID:20577635
Christophel, Thomas B; Allefeld, Carsten; Endisch, Christian; Haynes, John-Dylan
2018-06-01
Traditional views of visual working memory postulate that memorized contents are stored in dorsolateral prefrontal cortex using an adaptive and flexible code. In contrast, recent studies proposed that contents are maintained by posterior brain areas using codes akin to perceptual representations. An important question is whether this reflects a difference in the level of abstraction between posterior and prefrontal representations. Here, we investigated whether neural representations of visual working memory contents are view-independent, as indicated by rotation-invariance. Using functional magnetic resonance imaging and multivariate pattern analyses, we show that when subjects memorize complex shapes, both posterior and frontal brain regions maintain the memorized contents using a rotation-invariant code. Importantly, we found the representations in frontal cortex to be localized to the frontal eye fields rather than dorsolateral prefrontal cortices. Thus, our results give evidence for the view-independent storage of complex shapes in distributed representations across posterior and frontal brain regions.
A three-layer model of natural image statistics.
Gutmann, Michael U; Hyvärinen, Aapo
2013-11-01
An important property of visual systems is to be simultaneously both selective to specific patterns found in the sensory input and invariant to possible variations. Selectivity and invariance (tolerance) are opposing requirements. It has been suggested that they could be joined by iterating a sequence of elementary selectivity and tolerance computations. It is, however, unknown what should be selected or tolerated at each level of the hierarchy. We approach this issue by learning the computations from natural images. We propose and estimate a probabilistic model of natural images that consists of three processing layers. Two natural image data sets are considered: image patches, and complete visual scenes downsampled to the size of small patches. For both data sets, we find that in the first two layers, simple and complex cell-like computations are performed. In the third layer, we mainly find selectivity to longer contours; for patch data, we further find some selectivity to texture, while for the downsampled complete scenes, some selectivity to curvature is observed. Copyright © 2013 Elsevier Ltd. All rights reserved.
Synesthetic colors are elicited by sound quality in Japanese synesthetes.
Asano, Michiko; Yokosawa, Kazuhiko
2011-12-01
Determinants of synesthetic color choice for Japanese phonetic characters were studied in six Japanese synesthetes. The study used Hiragana and Katakana characters, which represent the same set of syllables although their visual forms are dissimilar. From a palette of 138 colors, synesthetes selected a color corresponding to each character. Results revealed that synesthetic color choices for Hiragana characters and those for their Katakana counterparts were remarkably consistent, indicating that color selection depended on character-related sounds and not visual form. This Hiragana-Katakana invariance cannot be regarded as the same phenomenon as letter case invariance, usually reported for English grapheme-color synesthesia, because Hiragana and Katakana characters have different identities whereas upper and lower case letters have the same identity. This involvement of phonology suggests that cross-activation between an inducer (i.e., letter/character) brain region and that of the concurrent (i.e., color) area in grapheme-color synesthesia is mediated by higher order cortical processing areas. Copyright © 2011 Elsevier Inc. All rights reserved.
Learning Rotation-Invariant Local Binary Descriptor.
Duan, Yueqi; Lu, Jiwen; Feng, Jianjiang; Zhou, Jie
2017-08-01
In this paper, we propose a rotation-invariant local binary descriptor (RI-LBD) learning method for visual recognition. Compared with hand-crafted local binary descriptors, such as local binary pattern and its variants, which require strong prior knowledge, local binary feature learning methods are more efficient and data-adaptive. Unlike existing learning-based local binary descriptors, such as compact binary face descriptor and simultaneous local binary feature learning and encoding, which are susceptible to rotations, our RI-LBD first categorizes each local patch into a rotational binary pattern (RBP), and then jointly learns the orientation for each pattern and the projection matrix to obtain RI-LBDs. As all the rotation variants of a patch belong to the same RBP, they are rotated into the same orientation and projected into the same binary descriptor. Then, we construct a codebook by a clustering method on the learned binary codes, and obtain a histogram feature for each image as the final representation. In order to exploit higher order statistical information, we extend our RI-LBD to the triple rotation-invariant co-occurrence local binary descriptor (TRICo-LBD) learning method, which learns a triple co-occurrence binary code for each local patch. Extensive experimental results on four different visual recognition tasks, including image patch matching, texture classification, face recognition, and scene classification, show that our RI-LBD and TRICo-LBD outperform most existing local descriptors.
Eger, E; Pinel, P; Dehaene, S; Kleinschmidt, A
2015-05-01
Macaque electrophysiology has revealed neurons responsive to number in lateral (LIP) and ventral (VIP) intraparietal areas. Recently, fMRI pattern recognition revealed information discriminative of individual numbers in human parietal cortex but without precisely localizing the relevant sites or testing for subregions with different response profiles. Here, we defined the human functional equivalents of LIP (feLIP) and VIP (feVIP) using neurophysiologically motivated localizers. We applied multivariate pattern recognition to investigate whether both regions represent numerical information and whether number codes are position specific or invariant. In a delayed number comparison paradigm with laterally presented numerosities, parietal cortex discriminated between numerosities better than early visual cortex, and discrimination generalized across hemifields in parietal, but not early visual cortex. Activation patterns in the 2 parietal regions of interest did not differ in the coding of position-specific or position-independent number information, but in the expression of a numerical distance effect which was more pronounced in feLIP. Thus, the representation of number in parietal cortex is at least partially position invariant. Both feLIP and feVIP contain information about individual numerosities in humans, but feLIP hosts a coarser representation of numerosity than feVIP, compatible with either broader tuning or a summation code. © The Author 2013. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Ben Ayed, Ismail; Punithakumar, Kumaradevan; Garvin, Gregory; Romano, Walter; Li, Shuo
2011-01-01
This study investigates novel object-interaction priors for graph cut image segmentation with application to intervertebral disc delineation in magnetic resonance (MR) lumbar spine images. The algorithm optimizes an original cost function which constrains the solution with learned prior knowledge about the geometric interactions between different objects in the image. Based on a global measure of similarity between distributions, the proposed priors are intrinsically invariant with respect to translation and rotation. We further introduce a scale variable from which we derive an original fixed-point equation (FPE), thereby achieving scale-invariance with only few fast computations. The proposed priors relax the need of costly pose estimation (or registration) procedures and large training sets (we used a single subject for training), and can tolerate shape deformations, unlike template-based priors. Our formulation leads to an NP-hard problem which does not afford a form directly amenable to graph cut optimization. We proceeded to a relaxation of the problem via an auxiliary function, thereby obtaining a nearly real-time solution with few graph cuts. Quantitative evaluations over 60 intervertebral discs acquired from 10 subjects demonstrated that the proposed algorithm yields a high correlation with independent manual segmentations by an expert. We further demonstrate experimentally the invariance of the proposed geometric attributes. This supports the fact that a single subject is sufficient for training our algorithm, and confirms the relevance of the proposed priors to disc segmentation.
Position, scale, and rotation invariant holographic associative memory
NASA Astrophysics Data System (ADS)
Fielding, Kenneth H.; Rogers, Steven K.; Kabrisky, Matthew; Mills, James P.
1989-08-01
This paper describes the development and characterization of a holographic associative memory (HAM) system that is able to recall stored objects whose inputs were changed in position, scale, and rotation. The HAM is based on the single iteration model described by Owechko et al. (1987); however, the system described uses a self-pumped BaTiO3 phase conjugate mirror, rather than a degenerate four-wave mixing proposed by Owechko and his coworkers. The HAM system can store objects in a position, scale, and rotation invariant feature space. The angularly multiplexed diffuse Fourier transform holograms of the HAM feature space are characterized as the memory unit; distorted input objects are correlated with the hologram, and the nonlinear phase conjugate mirror reduces cross-correlation noise and provides object discrimination. Applications of the HAM system are presented.
Biomorphic networks: approach to invariant feature extraction and segmentation for ATR
NASA Astrophysics Data System (ADS)
Baek, Andrew; Farhat, Nabil H.
1998-10-01
Invariant features in two dimensional binary images are extracted in a single layer network of locally coupled spiking (pulsating) model neurons with prescribed synapto-dendritic response. The feature vector for an image is represented as invariant structure in the aggregate histogram of interspike intervals obtained by computing time intervals between successive spikes produced from each neuron over a given period of time and combining such intervals from all neurons in the network into a histogram. Simulation results show that the feature vectors are more pattern-specific and invariant under translation, rotation, and change in scale or intensity than achieved in earlier work. We also describe an application of such networks to segmentation of line (edge-enhanced or silhouette) images. The biomorphic spiking network's capabilities in segmentation and invariant feature extraction may prove to be, when they are combined, valuable in Automated Target Recognition (ATR) and other automated object recognition systems.
ERIC Educational Resources Information Center
Shek, Daniel T. L.; Ma, Cecilia M. S.
2010-01-01
Objective: This paper examines the dimensionality and factorial invariance of the Chinese Family Assessment Instrument (C-FAI) using multigroup confirmatory factor analyses (MCFAs). Method: A total of 3,649 students responded to the C-FAI in a community survey. Results: Results showed that there are five dimensions of the C-FAI (communication,…
Learning receptor positions from imperfectly known motions
NASA Technical Reports Server (NTRS)
Ahumada, Albert J., Jr.; Mulligan, Jeffrey B.
1990-01-01
An algorithm is described for learning image interpolation functions for sensor arrays whose sensor positions are somewhat disordered. The learning is based on failures of translation invariance, so it does not require knowledge of the images being presented to the visual system. Previously reported implementations of the method assumed the visual system to have precise knowledge of the translations. It is demonstrated that translation estimates computed from the imperfectly interpolated images can have enough accuracy to allow the learning process to converge to a correct interpolation.
Pictorial depth probed through relative sizes
Wagemans, Johan; van Doorn, Andrea J; Koenderink, Jan J
2011-01-01
In the physical environment familiar size is an effective depth cue because the distance from the eye to an object equals the ratio of its physical size to its angular extent in the visual field. Such simple geometrical relations do not apply to pictorial space, since the eye itself is not in pictorial space, and consequently the notion “distance from the eye” is meaningless. Nevertheless, relative size in the picture plane is often used by visual artists to suggest depth differences. The depth domain has no natural origin, nor a natural unit; thus only ratios of depth differences could have an invariant significance. We investigate whether the pictorial relative size cue yields coherent depth structures in pictorial spaces. Specifically, we measure the depth differences for all pairs of points in a 20-point configuration in pictorial space, and we account for these observations through 19 independent parameters (the depths of the points modulo an arbitrary offset), with no meaningful residuals. We discuss a simple formal framework that allows one to handle individual differences. We also compare the depth scale obtained by way of this method with depth scales obtained in totally different ways, finding generally good agreement. PMID:23145258
Object matching using a locally affine invariant and linear programming techniques.
Li, Hongsheng; Huang, Xiaolei; He, Lei
2013-02-01
In this paper, we introduce a new matching method based on a novel locally affine-invariant geometric constraint and linear programming techniques. To model and solve the matching problem in a linear programming formulation, all geometric constraints should be able to be exactly or approximately reformulated into a linear form. This is a major difficulty for this kind of matching algorithm. We propose a novel locally affine-invariant constraint which can be exactly linearized and requires a lot fewer auxiliary variables than other linear programming-based methods do. The key idea behind it is that each point in the template point set can be exactly represented by an affine combination of its neighboring points, whose weights can be solved easily by least squares. Errors of reconstructing each matched point using such weights are used to penalize the disagreement of geometric relationships between the template points and the matched points. The resulting overall objective function can be solved efficiently by linear programming techniques. Our experimental results on both rigid and nonrigid object matching show the effectiveness of the proposed algorithm.
Moon illusion and spiral aftereffect: illusions due to the loom-zoom system?
Hershenson, M
1982-12-01
The moon illusion and the spiral aftereffect are illusions in which apparent size and apparent distance vary inversely. Because this relationship is exactly opposite to that predicted by the static size--distance invariance hypothesis, the illusions have been called "paradoxical." The illusions may be understood as products of a loom-zoom system, a hypothetical visual subsystem that, in its normal operation, acts according to its structural constraint, the constancy axiom, to produce perceptions that satisfy the constraints of stimulation, the kinetic size--distance invariance hypothesis. When stimulated by its characteristic stimulus of symmetrical expansion or contraction, the loom-zoom system produces the perception of a rigid object moving in depth. If this system is stimulated by a rotating spiral, a negative motion-aftereffect is produced when rotation ceases. If fixation is then shifted to a fixed-sized disc, the aftereffect process alters perceived distance and the loom-zoom system alters perceived size such that the disc appears to expand and approach or to contract and recede, depending on the direction of rotation of the spiral. If the loom-zoom system is stimulated by a moon-terrain configuration, the equidistance tendency produces a foreshortened perceived distance for the moon as an inverse function of elevation and acts in conjunction with the loom-zoom system to produce the increased perceived size of the moon.
Medical Image Tamper Detection Based on Passive Image Authentication.
Ulutas, Guzin; Ustubioglu, Arda; Ustubioglu, Beste; V Nabiyev, Vasif; Ulutas, Mustafa
2017-12-01
Telemedicine has gained popularity in recent years. Medical images can be transferred over the Internet to enable the telediagnosis between medical staffs and to make the patient's history accessible to medical staff from anywhere. Therefore, integrity protection of the medical image is a serious concern due to the broadcast nature of the Internet. Some watermarking techniques are proposed to control the integrity of medical images. However, they require embedding of extra information (watermark) into image before transmission. It decreases visual quality of the medical image and can cause false diagnosis. The proposed method uses passive image authentication mechanism to detect the tampered regions on medical images. Structural texture information is obtained from the medical image by using local binary pattern rotation invariant (LBPROT) to make the keypoint extraction techniques more successful. Keypoints on the texture image are obtained with scale invariant feature transform (SIFT). Tampered regions are detected by the method by matching the keypoints. The method improves the keypoint-based passive image authentication mechanism (they do not detect tampering when the smooth region is used for covering an object) by using LBPROT before keypoint extraction because smooth regions also have texture information. Experimental results show that the method detects tampered regions on the medical images even if the forged image has undergone some attacks (Gaussian blurring/additive white Gaussian noise) or the forged regions are scaled/rotated before pasting.
Reflection symmetry detection using locally affine invariant edge correspondence.
Wang, Zhaozhong; Tang, Zesheng; Zhang, Xiao
2015-04-01
Reflection symmetry detection receives increasing attentions in recent years. The state-of-the-art algorithms mainly use the matching of intensity-based features (such as the SIFT) within a single image to find symmetry axes. This paper proposes a novel approach by establishing the correspondence of locally affine invariant edge-based features, which are superior to the intensity based in the aspects that it is insensitive to illumination variations, and applicable to textureless objects. The locally affine invariance is achieved by simple linear algebra for efficient and robust computations, making the algorithm suitable for detections under object distortions like perspective projection. Commonly used edge detectors and a voting process are, respectively, used before and after the edge description and matching steps to form a complete reflection detection pipeline. Experiments are performed using synthetic and real-world images with both multiple and single reflection symmetry axis. The test results are compared with existing algorithms to validate the proposed method.
Sprague, Briana N; Hyun, Jinshil; Molenaar, Peter C M
2017-01-01
Invariance of intelligence across age is often assumed but infrequently explicitly tested. Horn and McArdle (1992) tested measurement invariance of intelligence, providing adequate model fit but might not consider all relevant aspects such as sub-test differences. The goal of the current paper is to explore age-related invariance of the WAIS-R using an alternative model that allows direct tests of age on WAIS-R subtests. Cross-sectional data on 940 participants aged 16-75 from the WAIS-R normative values were used. Subtests examined were information, comprehension, similarities, vocabulary, picture completion, block design, picture arrangement, and object assembly. The two intelligence factors considered were fluid and crystallized intelligence. Self-reported ages were divided into young (16-22, n = 300), adult (29-39, n = 275), middle (40-60, n = 205), and older (61-75, n = 160) adult groups. Results suggested partial metric invariance holds. Although most of the subtests reflected fluid and crystalized intelligence similarly across different ages, invariance did not hold for block design on fluid intelligence and picture arrangement on crystallized intelligence for older adults. Additionally, there was evidence of a correlated residual between information and vocabulary for the young adults only. This partial metric invariance model yielded acceptable model fit compared to previously-proposed invariance models of Horn and McArdle (1992). Almost complete metric invariance holds for a two-factor model of intelligence. Most of the subtests were invariant across age groups, suggesting little evidence for age-related bias in the WAIS-R. However, we did find unique relationships between two subtests and intelligence. Future studies should examine age-related differences in subtests when testing measurement invariance in intelligence.
Borzello, Mia; Freiwald, Winrich A.; Tsao, Doris
2015-01-01
Faces are a behaviorally important class of visual stimuli for primates. Recent work in macaque monkeys has identified six discrete face areas where most neurons have higher firing rates to images of faces compared with other objects (Tsao et al., 2006). While neurons in these areas appear to have different tuning (Freiwald and Tsao, 2010; Issa and DiCarlo, 2012), exactly what types of information and, consequently, which visual behaviors neural populations within each face area can support, is unknown. Here we use population decoding to better characterize three of these face patches (ML/MF, AL, and AM). We show that neural activity in all patches contains information that discriminates between the broad categories of face and nonface objects, individual faces, and nonface stimuli. Information is present in both high and lower firing rate regimes. However, there were significant differences between the patches, with the most anterior patch showing relatively weaker representation of nonface stimuli. Additionally, we find that pose-invariant face identity information increases as one moves to more anterior patches, while information about the orientation of the head decreases. Finally, we show that all the information we can extract from the population is present in patterns of activity across neurons, and there is relatively little information in the total activity of the population. These findings give new insight into the representations constructed by the face patch system and how they are successively transformed. PMID:25948258
Identification of simple objects in image sequences
NASA Astrophysics Data System (ADS)
Geiselmann, Christoph; Hahn, Michael
1994-08-01
We present an investigation in the identification and location of simple objects in color image sequences. As an example the identification of traffic signs is discussed. Three aspects are of special interest. First regions have to be detected which may contain the object. The separation of those regions from the background can be based on color, motion, and contours. In the experiments all three possibilities are investigated. The second aspect focuses on the extraction of suitable features for the identification of the objects. For that purpose the border line of the region of interest is used. For planar objects a sufficient approximation of perspective projection is affine mapping. In consequence, it is near at hand to extract affine-invariant features from the border line. The investigation includes invariant features based on Fourier descriptors and moments. Finally, the object is identified by maximum likelihood classification. In the experiments all three basic object types are correctly identified. The probabilities for misclassification have been found to be below 1%
Toying with the moon illusion.
Lockhead, G R; Wolbarsht, M L
1991-08-20
We propose that the correct interpretation of the moon illusion is that the zenith moon appears small, not that the horizon moon appears large. This illusion is caused by the visual gap between the observer and the overhead moon. Because of the gap, the observer has no or little optical information about the distance of the moon. This results in empty field myopia where the moon is neurally, although not necessarily cognitively, processed as being at about arm's length. When the moon is seen on the horizon, there usually is optical information about distance. That results in reduced accommodation, and so the moon is processed as at a greater distance. Consistent with the size-distance-invariance hypothesis, the moon is then judged as large. This is a specific example of the more general fact that all distant objects appear small in the absence of a stimulus for accommodation to be distant. This outcome produces the toy illusion.
Linke, Annika; Roach-Fox, Elizabeth; Vriezen, Ellen; Prasad, Asuri Narayan; Cusack, Rhodri
2018-06-02
Mirror writing is often produced by healthy children during early acquisition of literacy, and has been observed in adults following neurological disorders or insults. The neural mechanisms responsible for involuntary mirror writing remain debated, but in healthy children, it is typically attributed to the delayed development of a process of overcoming mirror invariance while learning to read and write. We present an unusual case of sudden-onset, persistent mirror writing in a previously typical seven-year-old girl. Using her dominant right hand only, she copied and spontaneously produced all letters, words and sentences, as well as some numbers and objects, in mirror image. Additionally, she frequently misidentified letter orientations in perceptual assessments. Clinical, neuropsychological, and functional neuroimaging studies were carried out over sixteen months. Neurologic and ophthalmologic examinations and a standard clinical MRI scan of the head were normal. Neuropsychological testing revealed average scores on most tests of intellectual function, language function, verbal learning and memory. Visual perception and visual reasoning were average, with the exception of below average form constancy, and mild difficulties on some visual memory tests. Activation and functional connectivity of the reading and writing network was assessed with fMRI. During a reading task, the VWFA showed a strong response to words in mirror but not in normal letter orientation - similar to what has been observed in typically developing children previously - but activation was atypically reduced in right primary visual cortex and Exner's Area. Resting-state connectivity within the reading and writing network was similar to that of age-matched controls, but hemispheric asymmetry between the balance of motor-to-visual input was found for Exner's Area. In summary, this unusual case suggests that a disruption to visual-motor integration rather than to the VWFA can contribute to sudden-onset, persistent mirror writing in the absence of clinically detectable neurological insult. Copyright © 2018. Published by Elsevier Ltd.
Universal brain systems for recognizing word shapes and handwriting gestures during reading
Nakamura, Kimihiro; Kuo, Wen-Jui; Pegado, Felipe; Cohen, Laurent; Tzeng, Ovid J. L.; Dehaene, Stanislas
2012-01-01
Do the neural circuits for reading vary across culture? Reading of visually complex writing systems such as Chinese has been proposed to rely on areas outside the classical left-hemisphere network for alphabetic reading. Here, however, we show that, once potential confounds in cross-cultural comparisons are controlled for by presenting handwritten stimuli to both Chinese and French readers, the underlying network for visual word recognition may be more universal than previously suspected. Using functional magnetic resonance imaging in a semantic task with words written in cursive font, we demonstrate that two universal circuits, a shape recognition system (reading by eye) and a gesture recognition system (reading by hand), are similarly activated and show identical patterns of activation and repetition priming in the two language groups. These activations cover most of the brain regions previously associated with culture-specific tuning. Our results point to an extended reading network that invariably comprises the occipitotemporal visual word-form system, which is sensitive to well-formed static letter strings, and a distinct left premotor region, Exner’s area, which is sensitive to the forward or backward direction with which cursive letters are dynamically presented. These findings suggest that cultural effects in reading merely modulate a fixed set of invariant macroscopic brain circuits, depending on surface features of orthographies. PMID:23184998
Explicit Encoding of Multimodal Percepts by Single Neurons in the Human Brain
Quiroga, Rodrigo Quian; Kraskov, Alexander; Koch, Christof; Fried, Itzhak
2010-01-01
Summary Different pictures of Marilyn Monroe can evoke the same percept, even if greatly modified as in Andy Warhol’s famous portraits. But how does the brain recognize highly variable pictures as the same percept? Various studies have provided insights into how visual information is processed along the “ventral pathway,” via both single-cell recordings in monkeys [1, 2] and functional imaging in humans [3, 4]. Interestingly, in humans, the same “concept” of Marilyn Monroe can be evoked with other stimulus modalities, for instance by hearing or reading her name. Brain imaging studies have identified cortical areas selective to voices [5, 6] and visual word forms [7, 8]. However, how visual, text, and sound information can elicit a unique percept is still largely unknown. By using presentations of pictures and of spoken and written names, we show that (1) single neurons in the human medial temporal lobe (MTL) respond selectively to representations of the same individual across different sensory modalities; (2) the degree of multimodal invariance increases along the hierarchical structure within the MTL; and (3) such neuronal representations can be generated within less than a day or two. These results demonstrate that single neurons can encode percepts in an explicit, selective, and invariant manner, even if evoked by different sensory modalities. PMID:19631538
A secure online image trading system for untrusted cloud environments.
Munadi, Khairul; Arnia, Fitri; Syaryadhi, Mohd; Fujiyoshi, Masaaki; Kiya, Hitoshi
2015-01-01
In conventional image trading systems, images are usually stored unprotected on a server, rendering them vulnerable to untrusted server providers and malicious intruders. This paper proposes a conceptual image trading framework that enables secure storage and retrieval over Internet services. The process involves three parties: an image publisher, a server provider, and an image buyer. The aim is to facilitate secure storage and retrieval of original images for commercial transactions, while preventing untrusted server providers and unauthorized users from gaining access to true contents. The framework exploits the Discrete Cosine Transform (DCT) coefficients and the moment invariants of images. Original images are visually protected in the DCT domain, and stored on a repository server. Small representation of the original images, called thumbnails, are generated and made publicly accessible for browsing. When a buyer is interested in a thumbnail, he/she sends a query to retrieve the visually protected image. The thumbnails and protected images are matched using the DC component of the DCT coefficients and the moment invariant feature. After the matching process, the server returns the corresponding protected image to the buyer. However, the image remains visually protected unless a key is granted. Our target application is the online market, where publishers sell their stock images over the Internet using public cloud servers.
The Structure of Cognitive Abilities in Youths with Manic Symptoms: A Factorial Invariance Study
ERIC Educational Resources Information Center
Beaujean, A. Alexander; Freeman, Megan Joseph; Youngstrom, Eric; Carlson, Gabrielle
2012-01-01
This study compared the structure of cognitive ability (specifically, verbal/crystallized ["Gc"] and visual-spatial ability ["Gv"]), as measured in the Wechsler Intelligence Scale for Children, in youth with manic symptoms with a nationally representative group of similarly aged youth. Multigroup confirmatory factor analysis…
Measurement Invariance of the Reynolds Depression Adolescent Scale across Gender and Age
ERIC Educational Resources Information Center
Fonseca-Pedrero, Eduardo; Wells, Craig; Paino, Mercedes; Lemos-Giraldez, Serafin; Villazon-Garcia, Ursula; Sierra, Susana; Garcia-Portilla Gonzalez, Ma Paz; Bobes, Julio; Muniz, Jose
2010-01-01
The main objective of the present study was to examine measurement invariance of the Reynolds Depression Adolescent Scale (RADS) (Reynolds, 1987) across gender and age in a representative sample of nonclinical adolescents. The sample was composed of 1,659 participants, 801 males (48.3%), with a mean age of 15.9 years (SD = 1.2). Confirmatory…
The time course of shape discrimination in the human brain.
Ales, Justin M; Appelbaum, L Gregory; Cottereau, Benoit R; Norcia, Anthony M
2013-02-15
The lateral occipital cortex (LOC) activates selectively to images of intact objects versus scrambled controls, is selective for the figure-ground relationship of a scene, and exhibits at least some degree of invariance for size and position. Because of these attributes, it is considered to be a crucial part of the object recognition pathway. Here we show that human LOC is critically involved in perceptual decisions about object shape. High-density EEG was recorded while subjects performed a threshold-level shape discrimination task on texture-defined figures segmented by either phase or orientation cues. The appearance or disappearance of a figure region from a uniform background generated robust visual evoked potentials throughout retinotopic cortex as determined by inverse modeling of the scalp voltage distribution. Contrasting responses from trials containing shape changes that were correctly detected (hits) with trials in which no change occurred (correct rejects) revealed stimulus-locked, target-selective activity in the occipital visual areas LOC and V4 preceding the subject's response. Activity that was locked to the subjects' reaction time was present in the LOC. Response-locked activity in the LOC was determined to be related to shape discrimination for several reasons: shape-selective responses were silenced when subjects viewed identical stimuli but their attention was directed away from the shapes to a demanding letter discrimination task; shape-selectivity was present across four different stimulus configurations used to define the figure; LOC responses correlated with participants' reaction times. These results indicate that decision-related activity is present in the LOC when subjects are engaged in threshold-level shape discriminations. Copyright © 2012 Elsevier Inc. All rights reserved.
An, M; Kusurkar, R A; Li, L; Xiao, Y; Zheng, C; Hu, J; Chen, M
2017-07-11
The Strength of Motivation for Medical School-Revised (SMMS-R) questionnaire measures students' motivation for studying medicine. It includes three subscales: 'willingness to sacrifice', 'readiness to start', and 'persistence'. Measurement invariance is a prerequisite for group comparisons. The objectives of this study were to verify the factorial structure of the SMMS-R questionnaire and to investigate it's measurement invariance. A total of 989 medical students were approached, 930 cases were kept for data analysis. Factorial structure of and measurement invariance of the SMMS-R were tested using single and multiple group confirmatory factor analyses with Mplus. Trational Cronbach's α along with McDonald's ω and glb were used to measure internal consistency for each subscale. Internal consistency for subscales and the full instrument were within the acceptable range. A 3-factor structure of the Chinese version of the SMMS-R was supported. Full configural, metric and partial scalar invariance were obtained. The SMMS-R showed measurement invariance across gender and two independent samples. So it can be used for group and cross-cultural comparisons.
Multiple degree of freedom object recognition using optical relational graph decision nets
NASA Technical Reports Server (NTRS)
Casasent, David P.; Lee, Andrew J.
1988-01-01
Multiple-degree-of-freedom object recognition concerns objects with no stable rest position with all scale, rotation, and aspect distortions possible. It is assumed that the objects are in a fairly benign background, so that feature extractors are usable. In-plane distortion invariance is provided by use of a polar-log coordinate transform feature space, and out-of-plane distortion invariance is provided by linear discriminant function design. Relational graph decision nets are considered for multiple-degree-of-freedom pattern recognition. The design of Fisher (1936) linear discriminant functions and synthetic discriminant function for use at the nodes of binary and multidecision nets is discussed. Case studies are detailed for two-class and multiclass problems. Simulation results demonstrate the robustness of the processors to quantization of the filter coefficients and to noise.
NASA Astrophysics Data System (ADS)
Tehsin, Sara; Rehman, Saad; Riaz, Farhan; Saeed, Omer; Hassan, Ali; Khan, Muazzam; Alam, Muhammad S.
2017-05-01
A fully invariant system helps in resolving difficulties in object detection when camera or object orientation and position are unknown. In this paper, the proposed correlation filter based mechanism provides the capability to suppress noise, clutter and occlusion. Minimum Average Correlation Energy (MACE) filter yields sharp correlation peaks while considering the controlled correlation peak value. Difference of Gaussian (DOG) Wavelet has been added at the preprocessing stage in proposed filter design that facilitates target detection in orientation variant cluttered environment. Logarithmic transformation is combined with a DOG composite minimum average correlation energy filter (WMACE), capable of producing sharp correlation peaks despite any kind of geometric distortion of target object. The proposed filter has shown improved performance over some of the other variant correlation filters which are discussed in the result section.
[Invariants of the anthropometrical proportions].
Smolianinov, V V
2012-01-01
In this work a general interpretation of a modulor as scales of segments proportions of anthropometrical modules (extremities and a body) is made. The objects of this study were: 1) to reason the idea of the growth modulor; 2) using the modern empirical data, to prove the validity of a principle of linear similarity for anthropometrical segments; 3) to specify the system of invariants for constitutional anthropometrics.
NASA Astrophysics Data System (ADS)
Mansourian, Leila; Taufik Abdullah, Muhamad; Nurliyana Abdullah, Lili; Azman, Azreen; Mustaffa, Mas Rina
2017-02-01
Pyramid Histogram of Words (PHOW), combined Bag of Visual Words (BoVW) with the spatial pyramid matching (SPM) in order to add location information to extracted features. However, different PHOW extracted from various color spaces, and they did not extract color information individually, that means they discard color information, which is an important characteristic of any image that is motivated by human vision. This article, concatenated PHOW Multi-Scale Dense Scale Invariant Feature Transform (MSDSIFT) histogram and a proposed Color histogram to improve the performance of existing image classification algorithms. Performance evaluation on several datasets proves that the new approach outperforms other existing, state-of-the-art methods.
NASA Astrophysics Data System (ADS)
Duong, Tuan A.; Duong, Nghi; Le, Duong
2017-01-01
In this paper, we present an integration technique using a bio-inspired, control-based visual and olfactory receptor system to search for elusive targets in practical environments where the targets cannot be seen obviously by either sensory data. Bio-inspired Visual System is based on a modeling of extended visual pathway which consists of saccadic eye movements and visual pathway (vertebrate retina, lateral geniculate nucleus and visual cortex) to enable powerful target detections of noisy, partial, incomplete visual data. Olfactory receptor algorithm, namely spatial invariant independent component analysis, that was developed based on data of old factory receptor-electronic nose (enose) of Caltech, is adopted to enable the odorant target detection in an unknown environment. The integration of two systems is a vital approach and sets up a cornerstone for effective and low-cost of miniaturized UAVs or fly robots for future DOD and NASA missions, as well as for security systems in Internet of Things environments.
1983-10-01
03.1 I20 -06 -20 -26 79.0 SfC sic SCaflayepm VS -09 -27 33 -13 -23 -16 .14 3 .ŝ of 2 2? 09 40-40 .09 -17 -28 .00 .10 -it .’ -20 72 073 040.Uu.s jsos...difference between the chi-squares for these *two models (pattern invariance vs . loading invariance) was computed to be 63.83 with 30 degrees of freedom...paragraphs in order to master or pass the objective or to receive a "GO." Typical criterion-referenced scores are number of objectives A" passed, GO vs . NO
Rotation invariants of vector fields from orthogonal moments
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yang, Bo; Kostková, Jitka; Flusser, Jan
Vector field images are a type of new multidimensional data that appear in many engineering areas. Although the vector fields can be visualized as images, they differ from graylevel and color images in several aspects. In order to analyze them, special methods and algorithms must be originally developed or substantially adapted from the traditional image processing area. Here, we propose a method for the description and matching of vector field patterns under an unknown rotation of the field. Rotation of a vector field is so-called total rotation, where the action is applied not only on the spatial coordinates but alsomore » on the field values. Invariants of vector fields with respect to total rotation constructed from orthogonal Gaussian–Hermite moments and Zernike moments are introduced. Their numerical stability is shown to be better than that of the invariants published so far. We demonstrate their usefulness in a real world template matching application of rotated vector fields.« less
Neural networks for data compression and invariant image recognition
NASA Technical Reports Server (NTRS)
Gardner, Sheldon
1989-01-01
An approach to invariant image recognition (I2R), based upon a model of biological vision in the mammalian visual system (MVS), is described. The complete I2R model incorporates several biologically inspired features: exponential mapping of retinal images, Gabor spatial filtering, and a neural network associative memory. In the I2R model, exponentially mapped retinal images are filtered by a hierarchical set of Gabor spatial filters (GSF) which provide compression of the information contained within a pixel-based image. A neural network associative memory (AM) is used to process the GSF coded images. We describe a 1-D shape function method for coding of scale and rotationally invariant shape information. This method reduces image shape information to a periodic waveform suitable for coding as an input vector to a neural network AM. The shape function method is suitable for near term applications on conventional computing architectures equipped with VLSI FFT chips to provide a rapid image search capability.
A novel method for unsteady flow field segmentation based on stochastic similarity of direction
NASA Astrophysics Data System (ADS)
Omata, Noriyasu; Shirayama, Susumu
2018-04-01
Recent developments in fluid dynamics research have opened up the possibility for the detailed quantitative understanding of unsteady flow fields. However, the visualization techniques currently in use generally provide only qualitative insights. A method for dividing the flow field into physically relevant regions of interest can help researchers quantify unsteady fluid behaviors. Most methods at present compare the trajectories of virtual Lagrangian particles. The time-invariant features of an unsteady flow are also frequently of interest, but the Lagrangian specification only reveals time-variant features. To address these challenges, we propose a novel method for the time-invariant spatial segmentation of an unsteady flow field. This segmentation method does not require Lagrangian particle tracking but instead quantitatively compares the stochastic models of the direction of the flow at each observed point. The proposed method is validated with several clustering tests for 3D flows past a sphere. Results show that the proposed method reveals the time-invariant, physically relevant structures of an unsteady flow.
Equicontrollability and the model following problem
NASA Technical Reports Server (NTRS)
Curran, R. T.
1971-01-01
Equicontrollability and its application to the linear time-invariant model-following problem are discussed. The problem is presented in the form of two systems, the plant and the model. The requirement is to find a controller to apply to the plant so that the resultant compensated plant behaves, in an input-output sense, the same as the model. All systems are assumed to be linear and time-invariant. The basic approach is to find suitable equicontrollable realizations of the plant and model and to utilize feedback so as to produce a controller of minimal state dimension. The concept of equicontrollability is a generalization of control canonical (phase variable) form applied to multivariable systems. It allows one to visualize clearly the effects of feedback and to pinpoint the parameters of a multivariable system which are invariant under feedback. The basic contributions are the development of equicontrollable form; solution of the model-following problem in an entirely algorithmic way, suitable for computer programming; and resolution of questions on system decoupling.
Rotation invariants of vector fields from orthogonal moments
Yang, Bo; Kostková, Jitka; Flusser, Jan; ...
2017-09-11
Vector field images are a type of new multidimensional data that appear in many engineering areas. Although the vector fields can be visualized as images, they differ from graylevel and color images in several aspects. In order to analyze them, special methods and algorithms must be originally developed or substantially adapted from the traditional image processing area. Here, we propose a method for the description and matching of vector field patterns under an unknown rotation of the field. Rotation of a vector field is so-called total rotation, where the action is applied not only on the spatial coordinates but alsomore » on the field values. Invariants of vector fields with respect to total rotation constructed from orthogonal Gaussian–Hermite moments and Zernike moments are introduced. Their numerical stability is shown to be better than that of the invariants published so far. We demonstrate their usefulness in a real world template matching application of rotated vector fields.« less
Recognition of Amodal Language Identity Emerges in Infancy
ERIC Educational Resources Information Center
Lewkowicz, David J.; Pons, Ferran
2013-01-01
Audiovisual speech consists of overlapping and invariant patterns of dynamic acoustic and optic articulatory information. Research has shown that infants can perceive a variety of basic auditory-visual (A-V) relations but no studies have investigated whether and when infants begin to perceive higher order A-V relations inherent in speech. Here, we…
NASA Astrophysics Data System (ADS)
Fortunati, Alessandro; Wiggins, Stephen
Starting from the concept of invariant KAM tori for nearly-integrable Hamiltonian systems with periodic or quasi-periodic nonautonomous perturbation, the paper analyzes the “analogue” of this class of invariant objects when the dependence on time is aperiodic. The investigation is carried out in a model motivated by the problem of a traveling wave in a channel over a smooth, quasi- and asymptotically flat (from which the “transient” feature) bathymetry, representing a case in which the described structures play the role of barriers to fluid transport in phase space. The paper provides computational evidence for the existence of transient structures also for “large” values of the perturbation size, as a complement to the rigorous results already proven by the first author for real-analytic bathymetry functions.
Hansen, Eva; Grimme, Britta; Reimann, Hendrik; Schöner, Gregor
2018-05-01
In a sequence of arm movements, any given segment could be influenced by its predecessors (carry-over coarticulation) and by its successor (anticipatory coarticulation). To study the interdependence of movement segments, we asked participants to move an object from an initial position to a first and then on to a second target location. The task involved ten joint angles controlling the three-dimensional spatial path of the object and hand. We applied the principle of the uncontrolled manifold (UCM) to analyze the difference between joint trajectories that either affect (non-motor equivalent) or do not affect (motor equivalent) the hand's trajectory in space. We found evidence for anticipatory coarticulation that was distributed equally in the two directions in joint space. We also found strong carry-over coarticulation, which showed clear structure in joint space: More of the difference between joint configurations observed for different preceding movements lies in directions in joint space that leaves the hand's path in space invariant than in orthogonal directions in joint space that varies the hand's path in space. We argue that the findings are consistent with anticipatory coarticulation reflecting processes of movement planning that lie at the level of the hand's trajectory in space. Carry-over coarticulation may reflect primarily processes of motor control that are governed by the principle of the UCM, according to which changes that do not affect the hand's trajectory in space are not actively delimited. Two follow-up experiments zoomed in on anticipatory coarticulation. These experiments strengthened evidence for anticipatory coarticulation. Anticipatory coarticulation was motor-equivalent when visual information supported the steering of the object to its first target, but was not motor equivalent when that information was removed. The experiments showed that visual updating of the hand's path in space when the object approaches the first target only affected the component of the joint difference vector orthogonal to the UCM, consistent with the UCM principle.
Canonical gravity, diffeomorphisms and objective histories
NASA Astrophysics Data System (ADS)
Samuel, Joseph
2000-11-01
This paper discusses the implementation of diffeomorphism invariance in purely Hamiltonian formulations of general relativity. We observe that, if a constrained Hamiltonian formulation derives from a manifestly covariant Lagrangian, the diffeomorphism invariance of the Lagrangian results in the following properties of the constrained Hamiltonian theory: the diffeomorphisms are generated by constraints on the phase space so that: (a) the algebra of the generators reflects the algebra of the diffeomorphism group; (b) the Poisson brackets of the basic fields with the generators reflects the spacetime transformation properties of these basic fields. This suggests that in a purely Hamiltonian approach the requirement of diffeomorphism invariance should be interpreted to include (b) and not just (a) as one might naively suppose. Giving up (b) amounts to giving up objective histories, even at the classical level. This observation has implications for loop quantum gravity which are spelled out in a companion paper. We also describe an analogy between canonical gravity and relativistic particle dynamics to illustrate our main point.
A Novel Image Retrieval Based on Visual Words Integration of SIFT and SURF
Ali, Nouman; Bajwa, Khalid Bashir; Sablatnig, Robert; Chatzichristofis, Savvas A.; Iqbal, Zeshan; Rashid, Muhammad; Habib, Hafiz Adnan
2016-01-01
With the recent evolution of technology, the number of image archives has increased exponentially. In Content-Based Image Retrieval (CBIR), high-level visual information is represented in the form of low-level features. The semantic gap between the low-level features and the high-level image concepts is an open research problem. In this paper, we present a novel visual words integration of Scale Invariant Feature Transform (SIFT) and Speeded-Up Robust Features (SURF). The two local features representations are selected for image retrieval because SIFT is more robust to the change in scale and rotation, while SURF is robust to changes in illumination. The visual words integration of SIFT and SURF adds the robustness of both features to image retrieval. The qualitative and quantitative comparisons conducted on Corel-1000, Corel-1500, Corel-2000, Oliva and Torralba and Ground Truth image benchmarks demonstrate the effectiveness of the proposed visual words integration. PMID:27315101
The economics of motion perception and invariants of visual sensitivity.
Gepshtein, Sergei; Tyukin, Ivan; Kubovy, Michael
2007-06-21
Neural systems face the challenge of optimizing their performance with limited resources, just as economic systems do. Here, we use tools of neoclassical economic theory to explore how a frugal visual system should use a limited number of neurons to optimize perception of motion. The theory prescribes that vision should allocate its resources to different conditions of stimulation according to the degree of balance between measurement uncertainties and stimulus uncertainties. We find that human vision approximately follows the optimal prescription. The equilibrium theory explains why human visual sensitivity is distributed the way it is and why qualitatively different regimes of apparent motion are observed at different speeds. The theory offers a new normative framework for understanding the mechanisms of visual sensitivity at the threshold of visibility and above the threshold and predicts large-scale changes in visual sensitivity in response to changes in the statistics of stimulation and system goals.
NASA Astrophysics Data System (ADS)
Wan, Qianwen; Panetta, Karen; Agaian, Sos
2017-05-01
Autonomous facial recognition system is widely used in real-life applications, such as homeland border security, law enforcement identification and authentication, and video-based surveillance analysis. Issues like low image quality, non-uniform illumination as well as variations in poses and facial expressions can impair the performance of recognition systems. To address the non-uniform illumination challenge, we present a novel robust autonomous facial recognition system inspired by the human visual system based, so called, logarithmical image visualization technique. In this paper, the proposed method, for the first time, utilizes the logarithmical image visualization technique coupled with the local binary pattern to perform discriminative feature extraction for facial recognition system. The Yale database, the Yale-B database and the ATT database are used for computer simulation accuracy and efficiency testing. The extensive computer simulation demonstrates the method's efficiency, accuracy, and robustness of illumination invariance for facial recognition.
Hierarchical acquisition of visual specificity in spatial contextual cueing.
Lie, Kin-Pou
2015-01-01
Spatial contextual cueing refers to visual search performance's being improved when invariant associations between target locations and distractor spatial configurations are learned incidentally. Using the instance theory of automatization and the reverse hierarchy theory of visual perceptual learning, this study explores the acquisition of visual specificity in spatial contextual cueing. Two experiments in which detailed visual features were irrelevant for distinguishing between spatial contexts found that spatial contextual cueing was visually generic in difficult trials when the trials were not preceded by easy trials (Experiment 1) but that spatial contextual cueing progressed to visual specificity when difficult trials were preceded by easy trials (Experiment 2). These findings support reverse hierarchy theory, which predicts that even when detailed visual features are irrelevant for distinguishing between spatial contexts, spatial contextual cueing can progress to visual specificity if the stimuli remain constant, the task is difficult, and difficult trials are preceded by easy trials. However, these findings are inconsistent with instance theory, which predicts that when detailed visual features are irrelevant for distinguishing between spatial contexts, spatial contextual cueing will not progress to visual specificity. This study concludes that the acquisition of visual specificity in spatial contextual cueing is more plausibly hierarchical, rather than instance-based.
NASA Astrophysics Data System (ADS)
Gulyaev, P.; Jordan, V.; Gulyaev, I.; Dolmatov, A.
2017-05-01
The paper presents the analysis of the recorded tracks of high-velocity emission in the air-argon plasma flow during breaking up of tungsten microdroplets. This new physical effect of optical emission involves two stages. The first one includes thermionic emission of electrons from the surface of the melted tungsten droplet of 100-200 μm size and formation of the charged sphere of 3-5 mm diameter. After it reaches the breakdown electric potential, it collapses and produces a spherical shock wave and luminous radiation. The second stage includes previously unknown physical phenomenon of narrowly directed energy jet with velocity exceeding 4000 m/s from the surface of the tungsten droplet. The luminous spherical collapse and high-velocity jets were recorded using CMOS photo-array operating in a global shutter charge storage mode. Special features of the CMOS array scanning algorithm affect formation of distinctive signs of the recorded tracks, which stay invariant to trace transform (TT) with specific functional. The series of concentric circles were adopted as primitive object models (patterns) used in TT at the spherical collapse stage and linear segment of fixed thickness - at the high-velocity emission stage. The two invariants of the physical object, motion velocity and optical brightness distribution in the motion front, were adopted as desired identification features of tracks. The analytical expressions of the relation of 2D TT parameters and physical object motion invariants were obtained. The equations for spherical collapse stage correspond to Radon-Nikodym transform.
Revealing the Mystery of the Galilean Principle of Relativity. Part I: Basic Assertions
NASA Astrophysics Data System (ADS)
Yarman, Tolga
2009-08-01
As Galileo has formulated, one cannot detect, once embarked in a uniform translational motion, and not receiving any information from the outside, how fast he is moving. Why? No one that we recall of, has worked out the answer of this question, although the Galilean Principle of Relativity ( GPR), constituted a major ingredient of the Special Theory of Relativity (STR). Thus, consider a quantum mechanical object of “ clock mass” M 0 ( which is just a mass), doing a “ clock motion”, such as rotation, vibration, etc., with a total energy E 0, in a space of size ℛ0. Previously we have established that, if the mass M 0 is multiplied by an arbitrary number γ, then through the relativistic or non-relativistic quantum mechanical description of the object ( which ever is appropriate to describe the case in hand), the size ℛ0 of it, shrinks as much, and the total energy E 0, concomitantly, increases as much. This quantum mechanical occurrence yields, at once, the invariance of the quantity E 0 M 0ℛ{0/2} with regards to the mass change in question, the object being overall at rest; this latter quantity is, on the other hand, as induced by the quantum mechanical framework, necessarily strapped to h 2, the square of the Planck Constant. But this constant is already, dimension wise, Lorentz invariant. Thus, any quantity bearing the dimension of h 2, is Lorentz invariant, too. So is then, the quantity E 0 M 0ℛ{0/2} ( no matter how the size of concern lies with respect to the direction of uniform translational motion) that would come into play. Thence, the quantum mechanical invariance of the quantity E 0 M 0ℛ{0/2} with regards to an arbitrary mass change, comes to be identical to the Lorentz invariance of this quantity, were the object brought to a uniform translational motion. It is this prevalence, which displays, amazingly, the underlying mechanism, securing the end results of the STR, and this via quantum mechanics. The Lorentz invariant quantum mechanical architecture, E 0 M 0ℛ{0/2}˜ h 2, more fundamentally, constitutes the answer of the mystery drawn by the GPR. In this article, we frame the basic assertions, which will be used in a subsequent article, to display the quantum mechanical machinery making the GPR, and to draw the bridge between the GPR and the architecture, we disclose.
Evidence for cross-script abstract identities in learners of Japanese kana.
Schubert, Teresa; Gawthrop, Roderick; Kinoshita, Sachiko
2018-05-07
The presence of abstract letter identity representations in the Roman alphabet has been well documented. These representations are invariant to letter case (upper vs. lower) and visual appearance. For example, "a" and "A" are represented by the same abstract identity. Recent research has begun to consider whether the processing of non-Roman orthographies also involves abstract orthographic representations. In the present study, we sought evidence for abstract identities in Japanese kana, which consist of two scripts, hiragana and katakana. Abstract identities would be invariant to the script used as well as to the degree of visual similarity. We adapted the cross-case masked-priming letter match task used in previous research on Roman letters, by presenting cross-script kana pairs and testing adult beginning -to- intermediate Japanese second-language (L2) learners (first-language English readers). We found robust cross-script priming effects, which were equal in magnitude for visually similar (e.g., り/リ) and dissimilar (e.g., あ/ア) kana pairs. This pattern was found despite participants' imperfect explicit knowledge of the kana names, particularly for katakana. We also replicated prior findings from Roman abstract letter identities in the same participants. Ours is the first study reporting abstract kana identity priming (in adult L2 learners). Furthermore, these representations were acquired relatively early in our adult L2 learners.
Object-processing neural efficiency differentiates object from spatial visualizers.
Motes, Michael A; Malach, Rafael; Kozhevnikov, Maria
2008-11-19
The visual system processes object properties and spatial properties in distinct subsystems, and we hypothesized that this distinction might extend to individual differences in visual processing. We conducted a functional MRI study investigating the neural underpinnings of individual differences in object versus spatial visual processing. Nine participants of high object-processing ability ('object' visualizers) and eight participants of high spatial-processing ability ('spatial' visualizers) were scanned, while they performed an object-processing task. Object visualizers showed lower bilateral neural activity in lateral occipital complex and lower right-lateralized neural activity in dorsolateral prefrontal cortex. The data indicate that high object-processing ability is associated with more efficient use of visual-object resources, resulting in less neural activity in the object-processing pathway.
Dynamical encoding of looming, receding, and focussing
NASA Astrophysics Data System (ADS)
Longtin, Andre; Clarke, Stephen Elisha; Maler, Leonard; CenterNeural Dynamics Collaboration
This talk will discuss a non-conventional neural coding task that may apply more broadly to many senses in higher vertebrates. We ask whether and how a non-visual sensory system can focus on an object. We present recent experimental and modeling work that shows how the early sensory circuitry of electric sense can perform such neuronal focusing that is manifested behaviorally. This sense is the main one used by weakly electric fish to navigate, locate prey and communicate in the murky waters of their natural habitat. We show that there is a distance at which the Fisher information of a neuron's response to a looming and receding object is maximized, and that this distance corresponds to a behaviorally relevant one chosen by these animals. Strikingly, this maximum occurs at a bifurcation between tonic firing and bursting. We further discuss how the invariance of this distance to signal attributes can arise, a process that first involves power-law spike frequency adaptation. The talk will also highlight the importance of expanding the classic dual neural encoding of contrast using ON and OFF cells in the context of looming and receding stimuli. The authors acknowledge support from CIHR and NSERC.
ERIC Educational Resources Information Center
Fazl, Arash; Grossberg, Stephen; Mingolla, Ennio
2009-01-01
How does the brain learn to recognize an object from multiple viewpoints while scanning a scene with eye movements? How does the brain avoid the problem of erroneously classifying parts of different objects together? How are attention and eye movements intelligently coordinated to facilitate object learning? A neural model provides a unified…
NASA Astrophysics Data System (ADS)
Peng, Haijun; Wang, Wei
2016-10-01
An adaptive surrogate model-based multi-objective optimization strategy that combines the benefits of invariant manifolds and low-thrust control toward developing a low-computational-cost transfer trajectory between libration orbits around the L1 and L2 libration points in the Sun-Earth system has been proposed in this paper. A new structure for a multi-objective transfer trajectory optimization model that divides the transfer trajectory into several segments and gives the dominations for invariant manifolds and low-thrust control in different segments has been established. To reduce the computational cost of multi-objective transfer trajectory optimization, a mixed sampling strategy-based adaptive surrogate model has been proposed. Numerical simulations show that the results obtained from the adaptive surrogate-based multi-objective optimization are in agreement with the results obtained using direct multi-objective optimization methods, and the computational workload of the adaptive surrogate-based multi-objective optimization is only approximately 10% of that of direct multi-objective optimization. Furthermore, the generating efficiency of the Pareto points of the adaptive surrogate-based multi-objective optimization is approximately 8 times that of the direct multi-objective optimization. Therefore, the proposed adaptive surrogate-based multi-objective optimization provides obvious advantages over direct multi-objective optimization methods.
Nigg, Claudio R; Motl, Robert W; Horwath, Caroline; Dishman, Rod K
2012-01-01
Objectives Physical activity (PA) research applying the Transtheoretical Model (TTM) to examine group differences and/or change over time requires preliminary evidence of factorial validity and invariance. The current study examined the factorial validity and longitudinal invariance of TTM constructs recently revised for PA. Method Participants from an ethnically diverse sample in Hawaii (N=700) completed questionnaires capturing each TTM construct. Results Factorial validity was confirmed for each construct using confirmatory factor analysis with full-information maximum likelihood. Longitudinal invariance was evidenced across a shorter (3-month) and longer (6-month) time period via nested model comparisons. Conclusions The questionnaires for each validated TTM construct are provided, and can now be generalized across similar subgroups and time points. Further validation of the provided measures is suggested in additional populations and across extended time points. PMID:22778669
The Grassmannian origin of dual superconformal invariance
NASA Astrophysics Data System (ADS)
Arkani-Hamed, Nima; Cachazo, Freddy; Cheung, Clifford
2010-03-01
A dual formulation of the S Matrix for mathcal {N} = 4 SYM has recently been presented, where all leading singularities of n-particle N k-2MHV amplitudes are given as an integral over the Grassmannian G( k, n), with cyclic symmetry, parity and superconformal invariance manifest. In this short note we show that the dual superconformal invariance of this object is also manifest. The geometry naturally suggests a partial integration and simple change of variable to an integral over G( k - 2, n). This change of variable precisely corresponds to the mapping between usual momentum variables and the “momentum twistors” introduced by Hodges, and yields an elementary derivation of the momentumtwistor space formula very recently presented by Mason and Skinner, which is manifestly dual superconformal invariant. Thus the G( k, n) Grassmannian formulation allows a direct understanding of all the important symmetries of mathcal {N} = 4 SYM scattering amplitudes.
3D shape representation with spatial probabilistic distribution of intrinsic shape keypoints
NASA Astrophysics Data System (ADS)
Ghorpade, Vijaya K.; Checchin, Paul; Malaterre, Laurent; Trassoudaine, Laurent
2017-12-01
The accelerated advancement in modeling, digitizing, and visualizing techniques for 3D shapes has led to an increasing amount of 3D models creation and usage, thanks to the 3D sensors which are readily available and easy to utilize. As a result, determining the similarity between 3D shapes has become consequential and is a fundamental task in shape-based recognition, retrieval, clustering, and classification. Several decades of research in Content-Based Information Retrieval (CBIR) has resulted in diverse techniques for 2D and 3D shape or object classification/retrieval and many benchmark data sets. In this article, a novel technique for 3D shape representation and object classification has been proposed based on analyses of spatial, geometric distributions of 3D keypoints. These distributions capture the intrinsic geometric structure of 3D objects. The result of the approach is a probability distribution function (PDF) produced from spatial disposition of 3D keypoints, keypoints which are stable on object surface and invariant to pose changes. Each class/instance of an object can be uniquely represented by a PDF. This shape representation is robust yet with a simple idea, easy to implement but fast enough to compute. Both Euclidean and topological space on object's surface are considered to build the PDFs. Topology-based geodesic distances between keypoints exploit the non-planar surface properties of the object. The performance of the novel shape signature is tested with object classification accuracy. The classification efficacy of the new shape analysis method is evaluated on a new dataset acquired with a Time-of-Flight camera, and also, a comparative evaluation on a standard benchmark dataset with state-of-the-art methods is performed. Experimental results demonstrate superior classification performance of the new approach on RGB-D dataset and depth data.
Filled Radar Charts Should Not Be Used to Compare Social Indicators
ERIC Educational Resources Information Center
Feldman, Roger
2013-01-01
The use of "radar charts" is an increasingly popular way to present spatial data in a visually interesting format. Some authors recommend using "filled radar charts" to compare the performance of observational units. Filled radar charts are not appropriate for such comparisons because the size of the area within the polygon is not invariant to the…
ERIC Educational Resources Information Center
Quinn, Paul C.; Schyns, Philippe G.; Goldstone, Robert L.
2006-01-01
The relation between perceptual organization and categorization processes in 3- and 4-month-olds was explored. The question was whether an invariant part abstracted during category learning could interfere with Gestalt organizational processes. A 2003 study by Quinn and Schyns had reported that an initial category familiarization experience in…
Chungkham, Holendro Singh; Ingre, Michael; Karasek, Robert; Westerlund, Hugo; Theorell, Töres
2013-01-01
Objectives To examine the factor structure and to evaluate the longitudinal measurement invariance of the demand-control-support questionnaire (DCSQ), using the Swedish Longitudinal Occupational Survey of Health (SLOSH). Methods A confirmatory factor analysis (CFA) and multi-group confirmatory factor analysis (MGCFA) models within the framework of structural equation modeling (SEM) have been used to examine the factor structure and invariance across time. Results Four factors: psychological demand, skill discretion, decision authority and social support, were confirmed by CFA at baseline, with the best fit obtained by removing the item repetitive work of skill discretion. A measurement error correlation (0.42) between work fast and work intensively for psychological demands was also detected. Acceptable composite reliability measures were obtained except for skill discretion (0.68). The invariance of the same factor structure was established, but caution in comparing mean levels of factors over time is warranted as lack of intercept invariance was evident. However, partial intercept invariance was established for work intensively. Conclusion Our findings indicate that skill discretion and decision authority represent two distinct constructs in the retained model. However removing the item repetitive work along with either work fast or work intensively would improve model fit. Care should also be taken while making comparisons in the constructs across time. Further research should investigate invariance across occupations or socio-economic classes. PMID:23950957
NASA Astrophysics Data System (ADS)
Bressloff, P. C.; Bressloff, N. W.
2000-02-01
Orientation tuning in a ring of pulse-coupled integrate-and-fire (IF) neurons is analyzed in terms of spontaneous pattern formation. It is shown how the ring bifurcates from a synchronous state to a non-phase-locked state whose spike trains are characterized by quasiperiodic variations of the inter-spike intervals (ISIs) on closed invariant circles. The separation of these invariant circles in phase space results in a localized peak of activity as measured by the time-averaged firing rate of the neurons. This generates a sharp orientation tuning curve that can lock to a slowly rotating, weakly tuned external stimulus. For fast synapses, breakup of the quasiperiodic orbits occurs leading to high spike time variability suggestive of chaos.
NASA Astrophysics Data System (ADS)
Patil, Sandeep Baburao; Sinha, G. R.
2017-02-01
India, having less awareness towards the deaf and dumb peoples leads to increase the communication gap between deaf and hard hearing community. Sign language is commonly developed for deaf and hard hearing peoples to convey their message by generating the different sign pattern. The scale invariant feature transform was introduced by David Lowe to perform reliable matching between different images of the same object. This paper implements the various phases of scale invariant feature transform to extract the distinctive features from Indian sign language gestures. The experimental result shows the time constraint for each phase and the number of features extracted for 26 ISL gestures.
NASA Astrophysics Data System (ADS)
Klapa, Przemyslaw; Mitka, Bartosz; Zygmunt, Mariusz
2017-12-01
The terrestrial laser scanning technology has a wide spectrum of applications, from land surveying, civil engineering and architecture to archaeology. The technology is capable of obtaining, in a short time, accurate coordinates of points which represent the surface of objects. Scanning of buildings is therefore a process which ensures obtaining information on all structural elements a building. The result is a point cloud consisting of millions of elements which are a perfect source of information on the object and its surrounding. The photogrammetric techniques allow documenting an object in high resolution in the form of orthophoto plans, or are a basis to develop 2D documentation or obtain point clouds for objects and 3D modelling. Integration of photogrammetric data and TLS brings a new quality in surveying historic monuments. Historic monuments play an important cultural and historical role. Centuries-old buildings require constant renovation and preservation of their structural and visual invariability while maintaining safety of people who use them. The full process of surveying allows evaluating the actual condition of monuments and planning repairs and renovations. Huge sizes and specific types of historic monuments cause problems in obtaining reliable and full information on them. The TLS technology allows obtaining such information in a short time and is non-invasive. A point cloud is not only a basis for developing architectural and construction documentation or evaluation of actual condition of a building. It also is a real visualization of monuments and their entire environment. The saved image of object surface can be presented at any time and place. A cyclical TLS survey of historic monuments allows detecting structural changes and evaluating damage and changes that cause deformation of monument’s components. The paper presents application of integrated photogrammetric data and TLS illustrated on an example of historic monuments from southern Poland. The cartographic materials are a basis for determining the actual condition of monuments and performing repair works. The materials also supplement the archive of monuments by means of recording the actual image of a monument in a virtual space.
Canonical Visual Size for Real-World Objects
Konkle, Talia; Oliva, Aude
2012-01-01
Real-world objects can be viewed at a range of distances and thus can be experienced at a range of visual angles within the visual field. Given the large amount of visual size variation possible when observing objects, we examined how internal object representations represent visual size information. In a series of experiments which required observers to access existing object knowledge, we observed that real-world objects have a consistent visual size at which they are drawn, imagined, and preferentially viewed. Importantly, this visual size is proportional to the logarithm of the assumed size of the object in the world, and is best characterized not as a fixed visual angle, but by the ratio of the object and the frame of space around it. Akin to the previous literature on canonical perspective, we term this consistent visual size information the canonical visual size. PMID:20822298
Koenderink, Jan; van Doorn, Andrea; Pinna, Baingio
2016-01-01
We investigated the familiar phenomenon of the uncanny feeling that represented people in frontal pose invariably appear to “face you” from wherever you stand. We deploy two different methods. The stimuli include the conventional one—a flat portrait rocking back and forth about a vertical axis—augmented with two novel variations. In one alternative, the portrait frame rotates whereas the actual portrait stays motionless and fronto-parallel; in the other, we replace the (flat!) portrait with a volumetric object. These variations yield exactly the same optical stimulation in frontal view, but become grossly different in very oblique views. We also let participants sample their momentary awareness through “gauge object” settings in static displays. From our results, we conclude that the psychogenesis of visual awareness maintains a number—at least two, but most likely more—of distinct spatial frameworks simultaneously involving “cue–scission.” Cues may be effective in one of these spatial frameworks but ineffective or functionally different in other ones. PMID:27895885
Capture of near-Earth objects with low-thrust propulsion and invariant manifolds
NASA Astrophysics Data System (ADS)
Tang, Gao; Jiang, Fanghua
2016-01-01
In this paper, a mission incorporating low-thrust propulsion and invariant manifolds to capture near-Earth objects (NEOs) is investigated. The initial condition has the spacecraft rendezvousing with the NEO. The mission terminates once it is inserted into a libration point orbit (LPO). The spacecraft takes advantage of stable invariant manifolds for low-energy ballistic capture. Low-thrust propulsion is employed to retrieve the joint spacecraft-asteroid system. Global optimization methods are proposed for the preliminary design. Local direct and indirect methods are applied to optimize the two-impulse transfers. Indirect methods are implemented to optimize the low-thrust trajectory and estimate the largest retrievable mass. To overcome the difficulty that arises from bang-bang control, a homotopic approach is applied to find an approximate solution. By detecting the switching moments of the bang-bang control the efficiency and accuracy of numerical integration are guaranteed. By using the homotopic approach as the initial guess the shooting function is easy to solve. The relationship between the maximum thrust and the retrieval mass is investigated. We find that both numerically and theoretically a larger thrust is preferred.
NASA Astrophysics Data System (ADS)
Cao, L.; Cheng, Q.
2004-12-01
The scale invariant generator technique (SIG) and spectrum-area analysis technique (S-A) were developed independently relevant to the concept of the generalized scale invariance (GSI). The former was developed for characterizing the parameters involved in the GSI for characterizing and simulating multifractal measures whereas the latter was for identifying scaling breaks for decomposition of superimposed multifractal measures caused by multiple geophysical processes. A natural integration of these two techniques may yield a new technique to serve two purposes, on the one hand, that can enrich the power of S-A by increasing the interpretability of decomposed patterns in some applications of S-A and, on the other hand, that can provide a mean to test the uniqueness of multifractality of measures which is essential for application of SIG technique in more complicated environment. The implementation of the proposed technique has been done as a Dynamic Link Library (DLL) in Visual C++. The program can be friendly used for method validation and application in different fields.
ERIC Educational Resources Information Center
Williams, Carrick C.; Pollatsek, Alexander; Cave, Kyle R.; Stroud, Michael J.
2009-01-01
In 2 experiments, eye movements were examined during searches in which elements were grouped into four 9-item clusters. The target (a red or blue "T") was known in advance, and each cluster contained different numbers of target-color elements. Rather than color composition of a cluster invariantly guiding the order of search though…
Scene incongruity and attention.
Mack, Arien; Clarke, Jason; Erol, Muge; Bert, John
2017-02-01
Does scene incongruity, (a mismatch between scene gist and a semantically incongruent object), capture attention and lead to conscious perception? We explored this question using 4 different procedures: Inattention (Experiment 1), Scene description (Experiment 2), Change detection (Experiment 3), and Iconic Memory (Experiment 4). We found no differences between scene incongruity and scene congruity in Experiments 1, 2, and 4, although in Experiment 3 change detection was faster for scenes containing an incongruent object. We offer an explanation for why the change detection results differ from the results of the other three experiments. In all four experiments, participants invariably failed to report the incongruity and routinely mis-described it by normalizing the incongruent object. None of the results supports the claim that semantic incongruity within a scene invariably captures attention and provide strong evidence of the dominant role of scene gist in determining what is perceived. Copyright © 2016 Elsevier Inc. All rights reserved.
Senese, Vincenzo Paolo; De Lucia, Natascia; Conson, Massimiliano
2015-01-01
Cognitive models of drawing are mainly based on assessment of copying performance of adults, whereas only a few studies have verified these models in young children. Moreover, developmental investigations have only rarely performed a systematic examination of the contribution of perceptual and representational visuo-spatial processes to copying and drawing from memory. In this study we investigated the role of visual perception and mental representation in both copying and drawing from memory skills in a sample of 227 typically developing children (53% females) aged 7-10 years. Participants underwent a neuropsychological assessment and the Rey-Osterrieth Complex Figure (ROCF). The fit and invariance of the predictive model considering visuo-spatial abilities, working memory, and executive functions were tested by means of hierarchical regressions and path analysis. Results showed that, in a gender invariant way, visual perception abilities and spatial mental representation had a direct effect on copying performance, whereas copying performance was the only specific predictor for drawing from memory. These effects were independent from age and socioeconomic status, and showed that cognitive models of drawing built up for adults could be considered for predicting copying and drawing from memory in children.
Visualizing the Topologically Induced States of Strongly Correlated Electrons in SmB6
NASA Astrophysics Data System (ADS)
Pirie, Harris; Hoffman, Jennifer E.; He, Yang; Yee, Michael M.; Soumyanarayanan, Anjan; Kim, Dae-Jeong; Fisk, Zachary; Morr, Dirk; Hamidian, Mohammad
The synergy between strong correlations and a topological invariant is predicted to generate exotic topological order, fractional quasiparticles and new platforms for quantum computation. SmB6 is a promising candidate in which interactions generate an insulating state whose gap arises from heavy fermion hybridization of low lying f-states with a Fermi sea. We used spectroscopic imaging scanning tunneling microscopy to visualize the hybridization of distinct crystal-field-split f-levels and the temperature-dependent evolution of an insulating gap spanning the chemical potential. Here, armed with a clear description of the bulk bands, we look within the insulating gap and directly image two dispersing surface states converging to a Dirac point close to the chemical potential. We show that these measurements are consistent with Dirac cones centered at the X and Γ points in the surface Brillouin zone corresponding to a strong topological invariant. The observation of topological states induced from strong correlations establishes SmB6 as an exciting playground for exotic physics. This work was supported by the Moore foundation, Canada Excellence Research Chair Program and the US National Science Foundation under the Grant DMR-1401480.
Selecting and perceiving multiple visual objects
Xu, Yaoda; Chun, Marvin M.
2010-01-01
To explain how multiple visual objects are attended and perceived, we propose that our visual system first selects a fixed number of about four objects from a crowded scene based on their spatial information (object individuation) and then encode their details (object identification). We describe the involvement of the inferior intra-parietal sulcus (IPS) in object individuation and the superior IPS and higher visual areas in object identification. Our neural object-file theory synthesizes and extends existing ideas in visual cognition and is supported by behavioral and neuroimaging results. It provides a better understanding of the role of the different parietal areas in encoding visual objects and can explain various forms of capacity-limited processing in visual cognition such as working memory. PMID:19269882
Role of temporal processing stages by inferior temporal neurons in facial recognition.
Sugase-Miyamoto, Yasuko; Matsumoto, Narihisa; Kawano, Kenji
2011-01-01
In this review, we focus on the role of temporal stages of encoded facial information in the visual system, which might enable the efficient determination of species, identity, and expression. Facial recognition is an important function of our brain and is known to be processed in the ventral visual pathway, where visual signals are processed through areas V1, V2, V4, and the inferior temporal (IT) cortex. In the IT cortex, neurons show selective responses to complex visual images such as faces, and at each stage along the pathway the stimulus selectivity of the neural responses becomes sharper, particularly in the later portion of the responses. In the IT cortex of the monkey, facial information is represented by different temporal stages of neural responses, as shown in our previous study: the initial transient response of face-responsive neurons represents information about global categories, i.e., human vs. monkey vs. simple shapes, whilst the later portion of these responses represents information about detailed facial categories, i.e., expression and/or identity. This suggests that the temporal stages of the neuronal firing pattern play an important role in the coding of visual stimuli, including faces. This type of coding may be a plausible mechanism underlying the temporal dynamics of recognition, including the process of detection/categorization followed by the identification of objects. Recent single-unit studies in monkeys have also provided evidence consistent with the important role of the temporal stages of encoded facial information. For example, view-invariant facial identity information is represented in the response at a later period within a region of face-selective neurons. Consistent with these findings, temporally modulated neural activity has also been observed in human studies. These results suggest a close correlation between the temporal processing stages of facial information by IT neurons and the temporal dynamics of face recognition.
Role of Temporal Processing Stages by Inferior Temporal Neurons in Facial Recognition
Sugase-Miyamoto, Yasuko; Matsumoto, Narihisa; Kawano, Kenji
2011-01-01
In this review, we focus on the role of temporal stages of encoded facial information in the visual system, which might enable the efficient determination of species, identity, and expression. Facial recognition is an important function of our brain and is known to be processed in the ventral visual pathway, where visual signals are processed through areas V1, V2, V4, and the inferior temporal (IT) cortex. In the IT cortex, neurons show selective responses to complex visual images such as faces, and at each stage along the pathway the stimulus selectivity of the neural responses becomes sharper, particularly in the later portion of the responses. In the IT cortex of the monkey, facial information is represented by different temporal stages of neural responses, as shown in our previous study: the initial transient response of face-responsive neurons represents information about global categories, i.e., human vs. monkey vs. simple shapes, whilst the later portion of these responses represents information about detailed facial categories, i.e., expression and/or identity. This suggests that the temporal stages of the neuronal firing pattern play an important role in the coding of visual stimuli, including faces. This type of coding may be a plausible mechanism underlying the temporal dynamics of recognition, including the process of detection/categorization followed by the identification of objects. Recent single-unit studies in monkeys have also provided evidence consistent with the important role of the temporal stages of encoded facial information. For example, view-invariant facial identity information is represented in the response at a later period within a region of face-selective neurons. Consistent with these findings, temporally modulated neural activity has also been observed in human studies. These results suggest a close correlation between the temporal processing stages of facial information by IT neurons and the temporal dynamics of face recognition. PMID:21734904
Jin, Xin; Liu, Li; Chen, Yanqin; Dai, Qionghai
2017-05-01
This paper derives a mathematical point spread function (PSF) and a depth-invariant focal sweep point spread function (FSPSF) for plenoptic camera 2.0. Derivation of PSF is based on the Fresnel diffraction equation and image formation analysis of a self-built imaging system which is divided into two sub-systems to reflect the relay imaging properties of plenoptic camera 2.0. The variations in PSF, which are caused by changes of object's depth and sensor position variation, are analyzed. A mathematical model of FSPSF is further derived, which is verified to be depth-invariant. Experiments on the real imaging systems demonstrate the consistency between the proposed PSF and the actual imaging results.
Wheldon, Christopher W; Kolar, Stephanie K; Hernandez, Natalie D; Daley, Ellen M
2017-01-01
The objective of this study was to assess the factorial invariance and convergent validity of the Group-Based Medical Mistrust Scale (GBMMS) across gender (male and female) and ethnoracial identity (Latino and Black). Minority students (N = 686) attending a southeastern university were surveyed in the fall of 2011. Psychometric analysis of the GBMMS was performed. A three-factor solution fit the data after the omission of two problematic items. This revised version of the GBMMS exhibited sufficient configural, metric, and scalar invariance. Convergence of the GBMMS with conceptually related measures provided further evidence of validity; however, there was variation across ethnoracial identity. The GBMMS has viable psychometric properties across gender and ethnoracial identity in Black and Latino populations.
Combined Feature Based and Shape Based Visual Tracker for Robot Navigation
NASA Technical Reports Server (NTRS)
Deans, J.; Kunz, C.; Sargent, R.; Park, E.; Pedersen, L.
2005-01-01
We have developed a combined feature based and shape based visual tracking system designed to enable a planetary rover to visually track and servo to specific points chosen by a user with centimeter precision. The feature based tracker uses invariant feature detection and matching across a stereo pair, as well as matching pairs before and after robot movement in order to compute an incremental 6-DOF motion at each tracker update. This tracking method is subject to drift over time, which can be compensated by the shape based method. The shape based tracking method consists of 3D model registration, which recovers 6-DOF motion given sufficient shape and proper initialization. By integrating complementary algorithms, the combined tracker leverages the efficiency and robustness of feature based methods with the precision and accuracy of model registration. In this paper, we present the algorithms and their integration into a combined visual tracking system.
Measurement Invariance of the WHODAS 2.0 in a Population-Based Sample of Youth
Kimber, Melissa; Rehm, Jürgen; Ferro, Mark A.
2015-01-01
The World Health Organization Disability Assessment Schedule 2.0 (WHODAS 2.0) is a brief measure of global disability originally developed for adults, which has since been implemented among samples of children and youth. However, evidence of its validity for use among youth, particularly measurement invariance, is lacking. Investigations of measurement invariance assess the extent to which the psychometric properties of observed items in a measure are generalizable across samples. Satisfying the assumption of measurement invariance is critical for any inferences about between-group differences. The objective of this paper was to empirically assess the measurement invariance of the 12-item interview version of the WHODAS 2.0 measure in an epidemiological sample of youth (15 to 17 years) and adults (≥ 18 years) in Canada. Multiple-group confirmatory factor analysis using a categorical variable framework allowed for the sequential testing of increasingly restrictive models to evaluate measurement invariance of the WHODAS 2.0 between adults and youth. Findings provided evidence for full measurement invariance of the WHODAS 2.0 in youth aged 15 to 17 years. The final model fit the data well: χ2(159) = 769.04, p < .001; CFI = 0.950, TLI = 0.958, RMSEA (90% CI) = 0.055 [0.051, 0.059]. Results from this study build on previous work supporting the validity of the WHODAS 2.0. Findings indicate that the WHODAS 2.0 is valid for making substantive comparisons of disability among youth as young as 15 years of age. PMID:26565410
Stationary wavelet transform for under-sampled MRI reconstruction.
Kayvanrad, Mohammad H; McLeod, A Jonathan; Baxter, John S H; McKenzie, Charles A; Peters, Terry M
2014-12-01
In addition to coil sensitivity data (parallel imaging), sparsity constraints are often used as an additional lp-penalty for under-sampled MRI reconstruction (compressed sensing). Penalizing the traditional decimated wavelet transform (DWT) coefficients, however, results in visual pseudo-Gibbs artifacts, some of which are attributed to the lack of translation invariance of the wavelet basis. We show that these artifacts can be greatly reduced by penalizing the translation-invariant stationary wavelet transform (SWT) coefficients. This holds with various additional reconstruction constraints, including coil sensitivity profiles and total variation. Additionally, SWT reconstructions result in lower error values and faster convergence compared to DWT. These concepts are illustrated with extensive experiments on in vivo MRI data with particular emphasis on multiple-channel acquisitions. Copyright © 2014 Elsevier Inc. All rights reserved.
Visual Odometry Based on Structural Matching of Local Invariant Features Using Stereo Camera Sensor
Núñez, Pedro; Vázquez-Martín, Ricardo; Bandera, Antonio
2011-01-01
This paper describes a novel sensor system to estimate the motion of a stereo camera. Local invariant image features are matched between pairs of frames and linked into image trajectories at video rate, providing the so-called visual odometry, i.e., motion estimates from visual input alone. Our proposal conducts two matching sessions: the first one between sets of features associated to the images of the stereo pairs and the second one between sets of features associated to consecutive frames. With respect to previously proposed approaches, the main novelty of this proposal is that both matching algorithms are conducted by means of a fast matching algorithm which combines absolute and relative feature constraints. Finding the largest-valued set of mutually consistent matches is equivalent to finding the maximum-weighted clique on a graph. The stereo matching allows to represent the scene view as a graph which emerge from the features of the accepted clique. On the other hand, the frame-to-frame matching defines a graph whose vertices are features in 3D space. The efficiency of the approach is increased by minimizing the geometric and algebraic errors to estimate the final displacement of the stereo camera between consecutive acquired frames. The proposed approach has been tested for mobile robotics navigation purposes in real environments and using different features. Experimental results demonstrate the performance of the proposal, which could be applied in both industrial and service robot fields. PMID:22164016
Automatic topics segmentation for TV news video
NASA Astrophysics Data System (ADS)
Hmayda, Mounira; Ejbali, Ridha; Zaied, Mourad
2017-03-01
Automatic identification of television programs in the TV stream is an important task for operating archives. This article proposes a new spatio-temporal approach to identify the programs in TV stream into two main steps: First, a reference catalogue for video features visual jingles built. We operate the features that characterize the instances of the same program type to identify the different types of programs in the flow of television. The role of video features is to represent the visual invariants for each visual jingle using appropriate automatic descriptors for each television program. On the other hand, programs in television streams are identified by examining the similarity of the video signal for visual grammars in the catalogue. The main idea of the identification process is to compare the visual similarity of the video signal features in the flow of television to the catalogue. After presenting the proposed approach, the paper overviews encouraging experimental results on several streams extracted from different channels and compounds of several programs.
Hearing in noisy environments: noise invariance and contrast gain control
Willmore, Ben D B; Cooke, James E; King, Andrew J
2014-01-01
Contrast gain control has recently been identified as a fundamental property of the auditory system. Electrophysiological recordings in ferrets have shown that neurons continuously adjust their gain (their sensitivity to change in sound level) in response to the contrast of sounds that are heard. At the level of the auditory cortex, these gain changes partly compensate for changes in sound contrast. This means that sounds which are structurally similar, but have different contrasts, have similar neuronal representations in the auditory cortex. As a result, the cortical representation is relatively invariant to stimulus contrast and robust to the presence of noise in the stimulus. In the inferior colliculus (an important subcortical auditory structure), gain changes are less reliably compensatory, suggesting that contrast- and noise-invariant representations are constructed gradually as one ascends the auditory pathway. In addition to noise invariance, contrast gain control provides a variety of computational advantages over static neuronal representations; it makes efficient use of neuronal dynamic range, may contribute to redundancy-reducing, sparse codes for sound and allows for simpler decoding of population responses. The circuits underlying auditory contrast gain control are still under investigation. As in the visual system, these circuits may be modulated by factors other than stimulus contrast, forming a potential neural substrate for mediating the effects of attention as well as interactions between the senses. PMID:24907308
Invariant 2D object recognition using the wavelet transform and structured neural networks
NASA Astrophysics Data System (ADS)
Khalil, Mahmoud I.; Bayoumi, Mohamed M.
1999-03-01
This paper applies the dyadic wavelet transform and the structured neural networks approach to recognize 2D objects under translation, rotation, and scale transformation. Experimental results are presented and compared with traditional methods. The experimental results showed that this refined technique successfully classified the objects and outperformed some traditional methods especially in the presence of noise.
2007-09-19
extended object relations such as boundary, interior, open, closed , within, connected, and overlaps, which are invariant under elastic deformation...is required in a geo-spatial semantic web is challenging because the defining properties of geographic entities are very closely related to space. In...Objects under Primitive will be open (i.e., they will not contain their boundary points) and the objects under Complex will be closed . In addition to
Groups of adjacent contour segments for object detection.
Ferrari, V; Fevrier, L; Jurie, F; Schmid, C
2008-01-01
We present a family of scale-invariant local shape features formed by chains of k connected, roughly straight contour segments (kAS), and their use for object class detection. kAS are able to cleanly encode pure fragments of an object boundary, without including nearby clutter. Moreover, they offer an attractive compromise between information content and repeatability, and encompass a wide variety of local shape structures. We also define a translation and scale invariant descriptor encoding the geometric configuration of the segments within a kAS, making kAS easy to reuse in other frameworks, for example as a replacement or addition to interest points. Software for detecting and describing kAS is released on lear.inrialpes.fr/software. We demonstrate the high performance of kAS within a simple but powerful sliding-window object detection scheme. Through extensive evaluations, involving eight diverse object classes and more than 1400 images, we 1) study the evolution of performance as the degree of feature complexity k varies and determine the best degree; 2) show that kAS substantially outperform interest points for detecting shape-based classes; 3) compare our object detector to the recent, state-of-the-art system by Dalal and Triggs [4].
Study of the Gray Scale, Polychromatic, Distortion Invariant Neural Networks Using the Ipa Model.
NASA Astrophysics Data System (ADS)
Uang, Chii-Maw
Research in the optical neural network field is primarily motivated by the fact that humans recognize objects better than the conventional digital computers and the massively parallel inherent nature of optics. This research represents a continuous effort during the past several years in the exploitation of using neurocomputing for pattern recognition. Based on the interpattern association (IPA) model and Hamming net model, many new systems and applications are introduced. A gray level discrete associative memory that is based on object decomposition/composition is proposed for recognizing gray-level patterns. This technique extends the processing ability from the binary mode to gray-level mode, and thus the information capacity is increased. Two polychromatic optical neural networks using color liquid crystal television (LCTV) panels for color pattern recognition are introduced. By introducing a color encoding technique in conjunction with the interpattern associative algorithm, a color associative memory was realized. Based on the color decomposition and composition technique, a color exemplar-based Hamming net was built for color image classification. A shift-invariant neural network is presented through use of the translation invariant property of the modulus of the Fourier transformation and the hetero-associative interpattern association (IPA) memory. To extract the main features, a quadrantal sampling method is used to sampled data and then replace the training patterns. Using the concept of hetero-associative memory to recall the distorted object. A shift and rotation invariant neural network using an interpattern hetero-association (IHA) model is presented. To preserve the shift and rotation invariant properties, a set of binarized-encoded circular harmonic expansion (CHE) functions at the Fourier domain is used as the training set. We use the shift and symmetric properties of the modulus of the Fourier spectrum to avoid the problem of centering the CHE functions. Almost all neural networks have the positive and negative weights, which increases the difficulty of optical implementation. A method to construct a unipolar IPA IWM is discussed. By searching the redundant interconnection links, an effective way that removes all negative links is discussed.
Demonstration of a 3D vision algorithm for space applications
NASA Technical Reports Server (NTRS)
Defigueiredo, Rui J. P. (Editor)
1987-01-01
This paper reports an extension of the MIAG algorithm for recognition and motion parameter determination of general 3-D polyhedral objects based on model matching techniques and using movement invariants as features of object representation. Results of tests conducted on the algorithm under conditions simulating space conditions are presented.
Extraction of composite visual objects from audiovisual materials
NASA Astrophysics Data System (ADS)
Durand, Gwenael; Thienot, Cedric; Faudemay, Pascal
1999-08-01
An effective analysis of Visual Objects appearing in still images and video frames is required in order to offer fine grain access to multimedia and audiovisual contents. In previous papers, we showed how our method for segmenting still images into visual objects could improve content-based image retrieval and video analysis methods. Visual Objects are used in particular for extracting semantic knowledge about the contents. However, low-level segmentation methods for still images are not likely to extract a complex object as a whole but instead as a set of several sub-objects. For example, a person would be segmented into three visual objects: a face, hair, and a body. In this paper, we introduce the concept of Composite Visual Object. Such an object is hierarchically composed of sub-objects called Component Objects.
Biologically-inspired robust and adaptive multi-sensor fusion and active control
NASA Astrophysics Data System (ADS)
Khosla, Deepak; Dow, Paul A.; Huber, David J.
2009-04-01
In this paper, we describe a method and system for robust and efficient goal-oriented active control of a machine (e.g., robot) based on processing, hierarchical spatial understanding, representation and memory of multimodal sensory inputs. This work assumes that a high-level plan or goal is known a priori or is provided by an operator interface, which translates into an overall perceptual processing strategy for the machine. Its analogy to the human brain is the download of plans and decisions from the pre-frontal cortex into various perceptual working memories as a perceptual plan that then guides the sensory data collection and processing. For example, a goal might be to look for specific colored objects in a scene while also looking for specific sound sources. This paper combines three key ideas and methods into a single closed-loop active control system. (1) Use high-level plan or goal to determine and prioritize spatial locations or waypoints (targets) in multimodal sensory space; (2) collect/store information about these spatial locations at the appropriate hierarchy and representation in a spatial working memory. This includes invariant learning of these spatial representations and how to convert between them; and (3) execute actions based on ordered retrieval of these spatial locations from hierarchical spatial working memory and using the "right" level of representation that can efficiently translate into motor actions. In its most specific form, the active control is described for a vision system (such as a pantilt- zoom camera system mounted on a robotic head and neck unit) which finds and then fixates on high saliency visual objects. We also describe the approach where the goal is to turn towards and sequentially foveate on salient multimodal cues that include both visual and auditory inputs.
Analyzing Cyber Security Threats on Cyber-Physical Systems Using Model-Based Systems Engineering
NASA Technical Reports Server (NTRS)
Kerzhner, Aleksandr; Pomerantz, Marc; Tan, Kymie; Campuzano, Brian; Dinkel, Kevin; Pecharich, Jeremy; Nguyen, Viet; Steele, Robert; Johnson, Bryan
2015-01-01
The spectre of cyber attacks on aerospace systems can no longer be ignored given that many of the components and vulnerabilities that have been successfully exploited by the adversary on other infrastructures are the same as those deployed and used within the aerospace environment. An important consideration with respect to the mission/safety critical infrastructure supporting space operations is that an appropriate defensive response to an attack invariably involves the need for high precision and accuracy, because an incorrect response can trigger unacceptable losses involving lives and/or significant financial damage. A highly precise defensive response, considering the typical complexity of aerospace environments, requires a detailed and well-founded understanding of the underlying system where the goal of the defensive response is to preserve critical mission objectives in the presence of adversarial activity. In this paper, a structured approach for modeling aerospace systems is described. The approach includes physical elements, network topology, software applications, system functions, and usage scenarios. We leverage Model-Based Systems Engineering methodology by utilizing the Object Management Group's Systems Modeling Language to represent the system being analyzed and also utilize model transformations to change relevant aspects of the model into specialized analyses. A novel visualization approach is utilized to visualize the entire model as a three-dimensional graph, allowing easier interaction with subject matter experts. The model provides a unifying structure for analyzing the impact of a particular attack or a particular type of attack. Two different example analysis types are demonstrated in this paper: a graph-based propagation analysis based on edge labels, and a graph-based propagation analysis based on node labels.
Systems and Methods for Data Visualization Using Three-Dimensional Displays
NASA Technical Reports Server (NTRS)
Davidoff, Scott (Inventor); Djorgovski, Stanislav G. (Inventor); Estrada, Vicente (Inventor); Donalek, Ciro (Inventor)
2017-01-01
Data visualization systems and methods for generating 3D visualizations of a multidimensional data space are described. In one embodiment a 3D data visualization application directs a processing system to: load a set of multidimensional data points into a visualization table; create representations of a set of 3D objects corresponding to the set of data points; receive mappings of data dimensions to visualization attributes; determine the visualization attributes of the set of 3D objects based upon the selected mappings of data dimensions to 3D object attributes; update a visibility dimension in the visualization table for each of the plurality of 3D object to reflect the visibility of each 3D object based upon the selected mappings of data dimensions to visualization attributes; and interactively render 3D data visualizations of the 3D objects within the virtual space from viewpoints determined based upon received user input.
Visual and Non-Visual Contributions to the Perception of Object Motion during Self-Motion
Fajen, Brett R.; Matthis, Jonathan S.
2013-01-01
Many locomotor tasks involve interactions with moving objects. When observer (i.e., self-)motion is accompanied by object motion, the optic flow field includes a component due to self-motion and a component due to object motion. For moving observers to perceive the movement of other objects relative to the stationary environment, the visual system could recover the object-motion component – that is, it could factor out the influence of self-motion. In principle, this could be achieved using visual self-motion information, non-visual self-motion information, or a combination of both. In this study, we report evidence that visual information about the speed (Experiment 1) and direction (Experiment 2) of self-motion plays a role in recovering the object-motion component even when non-visual self-motion information is also available. However, the magnitude of the effect was less than one would expect if subjects relied entirely on visual self-motion information. Taken together with previous studies, we conclude that when self-motion is real and actively generated, both visual and non-visual self-motion information contribute to the perception of object motion. We also consider the possible role of this process in visually guided interception and avoidance of moving objects. PMID:23408983
Tian, Moqian; Grill-Spector, Kalanit
2015-01-01
Recognizing objects is difficult because it requires both linking views of an object that can be different and distinguishing objects with similar appearance. Interestingly, people can learn to recognize objects across views in an unsupervised way, without feedback, just from the natural viewing statistics. However, there is intense debate regarding what information during unsupervised learning is used to link among object views. Specifically, researchers argue whether temporal proximity, motion, or spatiotemporal continuity among object views during unsupervised learning is beneficial. Here, we untangled the role of each of these factors in unsupervised learning of novel three-dimensional (3-D) objects. We found that after unsupervised training with 24 object views spanning a 180° view space, participants showed significant improvement in their ability to recognize 3-D objects across rotation. Surprisingly, there was no advantage to unsupervised learning with spatiotemporal continuity or motion information than training with temporal proximity. However, we discovered that when participants were trained with just a third of the views spanning the same view space, unsupervised learning via spatiotemporal continuity yielded significantly better recognition performance on novel views than learning via temporal proximity. These results suggest that while it is possible to obtain view-invariant recognition just from observing many views of an object presented in temporal proximity, spatiotemporal information enhances performance by producing representations with broader view tuning than learning via temporal association. Our findings have important implications for theories of object recognition and for the development of computational algorithms that learn from examples. PMID:26024454
The Nature of Objectivity with the Rasch Model.
ERIC Educational Resources Information Center
Whitely, Susan E.; Dawis, Rene V.
Although it has been claimed that the Rasch model leads to a higher degree of objectivity in measurement than has been previously possible, this model has had little impact on test development. Population-invariant item and ability calibrations along with the statistical equivalency of any two item subsets are supposedly possible if the item pool…
Constrained Metric Learning by Permutation Inducing Isometries.
Bosveld, Joel; Mahmood, Arif; Huynh, Du Q; Noakes, Lyle
2016-01-01
The choice of metric critically affects the performance of classification and clustering algorithms. Metric learning algorithms attempt to improve performance, by learning a more appropriate metric. Unfortunately, most of the current algorithms learn a distance function which is not invariant to rigid transformations of images. Therefore, the distances between two images and their rigidly transformed pair may differ, leading to inconsistent classification or clustering results. We propose to constrain the learned metric to be invariant to the geometry preserving transformations of images that induce permutations in the feature space. The constraint that these transformations are isometries of the metric ensures consistent results and improves accuracy. Our second contribution is a dimension reduction technique that is consistent with the isometry constraints. Our third contribution is the formulation of the isometry constrained logistic discriminant metric learning (IC-LDML) algorithm, by incorporating the isometry constraints within the objective function of the LDML algorithm. The proposed algorithm is compared with the existing techniques on the publicly available labeled faces in the wild, viewpoint-invariant pedestrian recognition, and Toy Cars data sets. The IC-LDML algorithm has outperformed existing techniques for the tasks of face recognition, person identification, and object classification by a significant margin.
McMenamin, Brenton W.; Deason, Rebecca G.; Steele, Vaughn R.; Koutstaal, Wilma; Marsolek, Chad J.
2014-01-01
Previous research indicates that dissociable neural subsystems underlie abstract-category (AC) recognition and priming of objects (e.g., cat, piano) and specific-exemplar (SE) recognition and priming of objects (e.g., a calico cat, a different calico cat, a grand piano, etc.). However, the degree of separability between these subsystems is not known, despite the importance of this issue for assessing relevant theories. Visual object representations are widely distributed in visual cortex, thus a multivariate pattern analysis (MVPA) approach to analyzing functional magnetic resonance imaging (fMRI) data may be critical for assessing the separability of different kinds of visual object processing. Here we examined the neural representations of visual object categories and visual object exemplars using multi-voxel pattern analyses of brain activity elicited in visual object processing areas during a repetition-priming task. In the encoding phase, participants viewed visual objects and the printed names of other objects. In the subsequent test phase, participants identified objects that were either same-exemplar primed, different-exemplar primed, word-primed, or unprimed. In visual object processing areas, classifiers were trained to distinguish same-exemplar primed objects from word-primed objects. Then, the abilities of these classifiers to discriminate different-exemplar primed objects and word-primed objects (reflecting AC priming) and to discriminate same-exemplar primed objects and different-exemplar primed objects (reflecting SE priming) was assessed. Results indicated that (a) repetition priming in occipital-temporal regions is organized asymmetrically, such that AC priming is more prevalent in the left hemisphere and SE priming is more prevalent in the right hemisphere, and (b) AC and SE subsystems are weakly modular, not strongly modular or unified. PMID:25528436
McMenamin, Brenton W; Deason, Rebecca G; Steele, Vaughn R; Koutstaal, Wilma; Marsolek, Chad J
2015-02-01
Previous research indicates that dissociable neural subsystems underlie abstract-category (AC) recognition and priming of objects (e.g., cat, piano) and specific-exemplar (SE) recognition and priming of objects (e.g., a calico cat, a different calico cat, a grand piano, etc.). However, the degree of separability between these subsystems is not known, despite the importance of this issue for assessing relevant theories. Visual object representations are widely distributed in visual cortex, thus a multivariate pattern analysis (MVPA) approach to analyzing functional magnetic resonance imaging (fMRI) data may be critical for assessing the separability of different kinds of visual object processing. Here we examined the neural representations of visual object categories and visual object exemplars using multi-voxel pattern analyses of brain activity elicited in visual object processing areas during a repetition-priming task. In the encoding phase, participants viewed visual objects and the printed names of other objects. In the subsequent test phase, participants identified objects that were either same-exemplar primed, different-exemplar primed, word-primed, or unprimed. In visual object processing areas, classifiers were trained to distinguish same-exemplar primed objects from word-primed objects. Then, the abilities of these classifiers to discriminate different-exemplar primed objects and word-primed objects (reflecting AC priming) and to discriminate same-exemplar primed objects and different-exemplar primed objects (reflecting SE priming) was assessed. Results indicated that (a) repetition priming in occipital-temporal regions is organized asymmetrically, such that AC priming is more prevalent in the left hemisphere and SE priming is more prevalent in the right hemisphere, and (b) AC and SE subsystems are weakly modular, not strongly modular or unified. Copyright © 2014 Elsevier Inc. All rights reserved.
Information theoretic analysis of linear shift-invariant edge-detection operators
NASA Astrophysics Data System (ADS)
Jiang, Bo; Rahman, Zia-ur
2012-06-01
Generally, the designs of digital image processing algorithms and image gathering devices remain separate. Consequently, the performance of digital image processing algorithms is evaluated without taking into account the influences by the image gathering process. However, experiments show that the image gathering process has a profound impact on the performance of digital image processing and the quality of the resulting images. Huck et al. proposed one definitive theoretic analysis of visual communication channels, where the different parts, such as image gathering, processing, and display, are assessed in an integrated manner using Shannon's information theory. We perform an end-to-end information theory based system analysis to assess linear shift-invariant edge-detection algorithms. We evaluate the performance of the different algorithms as a function of the characteristics of the scene and the parameters, such as sampling, additive noise etc., that define the image gathering system. The edge-detection algorithm is regarded as having high performance only if the information rate from the scene to the edge image approaches its maximum possible. This goal can be achieved only by jointly optimizing all processes. Our information-theoretic assessment provides a new tool that allows us to compare different linear shift-invariant edge detectors in a common environment.
Pattern recognition neural-net by spatial mapping of biology visual field
NASA Astrophysics Data System (ADS)
Lin, Xin; Mori, Masahiko
2000-05-01
The method of spatial mapping in biology vision field is applied to artificial neural networks for pattern recognition. By the coordinate transform that is called the complex-logarithm mapping and Fourier transform, the input images are transformed into scale- rotation- and shift- invariant patterns, and then fed into a multilayer neural network for learning and recognition. The results of computer simulation and an optical experimental system are described.
Principal visual word discovery for automatic license plate detection.
Zhou, Wengang; Li, Houqiang; Lu, Yijuan; Tian, Qi
2012-09-01
License plates detection is widely considered a solved problem, with many systems already in operation. However, the existing algorithms or systems work well only under some controlled conditions. There are still many challenges for license plate detection in an open environment, such as various observation angles, background clutter, scale changes, multiple plates, uneven illumination, and so on. In this paper, we propose a novel scheme to automatically locate license plates by principal visual word (PVW), discovery and local feature matching. Observing that characters in different license plates are duplicates of each other, we bring in the idea of using the bag-of-words (BoW) model popularly applied in partial-duplicate image search. Unlike the classic BoW model, for each plate character, we automatically discover the PVW characterized with geometric context. Given a new image, the license plates are extracted by matching local features with PVW. Besides license plate detection, our approach can also be extended to the detection of logos and trademarks. Due to the invariance virtue of scale-invariant feature transform feature, our method can adaptively deal with various changes in the license plates, such as rotation, scaling, illumination, etc. Promising results of the proposed approach are demonstrated with an experimental study in license plate detection.
Spectral-Spatial Scale Invariant Feature Transform for Hyperspectral Images.
Al-Khafaji, Suhad Lateef; Jun Zhou; Zia, Ali; Liew, Alan Wee-Chung
2018-02-01
Spectral-spatial feature extraction is an important task in hyperspectral image processing. In this paper we propose a novel method to extract distinctive invariant features from hyperspectral images for registration of hyperspectral images with different spectral conditions. Spectral condition means images are captured with different incident lights, viewing angles, or using different hyperspectral cameras. In addition, spectral condition includes images of objects with the same shape but different materials. This method, which is named spectral-spatial scale invariant feature transform (SS-SIFT), explores both spectral and spatial dimensions simultaneously to extract spectral and geometric transformation invariant features. Similar to the classic SIFT algorithm, SS-SIFT consists of keypoint detection and descriptor construction steps. Keypoints are extracted from spectral-spatial scale space and are detected from extrema after 3D difference of Gaussian is applied to the data cube. Two descriptors are proposed for each keypoint by exploring the distribution of spectral-spatial gradient magnitude in its local 3D neighborhood. The effectiveness of the SS-SIFT approach is validated on images collected in different light conditions, different geometric projections, and using two hyperspectral cameras with different spectral wavelength ranges and resolutions. The experimental results show that our method generates robust invariant features for spectral-spatial image matching.
Object representations in visual memory: evidence from visual illusions.
Ben-Shalom, Asaf; Ganel, Tzvi
2012-07-26
Human visual memory is considered to contain different levels of object representations. Representations in visual working memory (VWM) are thought to contain relatively elaborated information about object structure. Conversely, representations in iconic memory are thought to be more perceptual in nature. In four experiments, we tested the effects of two different categories of visual illusions on representations in VWM and in iconic memory. Unlike VWM that was affected by both types of illusions, iconic memory was immune to the effects of within-object contextual illusions and was affected only by illusions driven by between-objects contextual properties. These results show that iconic and visual working memory contain dissociable representations of object shape. These findings suggest that the global properties of the visual scene are processed prior to the processing of specific elements.
Low-level contrast statistics are diagnostic of invariance of natural textures
Groen, Iris I. A.; Ghebreab, Sennay; Lamme, Victor A. F.; Scholte, H. Steven
2012-01-01
Texture may provide important clues for real world object and scene perception. To be reliable, these clues should ideally be invariant to common viewing variations such as changes in illumination and orientation. In a large image database of natural materials, we found textures with low-level contrast statistics that varied substantially under viewing variations, as well as textures that remained relatively constant. This led us to ask whether textures with constant contrast statistics give rise to more invariant representations compared to other textures. To test this, we selected natural texture images with either high (HV) or low (LV) variance in contrast statistics and presented these to human observers. In two distinct behavioral categorization paradigms, participants more often judged HV textures as “different” compared to LV textures, showing that textures with constant contrast statistics are perceived as being more invariant. In a separate electroencephalogram (EEG) experiment, evoked responses to single texture images (single-image ERPs) were collected. The results show that differences in contrast statistics correlated with both early and late differences in occipital ERP amplitude between individual images. Importantly, ERP differences between images of HV textures were mainly driven by illumination angle, which was not the case for LV images: there, differences were completely driven by texture membership. These converging neural and behavioral results imply that some natural textures are surprisingly invariant to illumination changes and that low-level contrast statistics are diagnostic of the extent of this invariance. PMID:22701419
Invariant spatial context is learned but not retrieved in gaze-contingent tunnel-view search.
Zang, Xuelian; Jia, Lina; Müller, Hermann J; Shi, Zhuanghua
2015-05-01
Our visual brain is remarkable in extracting invariant properties from the noisy environment, guiding selection of where to look and what to identify. However, how the brain achieves this is still poorly understood. Here we explore interactions of local context and global structure in the long-term learning and retrieval of invariant display properties. Participants searched for a target among distractors, without knowing that some "old" configurations were presented repeatedly (randomly inserted among "new" configurations). We simulated tunnel vision, limiting the visible region around fixation. Robust facilitation of performance for old versus new contexts was observed when the visible region was large but not when it was small. However, once the display was made fully visible during the subsequent transfer phase, facilitation did become manifest. Furthermore, when participants were given a brief preview of the total display layout prior to tunnel view search with 2 items visible, facilitation was already obtained during the learning phase. The eye movement results revealed contextual facilitation to be coupled with changes of saccadic planning, characterized by slightly extended gaze durations but a reduced number of fixations and shortened scan paths for old displays. Taken together, our findings show that invariant spatial display properties can be acquired based on scarce, para-/foveal information, while their effective retrieval for search guidance requires the availability (even if brief) of a certain extent of peripheral information. (c) 2015 APA, all rights reserved).
Shift and rotation invariant photorefractive crystal-based associative memory
NASA Astrophysics Data System (ADS)
Uang, Chii-Maw; Lin, Wei-Feng; Lu, Ming-Huei; Lu, Guowen; Lu, Mingzhe
1995-08-01
A shift and rotation invariant photorefractive (PR) crystal based associative memory is addressed. The proposed associative memory has three layers: the feature extraction, inner- product, and output mapping layers. The feature extraction is performed by expanding an input object into a set of circular harmonic expansions (CHE) in the Fourier domain to acquire both the shift and rotation invariant properties. The inner product operation is performed by taking the advantage of Bragg diffraction of the bulky PR-crystal. The output mapping is achieved by using the massive storage capacity of the PR-crystal. In the training process, memories are stored in another PR-crystal by using the wavelength multiplexing technique. During the recall process, the output from the winner-take-all processor decides which wavelength should be used to read out the memory from the PR-crystal.
17,000 years of depicting the junction of two smooth shapes.
Biederman, Irving; Kim, Jiye G
2008-01-01
Competent realistic drawings preserve viewpoint-invariant shape characteristics of simple parts, such that a contour in the object that is straight or curved, for example, is depicted that way in the drawing. A more subtle invariant--a V-shaped singularity of the occluding boundary, containing a T-junction and a contour termination--is produced at the junction between articulated smooth surfaces, as with the leg joining the body of a horse. 45% of the drawings made in 2007 by individuals with only minimal art education correctly depicted such junctions, a proportion that is not reliably different from the incidence (42%) of correct depictions in a large sample of cave art made 17000 years ago. Whether a person did or did not include the invariant in their drawing, all agreed that it made for a better depiction.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Houdayer, Jérôme; Poitevin, Frédéric
This paper shows how small-angle scattering (SAS) curves can be decomposed in a simple sum using a set of invariant parameters calledK nwhich are related to the shape of the object of study. TheseK n, together with a radiusR, give a complete theoretical description of the SAS curve. Adding an overall constant, these parameters are easily fitted against experimental data giving a concise comprehensive description of the data. The pair distance distribution function is also entirely described by this invariant set and theD maxparameter can be measured. In addition to the understanding they bring, these invariants can be used tomore » reliably estimate structural moments beyond the radius of gyration, thereby rigorously expanding the actual set of model-free quantities one can extract from experimental SAS data, and possibly paving the way to designing new shape reconstruction strategies.« less
Covariance and the hierarchy of frame bundles
NASA Technical Reports Server (NTRS)
Estabrook, Frank B.
1987-01-01
This is an essay on the general concept of covariance, and its connection with the structure of the nested set of higher frame bundles over a differentiable manifold. Examples of covariant geometric objects include not only linear tensor fields, densities and forms, but affinity fields, sectors and sector forms, higher order frame fields, etc., often having nonlinear transformation rules and Lie derivatives. The intrinsic, or invariant, sets of forms that arise on frame bundles satisfy the graded Cartan-Maurer structure equations of an infinite Lie algebra. Reduction of these gives invariant structure equations for Lie pseudogroups, and for G-structures of various orders. Some new results are introduced for prolongation of structure equations, and for treatment of Riemannian geometry with higher-order moving frames. The use of invariant form equations for nonlinear field physics is implicitly advocated.
Du, Shaoyi; Xu, Yiting; Wan, Teng; Hu, Huaizhong; Zhang, Sirui; Xu, Guanglin; Zhang, Xuetao
2017-01-01
The iterative closest point (ICP) algorithm is efficient and accurate for rigid registration but it needs the good initial parameters. It is easily failed when the rotation angle between two point sets is large. To deal with this problem, a new objective function is proposed by introducing a rotation invariant feature based on the Euclidean distance between each point and a global reference point, where the global reference point is a rotation invariant. After that, this optimization problem is solved by a variant of ICP algorithm, which is an iterative method. Firstly, the accurate correspondence is established by using the weighted rotation invariant feature distance and position distance together. Secondly, the rigid transformation is solved by the singular value decomposition method. Thirdly, the weight is adjusted to control the relative contribution of the positions and features. Finally this new algorithm accomplishes the registration by a coarse-to-fine way whatever the initial rotation angle is, which is demonstrated to converge monotonically. The experimental results validate that the proposed algorithm is more accurate and robust compared with the original ICP algorithm.
Du, Shaoyi; Xu, Yiting; Wan, Teng; Zhang, Sirui; Xu, Guanglin; Zhang, Xuetao
2017-01-01
The iterative closest point (ICP) algorithm is efficient and accurate for rigid registration but it needs the good initial parameters. It is easily failed when the rotation angle between two point sets is large. To deal with this problem, a new objective function is proposed by introducing a rotation invariant feature based on the Euclidean distance between each point and a global reference point, where the global reference point is a rotation invariant. After that, this optimization problem is solved by a variant of ICP algorithm, which is an iterative method. Firstly, the accurate correspondence is established by using the weighted rotation invariant feature distance and position distance together. Secondly, the rigid transformation is solved by the singular value decomposition method. Thirdly, the weight is adjusted to control the relative contribution of the positions and features. Finally this new algorithm accomplishes the registration by a coarse-to-fine way whatever the initial rotation angle is, which is demonstrated to converge monotonically. The experimental results validate that the proposed algorithm is more accurate and robust compared with the original ICP algorithm. PMID:29176780
n-D shape/texture optimal synthetic description and modeling by GEOGINE
NASA Astrophysics Data System (ADS)
Fiorini, Rodolfo A.; Dacquino, Gianfranco F.
2004-12-01
GEOGINE(GEOmetrical enGINE), a state-of-the-art OMG (Ontological Model Generator) based on n-D Tensor Invariants for multidimensional shape/texture optimal synthetic description and learning, is presented. Usually elementary geometric shape robust characterization, subjected to geometric transformation, on a rigorous mathematical level is a key problem in many computer applications in different interest areas. The past four decades have seen solutions almost based on the use of n-Dimensional Moment and Fourier descriptor invariants. The present paper introduces a new approach for automatic model generation based on n -Dimensional Tensor Invariants as formal dictionary. An ontological model is the kernel used for specifying ontologies so that how close an ontology can be from the real world depends on the possibilities offered by the ontological model. By this approach even chromatic information content can be easily and reliably decoupled from target geometric information and computed into robus colour shape parameter attributes. Main GEOGINEoperational advantages over previous approaches are: 1) Automated Model Generation, 2) Invariant Minimal Complete Set for computational efficiency, 3) Arbitrary Model Precision for robust object description.
A model of attention-guided visual perception and recognition.
Rybak, I A; Gusakova, V I; Golovan, A V; Podladchikova, L N; Shevtsova, N A
1998-08-01
A model of visual perception and recognition is described. The model contains: (i) a low-level subsystem which performs both a fovea-like transformation and detection of primary features (edges), and (ii) a high-level subsystem which includes separated 'what' (sensory memory) and 'where' (motor memory) structures. Image recognition occurs during the execution of a 'behavioral recognition program' formed during the primary viewing of the image. The recognition program contains both programmed attention window movements (stored in the motor memory) and predicted image fragments (stored in the sensory memory) for each consecutive fixation. The model shows the ability to recognize complex images (e.g. faces) invariantly with respect to shift, rotation and scale.
Matthews, Andrew G
2004-08-01
It is conservatively estimated that some form of lens opacity is present in 5% to 7% of horses with otherwise clinically normal eyes.These opacities can range from small epicapsular remnants of the fetal vasculature to dense and extensive cataract. A cataract is defined technically as any opacity or alteration in the optical homogeneity of the lens involving one or more of the following: anterior epithelium, capsule, cortex, or nucleus. In the horse, cataracts rarely involve the entire lens structure (ie, complete cataracts) and are more usually localized to one anatomic landmark or sector of the lens. Complete cataracts are invariably associated with overt and significant visual disability. Focal or incomplete cataracts alone seldom cause any apparent visual dysfunction in affected horses,however.
A Spot Reminder System for the Visually Impaired Based on a Smartphone Camera
Takizawa, Hotaka; Orita, Kazunori; Aoyagi, Mayumi; Ezaki, Nobuo; Mizuno, Shinji
2017-01-01
The present paper proposes a smartphone-camera-based system to assist visually impaired users in recalling their memories related to important locations, called spots, that they visited. The memories are recorded as voice memos, which can be played back when the users return to the spots. Spot-to-spot correspondence is determined by image matching based on the scale invariant feature transform. The main contribution of the proposed system is to allow visually impaired users to associate arbitrary voice memos with arbitrary spots. The users do not need any special devices or systems except smartphones and do not need to remember the spots where the voice memos were recorded. In addition, the proposed system can identify spots in environments that are inaccessible to the global positioning system. The proposed system has been evaluated by two experiments: image matching tests and a user study. The experimental results suggested the effectiveness of the system to help visually impaired individuals, including blind individuals, recall information about regularly-visited spots. PMID:28165403
A Spot Reminder System for the Visually Impaired Based on a Smartphone Camera.
Takizawa, Hotaka; Orita, Kazunori; Aoyagi, Mayumi; Ezaki, Nobuo; Mizuno, Shinji
2017-02-04
The present paper proposes a smartphone-camera-based system to assist visually impaired users in recalling their memories related to important locations, called spots, that they visited. The memories are recorded as voice memos, which can be played back when the users return to the spots. Spot-to-spot correspondence is determined by image matching based on the scale invariant feature transform. The main contribution of the proposed system is to allow visually impaired users to associate arbitrary voice memos with arbitrary spots. The users do not need any special devices or systems except smartphones and do not need to remember the spots where the voice memos were recorded. In addition, the proposed system can identify spots in environments that are inaccessible to the global positioning system. The proposed system has been evaluated by two experiments: image matching tests and a user study. The experimental results suggested the effectiveness of the system to help visually impaired individuals, including blind individuals, recall information about regularly-visited spots.
An evaluation of attention models for use in SLAM
NASA Astrophysics Data System (ADS)
Dodge, Samuel; Karam, Lina
2013-12-01
In this paper we study the application of visual saliency models for the simultaneous localization and mapping (SLAM) problem. We consider visual SLAM, where the location of the camera and a map of the environment can be generated using images from a single moving camera. In visual SLAM, the interest point detector is of key importance. This detector must be invariant to certain image transformations so that features can be matched across di erent frames. Recent work has used a model of human visual attention to detect interest points, however it is unclear as to what is the best attention model for this purpose. To this aim, we compare the performance of interest points from four saliency models (Itti, GBVS, RARE, and AWS) with the performance of four traditional interest point detectors (Harris, Shi-Tomasi, SIFT, and FAST). We evaluate these detectors under several di erent types of image transformation and nd that the Itti saliency model, in general, achieves the best performance in terms of keypoint repeatability.
Direct visuomotor mapping for fast visually-evoked arm movements.
Reynolds, Raymond F; Day, Brian L
2012-12-01
In contrast to conventional reaction time (RT) tasks, saccadic RT's to visual targets are very fast and unaffected by the number of possible targets. This can be explained by the sub-cortical circuitry underlying eye movements, which involves direct mapping between retinal input and motor output in the superior colliculus. Here we asked if the choice-invariance established for the eyes also applies to a special class of fast visuomotor responses of the upper limb. Using a target-pointing paradigm we observed very fast reaction times (<150 ms) which were completely unaffected as the number of possible target choices was increased from 1 to 4. When we introduced a condition of altered stimulus-response mapping, RT went up and a cost of choice was observed. These results can be explained by direct mapping between visual input and motor output, compatible with a sub-cortical pathway for visual control of the upper limb. Copyright © 2012 Elsevier Ltd. All rights reserved.
The Invar tensor package: Differential invariants of Riemann
NASA Astrophysics Data System (ADS)
Martín-García, J. M.; Yllanes, D.; Portugal, R.
2008-10-01
The long standing problem of the relations among the scalar invariants of the Riemann tensor is computationally solved for all 6ṡ10 objects with up to 12 derivatives of the metric. This covers cases ranging from products of up to 6 undifferentiated Riemann tensors to cases with up to 10 covariant derivatives of a single Riemann. We extend our computer algebra system Invar to produce within seconds a canonical form for any of those objects in terms of a basis. The process is as follows: (1) an invariant is converted in real time into a canonical form with respect to the permutation symmetries of the Riemann tensor; (2) Invar reads a database of more than 6ṡ10 relations and applies those coming from the cyclic symmetry of the Riemann tensor; (3) then applies the relations coming from the Bianchi identity, (4) the relations coming from commutations of covariant derivatives, (5) the dimensionally-dependent identities for dimension 4, and finally (6) simplifies invariants that can be expressed as product of dual invariants. Invar runs on top of the tensor computer algebra systems xTensor (for Mathematica) and Canon (for Maple). Program summaryProgram title:Invar Tensor Package v2.0 Catalogue identifier:ADZK_v2_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/ADZK_v2_0.html Program obtainable from:CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions:Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.:3 243 249 No. of bytes in distributed program, including test data, etc.:939 Distribution format:tar.gz Programming language:Mathematica and Maple Computer:Any computer running Mathematica versions 5.0 to 6.0 or Maple versions 9 and 11 Operating system:Linux, Unix, Windows XP, MacOS RAM:100 Mb Word size:64 or 32 bits Supplementary material:The new database of relations is much larger than that for the previous version and therefore has not been included in the distribution. To obtain the Mathematica and Maple database files click on this link. Classification:1.5, 5 Does the new version supersede the previous version?:Yes. The previous version (1.0) only handled algebraic invariants. The current version (2.0) has been extended to cover differential invariants as well. Nature of problem:Manipulation and simplification of scalar polynomial expressions formed from the Riemann tensor and its covariant derivatives. Solution method:Algorithms of computational group theory to simplify expressions with tensors that obey permutation symmetries. Tables of syzygies of the scalar invariants of the Riemann tensor. Reasons for new version:With this new version, the user can manipulate differential invariants of the Riemann tensor. Differential invariants are required in many physical problems in classical and quantum gravity. Summary of revisions:The database of syzygies has been expanded by a factor of 30. New commands were added in order to deal with the enlarged database and to manipulate the covariant derivative. Restrictions:The present version only handles scalars, and not expressions with free indices. Additional comments:The distribution file for this program is over 53 Mbytes and therefore is not delivered directly when download or Email is requested. Instead a html file giving details of how the program can be obtained is sent. Running time:One second to fully reduce any monomial of the Riemann tensor up to degree 7 or order 10 in terms of independent invariants. The Mathematica notebook included in the distribution takes approximately 5 minutes to run.
ERIC Educational Resources Information Center
Rader, Nancy de Villiers; Zukow-Goldring, Patricia
2012-01-01
How do young infants discover word meanings? We have theorized that caregivers educate infants' attention (cf. Gibson, J.J., 1966) by synchronizing the saying of a word with a dynamic gesture displaying the object/referent (Zukow-Goldring, 1997). Detecting an amodal invariant across gesture and speech brackets the word and object within the…
DiCarlo, James J.; Zecchina, Riccardo; Zoccolan, Davide
2013-01-01
The anterior inferotemporal cortex (IT) is the highest stage along the hierarchy of visual areas that, in primates, processes visual objects. Although several lines of evidence suggest that IT primarily represents visual shape information, some recent studies have argued that neuronal ensembles in IT code the semantic membership of visual objects (i.e., represent conceptual classes such as animate and inanimate objects). In this study, we investigated to what extent semantic, rather than purely visual information, is represented in IT by performing a multivariate analysis of IT responses to a set of visual objects. By relying on a variety of machine-learning approaches (including a cutting-edge clustering algorithm that has been recently developed in the domain of statistical physics), we found that, in most instances, IT representation of visual objects is accounted for by their similarity at the level of shape or, more surprisingly, low-level visual properties. Only in a few cases we observed IT representations of semantic classes that were not explainable by the visual similarity of their members. Overall, these findings reassert the primary function of IT as a conveyor of explicit visual shape information, and reveal that low-level visual properties are represented in IT to a greater extent than previously appreciated. In addition, our work demonstrates how combining a variety of state-of-the-art multivariate approaches, and carefully estimating the contribution of shape similarity to the representation of object categories, can substantially advance our understanding of neuronal coding of visual objects in cortex. PMID:23950700
Decoding information about dynamically occluded objects in visual cortex
Erlikhman, Gennady; Caplovitz, Gideon P.
2016-01-01
During dynamic occlusion, an object passes behind an occluding surface and then later reappears. Even when completely occluded from view, such objects are experienced as continuing to exist or persist behind the occluder, even though they are no longer visible. The contents and neural basis of this persistent representation remain poorly understood. Questions remain as to whether there is information maintained about the object itself (i.e. its shape or identity) or, non-object-specific information such as its position or velocity as it is tracked behind an occluder as well as which areas of visual cortex represent such information. Recent studies have found that early visual cortex is activated by “invisible” objects during visual imagery and by unstimulated regions along the path of apparent motion, suggesting that some properties of dynamically occluded objects may also be neurally represented in early visual cortex. We applied functional magnetic resonance imaging in human subjects to examine the representation of information within visual cortex during dynamic occlusion. For gradually occluded, but not for instantly disappearing objects, there was an increase in activity in early visual cortex (V1, V2, and V3). This activity was spatially-specific, corresponding to the occluded location in the visual field. However, the activity did not encode enough information about object identity to discriminate between different kinds of occluded objects (circles vs. stars) using MVPA. In contrast, object identity could be decoded in spatially-specific subregions of higher-order, topographically organized areas such as ventral, lateral, and temporal occipital areas (VO, LO, and TO) as well as the functionally defined LOC and hMT+. These results suggest that early visual cortex may represent the dynamically occluded object’s position or motion path, while later visual areas represent object-specific information. PMID:27663987
Infants' prospective control during object manipulation in an uncertain environment.
Gottwald, Janna M; Gredebäck, Gustaf
2015-08-01
This study investigates how infants use visual and sensorimotor information to prospectively control their actions. We gave 14-month-olds two objects of different weight and observed how high they were lifted, using a Qualisys Motion Capture System. In one condition, the two objects were visually distinct (different color condition) in another they were visually identical (same color condition). Lifting amplitudes of the first movement unit were analyzed in order to assess prospective control. Results demonstrate that infants lifted a light object higher than a heavy object, especially when vision could be used to assess weight (different color condition). When being confronted with two visually identical objects of different weight (same color condition), infants showed a different lifting pattern than what could be observed in the different color condition, expressed by a significant interaction effect between object weight and color condition on lifting amplitude. These results indicate that (a) visual information about object weight can be used to prospectively control lifting actions and that (b) infants are able to prospectively control their lifting actions even without visual information about object weight. We argue that infants, in the absence of reliable visual information about object weight, heighten their dependence on non-visual information (tactile, sensorimotor memory) in order to estimate weight and pre-adjust their lifting actions in a prospective manner.
Image remapping strategies applied as protheses for the visually impaired
NASA Technical Reports Server (NTRS)
Johnson, Curtis D.
1993-01-01
Maculopathy and retinitis pigmentosa (rp) are two vision defects which render the afflicted person with impaired ability to read and recognize visual patterns. For some time there has been interest and work on the use of image remapping techniques to provide a visual aid for individuals with these impairments. The basic concept is to remap an image according to some mathematical transformation such that the image is warped around a maculopathic defect (scotoma) or within the rp foveal region of retinal sensitivity. NASA/JSC has been pursuing this research using angle invariant transformations with testing of the resulting remapping using subjects and facilities of the University of Houston, College of Optometry. Testing is facilitated by use of a hardware device, the Programmable Remapper, to provide the remapping of video images. This report presents the results of studies of alternative remapping transformations with the objective of improving subject reading rates and pattern recognition. In particular a form of conformal transformation was developed which provides for a smooth warping of an image around a scotoma. In such a case it is shown that distortion of characters and lines of characters is minimized which should lead to enhanced character recognition. In addition studies were made of alternative transformations which, although not conformal, provide for similar low character distortion remapping. A second, non-conformal transformation was studied for remapping of images to aid rp impairments. In this case a transformation was investigated which allows remapping of a vision field into a circular area representing the foveal retina region. The size and spatial representation of the image are selectable. It is shown that parametric adjustments allow for a wide variation of how a visual field is presented to the sensitive retina. This study also presents some preliminary considerations of how a prosthetic device could be implemented in a practical sense, vis-a-vis, size, weight and portability.
Erlikhman, Gennady; Gurariy, Gennadiy; Mruczek, Ryan E.B.; Caplovitz, Gideon P.
2016-01-01
Oftentimes, objects are only partially and transiently visible as parts of them become occluded during observer or object motion. The visual system can integrate such object fragments across space and time into perceptual wholes or spatiotemporal objects. This integrative and dynamic process may involve both ventral and dorsal visual processing pathways, along which shape and spatial representations are thought to arise. We measured fMRI BOLD response to spatiotemporal objects and used multi-voxel pattern analysis (MVPA) to decode shape information across 20 topographic regions of visual cortex. Object identity could be decoded throughout visual cortex, including intermediate (V3A, V3B, hV4, LO1-2,) and dorsal (TO1-2, and IPS0-1) visual areas. Shape-specific information, therefore, may not be limited to early and ventral visual areas, particularly when it is dynamic and must be integrated. Contrary to the classic view that the representation of objects is the purview of the ventral stream, intermediate and dorsal areas may play a distinct and critical role in the construction of object representations across space and time. PMID:27033688
Contini, Erika W; Wardle, Susan G; Carlson, Thomas A
2017-10-01
Visual object recognition is a complex, dynamic process. Multivariate pattern analysis methods, such as decoding, have begun to reveal how the brain processes complex visual information. Recently, temporal decoding methods for EEG and MEG have offered the potential to evaluate the temporal dynamics of object recognition. Here we review the contribution of M/EEG time-series decoding methods to understanding visual object recognition in the human brain. Consistent with the current understanding of the visual processing hierarchy, low-level visual features dominate decodable object representations early in the time-course, with more abstract representations related to object category emerging later. A key finding is that the time-course of object processing is highly dynamic and rapidly evolving, with limited temporal generalisation of decodable information. Several studies have examined the emergence of object category structure, and we consider to what degree category decoding can be explained by sensitivity to low-level visual features. Finally, we evaluate recent work attempting to link human behaviour to the neural time-course of object processing. Copyright © 2017 Elsevier Ltd. All rights reserved.
Changing viewer perspectives reveals constraints to implicit visual statistical learning.
Jiang, Yuhong V; Swallow, Khena M
2014-10-07
Statistical learning-learning environmental regularities to guide behavior-likely plays an important role in natural human behavior. One potential use is in search for valuable items. Because visual statistical learning can be acquired quickly and without intention or awareness, it could optimize search and thereby conserve energy. For this to be true, however, visual statistical learning needs to be viewpoint invariant, facilitating search even when people walk around. To test whether implicit visual statistical learning of spatial information is viewpoint independent, we asked participants to perform a visual search task from variable locations around a monitor placed flat on a stand. Unbeknownst to participants, the target was more often in some locations than others. In contrast to previous research on stationary observers, visual statistical learning failed to produce a search advantage for targets in high-probable regions that were stable within the environment but variable relative to the viewer. This failure was observed even when conditions for spatial updating were optimized. However, learning was successful when the rich locations were referenced relative to the viewer. We conclude that changing viewer perspective disrupts implicit learning of the target's location probability. This form of learning shows limited integration with spatial updating or spatiotopic representations. © 2014 ARVO.
Sarlegna, Fabrice R; Baud-Bovy, Gabriel; Danion, Frédéric
2010-08-01
When we manipulate an object, grip force is adjusted in anticipation of the mechanical consequences of hand motion (i.e., load force) to prevent the object from slipping. This predictive behavior is assumed to rely on an internal representation of the object dynamic properties, which would be elaborated via visual information before the object is grasped and via somatosensory feedback once the object is grasped. Here we examined this view by investigating the effect of delayed visual feedback during dextrous object manipulation. Adult participants manually tracked a sinusoidal target by oscillating a handheld object whose current position was displayed as a cursor on a screen along with the visual target. A delay was introduced between actual object displacement and cursor motion. This delay was linearly increased (from 0 to 300 ms) and decreased within 2-min trials. As previously reported, delayed visual feedback altered performance in manual tracking. Importantly, although the physical properties of the object remained unchanged, delayed visual feedback altered the timing of grip force relative to load force by about 50 ms. Additional experiments showed that this effect was not due to task complexity nor to manual tracking. A model inspired by the behavior of mass-spring systems suggests that delayed visual feedback may have biased the representation of object dynamics. Overall, our findings support the idea that visual feedback of object motion can influence the predictive control of grip force even when the object is grasped.
Overview of EVE - the event visualization environment of ROOT
NASA Astrophysics Data System (ADS)
Tadel, Matevž
2010-04-01
EVE is a high-level visualization library using ROOT's data-processing, GUI and OpenGL interfaces. It is designed as a framework for object management offering hierarchical data organization, object interaction and visualization via GUI and OpenGL representations. Automatic creation of 2D projected views is also supported. On the other hand, it can serve as an event visualization toolkit satisfying most HEP requirements: visualization of geometry, simulated and reconstructed data such as hits, clusters, tracks and calorimeter information. Special classes are available for visualization of raw-data. Object-interaction layer allows for easy selection and highlighting of objects and their derived representations (projections) across several views (3D, Rho-Z, R-Phi). Object-specific tooltips are provided in both GUI and GL views. The visual-configuration layer of EVE is built around a data-base of template objects that can be applied to specific instances of visualization objects to ensure consistent object presentation. The data-base can be retrieved from a file, edited during the framework operation and stored to file. EVE prototype was developed within the ALICE collaboration and has been included into ROOT in December 2007. Since then all EVE components have reached maturity. EVE is used as the base of AliEve visualization framework in ALICE, Firework physics-oriented event-display in CMS, and as the visualization engine of FairRoot in FAIR.
Rotation And Scale Invariant Object Recognition Using A Distributed Associative Memory
NASA Astrophysics Data System (ADS)
Wechsler, Harry; Zimmerman, George Lee
1988-04-01
This paper describes an approach to 2-dimensional object recognition. Complex-log conformal mapping is combined with a distributed associative memory to create a system which recognizes objects regardless of changes in rotation or scale. Recalled information from the memorized database is used to classify an object, reconstruct the memorized version of the object, and estimate the magnitude of changes in scale or rotation. The system response is resistant to moderate amounts of noise and occlusion. Several experiments, using real, gray scale images, are presented to show the feasibility of our approach.
Monaco, Simona; Gallivan, Jason P; Figley, Teresa D; Singhal, Anthony; Culham, Jody C
2017-11-29
The role of the early visual cortex and higher-order occipitotemporal cortex has been studied extensively for visual recognition and to a lesser degree for haptic recognition and visually guided actions. Using a slow event-related fMRI experiment, we investigated whether tactile and visual exploration of objects recruit the same "visual" areas (and in the case of visual cortex, the same retinotopic zones) and if these areas show reactivation during delayed actions in the dark toward haptically explored objects (and if so, whether this reactivation might be due to imagery). We examined activation during visual or haptic exploration of objects and action execution (grasping or reaching) separated by an 18 s delay. Twenty-nine human volunteers (13 females) participated in this study. Participants had their eyes open and fixated on a point in the dark. The objects were placed below the fixation point and accordingly visual exploration activated the cuneus, which processes retinotopic locations in the lower visual field. Strikingly, the occipital pole (OP), representing foveal locations, showed higher activation for tactile than visual exploration, although the stimulus was unseen and location in the visual field was peripheral. Moreover, the lateral occipital tactile-visual area (LOtv) showed comparable activation for tactile and visual exploration. Psychophysiological interaction analysis indicated that the OP showed stronger functional connectivity with anterior intraparietal sulcus and LOtv during the haptic than visual exploration of shapes in the dark. After the delay, the cuneus, OP, and LOtv showed reactivation that was independent of the sensory modality used to explore the object. These results show that haptic actions not only activate "visual" areas during object touch, but also that this information appears to be used in guiding grasping actions toward targets after a delay. SIGNIFICANCE STATEMENT Visual presentation of an object activates shape-processing areas and retinotopic locations in early visual areas. Moreover, if the object is grasped in the dark after a delay, these areas show "reactivation." Here, we show that these areas are also activated and reactivated for haptic object exploration and haptically guided grasping. Touch-related activity occurs not only in the retinotopic location of the visual stimulus, but also at the occipital pole (OP), corresponding to the foveal representation, even though the stimulus was unseen and located peripherally. That is, the same "visual" regions are implicated in both visual and haptic exploration; however, touch also recruits high-acuity central representation within early visual areas during both haptic exploration of objects and subsequent actions toward them. Functional connectivity analysis shows that the OP is more strongly connected with ventral and dorsal stream areas when participants explore an object in the dark than when they view it. Copyright © 2017 the authors 0270-6474/17/3711572-20$15.00/0.
The representational dynamics of remembered projectile locations.
De Sá Teixeira, Nuno Alexandre; Hecht, Heiko; Oliveira, Armando Mónica
2013-12-01
When people are instructed to locate the vanishing location of a moving target, systematic errors forward in the direction of motion (M-displacement) and downward in the direction of gravity (O-displacement) are found. These phenomena came to be linked with the notion that physical invariants are embedded in the dynamic representations generated by the perceptual system. We explore the nature of these invariants that determine the representational mechanics of projectiles. By manipulating the retention intervals between the target's disappearance and the participant's responses, while measuring both M- and O-displacements, we were able to uncover a representational analogue of the trajectory of a projectile. The outcomes of three experiments revealed that the shape of this trajectory is discontinuous. Although the horizontal component of such trajectory can be accounted for by perceptual and oculomotor factors, its vertical component cannot. Taken together, the outcomes support an internalization of gravity in the visual representation of projectiles.
Bag of Lines (BoL) for Improved Aerial Scene Representation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sridharan, Harini; Cheriyadat, Anil M.
2014-09-22
Feature representation is a key step in automated visual content interpretation. In this letter, we present a robust feature representation technique, referred to as bag of lines (BoL), for high-resolution aerial scenes. The proposed technique involves extracting and compactly representing low-level line primitives from the scene. The compact scene representation is generated by counting the different types of lines representing various linear structures in the scene. Through extensive experiments, we show that the proposed scene representation is invariant to scale changes and scene conditions and can discriminate urban scene categories accurately. We compare the BoL representation with the popular scalemore » invariant feature transform (SIFT) and Gabor wavelets for their classification and clustering performance on an aerial scene database consisting of images acquired by sensors with different spatial resolutions. The proposed BoL representation outperforms the SIFT- and Gabor-based representations.« less
Morphing continuum theory for turbulence: Theory, computation, and visualization.
Chen, James
2017-10-01
A high order morphing continuum theory (MCT) is introduced to model highly compressible turbulence. The theory is formulated under the rigorous framework of rational continuum mechanics. A set of linear constitutive equations and balance laws are deduced and presented from the Coleman-Noll procedure and Onsager's reciprocal relations. The governing equations are then arranged in conservation form and solved through the finite volume method with a second-order Lax-Friedrichs scheme for shock preservation. A numerical example of transonic flow over a three-dimensional bump is presented using MCT and the finite volume method. The comparison shows that MCT-based direct numerical simulation (DNS) provides a better prediction than Navier-Stokes (NS)-based DNS with less than 10% of the mesh number when compared with experiments. A MCT-based and frame-indifferent Q criterion is also derived to show the coherent eddy structure of the downstream turbulence in the numerical example. It should be emphasized that unlike the NS-based Q criterion, the MCT-based Q criterion is objective without the limitation of Galilean invariance.
Bodala, Indu P; Abbasi, Nida I; Yu Sun; Bezerianos, Anastasios; Al-Nashash, Hasan; Thakor, Nitish V
2017-07-01
Eye tracking offers a practical solution for monitoring cognitive performance in real world tasks. However, eye tracking in dynamic environments is difficult due to high spatial and temporal variation of stimuli, needing further and thorough investigation. In this paper, we study the possibility of developing a novel computer vision assisted eye tracking analysis by using fixations. Eye movement data is obtained from a long duration naturalistic driving experiment. Source invariant feature transform (SIFT) algorithm was implemented using VLFeat toolbox to identify multiple areas of interest (AOIs). A new measure called `fixation score' was defined to understand the dynamics of fixation position between the target AOI and the non target AOIs. Fixation score is maximum when the subjects focus on the target AOI and diminishes when they gaze at the non-target AOIs. Statistically significant negative correlation was found between fixation score and reaction time data (r =-0.2253 and p<;0.05). This implies that with vigilance decrement, the fixation score decreases due to visual attention shifting away from the target objects resulting in an increase in the reaction time.
Portable real-time color night vision
NASA Astrophysics Data System (ADS)
Toet, Alexander; Hogervorst, Maarten A.
2008-03-01
We developed a simple and fast lookup-table based method to derive and apply natural daylight colors to multi-band night-time images. The method deploys an optimal color transformation derived from a set of samples taken from a daytime color reference image. The colors in the resulting colorized multiband night-time images closely resemble the colors in the daytime color reference image. Also, object colors remain invariant under panning operations and are independent of the scene content. Here we describe the implementation of this method in two prototype portable dual band realtime night vision systems. One system provides co-aligned visual and near-infrared bands of two image intensifiers, the other provides co-aligned images from a digital image intensifier and an uncooled longwave infrared microbolometer. The co-aligned images from both systems are further processed by a notebook computer. The color mapping is implemented as a realtime lookup table transform. The resulting colorised video streams can be displayed in realtime on head mounted displays and stored on the hard disk of the notebook computer. Preliminary field trials demonstrate the potential of these systems for applications like surveillance, navigation and target detection.
Tafazoli, Sina; Safaai, Houman; De Franceschi, Gioia; Rosselli, Federica Bianca; Vanzella, Walter; Riggi, Margherita; Buffolo, Federica; Panzeri, Stefano; Zoccolan, Davide
2017-01-01
Rodents are emerging as increasingly popular models of visual functions. Yet, evidence that rodent visual cortex is capable of advanced visual processing, such as object recognition, is limited. Here we investigate how neurons located along the progression of extrastriate areas that, in the rat brain, run laterally to primary visual cortex, encode object information. We found a progressive functional specialization of neural responses along these areas, with: (1) a sharp reduction of the amount of low-level, energy-related visual information encoded by neuronal firing; and (2) a substantial increase in the ability of both single neurons and neuronal populations to support discrimination of visual objects under identity-preserving transformations (e.g., position and size changes). These findings strongly argue for the existence of a rat object-processing pathway, and point to the rodents as promising models to dissect the neuronal circuitry underlying transformation-tolerant recognition of visual objects. DOI: http://dx.doi.org/10.7554/eLife.22794.001 PMID:28395730
Wen, Haiguang; Shi, Junxing; Chen, Wei; Liu, Zhongming
2018-02-28
The brain represents visual objects with topographic cortical patterns. To address how distributed visual representations enable object categorization, we established predictive encoding models based on a deep residual network, and trained them to predict cortical responses to natural movies. Using this predictive model, we mapped human cortical representations to 64,000 visual objects from 80 categories with high throughput and accuracy. Such representations covered both the ventral and dorsal pathways, reflected multiple levels of object features, and preserved semantic relationships between categories. In the entire visual cortex, object representations were organized into three clusters of categories: biological objects, non-biological objects, and background scenes. In a finer scale specific to each cluster, object representations revealed sub-clusters for further categorization. Such hierarchical clustering of category representations was mostly contributed by cortical representations of object features from middle to high levels. In summary, this study demonstrates a useful computational strategy to characterize the cortical organization and representations of visual features for rapid categorization.
A Novel Robot Visual Homing Method Based on SIFT Features
Zhu, Qidan; Liu, Chuanjia; Cai, Chengtao
2015-01-01
Warping is an effective visual homing method for robot local navigation. However, the performance of the warping method can be greatly influenced by the changes of the environment in a real scene, thus resulting in lower accuracy. In order to solve the above problem and to get higher homing precision, a novel robot visual homing algorithm is proposed by combining SIFT (scale-invariant feature transform) features with the warping method. The algorithm is novel in using SIFT features as landmarks instead of the pixels in the horizon region of the panoramic image. In addition, to further improve the matching accuracy of landmarks in the homing algorithm, a novel mismatching elimination algorithm, based on the distribution characteristics of landmarks in the catadioptric panoramic image, is proposed. Experiments on image databases and on a real scene confirm the effectiveness of the proposed method. PMID:26473880
Lee, Kai-Hui; Chiu, Pei-Ling
2013-10-01
Conventional visual cryptography (VC) suffers from a pixel-expansion problem, or an uncontrollable display quality problem for recovered images, and lacks a general approach to construct visual secret sharing schemes for general access structures. We propose a general and systematic approach to address these issues without sophisticated codebook design. This approach can be used for binary secret images in non-computer-aided decryption environments. To avoid pixel expansion, we design a set of column vectors to encrypt secret pixels rather than using the conventional VC-based approach. We begin by formulating a mathematic model for the VC construction problem to find the column vectors for the optimal VC construction, after which we develop a simulated-annealing-based algorithm to solve the problem. The experimental results show that the display quality of the recovered image is superior to that of previous papers.
Research on flight stability performance of rotor aircraft based on visual servo control method
NASA Astrophysics Data System (ADS)
Yu, Yanan; Chen, Jing
2016-11-01
control method based on visual servo feedback is proposed, which is used to improve the attitude of a quad-rotor aircraft and to enhance its flight stability. Ground target images are obtained by a visual platform fixed on aircraft. Scale invariant feature transform (SIFT) algorism is used to extract image feature information. According to the image characteristic analysis, fast motion estimation is completed and used as an input signal of PID flight control system to realize real-time status adjustment in flight process. Imaging tests and simulation results show that the method proposed acts good performance in terms of flight stability compensation and attitude adjustment. The response speed and control precision meets the requirements of actual use, which is able to reduce or even eliminate the influence of environmental disturbance. So the method proposed has certain research value to solve the problem of aircraft's anti-disturbance.
The characteristics of low-speed streaks in the near-wall region of a turbulent boundary layer
NASA Astrophysics Data System (ADS)
Smith, C. R.; Metzler, S. P.
1983-04-01
The discovery of an instantaneous spanwise velocity distribution consisting of alternative zones of high- and low-speed fluid which develop in the viscous sublayer and extend into the logarithmic region was one of the first clues to the existence of an ordered structure within a turbulent boundary layer. The present investigation is concerned with quantitative flow-visualization results obtained with the aid of a high-speed video flow visualization system which permits the detailed visual examination of both the statistics and characteristics of low-speed streaks over a much wider range of Reynolds numbers than has been possible before. Attention is given to streak appearance, mean streak spacing, the spanwise distribution of streaks, streak persistence, and aspects of streak merging and intermittency. The results indicate that the statistical characteristics of the spanwise spacing of low-speed streaks are essentially invariant with Reynolds number.
Resilience to the contralateral visual field bias as a window into object representations
Garcea, Frank E.; Kristensen, Stephanie; Almeida, Jorge; Mahon, Bradford Z.
2016-01-01
Viewing images of manipulable objects elicits differential blood oxygen level-dependent (BOLD) contrast across parietal and dorsal occipital areas of the human brain that support object-directed reaching, grasping, and complex object manipulation. However, it is unknown which object-selective regions of parietal cortex receive their principal inputs from the ventral object-processing pathway and which receive their inputs from the dorsal object-processing pathway. Parietal areas that receive their inputs from the ventral visual pathway, rather than from the dorsal stream, will have inputs that are already filtered through object categorization and identification processes. This predicts that parietal regions that receive inputs from the ventral visual pathway should exhibit object-selective responses that are resilient to contralateral visual field biases. To test this hypothesis, adult participants viewed images of tools and animals that were presented to the left or right visual fields during functional magnetic resonance imaging (fMRI). We found that the left inferior parietal lobule showed robust tool preferences independently of the visual field in which tool stimuli were presented. In contrast, a region in posterior parietal/dorsal occipital cortex in the right hemisphere exhibited an interaction between visual field and category: tool-preferences were strongest contralateral to the stimulus. These findings suggest that action knowledge accessed in the left inferior parietal lobule operates over inputs that are abstracted from the visual input and contingent on analysis by the ventral visual pathway, consistent with its putative role in supporting object manipulation knowledge. PMID:27160998
Establishing Visual Category Boundaries between Objects: A PET Study
ERIC Educational Resources Information Center
Saumier, Daniel; Chertkow, Howard; Arguin, Martin; Whatmough, Cristine
2005-01-01
Individuals with Alzheimer's disease (AD) often have problems in recognizing common objects. This visual agnosia may stem from difficulties in establishing appropriate visual boundaries between visually similar objects. In support of this hypothesis, Saumier, Arguin, Chertkow, and Renfrew (2001) showed that AD subjects have difficulties in…
NASA Astrophysics Data System (ADS)
Dettwiller, L.; Lépine, T.
2017-12-01
A general and pure wave theory of image formation for all types of stellar interferometers, including hypertelescopes, is developed in the frame of Fresnel's paraxial approximations of diffraction. For a hypertelescope, we show that the severe lack of translation invariance leads to multiple and strong spatial frequency heterodyning, which codes the very high frequencies detected by the hypertelescope into medium spatial frequencies and introduces a moiré-type ambiguity for extended objects. This explains mathematically the disappointing appearance of poor resolution observed in some image simulations for hypertelescopes.
Modular properties of 6d (DELL) systems
NASA Astrophysics Data System (ADS)
Aminov, G.; Mironov, A.; Morozov, A.
2017-11-01
If super-Yang-Mills theory possesses the exact conformal invariance, there is an additional modular invariance under the change of the complex bare charge [InlineMediaObject not available: see fulltext.]. The low-energy Seiberg-Witten prepotential ℱ( a), however, is not explicitly invariant, because the flat moduli also change a - → a D = ∂ℱ/∂ a. In result, the prepotential is not a modular form and depends also on the anomalous Eisenstein series E 2. This dependence is usually described by the universal MNW modular anomaly equation. We demonstrate that, in the 6 d SU( N) theory with two independent modular parameters τ and \\widehat{τ} , the modular anomaly equation changes, because the modular transform of τ is accompanied by an ( N -dependent!) shift of \\widehat{τ} and vice versa. This is a new peculiarity of double-elliptic systems, which deserves further investigation.
Texture zeros and hierarchical masses from flavour (mis)alignment
NASA Astrophysics Data System (ADS)
Hollik, W. G.; Saldana-Salazar, U. J.
2018-03-01
We introduce an unconventional interpretation of the fermion mass matrix elements. As the full rotational freedom of the gauge-kinetic terms renders a set of infinite bases called weak bases, basis-dependent structures as mass matrices are unphysical. Matrix invariants, on the other hand, provide a set of basis-independent objects which are of more relevance. We employ one of these invariants to give a new parametrisation of the mass matrices. By virtue of it, one gains control over its implicit implications on several mass matrix structures. The key element is the trace invariant which resembles the equation of a hypersphere with a radius equal to the Frobenius norm of the mass matrix. With the concepts of alignment or misalignment we can identify texture zeros with certain alignments whereas Froggatt-Nielsen structures in the matrix elements are governed by misalignment. This method allows further insights of traditional approaches to the underlying flavour geometry.
Storage of features, conjunctions and objects in visual working memory.
Vogel, E K; Woodman, G F; Luck, S J
2001-02-01
Working memory can be divided into separate subsystems for verbal and visual information. Although the verbal system has been well characterized, the storage capacity of visual working memory has not yet been established for simple features or for conjunctions of features. The authors demonstrate that it is possible to retain information about only 3-4 colors or orientations in visual working memory at one time. Observers are also able to retain both the color and the orientation of 3-4 objects, indicating that visual working memory stores integrated objects rather than individual features. Indeed, objects defined by a conjunction of four features can be retained in working memory just as well as single-feature objects, allowing many individual features to be retained when distributed across a small number of objects. Thus, the capacity of visual working memory must be understood in terms of integrated objects rather than individual features.
Target recognition of log-polar ladar range images using moment invariants
NASA Astrophysics Data System (ADS)
Xia, Wenze; Han, Shaokun; Cao, Jie; Yu, Haoyong
2017-01-01
The ladar range image has received considerable attentions in the automatic target recognition field. However, previous research does not cover target recognition using log-polar ladar range images. Therefore, we construct a target recognition system based on log-polar ladar range images in this paper. In this system combined moment invariants and backpropagation neural network are selected as shape descriptor and shape classifier, respectively. In order to fully analyze the effect of log-polar sampling pattern on recognition result, several comparative experiments based on simulated and real range images are carried out. Eventually, several important conclusions are drawn: (i) if combined moments are computed directly by log-polar range images, translation, rotation and scaling invariant properties of combined moments will be invalid (ii) when object is located in the center of field of view, recognition rate of log-polar range images is less sensitive to the changing of field of view (iii) as object position changes from center to edge of field of view, recognition performance of log-polar range images will decline dramatically (iv) log-polar range images has a better noise robustness than Cartesian range images. Finally, we give a suggestion that it is better to divide field of view into recognition area and searching area in the real application.
ERIC Educational Resources Information Center
Abass, Bada Tayo; Isyakka, Bello; Olaolu, Ijisakin Yemi; Olusegun, Fajuyigbe Michael
2014-01-01
The study examined the effects of two and three dimensional visual objects on learners' drawing skills in junior secondary schools in OsunState, Nigeria. It also determined students' ability to identify visual objects. Furthermore, it investigated the comparative effectiveness of two and three dimensional visual objects on drawing skills of junior…
Micro-Valences: Perceiving Affective Valence in Everyday Objects
Lebrecht, Sophie; Bar, Moshe; Barrett, Lisa Feldman; Tarr, Michael J.
2012-01-01
Perceiving the affective valence of objects influences how we think about and react to the world around us. Conversely, the speed and quality with which we visually recognize objects in a visual scene can vary dramatically depending on that scene’s affective content. Although typical visual scenes contain mostly “everyday” objects, the affect perception in visual objects has been studied using somewhat atypical stimuli with strong affective valences (e.g., guns or roses). Here we explore whether affective valence must be strong or overt to exert an effect on our visual perception. We conclude that everyday objects carry subtle affective valences – “micro-valences” – which are intrinsic to their perceptual representation. PMID:22529828
Evidence for perceptual deficits in associative visual (prosop)agnosia: a single-case study.
Delvenne, Jean François; Seron, Xavier; Coyette, Françoise; Rossion, Bruno
2004-01-01
Associative visual agnosia is classically defined as normal visual perception stripped of its meaning [Archiv für Psychiatrie und Nervenkrankheiten 21 (1890) 22/English translation: Cognitive Neuropsychol. 5 (1988) 155]: these patients cannot access to their stored visual memories to categorize the objects nonetheless perceived correctly. However, according to an influential theory of visual agnosia [Farah, Visual Agnosia: Disorders of Object Recognition and What They Tell Us about Normal Vision, MIT Press, Cambridge, MA, 1990], visual associative agnosics necessarily present perceptual deficits that are the cause of their impairment at object recognition Here we report a detailed investigation of a patient with bilateral occipito-temporal lesions strongly impaired at object and face recognition. NS presents normal drawing copy, and normal performance at object and face matching tasks as used in classical neuropsychological tests. However, when tested with several computer tasks using carefully controlled visual stimuli and taking both his accuracy rate and response times into account, NS was found to have abnormal performances at high-level visual processing of objects and faces. Albeit presenting a different pattern of deficits than previously described in integrative agnosic patients such as HJA and LH, his deficits were characterized by an inability to integrate individual parts into a whole percept, as suggested by his failure at processing structurally impossible three-dimensional (3D) objects, an absence of face inversion effects and an advantage at detecting and matching single parts. Taken together, these observations question the idea of separate visual representations for object/face perception and object/face knowledge derived from investigations of visual associative (prosop)agnosia, and they raise some methodological issues in the analysis of single-case studies of (prosop)agnosic patients.
A foreground object features-based stereoscopic image visual comfort assessment model
NASA Astrophysics Data System (ADS)
Jin, Xin; Jiang, G.; Ying, H.; Yu, M.; Ding, S.; Peng, Z.; Shao, F.
2014-11-01
Since stereoscopic images provide observers with both realistic and discomfort viewing experience, it is necessary to investigate the determinants of visual discomfort. By considering that foreground object draws most attention when human observing stereoscopic images. This paper proposes a new foreground object based visual comfort assessment (VCA) metric. In the first place, a suitable segmentation method is applied to disparity map and then the foreground object is ascertained as the one having the biggest average disparity. In the second place, three visual features being average disparity, average width and spatial complexity of foreground object are computed from the perspective of visual attention. Nevertheless, object's width and complexity do not consistently influence the perception of visual comfort in comparison with disparity. In accordance with this psychological phenomenon, we divide the whole images into four categories on the basis of different disparity and width, and exert four different models to more precisely predict its visual comfort in the third place. Experimental results show that the proposed VCA metric outperformance other existing metrics and can achieve a high consistency between objective and subjective visual comfort scores. The Pearson Linear Correlation Coefficient (PLCC) and Spearman Rank Order Correlation Coefficient (SROCC) are over 0.84 and 0.82, respectively.
Object-based attention underlies the rehearsal of feature binding in visual working memory.
Shen, Mowei; Huang, Xiang; Gao, Zaifeng
2015-04-01
Feature binding is a core concept in many research fields, including the study of working memory (WM). Over the past decade, it has been debated whether keeping the feature binding in visual WM consumes more visual attention than the constituent single features. Previous studies have only explored the contribution of domain-general attention or space-based attention in the binding process; no study so far has explored the role of object-based attention in retaining binding in visual WM. We hypothesized that object-based attention underlay the mechanism of rehearsing feature binding in visual WM. Therefore, during the maintenance phase of a visual WM task, we inserted a secondary mental rotation (Experiments 1-3), transparent motion (Experiment 4), or an object-based feature report task (Experiment 5) to consume the object-based attention available for binding. In line with the prediction of the object-based attention hypothesis, Experiments 1-5 revealed a more significant impairment for binding than for constituent single features. However, this selective binding impairment was not observed when inserting a space-based visual search task (Experiment 6). We conclude that object-based attention underlies the rehearsal of binding representation in visual WM. (c) 2015 APA, all rights reserved.
Altered perceptual sensitivity to kinematic invariants in Parkinson's disease.
Dayan, Eran; Inzelberg, Rivka; Flash, Tamar
2012-01-01
Ample evidence exists for coupling between action and perception in neurologically healthy individuals, yet the precise nature of the internal representations shared between these domains remains unclear. One experimentally derived view is that the invariant properties and constraints characterizing movement generation are also manifested during motion perception. One prominent motor invariant is the "two-third power law," describing the strong relation between the kinematics of motion and the geometrical features of the path followed by the hand during planar drawing movements. The two-thirds power law not only characterizes various movement generation tasks but also seems to constrain visual perception of motion. The present study aimed to assess whether motor invariants, such as the two thirds power law also constrain motion perception in patients with Parkinson's disease (PD). Patients with PD and age-matched controls were asked to observe the movement of a light spot rotating on an elliptical path and to modify its velocity until it appeared to move most uniformly. As in previous reports controls tended to choose those movements close to obeying the two-thirds power law as most uniform. Patients with PD displayed a more variable behavior, choosing on average, movements closer but not equal to a constant velocity. Our results thus demonstrate impairments in how the two-thirds power law constrains motion perception in patients with PD, where this relationship between velocity and curvature appears to be preserved but scaled down. Recent hypotheses on the role of the basal ganglia in motor timing may explain these irregularities. Alternatively, these impairments in perception of movement may reflect similar deficits in motor production.
Beyond sensory images: Object-based representation in the human ventral pathway
Pietrini, Pietro; Furey, Maura L.; Ricciardi, Emiliano; Gobbini, M. Ida; Wu, W.-H. Carolyn; Cohen, Leonardo; Guazzelli, Mario; Haxby, James V.
2004-01-01
We investigated whether the topographically organized, category-related patterns of neural response in the ventral visual pathway are a representation of sensory images or a more abstract representation of object form that is not dependent on sensory modality. We used functional MRI to measure patterns of response evoked during visual and tactile recognition of faces and manmade objects in sighted subjects and during tactile recognition in blind subjects. Results showed that visual and tactile recognition evoked category-related patterns of response in a ventral extrastriate visual area in the inferior temporal gyrus that were correlated across modality for manmade objects. Blind subjects also demonstrated category-related patterns of response in this “visual” area, and in more ventral cortical regions in the fusiform gyrus, indicating that these patterns are not due to visual imagery and, furthermore, that visual experience is not necessary for category-related representations to develop in these cortices. These results demonstrate that the representation of objects in the ventral visual pathway is not simply a representation of visual images but, rather, is a representation of more abstract features of object form. PMID:15064396
A Cortical Network for the Encoding of Object Change
Hindy, Nicholas C.; Solomon, Sarah H.; Altmann, Gerry T.M.; Thompson-Schill, Sharon L.
2015-01-01
Understanding events often requires recognizing unique stimuli as alternative, mutually exclusive states of the same persisting object. Using fMRI, we examined the neural mechanisms underlying the representation of object states and object-state changes. We found that subjective ratings of visual dissimilarity between a depicted object and an unseen alternative state of that object predicted the corresponding multivoxel pattern dissimilarity in early visual cortex during an imagery task, while late visual cortex patterns tracked dissimilarity among distinct objects. Early visual cortex pattern dissimilarity for object states in turn predicted the level of activation in an area of left posterior ventrolateral prefrontal cortex (pVLPFC) most responsive to conflict in a separate Stroop color-word interference task, and an area of left ventral posterior parietal cortex (vPPC) implicated in the relational binding of semantic features. We suggest that when visualizing object states, representational content instantiated across early and late visual cortex is modulated by processes in left pVLPFC and left vPPC that support selection and binding, and ultimately event comprehension. PMID:24127425
Visual-Spatial Attention Aids the Maintenance of Object Representations in Visual Working Memory
Williams, Melonie; Pouget, Pierre; Boucher, Leanne; Woodman, Geoffrey F.
2013-01-01
Theories have proposed that the maintenance of object representations in visual working memory is aided by a spatial rehearsal mechanism. In this study, we used two different approaches to test the hypothesis that overt and covert visual-spatial attention mechanisms contribute to the maintenance of object representations in visual working memory. First, we tracked observers’ eye movements while remembering a variable number of objects during change-detection tasks. We observed that during the blank retention interval, participants spontaneously shifted gaze to the locations that the objects had occupied in the memory array. Next, we hypothesized that if attention mechanisms contribute to the maintenance of object representations, then drawing attention away from the object locations during the retention interval would impair object memory during these change-detection tasks. Supporting this prediction, we found that attending to the fixation point in anticipation of a brief probe stimulus during the retention interval reduced change-detection accuracy even on the trials in which no probe occurred. These findings support models of working memory in which visual-spatial selection mechanisms contribute to the maintenance of object representations. PMID:23371773
Optimization of Visual Information Presentation for Visual Prosthesis.
Guo, Fei; Yang, Yuan; Gao, Yong
2018-01-01
Visual prosthesis applying electrical stimulation to restore visual function for the blind has promising prospects. However, due to the low resolution, limited visual field, and the low dynamic range of the visual perception, huge loss of information occurred when presenting daily scenes. The ability of object recognition in real-life scenarios is severely restricted for prosthetic users. To overcome the limitations, optimizing the visual information in the simulated prosthetic vision has been the focus of research. This paper proposes two image processing strategies based on a salient object detection technique. The two processing strategies enable the prosthetic implants to focus on the object of interest and suppress the background clutter. Psychophysical experiments show that techniques such as foreground zooming with background clutter removal and foreground edge detection with background reduction have positive impacts on the task of object recognition in simulated prosthetic vision. By using edge detection and zooming technique, the two processing strategies significantly improve the recognition accuracy of objects. We can conclude that the visual prosthesis using our proposed strategy can assist the blind to improve their ability to recognize objects. The results will provide effective solutions for the further development of visual prosthesis.
Optimization of Visual Information Presentation for Visual Prosthesis
Gao, Yong
2018-01-01
Visual prosthesis applying electrical stimulation to restore visual function for the blind has promising prospects. However, due to the low resolution, limited visual field, and the low dynamic range of the visual perception, huge loss of information occurred when presenting daily scenes. The ability of object recognition in real-life scenarios is severely restricted for prosthetic users. To overcome the limitations, optimizing the visual information in the simulated prosthetic vision has been the focus of research. This paper proposes two image processing strategies based on a salient object detection technique. The two processing strategies enable the prosthetic implants to focus on the object of interest and suppress the background clutter. Psychophysical experiments show that techniques such as foreground zooming with background clutter removal and foreground edge detection with background reduction have positive impacts on the task of object recognition in simulated prosthetic vision. By using edge detection and zooming technique, the two processing strategies significantly improve the recognition accuracy of objects. We can conclude that the visual prosthesis using our proposed strategy can assist the blind to improve their ability to recognize objects. The results will provide effective solutions for the further development of visual prosthesis. PMID:29731769
Hoelzle, James B; Nelson, Nathaniel W; Smith, Clifford A
2011-03-01
Dimensional structures underlying the Wechsler Memory Scale-Fourth Edition (WMS-IV) and Wechsler Memory Scale-Third Edition (WMS-III) were compared to determine whether the revised measure has a more coherent and clinically relevant factor structure. Principal component analyses were conducted in normative samples reported in the respective technical manuals. Empirically supported procedures guided retention of dimensions. An invariant two-dimensional WMS-IV structure reflecting constructs of auditory learning/memory and visual attention/memory (C1 = .97; C2 = .96) is more theoretically coherent than the replicable, heterogeneous WMS-III dimension (C1 = .97). This research suggests that the WMS-IV may have greater utility in identifying lateralized memory dysfunction.
Effectiveness of basic display augmentation in vehicular control by visual field cues
NASA Technical Reports Server (NTRS)
Grunwald, A. J.; Merhav, S. J.
1978-01-01
The paper investigates the effectiveness of different basic display augmentation concepts - fixed reticle, velocity vector, and predicted future vehicle path - for RPVs controlled by a vehicle-mounted TV camera. The task is lateral manual control of a low flying RPV along a straight reference line in the presence of random side gusts. The man-machine system and the visual interface are modeled as a linear time-invariant system. Minimization of a quadratic performance criterion is assumed to underlie the control strategy of a well-trained human operator. The solution for the optimal feedback matrix enables the explicit computation of the variances of lateral deviation and directional error of the vehicle and of the control force that are used as performance measures.
Douglas, Danielle; Newsome, Rachel N; Man, Louisa LY
2018-01-01
A significant body of research in cognitive neuroscience is aimed at understanding how object concepts are represented in the human brain. However, it remains unknown whether and where the visual and abstract conceptual features that define an object concept are integrated. We addressed this issue by comparing the neural pattern similarities among object-evoked fMRI responses with behavior-based models that independently captured the visual and conceptual similarities among these stimuli. Our results revealed evidence for distinctive coding of visual features in lateral occipital cortex, and conceptual features in the temporal pole and parahippocampal cortex. By contrast, we found evidence for integrative coding of visual and conceptual object features in perirhinal cortex. The neuroanatomical specificity of this effect was highlighted by results from a searchlight analysis. Taken together, our findings suggest that perirhinal cortex uniquely supports the representation of fully specified object concepts through the integration of their visual and conceptual features. PMID:29393853
A calibration method immune to the projector errors in fringe projection profilometry
NASA Astrophysics Data System (ADS)
Zhang, Ruihua; Guo, Hongwei
2017-08-01
In fringe projection technique, system calibration is a tedious task to establish the mapping relationship between the object depths and the fringe phases. Especially, it is not easy to accurately determine the parameters of the projector in this system, which may induce errors in the measurement results. To solve this problem, this paper proposes a new calibration by using the cross-ratio invariance in the system geometry for determining the phase-to-depth relations. In it, we analyze the epipolar eometry of the fringe projection system. On each epipolar plane, the depth variation along an incident ray induces the pixel movement along the epipolar line on the image plane of the camera. These depth variations and pixel movements can be connected by use of the projective transformations, under which condition the cross-ratio for each of them keeps invariant. Based on this fact, we suggest measuring the depth map by use of this cross-ratio invariance. Firstly, we shift the reference board in its perpendicular direction to three positions with known depths, and measure their phase maps as the reference phase maps; and secondly, when measuring an object, we calculate the object depth at each pixel by equating the cross-ratio of the depths to that of the corresponding pixels having the same phase on the image plane of the camera. This method is immune to the errors sourced from the projector, including the distortions both in the geometric shapes and in the intensity profiles of the projected fringe patterns.The experimental results demonstrate the proposed method to be feasible and valid.
Spatial resolution in visual memory.
Ben-Shalom, Asaf; Ganel, Tzvi
2015-04-01
Representations in visual short-term memory are considered to contain relatively elaborated information on object structure. Conversely, representations in earlier stages of the visual hierarchy are thought to be dominated by a sensory-based, feed-forward buildup of information. In four experiments, we compared the spatial resolution of different object properties between two points in time along the processing hierarchy in visual short-term memory. Subjects were asked either to estimate the distance between objects or to estimate the size of one of the objects' features under two experimental conditions, of either a short or a long delay period between the presentation of the target stimulus and the probe. When different objects were referred to, similar spatial resolution was found for the two delay periods, suggesting that initial processing stages are sensitive to object-based properties. Conversely, superior resolution was found for the short, as compared with the long, delay when features were referred to. These findings suggest that initial representations in visual memory are hybrid in that they allow fine-grained resolution for object features alongside normal visual sensitivity to the segregation between objects. The findings are also discussed in reference to the distinction made in earlier studies between visual short-term memory and iconic memory.
Context-dependent spatially periodic activity in the human entorhinal cortex
Nguyen, T. Peter; Török, Ágoston; Shen, Jason Y.; Briggs, Deborah E.; Modur, Pradeep N.; Buchanan, Robert J.
2017-01-01
The spatially periodic activity of grid cells in the entorhinal cortex (EC) of the rodent, primate, and human provides a coordinate system that, together with the hippocampus, informs an individual of its location relative to the environment and encodes the memory of that location. Among the most defining features of grid-cell activity are the 60° rotational symmetry of grids and preservation of grid scale across environments. Grid cells, however, do display a limited degree of adaptation to environments. It remains unclear if this level of environment invariance generalizes to human grid-cell analogs, where the relative contribution of visual input to the multimodal sensory input of the EC is significantly larger than in rodents. Patients diagnosed with nontractable epilepsy who were implanted with entorhinal cortical electrodes performing virtual navigation tasks to memorized locations enabled us to investigate associations between grid-like patterns and environment. Here, we report that the activity of human entorhinal cortical neurons exhibits adaptive scaling in grid period, grid orientation, and rotational symmetry in close association with changes in environment size, shape, and visual cues, suggesting scale invariance of the frequency, rather than the wavelength, of spatially periodic activity. Our results demonstrate that neurons in the human EC represent space with an enhanced flexibility relative to neurons in rodents because they are endowed with adaptive scalability and context dependency. PMID:28396399
Shi, Qing; Stell, William K.
2013-01-01
Background Through adaptation, animals can function visually under an extremely broad range of light intensities. Light adaptation starts in the retina, through shifts in photoreceptor sensitivity and kinetics plus modulation of visual processing in retinal circuits. Although considerable research has been conducted on retinal adaptation in nocturnal species with rod-dominated retinas, such as the mouse, little is known about how cone-dominated avian retinas adapt to changes in mean light intensity. Methodology/Principal Findings We used the optokinetic response to characterize contrast sensitivity (CS) in the chick retina as a function of spatial frequency and temporal frequency at different mean light intensities. We found that: 1) daytime, cone-driven CS was tuned to spatial frequency; 2) nighttime, presumably rod-driven CS was tuned to temporal frequency and spatial frequency; 3) daytime, presumably cone-driven CS at threshold intensity was invariant with temporal and spatial frequency; and 4) daytime photopic CS was invariant with clock time. Conclusion/Significance Light- and dark-adaptational changes in CS were investigated comprehensively for the first time in the cone-dominated retina of an avian, diurnal species. The chick retina, like the mouse retina, adapts by using a “day/night” or “cone/rod” switch in tuning preference during changes in lighting conditions. The chick optokinetic response is an attractive model for noninvasive, behavioral studies of adaptation in retinal circuitry in health and disease. PMID:24098693
NASA Astrophysics Data System (ADS)
Lahamy, H.; Lichti, D.
2012-07-01
The automatic interpretation of human gestures can be used for a natural interaction with computers without the use of mechanical devices such as keyboards and mice. The recognition of hand postures have been studied for many years. However, most of the literature in this area has considered 2D images which cannot provide a full description of the hand gestures. In addition, a rotation-invariant identification remains an unsolved problem even with the use of 2D images. The objective of the current study is to design a rotation-invariant recognition process while using a 3D signature for classifying hand postures. An heuristic and voxelbased signature has been designed and implemented. The tracking of the hand motion is achieved with the Kalman filter. A unique training image per posture is used in the supervised classification. The designed recognition process and the tracking procedure have been successfully evaluated. This study has demonstrated the efficiency of the proposed rotation invariant 3D hand posture signature which leads to 98.24% recognition rate after testing 12723 samples of 12 gestures taken from the alphabet of the American Sign Language.
Combined influence of visual scene and body tilt on arm pointing movements: gravity matters!
Scotto Di Cesare, Cécile; Sarlegna, Fabrice R; Bourdin, Christophe; Mestre, Daniel R; Bringoux, Lionel
2014-01-01
Performing accurate actions such as goal-directed arm movements requires taking into account visual and body orientation cues to localize the target in space and produce appropriate reaching motor commands. We experimentally tilted the body and/or the visual scene to investigate how visual and body orientation cues are combined for the control of unseen arm movements. Subjects were asked to point toward a visual target using an upward movement during slow body and/or visual scene tilts. When the scene was tilted, final pointing errors varied as a function of the direction of the scene tilt (forward or backward). Actual forward body tilt resulted in systematic target undershoots, suggesting that the brain may have overcompensated for the biomechanical movement facilitation arising from body tilt. Combined body and visual scene tilts also affected final pointing errors according to the orientation of the visual scene. The data were further analysed using either a body-centered or a gravity-centered reference frame to encode visual scene orientation with simple additive models (i.e., 'combined' tilts equal to the sum of 'single' tilts). We found that the body-centered model could account only for some of the data regarding kinematic parameters and final errors. In contrast, the gravity-centered modeling in which the body and visual scene orientations were referred to vertical could explain all of these data. Therefore, our findings suggest that the brain uses gravity, thanks to its invariant properties, as a reference for the combination of visual and non-visual cues.
Combined Influence of Visual Scene and Body Tilt on Arm Pointing Movements: Gravity Matters!
Scotto Di Cesare, Cécile; Sarlegna, Fabrice R.; Bourdin, Christophe; Mestre, Daniel R.; Bringoux, Lionel
2014-01-01
Performing accurate actions such as goal-directed arm movements requires taking into account visual and body orientation cues to localize the target in space and produce appropriate reaching motor commands. We experimentally tilted the body and/or the visual scene to investigate how visual and body orientation cues are combined for the control of unseen arm movements. Subjects were asked to point toward a visual target using an upward movement during slow body and/or visual scene tilts. When the scene was tilted, final pointing errors varied as a function of the direction of the scene tilt (forward or backward). Actual forward body tilt resulted in systematic target undershoots, suggesting that the brain may have overcompensated for the biomechanical movement facilitation arising from body tilt. Combined body and visual scene tilts also affected final pointing errors according to the orientation of the visual scene. The data were further analysed using either a body-centered or a gravity-centered reference frame to encode visual scene orientation with simple additive models (i.e., ‘combined’ tilts equal to the sum of ‘single’ tilts). We found that the body-centered model could account only for some of the data regarding kinematic parameters and final errors. In contrast, the gravity-centered modeling in which the body and visual scene orientations were referred to vertical could explain all of these data. Therefore, our findings suggest that the brain uses gravity, thanks to its invariant properties, as a reference for the combination of visual and non-visual cues. PMID:24925371
NASA Astrophysics Data System (ADS)
Kypraios, Ioannis; Young, Rupert C. D.; Chatwin, Chris R.
2009-08-01
Motivated by the non-linear interpolation and generalization abilities of the hybrid optical neural network filter between the reference and non-reference images of the true-class object we designed the modifiedhybrid optical neural network filter. We applied an optical mask to the hybrid optical neural network's filter input. The mask was built with the constant weight connections of a randomly chosen image included in the training set. The resulted design of the modified-hybrid optical neural network filter is optimized for performing best in cluttered scenes of the true-class object. Due to the shift invariance properties inherited by its correlator unit the filter can accommodate multiple objects of the same class to be detected within an input cluttered image. Additionally, the architecture of the neural network unit of the general hybrid optical neural network filter allows the recognition of multiple objects of different classes within the input cluttered image by modifying the output layer of the unit. We test the modified-hybrid optical neural network filter for multiple objects of the same and of different classes' recognition within cluttered input images and video sequences of cluttered scenes. The filter is shown to exhibit with a single pass over the input data simultaneously out-of-plane rotation, shift invariance and good clutter tolerance. It is able to successfully detect and classify correctly the true-class objects within background clutter for which there has been no previous training.
The case of the missing visual details: Occlusion and long-term visual memory.
Williams, Carrick C; Burkle, Kyle A
2017-10-01
To investigate the critical information in long-term visual memory representations of objects, we used occlusion to emphasize 1 type of information or another. By occluding 1 solid side of the object (e.g., top 50%) or by occluding 50% of the object with stripes (like a picket fence), we emphasized visible information about the object, processing the visible details in the former and the object's overall form in the latter. On a token discrimination test, surprisingly, memory for solid or stripe occluded objects at either encoding (Experiment 1) or test (Experiment 2) was the same. In contrast, when occluded objects matched at encoding and test (Experiment 3) or when the occlusion shifted, revealing the entire object piecemeal (Experiment 4), memory was better for solid compared with stripe occluded objects, indicating that objects are represented differently in long-term visual memory. Critically, we also found that when the task emphasized remembering exactly what was shown, memory performance in the more detailed solid occlusion condition exceeded that in the stripe condition (Experiment 5). However, when the task emphasized the whole object form, memory was better in the stripe condition (Experiment 6) than in the solid condition. We argue that long-term visual memory can represent objects flexibly, and task demands can interact with visual information, allowing the viewer to cope with changing real-world visual environments. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
ERIC Educational Resources Information Center
Pang, Ming Fai; Marton, Ference
2013-01-01
Two studies are reported in this paper. The object of learning in both is the economic principle of changes in price as a function of changes in the relative magnitude of changes in demand and supply. The patterns of variation and invariance, defining the conditions compared were built into pedagogical tools (text, graphs, and worksheets). The…
Northeast Artificial Intelligence Consortium (NAIC) Review of Technical Tasks. Volume 2, Part 1.
1987-07-01
34- . 6.2 Transformation Invariant Attributes for S Digitized Object Outlines ................................. 469 6.3 Design of an Inference Engine for an...Attributes for Digital Object Outlines ...................................... 597 7 SPEECH UNDERSTANDING RESEARCH ( Rochester Institute of Technology...versatile maintenance expert system ES) for trouble-shooting--’ digital circuits. +" Some diagnosis systems, such as MYCLN [19] for medical diagnosis and CRIB
The effects of perceptual priming on 4-year-olds' haptic-to-visual cross-modal transfer.
Kalagher, Hilary
2013-01-01
Four-year-old children often have difficulty visually recognizing objects that were previously experienced only haptically. This experiment attempts to improve their performance in these haptic-to-visual transfer tasks. Sixty-two 4-year-old children participated in priming trials in which they explored eight unfamiliar objects visually, haptically, or visually and haptically together. Subsequently, all children participated in the same haptic-to-visual cross-modal transfer task. In this task, children haptically explored the objects that were presented in the priming phase and then visually identified a match from among three test objects, each matching the object on only one dimension (shape, texture, or color). Children in all priming conditions predominantly made shape-based matches; however, the most shape-based matches were made in the Visual and Haptic condition. All kinds of priming provided the necessary memory traces upon which subsequent haptic exploration could build a strong enough representation to enable subsequent visual recognition. Haptic exploration patterns during the cross-modal transfer task are discussed and the detailed analyses provide a unique contribution to our understanding of the development of haptic exploratory procedures.
Neural representation of objects in space: a dual coding account.
Humphreys, G W
1998-01-01
I present evidence on the nature of object coding in the brain and discuss the implications of this coding for models of visual selective attention. Neuropsychological studies of task-based constraints on: (i) visual neglect; and (ii) reading and counting, reveal the existence of parallel forms of spatial representation for objects: within-object representations, where elements are coded as parts of objects, and between-object representations, where elements are coded as independent objects. Aside from these spatial codes for objects, however, the coding of visual space is limited. We are extremely poor at remembering small spatial displacements across eye movements, indicating (at best) impoverished coding of spatial position per se. Also, effects of element separation on spatial extinction can be eliminated by filling the space with an occluding object, indicating that spatial effects on visual selection are moderated by object coding. Overall, there are separate limits on visual processing reflecting: (i) the competition to code parts within objects; (ii) the small number of independent objects that can be coded in parallel; and (iii) task-based selection of whether within- or between-object codes determine behaviour. Between-object coding may be linked to the dorsal visual system while parallel coding of parts within objects takes place in the ventral system, although there may additionally be some dorsal involvement either when attention must be shifted within objects or when explicit spatial coding of parts is necessary for object identification. PMID:9770227
Grubert, Anna; Eimer, Martin
2015-11-11
During the maintenance of task-relevant objects in visual working memory, the contralateral delay activity (CDA) is elicited over the hemisphere opposite to the visual field where these objects are presented. The presence of this lateralised CDA component demonstrates the existence of position-dependent object representations in working memory. We employed a change detection task to investigate whether the represented object locations in visual working memory are shifted in preparation for the known location of upcoming comparison stimuli. On each trial, bilateral memory displays were followed after a delay period by bilateral test displays. Participants had to encode and maintain three visual objects on one side of the memory display, and to judge whether they were identical or different to three objects in the test display. Task-relevant memory and test stimuli were located in the same visual hemifield in the no-shift task, and on opposite sides in the horizontal shift task. CDA components of similar size were triggered contralateral to the memorized objects in both tasks. The absence of a polarity reversal of the CDA in the horizontal shift task demonstrated that there was no preparatory shift of memorized object location towards the side of the upcoming comparison stimuli. These results suggest that visual working memory represents the locations of visual objects during encoding, and that the matching of memorized and test objects at different locations is based on a comparison process that can bridge spatial translations between these objects. This article is part of a Special Issue entitled SI: Prediction and Attention. Copyright © 2014 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Li, Heng; Zeng, Yajie; Lu, Zhuofan; Cao, Xiaofei; Su, Xiaofan; Sui, Xiaohong; Wang, Jing; Chai, Xinyu
2018-04-01
Objective. Retinal prosthesis devices have shown great value in restoring some sight for individuals with profoundly impaired vision, but the visual acuity and visual field provided by prostheses greatly limit recipients’ visual experience. In this paper, we employ computer vision approaches to seek to expand the perceptible visual field in patients implanted potentially with a high-density retinal prosthesis while maintaining visual acuity as much as possible. Approach. We propose an optimized content-aware image retargeting method, by introducing salient object detection based on color and intensity-difference contrast, aiming to remap important information of a scene into a small visual field and preserve their original scale as much as possible. It may improve prosthetic recipients’ perceived visual field and aid in performing some visual tasks (e.g. object detection and object recognition). To verify our method, psychophysical experiments, detecting object number and recognizing objects, are conducted under simulated prosthetic vision. As control, we use three other image retargeting techniques, including Cropping, Scaling, and seam-assisted shrinkability. Main results. Results show that our method outperforms in preserving more key features and has significantly higher recognition accuracy in comparison with other three image retargeting methods under the condition of small visual field and low-resolution. Significance. The proposed method is beneficial to expand the perceived visual field of prosthesis recipients and improve their object detection and recognition performance. It suggests that our method may provide an effective option for image processing module in future high-density retinal implants.
Object formation in visual working memory: Evidence from object-based attention.
Zhou, Jifan; Zhang, Haihang; Ding, Xiaowei; Shui, Rende; Shen, Mowei
2016-09-01
We report on how visual working memory (VWM) forms intact perceptual representations of visual objects using sub-object elements. Specifically, when objects were divided into fragments and sequentially encoded into VWM, the fragments were involuntarily integrated into objects in VWM, as evidenced by the occurrence of both positive and negative object-based attention effects: In Experiment 1, when subjects' attention was cued to a location occupied by the VWM object, the target presented at the location of that object was perceived as occurring earlier than that presented at the location of a different object. In Experiment 2, responses to a target were significantly slower when a distractor was presented at the same location as the cued object (Experiment 2). These results suggest that object fragments can be integrated into objects within VWM in a manner similar to that of visual perception. Copyright © 2016 Elsevier B.V. All rights reserved.
Visual discrimination in an orangutan (Pongo pygmaeus): measuring visual preference.
Hanazuka, Yuki; Kurotori, Hidetoshi; Shimizu, Mika; Midorikawa, Akira
2012-04-01
Although previous studies have confirmed that trained orangutans visually discriminate between mammals and artificial objects, whether orangutans without operant conditioning can discriminate remains unknown. The visual discrimination ability in an orangutan (Pongo pygmaeus) with no experience in operant learning was examined using measures of visual preference. Sixteen color photographs of inanimate objects and of mammals with four legs were randomly presented to an orangutan. The results showed that the mean looking time at photographs of mammals with four legs was longer than that for inanimate objects, suggesting that the orangutan discriminated mammals with four legs from inanimate objects. The results implied that orangutans who have not experienced operant conditioning may possess the ability to discriminate visually.
Size-Sensitive Perceptual Representations Underlie Visual and Haptic Object Recognition
Craddock, Matt; Lawson, Rebecca
2009-01-01
A variety of similarities between visual and haptic object recognition suggests that the two modalities may share common representations. However, it is unclear whether such common representations preserve low-level perceptual features or whether transfer between vision and haptics is mediated by high-level, abstract representations. Two experiments used a sequential shape-matching task to examine the effects of size changes on unimodal and crossmodal visual and haptic object recognition. Participants felt or saw 3D plastic models of familiar objects. The two objects presented on a trial were either the same size or different sizes and were the same shape or different but similar shapes. Participants were told to ignore size changes and to match on shape alone. In Experiment 1, size changes on same-shape trials impaired performance similarly for both visual-to-visual and haptic-to-haptic shape matching. In Experiment 2, size changes impaired performance on both visual-to-haptic and haptic-to-visual shape matching and there was no interaction between the cost of size changes and direction of transfer. Together the unimodal and crossmodal matching results suggest that the same, size-specific perceptual representations underlie both visual and haptic object recognition, and indicate that crossmodal memory for objects must be at least partly based on common perceptual representations. PMID:19956685
A note on statistical analysis of shape through triangulation of landmarks
Rao, C. Radhakrishna
2000-01-01
In an earlier paper, the author jointly with S. Suryawanshi proposed statistical analysis of shape through triangulation of landmarks on objects. It was observed that the angles of the triangles are invariant to scaling, location, and rotation of objects. No distinction was made between an object and its reflection. The present paper provides the methodology of shape discrimination when reflection is also taken into account and makes suggestions for modifications to be made when some of the landmarks are collinear. PMID:10737780
Cognition versus Constitution of Objects: From Kant to Modern Physics
NASA Astrophysics Data System (ADS)
Mittelstaedt, Peter
2009-07-01
Classical mechanics in phase space as well as quantum mechanics in Hilbert space lead to states and observables but not to objects that may be considered as carriers of observable quantities. However, in both cases objects can be constituted as new entities by means of invariance properties of the theories in question. We show, that this way of reasoning has a long history in physics and philosophy and that it can be traced back to the transcendental arguments in Kant’s critique of pure reason.
What and where information in the caudate tail guides saccades to visual objects
Yamamoto, Shinya; Monosov, Ilya E.; Yasuda, Masaharu; Hikosaka, Okihide
2012-01-01
We understand the world by making saccadic eye movements to various objects. However, it is unclear how a saccade can be aimed at a particular object, because two kinds of visual information, what the object is and where it is, are processed separately in the dorsal and ventral visual cortical pathways. Here we provide evidence suggesting that a basal ganglia circuit through the tail of the monkey caudate nucleus (CDt) guides such object-directed saccades. First, many CDt neurons responded to visual objects depending on where and what the objects were. Second, electrical stimulation in the CDt induced saccades whose directions matched the preferred directions of neurons at the stimulation site. Third, many CDt neurons increased their activity before saccades directed to the neurons’ preferred objects and directions in a free-viewing condition. Our results suggest that CDt neurons receive both ‘what’ and ‘where’ information and guide saccades to visual objects. PMID:22875934
Shape and color conjunction stimuli are represented as bound objects in visual working memory.
Luria, Roy; Vogel, Edward K
2011-05-01
The integrated object view of visual working memory (WM) argues that objects (rather than features) are the building block of visual WM, so that adding an extra feature to an object does not result in any extra cost to WM capacity. Alternative views have shown that complex objects consume additional WM storage capacity so that it may not be represented as bound objects. Additionally, it was argued that two features from the same dimension (i.e., color-color) do not form an integrated object in visual WM. This led some to argue for a "weak" object view of visual WM. We used the contralateral delay activity (the CDA) as an electrophysiological marker of WM capacity, to test those alternative hypotheses to the integrated object account. In two experiments we presented complex stimuli and color-color conjunction stimuli, and compared performance in displays that had one object but varying degrees of feature complexity. The results supported the integrated object account by showing that the CDA amplitude corresponded to the number of objects regardless of the number of features within each object, even for complex objects or color-color conjunction stimuli. Copyright © 2010 Elsevier Ltd. All rights reserved.
Emotion-induced trade-offs in spatiotemporal vision.
Bocanegra, Bruno R; Zeelenberg, René
2011-05-01
It is generally assumed that emotion facilitates human vision in order to promote adaptive responses to a potential threat in the environment. Surprisingly, we recently found that emotion in some cases impairs the perception of elementary visual features (Bocanegra & Zeelenberg, 2009b). Here, we demonstrate that emotion improves fast temporal vision at the expense of fine-grained spatial vision. We tested participants' threshold resolution with Landolt circles containing a small spatial or brief temporal discontinuity. The prior presentation of a fearful face cue, compared with a neutral face cue, impaired spatial resolution but improved temporal resolution. In addition, we show that these benefits and deficits were triggered selectively by the global configural properties of the faces, which were transmitted only through low spatial frequencies. Critically, the common locus of these opposite effects suggests a trade-off between magno- and parvocellular-type visual channels, which contradicts the common assumption that emotion invariably improves vision. We show that, rather than being a general "boost" for all visual features, affective neural circuits sacrifice the slower processing of small details for a coarser but faster visual signal.
Ptak, Radek; Lazeyras, François; Di Pietro, Marie; Schnider, Armin; Simon, Stéphane R
2014-07-01
Patients with visual object agnosia fail to recognize the identity of visually presented objects despite preserved semantic knowledge. Object agnosia may result from damage to visual cortex lying close to or overlapping with the lateral occipital complex (LOC), a brain region that exhibits selectivity to the shape of visually presented objects. Despite this anatomical overlap the relationship between shape processing in the LOC and shape representations in object agnosia is unknown. We studied a patient with object agnosia following isolated damage to the left occipito-temporal cortex overlapping with the LOC. The patient showed intact processing of object structure, yet often made identification errors that were mainly based on the global visual similarity between objects. Using functional Magnetic Resonance Imaging (fMRI) we found that the damaged as well as the contralateral, structurally intact right LOC failed to show any object-selective fMRI activity, though the latter retained selectivity for faces. Thus, unilateral damage to the left LOC led to a bilateral breakdown of neural responses to a specific stimulus class (objects and artefacts) while preserving the response to a different stimulus class (faces). These findings indicate that representations of structure necessary for the identification of objects crucially rely on bilateral, distributed coding of shape features. Copyright © 2014 Elsevier Ltd. All rights reserved.
Changes in Visual Object Recognition Precede the Shape Bias in Early Noun Learning
Yee, Meagan; Jones, Susan S.; Smith, Linda B.
2012-01-01
Two of the most formidable skills that characterize human beings are language and our prowess in visual object recognition. They may also be developmentally intertwined. Two experiments, a large sample cross-sectional study and a smaller sample 6-month longitudinal study of 18- to 24-month-olds, tested a hypothesized developmental link between changes in visual object representation and noun learning. Previous findings in visual object recognition indicate that children’s ability to recognize common basic level categories from sparse structural shape representations of object shape emerges between the ages of 18 and 24 months, is related to noun vocabulary size, and is lacking in children with language delay. Other research shows in artificial noun learning tasks that during this same developmental period, young children systematically generalize object names by shape, that this shape bias predicts future noun learning, and is lacking in children with language delay. The two experiments examine the developmental relation between visual object recognition and the shape bias for the first time. The results show that developmental changes in visual object recognition systematically precede the emergence of the shape bias. The results suggest a developmental pathway in which early changes in visual object recognition that are themselves linked to category learning enable the discovery of higher-order regularities in category structure and thus the shape bias in novel noun learning tasks. The proposed developmental pathway has implications for understanding the role of specific experience in the development of both visual object recognition and the shape bias in early noun learning. PMID:23227015
Self-organization via active exploration in robotic applications
NASA Technical Reports Server (NTRS)
Ogmen, H.; Prakash, R. V.
1992-01-01
We describe a neural network based robotic system. Unlike traditional robotic systems, our approach focussed on non-stationary problems. We indicate that self-organization capability is necessary for any system to operate successfully in a non-stationary environment. We suggest that self-organization should be based on an active exploration process. We investigated neural architectures having novelty sensitivity, selective attention, reinforcement learning, habit formation, flexible criteria categorization properties and analyzed the resulting behavior (consisting of an intelligent initiation of exploration) by computer simulations. While various computer vision researchers acknowledged recently the importance of active processes (Swain and Stricker, 1991), the proposed approaches within the new framework still suffer from a lack of self-organization (Aloimonos and Bandyopadhyay, 1987; Bajcsy, 1988). A self-organizing, neural network based robot (MAVIN) has been recently proposed (Baloch and Waxman, 1991). This robot has the capability of position, size rotation invariant pattern categorization, recognition and pavlovian conditioning. Our robot does not have initially invariant processing properties. The reason for this is the emphasis we put on active exploration. We maintain the point of view that such invariant properties emerge from an internalization of exploratory sensory-motor activity. Rather than coding the equilibria of such mental capabilities, we are seeking to capture its dynamics to understand on the one hand how the emergence of such invariances is possible and on the other hand the dynamics that lead to these invariances. The second point is crucial for an adaptive robot to acquire new invariances in non-stationary environments, as demonstrated by the inverting glass experiments of Helmholtz. We will introduce Pavlovian conditioning circuits in our future work for the precise objective of achieving the generation, coordination, and internalization of sequence of actions.
An insect-inspired model for visual binding II: functional analysis and visual attention.
Northcutt, Brandon D; Higgins, Charles M
2017-04-01
We have developed a neural network model capable of performing visual binding inspired by neuronal circuitry in the optic glomeruli of flies: a brain area that lies just downstream of the optic lobes where early visual processing is performed. This visual binding model is able to detect objects in dynamic image sequences and bind together their respective characteristic visual features-such as color, motion, and orientation-by taking advantage of their common temporal fluctuations. Visual binding is represented in the form of an inhibitory weight matrix which learns over time which features originate from a given visual object. In the present work, we show that information represented implicitly in this weight matrix can be used to explicitly count the number of objects present in the visual image, to enumerate their specific visual characteristics, and even to create an enhanced image in which one particular object is emphasized over others, thus implementing a simple form of visual attention. Further, we present a detailed analysis which reveals the function and theoretical limitations of the visual binding network and in this context describe a novel network learning rule which is optimized for visual binding.
Generating descriptive visual words and visual phrases for large-scale image applications.
Zhang, Shiliang; Tian, Qi; Hua, Gang; Huang, Qingming; Gao, Wen
2011-09-01
Bag-of-visual Words (BoWs) representation has been applied for various problems in the fields of multimedia and computer vision. The basic idea is to represent images as visual documents composed of repeatable and distinctive visual elements, which are comparable to the text words. Notwithstanding its great success and wide adoption, visual vocabulary created from single-image local descriptors is often shown to be not as effective as desired. In this paper, descriptive visual words (DVWs) and descriptive visual phrases (DVPs) are proposed as the visual correspondences to text words and phrases, where visual phrases refer to the frequently co-occurring visual word pairs. Since images are the carriers of visual objects and scenes, a descriptive visual element set can be composed by the visual words and their combinations which are effective in representing certain visual objects or scenes. Based on this idea, a general framework is proposed for generating DVWs and DVPs for image applications. In a large-scale image database containing 1506 object and scene categories, the visual words and visual word pairs descriptive to certain objects or scenes are identified and collected as the DVWs and DVPs. Experiments show that the DVWs and DVPs are informative and descriptive and, thus, are more comparable with the text words than the classic visual words. We apply the identified DVWs and DVPs in several applications including large-scale near-duplicated image retrieval, image search re-ranking, and object recognition. The combination of DVW and DVP performs better than the state of the art in large-scale near-duplicated image retrieval in terms of accuracy, efficiency and memory consumption. The proposed image search re-ranking algorithm: DWPRank outperforms the state-of-the-art algorithm by 12.4% in mean average precision and about 11 times faster in efficiency.
Eguchi, Akihiro; Mender, Bedeho M. W.; Evans, Benjamin D.; Humphreys, Glyn W.; Stringer, Simon M.
2015-01-01
Neurons in successive stages of the primate ventral visual pathway encode the spatial structure of visual objects. In this paper, we investigate through computer simulation how these cell firing properties may develop through unsupervised visually-guided learning. Individual neurons in the model are shown to exploit statistical regularity and temporal continuity of the visual inputs during training to learn firing properties that are similar to neurons in V4 and TEO. Neurons in V4 encode the conformation of boundary contour elements at a particular position within an object regardless of the location of the object on the retina, while neurons in TEO integrate information from multiple boundary contour elements. This representation goes beyond mere object recognition, in which neurons simply respond to the presence of a whole object, but provides an essential foundation from which the brain is subsequently able to recognize the whole object. PMID:26300766
Scene and Position Specificity in Visual Memory for Objects
ERIC Educational Resources Information Center
Hollingworth, Andrew
2006-01-01
This study investigated whether and how visual representations of individual objects are bound in memory to scene context. Participants viewed a series of naturalistic scenes, and memory for the visual form of a target object in each scene was examined in a 2-alternative forced-choice test, with the distractor object either a different object…
Affective and contextual values modulate spatial frequency use in object recognition
Caplette, Laurent; West, Gregory; Gomot, Marie; Gosselin, Frédéric; Wicker, Bruno
2014-01-01
Visual object recognition is of fundamental importance in our everyday interaction with the environment. Recent models of visual perception emphasize the role of top-down predictions facilitating object recognition via initial guesses that limit the number of object representations that need to be considered. Several results suggest that this rapid and efficient object processing relies on the early extraction and processing of low spatial frequencies (LSF). The present study aimed to investigate the SF content of visual object representations and its modulation by contextual and affective values of the perceived object during a picture-name verification task. Stimuli consisted of pictures of objects equalized in SF content and categorized as having low or high affective and contextual values. To access the SF content of stored visual representations of objects, SFs of each image were then randomly sampled on a trial-by-trial basis. Results reveal that intermediate SFs between 14 and 24 cycles per object (2.3–4 cycles per degree) are correlated with fast and accurate identification for all categories of objects. Moreover, there was a significant interaction between affective and contextual values over the SFs correlating with fast recognition. These results suggest that affective and contextual values of a visual object modulate the SF content of its internal representation, thus highlighting the flexibility of the visual recognition system. PMID:24904514
KAM tori and whiskered invariant tori for non-autonomous systems
NASA Astrophysics Data System (ADS)
Canadell, Marta; de la Llave, Rafael
2015-08-01
We consider non-autonomous dynamical systems which converge to autonomous (or periodic) systems exponentially fast in time. Such systems appear naturally as models of many physical processes affected by external pulses. We introduce definitions of non-autonomous invariant tori and non-autonomous whiskered tori and their invariant manifolds and we prove their persistence under small perturbations, smooth dependence on parameters and several geometric properties (if the systems are Hamiltonian, the tori are Lagrangian manifolds). We note that such definitions are problematic for general time-dependent systems, but we show that they are unambiguous for systems converging exponentially fast to autonomous. The proof of persistence relies only on a standard Implicit Function Theorem in Banach spaces and it does not require that the rotations in the tori are Diophantine nor that the systems we consider preserve any geometric structure. We only require that the autonomous system preserves these objects. In particular, when the autonomous system is integrable, we obtain the persistence of tori with rational rotational. We also discuss fast and efficient algorithms for their computation. The method also applies to infinite dimensional systems which define a good evolution, e.g. PDE's. When the systems considered are Hamiltonian, we show that the time dependent invariant tori are isotropic. Hence, the invariant tori of maximal dimension are Lagrangian manifolds. We also obtain that the (un)stable manifolds of whiskered tori are Lagrangian manifolds. We also include a comparison with the more global theory developed in Blazevski and de la Llave (2011).
Perceived object stability depends on multisensory estimates of gravity.
Barnett-Cowan, Michael; Fleming, Roland W; Singh, Manish; Bülthoff, Heinrich H
2011-04-27
How does the brain estimate object stability? Objects fall over when the gravity-projected centre-of-mass lies outside the point or area of support. To estimate an object's stability visually, the brain must integrate information across the shape and compare its orientation to gravity. When observers lie on their sides, gravity is perceived as tilted toward body orientation, consistent with a representation of gravity derived from multisensory information. We exploited this to test whether vestibular and kinesthetic information affect this visual task or whether the brain estimates object stability solely from visual information. In three body orientations, participants viewed images of objects close to a table edge. We measured the critical angle at which each object appeared equally likely to fall over or right itself. Perceived gravity was measured using the subjective visual vertical. The results show that the perceived critical angle was significantly biased in the same direction as the subjective visual vertical (i.e., towards the multisensory estimate of gravity). Our results rule out a general explanation that the brain depends solely on visual heuristics and assumptions about object stability. Instead, they suggest that multisensory estimates of gravity govern the perceived stability of objects, resulting in objects appearing more stable than they are when the head is tilted in the same direction in which they fall.
3D Visualization of an Invariant Display Strategy for Hyperspectral Imagery
2002-12-01
to Remote Sensing, New York, New York: Guillford Press, March 2002. Deitel , H. M., Deitel , P. J., Nieto, T. R. and Lin, T. M., XML How to Program ...Principal Component Analysis (PCA) to rotate the data into a coordinate space, which can be used to display the data. This thesis demonstrates how to ...radiation band is the natural unit of data organization, the BSQ format is also easy to implement. Figure 2.5 shows how a scene originally sensed
2014-09-01
the feature-space used to represent the target. Sometimes we trade off keeping information about one domain of the target in exchange for robustness... Kullback - Leibler distance), can be used as a similarity function between a candidate target and a template. This approach is invariant to changes in scale...basis vectors to adapt to appearance change and learns the visual information that the set of targets have in common, which is used to reduce the
Using Prosopagnosia to Test and Modify Visual Recognition Theory.
O'Brien, Alexander M
2018-02-01
Biederman's contemporary theory of basic visual object recognition (Recognition-by-Components) is based on structural descriptions of objects and presumes 36 visual primitives (geons) people can discriminate, but there has been no empirical test of the actual use of these 36 geons to visually distinguish objects. In this study, we tested for the actual use of these geons in basic visual discrimination by comparing object discrimination performance patterns (when distinguishing varied stimuli) of an acquired prosopagnosia patient (LB) and healthy control participants. LB's prosopagnosia left her heavily reliant on structural descriptions or categorical object differences in visual discrimination tasks versus the control participants' additional ability to use face recognition or coordinate systems (Coordinate Relations Hypothesis). Thus, when LB performed comparably to control participants with a given stimulus, her restricted reliance on basic or categorical discriminations meant that the stimuli must be distinguishable on the basis of a geon feature. By varying stimuli in eight separate experiments and presenting all 36 geons, we discerned that LB coded only 12 (vs. 36) distinct visual primitives (geons), apparently reflective of human visual systems generally.
Salience of the lambs: a test of the saliency map hypothesis with pictures of emotive objects.
Humphrey, Katherine; Underwood, Geoffrey; Lambert, Tony
2012-01-25
Humans have an ability to rapidly detect emotive stimuli. However, many emotional objects in a scene are also highly visually salient, which raises the question of how dependent the effects of emotionality are on visual saliency and whether the presence of an emotional object changes the power of a more visually salient object in attracting attention. Participants were shown a set of positive, negative, and neutral pictures and completed recall and recognition memory tests. Eye movement data revealed that visual saliency does influence eye movements, but the effect is reliably reduced when an emotional object is present. Pictures containing negative objects were recognized more accurately and recalled in greater detail, and participants fixated more on negative objects than positive or neutral ones. Initial fixations were more likely to be on emotional objects than more visually salient neutral ones, suggesting that the processing of emotional features occurs at a very early stage of perception.
Verspui, Remko; Gray, John R
2009-10-01
Animals rely on multimodal sensory integration for proper orientation within their environment. For example, odour-guided behaviours often require appropriate integration of concurrent visual cues. To gain a further understanding of mechanisms underlying sensory integration in odour-guided behaviour, our study examined the effects of visual stimuli induced by self-motion and object-motion on odour-guided flight in male M. sexta. By placing stationary objects (pillars) on either side of a female pheromone plume, moths produced self-induced visual motion during odour-guided flight. These flights showed a reduction in both ground and flight speeds and inter-turn interval when compared with flight tracks without stationary objects. Presentation of an approaching 20 cm disc, to simulate object-motion, resulted in interrupted odour-guided flight and changes in flight direction away from the pheromone source. Modifications of odour-guided flight behaviour in the presence of stationary objects suggest that visual information, in conjunction with olfactory cues, can be used to control the rate of counter-turning. We suggest that the behavioural responses to visual stimuli induced by object-motion indicate the presence of a neural circuit that relays visual information to initiate escape responses. These behavioural responses also suggest the presence of a sensory conflict requiring a trade-off between olfactory and visually driven behaviours. The mechanisms underlying olfactory and visual integration are discussed in the context of these behavioural responses.
Decoding visual object categories in early somatosensory cortex.
Smith, Fraser W; Goodale, Melvyn A
2015-04-01
Neurons, even in the earliest sensory areas of cortex, are subject to a great deal of contextual influence from both within and across modality connections. In the present work, we investigated whether the earliest regions of somatosensory cortex (S1 and S2) would contain content-specific information about visual object categories. We reasoned that this might be possible due to the associations formed through experience that link different sensory aspects of a given object. Participants were presented with visual images of different object categories in 2 fMRI experiments. Multivariate pattern analysis revealed reliable decoding of familiar visual object category in bilateral S1 (i.e., postcentral gyri) and right S2. We further show that this decoding is observed for familiar but not unfamiliar visual objects in S1. In addition, whole-brain searchlight decoding analyses revealed several areas in the parietal lobe that could mediate the observed context effects between vision and somatosensation. These results demonstrate that even the first cortical stages of somatosensory processing carry information about the category of visually presented familiar objects. © The Author 2013. Published by Oxford University Press.
Decoding Visual Object Categories in Early Somatosensory Cortex
Smith, Fraser W.; Goodale, Melvyn A.
2015-01-01
Neurons, even in the earliest sensory areas of cortex, are subject to a great deal of contextual influence from both within and across modality connections. In the present work, we investigated whether the earliest regions of somatosensory cortex (S1 and S2) would contain content-specific information about visual object categories. We reasoned that this might be possible due to the associations formed through experience that link different sensory aspects of a given object. Participants were presented with visual images of different object categories in 2 fMRI experiments. Multivariate pattern analysis revealed reliable decoding of familiar visual object category in bilateral S1 (i.e., postcentral gyri) and right S2. We further show that this decoding is observed for familiar but not unfamiliar visual objects in S1. In addition, whole-brain searchlight decoding analyses revealed several areas in the parietal lobe that could mediate the observed context effects between vision and somatosensation. These results demonstrate that even the first cortical stages of somatosensory processing carry information about the category of visually presented familiar objects. PMID:24122136
Infant Visual Attention and Object Recognition
Reynolds, Greg D.
2015-01-01
This paper explores the role visual attention plays in the recognition of objects in infancy. Research and theory on the development of infant attention and recognition memory are reviewed in three major sections. The first section reviews some of the major findings and theory emerging from a rich tradition of behavioral research utilizing preferential looking tasks to examine visual attention and recognition memory in infancy. The second section examines research utilizing neural measures of attention and object recognition in infancy as well as research on brain-behavior relations in the early development of attention and recognition memory. The third section addresses potential areas of the brain involved in infant object recognition and visual attention. An integrated synthesis of some of the existing models of the development of visual attention is presented which may account for the observed changes in behavioral and neural measures of visual attention and object recognition that occur across infancy. PMID:25596333
’What’ and ’Where’ in Visual Attention: Evidence from the Neglect Syndrome
1992-01-01
representations of the visual world, visual attention, and object representations. 24 Bauer, R. M., & Rubens, A. B. (1985). Agnosia . In K. M. Heilman, & E...visual information. Journal of Experimental Psychology: General, 1-1, 501-517. Farah, M. J. (1990). Visual Agnosia : Disorders of Object Recognition and
Odours reduce the magnitude of object substitution masking for matching visual targets in females.
Robinson, Amanda K; Laning, Julia; Reinhard, Judith; Mattingley, Jason B
2016-08-01
Recent evidence suggests that olfactory stimuli can influence early stages of visual processing, but there has been little focus on whether such olfactory-visual interactions convey an advantage in visual object identification. Moreover, despite evidence that some aspects of olfactory perception are superior in females than males, no study to date has examined whether olfactory influences on vision are gender-dependent. We asked whether inhalation of familiar odorants can modulate participants' ability to identify briefly flashed images of matching visual objects under conditions of object substitution masking (OSM). Across two experiments, we had male and female participants (N = 36 in each group) identify masked visual images of odour-related objects (e.g., orange, rose, mint) amongst nonodour-related distracters (e.g., box, watch). In each trial, participants inhaled a single odour that either matched or mismatched the masked, odour-related target. Target detection performance was analysed using a signal detection (d') approach. In females, but not males, matching odours significantly reduced OSM relative to mismatching odours, suggesting that familiar odours can enhance the salience of briefly presented visual objects. We conclude that olfactory cues exert a subtle influence on visual processes by transiently enhancing the salience of matching object representations. The results add to a growing body of literature that points towards consistent gender differences in olfactory perception.
The Case of the Missing Visual Details: Occlusion and Long-Term Visual Memory
ERIC Educational Resources Information Center
Williams, Carrick C.; Burkle, Kyle A.
2017-01-01
To investigate the critical information in long-term visual memory representations of objects, we used occlusion to emphasize 1 type of information or another. By occluding 1 solid side of the object (e.g., top 50%) or by occluding 50% of the object with stripes (like a picket fence), we emphasized visible information about the object, processing…
Perceptual asymmetries in greyscales: object-based versus space-based influences.
Thomas, Nicole A; Elias, Lorin J
2012-05-01
Neurologically normal individuals exhibit leftward spatial biases, resulting from object- and space-based biases; however their relative contributions to the overall bias remain unknown. Relative position within the display has not often been considered, with similar spatial conditions being collapsed across. Study 1 used the greyscales task to investigate the influence of relative position and object- and space-based contributions. One image in each greyscale pair was shifted towards the left or the right. A leftward object-based bias moderated by a bias to the centre was expected. Results confirmed this as a left object-based bias occurred in the right visual field, where the left side of the greyscale pairs was located in the centre visual field. Further, only lower visual field images exhibited a significant left bias in the left visual field. The left bias was also stronger when images were partially overlapping in the right visual field, demonstrating the importance of examining proximity. The second study examined whether object-based biases were stronger when actual objects, with directional lighting biases, were used. Direction of luminosity was congruent or incongruent with spatial location. A stronger object-based bias emerged overall; however a leftward bias was seen in congruent conditions and a rightward bias was seen in incongruent conditions. In conditions with significant biases, the lower visual field image was chosen most often. Results show that object- and space-based biases both contribute; however stimulus type allows either space- or object-based biases to be stronger. A lower visual field bias also interacts with these biases, leading the left bias to be eliminated under certain conditions. The complex interaction occurring between frame of reference and visual field makes spatial location extremely important in determining the strength of the leftward bias. Copyright © 2010 Elsevier Srl. All rights reserved.
Graewe, Britta; De Weerd, Peter; Farivar, Reza; Castelo-Branco, Miguel
2012-01-01
Many studies have linked the processing of different object categories to specific event-related potentials (ERPs) such as the face-specific N170. Despite reports showing that object-related ERPs are influenced by visual stimulus features, there is consensus that these components primarily reflect categorical aspects of the stimuli. Here, we re-investigated this idea by systematically measuring the effects of visual feature manipulations on ERP responses elicited by both structure-from-motion (SFM)-defined and luminance-defined object stimuli. SFM objects elicited a novel component at 200–250 ms (N250) over parietal and posterior temporal sites. We found, however, that the N250 amplitude was unaffected by restructuring SFM stimuli into meaningless objects based on identical visual cues. This suggests that this N250 peak was not uniquely linked to categorical aspects of the objects, but is strongly determined by visual stimulus features. We provide strong support for this hypothesis by parametrically manipulating the depth range of both SFM- and luminance-defined object stimuli and showing that the N250 evoked by SFM stimuli as well as the well-known N170 to static faces were sensitive to this manipulation. Importantly, this effect could not be attributed to compromised object categorization in low depth stimuli, confirming a strong impact of visual stimulus features on object-related ERP signals. As ERP components linked with visual categorical object perception are likely determined by multiple stimulus features, this creates an interesting inverse problem when deriving specific perceptual processes from variations in ERP components. PMID:22363479
Graewe, Britta; De Weerd, Peter; Farivar, Reza; Castelo-Branco, Miguel
2012-01-01
Many studies have linked the processing of different object categories to specific event-related potentials (ERPs) such as the face-specific N170. Despite reports showing that object-related ERPs are influenced by visual stimulus features, there is consensus that these components primarily reflect categorical aspects of the stimuli. Here, we re-investigated this idea by systematically measuring the effects of visual feature manipulations on ERP responses elicited by both structure-from-motion (SFM)-defined and luminance-defined object stimuli. SFM objects elicited a novel component at 200-250 ms (N250) over parietal and posterior temporal sites. We found, however, that the N250 amplitude was unaffected by restructuring SFM stimuli into meaningless objects based on identical visual cues. This suggests that this N250 peak was not uniquely linked to categorical aspects of the objects, but is strongly determined by visual stimulus features. We provide strong support for this hypothesis by parametrically manipulating the depth range of both SFM- and luminance-defined object stimuli and showing that the N250 evoked by SFM stimuli as well as the well-known N170 to static faces were sensitive to this manipulation. Importantly, this effect could not be attributed to compromised object categorization in low depth stimuli, confirming a strong impact of visual stimulus features on object-related ERP signals. As ERP components linked with visual categorical object perception are likely determined by multiple stimulus features, this creates an interesting inverse problem when deriving specific perceptual processes from variations in ERP components.
A novel rotational invariants target recognition method for rotating motion blurred images
NASA Astrophysics Data System (ADS)
Lan, Jinhui; Gong, Meiling; Dong, Mingwei; Zeng, Yiliang; Zhang, Yuzhen
2017-11-01
The imaging of the image sensor is blurred due to the rotational motion of the carrier and reducing the target recognition rate greatly. Although the traditional mode that restores the image first and then identifies the target can improve the recognition rate, it takes a long time to recognize. In order to solve this problem, a rotating fuzzy invariants extracted model was constructed that recognizes target directly. The model includes three metric layers. The object description capability of metric algorithms that contain gray value statistical algorithm, improved round projection transformation algorithm and rotation-convolution moment invariants in the three metric layers ranges from low to high, and the metric layer with the lowest description ability among them is as the input which can eliminate non pixel points of target region from degenerate image gradually. Experimental results show that the proposed model can improve the correct target recognition rate of blurred image and optimum allocation between the computational complexity and function of region.
NASA Astrophysics Data System (ADS)
Caldwell, T. Grant; Bibby, Hugh M.
1998-12-01
Long-offset transient electromagnetic (LOTEM) data have traditionally been represented as early- and late-time apparent resistivities. Time-varying electric field data recorded in a LOTEM survey made with multiple sources can be represented by an `instantaneous apparent resistivity tensor'. Three independent, coordinate-invariant, time-varying apparent resistivities can be derived from this tensor. For dipolar sources, the invariants are also independent of source orientation. In a uniform-resistivity half-space, the invariant given by the square root of the tensor determinant remains almost constant with time, deviating from the half-space resistivity by a maximum of 6 per cent. For a layered half-space, a distance-time pseudo-section of the determinant apparent resistivity produces an image of the layering beneath the measurement profile. As time increases, the instantaneous apparent resistivity tensor approaches the direct current apparent resistivity tensor. An approximate time-to-depth conversion can be achieved by integrating the diffusion depth formula with time, using the determinant apparent resistivity at each instant to represent the resistivity of the conductive medium. Localized near-surface inhomogeneities produce shifts in the time-domain apparent resistivity sounding curves that preserve the gradient, analogous to static shifts seen in magnetotelluric soundings. Instantaneous apparent resistivity tensors calculated for 3-D resistivity models suggest that profiles of LOTEM measurements across a simple 3-D structure can be used to create an image that reproduces the main features of the subsurface resistivity. Where measurements are distributed over an area, maps of the tensor invariants can be made into a sequence of images, which provides a way of `time slicing' down through the target structure.
Meyer, Georg F.; Shao, Fei; White, Mark D.; Hopkins, Carl; Robotham, Antony J.
2013-01-01
Externally generated visual motion signals can cause the illusion of self-motion in space (vection) and corresponding visually evoked postural responses (VEPR). These VEPRs are not simple responses to optokinetic stimulation, but are modulated by the configuration of the environment. The aim of this paper is to explore what factors modulate VEPRs in a high quality virtual reality (VR) environment where real and virtual foreground objects served as static visual, auditory and haptic reference points. Data from four experiments on visually evoked postural responses show that: 1) visually evoked postural sway in the lateral direction is modulated by the presence of static anchor points that can be haptic, visual and auditory reference signals; 2) real objects and their matching virtual reality representations as visual anchors have different effects on postural sway; 3) visual motion in the anterior-posterior plane induces robust postural responses that are not modulated by the presence of reference signals or the reality of objects that can serve as visual anchors in the scene. We conclude that automatic postural responses for laterally moving visual stimuli are strongly influenced by the configuration and interpretation of the environment and draw on multisensory representations. Different postural responses were observed for real and virtual visual reference objects. On the basis that automatic visually evoked postural responses in high fidelity virtual environments should mimic those seen in real situations we propose to use the observed effect as a robust objective test for presence and fidelity in VR. PMID:23840760
An insect-inspired model for visual binding I: learning objects and their characteristics.
Northcutt, Brandon D; Dyhr, Jonathan P; Higgins, Charles M
2017-04-01
Visual binding is the process of associating the responses of visual interneurons in different visual submodalities all of which are responding to the same object in the visual field. Recently identified neuropils in the insect brain termed optic glomeruli reside just downstream of the optic lobes and have an internal organization that could support visual binding. Working from anatomical similarities between optic and olfactory glomeruli, we have developed a model of visual binding based on common temporal fluctuations among signals of independent visual submodalities. Here we describe and demonstrate a neural network model capable both of refining selectivity of visual information in a given visual submodality, and of associating visual signals produced by different objects in the visual field by developing inhibitory neural synaptic weights representing the visual scene. We also show that this model is consistent with initial physiological data from optic glomeruli. Further, we discuss how this neural network model may be implemented in optic glomeruli at a neuronal level.
Visual Sensitivities and Discriminations and Their Roles in Aviation.
1986-03-01
D. Low contrast letter charts in early diabetic retinopathy , octrlar hypertension, glaucoma and Parkinson’s disease. Br J Ophthalmol, 1984, 68, 885...to detect a camouflaged object that was visible only when moving, and compared these data with similar measurements for conventional objects that were...3) Compare visual detection (i.e. visual acquisition) of camouflaged objects whose edges are defined by velocity differences with visual detection
NASA Astrophysics Data System (ADS)
Zhong, Yanfei; Han, Xiaobing; Zhang, Liangpei
2018-04-01
Multi-class geospatial object detection from high spatial resolution (HSR) remote sensing imagery is attracting increasing attention in a wide range of object-related civil and engineering applications. However, the distribution of objects in HSR remote sensing imagery is location-variable and complicated, and how to accurately detect the objects in HSR remote sensing imagery is a critical problem. Due to the powerful feature extraction and representation capability of deep learning, the deep learning based region proposal generation and object detection integrated framework has greatly promoted the performance of multi-class geospatial object detection for HSR remote sensing imagery. However, due to the translation caused by the convolution operation in the convolutional neural network (CNN), although the performance of the classification stage is seldom influenced, the localization accuracies of the predicted bounding boxes in the detection stage are easily influenced. The dilemma between translation-invariance in the classification stage and translation-variance in the object detection stage has not been addressed for HSR remote sensing imagery, and causes position accuracy problems for multi-class geospatial object detection with region proposal generation and object detection. In order to further improve the performance of the region proposal generation and object detection integrated framework for HSR remote sensing imagery object detection, a position-sensitive balancing (PSB) framework is proposed in this paper for multi-class geospatial object detection from HSR remote sensing imagery. The proposed PSB framework takes full advantage of the fully convolutional network (FCN), on the basis of a residual network, and adopts the PSB framework to solve the dilemma between translation-invariance in the classification stage and translation-variance in the object detection stage. In addition, a pre-training mechanism is utilized to accelerate the training procedure and increase the robustness of the proposed algorithm. The proposed algorithm is validated with a publicly available 10-class object detection dataset.
Timing the impact of literacy on visual processing
Pegado, Felipe; Comerlato, Enio; Ventura, Fabricio; Jobert, Antoinette; Nakamura, Kimihiro; Buiatti, Marco; Ventura, Paulo; Dehaene-Lambertz, Ghislaine; Kolinsky, Régine; Morais, José; Braga, Lucia W.; Cohen, Laurent; Dehaene, Stanislas
2014-01-01
Learning to read requires the acquisition of an efficient visual procedure for quickly recognizing fine print. Thus, reading practice could induce a perceptual learning effect in early vision. Using functional magnetic resonance imaging (fMRI) in literate and illiterate adults, we previously demonstrated an impact of reading acquisition on both high- and low-level occipitotemporal visual areas, but could not resolve the time course of these effects. To clarify whether literacy affects early vs. late stages of visual processing, we measured event-related potentials to various categories of visual stimuli in healthy adults with variable levels of literacy, including completely illiterate subjects, early-schooled literate subjects, and subjects who learned to read in adulthood (ex-illiterates). The stimuli included written letter strings forming pseudowords, on which literacy is expected to have a major impact, as well as faces, houses, tools, checkerboards, and false fonts. To evaluate the precision with which these stimuli were encoded, we studied repetition effects by presenting the stimuli in pairs composed of repeated, mirrored, or unrelated pictures from the same category. The results indicate that reading ability is correlated with a broad enhancement of early visual processing, including increased repetition suppression, suggesting better exemplar discrimination, and increased mirror discrimination, as early as ∼100–150 ms in the left occipitotemporal region. These effects were found with letter strings and false fonts, but also were partially generalized to other visual categories. Thus, learning to read affects the magnitude, precision, and invariance of early visual processing. PMID:25422460
Timing the impact of literacy on visual processing.
Pegado, Felipe; Comerlato, Enio; Ventura, Fabricio; Jobert, Antoinette; Nakamura, Kimihiro; Buiatti, Marco; Ventura, Paulo; Dehaene-Lambertz, Ghislaine; Kolinsky, Régine; Morais, José; Braga, Lucia W; Cohen, Laurent; Dehaene, Stanislas
2014-12-09
Learning to read requires the acquisition of an efficient visual procedure for quickly recognizing fine print. Thus, reading practice could induce a perceptual learning effect in early vision. Using functional magnetic resonance imaging (fMRI) in literate and illiterate adults, we previously demonstrated an impact of reading acquisition on both high- and low-level occipitotemporal visual areas, but could not resolve the time course of these effects. To clarify whether literacy affects early vs. late stages of visual processing, we measured event-related potentials to various categories of visual stimuli in healthy adults with variable levels of literacy, including completely illiterate subjects, early-schooled literate subjects, and subjects who learned to read in adulthood (ex-illiterates). The stimuli included written letter strings forming pseudowords, on which literacy is expected to have a major impact, as well as faces, houses, tools, checkerboards, and false fonts. To evaluate the precision with which these stimuli were encoded, we studied repetition effects by presenting the stimuli in pairs composed of repeated, mirrored, or unrelated pictures from the same category. The results indicate that reading ability is correlated with a broad enhancement of early visual processing, including increased repetition suppression, suggesting better exemplar discrimination, and increased mirror discrimination, as early as ∼ 100-150 ms in the left occipitotemporal region. These effects were found with letter strings and false fonts, but also were partially generalized to other visual categories. Thus, learning to read affects the magnitude, precision, and invariance of early visual processing.
Complexity of the laminar-turbulent boundary in pipe flow
NASA Astrophysics Data System (ADS)
Budanur, Nazmi Burak; Hof, Björn
2018-05-01
Over the past decade, the edge of chaos has proven to be a fruitful starting point for investigations of shear flows when the laminar base flow is linearly stable. Numerous computational studies of shear flows demonstrated the existence of states that separate laminar and turbulent regions of the state space. In addition, some studies determined invariant solutions that reside on this edge. In this paper, we study the unstable manifold of one such solution with the aid of continuous symmetry reduction, which we formulate here for the simultaneous quotiening of axial and azimuthal symmetries. Upon our investigation of the unstable manifold, we discover a previously unknown traveling-wave solution on the laminar-turbulent boundary with a relatively complex structure. By means of low-dimensional projections, we visualize different dynamical paths that connect these solutions to the turbulence. Our numerical experiments demonstrate that the laminar-turbulent boundary exhibits qualitatively different regions whose properties are influenced by the nearby invariant solutions.
Dynamic Encoding of Face Information in the Human Fusiform Gyrus
Ghuman, Avniel Singh; Brunet, Nicolas M.; Li, Yuanning; Konecky, Roma O.; Pyles, John A.; Walls, Shawn A.; Destefino, Vincent; Wang, Wei; Richardson, R. Mark
2014-01-01
Humans’ ability to rapidly and accurately detect, identify, and classify faces under variable conditions derives from a network of brain regions highly tuned to face information. The fusiform face area (FFA) is thought to be a computational hub for face processing, however temporal dynamics of face information processing in FFA remains unclear. Here we use multivariate pattern classification to decode the temporal dynamics of expression-invariant face information processing using electrodes placed directly upon FFA in humans. Early FFA activity (50-75 ms) contained information regarding whether participants were viewing a face. Activity between 200-500 ms contained expression-invariant information about which of 70 faces participants were viewing along with the individual differences in facial features and their configurations. Long-lasting (500+ ms) broadband gamma frequency activity predicted task performance. These results elucidate the dynamic computational role FFA plays in multiple face processing stages and indicate what information is used in performing these visual analyses. PMID:25482825
Dynamic encoding of face information in the human fusiform gyrus.
Ghuman, Avniel Singh; Brunet, Nicolas M; Li, Yuanning; Konecky, Roma O; Pyles, John A; Walls, Shawn A; Destefino, Vincent; Wang, Wei; Richardson, R Mark
2014-12-08
Humans' ability to rapidly and accurately detect, identify and classify faces under variable conditions derives from a network of brain regions highly tuned to face information. The fusiform face area (FFA) is thought to be a computational hub for face processing; however, temporal dynamics of face information processing in FFA remains unclear. Here we use multivariate pattern classification to decode the temporal dynamics of expression-invariant face information processing using electrodes placed directly on FFA in humans. Early FFA activity (50-75 ms) contained information regarding whether participants were viewing a face. Activity between 200 and 500 ms contained expression-invariant information about which of 70 faces participants were viewing along with the individual differences in facial features and their configurations. Long-lasting (500+ms) broadband gamma frequency activity predicted task performance. These results elucidate the dynamic computational role FFA plays in multiple face processing stages and indicate what information is used in performing these visual analyses.
Objective Measures of Visual Function in Papilledema
Moss, Heather E.
2016-01-01
Synopsis Visual function is an important parameter to consider when managing patients with papilledema. Though the current standard of care uses standard automated perimetry (SAP) to obtain this information, this test is inherently subjective and prone to patient errors. Objective visual function tests including the visual evoked potential, pattern electroretinogram, photopic negative response of the full field electroretinogram, and pupillary light response have the potential to replace or supplement subjective visual function tests in papilledema management. This article reviews the evidence for use of objective visual function tests to assess visual function in papilledema and discusses future investigations needed to develop them as clinically practical and useful measures for this purpose. PMID:28451649
Hastings, Gareth D.; Marsack, Jason D.; Nguyen, Lan Chi; Cheng, Han; Applegate, Raymond A.
2017-01-01
Purpose To prospectively examine whether using the visual image quality metric, visual Strehl (VSX), to optimise objective refraction from wavefront error measurements can provide equivalent or better visual performance than subjective refraction and which refraction is preferred in free viewing. Methods Subjective refractions and wavefront aberrations were measured on 40 visually-normal eyes of 20 subjects, through natural and dilated pupils. For each eye a sphere, cylinder, and axis prescription was also objectively determined that optimised visual image quality (VSX) for the measured wavefront error. High contrast (HC) and low contrast (LC) logMAR visual acuity (VA) and short-term monocular distance vision preference were recorded and compared between the VSX-objective and subjective prescriptions both undilated and dilated. Results For 36 myopic eyes, clinically equivalent (and not statistically different) HC VA was provided with both the objective and subjective refractions (undilated mean ±SD was −0.06 ±0.04 with both refractions; dilated was −0.05 ±0.04 with the objective, and −0.05 ±0.05 with the subjective refraction). LC logMAR VA provided by the objective refraction was also clinically equivalent and not statistically different to that provided by the subjective refraction through both natural and dilated pupils for myopic eyes. In free viewing the objective prescription was preferred over the subjective by 72% of myopic eyes when not dilated. For four habitually undercorrected high hyperopic eyes, the VSX-objective refraction was more positive in spherical power and VA poorer than with the subjective refraction. Conclusions A method of simultaneously optimising sphere, cylinder, and axis from wavefront error measurements, using the visual image quality metric VSX, is described. In myopic subjects, visual performance, as measured by HC and LC VA, with this VSX-objective refraction was found equivalent to that provided by subjective refraction, and was typically preferred over subjective refraction. Subjective refraction was preferred by habitually undercorrected hyperopic eyes. PMID:28370389
Visual search and contextual cueing: differential effects in 10-year-old children and adults.
Couperus, Jane W; Hunt, Ruskin H; Nelson, Charles A; Thomas, Kathleen M
2011-02-01
The development of contextual cueing specifically in relation to attention was examined in two experiments. Adult and 10-year-old participants completed a context cueing visual search task (Jiang & Chun, The Quarterly Journal of Experimental Psychology, 54A(4), 1105-1124, 2001) containing stimuli presented in an attended (e.g., red) and unattended (e.g., green) color. When the spatial configuration of stimuli in the attended and unattended color was invariant and consistently paired with the target location, adult reaction times improved, demonstrating learning. Learning also occurred if only the attended stimuli's configuration remained fixed. In contrast, while 10 year olds, like adults, showed incrementally slower reaction times as the number of attended stimuli increased, they did not show learning in the standard paradigm. However, they did show learning when the ratio of attended to unattended stimuli was high, irrespective of the total number of attended stimuli. Findings suggest children show efficient attentional guidance by color in visual search but differences in contextual cueing.
Global Sensory Qualities and Aesthetic Experience in Music.
Brattico, Pauli; Brattico, Elvira; Vuust, Peter
2017-01-01
A well-known tradition in the study of visual aesthetics holds that the experience of visual beauty is grounded in global computational or statistical properties of the stimulus, for example, scale-invariant Fourier spectrum or self-similarity. Some approaches rely on neural mechanisms, such as efficient computation, processing fluency, or the responsiveness of the cells in the primary visual cortex. These proposals are united by the fact that the contributing factors are hypothesized to be global (i.e., they concern the percept as a whole), formal or non-conceptual (i.e., they concern form instead of content), computational and/or statistical, and based on relatively low-level sensory properties. Here we consider that the study of aesthetic responses to music could benefit from the same approach. Thus, along with local features such as pitch, tuning, consonance/dissonance, harmony, timbre, or beat, also global sonic properties could be viewed as contributing toward creating an aesthetic musical experience. Several such properties are discussed and their neural implementation is reviewed in the light of recent advances in neuroaesthetics.
A neural model of motion processing and visual navigation by cortical area MST.
Grossberg, S; Mingolla, E; Pack, C
1999-12-01
Cells in the dorsal medial superior temporal cortex (MSTd) process optic flow generated by self-motion during visually guided navigation. A neural model shows how interactions between well-known neural mechanisms (log polar cortical magnification, Gaussian motion-sensitive receptive fields, spatial pooling of motion-sensitive signals and subtractive extraretinal eye movement signals) lead to emergent properties that quantitatively simulate neurophysiological data about MSTd cell properties and psychophysical data about human navigation. Model cells match MSTd neuron responses to optic flow stimuli placed in different parts of the visual field, including position invariance, tuning curves, preferred spiral directions, direction reversals, average response curves and preferred locations for stimulus motion centers. The model shows how the preferred motion direction of the most active MSTd cells can explain human judgments of self-motion direction (heading), without using complex heading templates. The model explains when extraretinal eye movement signals are needed for accurate heading perception, and when retinal input is sufficient, and how heading judgments depend on scene layouts and rotation rates.
A Model of Generating Visual Place Cells Based on Environment Perception and Similar Measure.
Zhou, Yang; Wu, Dewei
2016-01-01
It is an important content to generate visual place cells (VPCs) in the field of bioinspired navigation. By analyzing the firing characteristic of biological place cells and the existing methods for generating VPCs, a model of generating visual place cells based on environment perception and similar measure is abstracted in this paper. VPCs' generation process is divided into three phases, including environment perception, similar measure, and recruiting of a new place cell. According to this process, a specific method for generating VPCs is presented. External reference landmarks are obtained based on local invariant characteristics of image and a similar measure function is designed based on Euclidean distance and Gaussian function. Simulation validates the proposed method is available. The firing characteristic of the generated VPCs is similar to that of biological place cells, and VPCs' firing fields can be adjusted flexibly by changing the adjustment factor of firing field (AFFF) and firing rate's threshold (FRT).
A Model of Generating Visual Place Cells Based on Environment Perception and Similar Measure
2016-01-01
It is an important content to generate visual place cells (VPCs) in the field of bioinspired navigation. By analyzing the firing characteristic of biological place cells and the existing methods for generating VPCs, a model of generating visual place cells based on environment perception and similar measure is abstracted in this paper. VPCs' generation process is divided into three phases, including environment perception, similar measure, and recruiting of a new place cell. According to this process, a specific method for generating VPCs is presented. External reference landmarks are obtained based on local invariant characteristics of image and a similar measure function is designed based on Euclidean distance and Gaussian function. Simulation validates the proposed method is available. The firing characteristic of the generated VPCs is similar to that of biological place cells, and VPCs' firing fields can be adjusted flexibly by changing the adjustment factor of firing field (AFFF) and firing rate's threshold (FRT). PMID:27597859
A visual model for object detection based on active contours and level-set method.
Satoh, Shunji
2006-09-01
A visual model for object detection is proposed. In order to make the detection ability comparable with existing technical methods for object detection, an evolution equation of neurons in the model is derived from the computational principle of active contours. The hierarchical structure of the model emerges naturally from the evolution equation. One drawback involved with initial values of active contours is alleviated by introducing and formulating convexity, which is a visual property. Numerical experiments show that the proposed model detects objects with complex topologies and that it is tolerant of noise. A visual attention model is introduced into the proposed model. Other simulations show that the visual properties of the model are consistent with the results of psychological experiments that disclose the relation between figure-ground reversal and visual attention. We also demonstrate that the model tends to perceive smaller regions as figures, which is a characteristic observed in human visual perception.
When apperceptive agnosia is explained by a deficit of primary visual processing.
Serino, Andrea; Cecere, Roberto; Dundon, Neil; Bertini, Caterina; Sanchez-Castaneda, Cristina; Làdavas, Elisabetta
2014-03-01
Visual agnosia is a deficit in shape perception, affecting figure, object, face and letter recognition. Agnosia is usually attributed to lesions to high-order modules of the visual system, which combine visual cues to represent the shape of objects. However, most of previously reported agnosia cases presented visual field (VF) defects and poor primary visual processing. The present case-study aims to verify whether form agnosia could be explained by a deficit in basic visual functions, rather that by a deficit in high-order shape recognition. Patient SDV suffered a bilateral lesion of the occipital cortex due to anoxia. When tested, he could navigate, interact with others, and was autonomous in daily life activities. However, he could not recognize objects from drawings and figures, read or recognize familiar faces. He was able to recognize objects by touch and people from their voice. Assessments of visual functions showed blindness at the centre of the VF, up to almost 5°, bilaterally, with better stimulus detection in the periphery. Colour and motion perception was preserved. Psychophysical experiments showed that SDV's visual recognition deficits were not explained by poor spatial acuity or by the crowding effect. Rather a severe deficit in line orientation processing might be a key mechanism explaining SDV's agnosia. Line orientation processing is a basic function of primary visual cortex neurons, necessary for detecting "edges" of visual stimuli to build up a "primal sketch" for object recognition. We propose, therefore, that some forms of visual agnosia may be explained by deficits in basic visual functions due to widespread lesions of the primary visual areas, affecting primary levels of visual processing. Copyright © 2013 Elsevier Ltd. All rights reserved.
Seymour, K J; Williams, M A; Rich, A N
2016-05-01
Many theories of visual object perception assume the visual system initially extracts borders between objects and their background and then "fills in" color to the resulting object surfaces. We investigated the transformation of chromatic signals across the human ventral visual stream, with particular interest in distinguishing representations of object surface color from representations of chromatic signals reflecting the retinal input. We used fMRI to measure brain activity while participants viewed figure-ground stimuli that differed either in the position or in the color contrast polarity of the foreground object (the figure). Multivariate pattern analysis revealed that classifiers were able to decode information about which color was presented at a particular retinal location from early visual areas, whereas regions further along the ventral stream exhibited biases for representing color as part of an object's surface, irrespective of its position on the retina. Additional analyses showed that although activity in V2 contained strong chromatic contrast information to support the early parsing of objects within a visual scene, activity in this area also signaled information about object surface color. These findings are consistent with the view that mechanisms underlying scene segmentation and the binding of color to object surfaces converge in V2. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Robust selectivity to two-object images in human visual cortex
Agam, Yigal; Liu, Hesheng; Papanastassiou, Alexander; Buia, Calin; Golby, Alexandra J.; Madsen, Joseph R.; Kreiman, Gabriel
2010-01-01
SUMMARY We can recognize objects in a fraction of a second in spite of the presence of other objects [1–3]. The responses in macaque areas V4 and inferior temporal cortex [4–15] to a neuron’s preferred stimuli are typically suppressed by the addition of a second object within the receptive field (see however [16, 17]). How can this suppression be reconciled with rapid visual recognition in complex scenes? One option is that certain “special categories” are unaffected by other objects [18] but this leaves the problem unsolved for other categories. Another possibility is that serial attentional shifts help ameliorate the problem of distractor objects [19–21]. Yet, psychophysical studies [1–3], scalp recordings [1] and neurophysiological recordings [14, 16, 22–24], suggest that the initial sweep of visual processing contains a significant amount of information. We recorded intracranial field potentials in human visual cortex during presentation of flashes of two-object images. Visual selectivity from temporal cortex during the initial ~200 ms was largely robust to the presence of other objects. We could train linear decoders on the responses to isolated objects and decode information in two-object images. These observations are compatible with parallel, hierarchical and feed-forward theories of rapid visual recognition [25] and may provide a neural substrate to begin to unravel rapid recognition in natural scenes. PMID:20417105
Spatiotemporal dynamics underlying object completion in human ventral visual cortex.
Tang, Hanlin; Buia, Calin; Madhavan, Radhika; Crone, Nathan E; Madsen, Joseph R; Anderson, William S; Kreiman, Gabriel
2014-08-06
Natural vision often involves recognizing objects from partial information. Recognition of objects from parts presents a significant challenge for theories of vision because it requires spatial integration and extrapolation from prior knowledge. Here we recorded intracranial field potentials of 113 visually selective electrodes from epilepsy patients in response to whole and partial objects. Responses along the ventral visual stream, particularly the inferior occipital and fusiform gyri, remained selective despite showing only 9%-25% of the object areas. However, these visually selective signals emerged ∼100 ms later for partial versus whole objects. These processing delays were particularly pronounced in higher visual areas within the ventral stream. This latency difference persisted when controlling for changes in contrast, signal amplitude, and the strength of selectivity. These results argue against a purely feedforward explanation of recognition from partial information, and provide spatiotemporal constraints on theories of object recognition that involve recurrent processing. Copyright © 2014 Elsevier Inc. All rights reserved.
Hearing the shape of the Ising model with a programmable superconducting-flux annealer.
Vinci, Walter; Markström, Klas; Boixo, Sergio; Roy, Aidan; Spedalieri, Federico M; Warburton, Paul A; Severini, Simone
2014-07-16
Two objects can be distinguished if they have different measurable properties. Thus, distinguishability depends on the Physics of the objects. In considering graphs, we revisit the Ising model as a framework to define physically meaningful spectral invariants. In this context, we introduce a family of refinements of the classical spectrum and consider the quantum partition function. We demonstrate that the energy spectrum of the quantum Ising Hamiltonian is a stronger invariant than the classical one without refinements. For the purpose of implementing the related physical systems, we perform experiments on a programmable annealer with superconducting flux technology. Departing from the paradigm of adiabatic computation, we take advantage of a noisy evolution of the device to generate statistics of low energy states. The graphs considered in the experiments have the same classical partition functions, but different quantum spectra. The data obtained from the annealer distinguish non-isomorphic graphs via information contained in the classical refinements of the functions but not via the differences in the quantum spectra.
Automatic recognition of ship types from infrared images using superstructure moment invariants
NASA Astrophysics Data System (ADS)
Li, Heng; Wang, Xinyu
2007-11-01
Automatic object recognition is an active area of interest for military and commercial applications. In this paper, a system addressing autonomous recognition of ship types in infrared images is proposed. Firstly, an approach of segmentation based on detection of salient features of the target with subsequent shadow removing is proposed, as is the base of the subsequent object recognition. Considering the differences between the shapes of various ships mainly lie in their superstructures, we then use superstructure moment functions invariant to translation, rotation and scale differences in input patterns and develop a robust algorithm of obtaining ship superstructure. Subsequently a back-propagation neural network is used as a classifier in the recognition stage and projection images of simulated three-dimensional ship models are used as the training sets. Our recognition model was implemented and experimentally validated using both simulated three-dimensional ship model images and real images derived from video of an AN/AAS-44V Forward Looking Infrared(FLIR) sensor.