view-based object recognition: Topics by Science.gov

Sample records for view-based object recognition

Development of novel tasks for studying view-invariant object recognition in rodents: Sensitivity to scopolamine.

PubMed

Mitchnick, Krista A; Wideman, Cassidy E; Huff, Andrew E; Palmer, Daniel; McNaughton, Bruce L; Winters, Boyer D

2018-05-15

The capacity to recognize objects from different view-points or angles, referred to as view-invariance, is an essential process that humans engage in daily. Currently, the ability to investigate the neurobiological underpinnings of this phenomenon is limited, as few ethologically valid view-invariant object recognition tasks exist for rodents. Here, we report two complementary, novel view-invariant object recognition tasks in which rodents physically interact with three-dimensional objects. Prior to experimentation, rats and mice were given extensive experience with a set of 'pre-exposure' objects. In a variant of the spontaneous object recognition task, novelty preference for pre-exposed or new objects was assessed at various angles of rotation (45°, 90° or 180°); unlike control rodents, for whom the objects were novel, rats and mice tested with pre-exposed objects did not discriminate between rotated and un-rotated objects in the choice phase, indicating substantial view-invariant object recognition. Secondly, using automated operant touchscreen chambers, rats were tested on pre-exposed or novel objects in a pairwise discrimination task, where the rewarded stimulus (S+) was rotated (180°) once rats had reached acquisition criterion; rats tested with pre-exposed objects re-acquired the pairwise discrimination following S+ rotation more effectively than those tested with new objects. Systemic scopolamine impaired performance on both tasks, suggesting involvement of acetylcholine at muscarinic receptors in view-invariant object processing. These tasks present novel means of studying the behavioral and neural bases of view-invariant object recognition in rodents. Copyright © 2018 Elsevier B.V. All rights reserved.
Generalization between canonical and non-canonical views in object recognition

PubMed Central

Ghose, Tandra; Liu, Zili

2013-01-01

Viewpoint generalization in object recognition is the process that allows recognition of a given 3D object from many different viewpoints despite variations in its 2D projections. We used the canonical view effects as a foundation to empirically test the validity of a major theory in object recognition, the view-approximation model (Poggio & Edelman, 1990). This model predicts that generalization should be better when an object is first seen from a non-canonical view and then a canonical view than when seen in the reversed order. We also manipulated object similarity to study the degree to which this view generalization was constrained by shape details and task instructions (object vs. image recognition). Old-new recognition performance for basic and subordinate level objects was measured in separate blocks. We found that for object recognition, view generalization between canonical and non-canonical views was comparable for basic level objects. For subordinate level objects, recognition performance was more accurate from non-canonical to canonical views than the other way around. When the task was changed from object recognition to image recognition, the pattern of the results reversed. Interestingly, participants responded “old” to “new” images of “old” objects with a substantially higher rate than to “new” objects, despite instructions to the contrary, thereby indicating involuntary view generalization. Our empirical findings are incompatible with the prediction of the view-approximation theory, and argue against the hypothesis that views are stored independently. PMID:23283692
Three-dimensional model-based object recognition and segmentation in cluttered scenes.

PubMed

Mian, Ajmal S; Bennamoun, Mohammed; Owens, Robyn

2006-10-01

Viewpoint independent recognition of free-form objects and their segmentation in the presence of clutter and occlusions is a challenging task. We present a novel 3D model-based algorithm which performs this task automatically and efficiently. A 3D model of an object is automatically constructed offline from its multiple unordered range images (views). These views are converted into multidimensional table representations (which we refer to as tensors). Correspondences are automatically established between these views by simultaneously matching the tensors of a view with those of the remaining views using a hash table-based voting scheme. This results in a graph of relative transformations used to register the views before they are integrated into a seamless 3D model. These models and their tensor representations constitute the model library. During online recognition, a tensor from the scene is simultaneously matched with those in the library by casting votes. Similarity measures are calculated for the model tensors which receive the most votes. The model with the highest similarity is transformed to the scene and, if it aligns accurately with an object in the scene, that object is declared as recognized and is segmented. This process is repeated until the scene is completely segmented. Experiments were performed on real and synthetic data comprised of 55 models and 610 scenes and an overall recognition rate of 95 percent was achieved. Comparison with the spin images revealed that our algorithm is superior in terms of recognition rate and efficiency.
View-invariant object recognition ability develops after discrimination, not mere exposure, at several viewing angles.

PubMed

Yamashita, Wakayo; Wang, Gang; Tanaka, Keiji

2010-01-01

One usually fails to recognize an unfamiliar object across changes in viewing angle when it has to be discriminated from similar distractor objects. Previous work has demonstrated that after long-term experience in discriminating among a set of objects seen from the same viewing angle, immediate recognition of the objects across 30-60 degrees changes in viewing angle becomes possible. The capability for view-invariant object recognition should develop during the within-viewing-angle discrimination, which includes two kinds of experience: seeing individual views and discriminating among the objects. The aim of the present study was to determine the relative contribution of each factor to the development of view-invariant object recognition capability. Monkeys were first extensively trained in a task that required view-invariant object recognition (Object task) with several sets of objects. The animals were then exposed to a new set of objects over 26 days in one of two preparatory tasks: one in which each object view was seen individually, and a second that required discrimination among the objects at each of four viewing angles. After the preparatory period, we measured the monkeys' ability to recognize the objects across changes in viewing angle, by introducing the object set to the Object task. Results indicated significant view-invariant recognition after the second but not first preparatory task. These results suggest that discrimination of objects from distractors at each of several viewing angles is required for the development of view-invariant recognition of the objects when the distractors are similar to the objects.
Viewpoint dependence in the recognition of non-elongated familiar objects: testing the effects of symmetry, front-back axis, and familiarity.

PubMed

Niimi, Ryosuke; Yokosawa, Kazuhiko

2009-01-01

Visual recognition of three-dimensional (3-D) objects is relatively impaired for some particular views, called accidental views. For most familiar objects, the front and top views are considered to be accidental views. Previous studies have shown that foreshortening of the axes of elongation of objects in these views impairs recognition, but the influence of other possible factors is largely unknown. Using familiar objects without a salient axis of elongation, we found that a foreshortened symmetry plane of the object and low familiarity of the viewpoint accounted for the relatively worse recognition for front views and top views, independently of the effect of a foreshortened axis of elongation. We found no evidence that foreshortened front-back axes impaired recognition in front views. These results suggest that the viewpoint dependence of familiar object recognition is not a unitary phenomenon. The possible role of symmetry (either 2-D or 3-D) in familiar object recognition is also discussed.
Neural Substrates of View-Invariant Object Recognition Developed without Experiencing Rotations of the Objects

PubMed Central

Okamura, Jun-ya; Yamaguchi, Reona; Honda, Kazunari; Tanaka, Keiji

2014-01-01

One fails to recognize an unfamiliar object across changes in viewing angle when it must be discriminated from similar distractor objects. View-invariant recognition gradually develops as the viewer repeatedly sees the objects in rotation. It is assumed that different views of each object are associated with one another while their successive appearance is experienced in rotation. However, natural experience of objects also contains ample opportunities to discriminate among objects at each of the multiple viewing angles. Our previous behavioral experiments showed that after experiencing a new set of object stimuli during a task that required only discrimination at each of four viewing angles at 30° intervals, monkeys could recognize the objects across changes in viewing angle up to 60°. By recording activities of neurons from the inferotemporal cortex after various types of preparatory experience, we here found a possible neural substrate for the monkeys' performance. For object sets that the monkeys had experienced during the task that required only discrimination at each of four viewing angles, many inferotemporal neurons showed object selectivity covering multiple views. The degree of view generalization found for these object sets was similar to that found for stimulus sets with which the monkeys had been trained to conduct view-invariant recognition. These results suggest that the experience of discriminating new objects in each of several viewing angles develops the partially view-generalized object selectivity distributed over many neurons in the inferotemporal cortex, which in turn bases the monkeys' emergent capability to discriminate the objects across changes in viewing angle. PMID:25378169
The roles of perceptual and conceptual information in face recognition.

PubMed

Schwartz, Linoy; Yovel, Galit

2016-11-01

The representation of familiar objects is comprised of perceptual information about their visual properties as well as the conceptual knowledge that we have about them. What is the relative contribution of perceptual and conceptual information to object recognition? Here, we examined this question by designing a face familiarization protocol during which participants were either exposed to rich perceptual information (viewing each face in different angles and illuminations) or with conceptual information (associating each face with a different name). Both conditions were compared with single-view faces presented with no labels. Recognition was tested on new images of the same identities to assess whether learning generated a view-invariant representation. Results showed better recognition of novel images of the learned identities following association of a face with a name label, but no enhancement following exposure to multiple face views. Whereas these findings may be consistent with the role of category learning in object recognition, face recognition was better for labeled faces only when faces were associated with person-related labels (name, occupation), but not with person-unrelated labels (object names or symbols). These findings suggest that association of meaningful conceptual information with an image shifts its representation from an image-based percept to a view-invariant concept. They further indicate that the role of conceptual information should be considered to account for the superior recognition that we have for familiar faces and objects. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Automatic image database generation from CAD for 3D object recognition

NASA Astrophysics Data System (ADS)

Sardana, Harish K.; Daemi, Mohammad F.; Ibrahim, Mohammad K.

1993-06-01

The development and evaluation of Multiple-View 3-D object recognition systems is based on a large set of model images. Due to the various advantages of using CAD, it is becoming more and more practical to use existing CAD data in computer vision systems. Current PC- level CAD systems are capable of providing physical image modelling and rendering involving positional variations in cameras, light sources etc. We have formulated a modular scheme for automatic generation of various aspects (views) of the objects in a model based 3-D object recognition system. These views are generated at desired orientations on the unit Gaussian sphere. With a suitable network file sharing system (NFS), the images can directly be stored on a database located on a file server. This paper presents the image modelling solutions using CAD in relation to multiple-view approach. Our modular scheme for data conversion and automatic image database storage for such a system is discussed. We have used this approach in 3-D polyhedron recognition. An overview of the results, advantages and limitations of using CAD data and conclusions using such as scheme are also presented.
Neural substrates of view-invariant object recognition developed without experiencing rotations of the objects.

PubMed

Okamura, Jun-Ya; Yamaguchi, Reona; Honda, Kazunari; Wang, Gang; Tanaka, Keiji

2014-11-05

One fails to recognize an unfamiliar object across changes in viewing angle when it must be discriminated from similar distractor objects. View-invariant recognition gradually develops as the viewer repeatedly sees the objects in rotation. It is assumed that different views of each object are associated with one another while their successive appearance is experienced in rotation. However, natural experience of objects also contains ample opportunities to discriminate among objects at each of the multiple viewing angles. Our previous behavioral experiments showed that after experiencing a new set of object stimuli during a task that required only discrimination at each of four viewing angles at 30° intervals, monkeys could recognize the objects across changes in viewing angle up to 60°. By recording activities of neurons from the inferotemporal cortex after various types of preparatory experience, we here found a possible neural substrate for the monkeys' performance. For object sets that the monkeys had experienced during the task that required only discrimination at each of four viewing angles, many inferotemporal neurons showed object selectivity covering multiple views. The degree of view generalization found for these object sets was similar to that found for stimulus sets with which the monkeys had been trained to conduct view-invariant recognition. These results suggest that the experience of discriminating new objects in each of several viewing angles develops the partially view-generalized object selectivity distributed over many neurons in the inferotemporal cortex, which in turn bases the monkeys' emergent capability to discriminate the objects across changes in viewing angle. Copyright © 2014 the authors 0270-6474/14/3415047-13$15.00/0.
Target recognition of log-polar ladar range images using moment invariants

NASA Astrophysics Data System (ADS)

Xia, Wenze; Han, Shaokun; Cao, Jie; Yu, Haoyong

2017-01-01

The ladar range image has received considerable attentions in the automatic target recognition field. However, previous research does not cover target recognition using log-polar ladar range images. Therefore, we construct a target recognition system based on log-polar ladar range images in this paper. In this system combined moment invariants and backpropagation neural network are selected as shape descriptor and shape classifier, respectively. In order to fully analyze the effect of log-polar sampling pattern on recognition result, several comparative experiments based on simulated and real range images are carried out. Eventually, several important conclusions are drawn: (i) if combined moments are computed directly by log-polar range images, translation, rotation and scaling invariant properties of combined moments will be invalid (ii) when object is located in the center of field of view, recognition rate of log-polar range images is less sensitive to the changing of field of view (iii) as object position changes from center to edge of field of view, recognition performance of log-polar range images will decline dramatically (iv) log-polar range images has a better noise robustness than Cartesian range images. Finally, we give a suggestion that it is better to divide field of view into recognition area and searching area in the real application.
Multiview human activity recognition system based on spatiotemporal template for video surveillance system

NASA Astrophysics Data System (ADS)

Kushwaha, Alok Kumar Singh; Srivastava, Rajeev

2015-09-01

An efficient view invariant framework for the recognition of human activities from an input video sequence is presented. The proposed framework is composed of three consecutive modules: (i) detect and locate people by background subtraction, (ii) view invariant spatiotemporal template creation for different activities, (iii) and finally, template matching is performed for view invariant activity recognition. The foreground objects present in a scene are extracted using change detection and background modeling. The view invariant templates are constructed using the motion history images and object shape information for different human activities in a video sequence. For matching the spatiotemporal templates for various activities, the moment invariants and Mahalanobis distance are used. The proposed approach is tested successfully on our own viewpoint dataset, KTH action recognition dataset, i3DPost multiview dataset, MSR viewpoint action dataset, VideoWeb multiview dataset, and WVU multiview human action recognition dataset. From the experimental results and analysis over the chosen datasets, it is observed that the proposed framework is robust, flexible, and efficient with respect to multiple views activity recognition, scale, and phase variations.
On the three-quarter view advantage of familiar object recognition.

PubMed

Nonose, Kohei; Niimi, Ryosuke; Yokosawa, Kazuhiko

2016-11-01

A three-quarter view, i.e., an oblique view, of familiar objects often leads to a higher subjective goodness rating when compared with other orientations. What is the source of the high goodness for oblique views? First, we confirmed that object recognition performance was also best for oblique views around 30° view, even when the foreshortening disadvantage of front- and side-views was minimized (Experiments 1 and 2). In Experiment 3, we measured subjective ratings of view goodness and two possible determinants of view goodness: familiarity of view, and subjective impression of three-dimensionality. Three-dimensionality was measured as the subjective saliency of visual depth information. The oblique views were rated best, most familiar, and as approximating greatest three-dimensionality on average; however, the cluster analyses showed that the "best" orientation systematically varied among objects. We found three clusters of objects: front-preferred objects, oblique-preferred objects, and side-preferred objects. Interestingly, recognition performance and the three-dimensionality rating were higher for oblique views irrespective of the clusters. It appears that recognition efficiency is not the major source of the three-quarter view advantage. There are multiple determinants and variability among objects. This study suggests that the classical idea that a canonical view has a unique advantage in object perception requires further discussion.
Compensation for Blur Requires Increase in Field of View and Viewing Time

PubMed Central

Kwon, MiYoung; Liu, Rong; Chien, Lillian

2016-01-01

Spatial resolution is an important factor for human pattern recognition. In particular, low resolution (blur) is a defining characteristic of low vision. Here, we examined spatial (field of view) and temporal (stimulus duration) requirements for blurry object recognition. The spatial resolution of an image such as letter or face, was manipulated with a low-pass filter. In experiment 1, studying spatial requirement, observers viewed a fixed-size object through a window of varying sizes, which was repositioned until object identification (moving window paradigm). Field of view requirement, quantified as the number of “views” (window repositions) for correct recognition, was obtained for three blur levels, including no blur. In experiment 2, studying temporal requirement, we determined threshold viewing time, the stimulus duration yielding criterion recognition accuracy, at six blur levels, including no blur. For letter and face recognition, we found blur significantly increased the number of views, suggesting a larger field of view is required to recognize blurry objects. We also found blur significantly increased threshold viewing time, suggesting longer temporal integration is necessary to recognize blurry objects. The temporal integration reflects the tradeoff between stimulus intensity and time. While humans excel at recognizing blurry objects, our findings suggest compensating for blur requires increased field of view and viewing time. The need for larger spatial and longer temporal integration for recognizing blurry objects may further challenge object recognition in low vision. Thus, interactions between blur and field of view should be considered for developing low vision rehabilitation or assistive aids. PMID:27622710
Spatiotemporal information during unsupervised learning enhances viewpoint invariant object recognition

PubMed Central

Tian, Moqian; Grill-Spector, Kalanit

2015-01-01

Recognizing objects is difficult because it requires both linking views of an object that can be different and distinguishing objects with similar appearance. Interestingly, people can learn to recognize objects across views in an unsupervised way, without feedback, just from the natural viewing statistics. However, there is intense debate regarding what information during unsupervised learning is used to link among object views. Specifically, researchers argue whether temporal proximity, motion, or spatiotemporal continuity among object views during unsupervised learning is beneficial. Here, we untangled the role of each of these factors in unsupervised learning of novel three-dimensional (3-D) objects. We found that after unsupervised training with 24 object views spanning a 180° view space, participants showed significant improvement in their ability to recognize 3-D objects across rotation. Surprisingly, there was no advantage to unsupervised learning with spatiotemporal continuity or motion information than training with temporal proximity. However, we discovered that when participants were trained with just a third of the views spanning the same view space, unsupervised learning via spatiotemporal continuity yielded significantly better recognition performance on novel views than learning via temporal proximity. These results suggest that while it is possible to obtain view-invariant recognition just from observing many views of an object presented in temporal proximity, spatiotemporal information enhances performance by producing representations with broader view tuning than learning via temporal association. Our findings have important implications for theories of object recognition and for the development of computational algorithms that learn from examples. PMID:26024454
A Neural-Dynamic Architecture for Concurrent Estimation of Object Pose and Identity

PubMed Central

Lomp, Oliver; Faubel, Christian; Schöner, Gregor

2017-01-01

Handling objects or interacting with a human user about objects on a shared tabletop requires that objects be identified after learning from a small number of views and that object pose be estimated. We present a neurally inspired architecture that learns object instances by storing features extracted from a single view of each object. Input features are color and edge histograms from a localized area that is updated during processing. The system finds the best-matching view for the object in a novel input image while concurrently estimating the object’s pose, aligning the learned view with current input. The system is based on neural dynamics, computationally operating in real time, and can handle dynamic scenes directly off live video input. In a scenario with 30 everyday objects, the system achieves recognition rates of 87.2% from a single training view for each object, while also estimating pose quite precisely. We further demonstrate that the system can track moving objects, and that it can segment the visual array, selecting and recognizing one object while suppressing input from another known object in the immediate vicinity. Evaluation on the COIL-100 dataset, in which objects are depicted from different viewing angles, revealed recognition rates of 91.1% on the first 30 objects, each learned from four training views. PMID:28503145
View-Based Models of 3D Object Recognition and Class-Specific Invariance

DTIC Science & Technology

1994-04-01

underlie recognition of geon-like com- ponents (see Edelman, 1991 and Biederman , 1987 ). I(X -_ ta)II1y = (X - ta)TWTW(x -_ ta) (3) View-invariant features...Institute of Technology, 1993. neocortex. Biological Cybernetics, 1992. 14] I. Biederman . Recognition by components: a theory [20] B. Olshausen, C...Anderson, and D. Van Essen. A of human image understanding. Psychol. Review, neural model of visual attention and invariant pat- 94:115-147, 1987 . tern
Depth rotation and mirror-image reflection reduce affective preference as well as recognition memory for pictures of novel objects.

PubMed

Lawson, Rebecca

2004-10-01

In two experiments, the identification of novel 3-D objects was worse for depth-rotated and mirror-reflected views, compared with the study view in an implicit affective preference memory task, as well as in an explicit recognition memory task. In Experiment 1, recognition was worse and preference was lower when depth-rotated views of an object were paired with an unstudied object relative to trials when the study view of that object was shown. There was a similar trend for mirror-reflected views. In Experiment 2, the study view of an object was both recognized and preferred above chance when it was paired with either depth-rotated or mirror-reflected views of that object. These results suggest that view-sensitive representations of objects mediate performance in implicit, as well as explicit, memory tasks. The findings do not support the claim that separate episodic and structural description representations underlie performance in implicit and explicit memory tasks, respectively.
The role of perceptual load in object recognition.

PubMed

Lavie, Nilli; Lin, Zhicheng; Zokaei, Nahid; Thoma, Volker

2009-10-01

Predictions from perceptual load theory (Lavie, 1995, 2005) regarding object recognition across the same or different viewpoints were tested. Results showed that high perceptual load reduces distracter recognition levels despite always presenting distracter objects from the same view. They also showed that the levels of distracter recognition were unaffected by a change in the distracter object view under conditions of low perceptual load. These results were found both with repetition priming measures of distracter recognition and with performance on a surprise recognition memory test. The results support load theory proposals that distracter recognition critically depends on the level of perceptual load. The implications for the role of attention in object recognition theories are discussed. PsycINFO Database Record (c) 2009 APA, all rights reserved.
A knowledge-based object recognition system for applications in the space station

NASA Technical Reports Server (NTRS)

Dhawan, Atam P.

1988-01-01

A knowledge-based three-dimensional (3D) object recognition system is being developed. The system uses primitive-based hierarchical relational and structural matching for the recognition of 3D objects in the two-dimensional (2D) image for interpretation of the 3D scene. At present, the pre-processing, low-level preliminary segmentation, rule-based segmentation, and the feature extraction are completed. The data structure of the primitive viewing knowledge-base (PVKB) is also completed. Algorithms and programs based on attribute-trees matching for decomposing the segmented data into valid primitives were developed. The frame-based structural and relational descriptions of some objects were created and stored in a knowledge-base. This knowledge-base of the frame-based descriptions were developed on the MICROVAX-AI microcomputer in LISP environment. The simulated 3D scene of simple non-overlapping objects as well as real camera data of images of 3D objects of low-complexity have been successfully interpreted.
Does object view influence the scene consistency effect?

PubMed

Sastyin, Gergo; Niimi, Ryosuke; Yokosawa, Kazuhiko

2015-04-01

Traditional research on the scene consistency effect only used clearly recognizable object stimuli to show mutually interactive context effects for both the object and background components on scene perception (Davenport & Potter in Psychological Science, 15, 559-564, 2004). However, in real environments, objects are viewed from multiple viewpoints, including an accidental, hard-to-recognize one. When the observers named target objects in scenes (Experiments 1a and 1b, object recognition task), we replicated the scene consistency effect (i.e., there was higher accuracy for the objects with consistent backgrounds). However, there was a significant interaction effect between consistency and object viewpoint, which indicated that the scene consistency effect was more important for identifying objects in the accidental view condition than in the canonical view condition. Therefore, the object recognition system may rely more on the scene context when the object is difficult to recognize. In Experiment 2, the observers identified the background (background recognition task) while the scene consistency and object views were manipulated. The results showed that object viewpoint had no effect, while the scene consistency effect was observed. More specifically, the canonical and accidental views both equally provided contextual information for scene perception. These findings suggested that the mechanism for conscious recognition of objects could be dissociated from the mechanism for visual analysis of object images that were part of a scene. The "context" that the object images provided may have been derived from its view-invariant, relatively low-level visual features (e.g., color), rather than its semantic information.

View Combination: A Generalization Mechanism for Visual Recognition

ERIC Educational Resources Information Center

Friedman, Alinda; Waller, David; Thrash, Tyler; Greenauer, Nathan; Hodgson, Eric

2011-01-01

We examined whether view combination mechanisms shown to underlie object and scene recognition can integrate visual information across views that have little or no three-dimensional information at either the object or scene level. In three experiments, people learned four "views" of a two dimensional visual array derived from a three-dimensional…
Neural-Network Object-Recognition Program

NASA Technical Reports Server (NTRS)

Spirkovska, L.; Reid, M. B.

1993-01-01

HONTIOR computer program implements third-order neural network exhibiting invariance under translation, change of scale, and in-plane rotation. Invariance incorporated directly into architecture of network. Only one view of each object needed to train network for two-dimensional-translation-invariant recognition of object. Also used for three-dimensional-transformation-invariant recognition by training network on only set of out-of-plane rotated views. Written in C language.
Scene recognition following locomotion around a scene.

PubMed

Motes, Michael A; Finlay, Cory A; Kozhevnikov, Maria

2006-01-01

Effects of locomotion on scene-recognition reaction time (RT) and accuracy were studied. In experiment 1, observers memorized an 11-object scene and made scene-recognition judgments on subsequently presented scenes from the encoded view or different views (ie scenes were rotated or observers moved around the scene, both from 40 degrees to 360 degrees). In experiment 2, observers viewed different 5-object scenes on each trial and made scene-recognition judgments from the encoded view or after moving around the scene, from 36 degrees to 180 degrees. Across experiments, scene-recognition RT increased (in experiment 2 accuracy decreased) with angular distance between encoded and judged views, regardless of how the viewpoint changes occurred. The findings raise questions about conditions in which locomotion produces spatially updated representations of scenes.
New neural-networks-based 3D object recognition system

NASA Astrophysics Data System (ADS)

Abolmaesumi, Purang; Jahed, M.

1997-09-01

Three-dimensional object recognition has always been one of the challenging fields in computer vision. In recent years, Ulman and Basri (1991) have proposed that this task can be done by using a database of 2-D views of the objects. The main problem in their proposed system is that the correspondent points should be known to interpolate the views. On the other hand, their system should have a supervisor to decide which class does the represented view belong to. In this paper, we propose a new momentum-Fourier descriptor that is invariant to scale, translation, and rotation. This descriptor provides the input feature vectors to our proposed system. By using the Dystal network, we show that the objects can be classified with over 95% precision. We have used this system to classify the objects like cube, cone, sphere, torus, and cylinder. Because of the nature of the Dystal network, this system reaches to its stable point by a single representation of the view to the system. This system can also classify the similar views to a single class (e.g., for the cube, the system generated 9 different classes for 50 different input views), which can be used to select an optimum database of training views. The system is also very flexible to the noise and deformed views.
The Last Meter: Blind Visual Guidance to a Target.

PubMed

Manduchi, Roberto; Coughlan, James M

2014-01-01

Smartphone apps can use object recognition software to provide information to blind or low vision users about objects in the visual environment. A crucial challenge for these users is aiming the camera properly to take a well-framed picture of the desired target object. We investigate the effects of two fundamental constraints of object recognition - frame rate and camera field of view - on a blind person's ability to use an object recognition smartphone app. The app was used by 18 blind participants to find visual targets beyond arm's reach and approach them to within 30 cm. While we expected that a faster frame rate or wider camera field of view should always improve search performance, our experimental results show that in many cases increasing the field of view does not help, and may even hurt, performance. These results have important implications for the design of object recognition systems for blind users.
Automated Field-of-View, Illumination, and Recognition Algorithm Design of a Vision System for Pick-and-Place Considering Colour Information in Illumination and Images

PubMed Central

Chen, Yibing; Ogata, Taiki; Ueyama, Tsuyoshi; Takada, Toshiyuki; Ota, Jun

2018-01-01

Machine vision is playing an increasingly important role in industrial applications, and the automated design of image recognition systems has been a subject of intense research. This study has proposed a system for automatically designing the field-of-view (FOV) of a camera, the illumination strength and the parameters in a recognition algorithm. We formulated the design problem as an optimisation problem and used an experiment based on a hierarchical algorithm to solve it. The evaluation experiments using translucent plastics objects showed that the use of the proposed system resulted in an effective solution with a wide FOV, recognition of all objects and 0.32 mm and 0.4° maximal positional and angular errors when all the RGB (red, green and blue) for illumination and R channel image for recognition were used. Though all the RGB illumination and grey scale images also provided recognition of all the objects, only a narrow FOV was selected. Moreover, full recognition was not achieved by using only G illumination and a grey-scale image. The results showed that the proposed method can automatically design the FOV, illumination and parameters in the recognition algorithm and that tuning all the RGB illumination is desirable even when single-channel or grey-scale images are used for recognition. PMID:29786665
Automated Field-of-View, Illumination, and Recognition Algorithm Design of a Vision System for Pick-and-Place Considering Colour Information in Illumination and Images.

PubMed

Chen, Yibing; Ogata, Taiki; Ueyama, Tsuyoshi; Takada, Toshiyuki; Ota, Jun

2018-05-22

Machine vision is playing an increasingly important role in industrial applications, and the automated design of image recognition systems has been a subject of intense research. This study has proposed a system for automatically designing the field-of-view (FOV) of a camera, the illumination strength and the parameters in a recognition algorithm. We formulated the design problem as an optimisation problem and used an experiment based on a hierarchical algorithm to solve it. The evaluation experiments using translucent plastics objects showed that the use of the proposed system resulted in an effective solution with a wide FOV, recognition of all objects and 0.32 mm and 0.4° maximal positional and angular errors when all the RGB (red, green and blue) for illumination and R channel image for recognition were used. Though all the RGB illumination and grey scale images also provided recognition of all the objects, only a narrow FOV was selected. Moreover, full recognition was not achieved by using only G illumination and a grey-scale image. The results showed that the proposed method can automatically design the FOV, illumination and parameters in the recognition algorithm and that tuning all the RGB illumination is desirable even when single-channel or grey-scale images are used for recognition.
Global ensemble texture representations are critical to rapid scene perception.

PubMed

Brady, Timothy F; Shafer-Skelton, Anna; Alvarez, George A

2017-06-01

Traditionally, recognizing the objects within a scene has been treated as a prerequisite to recognizing the scene itself. However, research now suggests that the ability to rapidly recognize visual scenes could be supported by global properties of the scene itself rather than the objects within the scene. Here, we argue for a particular instantiation of this view: That scenes are recognized by treating them as a global texture and processing the pattern of orientations and spatial frequencies across different areas of the scene without recognizing any objects. To test this model, we asked whether there is a link between how proficient individuals are at rapid scene perception and how proficiently they represent simple spatial patterns of orientation information (global ensemble texture). We find a significant and selective correlation between these tasks, suggesting a link between scene perception and spatial ensemble tasks but not nonspatial summary statistics In a second and third experiment, we additionally show that global ensemble texture information is not only associated with scene recognition, but that preserving only global ensemble texture information from scenes is sufficient to support rapid scene perception; however, preserving the same information is not sufficient for object recognition. Thus, global ensemble texture alone is sufficient to allow activation of scene representations but not object representations. Together, these results provide evidence for a view of scene recognition based on global ensemble texture rather than a view based purely on objects or on nonspatially localized global properties. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
The Role of Perceptual Load in Object Recognition

ERIC Educational Resources Information Center

Lavie, Nilli; Lin, Zhicheng; Zokaei, Nahid; Thoma, Volker

2009-01-01

Predictions from perceptual load theory (Lavie, 1995, 2005) regarding object recognition across the same or different viewpoints were tested. Results showed that high perceptual load reduces distracter recognition levels despite always presenting distracter objects from the same view. They also showed that the levels of distracter recognition were…
Visual object recognition for mobile tourist information systems

NASA Astrophysics Data System (ADS)

Paletta, Lucas; Fritz, Gerald; Seifert, Christin; Luley, Patrick; Almer, Alexander

2005-03-01

We describe a mobile vision system that is capable of automated object identification using images captured from a PDA or a camera phone. We present a solution for the enabling technology of outdoors vision based object recognition that will extend state-of-the-art location and context aware services towards object based awareness in urban environments. In the proposed application scenario, tourist pedestrians are equipped with GPS, W-LAN and a camera attached to a PDA or a camera phone. They are interested whether their field of view contains tourist sights that would point to more detailed information. Multimedia type data about related history, the architecture, or other related cultural context of historic or artistic relevance might be explored by a mobile user who is intending to learn within the urban environment. Learning from ambient cues is in this way achieved by pointing the device towards the urban sight, capturing an image, and consequently getting information about the object on site and within the focus of attention, i.e., the users current field of view.
Higher-Order Neural Networks Applied to 2D and 3D Object Recognition

NASA Technical Reports Server (NTRS)

Spirkovska, Lilly; Reid, Max B.

1994-01-01

A Higher-Order Neural Network (HONN) can be designed to be invariant to geometric transformations such as scale, translation, and in-plane rotation. Invariances are built directly into the architecture of a HONN and do not need to be learned. Thus, for 2D object recognition, the network needs to be trained on just one view of each object class, not numerous scaled, translated, and rotated views. Because the 2D object recognition task is a component of the 3D object recognition task, built-in 2D invariance also decreases the size of the training set required for 3D object recognition. We present results for 2D object recognition both in simulation and within a robotic vision experiment and for 3D object recognition in simulation. We also compare our method to other approaches and show that HONNs have distinct advantages for position, scale, and rotation-invariant object recognition. The major drawback of HONNs is that the size of the input field is limited due to the memory required for the large number of interconnections in a fully connected network. We present partial connectivity strategies and a coarse-coding technique for overcoming this limitation and increasing the input field to that required by practical object recognition problems.
Neural Dynamics of Object-Based Multifocal Visual Spatial Attention and Priming: Object Cueing, Useful-Field-of-View, and Crowding

ERIC Educational Resources Information Center

Foley, Nicholas C.; Grossberg, Stephen; Mingolla, Ennio

2012-01-01

How are spatial and object attention coordinated to achieve rapid object learning and recognition during eye movement search? How do prefrontal priming and parietal spatial mechanisms interact to determine the reaction time costs of intra-object attention shifts, inter-object attention shifts, and shifts between visible objects and covertly cued…
Interactive object recognition assistance: an approach to recognition starting from target objects

NASA Astrophysics Data System (ADS)

Geisler, Juergen; Littfass, Michael

1999-07-01

Recognition of target objects in remotely sensed imagery required detailed knowledge about the target object domain as well as about mapping properties of the sensing system. The art of object recognition is to combine both worlds appropriately and to provide models of target appearance with respect to sensor characteristics. Common approaches to support interactive object recognition are either driven from the sensor point of view and address the problem of displaying images in a manner adequate to the sensing system. Or they focus on target objects and provide exhaustive encyclopedic information about this domain. Our paper discusses an approach to assist interactive object recognition based on knowledge about target objects and taking into account the significance of object features with respect to characteristics of the sensed imagery, e.g. spatial and spectral resolution. An `interactive recognition assistant' takes the image analyst through the interpretation process by indicating step-by-step the respectively most significant features of objects in an actual set of candidates. The significance of object features is expressed by pregenerated trees of significance, and by the dynamic computation of decision relevance for every feature at each step of the recognition process. In the context of this approach we discuss the question of modeling and storing the multisensorial/multispectral appearances of target objects and object classes as well as the problem of an adequate dynamic human-machine-interface that takes into account various mental models of human image interpretation.
Recognition Of Complex Three Dimensional Objects Using Three Dimensional Moment Invariants

NASA Astrophysics Data System (ADS)

Sadjadi, Firooz A.

1985-01-01

A technique for the recognition of complex three dimensional objects is presented. The complex 3-D objects are represented in terms of their 3-D moment invariants, algebraic expressions that remain invariant independent of the 3-D objects' orientations and locations in the field of view. The technique of 3-D moment invariants has been used successfully for simple 3-D object recognition in the past. In this work we have extended this method for the representation of more complex objects. Two complex objects are represented digitally; their 3-D moment invariants have been calculated, and then the invariancy of these 3-D invariant moment expressions is verified by changing the orientation and the location of the objects in the field of view. The results of this study have significant impact on 3-D robotic vision, 3-D target recognition, scene analysis and artificial intelligence.
Modeling guidance and recognition in categorical search: bridging human and computer object detection.

PubMed

Zelinsky, Gregory J; Peng, Yifan; Berg, Alexander C; Samaras, Dimitris

2013-10-08

Search is commonly described as a repeating cycle of guidance to target-like objects, followed by the recognition of these objects as targets or distractors. Are these indeed separate processes using different visual features? We addressed this question by comparing observer behavior to that of support vector machine (SVM) models trained on guidance and recognition tasks. Observers searched for a categorically defined teddy bear target in four-object arrays. Target-absent trials consisted of random category distractors rated in their visual similarity to teddy bears. Guidance, quantified as first-fixated objects during search, was strongest for targets, followed by target-similar, medium-similarity, and target-dissimilar distractors. False positive errors to first-fixated distractors also decreased with increasing dissimilarity to the target category. To model guidance, nine teddy bear detectors, using features ranging in biological plausibility, were trained on unblurred bears then tested on blurred versions of the same objects appearing in each search display. Guidance estimates were based on target probabilities obtained from these detectors. To model recognition, nine bear/nonbear classifiers, trained and tested on unblurred objects, were used to classify the object that would be fixated first (based on the detector estimates) as a teddy bear or a distractor. Patterns of categorical guidance and recognition accuracy were modeled almost perfectly by an HMAX model in combination with a color histogram feature. We conclude that guidance and recognition in the context of search are not separate processes mediated by different features, and that what the literature knows as guidance is really recognition performed on blurred objects viewed in the visual periphery.
Modeling guidance and recognition in categorical search: Bridging human and computer object detection

PubMed Central

Zelinsky, Gregory J.; Peng, Yifan; Berg, Alexander C.; Samaras, Dimitris

2013-01-01

Search is commonly described as a repeating cycle of guidance to target-like objects, followed by the recognition of these objects as targets or distractors. Are these indeed separate processes using different visual features? We addressed this question by comparing observer behavior to that of support vector machine (SVM) models trained on guidance and recognition tasks. Observers searched for a categorically defined teddy bear target in four-object arrays. Target-absent trials consisted of random category distractors rated in their visual similarity to teddy bears. Guidance, quantified as first-fixated objects during search, was strongest for targets, followed by target-similar, medium-similarity, and target-dissimilar distractors. False positive errors to first-fixated distractors also decreased with increasing dissimilarity to the target category. To model guidance, nine teddy bear detectors, using features ranging in biological plausibility, were trained on unblurred bears then tested on blurred versions of the same objects appearing in each search display. Guidance estimates were based on target probabilities obtained from these detectors. To model recognition, nine bear/nonbear classifiers, trained and tested on unblurred objects, were used to classify the object that would be fixated first (based on the detector estimates) as a teddy bear or a distractor. Patterns of categorical guidance and recognition accuracy were modeled almost perfectly by an HMAX model in combination with a color histogram feature. We conclude that guidance and recognition in the context of search are not separate processes mediated by different features, and that what the literature knows as guidance is really recognition performed on blurred objects viewed in the visual periphery. PMID:24105460
View-Invariant Object Category Learning, Recognition, and Search: How Spatial and Object Attention are Coordinated Using Surface-Based Attentional Shrouds

ERIC Educational Resources Information Center

Fazl, Arash; Grossberg, Stephen; Mingolla, Ennio

2009-01-01

How does the brain learn to recognize an object from multiple viewpoints while scanning a scene with eye movements? How does the brain avoid the problem of erroneously classifying parts of different objects together? How are attention and eye movements intelligently coordinated to facilitate object learning? A neural model provides a unified…
Non-accidental properties, metric invariance, and encoding by neurons in a model of ventral stream visual object recognition, VisNet.

PubMed

Rolls, Edmund T; Mills, W Patrick C

2018-05-01

When objects transform into different views, some properties are maintained, such as whether the edges are convex or concave, and these non-accidental properties are likely to be important in view-invariant object recognition. The metric properties, such as the degree of curvature, may change with different views, and are less likely to be useful in object recognition. It is shown that in a model of invariant visual object recognition in the ventral visual stream, VisNet, non-accidental properties are encoded much more than metric properties by neurons. Moreover, it is shown how with the temporal trace rule training in VisNet, non-accidental properties of objects become encoded by neurons, and how metric properties are treated invariantly. We also show how VisNet can generalize between different objects if they have the same non-accidental property, because the metric properties are likely to overlap. VisNet is a 4-layer unsupervised model of visual object recognition trained by competitive learning that utilizes a temporal trace learning rule to implement the learning of invariance using views that occur close together in time. A second crucial property of this model of object recognition is, when neurons in the level corresponding to the inferior temporal visual cortex respond selectively to objects, whether neurons in the intermediate layers can respond to combinations of features that may be parts of two or more objects. In an investigation using the four sides of a square presented in every possible combination, it was shown that even though different layer 4 neurons are tuned to encode each feature or feature combination orthogonally, neurons in the intermediate layers can respond to features or feature combinations present is several objects. This property is an important part of the way in which high capacity can be achieved in the four-layer ventral visual cortical pathway. These findings concerning non-accidental properties and the use of neurons in intermediate layers of the hierarchy help to emphasise fundamental underlying principles of the computations that may be implemented in the ventral cortical visual stream used in object recognition. Copyright © 2018 Elsevier Inc. All rights reserved.
Object recognition with severe spatial deficits in Williams syndrome: sparing and breakdown.

PubMed

Landau, Barbara; Hoffman, James E; Kurz, Nicole

2006-07-01

Williams syndrome (WS) is a rare genetic disorder that results in severe visual-spatial cognitive deficits coupled with relative sparing in language, face recognition, and certain aspects of motion processing. Here, we look for evidence for sparing or impairment in another cognitive system-object recognition. Children with WS, normal mental-age (MA) and chronological age-matched (CA) children, and normal adults viewed pictures of a large range of objects briefly presented under various conditions of degradation, including canonical and unusual orientations, and clear or blurred contours. Objects were shown as either full-color views (Experiment 1) or line drawings (Experiment 2). Across both experiments, WS and MA children performed similarly in all conditions while CA children performed better than both WS group and MA groups with unusual views. This advantage, however, was eliminated when images were also blurred. The error types and relative difficulty of different objects were similar across all participant groups. The results indicate selective sparing of basic mechanisms of object recognition in WS, together with developmental delay or arrest in recognition of objects from unusual viewpoints. These findings are consistent with the growing literature on brain abnormalities in WS which points to selective impairment in the parietal areas of the brain. As a whole, the results lend further support to the growing literature on the functional separability of object recognition mechanisms from other spatial functions, and raise intriguing questions about the link between genetic deficits and cognition.
Neural dynamics of object-based multifocal visual spatial attention and priming: Object cueing, useful-field-of-view, and crowding

PubMed Central

Foley, Nicholas C.; Grossberg, Stephen; Mingolla, Ennio

2015-01-01

How are spatial and object attention coordinated to achieve rapid object learning and recognition during eye movement search? How do prefrontal priming and parietal spatial mechanisms interact to determine the reaction time costs of intra-object attention shifts, inter-object attention shifts, and shifts between visible objects and covertly cued locations? What factors underlie individual differences in the timing and frequency of such attentional shifts? How do transient and sustained spatial attentional mechanisms work and interact? How can volition, mediated via the basal ganglia, influence the span of spatial attention? A neural model is developed of how spatial attention in the where cortical stream coordinates view-invariant object category learning in the what cortical stream under free viewing conditions. The model simulates psychological data about the dynamics of covert attention priming and switching requiring multifocal attention without eye movements. The model predicts how “attentional shrouds” are formed when surface representations in cortical area V4 resonate with spatial attention in posterior parietal cortex (PPC) and prefrontal cortex (PFC), while shrouds compete among themselves for dominance. Winning shrouds support invariant object category learning, and active surface-shroud resonances support conscious surface perception and recognition. Attentive competition between multiple objects and cues simulates reaction-time data from the two-object cueing paradigm. The relative strength of sustained surface-driven and fast-transient motion-driven spatial attention controls individual differences in reaction time for invalid cues. Competition between surface-driven attentional shrouds controls individual differences in detection rate of peripheral targets in useful-field-of-view tasks. The model proposes how the strength of competition can be mediated, though learning or momentary changes in volition, by the basal ganglia. A new explanation of crowding shows how the cortical magnification factor, among other variables, can cause multiple object surfaces to share a single surface-shroud resonance, thereby preventing recognition of the individual objects. PMID:22425615

Neural dynamics of object-based multifocal visual spatial attention and priming: object cueing, useful-field-of-view, and crowding.

PubMed

Foley, Nicholas C; Grossberg, Stephen; Mingolla, Ennio

2012-08-01

How are spatial and object attention coordinated to achieve rapid object learning and recognition during eye movement search? How do prefrontal priming and parietal spatial mechanisms interact to determine the reaction time costs of intra-object attention shifts, inter-object attention shifts, and shifts between visible objects and covertly cued locations? What factors underlie individual differences in the timing and frequency of such attentional shifts? How do transient and sustained spatial attentional mechanisms work and interact? How can volition, mediated via the basal ganglia, influence the span of spatial attention? A neural model is developed of how spatial attention in the where cortical stream coordinates view-invariant object category learning in the what cortical stream under free viewing conditions. The model simulates psychological data about the dynamics of covert attention priming and switching requiring multifocal attention without eye movements. The model predicts how "attentional shrouds" are formed when surface representations in cortical area V4 resonate with spatial attention in posterior parietal cortex (PPC) and prefrontal cortex (PFC), while shrouds compete among themselves for dominance. Winning shrouds support invariant object category learning, and active surface-shroud resonances support conscious surface perception and recognition. Attentive competition between multiple objects and cues simulates reaction-time data from the two-object cueing paradigm. The relative strength of sustained surface-driven and fast-transient motion-driven spatial attention controls individual differences in reaction time for invalid cues. Competition between surface-driven attentional shrouds controls individual differences in detection rate of peripheral targets in useful-field-of-view tasks. The model proposes how the strength of competition can be mediated, though learning or momentary changes in volition, by the basal ganglia. A new explanation of crowding shows how the cortical magnification factor, among other variables, can cause multiple object surfaces to share a single surface-shroud resonance, thereby preventing recognition of the individual objects. Copyright © 2012 Elsevier Inc. All rights reserved.
A multi-view face recognition system based on cascade face detector and improved Dlib

NASA Astrophysics Data System (ADS)

Zhou, Hongjun; Chen, Pei; Shen, Wei

2018-03-01

In this research, we present a framework for multi-view face detect and recognition system based on cascade face detector and improved Dlib. This method is aimed to solve the problems of low efficiency and low accuracy in multi-view face recognition, to build a multi-view face recognition system, and to discover a suitable monitoring scheme. For face detection, the cascade face detector is used to extracted the Haar-like feature from the training samples, and Haar-like feature is used to train a cascade classifier by combining Adaboost algorithm. Next, for face recognition, we proposed an improved distance model based on Dlib to improve the accuracy of multiview face recognition. Furthermore, we applied this proposed method into recognizing face images taken from different viewing directions, including horizontal view, overlooks view, and looking-up view, and researched a suitable monitoring scheme. This method works well for multi-view face recognition, and it is also simulated and tested, showing satisfactory experimental results.
Young Children's Self-Generated Object Views and Object Recognition

ERIC Educational Resources Information Center

James, Karin H.; Jones, Susan S.; Smith, Linda B.; Swain, Shelley N.

2014-01-01

Two important and related developments in children between 18 and 24 months of age are the rapid expansion of object name vocabularies and the emergence of an ability to recognize objects from sparse representations of their geometric shapes. In the same period, children also begin to show a preference for planar views (i.e., views of objects held…
MATHEMATICS OF SENSING, EXPLOITATION, AND EXECUTION (MSEE) Sensing, Exploitation, and Execution (SEE) on a Foundation for Representation, Inference, and Learning

DTIC Science & Technology

2016-07-01

reconstruction, video synchronization, multi - view tracking, action recognition, reasoning with uncertainty 16. SECURITY CLASSIFICATION OF: 17...3.4.2. Human action recognition across multi - views ......................................................................................... 44 3.4.3...68 4.2.1. Multi - view Multi -object Tracking with 3D cues
Stereo Viewing Modulates Three-Dimensional Shape Processing During Object Recognition: A High-Density ERP Study

PubMed Central

2017-01-01

The role of stereo disparity in the recognition of 3-dimensional (3D) object shape remains an unresolved issue for theoretical models of the human visual system. We examined this issue using high-density (128 channel) recordings of event-related potentials (ERPs). A recognition memory task was used in which observers were trained to recognize a subset of complex, multipart, 3D novel objects under conditions of either (bi-) monocular or stereo viewing. In a subsequent test phase they discriminated previously trained targets from untrained distractor objects that shared either local parts, 3D spatial configuration, or neither dimension, across both previously seen and novel viewpoints. The behavioral data showed a stereo advantage for target recognition at untrained viewpoints. ERPs showed early differential amplitude modulations to shape similarity defined by local part structure and global 3D spatial configuration. This occurred initially during an N1 component around 145–190 ms poststimulus onset, and then subsequently during an N2/P3 component around 260–385 ms poststimulus onset. For mono viewing, amplitude modulation during the N1 was greatest between targets and distracters with different local parts for trained views only. For stereo viewing, amplitude modulation during the N2/P3 was greatest between targets and distracters with different global 3D spatial configurations and generalized across trained and untrained views. The results show that image classification is modulated by stereo information about the local part, and global 3D spatial configuration of object shape. The findings challenge current theoretical models that do not attribute functional significance to stereo input during the computation of 3D object shape. PMID:29022728
Under what conditions is recognition spared relative to recall after selective hippocampal damage in humans?

PubMed

Holdstock, J S; Mayes, A R; Roberts, N; Cezayirli, E; Isaac, C L; O'Reilly, R C; Norman, K A

2002-01-01

The claim that recognition memory is spared relative to recall after focal hippocampal damage has been disputed in the literature. We examined this claim by investigating object and object-location recall and recognition memory in a patient, YR, who has adult-onset selective hippocampal damage. Our aim was to identify the conditions under which recognition was spared relative to recall in this patient. She showed unimpaired forced-choice object recognition but clearly impaired recall, even when her control subjects found the object recognition task to be numerically harder than the object recall task. However, on two other recognition tests, YR's performance was not relatively spared. First, she was clearly impaired at an equivalently difficult yes/no object recognition task, but only when targets and foils were very similar. Second, YR was clearly impaired at forced-choice recognition of object-location associations. This impairment was also unrelated to difficulty because this task was no more difficult than the forced-choice object recognition task for control subjects. The clear impairment of yes/no, but not of forced-choice, object recognition after focal hippocampal damage, when targets and foils are very similar, is predicted by the neural network-based Complementary Learning Systems model of recognition. This model postulates that recognition is mediated by hippocampally dependent recollection and cortically dependent familiarity; thus hippocampal damage should not impair item familiarity. The model postulates that familiarity is ineffective when very similar targets and foils are shown one at a time and subjects have to identify which items are old (yes/no recognition). In contrast, familiarity is effective in discriminating which of similar targets and foils, seen together, is old (forced-choice recognition). Independent evidence from the remember/know procedure also indicates that YR's familiarity is normal. The Complementary Learning Systems model can also accommodate the clear impairment of forced-choice object-location recognition memory if it incorporates the view that the most complete convergence of spatial and object information, represented in different cortical regions, occurs in the hippocampus.
Acute effects of alcohol on intrusive memory development and viewpoint dependence in spatial memory support a dual representation model.

PubMed

Bisby, James A; King, John A; Brewin, Chris R; Burgess, Neil; Curran, H Valerie

2010-08-01

A dual representation model of intrusive memory proposes that personally experienced events give rise to two types of representation: an image-based, egocentric representation based on sensory-perceptual features; and a more abstract, allocentric representation that incorporates spatiotemporal context. The model proposes that intrusions reflect involuntary reactivation of egocentric representations in the absence of a corresponding allocentric representation. We tested the model by investigating the effect of alcohol on intrusive memories and, concurrently, on egocentric and allocentric spatial memory. With a double-blind independent group design participants were administered alcohol (.4 or .8 g/kg) or placebo. A virtual environment was used to present objects and test recognition memory from the same viewpoint as presentation (tapping egocentric memory) or a shifted viewpoint (tapping allocentric memory). Participants were also exposed to a trauma video and required to detail intrusive memories for 7 days, after which explicit memory was assessed. There was a selective impairment of shifted-view recognition after the low dose of alcohol, whereas the high dose induced a global impairment in same-view and shifted-view conditions. Alcohol showed a dose-dependent inverted "U"-shaped effect on intrusions, with only the low dose increasing the number of intrusions, replicating previous work. When same-view recognition was intact, decrements in shifted-view recognition were associated with increases in intrusions. The differential effect of alcohol on intrusive memories and on same/shifted-view recognition support a dual representation model in which intrusions might reflect an imbalance between two types of memory representation. These findings highlight important clinical implications, given alcohol's involvement in real-life trauma. Copyright 2010 Society of Biological Psychiatry. Published by Elsevier Inc. All rights reserved.
Priming Contour-Deleted Images: Evidence for Immediate Representations in Visual Object Recognition.

ERIC Educational Resources Information Center

Biederman, Irving; Cooper, Eric E.

1991-01-01

Speed and accuracy of identification of pictures of objects are facilitated by prior viewing. Contributions of image features, convex or concave components, and object models in a repetition priming task were explored in 2 studies involving 96 college students. Results provide evidence of intermediate representations in visual object recognition.…
Kernel-aligned multi-view canonical correlation analysis for image recognition

NASA Astrophysics Data System (ADS)

Su, Shuzhi; Ge, Hongwei; Yuan, Yun-Hao

2016-09-01

Existing kernel-based correlation analysis methods mainly adopt a single kernel in each view. However, only a single kernel is usually insufficient to characterize nonlinear distribution information of a view. To solve the problem, we transform each original feature vector into a 2-dimensional feature matrix by means of kernel alignment, and then propose a novel kernel-aligned multi-view canonical correlation analysis (KAMCCA) method on the basis of the feature matrices. Our proposed method can simultaneously employ multiple kernels to better capture the nonlinear distribution information of each view, so that correlation features learned by KAMCCA can have well discriminating power in real-world image recognition. Extensive experiments are designed on five real-world image datasets, including NIR face images, thermal face images, visible face images, handwritten digit images, and object images. Promising experimental results on the datasets have manifested the effectiveness of our proposed method.
Learned Non-Rigid Object Motion is a View-Invariant Cue to Recognizing Novel Objects

PubMed Central

Chuang, Lewis L.; Vuong, Quoc C.; Bülthoff, Heinrich H.

2012-01-01

There is evidence that observers use learned object motion to recognize objects. For instance, studies have shown that reversing the learned direction in which a rigid object rotated in depth impaired recognition accuracy. This motion reversal can be achieved by playing animation sequences of moving objects in reverse frame order. In the current study, we used this sequence-reversal manipulation to investigate whether observers encode the motion of dynamic objects in visual memory, and whether such dynamic representations are encoded in a way that is dependent on the viewing conditions. Participants first learned dynamic novel objects, presented as animation sequences. Following learning, they were then tested on their ability to recognize these learned objects when their animation sequence was shown in the same sequence order as during learning or in the reverse sequence order. In Experiment 1, we found that non-rigid motion contributed to recognition performance; that is, sequence-reversal decreased sensitivity across different tasks. In subsequent experiments, we tested the recognition of non-rigidly deforming (Experiment 2) and rigidly rotating (Experiment 3) objects across novel viewpoints. Recognition performance was affected by viewpoint changes for both experiments. Learned non-rigid motion continued to contribute to recognition performance and this benefit was the same across all viewpoint changes. By comparison, learned rigid motion did not contribute to recognition performance. These results suggest that non-rigid motion provides a source of information for recognizing dynamic objects, which is not affected by changes to viewpoint. PMID:22661939
An Intelligent Systems Approach to Automated Object Recognition: A Preliminary Study

USGS Publications Warehouse

Maddox, Brian G.; Swadley, Casey L.

2002-01-01

Attempts at fully automated object recognition systems have met with varying levels of success over the years. However, none of the systems have achieved high enough accuracy rates to be run unattended. One of the reasons for this may be that they are designed from the computer's point of view and rely mainly on image-processing methods. A better solution to this problem may be to make use of modern advances in computational intelligence and distributed processing to try to mimic how the human brain is thought to recognize objects. As humans combine cognitive processes with detection techniques, such a system would combine traditional image-processing techniques with computer-based intelligence to determine the identity of various objects in a scene.
Recognizing 3 D Objects from 2D Images Using Structural Knowledge Base of Genetic Views

DTIC Science & Technology

1988-08-31

technical report. [BIE85] I. Biederman , "Human image understanding: Recent research and a theory", Computer Vision, Graphics, and Image Processing, vol...model bases", Technical Report 87-85, COINS Dept, University of Massachusetts, Amherst, MA 01003, August 1987 . [BUR87b) Burns, J. B. and L. J. Kitchen...34Recognition in 2D images of 3D objects from large model bases using prediction hierarchies", Proc. IJCAI-10, 1987 . [BUR891 J. B. Burns, forthcoming
Biologically Inspired Model for Visual Cognition Achieving Unsupervised Episodic and Semantic Feature Learning.

PubMed

Qiao, Hong; Li, Yinlin; Li, Fengfu; Xi, Xuanyang; Wu, Wei

2016-10-01

Recently, many biologically inspired visual computational models have been proposed. The design of these models follows the related biological mechanisms and structures, and these models provide new solutions for visual recognition tasks. In this paper, based on the recent biological evidence, we propose a framework to mimic the active and dynamic learning and recognition process of the primate visual cortex. From principle point of view, the main contributions are that the framework can achieve unsupervised learning of episodic features (including key components and their spatial relations) and semantic features (semantic descriptions of the key components), which support higher level cognition of an object. From performance point of view, the advantages of the framework are as follows: 1) learning episodic features without supervision-for a class of objects without a prior knowledge, the key components, their spatial relations and cover regions can be learned automatically through a deep neural network (DNN); 2) learning semantic features based on episodic features-within the cover regions of the key components, the semantic geometrical values of these components can be computed based on contour detection; 3) forming the general knowledge of a class of objects-the general knowledge of a class of objects can be formed, mainly including the key components, their spatial relations and average semantic values, which is a concise description of the class; and 4) achieving higher level cognition and dynamic updating-for a test image, the model can achieve classification and subclass semantic descriptions. And the test samples with high confidence are selected to dynamically update the whole model. Experiments are conducted on face images, and a good performance is achieved in each layer of the DNN and the semantic description learning process. Furthermore, the model can be generalized to recognition tasks of other objects with learning ability.
Behavioral model of visual perception and recognition

NASA Astrophysics Data System (ADS)

Rybak, Ilya A.; Golovan, Alexander V.; Gusakova, Valentina I.

1993-09-01

In the processes of visual perception and recognition human eyes actively select essential information by way of successive fixations at the most informative points of the image. A behavioral program defining a scanpath of the image is formed at the stage of learning (object memorizing) and consists of sequential motor actions, which are shifts of attention from one to another point of fixation, and sensory signals expected to arrive in response to each shift of attention. In the modern view of the problem, invariant object recognition is provided by the following: (1) separated processing of `what' (object features) and `where' (spatial features) information at high levels of the visual system; (2) mechanisms of visual attention using `where' information; (3) representation of `what' information in an object-based frame of reference (OFR). However, most recent models of vision based on OFR have demonstrated the ability of invariant recognition of only simple objects like letters or binary objects without background, i.e. objects to which a frame of reference is easily attached. In contrast, we use not OFR, but a feature-based frame of reference (FFR), connected with the basic feature (edge) at the fixation point. This has provided for our model, the ability for invariant representation of complex objects in gray-level images, but demands realization of behavioral aspects of vision described above. The developed model contains a neural network subsystem of low-level vision which extracts a set of primary features (edges) in each fixation, and high- level subsystem consisting of `what' (Sensory Memory) and `where' (Motor Memory) modules. The resolution of primary features extraction decreases with distances from the point of fixation. FFR provides both the invariant representation of object features in Sensor Memory and shifts of attention in Motor Memory. Object recognition consists in successive recall (from Motor Memory) and execution of shifts of attention and successive verification of the expected sets of features (stored in Sensory Memory). The model shows the ability of recognition of complex objects (such as faces) in gray-level images invariant with respect to shift, rotation, and scale.
2.5D multi-view gait recognition based on point cloud registration.

PubMed

Tang, Jin; Luo, Jian; Tjahjadi, Tardi; Gao, Yan

2014-03-28

This paper presents a method for modeling a 2.5-dimensional (2.5D) human body and extracting the gait features for identifying the human subject. To achieve view-invariant gait recognition, a multi-view synthesizing method based on point cloud registration (MVSM) to generate multi-view training galleries is proposed. The concept of a density and curvature-based Color Gait Curvature Image is introduced to map 2.5D data onto a 2D space to enable data dimension reduction by discrete cosine transform and 2D principle component analysis. Gait recognition is achieved via a 2.5D view-invariant gait recognition method based on point cloud registration. Experimental results on the in-house database captured by a Microsoft Kinect camera show a significant performance gain when using MVSM.
3D object retrieval using salient views

PubMed Central

Shapiro, Linda G.

2013-01-01

This paper presents a method for selecting salient 2D views to describe 3D objects for the purpose of retrieval. The views are obtained by first identifying salient points via a learning approach that uses shape characteristics of the 3D points (Atmosukarto and Shapiro in International workshop on structural, syntactic, and statistical pattern recognition, 2008; Atmosukarto and Shapiro in ACM multimedia information retrieval, 2008). The salient views are selected by choosing views with multiple salient points on the silhouette of the object. Silhouette-based similarity measures from Chen et al. (Comput Graph Forum 22(3):223–232, 2003) are then used to calculate the similarity between two 3D objects. Retrieval experiments were performed on three datasets: the Heads dataset, the SHREC2008 dataset, and the Princeton dataset. Experimental results show that the retrieval results using the salient views are comparable to the existing light field descriptor method (Chen et al. in Comput Graph Forum 22(3):223–232, 2003), and our method achieves a 15-fold speedup in the feature extraction computation time. PMID:23833704
Object similarity affects the perceptual strategy underlying invariant visual object recognition in rats

PubMed Central

Rosselli, Federica B.; Alemi, Alireza; Ansuini, Alessio; Zoccolan, Davide

2015-01-01

In recent years, a number of studies have explored the possible use of rats as models of high-level visual functions. One central question at the root of such an investigation is to understand whether rat object vision relies on the processing of visual shape features or, rather, on lower-order image properties (e.g., overall brightness). In a recent study, we have shown that rats are capable of extracting multiple features of an object that are diagnostic of its identity, at least when those features are, structure-wise, distinct enough to be parsed by the rat visual system. In the present study, we have assessed the impact of object structure on rat perceptual strategy. We trained rats to discriminate between two structurally similar objects, and compared their recognition strategies with those reported in our previous study. We found that, under conditions of lower stimulus discriminability, rat visual discrimination strategy becomes more view-dependent and subject-dependent. Rats were still able to recognize the target objects, in a way that was largely tolerant (i.e., invariant) to object transformation; however, the larger structural and pixel-wise similarity affected the way objects were processed. Compared to the findings of our previous study, the patterns of diagnostic features were: (i) smaller and more scattered; (ii) only partially preserved across object views; and (iii) only partially reproducible across rats. On the other hand, rats were still found to adopt a multi-featural processing strategy and to make use of part of the optimal discriminatory information afforded by the two objects. Our findings suggest that, as in humans, rat invariant recognition can flexibly rely on either view-invariant representations of distinctive object features or view-specific object representations, acquired through learning. PMID:25814936
2.5D Multi-View Gait Recognition Based on Point Cloud Registration

PubMed Central

Tang, Jin; Luo, Jian; Tjahjadi, Tardi; Gao, Yan

2014-01-01

This paper presents a method for modeling a 2.5-dimensional (2.5D) human body and extracting the gait features for identifying the human subject. To achieve view-invariant gait recognition, a multi-view synthesizing method based on point cloud registration (MVSM) to generate multi-view training galleries is proposed. The concept of a density and curvature-based Color Gait Curvature Image is introduced to map 2.5D data onto a 2D space to enable data dimension reduction by discrete cosine transform and 2D principle component analysis. Gait recognition is achieved via a 2.5D view-invariant gait recognition method based on point cloud registration. Experimental results on the in-house database captured by a Microsoft Kinect camera show a significant performance gain when using MVSM. PMID:24686727
The Invariance Hypothesis Implies Domain-Specific Regions in Visual Cortex

PubMed Central

Leibo, Joel Z.; Liao, Qianli; Anselmi, Fabio; Poggio, Tomaso

2015-01-01

Is visual cortex made up of general-purpose information processing machinery, or does it consist of a collection of specialized modules? If prior knowledge, acquired from learning a set of objects is only transferable to new objects that share properties with the old, then the recognition system’s optimal organization must be one containing specialized modules for different object classes. Our analysis starts from a premise we call the invariance hypothesis: that the computational goal of the ventral stream is to compute an invariant-to-transformations and discriminative signature for recognition. The key condition enabling approximate transfer of invariance without sacrificing discriminability turns out to be that the learned and novel objects transform similarly. This implies that the optimal recognition system must contain subsystems trained only with data from similarly-transforming objects and suggests a novel interpretation of domain-specific regions like the fusiform face area (FFA). Furthermore, we can define an index of transformation-compatibility, computable from videos, that can be combined with information about the statistics of natural vision to yield predictions for which object categories ought to have domain-specific regions in agreement with the available data. The result is a unifying account linking the large literature on view-based recognition with the wealth of experimental evidence concerning domain-specific regions. PMID:26496457
Three-dimensional object recognition using similar triangles and decision trees

NASA Technical Reports Server (NTRS)

Spirkovska, Lilly

1993-01-01

A system, TRIDEC, that is capable of distinguishing between a set of objects despite changes in the objects' positions in the input field, their size, or their rotational orientation in 3D space is described. TRIDEC combines very simple yet effective features with the classification capabilities of inductive decision tree methods. The feature vector is a list of all similar triangles defined by connecting all combinations of three pixels in a coarse coded 127 x 127 pixel input field. The classification is accomplished by building a decision tree using the information provided from a limited number of translated, scaled, and rotated samples. Simulation results are presented which show that TRIDEC achieves 94 percent recognition accuracy in the 2D invariant object recognition domain and 98 percent recognition accuracy in the 3D invariant object recognition domain after training on only a small sample of transformed views of the objects.

Binary optical filters for scale invariant pattern recognition

NASA Technical Reports Server (NTRS)

Reid, Max B.; Downie, John D.; Hine, Butler P.

1992-01-01

Binary synthetic discriminant function (BSDF) optical filters which are invariant to scale changes in the target object of more than 50 percent are demonstrated in simulation and experiment. Efficient databases of scale invariant BSDF filters can be designed which discriminate between two very similar objects at any view scaled over a factor of 2 or more. The BSDF technique has considerable advantages over other methods for achieving scale invariant object recognition, as it also allows determination of the object's scale. In addition to scale, the technique can be used to design recognition systems invariant to other geometric distortions.
ROBOSIGHT: Robotic Vision System For Inspection And Manipulation

NASA Astrophysics Data System (ADS)

Trivedi, Mohan M.; Chen, ChuXin; Marapane, Suresh

1989-02-01

Vision is an important sensory modality that can be used for deriving information critical to the proper, efficient, flexible, and safe operation of an intelligent robot. Vision systems are uti-lized for developing higher level interpretation of the nature of a robotic workspace using images acquired by cameras mounted on a robot. Such information can be useful for tasks such as object recognition, object location, object inspection, obstacle avoidance and navigation. In this paper we describe efforts directed towards developing a vision system useful for performing various robotic inspection and manipulation tasks. The system utilizes gray scale images and can be viewed as a model-based system. It includes general purpose image analysis modules as well as special purpose, task dependent object status recognition modules. Experiments are described to verify the robust performance of the integrated system using a robotic testbed.
Online Feature Transformation Learning for Cross-Domain Object Category Recognition.

PubMed

Zhang, Xuesong; Zhuang, Yan; Wang, Wei; Pedrycz, Witold

2017-06-09

In this paper, we introduce a new research problem termed online feature transformation learning in the context of multiclass object category recognition. The learning of a feature transformation is viewed as learning a global similarity metric function in an online manner. We first consider the problem of online learning a feature transformation matrix expressed in the original feature space and propose an online passive aggressive feature transformation algorithm. Then these original features are mapped to kernel space and an online single kernel feature transformation (OSKFT) algorithm is developed to learn a nonlinear feature transformation. Based on the OSKFT and the existing Hedge algorithm, a novel online multiple kernel feature transformation algorithm is also proposed, which can further improve the performance of online feature transformation learning in large-scale application. The classifier is trained with k nearest neighbor algorithm together with the learned similarity metric function. Finally, we experimentally examined the effect of setting different parameter values in the proposed algorithms and evaluate the model performance on several multiclass object recognition data sets. The experimental results demonstrate the validity and good performance of our methods on cross-domain and multiclass object recognition application.
Can Changes in Eye Movement Scanning Alter the Age-Related Deficit in Recognition Memory?

PubMed Central

Chan, Jessica P. K.; Kamino, Daphne; Binns, Malcolm A.; Ryan, Jennifer D.

2011-01-01

Older adults typically exhibit poorer face recognition compared to younger adults. These recognition differences may be due to underlying age-related changes in eye movement scanning. We examined whether older adults’ recognition could be improved by yoking their eye movements to those of younger adults. Participants studied younger and older faces, under free viewing conditions (bases), through a gaze-contingent moving window (own), or a moving window which replayed the eye movements of a base participant (yoked). During the recognition test, participants freely viewed the faces with no viewing restrictions. Own-age recognition biases were observed for older adults in all viewing conditions, suggesting that this effect occurs independently of scanning. Participants in the bases condition had the highest recognition accuracy, and participants in the yoked condition were more accurate than participants in the own condition. Among yoked participants, recognition did not depend on age of the base participant. These results suggest that successful encoding for all participants requires the bottom-up contribution of peripheral information, regardless of the locus of control of the viewer. Although altering the pattern of eye movements did not increase recognition, the amount of sampling of the face during encoding predicted subsequent recognition accuracy for all participants. Increased sampling may confer some advantages for subsequent recognition, particularly for people who have declining memory abilities. PMID:21687460
A role for the CAMKK pathway in visual object recognition memory.

PubMed

Tinsley, Chris J; Narduzzo, Katherine E; Brown, Malcolm W; Warburton, E Clea

2012-03-01

The role of the CAMKK pathway in object recognition memory was investigated. Rats' performance in a preferential object recognition test was examined after local infusion into the perirhinal cortex of the CAMKK inhibitor STO-609. STO-609 infused either before or immediately after acquisition impaired memory tested after a 24 h but not a 20-min delay. Memory was not impaired when STO-609 was infused 20 min after acquisition. The expression of a downstream reaction product of CAMKK was measured by immunohistochemical staining for phospho-CAMKI(Thr177) at 10, 40, 70, and 100 min following the viewing of novel and familiar images of objects. Processing familiar images resulted in more pCAMKI stained neurons in the perirhinal cortex than processing novel images at the 10- and 40-min delays. Prior infusion of STO-609 caused a reduction in pCAMKI stained neurons in response to viewing either novel or familiar images, consistent with its role as an inhibitor of CAMKK. The results establish that the CAMKK pathway within the perirhinal cortex is important for the consolidation of object recognition memory. The activation of pCAMKI after acquisition is earlier than previously reported for pCAMKII. Copyright © 2011 Wiley Periodicals, Inc.
Multivariate fMRI and Eye Tracking Reveal Differential Effects of Visual Interference on Recognition Memory Judgments for Objects and Scenes.

PubMed

O'Neil, Edward B; Watson, Hilary C; Dhillon, Sonya; Lobaugh, Nancy J; Lee, Andy C H

2015-09-01

Recent work has demonstrated that the perirhinal cortex (PRC) supports conjunctive object representations that aid object recognition memory following visual object interference. It is unclear, however, how these representations interact with other brain regions implicated in mnemonic retrieval and how congruent and incongruent interference influences the processing of targets and foils during object recognition. To address this, multivariate partial least squares was applied to fMRI data acquired during an interference match-to-sample task, in which participants made object or scene recognition judgments after object or scene interference. This revealed a pattern of activity sensitive to object recognition following congruent (i.e., object) interference that included PRC, prefrontal, and parietal regions. Moreover, functional connectivity analysis revealed a common pattern of PRC connectivity across interference and recognition conditions. Examination of eye movements during the same task in a separate study revealed that participants gazed more at targets than foils during correct object recognition decisions, regardless of interference congruency. By contrast, participants viewed foils more than targets for incorrect object memory judgments, but only after congruent interference. Our findings suggest that congruent interference makes object foils appear familiar and that a network of regions, including PRC, is recruited to overcome the effects of interference.
Episodic Short-Term Recognition Requires Encoding into Visual Working Memory: Evidence from Probe Recognition after Letter Report

PubMed Central

Poth, Christian H.; Schneider, Werner X.

2016-01-01

Human vision is organized in discrete processing episodes (e.g., eye fixations or task-steps). Object information must be transmitted across episodes to enable episodic short-term recognition: recognizing whether a current object has been seen in a previous episode. We ask whether episodic short-term recognition presupposes that objects have been encoded into capacity-limited visual working memory (VWM), which retains visual information for report. Alternatively, it could rely on the activation of visual features or categories that occurs before encoding into VWM. We assessed the dependence of episodic short-term recognition on VWM by a new paradigm combining letter report and probe recognition. Participants viewed displays of 10 letters and reported as many as possible after a retention interval (whole report). Next, participants viewed a probe letter and indicated whether it had been one of the 10 letters (probe recognition). In Experiment 1, probe recognition was more accurate for letters that had been encoded into VWM (reported letters) compared with non-encoded letters (non-reported letters). Interestingly, those letters that participants reported in their whole report had been near to one another within the letter displays. This suggests that the encoding into VWM proceeded in a spatially clustered manner. In Experiment 2, participants reported only one of 10 letters (partial report) and probes either referred to this letter, to letters that had been near to it, or far from it. Probe recognition was more accurate for near than for far letters, although none of these letters had to be reported. These findings indicate that episodic short-term recognition is constrained to a small number of simultaneously presented objects that have been encoded into VWM. PMID:27713722
Episodic Short-Term Recognition Requires Encoding into Visual Working Memory: Evidence from Probe Recognition after Letter Report.

PubMed

Poth, Christian H; Schneider, Werner X

2016-01-01

Human vision is organized in discrete processing episodes (e.g., eye fixations or task-steps). Object information must be transmitted across episodes to enable episodic short-term recognition: recognizing whether a current object has been seen in a previous episode. We ask whether episodic short-term recognition presupposes that objects have been encoded into capacity-limited visual working memory (VWM), which retains visual information for report. Alternatively, it could rely on the activation of visual features or categories that occurs before encoding into VWM. We assessed the dependence of episodic short-term recognition on VWM by a new paradigm combining letter report and probe recognition. Participants viewed displays of 10 letters and reported as many as possible after a retention interval (whole report). Next, participants viewed a probe letter and indicated whether it had been one of the 10 letters (probe recognition). In Experiment 1, probe recognition was more accurate for letters that had been encoded into VWM (reported letters) compared with non-encoded letters (non-reported letters). Interestingly, those letters that participants reported in their whole report had been near to one another within the letter displays. This suggests that the encoding into VWM proceeded in a spatially clustered manner. In Experiment 2, participants reported only one of 10 letters (partial report) and probes either referred to this letter, to letters that had been near to it, or far from it. Probe recognition was more accurate for near than for far letters, although none of these letters had to be reported. These findings indicate that episodic short-term recognition is constrained to a small number of simultaneously presented objects that have been encoded into VWM.
Learning to distinguish similar objects

NASA Astrophysics Data System (ADS)

Seibert, Michael; Waxman, Allen M.; Gove, Alan N.

1995-04-01

This paper describes how the similarities and differences among similar objects can be discovered during learning to facilitate recognition. The application domain is single views of flying model aircraft captured in silhouette by a CCD camera. The approach was motivated by human psychovisual and monkey neurophysiological data. The implementation uses neural net processing mechanisms to build a hierarchy that relates similar objects to superordinate classes, while simultaneously discovering the salient differences between objects within a class. Learning and recognition experiments both with and without the class similarity and difference learning show the effectiveness of the approach on this visual data. To test the approach, the hierarchical approach was compared to a non-hierarchical approach, and was found to improve the average percentage of correctly classified views from 77% to 84%.
Category Specificity in Normal Episodic Learning: Applications to Object Recognition and Category-Specific Agnosia

ERIC Educational Resources Information Center

Bukach, Cindy M.; Bub, Daniel N.; Masson, Michael E. J.; Lindsay, D. Stephen

2004-01-01

Studies of patients with category-specific agnosia (CSA) have given rise to multiple theories of object recognition, most of which assume the existence of a stable, abstract semantic memory system. We applied an episodic view of memory to questions raised by CSA in a series of studies examining normal observers' recall of newly learned attributes…
Object recognition contributions to figure-ground organization: operations on outlines and subjective contours.

PubMed

Peterson, M A; Gibson, B S

1994-11-01

In previous research, replicated here, we found that some object recognition processes influence figure-ground organization. We have proposed that these object recognition processes operate on edges (or contours) detected early in visual processing, rather than on regions. Consistent with this proposal, influences from object recognition on figure-ground organization were previously observed in both pictures and stereograms depicting regions of different luminance, but not in random-dot stereograms, where edges arise late in processing (Peterson & Gibson, 1993). In the present experiments, we examined whether or not two other types of contours--outlines and subjective contours--enable object recognition influences on figure-ground organization. For both types of contours we observed a pattern of effects similar to that originally obtained with luminance edges. The results of these experiments are valuable for distinguishing between alternative views of the mechanisms mediating object recognition influences on figure-ground organization. In addition, in both Experiments 1 and 2, fixated regions were seen as figure longer than nonfixated regions, suggesting that fixation location must be included among the variables relevant to figure-ground organization.
View-invariant object category learning, recognition, and search: how spatial and object attention are coordinated using surface-based attentional shrouds.

PubMed

Fazl, Arash; Grossberg, Stephen; Mingolla, Ennio

2009-02-01

How does the brain learn to recognize an object from multiple viewpoints while scanning a scene with eye movements? How does the brain avoid the problem of erroneously classifying parts of different objects together? How are attention and eye movements intelligently coordinated to facilitate object learning? A neural model provides a unified mechanistic explanation of how spatial and object attention work together to search a scene and learn what is in it. The ARTSCAN model predicts how an object's surface representation generates a form-fitting distribution of spatial attention, or "attentional shroud". All surface representations dynamically compete for spatial attention to form a shroud. The winning shroud persists during active scanning of the object. The shroud maintains sustained activity of an emerging view-invariant category representation while multiple view-specific category representations are learned and are linked through associative learning to the view-invariant object category. The shroud also helps to restrict scanning eye movements to salient features on the attended object. Object attention plays a role in controlling and stabilizing the learning of view-specific object categories. Spatial attention hereby coordinates the deployment of object attention during object category learning. Shroud collapse releases a reset signal that inhibits the active view-invariant category in the What cortical processing stream. Then a new shroud, corresponding to a different object, forms in the Where cortical processing stream, and search using attention shifts and eye movements continues to learn new objects throughout a scene. The model mechanistically clarifies basic properties of attention shifts (engage, move, disengage) and inhibition of return. It simulates human reaction time data about object-based spatial attention shifts, and learns with 98.1% accuracy and a compression of 430 on a letter database whose letters vary in size, position, and orientation. The model provides a powerful framework for unifying many data about spatial and object attention, and their interactions during perception, cognition, and action.
Learning the 3-D structure of objects from 2-D views depends on shape, not format

PubMed Central

Tian, Moqian; Yamins, Daniel; Grill-Spector, Kalanit

2016-01-01

Humans can learn to recognize new objects just from observing example views. However, it is unknown what structural information enables this learning. To address this question, we manipulated the amount of structural information given to subjects during unsupervised learning by varying the format of the trained views. We then tested how format affected participants' ability to discriminate similar objects across views that were rotated 90° apart. We found that, after training, participants' performance increased and generalized to new views in the same format. Surprisingly, the improvement was similar across line drawings, shape from shading, and shape from shading + stereo even though the latter two formats provide richer depth information compared to line drawings. In contrast, participants' improvement was significantly lower when training used silhouettes, suggesting that silhouettes do not have enough information to generate a robust 3-D structure. To test whether the learned object representations were format-specific or format-invariant, we examined if learning novel objects from example views transfers across formats. We found that learning objects from example line drawings transferred to shape from shading and vice versa. These results have important implications for theories of object recognition because they suggest that (a) learning the 3-D structure of objects does not require rich structural cues during training as long as shape information of internal and external features is provided and (b) learning generates shape-based object representations independent of the training format. PMID:27153196
On the psychology of the recognition heuristic: retrieval primacy as a key determinant of its use.

PubMed

Pachur, Thorsten; Hertwig, Ralph

2006-09-01

The recognition heuristic is a prime example of a boundedly rational mind tool that rests on an evolved capacity, recognition, and exploits environmental structures. When originally proposed, it was conjectured that no other probabilistic cue reverses the recognition-based inference (D. G. Goldstein & G. Gigerenzer, 2002). More recent studies challenged this view and gave rise to the argument that recognition enters inferences just like any other probabilistic cue. By linking research on the heuristic with research on recognition memory, the authors argue that the retrieval of recognition information is not tantamount to the retrieval of other probabilistic cues. Specifically, the retrieval of subjective recognition precedes that of an objective probabilistic cue and occurs at little to no cognitive cost. This retrieval primacy gives rise to 2 predictions, both of which have been empirically supported: Inferences in line with the recognition heuristic (a) are made faster than inferences inconsistent with it and (b) are more prevalent under time pressure. Suspension of the heuristic, in contrast, requires additional time, and direct knowledge of the criterion variable, if available, can trigger such suspension. Copyright 2006 APA
Leveraging Cognitive Context for Object Recognition

DTIC Science & Technology

2014-06-01

learned from large image databases. We build upon this concept by exploring cognitive context, demonstrating how rich dynamic context provided by...context that people rely upon as they perceive the world. Context in ACT-R/E takes the form of associations between related concepts that are learned ...and accuracy of object recognition. Context is most often viewed as a static concept, learned from large image databases. We build upon this concept by
Appearance-based face recognition and light-fields.

PubMed

Gross, Ralph; Matthews, Iain; Baker, Simon

2004-04-01

Arguably the most important decision to be made when developing an object recognition algorithm is selecting the scene measurements or features on which to base the algorithm. In appearance-based object recognition, the features are chosen to be the pixel intensity values in an image of the object. These pixel intensities correspond directly to the radiance of light emitted from the object along certain rays in space. The set of all such radiance values over all possible rays is known as the plenoptic function or light-field. In this paper, we develop a theory of appearance-based object recognition from light-fields. This theory leads directly to an algorithm for face recognition across pose that uses as many images of the face as are available, from one upwards. All of the pixels, whichever image they come from, are treated equally and used to estimate the (eigen) light-field of the object. The eigen light-field is then used as the set of features on which to base recognition, analogously to how the pixel intensities are used in appearance-based face and object recognition.
Breaking object correspondence across saccades impairs object recognition: The role of color and luminance.

PubMed

Poth, Christian H; Schneider, Werner X

2016-09-01

Rapid saccadic eye movements bring the foveal region of the eye's retina onto objects for high-acuity vision. Saccades change the location and resolution of objects' retinal images. To perceive objects as visually stable across saccades, correspondence between the objects before and after the saccade must be established. We have previously shown that breaking object correspondence across the saccade causes a decrement in object recognition (Poth, Herwig, & Schneider, 2015). Color and luminance can establish object correspondence, but it is unknown how these surface features contribute to transsaccadic visual processing. Here, we investigated whether changing the surface features color-and-luminance and color alone across saccades impairs postsaccadic object recognition. Participants made saccades to peripheral objects, which either maintained or changed their surface features across the saccade. After the saccade, participants briefly viewed a letter within the saccade target object (terminated by a pattern mask). Postsaccadic object recognition was assessed as participants' accuracy in reporting the letter. Experiment A used the colors green and red with different luminances as surface features, Experiment B blue and yellow with approximately the same luminances. Changing the surface features across the saccade deteriorated postsaccadic object recognition in both experiments. These findings reveal a link between object recognition and object correspondence relying on the surface features colors and luminance, which is currently not addressed in theories of transsaccadic perception. We interpret the findings within a recent theory ascribing this link to visual attention (Schneider, 2013).
The roles of scene gist and spatial dependency among objects in the semantic guidance of attention in real-world scenes.

PubMed

Wu, Chia-Chien; Wang, Hsueh-Cheng; Pomplun, Marc

2014-12-01

A previous study (Vision Research 51 (2011) 1192-1205) found evidence for semantic guidance of visual attention during the inspection of real-world scenes, i.e., an influence of semantic relationships among scene objects on overt shifts of attention. In particular, the results revealed an observer bias toward gaze transitions between semantically similar objects. However, this effect is not necessarily indicative of semantic processing of individual objects but may be mediated by knowledge of the scene gist, which does not require object recognition, or by known spatial dependency among objects. To examine the mechanisms underlying semantic guidance, in the present study, participants were asked to view a series of displays with the scene gist excluded and spatial dependency varied. Our results show that spatial dependency among objects seems to be sufficient to induce semantic guidance. Scene gist, on the other hand, does not seem to affect how observers use semantic information to guide attention while viewing natural scenes. Extracting semantic information mainly based on spatial dependency may be an efficient strategy of the visual system that only adds little cognitive load to the viewing task. Copyright © 2014 Elsevier Ltd. All rights reserved.
Core geometry in perspective

PubMed Central

Dillon, Moira R.; Spelke, Elizabeth S.

2015-01-01

Research on animals, infants, children, and adults provides evidence that distinct cognitive systems underlie navigation and object recognition. Here we examine whether and how these systems interact when children interpret 2D edge-based perspectival line drawings of scenes and objects. Such drawings serve as symbols early in development, and they preserve scene and object geometry from canonical points of view. Young children show limits when using geometry both in non-symbolic tasks and in symbolic map tasks that present 3D contexts from unusual, unfamiliar points of view. When presented with the familiar viewpoints in perspectival line drawings, however, do children engage more integrated geometric representations? In three experiments, children successfully interpreted line drawings with respect to their depicted scene or object. Nevertheless, children recruited distinct processes when navigating based on the information in these drawings, and these processes depended on the context in which the drawings were presented. These results suggest that children are flexible but limited in using geometric information to form integrated representations of scenes and objects, even when interpreting spatial symbols that are highly familiar and faithful renditions of the visual world. PMID:25441089
Agnosic vision is like peripheral vision, which is limited by crowding.

PubMed

Strappini, Francesca; Pelli, Denis G; Di Pace, Enrico; Martelli, Marialuisa

2017-04-01

Visual agnosia is a neuropsychological impairment of visual object recognition despite near-normal acuity and visual fields. A century of research has provided only a rudimentary account of the functional damage underlying this deficit. We find that the object-recognition ability of agnosic patients viewing an object directly is like that of normally-sighted observers viewing it indirectly, with peripheral vision. Thus, agnosic vision is like peripheral vision. We obtained 14 visual-object-recognition tests that are commonly used for diagnosis of visual agnosia. Our "standard" normal observer took these tests at various eccentricities in his periphery. Analyzing the published data of 32 apperceptive agnosia patients and a group of 14 posterior cortical atrophy (PCA) patients on these tests, we find that each patient's pattern of object recognition deficits is well characterized by one number, the equivalent eccentricity at which our standard observer's peripheral vision is like the central vision of the agnosic patient. In other words, each agnosic patient's equivalent eccentricity is conserved across tests. Across patients, equivalent eccentricity ranges from 4 to 40 deg, which rates severity of the visual deficit. In normal peripheral vision, the required size to perceive a simple image (e.g., an isolated letter) is limited by acuity, and that for a complex image (e.g., a face or a word) is limited by crowding. In crowding, adjacent simple objects appear unrecognizably jumbled unless their spacing exceeds the crowding distance, which grows linearly with eccentricity. Besides conservation of equivalent eccentricity across object-recognition tests, we also find conservation, from eccentricity to agnosia, of the relative susceptibility of recognition of ten visual tests. These findings show that agnosic vision is like eccentric vision. Whence crowding? Peripheral vision, strabismic amblyopia, and possibly apperceptive agnosia are all limited by crowding, making it urgent to know what drives crowding. Acuity does not (Song et al., 2014), but neural density might: neurons per deg 2 in the crowding-relevant cortical area. Copyright © 2017 Elsevier Ltd. All rights reserved.

Is a Pink Cow Still a Cow? Individual Differences in Toddlers' Vocabulary Knowledge and Lexical Representations.

PubMed

Perry, Lynn K; Saffran, Jenny R

2017-05-01

When a toddler knows a word, what does she actually know? Many categories have multiple relevant properties; for example, shape and color are relevant to membership in the category banana. How do toddlers prioritize these properties when recognizing familiar words, and are there systematic differences among children? In this study, toddlers viewed pairs of objects associated with prototypical colors. On some trials, objects were typically colored (e.g., Holstein cow and pink pig); on other trials, colors were switched (e.g., pink cow and Holstein-patterned pig). On each trial, toddlers were directed to find a target object. Overall, recognition was disrupted when colors were switched, as measured by eye movements. Moreover, individual differences in vocabularies predicted recognition differences: Toddlers who say fewer shape-based words were more disrupted by color switches. "Knowing" a word may not mean the same thing for all toddlers; different toddlers prioritize different facets of familiar objects in their lexical representations. Copyright © 2016 Cognitive Science Society, Inc.
A computerized recognition system for the home-based physiotherapy exercises using an RGBD camera.

PubMed

Ar, Ilktan; Akgul, Yusuf Sinan

2014-11-01

Computerized recognition of the home based physiotherapy exercises has many benefits and it has attracted considerable interest among the computer vision community. However, most methods in the literature view this task as a special case of motion recognition. In contrast, we propose to employ the three main components of a physiotherapy exercise (the motion patterns, the stance knowledge, and the exercise object) as different recognition tasks and embed them separately into the recognition system. The low level information about each component is gathered using machine learning methods. Then, we use a generative Bayesian network to recognize the exercise types by combining the information from these sources at an abstract level, which takes the advantage of domain knowledge for a more robust system. Finally, a novel postprocessing step is employed to estimate the exercise repetitions counts. The performance evaluation of the system is conducted with a new dataset which contains RGB (red, green, and blue) and depth videos of home-based exercise sessions for commonly applied shoulder and knee exercises. The proposed system works without any body-part segmentation, bodypart tracking, joint detection, and temporal segmentation methods. In the end, favorable exercise recognition rates and encouraging results on the estimation of repetition counts are obtained.
Aging and solid shape recognition: Vision and haptics.

PubMed

Norman, J Farley; Cheeseman, Jacob R; Adkins, Olivia C; Cox, Andrea G; Rogers, Connor E; Dowell, Catherine J; Baxter, Michael W; Norman, Hideko F; Reyes, Cecia M

2015-10-01

The ability of 114 younger and older adults to recognize naturally-shaped objects was evaluated in three experiments. The participants viewed or haptically explored six randomly-chosen bell peppers (Capsicum annuum) in a study session and were later required to judge whether each of twelve bell peppers was "old" (previously presented during the study session) or "new" (not presented during the study session). When recognition memory was tested immediately after study, the younger adults' (Experiment 1) performance for vision and haptics was identical when the individual study objects were presented once. Vision became superior to haptics, however, when the individual study objects were presented multiple times. When 10- and 20-min delays (Experiment 2) were inserted in between study and test sessions, no significant differences occurred between vision and haptics: recognition performance in both modalities was comparable. When the recognition performance of older adults was evaluated (Experiment 3), a negative effect of age was found for visual shape recognition (younger adults' overall recognition performance was 60% higher). There was no age effect, however, for haptic shape recognition. The results of the present experiments indicate that the visual recognition of natural object shape is different from haptic recognition in multiple ways: visual shape recognition can be superior to that of haptics and is affected by aging, while haptic shape recognition is less accurate and unaffected by aging. Copyright © 2015 Elsevier Ltd. All rights reserved.
Electrophysiological evidence for effects of color knowledge in object recognition.

PubMed

Lu, Aitao; Xu, Guiping; Jin, Hua; Mo, Lei; Zhang, Jijia; Zhang, John X

2010-01-29

Knowledge about the typical colors associated with familiar everyday objects (i.e., strawberries are red) is well-known to be represented in the conceptual semantic system. Evidence that such knowledge may also play a role in early perceptual processes for object recognition is scant. In the present ERP study, participants viewed a list of object pictures and detected infrequent stimulus repetitions. Results show that shortly after stimulus onset, ERP components indexing early perceptual processes, including N1, P2, and N2, differentiated between objects in their appropriate or congruent color from these objects in an inappropriate or incongruent color. Such congruence effect also occurred in N3 associated with semantic processing of pictures but not in N4 for domain-general semantic processing. Our results demonstrate a clear effect of color knowledge in early object recognition stages and support the following proposal-color as a surface property is stored in a multiple-memory system where pre-semantic perceptual and semantic conceptual representations interact during object recognition. (c) 2009 Elsevier Ireland Ltd. All rights reserved.
A rodent model for the study of invariant visual object recognition

PubMed Central

Zoccolan, Davide; Oertelt, Nadja; DiCarlo, James J.; Cox, David D.

2009-01-01

The human visual system is able to recognize objects despite tremendous variation in their appearance on the retina resulting from variation in view, size, lighting, etc. This ability—known as “invariant” object recognition—is central to visual perception, yet its computational underpinnings are poorly understood. Traditionally, nonhuman primates have been the animal model-of-choice for investigating the neuronal substrates of invariant recognition, because their visual systems closely mirror our own. Meanwhile, simpler and more accessible animal models such as rodents have been largely overlooked as possible models of higher-level visual functions, because their brains are often assumed to lack advanced visual processing machinery. As a result, little is known about rodents' ability to process complex visual stimuli in the face of real-world image variation. In the present work, we show that rats possess more advanced visual abilities than previously appreciated. Specifically, we trained pigmented rats to perform a visual task that required them to recognize objects despite substantial variation in their appearance, due to changes in size, view, and lighting. Critically, rats were able to spontaneously generalize to previously unseen transformations of learned objects. These results provide the first systematic evidence for invariant object recognition in rats and argue for an increased focus on rodents as models for studying high-level visual processing. PMID:19429704
Multi-object detection and tracking technology based on hexagonal opto-electronic detector

NASA Astrophysics Data System (ADS)

Song, Yong; Hao, Qun; Li, Xiang

2008-02-01

A novel multi-object detection and tracking technology based on hexagonal opto-electronic detector is proposed, in which (1) a new hexagonal detector, which is composed of 6 linear CCDs, has been firstly developed to achieve the field of view of 360 degree, (2) to achieve the detection and tracking of multi-object with high speed, the object recognition criterions of Object Signal Width Criterion (OSWC) and Horizontal Scale Ratio Criterion (HSRC) are proposed. In this paper, Simulated Experiments have been carried out to verify the validity of the proposed technology, which show that the detection and tracking of multi-object can be achieved with high speed by using the proposed hexagonal detector and the criterions of OSWC and HSRC, indicating that the technology offers significant advantages in Photo-electric Detection, Computer Vision, Virtual Reality, Augment Reality, etc.
Sparse aperture 3D passive image sensing and recognition

NASA Astrophysics Data System (ADS)

Daneshpanah, Mehdi

The way we perceive, capture, store, communicate and visualize the world has greatly changed in the past century Novel three dimensional (3D) imaging and display systems are being pursued both in academic and industrial settings. In many cases, these systems have revolutionized traditional approaches and/or enabled new technologies in other disciplines including medical imaging and diagnostics, industrial metrology, entertainment, robotics as well as defense and security. In this dissertation, we focus on novel aspects of sparse aperture multi-view imaging systems and their application in quantum-limited object recognition in two separate parts. In the first part, two concepts are proposed. First a solution is presented that involves a generalized framework for 3D imaging using randomly distributed sparse apertures. Second, a method is suggested to extract the profile of objects in the scene through statistical properties of the reconstructed light field. In both cases, experimental results are presented that demonstrate the feasibility of the techniques. In the second part, the application of 3D imaging systems in sensing and recognition of objects is addressed. In particular, we focus on the scenario in which only 10s of photons reach the sensor from the object of interest, as opposed to hundreds of billions of photons in normal imaging conditions. At this level, the quantum limited behavior of light will dominate and traditional object recognition practices may fail. We suggest a likelihood based object recognition framework that incorporates the physics of sensing at quantum-limited conditions. Sensor dark noise has been modeled and taken into account. This framework is applied to 3D sensing of thermal objects using visible spectrum detectors. Thermal objects as cold as 250K are shown to provide enough signature photons to be sensed and recognized within background and dark noise with mature, visible band, image forming optics and detector arrays. The results suggest that one might not need to venture into exotic and expensive detector arrays and associated optics for sensing room-temperature thermal objects in complete darkness.
The Dark Side of Context: Context Reinstatement Can Distort Memory.

PubMed

Doss, Manoj K; Picart, Jamila K; Gallo, David A

2018-04-01

It is widely assumed that context reinstatement benefits memory, but our experiments revealed that context reinstatement can systematically distort memory. Participants viewed pictures of objects superimposed over scenes, and we later tested their ability to differentiate these old objects from similar new objects. Context reinstatement was manipulated by presenting objects on the reinstated or switched scene at test. Not only did context reinstatement increase correct recognition of old objects, but it also consistently increased incorrect recognition of similar objects as old ones. This false recognition effect was robust, as it was found in several experiments, occurred after both immediate and delayed testing, and persisted with high confidence even after participants were warned to avoid the distorting effects of context. To explain this memory illusion, we propose that context reinstatement increases the likelihood of confusing conceptual and perceptual information, potentially in medial temporal brain regions that integrate this information.
Object recognition and pose estimation of planar objects from range data

NASA Technical Reports Server (NTRS)

Pendleton, Thomas W.; Chien, Chiun Hong; Littlefield, Mark L.; Magee, Michael

1994-01-01

The Extravehicular Activity Helper/Retriever (EVAHR) is a robotic device currently under development at the NASA Johnson Space Center that is designed to fetch objects or to assist in retrieving an astronaut who may have become inadvertently de-tethered. The EVAHR will be required to exhibit a high degree of intelligent autonomous operation and will base much of its reasoning upon information obtained from one or more three-dimensional sensors that it will carry and control. At the highest level of visual cognition and reasoning, the EVAHR will be required to detect objects, recognize them, and estimate their spatial orientation and location. The recognition phase and estimation of spatial pose will depend on the ability of the vision system to reliably extract geometric features of the objects such as whether the surface topologies observed are planar or curved and the spatial relationships between the component surfaces. In order to achieve these tasks, three-dimensional sensing of the operational environment and objects in the environment will therefore be essential. One of the sensors being considered to provide image data for object recognition and pose estimation is a phase-shift laser scanner. The characteristics of the data provided by this scanner have been studied and algorithms have been developed for segmenting range images into planar surfaces, extracting basic features such as surface area, and recognizing the object based on the characteristics of extracted features. Also, an approach has been developed for estimating the spatial orientation and location of the recognized object based on orientations of extracted planes and their intersection points. This paper presents some of the algorithms that have been developed for the purpose of recognizing and estimating the pose of objects as viewed by the laser scanner, and characterizes the desirability and utility of these algorithms within the context of the scanner itself, considering data quality and noise.
On techniques for angle compensation in nonideal iris recognition.

PubMed

Schuckers, Stephanie A C; Schmid, Natalia A; Abhyankar, Aditya; Dorairaj, Vivekanand; Boyce, Christopher K; Hornak, Lawrence A

2007-10-01

The popularity of the iris biometric has grown considerably over the past two to three years. Most research has been focused on the development of new iris processing and recognition algorithms for frontal view iris images. However, a few challenging directions in iris research have been identified, including processing of a nonideal iris and iris at a distance. In this paper, we describe two nonideal iris recognition systems and analyze their performance. The word "nonideal" is used in the sense of compensating for off-angle occluded iris images. The system is designed to process nonideal iris images in two steps: 1) compensation for off-angle gaze direction and 2) processing and encoding of the rotated iris image. Two approaches are presented to account for angular variations in the iris images. In the first approach, we use Daugman's integrodifferential operator as an objective function to estimate the gaze direction. After the angle is estimated, the off-angle iris image undergoes geometric transformations involving the estimated angle and is further processed as if it were a frontal view image. The encoding technique developed for a frontal image is based on the application of the global independent component analysis. The second approach uses an angular deformation calibration model. The angular deformations are modeled, and calibration parameters are calculated. The proposed method consists of a closed-form solution, followed by an iterative optimization procedure. The images are projected on the plane closest to the base calibrated plane. Biorthogonal wavelets are used for encoding to perform iris recognition. We use a special dataset of the off-angle iris images to quantify the performance of the designed systems. A series of receiver operating characteristics demonstrate various effects on the performance of the nonideal-iris-based recognition system.
Higher-order neural network software for distortion invariant object recognition

NASA Technical Reports Server (NTRS)

Reid, Max B.; Spirkovska, Lilly

1991-01-01

The state-of-the-art in pattern recognition for such applications as automatic target recognition and industrial robotic vision relies on digital image processing. We present a higher-order neural network model and software which performs the complete feature extraction-pattern classification paradigm required for automatic pattern recognition. Using a third-order neural network, we demonstrate complete, 100 percent accurate invariance to distortions of scale, position, and in-plate rotation. In a higher-order neural network, feature extraction is built into the network, and does not have to be learned. Only the relatively simple classification step must be learned. This is key to achieving very rapid training. The training set is much smaller than with standard neural network software because the higher-order network only has to be shown one view of each object to be learned, not every possible view. The software and graphical user interface run on any Sun workstation. Results of the use of the neural software in autonomous robotic vision systems are presented. Such a system could have extensive application in robotic manufacturing.
Comparing visual representations across human fMRI and computational vision

PubMed Central

Leeds, Daniel D.; Seibert, Darren A.; Pyles, John A.; Tarr, Michael J.

2013-01-01

Feedforward visual object perception recruits a cortical network that is assumed to be hierarchical, progressing from basic visual features to complete object representations. However, the nature of the intermediate features related to this transformation remains poorly understood. Here, we explore how well different computer vision recognition models account for neural object encoding across the human cortical visual pathway as measured using fMRI. These neural data, collected during the viewing of 60 images of real-world objects, were analyzed with a searchlight procedure as in Kriegeskorte, Goebel, and Bandettini (2006): Within each searchlight sphere, the obtained patterns of neural activity for all 60 objects were compared to model responses for each computer recognition algorithm using representational dissimilarity analysis (Kriegeskorte et al., 2008). Although each of the computer vision methods significantly accounted for some of the neural data, among the different models, the scale invariant feature transform (Lowe, 2004), encoding local visual properties gathered from “interest points,” was best able to accurately and consistently account for stimulus representations within the ventral pathway. More generally, when present, significance was observed in regions of the ventral-temporal cortex associated with intermediate-level object perception. Differences in model effectiveness and the neural location of significant matches may be attributable to the fact that each model implements a different featural basis for representing objects (e.g., more holistic or more parts-based). Overall, we conclude that well-known computer vision recognition systems may serve as viable proxies for theories of intermediate visual object representation. PMID:24273227
Recognition-induced forgetting is not due to category-based set size.

PubMed

Maxcey, Ashleigh M

2016-01-01

What are the consequences of accessing a visual long-term memory representation? Previous work has shown that accessing a long-term memory representation via retrieval improves memory for the targeted item and hurts memory for related items, a phenomenon called retrieval-induced forgetting. Recently we found a similar forgetting phenomenon with recognition of visual objects. Recognition-induced forgetting occurs when practice recognizing an object during a two-alternative forced-choice task, from a group of objects learned at the same time, leads to worse memory for objects from that group that were not practiced. An alternative explanation of this effect is that category-based set size is inducing forgetting, not recognition practice as claimed by some researchers. This alternative explanation is possible because during recognition practice subjects make old-new judgments in a two-alternative forced-choice task, and are thus exposed to more objects from practiced categories, potentially inducing forgetting due to set-size. Herein I pitted the category-based set size hypothesis against the recognition-induced forgetting hypothesis. To this end, I parametrically manipulated the amount of practice objects received in the recognition-induced forgetting paradigm. If forgetting is due to category-based set size, then the magnitude of forgetting of related objects will increase as the number of practice trials increases. If forgetting is recognition induced, the set size of exemplars from any given category should not be predictive of memory for practiced objects. Consistent with this latter hypothesis, additional practice systematically improved memory for practiced objects, but did not systematically affect forgetting of related objects. These results firmly establish that recognition practice induces forgetting of related memories. Future directions and important real-world applications of using recognition to access our visual memories of previously encountered objects are discussed.
A role for calcium-calmodulin-dependent protein kinase II in the consolidation of visual object recognition memory.

PubMed

Tinsley, C J; Narduzzo, K E; Ho, J W; Barker, G R; Brown, M W; Warburton, E C

2009-09-01

The aim was to investigate the role of calcium-calmodulin-dependent protein kinase (CAMK)II in object recognition memory. The performance of rats in a preferential object recognition test was examined after local infusion of the CAMKII inhibitors KN-62 or autocamtide-2-related inhibitory peptide (AIP) into the perirhinal cortex. KN-62 or AIP infused after acquisition impaired memory tested at 24 h, indicating an involvement of CAMKII in the consolidation of recognition memory. Memory was impaired when KN-62 was infused at 20 min after acquisition or when AIP was infused at 20, 40, 60 or 100 min after acquisition. The time-course of CAMKII activation in rats was further examined by immunohistochemical staining for phospho-CAMKII(Thre286)alpha at 10, 40, 70 and 100 min following the viewing of novel and familiar images. At 70 min, processing novel images resulted in more phospho-CAMKII(Thre286)alpha-stained neurons in the perirhinal cortex than did the processing of familiar images, consistent with the viewing of novel images increasing the activity of CAMKII at this time. This difference was eliminated by prior infusion of AIP. These findings establish that CAMKII is active within the perirhinal region between approximately 20 and 100 min following learning and then returns to baseline. Thus, increased CAMKII activity is essential for the consolidation of long-term object recognition memory but continuation of that increased activity throughout the 24 h memory delay is not necessary for maintenance of the memory.
Cognitive object recognition system (CORS)

NASA Astrophysics Data System (ADS)

Raju, Chaitanya; Varadarajan, Karthik Mahesh; Krishnamurthi, Niyant; Xu, Shuli; Biederman, Irving; Kelley, Troy

2010-04-01

We have developed a framework, Cognitive Object Recognition System (CORS), inspired by current neurocomputational models and psychophysical research in which multiple recognition algorithms (shape based geometric primitives, 'geons,' and non-geometric feature-based algorithms) are integrated to provide a comprehensive solution to object recognition and landmarking. Objects are defined as a combination of geons, corresponding to their simple parts, and the relations among the parts. However, those objects that are not easily decomposable into geons, such as bushes and trees, are recognized by CORS using "feature-based" algorithms. The unique interaction between these algorithms is a novel approach that combines the effectiveness of both algorithms and takes us closer to a generalized approach to object recognition. CORS allows recognition of objects through a larger range of poses using geometric primitives and performs well under heavy occlusion - about 35% of object surface is sufficient. Furthermore, geon composition of an object allows image understanding and reasoning even with novel objects. With reliable landmarking capability, the system improves vision-based robot navigation in GPS-denied environments. Feasibility of the CORS system was demonstrated with real stereo images captured from a Pioneer robot. The system can currently identify doors, door handles, staircases, trashcans and other relevant landmarks in the indoor environment.
Neural Correlates of Individual Differences in Infant Visual Attention and Recognition Memory

PubMed Central

Reynolds, Greg D.; Guy, Maggie W.; Zhang, Dantong

2010-01-01

Past studies have identified individual differences in infant visual attention based upon peak look duration during initial exposure to a stimulus. Colombo and colleagues (e.g., Colombo & Mitchell, 1990) found that infants that demonstrate brief visual fixations (i.e., short lookers) during familiarization are more likely to demonstrate evidence of recognition memory during subsequent stimulus exposure than infants that demonstrate long visual fixations (i.e., long lookers). The current study utilized event-related potentials to examine possible neural mechanisms associated with individual differences in visual attention and recognition memory for 6- and 7.5-month-old infants. Short- and long-looking infants viewed images of familiar and novel objects during ERP testing. There was a stimulus type by looker type interaction at temporal and frontal electrodes on the late slow wave (LSW). Short lookers demonstrated a LSW that was significantly greater in amplitude in response to novel stimulus presentations. No significant differences in LSW amplitude were found based on stimulus type for long lookers. These results indicate deeper processing and recognition memory of the familiar stimulus for short lookers. PMID:21666833
Shape and texture fused recognition of flying targets

NASA Astrophysics Data System (ADS)

Kovács, Levente; Utasi, Ákos; Kovács, Andrea; Szirányi, Tamás

2011-06-01

This paper presents visual detection and recognition of flying targets (e.g. planes, missiles) based on automatically extracted shape and object texture information, for application areas like alerting, recognition and tracking. Targets are extracted based on robust background modeling and a novel contour extraction approach, and object recognition is done by comparisons to shape and texture based query results on a previously gathered real life object dataset. Application areas involve passive defense scenarios, including automatic object detection and tracking with cheap commodity hardware components (CPU, camera and GPS).
Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence

PubMed Central

Cichy, Radoslaw Martin; Khosla, Aditya; Pantazis, Dimitrios; Torralba, Antonio; Oliva, Aude

2016-01-01

The complex multi-stage architecture of cortical visual pathways provides the neural basis for efficient visual object recognition in humans. However, the stage-wise computations therein remain poorly understood. Here, we compared temporal (magnetoencephalography) and spatial (functional MRI) visual brain representations with representations in an artificial deep neural network (DNN) tuned to the statistics of real-world visual recognition. We showed that the DNN captured the stages of human visual processing in both time and space from early visual areas towards the dorsal and ventral streams. Further investigation of crucial DNN parameters revealed that while model architecture was important, training on real-world categorization was necessary to enforce spatio-temporal hierarchical relationships with the brain. Together our results provide an algorithmically informed view on the spatio-temporal dynamics of visual object recognition in the human visual brain. PMID:27282108
Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence.

PubMed

Cichy, Radoslaw Martin; Khosla, Aditya; Pantazis, Dimitrios; Torralba, Antonio; Oliva, Aude

2016-06-10

The complex multi-stage architecture of cortical visual pathways provides the neural basis for efficient visual object recognition in humans. However, the stage-wise computations therein remain poorly understood. Here, we compared temporal (magnetoencephalography) and spatial (functional MRI) visual brain representations with representations in an artificial deep neural network (DNN) tuned to the statistics of real-world visual recognition. We showed that the DNN captured the stages of human visual processing in both time and space from early visual areas towards the dorsal and ventral streams. Further investigation of crucial DNN parameters revealed that while model architecture was important, training on real-world categorization was necessary to enforce spatio-temporal hierarchical relationships with the brain. Together our results provide an algorithmically informed view on the spatio-temporal dynamics of visual object recognition in the human visual brain.
View-invariant gait recognition method by three-dimensional convolutional neural network

NASA Astrophysics Data System (ADS)

Xing, Weiwei; Li, Ying; Zhang, Shunli

2018-01-01

Gait as an important biometric feature can identify a human at a long distance. View change is one of the most challenging factors for gait recognition. To address the cross view issues in gait recognition, we propose a view-invariant gait recognition method by three-dimensional (3-D) convolutional neural network. First, 3-D convolutional neural network (3DCNN) is introduced to learn view-invariant feature, which can capture the spatial information and temporal information simultaneously on normalized silhouette sequences. Second, a network training method based on cross-domain transfer learning is proposed to solve the problem of the limited gait training samples. We choose the C3D as the basic model, which is pretrained on the Sports-1M and then fine-tune C3D model to adapt gait recognition. In the recognition stage, we use the fine-tuned model to extract gait features and use Euclidean distance to measure the similarity of gait sequences. Sufficient experiments are carried out on the CASIA-B dataset and the experimental results demonstrate that our method outperforms many other methods.

Eye movements during spoken word recognition in Russian children.

PubMed

Sekerina, Irina A; Brooks, Patricia J

2007-09-01

This study explores incremental processing in spoken word recognition in Russian 5- and 6-year-olds and adults using free-viewing eye-tracking. Participants viewed scenes containing pictures of four familiar objects and clicked on a target embedded in a spoken instruction. In the cohort condition, two object names shared identical three-phoneme onsets. In the noncohort condition, all object names had unique onsets. Coarse-grain analyses of eye movements indicated that adults produced looks to the competitor on significantly more cohort trials than on noncohort trials, whereas children surprisingly failed to demonstrate cohort competition due to widespread exploratory eye movements across conditions. Fine-grain analyses, in contrast, showed a similar time course of eye movements across children and adults, but with cohort competition lingering more than 1s longer in children. The dissociation between coarse-grain and fine-grain eye movements indicates a need to consider multiple behavioral measures in making developmental comparisons in language processing.
The Cambridge Car Memory Test: a task matched in format to the Cambridge Face Memory Test, with norms, reliability, sex differences, dissociations from face memory, and expertise effects.

PubMed

Dennett, Hugh W; McKone, Elinor; Tavashmi, Raka; Hall, Ashleigh; Pidcock, Madeleine; Edwards, Mark; Duchaine, Bradley

2012-06-01

Many research questions require a within-class object recognition task matched for general cognitive requirements with a face recognition task. If the object task also has high internal reliability, it can improve accuracy and power in group analyses (e.g., mean inversion effects for faces vs. objects), individual-difference studies (e.g., correlations between certain perceptual abilities and face/object recognition), and case studies in neuropsychology (e.g., whether a prosopagnosic shows a face-specific or object-general deficit). Here, we present such a task. Our Cambridge Car Memory Test (CCMT) was matched in format to the established Cambridge Face Memory Test, requiring recognition of exemplars across view and lighting change. We tested 153 young adults (93 female). Results showed high reliability (Cronbach's alpha = .84) and a range of scores suitable both for normal-range individual-difference studies and, potentially, for diagnosis of impairment. The mean for males was much higher than the mean for females. We demonstrate independence between face memory and car memory (dissociation based on sex, plus a modest correlation between the two), including where participants have high relative expertise with cars. We also show that expertise with real car makes and models of the era used in the test significantly predicts CCMT performance. Surprisingly, however, regression analyses imply that there is an effect of sex per se on the CCMT that is not attributable to a stereotypical male advantage in car expertise.
Exogenous temporal cues enhance recognition memory in an object-based manner.

PubMed

Ohyama, Junji; Watanabe, Katsumi

2010-11-01

Exogenous attention enhances the perception of attended items in both a space-based and an object-based manner. Exogenous attention also improves recognition memory for attended items in the space-based mode. However, it has not been examined whether object-based exogenous attention enhances recognition memory. To address this issue, we examined whether a sudden visual change in a task-irrelevant stimulus (an exogenous cue) would affect participants' recognition memory for items that were serially presented around a cued time. The results showed that recognition accuracy for an item was strongly enhanced when the visual cue occurred at the same location and time as the item (Experiments 1 and 2). The memory enhancement effect occurred when the exogenous visual cue and an item belonged to the same object (Experiments 3 and 4) and even when the cue was counterpredictive of the timing of an item to be asked about (Experiment 5). The present study suggests that an exogenous temporal cue automatically enhances the recognition accuracy for an item that is presented at close temporal proximity to the cue and that recognition memory enhancement occurs in an object-based manner.
Repetition priming of face recognition in a serial choice reaction-time task.

PubMed

Roberts, T; Bruce, V

1989-05-01

Marshall & Walker (1987) found that pictorial stimuli yield visual priming that is disrupted by an unpredictable visual event in the response-stimulus interval. They argue that visual stimuli are represented in memory in the form of distinct visual and object codes. Bruce & Young (1986) propose similar pictorial, structural and semantic codes which mediate the recognition of faces, yet repetition priming results obtained with faces as stimuli (Bruce & Valentine, 1985), and with objects (Warren & Morton, 1982) are quite different from those of Marshall & Walker (1987), in the sense that recognition is facilitated by pictures presented 20 minutes earlier. The experiment reported here used different views of familiar and unfamiliar faces as stimuli in a serial choice reaction-time task and found that, with identical pictures, repetition priming survives and intervening item requiring a response, with both familiar and unfamiliar faces. Furthermore, with familiar faces such priming was present even when the view of the prime was different from the target. The theoretical implications of these results are discussed.
HWDA: A coherence recognition and resolution algorithm for hybrid web data aggregation

NASA Astrophysics Data System (ADS)

Guo, Shuhang; Wang, Jian; Wang, Tong

2017-09-01

Aiming at the object confliction recognition and resolution problem for hybrid distributed data stream aggregation, a distributed data stream object coherence solution technology is proposed. Firstly, the framework was defined for the object coherence conflict recognition and resolution, named HWDA. Secondly, an object coherence recognition technology was proposed based on formal language description logic and hierarchical dependency relationship between logic rules. Thirdly, a conflict traversal recognition algorithm was proposed based on the defined dependency graph. Next, the conflict resolution technology was prompted based on resolution pattern matching including the definition of the three types of conflict, conflict resolution matching pattern and arbitration resolution method. At last, the experiment use two kinds of web test data sets to validate the effect of application utilizing the conflict recognition and resolution technology of HWDA.
Generation, recognition, and consistent fusion of partial boundary representations from range images

NASA Astrophysics Data System (ADS)

Kohlhepp, Peter; Hanczak, Andrzej M.; Li, Gang

1994-10-01

This paper presents SOMBRERO, a new system for recognizing and locating 3D, rigid, non- moving objects from range data. The objects may be polyhedral or curved, partially occluding, touching or lying flush with each other. For data collection, we employ 2D time- of-flight laser scanners mounted to a moving gantry robot. By combining sensor and robot coordinates, we obtain 3D cartesian coordinates. Boundary representations (Brep's) provide view independent geometry models that are both efficiently recognizable and derivable automatically from sensor data. SOMBRERO's methods for generating, matching and fusing Brep's are highly synergetic. A split-and-merge segmentation algorithm with dynamic triangular builds a partial (21/2D) Brep from scattered data. The recognition module matches this scene description with a model database and outputs recognized objects, their positions and orientations, and possibly surfaces corresponding to unknown objects. We present preliminary results in scene segmentation and recognition. Partial Brep's corresponding to different range sensors or viewpoints can be merged into a consistent, complete and irredundant 3D object or scene model. This fusion algorithm itself uses the recognition and segmentation methods.
Object Recognition in Mental Representations: Directions for Exploring Diagnostic Features through Visual Mental Imagery.

PubMed

Roldan, Stephanie M

2017-01-01

One of the fundamental goals of object recognition research is to understand how a cognitive representation produced from the output of filtered and transformed sensory information facilitates efficient viewer behavior. Given that mental imagery strongly resembles perceptual processes in both cortical regions and subjective visual qualities, it is reasonable to question whether mental imagery facilitates cognition in a manner similar to that of perceptual viewing: via the detection and recognition of distinguishing features. Categorizing the feature content of mental imagery holds potential as a reverse pathway by which to identify the components of a visual stimulus which are most critical for the creation and retrieval of a visual representation. This review will examine the likelihood that the information represented in visual mental imagery reflects distinctive object features thought to facilitate efficient object categorization and recognition during perceptual viewing. If it is the case that these representational features resemble their sensory counterparts in both spatial and semantic qualities, they may well be accessible through mental imagery as evaluated through current investigative techniques. In this review, methods applied to mental imagery research and their findings are reviewed and evaluated for their efficiency in accessing internal representations, and implications for identifying diagnostic features are discussed. An argument is made for the benefits of combining mental imagery assessment methods with diagnostic feature research to advance the understanding of visual perceptive processes, with suggestions for avenues of future investigation.
Object Recognition in Mental Representations: Directions for Exploring Diagnostic Features through Visual Mental Imagery

PubMed Central

Roldan, Stephanie M.

2017-01-01

One of the fundamental goals of object recognition research is to understand how a cognitive representation produced from the output of filtered and transformed sensory information facilitates efficient viewer behavior. Given that mental imagery strongly resembles perceptual processes in both cortical regions and subjective visual qualities, it is reasonable to question whether mental imagery facilitates cognition in a manner similar to that of perceptual viewing: via the detection and recognition of distinguishing features. Categorizing the feature content of mental imagery holds potential as a reverse pathway by which to identify the components of a visual stimulus which are most critical for the creation and retrieval of a visual representation. This review will examine the likelihood that the information represented in visual mental imagery reflects distinctive object features thought to facilitate efficient object categorization and recognition during perceptual viewing. If it is the case that these representational features resemble their sensory counterparts in both spatial and semantic qualities, they may well be accessible through mental imagery as evaluated through current investigative techniques. In this review, methods applied to mental imagery research and their findings are reviewed and evaluated for their efficiency in accessing internal representations, and implications for identifying diagnostic features are discussed. An argument is made for the benefits of combining mental imagery assessment methods with diagnostic feature research to advance the understanding of visual perceptive processes, with suggestions for avenues of future investigation. PMID:28588538
Finding and recognizing objects in natural scenes: complementary computations in the dorsal and ventral visual systems

PubMed Central

Rolls, Edmund T.; Webb, Tristan J.

2014-01-01

Searching for and recognizing objects in complex natural scenes is implemented by multiple saccades until the eyes reach within the reduced receptive field sizes of inferior temporal cortex (IT) neurons. We analyze and model how the dorsal and ventral visual streams both contribute to this. Saliency detection in the dorsal visual system including area LIP is modeled by graph-based visual saliency, and allows the eyes to fixate potential objects within several degrees. Visual information at the fixated location subtending approximately 9° corresponding to the receptive fields of IT neurons is then passed through a four layer hierarchical model of the ventral cortical visual system, VisNet. We show that VisNet can be trained using a synaptic modification rule with a short-term memory trace of recent neuronal activity to capture both the required view and translation invariances to allow in the model approximately 90% correct object recognition for 4 objects shown in any view across a range of 135° anywhere in a scene. The model was able to generalize correctly within the four trained views and the 25 trained translations. This approach analyses the principles by which complementary computations in the dorsal and ventral visual cortical streams enable objects to be located and recognized in complex natural scenes. PMID:25161619
The utility of multiple synthesized views in the recognition of unfamiliar faces.

PubMed

Jones, Scott P; Dwyer, Dominic M; Lewis, Michael B

2017-05-01

The ability to recognize an unfamiliar individual on the basis of prior exposure to a photograph is notoriously poor and prone to errors, but recognition accuracy is improved when multiple photographs are available. In applied situations, when only limited real images are available (e.g., from a mugshot or CCTV image), the generation of new images might provide a technological prosthesis for otherwise fallible human recognition. We report two experiments examining the effects of providing computer-generated additional views of a target face. In Experiment 1, provision of computer-generated views supported better target face recognition than exposure to the target image alone and equivalent performance to that for exposure of multiple photograph views. Experiment 2 replicated the advantage of providing generated views, but also indicated an advantage for multiple viewings of the single target photograph. These results strengthen the claim that identifying a target face can be improved by providing multiple synthesized views based on a single target image. In addition, our results suggest that the degree of advantage provided by synthesized views may be affected by the quality of synthesized material.
Do object refixations during scene viewing indicate rehearsal in visual working memory?

PubMed

Zelinsky, Gregory J; Loschky, Lester C; Dickinson, Christopher A

2011-05-01

Do refixations serve a rehearsal function in visual working memory (VWM)? We analyzed refixations from observers freely viewing multiobject scenes. An eyetracker was used to limit the viewing of a scene to a specified number of objects fixated after the target (intervening objects), followed by a four-alternative forced choice recognition test. Results showed that the probability of target refixation increased with the number of fixated intervening objects, and these refixations produced a 16% accuracy benefit over the first five intervening-object conditions. Additionally, refixations most frequently occurred after fixations on only one to two other objects, regardless of the intervening-object condition. These behaviors could not be explained by random or minimally constrained computational models; a VWM component was required to completely describe these data. We explain these findings in terms of a monitor-refixate rehearsal system: The activations of object representations in VWM are monitored, with refixations occurring when these activations decrease suddenly.
Spatial resolution enhancement of satellite image data using fusion approach

NASA Astrophysics Data System (ADS)

Lestiana, H.; Sukristiyanti

2018-02-01

Object identification using remote sensing data has a problem when the spatial resolution is not in accordance with the object. The fusion approach is one of methods to solve the problem, to improve the object recognition and to increase the objects information by combining data from multiple sensors. The application of fusion image can be used to estimate the environmental component that is needed to monitor in multiple views, such as evapotranspiration estimation, 3D ground-based characterisation, smart city application, urban environments, terrestrial mapping, and water vegetation. Based on fusion application method, the visible object in land area has been easily recognized using the method. The variety of object information in land area has increased the variation of environmental component estimation. The difficulties in recognizing the invisible object like Submarine Groundwater Discharge (SGD), especially in tropical area, might be decreased by the fusion method. The less variation of the object in the sea surface temperature is a challenge to be solved.
Retrieval Failure Contributes to Gist-Based False Recognition

PubMed Central

Guerin, Scott A.; Robbins, Clifford A.; Gilmore, Adrian W.; Schacter, Daniel L.

2011-01-01

People often falsely recognize items that are similar to previously encountered items. This robust memory error is referred to as gist-based false recognition. A widely held view is that this error occurs because the details fade rapidly from our memory. Contrary to this view, an initial experiment revealed that, following the same encoding conditions that produce high rates of gist-based false recognition, participants overwhelmingly chose the correct target rather than its related foil when given the option to do so. A second experiment showed that this result is due to increased access to stored details provided by reinstatement of the originally encoded photograph, rather than to increased attention to the details. Collectively, these results suggest that details needed for accurate recognition are, to a large extent, still stored in memory and that a critical factor determining whether false recognition will occur is whether these details can be accessed during retrieval. PMID:22125357
Robust kernel collaborative representation for face recognition

NASA Astrophysics Data System (ADS)

Huang, Wei; Wang, Xiaohui; Ma, Yanbo; Jiang, Yuzheng; Zhu, Yinghui; Jin, Zhong

2015-05-01

One of the greatest challenges of representation-based face recognition is that the training samples are usually insufficient. In other words, the training set usually does not include enough samples to show varieties of high-dimensional face images caused by illuminations, facial expressions, and postures. When the test sample is significantly different from the training samples of the same subject, the recognition performance will be sharply reduced. We propose a robust kernel collaborative representation based on virtual samples for face recognition. We think that the virtual training set conveys some reasonable and possible variations of the original training samples. Hence, we design a new object function to more closely match the representation coefficients generated from the original and virtual training sets. In order to further improve the robustness, we implement the corresponding representation-based face recognition in kernel space. It is noteworthy that any kind of virtual training samples can be used in our method. We use noised face images to obtain virtual face samples. The noise can be approximately viewed as a reflection of the varieties of illuminations, facial expressions, and postures. Our work is a simple and feasible way to obtain virtual face samples to impose Gaussian noise (and other types of noise) specifically to the original training samples to obtain possible variations of the original samples. Experimental results on the FERET, Georgia Tech, and ORL face databases show that the proposed method is more robust than two state-of-the-art face recognition methods, such as CRC and Kernel CRC.
Using an Improved SIFT Algorithm and Fuzzy Closed-Loop Control Strategy for Object Recognition in Cluttered Scenes

PubMed Central

Nie, Haitao; Long, Kehui; Ma, Jun; Yue, Dan; Liu, Jinguo

2015-01-01

Partial occlusions, large pose variations, and extreme ambient illumination conditions generally cause the performance degradation of object recognition systems. Therefore, this paper presents a novel approach for fast and robust object recognition in cluttered scenes based on an improved scale invariant feature transform (SIFT) algorithm and a fuzzy closed-loop control method. First, a fast SIFT algorithm is proposed by classifying SIFT features into several clusters based on several attributes computed from the sub-orientation histogram (SOH), in the feature matching phase only features that share nearly the same corresponding attributes are compared. Second, a feature matching step is performed following a prioritized order based on the scale factor, which is calculated between the object image and the target object image, guaranteeing robust feature matching. Finally, a fuzzy closed-loop control strategy is applied to increase the accuracy of the object recognition and is essential for autonomous object manipulation process. Compared to the original SIFT algorithm for object recognition, the result of the proposed method shows that the number of SIFT features extracted from an object has a significant increase, and the computing speed of the object recognition processes increases by more than 40%. The experimental results confirmed that the proposed method performs effectively and accurately in cluttered scenes. PMID:25714094
Automated detection and recognition of wildlife using thermal cameras.

PubMed

Christiansen, Peter; Steen, Kim Arild; Jørgensen, Rasmus Nyholm; Karstoft, Henrik

2014-07-30

In agricultural mowing operations, thousands of animals are injured or killed each year, due to the increased working widths and speeds of agricultural machinery. Detection and recognition of wildlife within the agricultural fields is important to reduce wildlife mortality and, thereby, promote wildlife-friendly farming. The work presented in this paper contributes to the automated detection and classification of animals in thermal imaging. The methods and results are based on top-view images taken manually from a lift to motivate work towards unmanned aerial vehicle-based detection and recognition. Hot objects are detected based on a threshold dynamically adjusted to each frame. For the classification of animals, we propose a novel thermal feature extraction algorithm. For each detected object, a thermal signature is calculated using morphological operations. The thermal signature describes heat characteristics of objects and is partly invariant to translation, rotation, scale and posture. The discrete cosine transform (DCT) is used to parameterize the thermal signature and, thereby, calculate a feature vector, which is used for subsequent classification. Using a k-nearest-neighbor (kNN) classifier, animals are discriminated from non-animals with a balanced classification accuracy of 84.7% in an altitude range of 3-10 m and an accuracy of 75.2% for an altitude range of 10-20 m. To incorporate temporal information in the classification, a tracking algorithm is proposed. Using temporal information improves the balanced classification accuracy to 93.3% in an altitude range 3-10 of meters and 77.7% in an altitude range of 10-20 m.
Human-inspired sound environment recognition system for assistive vehicles

NASA Astrophysics Data System (ADS)

González Vidal, Eduardo; Fredes Zarricueta, Ernesto; Auat Cheein, Fernando

2015-02-01

Objective. The human auditory system acquires environmental information under sound stimuli faster than visual or touch systems, which in turn, allows for faster human responses to such stimuli. It also complements senses such as sight, where direct line-of-view is necessary to identify objects, in the environment recognition process. This work focuses on implementing human reaction to sound stimuli and environment recognition on assistive robotic devices, such as robotic wheelchairs or robotized cars. These vehicles need environment information to ensure safe navigation. Approach. In the field of environment recognition, range sensors (such as LiDAR and ultrasonic systems) and artificial vision devices are widely used; however, these sensors depend on environment constraints (such as lighting variability or color of objects), and sound can provide important information for the characterization of an environment. In this work, we propose a sound-based approach to enhance the environment recognition process, mainly for cases that compromise human integrity, according to the International Classification of Functioning (ICF). Our proposal is based on a neural network implementation that is able to classify up to 15 different environments, each selected according to the ICF considerations on environment factors in the community-based physical activities of people with disabilities. Main results. The accuracy rates in environment classification ranges from 84% to 93%. This classification is later used to constrain assistive vehicle navigation in order to protect the user during daily activities. This work also includes real-time outdoor experimentation (performed on an assistive vehicle) by seven volunteers with different disabilities (but without cognitive impairment and experienced in the use of wheelchairs), statistical validation, comparison with previously published work, and a discussion section where the pros and cons of our system are evaluated. Significance. The proposed sound-based system is very efficient at providing general descriptions of the environment. Such descriptions are focused on vulnerable situations described by the ICF. The volunteers answered a questionnaire regarding the importance of constraining the vehicle velocities in risky environments, showing that all the volunteers felt comfortable with the system and its performance.
Novelty preference in patients with developmental amnesia.

PubMed

Munoz, M; Chadwick, M; Perez-Hernandez, E; Vargha-Khadem, F; Mishkin, M

2011-12-01

To re-examine whether or not selective hippocampal damage reduces novelty preference in visual paired comparison (VPC), we presented two different versions of the task to a group of patients with developmental amnesia (DA), each of whom sustained this form of pathology early in life. Compared with normal control participants, the DA group showed a delay-dependent reduction in novelty preference on one version of the task and an overall reduction on both versions combined. Because VPC is widely considered to be a measure of incidental recognition, the results appear to support the view that the hippocampus contributes to recognition memory. A difficulty for this conclusion, however, is that according to one current view the hippocampal contribution to recognition is limited to task conditions that encourage recollection of an item in some associated context, and according to another current view, to recognition of an item with the high confidence judgment that reflects a strong memory. By contrast, VPC, throughout which the participant remains entirely uninstructed other than to view the stimuli, would seem to lack such task conditions and so would likely lead to recognition based on familiarity rather than recollection or, alternatively, weak memories rather than strong. However, before concluding that the VPC impairment therefore contradicts both current views regarding the role of the hippocampus in recognition memory, two possibilities that would resolve this issue need to be investigated. One is that some variable in VPC, such as the extended period of stimulus encoding during familiarization, overrides its incidental nature, and, because this condition promotes either recollection- or strength-based recognition, renders the task hippocampal-dependent. The other possibility is that VPC, rather than providing a measure of incidental recognition, actually assesses an implicit, information-gathering process modulated by habituation, for which the hippocampus is also partly responsible, independent of its role in recognition. Copyright © 2010 Wiley Periodicals, Inc.
Object Recognition Under Semantic Impairment: The Effects of Conceptual Regularities on Perceptual Decisions.

ERIC Educational Resources Information Center

Rogers, Timothy T.; Hodges, John R.; Ralph, Matthew A. Lambon; Patterson, Karalyn

2003-01-01

Presents evidence that although patients with semantic deficits can sometimes show good performance on tests or object decisions, this pattern applies when nonsee-objects do not respect the regularities of the domain. Patients with semantic dementia viewed line drawings of a real and chimeric animals side-by-side and were asked to decide which was…
Humans and Deep Networks Largely Agree on Which Kinds of Variation Make Object Recognition Harder.

PubMed

Kheradpisheh, Saeed R; Ghodrati, Masoud; Ganjtabesh, Mohammad; Masquelier, Timothée

2016-01-01

View-invariant object recognition is a challenging problem that has attracted much attention among the psychology, neuroscience, and computer vision communities. Humans are notoriously good at it, even if some variations are presumably more difficult to handle than others (e.g., 3D rotations). Humans are thought to solve the problem through hierarchical processing along the ventral stream, which progressively extracts more and more invariant visual features. This feed-forward architecture has inspired a new generation of bio-inspired computer vision systems called deep convolutional neural networks (DCNN), which are currently the best models for object recognition in natural images. Here, for the first time, we systematically compared human feed-forward vision and DCNNs at view-invariant object recognition task using the same set of images and controlling the kinds of transformation (position, scale, rotation in plane, and rotation in depth) as well as their magnitude, which we call "variation level." We used four object categories: car, ship, motorcycle, and animal. In total, 89 human subjects participated in 10 experiments in which they had to discriminate between two or four categories after rapid presentation with backward masking. We also tested two recent DCNNs (proposed respectively by Hinton's group and Zisserman's group) on the same tasks. We found that humans and DCNNs largely agreed on the relative difficulties of each kind of variation: rotation in depth is by far the hardest transformation to handle, followed by scale, then rotation in plane, and finally position (much easier). This suggests that DCNNs would be reasonable models of human feed-forward vision. In addition, our results show that the variation levels in rotation in depth and scale strongly modulate both humans' and DCNNs' recognition performances. We thus argue that these variations should be controlled in the image datasets used in vision research.

Robot Vision

NASA Technical Reports Server (NTRS)

Sutro, L. L.; Lerman, J. B.

1973-01-01

The operation of a system is described that is built both to model the vision of primate animals, including man, and serve as a pre-prototype of possible object recognition system. It was employed in a series of experiments to determine the practicability of matching left and right images of a scene to determine the range and form of objects. The experiments started with computer generated random-dot stereograms as inputs and progressed through random square stereograms to a real scene. The major problems were the elimination of spurious matches, between the left and right views, and the interpretation of ambiguous regions, on the left side of an object that can be viewed only by the left camera, and on the right side of an object that can be viewed only by the right camera.
The implementation of aerial object recognition algorithm based on contour descriptor in FPGA-based on-board vision system

NASA Astrophysics Data System (ADS)

Babayan, Pavel; Smirnov, Sergey; Strotov, Valery

2017-10-01

This paper describes the aerial object recognition algorithm for on-board and stationary vision system. Suggested algorithm is intended to recognize the objects of a specific kind using the set of the reference objects defined by 3D models. The proposed algorithm based on the outer contour descriptor building. The algorithm consists of two stages: learning and recognition. Learning stage is devoted to the exploring of reference objects. Using 3D models we can build the database containing training images by rendering the 3D model from viewpoints evenly distributed on a sphere. Sphere points distribution is made by the geosphere principle. Gathered training image set is used for calculating descriptors, which will be used in the recognition stage of the algorithm. The recognition stage is focusing on estimating the similarity of the captured object and the reference objects by matching an observed image descriptor and the reference object descriptors. The experimental research was performed using a set of the models of the aircraft of the different types (airplanes, helicopters, UAVs). The proposed orientation estimation algorithm showed good accuracy in all case studies. The real-time performance of the algorithm in FPGA-based vision system was demonstrated.
System of technical vision for autonomous unmanned aerial vehicles

NASA Astrophysics Data System (ADS)

Bondarchuk, A. S.

2018-05-01

This paper is devoted to the implementation of image recognition algorithm using the LabVIEW software. The created virtual instrument is designed to detect the objects on the frames from the camera mounted on the UAV. The trained classifier is invariant to changes in rotation, as well as to small changes in the camera's viewing angle. Finding objects in the image using particle analysis, allows you to classify regions of different sizes. This method allows the system of technical vision to more accurately determine the location of the objects of interest and their movement relative to the camera.
Face learning and the emergence of view-independent face recognition: an event-related brain potential study.

PubMed

Zimmermann, Friederike G S; Eimer, Martin

2013-06-01

Recognizing unfamiliar faces is more difficult than familiar face recognition, and this has been attributed to qualitative differences in the processing of familiar and unfamiliar faces. Familiar faces are assumed to be represented by view-independent codes, whereas unfamiliar face recognition depends mainly on view-dependent low-level pictorial representations. We employed an electrophysiological marker of visual face recognition processes in order to track the emergence of view-independence during the learning of previously unfamiliar faces. Two face images showing either the same or two different individuals in the same or two different views were presented in rapid succession, and participants had to perform an identity-matching task. On trials where both faces showed the same view, repeating the face of the same individual triggered an N250r component at occipito-temporal electrodes, reflecting the rapid activation of visual face memory. A reliable N250r component was also observed on view-change trials. Crucially, this view-independence emerged as a result of face learning. In the first half of the experiment, N250r components were present only on view-repetition trials but were absent on view-change trials, demonstrating that matching unfamiliar faces was initially based on strictly view-dependent codes. In the second half, the N250r was triggered not only on view-repetition trials but also on view-change trials, indicating that face recognition had now become more view-independent. This transition may be due to the acquisition of abstract structural codes of individual faces during face learning, but could also reflect the formation of associative links between sets of view-specific pictorial representations of individual faces. Copyright © 2013 Elsevier Ltd. All rights reserved.
A focus of attention mechanism for gaze control within a framework for intelligent image analysis tools

NASA Astrophysics Data System (ADS)

Rodrigo, Ranga P.; Ranaweera, Kamal; Samarabandu, Jagath K.

2004-05-01

Focus of attention is often attributed to biological vision system where the entire field of view is first monitored and then the attention is focused to the object of interest. We propose using a similar approach for object recognition in a color image sequence. The intention is to locate an object based on a prior motive, concentrate on the detected object so that the imaging device can be guided toward it. We use the abilities of the intelligent image analysis framework developed in our laboratory to generate an algorithm dynamically to detect the particular type of object based on the user's object description. The proposed method uses color clustering along with segmentation. The segmented image with labeled regions is used to calculate the shape descriptor parameters. These and the color information are matched with the input description. Gaze is then controlled by issuing camera movement commands as appropriate. We present some preliminary results that demonstrate the success of this approach.
The Influence of Action Perception on Object Recognition: A Developmental Study

ERIC Educational Resources Information Center

Mounoud, Pierre; Duscherer, Katia; Moy, Guenael; Perraudin, Sandrine

2007-01-01

Two experiments explored the existence and the development of relations between action representations and object representations. A priming paradigm was used in which participants viewed an action pantomime followed by the picture of a tool, the tool being either associated or unassociated with the preceding action. Overall, we observed that the…
Object Recognition and Localization: The Role of Tactile Sensors

PubMed Central

Aggarwal, Achint; Kirchner, Frank

2014-01-01

Tactile sensors, because of their intrinsic insensitivity to lighting conditions and water turbidity, provide promising opportunities for augmenting the capabilities of vision sensors in applications involving object recognition and localization. This paper presents two approaches for haptic object recognition and localization for ground and underwater environments. The first approach called Batch Ransac and Iterative Closest Point augmented Particle Filter (BRICPPF) is based on an innovative combination of particle filters, Iterative-Closest-Point algorithm, and a feature-based Random Sampling and Consensus (RANSAC) algorithm for database matching. It can handle a large database of 3D-objects of complex shapes and performs a complete six-degree-of-freedom localization of static objects. The algorithms are validated by experimentation in ground and underwater environments using real hardware. To our knowledge this is the first instance of haptic object recognition and localization in underwater environments. The second approach is biologically inspired, and provides a close integration between exploration and recognition. An edge following exploration strategy is developed that receives feedback from the current state of recognition. A recognition by parts approach is developed which uses the BRICPPF for object sub-part recognition. Object exploration is either directed to explore a part until it is successfully recognized, or is directed towards new parts to endorse the current recognition belief. This approach is validated by simulation experiments. PMID:24553087
Shape and Color Features for Object Recognition Search

NASA Technical Reports Server (NTRS)

Duong, Tuan A.; Duong, Vu A.; Stubberud, Allen R.

2012-01-01

A bio-inspired shape feature of an object of interest emulates the integration of the saccadic eye movement and horizontal layer in vertebrate retina for object recognition search where a single object can be used one at a time. The optimal computational model for shape-extraction-based principal component analysis (PCA) was also developed to reduce processing time and enable the real-time adaptive system capability. A color feature of the object is employed as color segmentation to empower the shape feature recognition to solve the object recognition in the heterogeneous environment where a single technique - shape or color - may expose its difficulties. To enable the effective system, an adaptive architecture and autonomous mechanism were developed to recognize and adapt the shape and color feature of the moving object. The bio-inspired object recognition based on bio-inspired shape and color can be effective to recognize a person of interest in the heterogeneous environment where the single technique exposed its difficulties to perform effective recognition. Moreover, this work also demonstrates the mechanism and architecture of the autonomous adaptive system to enable the realistic system for the practical use in the future.
Weighted fusion of depth and inertial data to improve view invariance for real-time human action recognition

NASA Astrophysics Data System (ADS)

Chen, Chen; Hao, Huiyan; Jafari, Roozbeh; Kehtarnavaz, Nasser

2017-05-01

This paper presents an extension to our previously developed fusion framework [10] involving a depth camera and an inertial sensor in order to improve its view invariance aspect for real-time human action recognition applications. A computationally efficient view estimation based on skeleton joints is considered in order to select the most relevant depth training data when recognizing test samples. Two collaborative representation classifiers, one for depth features and one for inertial features, are appropriately weighted to generate a decision making probability. The experimental results applied to a multi-view human action dataset show that this weighted extension improves the recognition performance by about 5% over equally weighted fusion deployed in our previous fusion framework.
Image processing strategies based on saliency segmentation for object recognition under simulated prosthetic vision.

PubMed

Li, Heng; Su, Xiaofan; Wang, Jing; Kan, Han; Han, Tingting; Zeng, Yajie; Chai, Xinyu

2018-01-01

Current retinal prostheses can only generate low-resolution visual percepts constituted of limited phosphenes which are elicited by an electrode array and with uncontrollable color and restricted grayscale. Under this visual perception, prosthetic recipients can just complete some simple visual tasks, but more complex tasks like face identification/object recognition are extremely difficult. Therefore, it is necessary to investigate and apply image processing strategies for optimizing the visual perception of the recipients. This study focuses on recognition of the object of interest employing simulated prosthetic vision. We used a saliency segmentation method based on a biologically plausible graph-based visual saliency model and a grabCut-based self-adaptive-iterative optimization framework to automatically extract foreground objects. Based on this, two image processing strategies, Addition of Separate Pixelization and Background Pixel Shrink, were further utilized to enhance the extracted foreground objects. i) The results showed by verification of psychophysical experiments that under simulated prosthetic vision, both strategies had marked advantages over Direct Pixelization in terms of recognition accuracy and efficiency. ii) We also found that recognition performance under two strategies was tied to the segmentation results and was affected positively by the paired-interrelated objects in the scene. The use of the saliency segmentation method and image processing strategies can automatically extract and enhance foreground objects, and significantly improve object recognition performance towards recipients implanted a high-density implant. Copyright © 2017 Elsevier B.V. All rights reserved.
Robust Pedestrian Tracking and Recognition from FLIR Video: A Unified Approach via Sparse Coding

PubMed Central

Li, Xin; Guo, Rui; Chen, Chao

2014-01-01

Sparse coding is an emerging method that has been successfully applied to both robust object tracking and recognition in the vision literature. In this paper, we propose to explore a sparse coding-based approach toward joint object tracking-and-recognition and explore its potential in the analysis of forward-looking infrared (FLIR) video to support nighttime machine vision systems. A key technical contribution of this work is to unify existing sparse coding-based approaches toward tracking and recognition under the same framework, so that they can benefit from each other in a closed-loop. On the one hand, tracking the same object through temporal frames allows us to achieve improved recognition performance through dynamical updating of template/dictionary and combining multiple recognition results; on the other hand, the recognition of individual objects facilitates the tracking of multiple objects (i.e., walking pedestrians), especially in the presence of occlusion within a crowded environment. We report experimental results on both the CASIAPedestrian Database and our own collected FLIR video database to demonstrate the effectiveness of the proposed joint tracking-and-recognition approach. PMID:24961216
An approach to computing direction relations between separated object groups

NASA Astrophysics Data System (ADS)

Yan, H.; Wang, Z.; Li, J.

2013-06-01

Direction relations between object groups play an important role in qualitative spatial reasoning, spatial computation and spatial recognition. However, none of existing models can be used to compute direction relations between object groups. To fill this gap, an approach to computing direction relations between separated object groups is proposed in this paper, which is theoretically based on Gestalt principles and the idea of multi-directions. The approach firstly triangulates the two object groups; and then it constructs the Voronoi Diagram between the two groups using the triangular network; after this, the normal of each Vornoi edge is calculated, and the quantitative expression of the direction relations is constructed; finally, the quantitative direction relations are transformed into qualitative ones. The psychological experiments show that the proposed approach can obtain direction relations both between two single objects and between two object groups, and the results are correct from the point of view of spatial cognition.
An approach to computing direction relations between separated object groups

NASA Astrophysics Data System (ADS)

Yan, H.; Wang, Z.; Li, J.

2013-09-01

Direction relations between object groups play an important role in qualitative spatial reasoning, spatial computation and spatial recognition. However, none of existing models can be used to compute direction relations between object groups. To fill this gap, an approach to computing direction relations between separated object groups is proposed in this paper, which is theoretically based on gestalt principles and the idea of multi-directions. The approach firstly triangulates the two object groups, and then it constructs the Voronoi diagram between the two groups using the triangular network. After this, the normal of each Voronoi edge is calculated, and the quantitative expression of the direction relations is constructed. Finally, the quantitative direction relations are transformed into qualitative ones. The psychological experiments show that the proposed approach can obtain direction relations both between two single objects and between two object groups, and the results are correct from the point of view of spatial cognition.
Road sign recognition using Viapix module and correlation

NASA Astrophysics Data System (ADS)

Ouerhani, Y.; Desthieux, M.; Alfalou, A.

2015-03-01

In this paper, we propose and validate a new system used to explore road assets. In this work we are interested on the vertical road signs. To do this, we are based on the combination of road signs detection, recognition and identification using data provides by sensors. The proposed approach consists on using panoramic views provided by the innovative device, VIAPIX®1, developed by our company ACTRIS2. We are based also on the optimized correlation technique for road signs recognition and identification on pictures. Obtained results shows the interest on using panoramic views compared to results obtained using images provided using only one camera.
Safe trajectory estimation at a pedestrian crossing to assist visually impaired people.

PubMed

Alghamdi, Saleh; van Schyndel, Ron; Khalil, Ibrahim

2012-01-01

The aim of this paper is to present a service for blind and people with low vision to assist them to cross the street independently. The presented approach provides the user with significant information such as detection of pedestrian crossing signal from any point of view, when the pedestrian crossing signal light is green, the detection of dynamic and fixed obstacles, predictions of the movement of fellow pedestrians and information on objects which may intersect his path. Our approach is based on capturing multiple frames using a depth camera which is attached to a user's headgear. Currently a testbed system is built on a helmet and is connected to a laptop in the user's backpack. In this paper, we discussed efficiency of using Speeded-Up Robust Features (SURF) algorithm for object recognition for purposes of blind people assistance. The system predicts the movement of objects of interest to provide the user with information on the safest path to navigate and information on the surrounding area. Evaluation of this approach on real sequence video frames provides 90% of human detection and more than 80% for recognition of other related objects.
Northeast Artificial Intelligence Consortium Annual Report. Volume 7. 1988 Research in Automated Photointerpretation

DTIC Science & Technology

1989-10-01

weight based on how powerful the corresponding feature is for object recognition and discrimination. For example, consider an arbitrary weight, denoted...quality of the segmentation, how powerful the features and spatial constraints in the knowledge base are (as far as object recognition is concern...that are powerful for object recognition and discrimination. At this point, this selection is performed heuristically through trial-and-error. As a
Newborn chickens generate invariant object representations at the onset of visual object experience

PubMed Central

Wood, Justin N.

2013-01-01

To recognize objects quickly and accurately, mature visual systems build invariant object representations that generalize across a range of novel viewing conditions (e.g., changes in viewpoint). To date, however, the origins of this core cognitive ability have not yet been established. To examine how invariant object recognition develops in a newborn visual system, I raised chickens from birth for 2 weeks within controlled-rearing chambers. These chambers provided complete control over all visual object experiences. In the first week of life, subjects’ visual object experience was limited to a single virtual object rotating through a 60° viewpoint range. In the second week of life, I examined whether subjects could recognize that virtual object from novel viewpoints. Newborn chickens were able to generate viewpoint-invariant representations that supported object recognition across large, novel, and complex changes in the object’s appearance. Thus, newborn visual systems can begin building invariant object representations at the onset of visual object experience. These abstract representations can be generated from sparse data, in this case from a visual world containing a single virtual object seen from a limited range of viewpoints. This study shows that powerful, robust, and invariant object recognition machinery is an inherent feature of the newborn brain. PMID:23918372
OPTICAL INFORMATION PROCESSING: Synthesis of an object recognition system based on the profile of the envelope of a laser pulse in pulsed lidars

NASA Astrophysics Data System (ADS)

Buryi, E. V.

1998-05-01

The main problems in the synthesis of an object recognition system, based on the principles of operation of neuron networks, are considered. Advantages are demonstrated of a hierarchical structure of the recognition algorithm. The use of reading of the amplitude spectrum of signals as information tags is justified and a method is developed for determination of the dimensionality of the tag space. Methods are suggested for ensuring the stability of object recognition in the optical range. It is concluded that it should be possible to recognise perspectives of complex objects.
Learning and recognition of on-premise signs from weakly labeled street view images.

PubMed

Tsai, Tsung-Hung; Cheng, Wen-Huang; You, Chuang-Wen; Hu, Min-Chun; Tsui, Arvin Wen; Chi, Heng-Yu

2014-03-01

Camera-enabled mobile devices are commonly used as interaction platforms for linking the user's virtual and physical worlds in numerous research and commercial applications, such as serving an augmented reality interface for mobile information retrieval. The various application scenarios give rise to a key technique of daily life visual object recognition. On-premise signs (OPSs), a popular form of commercial advertising, are widely used in our living life. The OPSs often exhibit great visual diversity (e.g., appearing in arbitrary size), accompanied with complex environmental conditions (e.g., foreground and background clutter). Observing that such real-world characteristics are lacking in most of the existing image data sets, in this paper, we first proposed an OPS data set, namely OPS-62, in which totally 4649 OPS images of 62 different businesses are collected from Google's Street View. Further, for addressing the problem of real-world OPS learning and recognition, we developed a probabilistic framework based on the distributional clustering, in which we proposed to exploit the distributional information of each visual feature (the distribution of its associated OPS labels) as a reliable selection criterion for building discriminative OPS models. Experiments on the OPS-62 data set demonstrated the outperformance of our approach over the state-of-the-art probabilistic latent semantic analysis models for more accurate recognitions and less false alarms, with a significant 151.28% relative improvement in the average recognition rate. Meanwhile, our approach is simple, linear, and can be executed in a parallel fashion, making it practical and scalable for large-scale multimedia applications.
Toward faster and more accurate star sensors using recursive centroiding and star identification

NASA Astrophysics Data System (ADS)

Samaan, Malak Anees

The objective of this research is to study different novel developed techniques for spacecraft attitude determination methods using star tracker sensors. This dissertation addresses various issues on developing improved star tracker software, presents new approaches for better performance of star trackers, and considers applications to realize high precision attitude estimates. Star-sensors are often included in a spacecraft attitude-system instrument suite, where high accuracy pointing capability is required. Novel methods for image processing, camera parameters ground calibration, autonomous star pattern recognition, and recursive star identification are researched and implemented to achieve high accuracy and a high frame rate star tracker that can be used for many space missions. This dissertation presents the methods and algorithms implemented for the one Field of View 'FOV'Star NavI sensor that was tested aboard the STS-107 mission in spring 2003 and the two fields of view StarNavII sensor for the EO-3 spacecraft scheduled for launch in 2007. The results of this research enable advances in spacecraft attitude determination based upon real time star sensing and pattern recognition. Building upon recent developments in image processing, pattern recognition algorithms, focal plane detectors, electro-optics, and microprocessors, the star tracker concept utilized in this research has the following key objectives for spacecraft of the future: lower cost, lower mass and smaller volume, increased robustness to environment-induced aging and instrument response variations, increased adaptability and autonomy via recursive self-calibration and health-monitoring on-orbit. Many of these attributes are consequences of improved algorithms that are derived in this dissertation.

A text input system developed by using lips image recognition based LabVIEW for the seriously disabled.

PubMed

Chen, S C; Shao, C L; Liang, C K; Lin, S W; Huang, T H; Hsieh, M C; Yang, C H; Luo, C H; Wuo, C M

2004-01-01

In this paper, we present a text input system for the seriously disabled by using lips image recognition based on LabVIEW. This system can be divided into the software subsystem and the hardware subsystem. In the software subsystem, we adopted the technique of image processing to recognize the status of mouth-opened or mouth-closed depending the relative distance between the upper lip and the lower lip. In the hardware subsystem, parallel port built in PC is used to transmit the recognized result of mouth status to the Morse-code text input system. Integrating the software subsystem with the hardware subsystem, we implement a text input system by using lips image recognition programmed in LabVIEW language. We hope the system can help the seriously disabled to communicate with normal people more easily.
Formal implementation of a performance evaluation model for the face recognition system.

PubMed

Shin, Yong-Nyuo; Kim, Jason; Lee, Yong-Jun; Shin, Woochang; Choi, Jin-Young

2008-01-01

Due to usability features, practical applications, and its lack of intrusiveness, face recognition technology, based on information, derived from individuals' facial features, has been attracting considerable attention recently. Reported recognition rates of commercialized face recognition systems cannot be admitted as official recognition rates, as they are based on assumptions that are beneficial to the specific system and face database. Therefore, performance evaluation methods and tools are necessary to objectively measure the accuracy and performance of any face recognition system. In this paper, we propose and formalize a performance evaluation model for the biometric recognition system, implementing an evaluation tool for face recognition systems based on the proposed model. Furthermore, we performed evaluations objectively by providing guidelines for the design and implementation of a performance evaluation system, formalizing the performance test process.
A Method to Recognize Anatomical Site and Image Acquisition View in X-ray Images.

PubMed

Chang, Xiao; Mazur, Thomas; Li, H Harold; Yang, Deshan

2017-12-01

A method was developed to recognize anatomical site and image acquisition view automatically in 2D X-ray images that are used in image-guided radiation therapy. The purpose is to enable site and view dependent automation and optimization in the image processing tasks including 2D-2D image registration, 2D image contrast enhancement, and independent treatment site confirmation. The X-ray images for 180 patients of six disease sites (the brain, head-neck, breast, lung, abdomen, and pelvis) were included in this study with 30 patients each site and two images of orthogonal views each patient. A hierarchical multiclass recognition model was developed to recognize general site first and then specific site. Each node of the hierarchical model recognized the images using a feature extraction step based on principal component analysis followed by a binary classification step based on support vector machine. Given two images in known orthogonal views, the site recognition model achieved a 99% average F1 score across the six sites. If the views were unknown in the images, the average F1 score was 97%. If only one image was taken either with or without view information, the average F1 score was 94%. The accuracy of the site-specific view recognition models was 100%.
Music Recognition in Frontotemporal Lobar Degeneration and Alzheimer Disease

PubMed Central

Johnson, Julene K; Chang, Chiung-Chih; Brambati, Simona M; Migliaccio, Raffaella; Gorno-Tempini, Maria Luisa; Miller, Bruce L; Janata, Petr

2013-01-01

Objective To compare music recognition in patients with frontotemporal dementia, semantic dementia, Alzheimer disease, and controls and to evaluate the relationship between music recognition and brain volume. Background Recognition of familiar music depends on several levels of processing. There are few studies about how patients with dementia recognize familiar music. Methods Subjects were administered tasks that assess pitch and melody discrimination, detection of pitch errors in familiar melodies, and naming of familiar melodies. Results There were no group differences on pitch and melody discrimination tasks. However, patients with semantic dementia had considerable difficulty naming familiar melodies and also scored the lowest when asked to identify pitch errors in the same melodies. Naming familiar melodies, but not other music tasks, was strongly related to measures of semantic memory. Voxel-based morphometry analysis of brain MRI showed that difficulty in naming songs was associated with the bilateral temporal lobes and inferior frontal gyrus, whereas difficulty in identifying pitch errors in familiar melodies correlated with primarily the right temporal lobe. Conclusions The results support a view that the anterior temporal lobes play a role in familiar melody recognition, and that musical functions are affected differentially across forms of dementia. PMID:21617528
Visual working memory is more tolerant than visual long-term memory.

PubMed

Schurgin, Mark W; Flombaum, Jonathan I

2018-05-07

Human visual memory is tolerant, meaning that it supports object recognition despite variability across encounters at the image level. Tolerant object recognition remains one capacity in which artificial intelligence trails humans. Typically, tolerance is described as a property of human visual long-term memory (VLTM). In contrast, visual working memory (VWM) is not usually ascribed a role in tolerant recognition, with tests of that system usually demanding discriminatory power-identifying changes, not sameness. There are good reasons to expect that VLTM is more tolerant; functionally, recognition over the long-term must accommodate the fact that objects will not be viewed under identical conditions; and practically, the passive and massive nature of VLTM may impose relatively permissive criteria for thinking that two inputs are the same. But empirically, tolerance has never been compared across working and long-term visual memory. We therefore developed a novel paradigm for equating encoding and test across different memory types. In each experiment trial, participants saw two objects, memory for one tested immediately (VWM) and later for the other (VLTM). VWM performance was better than VLTM and remained robust despite the introduction of image and object variability. In contrast, VLTM performance suffered linearly as more variability was introduced into test stimuli. Additional experiments excluded interference effects as causes for the observed differences. These results suggest the possibility of a previously unidentified role for VWM in the acquisition of tolerant representations for object recognition. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
The representation of object viewpoint in human visual cortex.

PubMed

Andresen, David R; Vinberg, Joakim; Grill-Spector, Kalanit

2009-04-01

Understanding the nature of object representations in the human brain is critical for understanding the neural basis of invariant object recognition. However, the degree to which object representations are sensitive to object viewpoint is unknown. Using fMRI we employed a parametric approach to examine the sensitivity to object view as a function of rotation (0 degrees-180 degrees ), category (animal/vehicle) and fMRI-adaptation paradigm (short or long-lagged). For both categories and fMRI-adaptation paradigms, object-selective regions recovered from adaptation when a rotated view of an object was shown after adaptation to a specific view of that object, suggesting that representations are sensitive to object rotation. However, we found evidence for differential representations across categories and ventral stream regions. Rotation cross-adaptation was larger for animals than vehicles, suggesting higher sensitivity to vehicle than animal rotation, and was largest in the left fusiform/occipito-temporal sulcus (pFUS/OTS), suggesting that this region has low sensitivity to rotation. Moreover, right pFUS/OTS and FFA responded more strongly to front than back views of animals (without adaptation) and rotation cross-adaptation depended both on the level of rotation and the adapting view. This result suggests a prevalence of neurons that prefer frontal views of animals in fusiform regions. Using a computational model of view-tuned neurons, we demonstrate that differential neural view tuning widths and relative distributions of neural-tuned populations in fMRI voxels can explain the fMRI results. Overall, our findings underscore the utility of parametric approaches for studying the neural basis of object invariance and suggest that there is no complete invariance to object view in the human ventral stream.
A new selective developmental deficit: Impaired object recognition with normal face recognition.

PubMed

Germine, Laura; Cashdollar, Nathan; Düzel, Emrah; Duchaine, Bradley

2011-05-01

Studies of developmental deficits in face recognition, or developmental prosopagnosia, have shown that individuals who have not suffered brain damage can show face recognition impairments coupled with normal object recognition (Duchaine and Nakayama, 2005; Duchaine et al., 2006; Nunn et al., 2001). However, no developmental cases with the opposite dissociation - normal face recognition with impaired object recognition - have been reported. The existence of a case of non-face developmental visual agnosia would indicate that the development of normal face recognition mechanisms does not rely on the development of normal object recognition mechanisms. To see whether a developmental variant of non-face visual object agnosia exists, we conducted a series of web-based object and face recognition tests to screen for individuals showing object recognition memory impairments but not face recognition impairments. Through this screening process, we identified AW, an otherwise normal 19-year-old female, who was then tested in the lab on face and object recognition tests. AW's performance was impaired in within-class visual recognition memory across six different visual categories (guns, horses, scenes, tools, doors, and cars). In contrast, she scored normally on seven tests of face recognition, tests of memory for two other object categories (houses and glasses), and tests of recall memory for visual shapes. Testing confirmed that her impairment was not related to a general deficit in lower-level perception, object perception, basic-level recognition, or memory. AW's results provide the first neuropsychological evidence that recognition memory for non-face visual object categories can be selectively impaired in individuals without brain damage or other memory impairment. These results indicate that the development of recognition memory for faces does not depend on intact object recognition memory and provide further evidence for category-specific dissociations in visual recognition. Copyright © 2010 Elsevier Srl. All rights reserved.
Similarity-Based Fusion of MEG and fMRI Reveals Spatio-Temporal Dynamics in Human Cortex During Visual Object Recognition

PubMed Central

Cichy, Radoslaw Martin; Pantazis, Dimitrios; Oliva, Aude

2016-01-01

Every human cognitive function, such as visual object recognition, is realized in a complex spatio-temporal activity pattern in the brain. Current brain imaging techniques in isolation cannot resolve the brain's spatio-temporal dynamics, because they provide either high spatial or temporal resolution but not both. To overcome this limitation, we developed an integration approach that uses representational similarities to combine measurements of magnetoencephalography (MEG) and functional magnetic resonance imaging (fMRI) to yield a spatially and temporally integrated characterization of neuronal activation. Applying this approach to 2 independent MEG–fMRI data sets, we observed that neural activity first emerged in the occipital pole at 50–80 ms, before spreading rapidly and progressively in the anterior direction along the ventral and dorsal visual streams. Further region-of-interest analyses established that dorsal and ventral regions showed MEG–fMRI correspondence in representations later than early visual cortex. Together, these results provide a novel and comprehensive, spatio-temporally resolved view of the rapid neural dynamics during the first few hundred milliseconds of object vision. They further demonstrate the feasibility of spatially unbiased representational similarity-based fusion of MEG and fMRI, promising new insights into how the brain computes complex cognitive functions. PMID:27235099
An Exemplar-Based Multi-View Domain Generalization Framework for Visual Recognition.

PubMed

Niu, Li; Li, Wen; Xu, Dong; Cai, Jianfei

2018-02-01

In this paper, we propose a new exemplar-based multi-view domain generalization (EMVDG) framework for visual recognition by learning robust classifier that are able to generalize well to arbitrary target domain based on the training samples with multiple types of features (i.e., multi-view features). In this framework, we aim to address two issues simultaneously. First, the distribution of training samples (i.e., the source domain) is often considerably different from that of testing samples (i.e., the target domain), so the performance of the classifiers learnt on the source domain may drop significantly on the target domain. Moreover, the testing data are often unseen during the training procedure. Second, when the training data are associated with multi-view features, the recognition performance can be further improved by exploiting the relation among multiple types of features. To address the first issue, considering that it has been shown that fusing multiple SVM classifiers can enhance the domain generalization ability, we build our EMVDG framework upon exemplar SVMs (ESVMs), in which a set of ESVM classifiers are learnt with each one trained based on one positive training sample and all the negative training samples. When the source domain contains multiple latent domains, the learnt ESVM classifiers are expected to be grouped into multiple clusters. To address the second issue, we propose two approaches under the EMVDG framework based on the consensus principle and the complementary principle, respectively. Specifically, we propose an EMVDG_CO method by adding a co-regularizer to enforce the cluster structures of ESVM classifiers on different views to be consistent based on the consensus principle. Inspired by multiple kernel learning, we also propose another EMVDG_MK method by fusing the ESVM classifiers from different views based on the complementary principle. In addition, we further extend our EMVDG framework to exemplar-based multi-view domain adaptation (EMVDA) framework when the unlabeled target domain data are available during the training procedure. The effectiveness of our EMVDG and EMVDA frameworks for visual recognition is clearly demonstrated by comprehensive experiments on three benchmark data sets.
Use of the recognition heuristic depends on the domain's recognition validity, not on the recognition validity of selected sets of objects.

PubMed

Pohl, Rüdiger F; Michalkiewicz, Martha; Erdfelder, Edgar; Hilbig, Benjamin E

2017-07-01

According to the recognition-heuristic theory, decision makers solve paired comparisons in which one object is recognized and the other not by recognition alone, inferring that recognized objects have higher criterion values than unrecognized ones. However, success-and thus usefulness-of this heuristic depends on the validity of recognition as a cue, and adaptive decision making, in turn, requires that decision makers are sensitive to it. To this end, decision makers could base their evaluation of the recognition validity either on the selected set of objects (the set's recognition validity), or on the underlying domain from which the objects were drawn (the domain's recognition validity). In two experiments, we manipulated the recognition validity both in the selected set of objects and between domains from which the sets were drawn. The results clearly show that use of the recognition heuristic depends on the domain's recognition validity, not on the set's recognition validity. In other words, participants treat all sets as roughly representative of the underlying domain and adjust their decision strategy adaptively (only) with respect to the more general environment rather than the specific items they are faced with.
Perceptual Integration Deficits in Autism Spectrum Disorders Are Associated with Reduced Interhemispheric Gamma-Band Coherence.

PubMed

Peiker, Ina; David, Nicole; Schneider, Till R; Nolte, Guido; Schöttle, Daniel; Engel, Andreas K

2015-12-16

The integration of visual details into a holistic percept is essential for object recognition. This integration has been reported as a key deficit in patients with autism spectrum disorders (ASDs). The weak central coherence account posits an altered disposition to integrate features into a coherent whole in ASD. Here, we test the hypothesis that such weak perceptual coherence may be reflected in weak neural coherence across different cortical sites. We recorded magnetoencephalography from 20 adult human participants with ASD and 20 matched controls, who performed a slit-viewing paradigm, in which objects gradually passed behind a vertical or horizontal slit so that only fragments of the object were visible at any given moment. Object recognition thus required perceptual integration over time and, in case of the horizontal slit, also across visual hemifields. ASD participants were selectively impaired in the horizontal slit condition, indicating specific difficulties in long-range synchronization between the hemispheres. Specifically, the ASD group failed to show condition-related enhancement of imaginary coherence between the posterior superior temporal sulci in both hemispheres during horizontal slit-viewing in contrast to controls. Moreover, local synchronization reflected in occipitocerebellar beta-band power was selectively reduced for horizontal compared with vertical slit-viewing in ASD. Furthermore, we found disturbed connectivity between right posterior superior temporal sulcus and left cerebellum. Together, our results suggest that perceptual integration deficits co-occur with specific patterns of abnormal global and local synchronization in ASD. The weak central coherence account proposes a tendency of individuals with autism spectrum disorders (ASDs) to focus on details at the cost of an integrated coherent whole. Here, we provide evidence, at the behavioral and the neural level, that visual integration in object recognition is impaired in ASD, when details had to be integrated across both visual hemifields. We found enhanced interhemispheric gamma-band coherence in typically developed participants when communication between cortical hemispheres was required by the task. Importantly, participants with ASD failed to show this enhanced coherence between bilateral posterior superior temporal sulci. The findings suggest that visual integration is disturbed at the local and global synchronization scale, which might bear implications for object recognition in ASD. Copyright © 2015 the authors 0270-6474/15/3516352-10$15.00/0.
Reconciling change blindness with long-term memory for objects.

PubMed

Wood, Katherine; Simons, Daniel J

2017-02-01

How can we reconcile remarkably precise long-term memory for thousands of images with failures to detect changes to similar images? We explored whether people can use detailed, long-term memory to improve change detection performance. Subjects studied a set of images of objects and then performed recognition and change detection tasks with those images. Recognition memory performance exceeded change detection performance, even when a single familiar object in the postchange display consistently indicated the change location. In fact, participants were no better when a familiar object predicted the change location than when the displays consisted of unfamiliar objects. When given an explicit strategy to search for a familiar object as a way to improve performance on the change detection task, they performed no better than in a 6-alternative recognition memory task. Subjects only benefited from the presence of familiar objects in the change detection task when they had more time to view the prechange array before it switched. Once the cost to using the change detection information decreased, subjects made use of it in conjunction with memory to boost performance on the familiar-item change detection task. This suggests that even useful information will go unused if it is sufficiently difficult to extract.
Target recognitions in multiple-camera closed-circuit television using color constancy

NASA Astrophysics Data System (ADS)

Soori, Umair; Yuen, Peter; Han, Ji Wen; Ibrahim, Izzati; Chen, Wentao; Hong, Kan; Merfort, Christian; James, David; Richardson, Mark

2013-04-01

People tracking in crowded scenes from closed-circuit television (CCTV) footage has been a popular and challenging task in computer vision. Due to the limited spatial resolution in the CCTV footage, the color of people's dress may offer an alternative feature for their recognition and tracking. However, there are many factors, such as variable illumination conditions, viewing angles, and camera calibration, that may induce illusive modification of intrinsic color signatures of the target. Our objective is to recognize and track targets in multiple camera views using color as the detection feature, and to understand if a color constancy (CC) approach may help to reduce these color illusions due to illumination and camera artifacts and thereby improve target recognition performance. We have tested a number of CC algorithms using various color descriptors to assess the efficiency of target recognition from a real multicamera Imagery Library for Intelligent Detection Systems (i-LIDS) data set. Various classifiers have been used for target detection, and the figure of merit to assess the efficiency of target recognition is achieved through the area under the receiver operating characteristics (AUROC). We have proposed two modifications of luminance-based CC algorithms: one with a color transfer mechanism and the other using a pixel-wise sigmoid function for an adaptive dynamic range compression, a method termed enhanced luminance reflectance CC (ELRCC). We found that both algorithms improve the efficiency of target recognitions substantially better than that of the raw data without CC treatment, and in some cases the ELRCC improves target tracking by over 100% within the AUROC assessment metric. The performance of the ELRCC has been assessed over 10 selected targets from three different camera views of the i-LIDS footage, and the averaged target recognition efficiency over all these targets is found to be improved by about 54% in AUROC after the data are processed by the proposed ELRCC algorithm. This amount of improvement represents a reduction of probability of false alarm by about a factor of 5 at the probability of detection of 0.5. Our study concerns mainly the detection of colored targets; and issues for the recognition of white or gray targets will be addressed in a forthcoming study.
Effects of Power on Mental Rotation and Emotion Recognition in Women.

PubMed

Nissan, Tali; Shapira, Oren; Liberman, Nira

2015-10-01

Based on construal-level theory (CLT) and its view of power as an instance of social distance, we predicted that high, relative to low power would enhance women's mental-rotation performance and impede their emotion-recognition performance. The predicted effects of power emerged both when it was manipulated via a recall priming task (Study 1) and environmental cues (Studies 2 and 3). Studies 3 and 4 found evidence for mediation by construal level of the effect of power on emotion recognition but not on mental rotation. We discuss potential mediating mechanisms for these effects based on both the social distance/construal level and the approach/inhibition views of power. We also discuss implications for optimizing performance on mental rotation and emotion recognition in everyday life. © 2015 by the Society for Personality and Social Psychology, Inc.
Parts and Relations in Young Children's Shape-Based Object Recognition

ERIC Educational Resources Information Center

Augustine, Elaine; Smith, Linda B.; Jones, Susan S.

2011-01-01

The ability to recognize common objects from sparse information about geometric shape emerges during the same period in which children learn object names and object categories. Hummel and Biederman's (1992) theory of object recognition proposes that the geometric shapes of objects have two components--geometric volumes representing major object…
It Takes Two–Skilled Recognition of Objects Engages Lateral Areas in Both Hemispheres

PubMed Central

Bilalić, Merim; Kiesel, Andrea; Pohl, Carsten; Erb, Michael; Grodd, Wolfgang

2011-01-01

Our object recognition abilities, a direct product of our experience with objects, are fine-tuned to perfection. Left temporal and lateral areas along the dorsal, action related stream, as well as left infero-temporal areas along the ventral, object related stream are engaged in object recognition. Here we show that expertise modulates the activity of dorsal areas in the recognition of man-made objects with clearly specified functions. Expert chess players were faster than chess novices in identifying chess objects and their functional relations. Experts' advantage was domain-specific as there were no differences between groups in a control task featuring geometrical shapes. The pattern of eye movements supported the notion that experts' extensive knowledge about domain objects and their functions enabled superior recognition even when experts were not directly fixating the objects of interest. Functional magnetic resonance imaging (fMRI) related exclusively the areas along the dorsal stream to chess specific object recognition. Besides the commonly involved left temporal and parietal lateral brain areas, we found that only in experts homologous areas on the right hemisphere were also engaged in chess specific object recognition. Based on these results, we discuss whether skilled object recognition does not only involve a more efficient version of the processes found in non-skilled recognition, but also qualitatively different cognitive processes which engage additional brain areas. PMID:21283683
View-Invariant Gait Recognition Through Genetic Template Segmentation

NASA Astrophysics Data System (ADS)

Isaac, Ebenezer R. H. P.; Elias, Susan; Rajagopalan, Srinivasan; Easwarakumar, K. S.

2017-08-01

Template-based model-free approach provides by far the most successful solution to the gait recognition problem in literature. Recent work discusses how isolating the head and leg portion of the template increase the performance of a gait recognition system making it robust against covariates like clothing and carrying conditions. However, most involve a manual definition of the boundaries. The method we propose, the genetic template segmentation (GTS), employs the genetic algorithm to automate the boundary selection process. This method was tested on the GEI, GEnI and AEI templates. GEI seems to exhibit the best result when segmented with our approach. Experimental results depict that our approach significantly outperforms the existing implementations of view-invariant gait recognition.
Trajectory Recognition as the Basis for Object Individuation: A Functional Model of Object File Instantiation and Object-Token Encoding

PubMed Central

Fields, Chris

2011-01-01

The perception of persisting visual objects is mediated by transient intermediate representations, object files, that are instantiated in response to some, but not all, visual trajectories. The standard object file concept does not, however, provide a mechanism sufficient to account for all experimental data on visual object persistence, object tracking, and the ability to perceive spatially disconnected stimuli as continuously existing objects. Based on relevant anatomical, functional, and developmental data, a functional model is constructed that bases visual object individuation on the recognition of temporal sequences of apparent center-of-mass positions that are specifically identified as trajectories by dedicated “trajectory recognition networks” downstream of the medial–temporal motion-detection area. This model is shown to account for a wide range of data, and to generate a variety of testable predictions. Individual differences in the recognition, abstraction, and encoding of trajectory information are expected to generate distinct object persistence judgments and object recognition abilities. Dominance of trajectory information over feature information in stored object tokens during early infancy, in particular, is expected to disrupt the ability to re-identify human and other individuals across perceptual episodes, and lead to developmental outcomes with characteristics of autism spectrum disorders. PMID:21716599
Perceptual Plasticity for Auditory Object Recognition

PubMed Central

Heald, Shannon L. M.; Van Hedger, Stephen C.; Nusbaum, Howard C.

2017-01-01

In our auditory environment, we rarely experience the exact acoustic waveform twice. This is especially true for communicative signals that have meaning for listeners. In speech and music, the acoustic signal changes as a function of the talker (or instrument), speaking (or playing) rate, and room acoustics, to name a few factors. Yet, despite this acoustic variability, we are able to recognize a sentence or melody as the same across various kinds of acoustic inputs and determine meaning based on listening goals, expectations, context, and experience. The recognition process relates acoustic signals to prior experience despite variability in signal-relevant and signal-irrelevant acoustic properties, some of which could be considered as “noise” in service of a recognition goal. However, some acoustic variability, if systematic, is lawful and can be exploited by listeners to aid in recognition. Perceivable changes in systematic variability can herald a need for listeners to reorganize perception and reorient their attention to more immediately signal-relevant cues. This view is not incorporated currently in many extant theories of auditory perception, which traditionally reduce psychological or neural representations of perceptual objects and the processes that act on them to static entities. While this reduction is likely done for the sake of empirical tractability, such a reduction may seriously distort the perceptual process to be modeled. We argue that perceptual representations, as well as the processes underlying perception, are dynamically determined by an interaction between the uncertainty of the auditory signal and constraints of context. This suggests that the process of auditory recognition is highly context-dependent in that the identity of a given auditory object may be intrinsically tied to its preceding context. To argue for the flexible neural and psychological updating of sound-to-meaning mappings across speech and music, we draw upon examples of perceptual categories that are thought to be highly stable. This framework suggests that the process of auditory recognition cannot be divorced from the short-term context in which an auditory object is presented. Implications for auditory category acquisition and extant models of auditory perception, both cognitive and neural, are discussed. PMID:28588524
Integrated approach for automatic target recognition using a network of collaborative sensors.

PubMed

Mahalanobis, Abhijit; Van Nevel, Alan

2006-10-01

We introduce what is believed to be a novel concept by which several sensors with automatic target recognition (ATR) capability collaborate to recognize objects. Such an approach would be suitable for netted systems in which the sensors and platforms can coordinate to optimize end-to-end performance. We use correlation filtering techniques to facilitate the development of the concept, although other ATR algorithms may be easily substituted. Essentially, a self-configuring geometry of netted platforms is proposed that positions the sensors optimally with respect to each other, and takes into account the interactions among the sensor, the recognition algorithms, and the classes of the objects to be recognized. We show how such a paradigm optimizes overall performance, and illustrate the collaborative ATR scheme for recognizing targets in synthetic aperture radar imagery by using viewing position as a sensor parameter.

High-order distance-based multiview stochastic learning in image classification.

PubMed

Yu, Jun; Rui, Yong; Tang, Yuan Yan; Tao, Dacheng

2014-12-01

How do we find all images in a larger set of images which have a specific content? Or estimate the position of a specific object relative to the camera? Image classification methods, like support vector machine (supervised) and transductive support vector machine (semi-supervised), are invaluable tools for the applications of content-based image retrieval, pose estimation, and optical character recognition. However, these methods only can handle the images represented by single feature. In many cases, different features (or multiview data) can be obtained, and how to efficiently utilize them is a challenge. It is inappropriate for the traditionally concatenating schema to link features of different views into a long vector. The reason is each view has its specific statistical property and physical interpretation. In this paper, we propose a high-order distance-based multiview stochastic learning (HD-MSL) method for image classification. HD-MSL effectively combines varied features into a unified representation and integrates the labeling information based on a probabilistic framework. In comparison with the existing strategies, our approach adopts the high-order distance obtained from the hypergraph to replace pairwise distance in estimating the probability matrix of data distribution. In addition, the proposed approach can automatically learn a combination coefficient for each view, which plays an important role in utilizing the complementary information of multiview data. An alternative optimization is designed to solve the objective functions of HD-MSL and obtain different views on coefficients and classification scores simultaneously. Experiments on two real world datasets demonstrate the effectiveness of HD-MSL in image classification.
Object Recognition using Feature- and Color-Based Methods

NASA Technical Reports Server (NTRS)

Duong, Tuan; Duong, Vu; Stubberud, Allen

2008-01-01

An improved adaptive method of processing image data in an artificial neural network has been developed to enable automated, real-time recognition of possibly moving objects under changing (including suddenly changing) conditions of illumination and perspective. The method involves a combination of two prior object-recognition methods one based on adaptive detection of shape features and one based on adaptive color segmentation to enable recognition in situations in which either prior method by itself may be inadequate. The chosen prior feature-based method is known as adaptive principal-component analysis (APCA); the chosen prior color-based method is known as adaptive color segmentation (ACOSE). These methods are made to interact with each other in a closed-loop system to obtain an optimal solution of the object-recognition problem in a dynamic environment. One of the results of the interaction is to increase, beyond what would otherwise be possible, the accuracy of the determination of a region of interest (containing an object that one seeks to recognize) within an image. Another result is to provide a minimized adaptive step that can be used to update the results obtained by the two component methods when changes of color and apparent shape occur. The net effect is to enable the neural network to update its recognition output and improve its recognition capability via an adaptive learning sequence. In principle, the improved method could readily be implemented in integrated circuitry to make a compact, low-power, real-time object-recognition system. It has been proposed to demonstrate the feasibility of such a system by integrating a 256-by-256 active-pixel sensor with APCA, ACOSE, and neural processing circuitry on a single chip. It has been estimated that such a system on a chip would have a volume no larger than a few cubic centimeters, could operate at a rate as high as 1,000 frames per second, and would consume in the order of milliwatts of power.
Orientation congruency effects for familiar objects: coordinate transformations in object recognition.

PubMed

Graf, M; Kaping, D; Bülthoff, H H

2005-03-01

How do observers recognize objects after spatial transformations? Recent neurocomputational models have proposed that object recognition is based on coordinate transformations that align memory and stimulus representations. If the recognition of a misoriented object is achieved by adjusting a coordinate system (or reference frame), then recognition should be facilitated when the object is preceded by a different object in the same orientation. In the two experiments reported here, two objects were presented in brief masked displays that were in close temporal contiguity; the objects were in either congruent or incongruent picture-plane orientations. Results showed that naming accuracy was higher for congruent than for incongruent orientations. The congruency effect was independent of superordinate category membership (Experiment 1) and was found for objects with different main axes of elongation (Experiment 2). The results indicate congruency effects for common familiar objects even when they have dissimilar shapes. These findings are compatible with models in which object recognition is achieved by an adjustment of a perceptual coordinate system.
Computing multiple aggregation levels and contextual features for road facilities recognition using mobile laser scanning data

NASA Astrophysics Data System (ADS)

Yang, Bisheng; Dong, Zhen; Liu, Yuan; Liang, Fuxun; Wang, Yongjun

2017-04-01

In recent years, updating the inventory of road infrastructures based on field work is labor intensive, time consuming, and costly. Fortunately, vehicle-based mobile laser scanning (MLS) systems provide an efficient solution to rapidly capture three-dimensional (3D) point clouds of road environments with high flexibility and precision. However, robust recognition of road facilities from huge volumes of 3D point clouds is still a challenging issue because of complicated and incomplete structures, occlusions and varied point densities. Most existing methods utilize point or object based features to recognize object candidates, and can only extract limited types of objects with a relatively low recognition rate, especially for incomplete and small objects. To overcome these drawbacks, this paper proposes a semantic labeling framework by combing multiple aggregation levels (point-segment-object) of features and contextual features to recognize road facilities, such as road surfaces, road boundaries, buildings, guardrails, street lamps, traffic signs, roadside-trees, power lines, and cars, for highway infrastructure inventory. The proposed method first identifies ground and non-ground points, and extracts road surfaces facilities from ground points. Non-ground points are segmented into individual candidate objects based on the proposed multi-rule region growing method. Then, the multiple aggregation levels of features and the contextual features (relative positions, relative directions, and spatial patterns) associated with each candidate object are calculated and fed into a SVM classifier to label the corresponding candidate object. The recognition performance of combining multiple aggregation levels and contextual features was compared with single level (point, segment, or object) based features using large-scale highway scene point clouds. Comparative studies demonstrated that the proposed semantic labeling framework significantly improves road facilities recognition precision (90.6%) and recall (91.2%), particularly for incomplete and small objects.
Toward a Unified Theory of Visual Area V4

PubMed Central

Roe, Anna W.; Chelazzi, Leonardo; Connor, Charles E.; Conway, Bevil R.; Fujita, Ichiro; Gallant, Jack L.; Lu, Haidong; Vanduffel, Wim

2016-01-01

Visual area V4 is a midtier cortical area in the ventral visual pathway. It is crucial for visual object recognition and has been a focus of many studies on visual attention. However, there is no unifying view of V4’s role in visual processing. Neither is there an understanding of how its role in feature processing interfaces with its role in visual attention. This review captures our current knowledge of V4, largely derived from electrophysiological and imaging studies in the macaque monkey. Based on recent discovery of functionally specific domains in V4, we propose that the unifying function of V4 circuitry is to enable selective extraction of specific functional domain-based networks, whether it be by bottom-up specification of object features or by top-down attentionally driven selection. PMID:22500626
Advanced optical correlation and digital methods for pattern matching—50th anniversary of Vander Lugt matched filter

NASA Astrophysics Data System (ADS)

Millán, María S.

2012-10-01

On the verge of the 50th anniversary of Vander Lugt’s formulation for pattern matching based on matched filtering and optical correlation, we acknowledge the very intense research activity developed in the field of correlation-based pattern recognition during this period of time. The paper reviews some domains that appeared as emerging fields in the last years of the 20th century and have been developed later on in the 21st century. Such is the case of three-dimensional (3D) object recognition, biometric pattern matching, optical security and hybrid optical-digital processors. 3D object recognition is a challenging case of multidimensional image recognition because of its implications in the recognition of real-world objects independent of their perspective. Biometric recognition is essentially pattern recognition for which the personal identification is based on the authentication of a specific physiological characteristic possessed by the subject (e.g. fingerprint, face, iris, retina, and multifactor combinations). Biometric recognition often appears combined with encryption-decryption processes to secure information. The optical implementations of correlation-based pattern recognition processes still rely on the 4f-correlator, the joint transform correlator, or some of their variants. But the many applications developed in the field have been pushing the systems for a continuous improvement of their architectures and algorithms, thus leading towards merged optical-digital solutions.
Strategies for the Interpretive Integration of Ground and Aerial Views in UGV Operations

DTIC Science & Technology

2006-11-01

conjoinment of the psychological processes and effects of perception, object recognition (i.e. Biederman & Gerhardstein, 1993), navigation (Wickens...rather simple geon structural descriptions (GSD, see Biederman & 4 Gerhardstein, 1993). A geon is a basic three dimensional solid shape, such as a...large and reliable sex difference. Behavioral Brain Research, 93, 185-190. Biederman , I., & Gerhardstein, C. (1993). Recognizing depth- rotated objects
Thesis and Antithesis

ERIC Educational Resources Information Center

Baggaley, Jon

2012-01-01

Behind every educational concept an opposing notion is waiting for recognition. Despite their avowed objectives, however, academic debates do not always encourage the discussion of opposing views. A review of sessions at the December 2011 Online Educa Conference illustrates that point and others about academic meetings. Opposing viewpoints may be…
Vision-based object detection and recognition system for intelligent vehicles

NASA Astrophysics Data System (ADS)

Ran, Bin; Liu, Henry X.; Martono, Wilfung

1999-01-01

Recently, a proactive crash mitigation system is proposed to enhance the crash avoidance and survivability of the Intelligent Vehicles. Accurate object detection and recognition system is a prerequisite for a proactive crash mitigation system, as system component deployment algorithms rely on accurate hazard detection, recognition, and tracking information. In this paper, we present a vision-based approach to detect and recognize vehicles and traffic signs, obtain their information, and track multiple objects by using a sequence of color images taken from a moving vehicle. The entire system consist of two sub-systems, the vehicle detection and recognition sub-system and traffic sign detection and recognition sub-system. Both of the sub- systems consist of four models: object detection model, object recognition model, object information model, and object tracking model. In order to detect potential objects on the road, several features of the objects are investigated, which include symmetrical shape and aspect ratio of a vehicle and color and shape information of the signs. A two-layer neural network is trained to recognize different types of vehicles and a parameterized traffic sign model is established in the process of recognizing a sign. Tracking is accomplished by combining the analysis of single image frame with the analysis of consecutive image frames. The analysis of the single image frame is performed every ten full-size images. The information model will obtain the information related to the object, such as time to collision for the object vehicle and relative distance from the traffic sings. Experimental results demonstrated a robust and accurate system in real time object detection and recognition over thousands of image frames.
Comparison of Object Recognition Behavior in Human and Monkey

PubMed Central

Rajalingham, Rishi; Schmidt, Kailyn

2015-01-01

Although the rhesus monkey is used widely as an animal model of human visual processing, it is not known whether invariant visual object recognition behavior is quantitatively comparable across monkeys and humans. To address this question, we systematically compared the core object recognition behavior of two monkeys with that of human subjects. To test true object recognition behavior (rather than image matching), we generated several thousand naturalistic synthetic images of 24 basic-level objects with high variation in viewing parameters and image background. Monkeys were trained to perform binary object recognition tasks on a match-to-sample paradigm. Data from 605 human subjects performing the same tasks on Mechanical Turk were aggregated to characterize “pooled human” object recognition behavior, as well as 33 separate Mechanical Turk subjects to characterize individual human subject behavior. Our results show that monkeys learn each new object in a few days, after which they not only match mean human performance but show a pattern of object confusion that is highly correlated with pooled human confusion patterns and is statistically indistinguishable from individual human subjects. Importantly, this shared human and monkey pattern of 3D object confusion is not shared with low-level visual representations (pixels, V1+; models of the retina and primary visual cortex) but is shared with a state-of-the-art computer vision feature representation. Together, these results are consistent with the hypothesis that rhesus monkeys and humans share a common neural shape representation that directly supports object perception. SIGNIFICANCE STATEMENT To date, several mammalian species have shown promise as animal models for studying the neural mechanisms underlying high-level visual processing in humans. In light of this diversity, making tight comparisons between nonhuman and human primates is particularly critical in determining the best use of nonhuman primates to further the goal of the field of translating knowledge gained from animal models to humans. To the best of our knowledge, this study is the first systematic attempt at comparing a high-level visual behavior of humans and macaque monkeys. PMID:26338324
Human-inspired sound environment recognition system for assistive vehicles.

PubMed

Vidal, Eduardo González; Zarricueta, Ernesto Fredes; Cheein, Fernando Auat

2015-02-01

The human auditory system acquires environmental information under sound stimuli faster than visual or touch systems, which in turn, allows for faster human responses to such stimuli. It also complements senses such as sight, where direct line-of-view is necessary to identify objects, in the environment recognition process. This work focuses on implementing human reaction to sound stimuli and environment recognition on assistive robotic devices, such as robotic wheelchairs or robotized cars. These vehicles need environment information to ensure safe navigation. In the field of environment recognition, range sensors (such as LiDAR and ultrasonic systems) and artificial vision devices are widely used; however, these sensors depend on environment constraints (such as lighting variability or color of objects), and sound can provide important information for the characterization of an environment. In this work, we propose a sound-based approach to enhance the environment recognition process, mainly for cases that compromise human integrity, according to the International Classification of Functioning (ICF). Our proposal is based on a neural network implementation that is able to classify up to 15 different environments, each selected according to the ICF considerations on environment factors in the community-based physical activities of people with disabilities. The accuracy rates in environment classification ranges from 84% to 93%. This classification is later used to constrain assistive vehicle navigation in order to protect the user during daily activities. This work also includes real-time outdoor experimentation (performed on an assistive vehicle) by seven volunteers with different disabilities (but without cognitive impairment and experienced in the use of wheelchairs), statistical validation, comparison with previously published work, and a discussion section where the pros and cons of our system are evaluated. The proposed sound-based system is very efficient at providing general descriptions of the environment. Such descriptions are focused on vulnerable situations described by the ICF. The volunteers answered a questionnaire regarding the importance of constraining the vehicle velocities in risky environments, showing that all the volunteers felt comfortable with the system and its performance.
Digital and optical shape representation and pattern recognition; Proceedings of the Meeting, Orlando, FL, Apr. 4-6, 1988

NASA Technical Reports Server (NTRS)

Juday, Richard D. (Editor)

1988-01-01

The present conference discusses topics in pattern-recognition correlator architectures, digital stereo systems, geometric image transformations and their applications, topics in pattern recognition, filter algorithms, object detection and classification, shape representation techniques, and model-based object recognition methods. Attention is given to edge-enhancement preprocessing using liquid crystal TVs, massively-parallel optical data base management, three-dimensional sensing with polar exponential sensor arrays, the optical processing of imaging spectrometer data, hybrid associative memories and metric data models, the representation of shape primitives in neural networks, and the Monte Carlo estimation of moment invariants for pattern recognition.
Invariant visual object recognition: a model, with lighting invariance.

PubMed

Rolls, Edmund T; Stringer, Simon M

2006-01-01

How are invariant representations of objects formed in the visual cortex? We describe a neurophysiological and computational approach which focusses on a feature hierarchy model in which invariant representations can be built by self-organizing learning based on the statistics of the visual input. The model can use temporal continuity in an associative synaptic learning rule with a short term memory trace, and/or it can use spatial continuity in Continuous Transformation learning. The model of visual processing in the ventral cortical stream can build representations of objects that are invariant with respect to translation, view, size, and in this paper we show also lighting. The model has been extended to provide an account of invariant representations in the dorsal visual system of the global motion produced by objects such as looming, rotation, and object-based movement. The model has been extended to incorporate top-down feedback connections to model the control of attention by biased competition in for example spatial and object search tasks. The model has also been extended to account for how the visual system can select single objects in complex visual scenes, and how multiple objects can be represented in a scene.
Familiarity Breeds Attempts: A Critical Review of Dual-Process Theories of Recognition.

PubMed

Mandler, George

2008-09-01

Recognition memory and recall/recollection are the major divisions of the psychology of human memory. Theories of recognition have shifted from a "strength" approach to a dual-process view, which distinguishes between knowing that one has experienced an object before and knowing what it was. In this article, I discuss the history of this approach and the two processes of familiarity and recollection and locate their origin in pattern matching and organization. I evaluate various theories in terms of their basic requirements and their defining research and propose the extension of the original two process theory to domains such as pictorial recognition. Finally, I present the main phenomena that a dual-process theory of recognition must account for and discuss future needs and directions of research and development. © 2008 Association for Psychological Science.
Augmented reality three-dimensional object visualization and recognition with axially distributed sensing.

PubMed

Markman, Adam; Shen, Xin; Hua, Hong; Javidi, Bahram

2016-01-15

An augmented reality (AR) smartglass display combines real-world scenes with digital information enabling the rapid growth of AR-based applications. We present an augmented reality-based approach for three-dimensional (3D) optical visualization and object recognition using axially distributed sensing (ADS). For object recognition, the 3D scene is reconstructed, and feature extraction is performed by calculating the histogram of oriented gradients (HOG) of a sliding window. A support vector machine (SVM) is then used for classification. Once an object has been identified, the 3D reconstructed scene with the detected object is optically displayed in the smartglasses allowing the user to see the object, remove partial occlusions of the object, and provide critical information about the object such as 3D coordinates, which are not possible with conventional AR devices. To the best of our knowledge, this is the first report on combining axially distributed sensing with 3D object visualization and recognition for applications to augmented reality. The proposed approach can have benefits for many applications, including medical, military, transportation, and manufacturing.
Evidence for the activation of sensorimotor information during visual word recognition: the body-object interaction effect.

PubMed

Siakaluk, Paul D; Pexman, Penny M; Aguilera, Laura; Owen, William J; Sears, Christopher R

2008-01-01

We examined the effects of sensorimotor experience in two visual word recognition tasks. Body-object interaction (BOI) ratings were collected for a large set of words. These ratings assess perceptions of the ease with which a human body can physically interact with a word's referent. A set of high BOI words (e.g., mask) and a set of low BOI words (e.g., ship) were created, matched on imageability and concreteness. Facilitatory BOI effects were observed in lexical decision and phonological lexical decision tasks: responses were faster for high BOI words than for low BOI words. We discuss how our findings may be accounted for by (a) semantic feedback within the visual word recognition system, and (b) an embodied view of cognition (e.g., Barsalou's perceptual symbol systems theory), which proposes that semantic knowledge is grounded in sensorimotor interactions with the environment.
Evidence for view-invariant face recognition units in unfamiliar face learning.

PubMed

Etchells, David B; Brooks, Joseph L; Johnston, Robert A

2017-05-01

Many models of face recognition incorporate the idea of a face recognition unit (FRU), an abstracted representation formed from each experience of a face which aids recognition under novel viewing conditions. Some previous studies have failed to find evidence of this FRU representation. Here, we report three experiments which investigated this theoretical construct by modifying the face learning procedure from that in previous work. During learning, one or two views of previously unfamiliar faces were shown to participants in a serial matching task. Later, participants attempted to recognize both seen and novel views of the learned faces (recognition phase). Experiment 1 tested participants' recognition of a novel view, a day after learning. Experiment 2 was identical, but tested participants on the same day as learning. Experiment 3 repeated Experiment 1, but tested participants on a novel view that was outside the rotation of those views learned. Results revealed a significant advantage, across all experiments, for recognizing a novel view when two views had been learned compared to single view learning. The observed view invariance supports the notion that an FRU representation is established during multi-view face learning under particular learning conditions.
Thoracic lymph node station recognition on CT images based on automatic anatomy recognition with an optimal parent strategy

NASA Astrophysics Data System (ADS)

Xu, Guoping; Udupa, Jayaram K.; Tong, Yubing; Cao, Hanqiang; Odhner, Dewey; Torigian, Drew A.; Wu, Xingyu

2018-03-01

Currently, there are many papers that have been published on the detection and segmentation of lymph nodes from medical images. However, it is still a challenging problem owing to low contrast with surrounding soft tissues and the variations of lymph node size and shape on computed tomography (CT) images. This is particularly very difficult on low-dose CT of PET/CT acquisitions. In this study, we utilize our previous automatic anatomy recognition (AAR) framework to recognize the thoracic-lymph node stations defined by the International Association for the Study of Lung Cancer (IASLC) lymph node map. The lymph node stations themselves are viewed as anatomic objects and are localized by using a one-shot method in the AAR framework. Two strategies have been taken in this paper for integration into AAR framework. The first is to combine some lymph node stations into composite lymph node stations according to their geometrical nearness. The other is to find the optimal parent (organ or union of organs) as an anchor for each lymph node station based on the recognition error and thereby find an overall optimal hierarchy to arrange anchor organs and lymph node stations. Based on 28 contrast-enhanced thoracic CT image data sets for model building, 12 independent data sets for testing, our results show that thoracic lymph node stations can be localized within 2-3 voxels compared to the ground truth.
On the road to invariant recognition: explaining tradeoff and morph properties of cells in inferotemporal cortex using multiple-scale task-sensitive attentive learning.

PubMed

Grossberg, Stephen; Markowitz, Jeffrey; Cao, Yongqiang

2011-12-01

Visual object recognition is an essential accomplishment of advanced brains. Object recognition needs to be tolerant, or invariant, with respect to changes in object position, size, and view. In monkeys and humans, a key area for recognition is the anterior inferotemporal cortex (ITa). Recent neurophysiological data show that ITa cells with high object selectivity often have low position tolerance. We propose a neural model whose cells learn to simulate this tradeoff, as well as ITa responses to image morphs, while explaining how invariant recognition properties may arise in stages due to processes across multiple cortical areas. These processes include the cortical magnification factor, multiple receptive field sizes, and top-down attentive matching and learning properties that may be tuned by task requirements to attend to either concrete or abstract visual features with different levels of vigilance. The model predicts that data from the tradeoff and image morph tasks emerge from different levels of vigilance in the animals performing them. This result illustrates how different vigilance requirements of a task may change the course of category learning, notably the critical features that are attended and incorporated into learned category prototypes. The model outlines a path for developing an animal model of how defective vigilance control can lead to symptoms of various mental disorders, such as autism and amnesia. Copyright © 2011 Elsevier Ltd. All rights reserved.
The effect of scene context on episodic object recognition: parahippocampal cortex mediates memory encoding and retrieval success.

PubMed

Hayes, Scott M; Nadel, Lynn; Ryan, Lee

2007-01-01

Previous research has investigated intentional retrieval of contextual information and contextual influences on object identification and word recognition, yet few studies have investigated context effects in episodic memory for objects. To address this issue, unique objects embedded in a visually rich scene or on a white background were presented to participants. At test, objects were presented either in the original scene or on a white background. A series of behavioral studies with young adults demonstrated a context shift decrement (CSD)-decreased recognition performance when context is changed between encoding and retrieval. The CSD was not attenuated by encoding or retrieval manipulations, suggesting that binding of object and context may be automatic. A final experiment explored the neural correlates of the CSD, using functional Magnetic Resonance Imaging. Parahippocampal cortex (PHC) activation (right greater than left) during incidental encoding was associated with subsequent memory of objects in the context shift condition. Greater activity in right PHC was also observed during successful recognition of objects previously presented in a scene. Finally, a subset of regions activated during scene encoding, such as bilateral PHC, was reactivated when the object was presented on a white background at retrieval. Although participants were not required to intentionally retrieve contextual information, the results suggest that PHC may reinstate visual context to mediate successful episodic memory retrieval. The CSD is attributed to automatic and obligatory binding of object and context. The results suggest that PHC is important not only for processing of scene information, but also plays a role in successful episodic memory encoding and retrieval. These findings are consistent with the view that spatial information is stored in the hippocampal complex, one of the central tenets of Multiple Trace Theory. (c) 2007 Wiley-Liss, Inc.

3D Object Recognition: Symmetry and Virtual Views

DTIC Science & Technology

1992-12-01

NAME(S) AND ADDRESS(ES) 8. PERFORMING ORGANIZATIONI Artificial Intelligence Laboratory REPORT NUMBER 545 Technology Square AIM 1409 Cambridge... ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING A.I. Memo No. 1409 December 1992 C.B.C.L. Paper No. 76 3D Object...research done within the Center for Biological and Computational Learning in the Department of Brain and Cognitive Sciences, and at the Artificial
Complementary Hemispheric Asymmetries in Object Naming and Recognition: A Voxel-Based Correlational Study

ERIC Educational Resources Information Center

Acres, K.; Taylor, K. I.; Moss, H. E.; Stamatakis, E. A.; Tyler, L. K.

2009-01-01

Cognitive neuroscientific research proposes complementary hemispheric asymmetries in naming and recognising visual objects, with a left temporal lobe advantage for object naming and a right temporal lobe advantage for object recognition. Specifically, it has been proposed that the left inferior temporal lobe plays a mediational role linking…
Automatic anatomy recognition using neural network learning of object relationships via virtual landmarks

NASA Astrophysics Data System (ADS)

Yan, Fengxia; Udupa, Jayaram K.; Tong, Yubing; Xu, Guoping; Odhner, Dewey; Torigian, Drew A.

2018-03-01

The recently developed body-wide Automatic Anatomy Recognition (AAR) methodology depends on fuzzy modeling of individual objects, hierarchically arranging objects, constructing an anatomy ensemble of these models, and a dichotomous object recognition-delineation process. The parent-to-offspring spatial relationship in the object hierarchy is crucial in the AAR method. We have found this relationship to be quite complex, and as such any improvement in capturing this relationship information in the anatomy model will improve the process of recognition itself. Currently, the method encodes this relationship based on the layout of the geometric centers of the objects. Motivated by the concept of virtual landmarks (VLs), this paper presents a new one-shot AAR recognition method that utilizes the VLs to learn object relationships by training a neural network to predict the pose and the VLs of an offspring object given the VLs of the parent object in the hierarchy. We set up two neural networks for each parent-offspring object pair in a body region, one for predicting the VLs and another for predicting the pose parameters. The VL-based learning/prediction method is evaluated on two object hierarchies involving 14 objects. We utilize 54 computed tomography (CT) image data sets of head and neck cancer patients and the associated object contours drawn by dosimetrists for routine radiation therapy treatment planning. The VL neural network method is found to yield more accurate object localization than the currently used simple AAR method.
Sensor agnostic object recognition using a map seeking circuit

NASA Astrophysics Data System (ADS)

Overman, Timothy L.; Hart, Michael

2012-05-01

Automatic object recognition capabilities are traditionally tuned to exploit the specific sensing modality they were designed to. Their successes (and shortcomings) are tied to object segmentation from the background, they typically require highly skilled personnel to train them, and they become cumbersome with the introduction of new objects. In this paper we describe a sensor independent algorithm based on the biologically inspired technology of map seeking circuits (MSC) which overcomes many of these obstacles. In particular, the MSC concept offers transparency in object recognition from a common interface to all sensor types, analogous to a USB device. It also provides a common core framework that is independent of the sensor and expandable to support high dimensionality decision spaces. Ease in training is assured by using commercially available 3D models from the video game community. The search time remains linear no matter how many objects are introduced, ensuring rapid object recognition. Here, we report results of an MSC algorithm applied to object recognition and pose estimation from high range resolution radar (1D), electrooptical imagery (2D), and LIDAR point clouds (3D) separately. By abstracting the sensor phenomenology from the underlying a prior knowledge base, MSC shows promise as an easily adaptable tool for incorporating additional sensor inputs.
Item-method directed forgetting: Effects at retrieval?

PubMed

Taylor, Tracy L; Cutmore, Laura; Pries, Lotta

2018-02-01

In an item-method directed forgetting paradigm, words are presented one at a time, each followed by an instruction to Remember or Forget; a directed forgetting effect is measured as better subsequent memory for Remember words than Forget words. The dominant view is that the directed forgetting effect arises during encoding due to selective rehearsal of Remember over Forget items. In three experiments we attempted to falsify a strong view that directed forgetting effects in recognition are due only to encoding mechanisms when an item method is used. Across 3 experiments we tested for retrieval-based processes by colour-coding the recognition test items. Black colour provided no information; green colour cued a potential Remember item; and, red colour cued a potential Forget item. Recognition cues were mixed within-blocks in Experiment 1 and between-blocks in Experiments 2 and 3; Experiment 3 added explicit feedback on the accuracy of the recognition decision. Although overall recognition improved with cuing when explicit test performance feedback was added in Experiment 3, in no case was the magnitude of the directed forgetting effect influenced by recognition cueing. Our results argue against a role for retrieval-based strategies that limit recognition of Forget items at test and posit a role for encoding intentions only. Copyright © 2017 Elsevier B.V. All rights reserved.
Mechanisms of object recognition: what we have learned from pigeons

PubMed Central

Soto, Fabian A.; Wasserman, Edward A.

2014-01-01

Behavioral studies of object recognition in pigeons have been conducted for 50 years, yielding a large body of data. Recent work has been directed toward synthesizing this evidence and understanding the visual, associative, and cognitive mechanisms that are involved. The outcome is that pigeons are likely to be the non-primate species for which the computational mechanisms of object recognition are best understood. Here, we review this research and suggest that a core set of mechanisms for object recognition might be present in all vertebrates, including pigeons and people, making pigeons an excellent candidate model to study the neural mechanisms of object recognition. Behavioral and computational evidence suggests that error-driven learning participates in object category learning by pigeons and people, and recent neuroscientific research suggests that the basal ganglia, which are homologous in these species, may implement error-driven learning of stimulus-response associations. Furthermore, learning of abstract category representations can be observed in pigeons and other vertebrates. Finally, there is evidence that feedforward visual processing, a central mechanism in models of object recognition in the primate ventral stream, plays a role in object recognition by pigeons. We also highlight differences between pigeons and people in object recognition abilities, and propose candidate adaptive specializations which may explain them, such as holistic face processing and rule-based category learning in primates. From a modern comparative perspective, such specializations are to be expected regardless of the model species under study. The fact that we have a good idea of which aspects of object recognition differ in people and pigeons should be seen as an advantage over other animal models. From this perspective, we suggest that there is much to learn about human object recognition from studying the “simple” brains of pigeons. PMID:25352784
How does the brain rapidly learn and reorganize view-invariant and position-invariant object representations in the inferotemporal cortex?

PubMed

Cao, Yongqiang; Grossberg, Stephen; Markowitz, Jeffrey

2011-12-01

All primates depend for their survival on being able to rapidly learn about and recognize objects. Objects may be visually detected at multiple positions, sizes, and viewpoints. How does the brain rapidly learn and recognize objects while scanning a scene with eye movements, without causing a combinatorial explosion in the number of cells that are needed? How does the brain avoid the problem of erroneously classifying parts of different objects together at the same or different positions in a visual scene? In monkeys and humans, a key area for such invariant object category learning and recognition is the inferotemporal cortex (IT). A neural model is proposed to explain how spatial and object attention coordinate the ability of IT to learn invariant category representations of objects that are seen at multiple positions, sizes, and viewpoints. The model clarifies how interactions within a hierarchy of processing stages in the visual brain accomplish this. These stages include the retina, lateral geniculate nucleus, and cortical areas V1, V2, V4, and IT in the brain's What cortical stream, as they interact with spatial attention processes within the parietal cortex of the Where cortical stream. The model builds upon the ARTSCAN model, which proposed how view-invariant object representations are generated. The positional ARTSCAN (pARTSCAN) model proposes how the following additional processes in the What cortical processing stream also enable position-invariant object representations to be learned: IT cells with persistent activity, and a combination of normalizing object category competition and a view-to-object learning law which together ensure that unambiguous views have a larger effect on object recognition than ambiguous views. The model explains how such invariant learning can be fooled when monkeys, or other primates, are presented with an object that is swapped with another object during eye movements to foveate the original object. The swapping procedure is predicted to prevent the reset of spatial attention, which would otherwise keep the representations of multiple objects from being combined by learning. Li and DiCarlo (2008) have presented neurophysiological data from monkeys showing how unsupervised natural experience in a target swapping experiment can rapidly alter object representations in IT. The model quantitatively simulates the swapping data by showing how the swapping procedure fools the spatial attention mechanism. More generally, the model provides a unifying framework, and testable predictions in both monkeys and humans, for understanding object learning data using neurophysiological methods in monkeys, and spatial attention, episodic learning, and memory retrieval data using functional imaging methods in humans. Copyright © 2011 Elsevier Ltd. All rights reserved.
Two Ways to Facial Expression Recognition? Motor and Visual Information Have Different Effects on Facial Expression Recognition.

PubMed

de la Rosa, Stephan; Fademrecht, Laura; Bülthoff, Heinrich H; Giese, Martin A; Curio, Cristóbal

2018-06-01

Motor-based theories of facial expression recognition propose that the visual perception of facial expression is aided by sensorimotor processes that are also used for the production of the same expression. Accordingly, sensorimotor and visual processes should provide congruent emotional information about a facial expression. Here, we report evidence that challenges this view. Specifically, the repeated execution of facial expressions has the opposite effect on the recognition of a subsequent facial expression than the repeated viewing of facial expressions. Moreover, the findings of the motor condition, but not of the visual condition, were correlated with a nonsensory condition in which participants imagined an emotional situation. These results can be well accounted for by the idea that facial expression recognition is not always mediated by motor processes but can also be recognized on visual information alone.
Breastfeeding experience differentially impacts recognition of happiness and anger in mothers.

PubMed

Krol, Kathleen M; Kamboj, Sunjeev K; Curran, H Valerie; Grossmann, Tobias

2014-11-12

Breastfeeding is a dynamic biological and social process based on hormonal regulation involving oxytocin. While there is much work on the role of breastfeeding in infant development and on the role of oxytocin in socio-emotional functioning in adults, little is known about how breastfeeding impacts emotion perception during motherhood. We therefore examined whether breastfeeding influences emotion recognition in mothers. Using a dynamic emotion recognition task, we found that longer durations of exclusive breastfeeding were associated with faster recognition of happiness, providing evidence for a facilitation of processing positive facial expressions. In addition, we found that greater amounts of breastfed meals per day were associated with slower recognition of anger. Our findings are in line with current views of oxytocin function and support accounts that view maternal behaviour as tuned to prosocial responsiveness, by showing that vital elements of maternal care can facilitate the rapid responding to affiliative stimuli by reducing importance of threatening stimuli.
New technologies lead to a new frontier: cognitive multiple data representation

NASA Astrophysics Data System (ADS)

Buffat, S.; Liege, F.; Plantier, J.; Roumes, C.

2005-05-01

The increasing number and complexity of operational sensors (radar, infrared, hyperspectral...) and availability of huge amount of data, lead to more and more sophisticated information presentations. But one key element of the IMINT line cannot be improved beyond initial system specification: the operator.... In order to overcome this issue, we have to better understand human visual object representation. Object recognition theories in human vision balance between matching 2D templates representation with viewpoint-dependant information, and a viewpoint-invariant system based on structural description. Spatial frequency content is relevant due to early vision filtering. Orientation in depth is an important variable to challenge object constancy. Three objects, seen from three different points of view in a natural environment made the original images in this study. Test images were a combination of spatial frequency filtered original images and an additive contrast level of white noise. In the first experiment, the observer's task was a same versus different forced choice with spatial alternative. Test images had the same noise level in a presentation row. Discrimination threshold was determined by modifying the white noise contrast level by means of an adaptative method. In the second experiment, a repetition blindness paradigm was used to further investigate the viewpoint effect on object recognition. The results shed some light on the human visual system processing of objects displayed under different physical descriptions. This is an important achievement because targets which not always match physical properties of usual visual stimuli can increase operational workload.
Visual face-movement sensitive cortex is relevant for auditory-only speech recognition.

PubMed

Riedel, Philipp; Ragert, Patrick; Schelinski, Stefanie; Kiebel, Stefan J; von Kriegstein, Katharina

2015-07-01

It is commonly assumed that the recruitment of visual areas during audition is not relevant for performing auditory tasks ('auditory-only view'). According to an alternative view, however, the recruitment of visual cortices is thought to optimize auditory-only task performance ('auditory-visual view'). This alternative view is based on functional magnetic resonance imaging (fMRI) studies. These studies have shown, for example, that even if there is only auditory input available, face-movement sensitive areas within the posterior superior temporal sulcus (pSTS) are involved in understanding what is said (auditory-only speech recognition). This is particularly the case when speakers are known audio-visually, that is, after brief voice-face learning. Here we tested whether the left pSTS involvement is causally related to performance in auditory-only speech recognition when speakers are known by face. To test this hypothesis, we applied cathodal transcranial direct current stimulation (tDCS) to the pSTS during (i) visual-only speech recognition of a speaker known only visually to participants and (ii) auditory-only speech recognition of speakers they learned by voice and face. We defined the cathode as active electrode to down-regulate cortical excitability by hyperpolarization of neurons. tDCS to the pSTS interfered with visual-only speech recognition performance compared to a control group without pSTS stimulation (tDCS to BA6/44 or sham). Critically, compared to controls, pSTS stimulation additionally decreased auditory-only speech recognition performance selectively for voice-face learned speakers. These results are important in two ways. First, they provide direct evidence that the pSTS is causally involved in visual-only speech recognition; this confirms a long-standing prediction of current face-processing models. Secondly, they show that visual face-sensitive pSTS is causally involved in optimizing auditory-only speech recognition. These results are in line with the 'auditory-visual view' of auditory speech perception, which assumes that auditory speech recognition is optimized by using predictions from previously encoded speaker-specific audio-visual internal models. Copyright © 2015 Elsevier Ltd. All rights reserved.
Bottlenose dolphins perceive object features through echolocation.

PubMed

Harley, Heidi E; Putman, Erika A; Roitblat, Herbert L

2003-08-07

How organisms (including people) recognize distant objects is a fundamental question. The correspondence between object characteristics (distal stimuli), like visual shape, and sensory characteristics (proximal stimuli), like retinal projection, is ambiguous. The view that sensory systems are 'designed' to 'pick up' ecologically useful information is vague about how such mechanisms might work. In echolocating dolphins, which are studied as models for object recognition sonar systems, the correspondence between echo characteristics and object characteristics is less clear. Many cognitive scientists assume that object characteristics are extracted from proximal stimuli, but evidence for this remains ambiguous. For example, a dolphin may store 'sound templates' in its brain and identify whole objects by listening for a particular sound. Alternatively, a dolphin's brain may contain algorithms, derived through natural endowments or experience or both, which allow it to identify object characteristics based on sounds. The standard method used to address this question in many species is indirect and has led to equivocal results with dolphins. Here we outline an appropriate method and test it to show that dolphins extract object characteristics directly from echoes.
Recognition of lesion correspondence on two mammographic views: a new method of false-positive reduction for computerized mass detection

NASA Astrophysics Data System (ADS)

Sahiner, Berkman; Petrick, Nicholas; Chan, Heang-Ping; Paquerault, Sophie; Helvie, Mark A.; Hadjiiski, Lubomir M.

2001-07-01

We used the correspondence of detected structures on two views of the same breast for false-positive (FP) reduction in computerized detection of mammographic masses. For each initially detected object on one view, we considered all possible pairings with objects on the other view that fell within a radial band defined by the nipple-to-object distances. We designed a 'correspondence classifier' to classify these pairs as either the same mass (a TP-TP pair) or a mismatch (a TP-FP, FP-TP or FP-FP pair). For each pair, similarity measures of morphological and texture features were derived and used as input features in the correspondence classifier. Two-view mammograms from 94 cases were used as a preliminary data set. Initial detection provided 6.3 FPs/image at 96% sensitivity. Further FP reduction in single view resulted in 1.9 FPs/image at 80% sensitivity and 1.1 FPs/image at 70% sensitivity. By combining single-view detection with the correspondence classifier, detection accuracy improved to 1.5 FPs/image at 80% sensitivity and 0.7 FPs/image at 70% sensitivity. Our preliminary results indicate that the correspondence of geometric, morphological, and textural features of a mass on two different views provides valuable additional information for reducing FPs.
HONTIOR - HIGHER-ORDER NEURAL NETWORK FOR TRANSFORMATION INVARIANT OBJECT RECOGNITION

NASA Technical Reports Server (NTRS)

Spirkovska, L.

1994-01-01

Neural networks have been applied in numerous fields, including transformation invariant object recognition, wherein an object is recognized despite changes in the object's position in the input field, size, or rotation. One of the more successful neural network methods used in invariant object recognition is the higher-order neural network (HONN) method. With a HONN, known relationships are exploited and the desired invariances are built directly into the architecture of the network, eliminating the need for the network to learn invariance to transformations. This results in a significant reduction in the training time required, since the network needs to be trained on only one view of each object, not on numerous transformed views. Moreover, one hundred percent accuracy is guaranteed for images characterized by the built-in distortions, providing noise is not introduced through pixelation. The program HONTIOR implements a third-order neural network having invariance to translation, scale, and in-plane rotation built directly into the architecture, Thus, for 2-D transformation invariance, the network needs only to be trained on just one view of each object. HONTIOR can also be used for 3-D transformation invariant object recognition by training the network only on a set of out-of-plane rotated views. Historically, the major drawback of HONNs has been that the size of the input field was limited to the memory required for the large number of interconnections in a fully connected network. HONTIOR solves this problem by coarse coding the input images (coding an image as a set of overlapping but offset coarser images). Using this scheme, large input fields (4096 x 4096 pixels) can easily be represented using very little virtual memory (30Mb). The HONTIOR distribution consists of three main programs. The first program contains the training and testing routines for a third-order neural network. The second program contains the same training and testing procedures as the first, but it also contains a number of functions to display and edit training and test images. Finally, the third program is an auxiliary program which calculates the included angles for a given input field size. HONTIOR is written in C language, and was originally developed for Sun3 and Sun4 series computers. Both graphic and command line versions of the program are provided. The command line version has been successfully compiled and executed both on computers running the UNIX operating system and on DEC VAX series computer running VMS. The graphic version requires the SunTools windowing environment, and therefore runs only on Sun series computers. The executable for the graphics version of HONTIOR requires 1Mb of RAM. The standard distribution medium for HONTIOR is a .25 inch streaming magnetic tape cartridge in UNIX tar format. It is also available on a 3.5 inch diskette in UNIX tar format. The package includes sample input and output data. HONTIOR was developed in 1991. Sun, Sun3 and Sun4 are trademarks of Sun Microsystems, Inc. UNIX is a registered trademark of AT&T Bell Laboratories. DEC, VAX, and VMS are trademarks of Digital Equipment Corporation.
Exploring the feasibility of traditional image querying tasks for industrial radiographs

NASA Astrophysics Data System (ADS)

Bray, Iliana E.; Tsai, Stephany J.; Jimenez, Edward S.

2015-08-01

Although there have been great strides in object recognition with optical images (photographs), there has been comparatively little research into object recognition for X-ray radiographs. Our exploratory work contributes to this area by creating an object recognition system designed to recognize components from a related database of radiographs. Object recognition for radiographs must be approached differently than for optical images, because radiographs have much less color-based information to distinguish objects, and they exhibit transmission overlap that alters perceived object shapes. The dataset used in this work contained more than 55,000 intermixed radiographs and photographs, all in a compressed JPEG form and with multiple ways of describing pixel information. For this work, a robust and efficient system is needed to combat problems presented by properties of the X-ray imaging modality, the large size of the given database, and the quality of the images contained in said database. We have explored various pre-processing techniques to clean the cluttered and low-quality images in the database, and we have developed our object recognition system by combining multiple object detection and feature extraction methods. We present the preliminary results of the still-evolving hybrid object recognition system.
RecceMan: an interactive recognition assistance for image-based reconnaissance: synergistic effects of human perception and computational methods for object recognition, identification, and infrastructure analysis

NASA Astrophysics Data System (ADS)

El Bekri, Nadia; Angele, Susanne; Ruckhäberle, Martin; Peinsipp-Byma, Elisabeth; Haelke, Bruno

2015-10-01

This paper introduces an interactive recognition assistance system for imaging reconnaissance. This system supports aerial image analysts on missions during two main tasks: Object recognition and infrastructure analysis. Object recognition concentrates on the classification of one single object. Infrastructure analysis deals with the description of the components of an infrastructure and the recognition of the infrastructure type (e.g. military airfield). Based on satellite or aerial images, aerial image analysts are able to extract single object features and thereby recognize different object types. It is one of the most challenging tasks in the imaging reconnaissance. Currently, there are no high potential ATR (automatic target recognition) applications available, as consequence the human observer cannot be replaced entirely. State-of-the-art ATR applications cannot assume in equal measure human perception and interpretation. Why is this still such a critical issue? First, cluttered and noisy images make it difficult to automatically extract, classify and identify object types. Second, due to the changed warfare and the rise of asymmetric threats it is nearly impossible to create an underlying data set containing all features, objects or infrastructure types. Many other reasons like environmental parameters or aspect angles compound the application of ATR supplementary. Due to the lack of suitable ATR procedures, the human factor is still important and so far irreplaceable. In order to use the potential benefits of the human perception and computational methods in a synergistic way, both are unified in an interactive assistance system. RecceMan® (Reconnaissance Manual) offers two different modes for aerial image analysts on missions: the object recognition mode and the infrastructure analysis mode. The aim of the object recognition mode is to recognize a certain object type based on the object features that originated from the image signatures. The infrastructure analysis mode pursues the goal to analyze the function of the infrastructure. The image analyst extracts visually certain target object signatures, assigns them to corresponding object features and is finally able to recognize the object type. The system offers him the possibility to assign the image signatures to features given by sample images. The underlying data set contains a wide range of objects features and object types for different domains like ships or land vehicles. Each domain has its own feature tree developed by aerial image analyst experts. By selecting the corresponding features, the possible solution set of objects is automatically reduced and matches only the objects that contain the selected features. Moreover, we give an outlook of current research in the field of ground target analysis in which we deal with partly automated methods to extract image signatures and assign them to the corresponding features. This research includes methods for automatically determining the orientation of an object and geometric features like width and length of the object. This step enables to reduce automatically the possible object types offered to the image analyst by the interactive recognition assistance system.
Track Everything: Limiting Prior Knowledge in Online Multi-Object Recognition.

PubMed

Wong, Sebastien C; Stamatescu, Victor; Gatt, Adam; Kearney, David; Lee, Ivan; McDonnell, Mark D

2017-10-01

This paper addresses the problem of online tracking and classification of multiple objects in an image sequence. Our proposed solution is to first track all objects in the scene without relying on object-specific prior knowledge, which in other systems can take the form of hand-crafted features or user-based track initialization. We then classify the tracked objects with a fast-learning image classifier, that is based on a shallow convolutional neural network architecture and demonstrate that object recognition improves when this is combined with object state information from the tracking algorithm. We argue that by transferring the use of prior knowledge from the detection and tracking stages to the classification stage, we can design a robust, general purpose object recognition system with the ability to detect and track a variety of object types. We describe our biologically inspired implementation, which adaptively learns the shape and motion of tracked objects, and apply it to the Neovision2 Tower benchmark data set, which contains multiple object types. An experimental evaluation demonstrates that our approach is competitive with the state-of-the-art video object recognition systems that do make use of object-specific prior knowledge in detection and tracking, while providing additional practical advantages by virtue of its generality.
Impaired recognition of faces and objects in dyslexia: Evidence for ventral stream dysfunction?

PubMed

Sigurdardottir, Heida Maria; Ívarsson, Eysteinn; Kristinsdóttir, Kristjana; Kristjánsson, Árni

2015-09-01

The objective of this study was to establish whether or not dyslexics are impaired at the recognition of faces and other complex nonword visual objects. This would be expected based on a meta-analysis revealing that children and adult dyslexics show functional abnormalities within the left fusiform gyrus, a brain region high up in the ventral visual stream, which is thought to support the recognition of words, faces, and other objects. 20 adult dyslexics (M = 29 years) and 20 matched typical readers (M = 29 years) participated in the study. One dyslexic-typical reader pair was excluded based on Adult Reading History Questionnaire scores and IS-FORM reading scores. Performance was measured on 3 high-level visual processing tasks: the Cambridge Face Memory Test, the Vanderbilt Holistic Face Processing Test, and the Vanderbilt Expertise Test. People with dyslexia are impaired in their recognition of faces and other visually complex objects. Their holistic processing of faces appears to be intact, suggesting that dyslexics may instead be specifically impaired at part-based processing of visual objects. The difficulty that people with dyslexia experience with reading might be the most salient manifestation of a more general high-level visual deficit. (c) 2015 APA, all rights reserved).
Multi-objects recognition for distributed intelligent sensor networks

NASA Astrophysics Data System (ADS)

He, Haibo; Chen, Sheng; Cao, Yuan; Desai, Sachi; Hohil, Myron E.

2008-04-01

This paper proposes an innovative approach for multi-objects recognition for homeland security and defense based intelligent sensor networks. Unlike the conventional way of information analysis, data mining in such networks is typically characterized with high information ambiguity/uncertainty, data redundancy, high dimensionality and real-time constrains. Furthermore, since a typical military based network normally includes multiple mobile sensor platforms, ground forces, fortified tanks, combat flights, and other resources, it is critical to develop intelligent data mining approaches to fuse different information resources to understand dynamic environments, to support decision making processes, and finally to achieve the goals. This paper aims to address these issues with a focus on multi-objects recognition. Instead of classifying a single object as in the traditional image classification problems, the proposed method can automatically learn multiple objectives simultaneously. Image segmentation techniques are used to identify the interesting regions in the field, which correspond to multiple objects such as soldiers or tanks. Since different objects will come with different feature sizes, we propose a feature scaling method to represent each object in the same number of dimensions. This is achieved by linear/nonlinear scaling and sampling techniques. Finally, support vector machine (SVM) based learning algorithms are developed to learn and build the associations for different objects, and such knowledge will be adaptively accumulated for objects recognition in the testing stage. We test the effectiveness of proposed method in different simulated military environments.
Dissociating electrophysiological correlates of subjective, objective, and correct memory in investigating the emotion-induced recognition bias.

PubMed

Windmann, Sabine; Hill, Holger

2014-10-01

Performance on tasks requiring discrimination of at least two stimuli can be viewed either from an objective perspective (referring to actual stimulus differences), or from a subjective perspective (corresponding to participant's responses). Using event-related potentials recorded during an old/new recognition memory test involving emotionally laden and neutral words studied either blockwise or randomly intermixed, we show here how the objective perspective (old versus new items) yields late effects of blockwise emotional item presentation at parietal sites that the subjective perspective fails to find, whereas the subjective perspective ("old" versus "new" responses) is more sensitive to early effects of emotion at anterior sites than the objective perspective. Our results demonstrate the potential advantage of dissociating the subjective and the objective perspective onto task performance (in addition to analyzing trials with correct responses), especially for investigations of illusions and information processing biases, in behavioral and cognitive neuroscience studies. Copyright © 2014 Elsevier Inc. All rights reserved.

The gender congruency effect during bilingual spoken-word recognition

PubMed Central

Morales, Luis; Paolieri, Daniela; Dussias, Paola E.; Valdés kroff, Jorge R.; Gerfen, Chip; Bajo, María Teresa

2016-01-01

We investigate the ‘gender-congruency’ effect during a spoken-word recognition task using the visual world paradigm. Eye movements of Italian–Spanish bilinguals and Spanish monolinguals were monitored while they viewed a pair of objects on a computer screen. Participants listened to instructions in Spanish (encuentra la bufanda / ‘find the scarf’) and clicked on the object named in the instruction. Grammatical gender of the objects’ name was manipulated so that pairs of objects had the same (congruent) or different (incongruent) gender in Italian, but gender in Spanish was always congruent. Results showed that bilinguals, but not monolinguals, looked at target objects less when they were incongruent in gender, suggesting a between-language gender competition effect. In addition, bilinguals looked at target objects more when the definite article in the spoken instructions provided a valid cue to anticipate its selection (different-gender condition). The temporal dynamics of gender processing and cross-language activation in bilinguals are discussed. PMID:28018132
Short temporal asynchrony disrupts visual object recognition

PubMed Central

Singer, Jedediah M.; Kreiman, Gabriel

2014-01-01

Humans can recognize objects and scenes in a small fraction of a second. The cascade of signals underlying rapid recognition might be disrupted by temporally jittering different parts of complex objects. Here we investigated the time course over which shape information can be integrated to allow for recognition of complex objects. We presented fragments of object images in an asynchronous fashion and behaviorally evaluated categorization performance. We observed that visual recognition was significantly disrupted by asynchronies of approximately 30 ms, suggesting that spatiotemporal integration begins to break down with even small deviations from simultaneity. However, moderate temporal asynchrony did not completely obliterate recognition; in fact, integration of visual shape information persisted even with an asynchrony of 100 ms. We describe the data with a concise model based on the dynamic reduction of uncertainty about what image was presented. These results emphasize the importance of timing in visual processing and provide strong constraints for the development of dynamical models of visual shape recognition. PMID:24819738
Simple Learned Weighted Sums of Inferior Temporal Neuronal Firing Rates Accurately Predict Human Core Object Recognition Performance

PubMed Central

Hong, Ha; Solomon, Ethan A.; DiCarlo, James J.

2015-01-01

To go beyond qualitative models of the biological substrate of object recognition, we ask: can a single ventral stream neuronal linking hypothesis quantitatively account for core object recognition performance over a broad range of tasks? We measured human performance in 64 object recognition tests using thousands of challenging images that explore shape similarity and identity preserving object variation. We then used multielectrode arrays to measure neuronal population responses to those same images in visual areas V4 and inferior temporal (IT) cortex of monkeys and simulated V1 population responses. We tested leading candidate linking hypotheses and control hypotheses, each postulating how ventral stream neuronal responses underlie object recognition behavior. Specifically, for each hypothesis, we computed the predicted performance on the 64 tests and compared it with the measured pattern of human performance. All tested hypotheses based on low- and mid-level visually evoked activity (pixels, V1, and V4) were very poor predictors of the human behavioral pattern. However, simple learned weighted sums of distributed average IT firing rates exactly predicted the behavioral pattern. More elaborate linking hypotheses relying on IT trial-by-trial correlational structure, finer IT temporal codes, or ones that strictly respect the known spatial substructures of IT (“face patches”) did not improve predictive power. Although these results do not reject those more elaborate hypotheses, they suggest a simple, sufficient quantitative model: each object recognition task is learned from the spatially distributed mean firing rates (100 ms) of ∼60,000 IT neurons and is executed as a simple weighted sum of those firing rates. SIGNIFICANCE STATEMENT We sought to go beyond qualitative models of visual object recognition and determine whether a single neuronal linking hypothesis can quantitatively account for core object recognition behavior. To achieve this, we designed a database of images for evaluating object recognition performance. We used multielectrode arrays to characterize hundreds of neurons in the visual ventral stream of nonhuman primates and measured the object recognition performance of >100 human observers. Remarkably, we found that simple learned weighted sums of firing rates of neurons in monkey inferior temporal (IT) cortex accurately predicted human performance. Although previous work led us to expect that IT would outperform V4, we were surprised by the quantitative precision with which simple IT-based linking hypotheses accounted for human behavior. PMID:26424887
Symbolic feature detection for image understanding

NASA Astrophysics Data System (ADS)

Aslan, Sinem; Akgül, Ceyhun Burak; Sankur, Bülent

2014-03-01

In this study we propose a model-driven codebook generation method used to assign probability scores to pixels in order to represent underlying local shapes they reside in. In the first version of the symbol library we limited ourselves to photometric and similarity transformations applied on eight prototypical shapes of flat plateau , ramp, valley, ridge, circular and elliptic respectively pit and hill and used randomized decision forest as the statistical classifier to compute shape class ambiguity of each pixel. We achieved90% accuracy in identification of known objects from alternate views, however, we could not outperform texture, global and local shape methods, but only color-based method in recognition of unknown objects. We present a progress plan to be accomplished as a future work to improve the proposed approach further.
Automatic anatomy recognition on CT images with pathology

NASA Astrophysics Data System (ADS)

Huang, Lidong; Udupa, Jayaram K.; Tong, Yubing; Odhner, Dewey; Torigian, Drew A.

2016-03-01

Body-wide anatomy recognition on CT images with pathology becomes crucial for quantifying body-wide disease burden. This, however, is a challenging problem because various diseases result in various abnormalities of objects such as shape and intensity patterns. We previously developed an automatic anatomy recognition (AAR) system [1] whose applicability was demonstrated on near normal diagnostic CT images in different body regions on 35 organs. The aim of this paper is to investigate strategies for adapting the previous AAR system to diagnostic CT images of patients with various pathologies as a first step toward automated body-wide disease quantification. The AAR approach consists of three main steps - model building, object recognition, and object delineation. In this paper, within the broader AAR framework, we describe a new strategy for object recognition to handle abnormal images. In the model building stage an optimal threshold interval is learned from near-normal training images for each object. This threshold is optimally tuned to the pathological manifestation of the object in the test image. Recognition is performed following a hierarchical representation of the objects. Experimental results for the abdominal body region based on 50 near-normal images used for model building and 20 abnormal images used for object recognition show that object localization accuracy within 2 voxels for liver and spleen and 3 voxels for kidney can be achieved with the new strategy.
Research on Attribute Reduction in Hoisting Motor State Recognition of Quayside Container Crane

NASA Astrophysics Data System (ADS)

Li, F.; Tang, G.; Hu, X.

2017-07-01

In view of too many attributes in hoisting motor state recognition of quayside container crane. Attribute reduction method based on discernibility matrix is introduced to attribute reduction of lifting motor state information table. A method of attribute reduction based on the combination of rough set and genetic algorithm is proposed to deal with the hoisting motor state decision table. Under the condition that the information system's decision-making ability is unchanged, the redundant attribute is deleted. Which reduces the complexity and computation of the recognition process of the hoisting motor. It is possible to realize the fast state recognition.
Computational approaches to cognition: the bottom-up view.

PubMed

Koch, C

1993-04-01

How can higher level aspects of cognition, such as figure-ground segregation, object recognition, selective focal attention and ultimately even awareness, be implemented at the level of synapses and neurons? A number of theoretical studies emerging out of the connectionist and the computational neuroscience communities are starting to address these issues using neural plausible models.
Shape Recognition in Infancy: Visual Integration of Sequential Information.

ERIC Educational Resources Information Center

Rose, Susan A

1988-01-01

Investigated infants' integration of visual information across space and time. In four experiments, infants aged 12 months and 6 months viewed objects after watching light trace similar and dissimilar shapes. Infants looked longer at novel shapes, although six-month-olds did not recognize figures taking more than 10 seconds to trace. One-year-old…
Examining the Conflict and Interconnectedness of Young People's Ideas about Environmental Issues, Responsibility and Action

ERIC Educational Resources Information Center

Wilks, Leigh; Harris, Neil

2016-01-01

Objective: Young people's environmental views are typically conflicted, with little recognition of the links between environmental issues or between environmental responsibility and action. The purpose of this study was to clarify whether young people's understanding of the environment is in conflict or whether they are forming interconnections…
Recognition of neural brain activity patterns correlated with complex motor activity

NASA Astrophysics Data System (ADS)

Kurkin, Semen; Musatov, Vyacheslav Yu.; Runnova, Anastasia E.; Grubov, Vadim V.; Efremova, Tatyana Yu.; Zhuravlev, Maxim O.

2018-04-01

In this paper, based on the apparatus of artificial neural networks, a technique for recognizing and classifying patterns corresponding to imaginary movements on electroencephalograms (EEGs) obtained from a group of untrained subjects was developed. The works on the selection of the optimal type, topology, training algorithms and neural network parameters were carried out from the point of view of the most accurate and fast recognition and classification of patterns on multi-channel EEGs associated with the imagination of movements. The influence of the number and choice of the analyzed channels of a multichannel EEG on the quality of recognition of imaginary movements was also studied, and optimal configurations of electrode arrangements were obtained. The effect of pre-processing of EEG signals is analyzed from the point of view of improving the accuracy of recognition of imaginary movements.
An Effective 3D Shape Descriptor for Object Recognition with RGB-D Sensors

PubMed Central

Liu, Zhong; Zhao, Changchen; Wu, Xingming; Chen, Weihai

2017-01-01

RGB-D sensors have been widely used in various areas of computer vision and graphics. A good descriptor will effectively improve the performance of operation. This article further analyzes the recognition performance of shape features extracted from multi-modality source data using RGB-D sensors. A hybrid shape descriptor is proposed as a representation of objects for recognition. We first extracted five 2D shape features from contour-based images and five 3D shape features over point cloud data to capture the global and local shape characteristics of an object. The recognition performance was tested for category recognition and instance recognition. Experimental results show that the proposed shape descriptor outperforms several common global-to-global shape descriptors and is comparable to some partial-to-global shape descriptors that achieved the best accuracies in category and instance recognition. Contribution of partial features and computational complexity were also analyzed. The results indicate that the proposed shape features are strong cues for object recognition and can be combined with other features to boost accuracy. PMID:28245553
Three-dimensional object recognition based on planar images

NASA Astrophysics Data System (ADS)

Mital, Dinesh P.; Teoh, Eam-Khwang; Au, K. C.; Chng, E. K.

1993-01-01

This paper presents the development and realization of a robotic vision system for the recognition of 3-dimensional (3-D) objects. The system can recognize a single object from among a group of known regular convex polyhedron objects that is constrained to lie on a calibrated flat platform. The approach adopted comprises a series of image processing operations on a single 2-dimensional (2-D) intensity image to derive an image line drawing. Subsequently, a feature matching technique is employed to determine 2-D spatial correspondences of the image line drawing with the model in the database. Besides its identification ability, the system can also provide important position and orientation information of the recognized object. The system was implemented on an IBM-PC AT machine executing at 8 MHz without the 80287 Maths Co-processor. In our overall performance evaluation based on a 600 recognition cycles test, the system demonstrated an accuracy of above 80% with recognition time well within 10 seconds. The recognition time is, however, indirectly dependent on the number of models in the database. The reliability of the system is also affected by illumination conditions which must be clinically controlled as in any industrial robotic vision system.
How landmark suitability shapes recognition memory signals for objects in the medial temporal lobes.

PubMed

Martin, Chris B; Sullivan, Jacqueline A; Wright, Jessey; Köhler, Stefan

2018-02-01

A role of perirhinal cortex (PrC) in recognition memory for objects has been well established. Contributions of parahippocampal cortex (PhC) to this function, while documented, remain less well understood. Here, we used fMRI to examine whether the organization of item-based recognition memory signals across these two structures is shaped by object category, independent of any difference in representing episodic context. Guided by research suggesting that PhC plays a critical role in processing landmarks, we focused on three categories of objects that differ from each other in their landmark suitability as confirmed with behavioral ratings (buildings > trees > aircraft). Participants made item-based recognition-memory decisions for novel and previously studied objects from these categories, which were matched in accuracy. Multi-voxel pattern classification revealed category-specific item-recognition memory signals along the long axis of PrC and PhC, with no sharp functional boundaries between these structures. Memory signals for buildings were observed in the mid to posterior extent of PhC, signals for trees in anterior to posterior segments of PhC, and signals for aircraft in mid to posterior aspects of PrC and the anterior extent of PhC. Notably, item-based memory signals for the category with highest landmark suitability ratings were observed only in those posterior segments of PhC that also allowed for classification of landmark suitability of objects when memory status was held constant. These findings provide new evidence in support of the notion that item-based memory signals for objects are not limited to PrC, and that the organization of these signals along the longitudinal axis that crosses PrC and PhC can be captured with reference to landmark suitability. Copyright © 2017 Elsevier Inc. All rights reserved.
Age-related impairments in active learning and strategic visual exploration.

PubMed

Brandstatt, Kelly L; Voss, Joel L

2014-01-01

Old age could impair memory by disrupting learning strategies used by younger individuals. We tested this possibility by manipulating the ability to use visual-exploration strategies during learning. Subjects controlled visual exploration during active learning, thus permitting the use of strategies, whereas strategies were limited during passive learning via predetermined exploration patterns. Performance on tests of object recognition and object-location recall was matched for younger and older subjects for objects studied passively, when learning strategies were restricted. Active learning improved object recognition similarly for younger and older subjects. However, active learning improved object-location recall for younger subjects, but not older subjects. Exploration patterns were used to identify a learning strategy involving repeat viewing. Older subjects used this strategy less frequently and it provided less memory benefit compared to younger subjects. In previous experiments, we linked hippocampal-prefrontal co-activation to improvements in object-location recall from active learning and to the exploration strategy. Collectively, these findings suggest that age-related memory problems result partly from impaired strategies during learning, potentially due to reduced hippocampal-prefrontal co-engagement.
Speckle-learning-based object recognition through scattering media.

PubMed

Ando, Takamasa; Horisaki, Ryoichi; Tanida, Jun

2015-12-28

We experimentally demonstrated object recognition through scattering media based on direct machine learning of a number of speckle intensity images. In the experiments, speckle intensity images of amplitude or phase objects on a spatial light modulator between scattering plates were captured by a camera. We used the support vector machine for binary classification of the captured speckle intensity images of face and non-face data. The experimental results showed that speckles are sufficient for machine learning.
Reassessing the 3/4 view effect in face recognition.

PubMed

Liu, Chang Hong; Chaudhuri, Avi

2002-02-01

It is generally accepted that unfamiliar faces are better recognized if presented in 3/4 view. A common interpretation of this result is that the 3/4 view represents a canonical view for faces. This article presents a critical review of this claim. Two kinds of advantage, in which a 3/4 view either generalizes better to a different view or produces better recognition in the same view, are discussed. Our analysis of the literature shows that the first effect almost invariably depended on different amounts of angular rotation that was present between learning and test views. The advantage usually vanished when angular rotation was equalized between conditions. Reports in favor of the second effect are scant and can be countered by studies reporting negative findings. To clarify this ambiguity, we conducted a recognition experiment. Subjects were trained and tested on the same three views (full-face, 3/4 and profile). The results showed no difference between the three view conditions. Our analysis of the literature, along with the new results, shows that the evidence for a 3/4 view advantage in both categories is weak at best. We suggest that a better predictor of performance for recognition in different views is the angular difference between learning and test views. For recognition in the same view, there may be a wide range of views whose effectiveness is comparable to the 3/4 view.
The Development of Adaptive Decision Making: Recognition-Based Inference in Children and Adolescents

ERIC Educational Resources Information Center

Horn, Sebastian S.; Ruggeri, Azzurra; Pachur, Thorsten

2016-01-01

Judgments about objects in the world are often based on probabilistic information (or cues). A frugal judgment strategy that utilizes memory (i.e., the ability to discriminate between known and unknown objects) as a cue for inference is the recognition heuristic (RH). The usefulness of the RH depends on the structure of the environment,…
Multidimensional brain activity dictated by winner-take-all mechanisms.

PubMed

Tozzi, Arturo; Peters, James F

2018-06-21

A novel demon-based architecture is introduced to elucidate brain functions such as pattern recognition during human perception and mental interpretation of visual scenes. Starting from the topological concepts of invariance and persistence, we introduce a Selfridge pandemonium variant of brain activity that takes into account a novel feature, namely, demons that recognize short straight-line segments, curved lines and scene shapes, such as shape interior, density and texture. Low-level representations of objects can be mapped to higher-level views (our mental interpretations): a series of transformations can be gradually applied to a pattern in a visual scene, without affecting its invariant properties. This makes it possible to construct a symbolic multi-dimensional representation of the environment. These representations can be projected continuously to an object that we have seen and continue to see, thanks to the mapping from shapes in our memory to shapes in Euclidean space. Although perceived shapes are 3-dimensional (plus time), the evaluation of shape features (volume, color, contour, closeness, texture, and so on) leads to n-dimensional brain landscapes. Here we discuss the advantages of our parallel, hierarchical model in pattern recognition, computer vision and biological nervous system's evolution. Copyright © 2018 Elsevier B.V. All rights reserved.
Real-time object recognition in multidimensional images based on joined extended structural tensor and higher-order tensor decomposition methods

NASA Astrophysics Data System (ADS)

Cyganek, Boguslaw; Smolka, Bogdan

2015-02-01

In this paper a system for real-time recognition of objects in multidimensional video signals is proposed. Object recognition is done by pattern projection into the tensor subspaces obtained from the factorization of the signal tensors representing the input signal. However, instead of taking only the intensity signal the novelty of this paper is first to build the Extended Structural Tensor representation from the intensity signal that conveys information on signal intensities, as well as on higher-order statistics of the input signals. This way the higher-order input pattern tensors are built from the training samples. Then, the tensor subspaces are built based on the Higher-Order Singular Value Decomposition of the prototype pattern tensors. Finally, recognition relies on measurements of the distance of a test pattern projected into the tensor subspaces obtained from the training tensors. Due to high-dimensionality of the input data, tensor based methods require high memory and computational resources. However, recent achievements in the technology of the multi-core microprocessors and graphic cards allows real-time operation of the multidimensional methods as is shown and analyzed in this paper based on real examples of object detection in digital images.
A Novel Locally Linear KNN Method With Applications to Visual Recognition.

PubMed

Liu, Qingfeng; Liu, Chengjun

2017-09-01

A locally linear K Nearest Neighbor (LLK) method is presented in this paper with applications to robust visual recognition. Specifically, the concept of an ideal representation is first presented, which improves upon the traditional sparse representation in many ways. The objective function based on a host of criteria for sparsity, locality, and reconstruction is then optimized to derive a novel representation, which is an approximation to the ideal representation. The novel representation is further processed by two classifiers, namely, an LLK-based classifier and a locally linear nearest mean-based classifier, for visual recognition. The proposed classifiers are shown to connect to the Bayes decision rule for minimum error. Additional new theoretical analysis is presented, such as the nonnegative constraint, the group regularization, and the computational efficiency of the proposed LLK method. New methods such as a shifted power transformation for improving reliability, a coefficients' truncating method for enhancing generalization, and an improved marginal Fisher analysis method for feature extraction are proposed to further improve visual recognition performance. Extensive experiments are implemented to evaluate the proposed LLK method for robust visual recognition. In particular, eight representative data sets are applied for assessing the performance of the LLK method for various visual recognition applications, such as action recognition, scene recognition, object recognition, and face recognition.

The roles of categorical and coordinate spatial relations in recognizing buildings.

PubMed

Palermo, Liana; Piccardi, Laura; Nori, Raffaella; Giusberti, Fiorella; Guariglia, Cecilia

2012-11-01

Categorical spatial information is considered more useful for recognizing objects, and coordinate spatial information for guiding actions--for example, during navigation or grasping. In contrast with this assumption, we hypothesized that buildings, unlike other categories of objects, require both categorical and coordinate spatial information in order to be recognized. This hypothesis arose from evidence that right-brain-damaged patients have deficits in both coordinate judgments and recognition of buildings and from the fact that buildings are very useful for guiding navigation in urban environments. To test this hypothesis, we assessed 210 healthy college students while they performed four different tasks that required categorical and coordinate judgments and the recognition of common objects and buildings. Our results showed that both categorical and coordinate spatial representations are necessary to recognize a building, whereas only categorical representations are necessary to recognize an object. We discuss our data in view of a recent neural framework for visuospatial processing, suggesting that recognizing buildings may specifically activate the parieto-medial-temporal pathway.
Short memory fuzzy fusion image recognition schema employing spatial and Fourier descriptors

NASA Astrophysics Data System (ADS)

Raptis, Sotiris N.; Tzafestas, Spyros G.

2001-03-01

Single images quite often do not bear enough information for precise interpretation due to a variety of reasons. Multiple image fusion and adequate integration recently became the state of the art in the pattern recognition field. In this paper presented here and enhanced multiple observation schema is discussed investigating improvements to the baseline fuzzy- probabilistic image fusion methodology. The first innovation introduced consists in considering only a limited but seemingly ore effective part of the uncertainty information obtained by a certain time restricting older uncertainty dependencies and alleviating computational burden that is now needed for short sequence (stored into memory) of samples. The second innovation essentially grouping them into feature-blind object hypotheses. Experiment settings include a sequence of independent views obtained by camera being moved around the investigated object.
The effects of a convex rear-view mirror on ocular accommodative responses.

PubMed

Nagata, Tatsuo; Iwasaki, Tsuneto; Kondo, Hiroyuki; Tawara, Akihiko

2013-11-01

Convex mirrors are universally used as rear-view mirrors in automobiles. However, the ocular accommodative responses during the use of these mirrors have not yet been examined. This study investigated the effects of a convex mirror on the ocular accommodative systems. Seven young adults with normal visual functions were ordered to binocularly watch an object in a convex or plane mirror. The accommodative responses were measured with an infrared optometer. The average of the accommodation of all subjects while viewing the object in the convex mirror were significantly nearer than in the plane mirror, although all subjects perceived the position of the object in the convex mirror as being farther away. Moreover, the fluctuations of accommodation were significantly larger for the convex mirror. The convex mirror caused the 'false recognition of distance', which induced the large accommodative fluctuations and blurred vision. Manufactures should consider the ocular accommodative responses as a new indicator for increasing automotive safety. Copyright © 2013 Elsevier Ltd and The Ergonomics Society. All rights reserved.
Picture Details in Recognition Memory.

ERIC Educational Resources Information Center

Cody, James A.; Madigan, Stephen

A study was conducted to investigate the effects of symbolic format of test material on short- and long-term recognition. Subjects, 104 undergraduate students, viewed slides of either a black-and-white photograph, a one-sentence verbal description of the photo, a black-and-white drawing based on the verbal description, or a black-and-white line…
Simple Learned Weighted Sums of Inferior Temporal Neuronal Firing Rates Accurately Predict Human Core Object Recognition Performance.

PubMed

Majaj, Najib J; Hong, Ha; Solomon, Ethan A; DiCarlo, James J

2015-09-30

To go beyond qualitative models of the biological substrate of object recognition, we ask: can a single ventral stream neuronal linking hypothesis quantitatively account for core object recognition performance over a broad range of tasks? We measured human performance in 64 object recognition tests using thousands of challenging images that explore shape similarity and identity preserving object variation. We then used multielectrode arrays to measure neuronal population responses to those same images in visual areas V4 and inferior temporal (IT) cortex of monkeys and simulated V1 population responses. We tested leading candidate linking hypotheses and control hypotheses, each postulating how ventral stream neuronal responses underlie object recognition behavior. Specifically, for each hypothesis, we computed the predicted performance on the 64 tests and compared it with the measured pattern of human performance. All tested hypotheses based on low- and mid-level visually evoked activity (pixels, V1, and V4) were very poor predictors of the human behavioral pattern. However, simple learned weighted sums of distributed average IT firing rates exactly predicted the behavioral pattern. More elaborate linking hypotheses relying on IT trial-by-trial correlational structure, finer IT temporal codes, or ones that strictly respect the known spatial substructures of IT ("face patches") did not improve predictive power. Although these results do not reject those more elaborate hypotheses, they suggest a simple, sufficient quantitative model: each object recognition task is learned from the spatially distributed mean firing rates (100 ms) of ∼60,000 IT neurons and is executed as a simple weighted sum of those firing rates. Significance statement: We sought to go beyond qualitative models of visual object recognition and determine whether a single neuronal linking hypothesis can quantitatively account for core object recognition behavior. To achieve this, we designed a database of images for evaluating object recognition performance. We used multielectrode arrays to characterize hundreds of neurons in the visual ventral stream of nonhuman primates and measured the object recognition performance of >100 human observers. Remarkably, we found that simple learned weighted sums of firing rates of neurons in monkey inferior temporal (IT) cortex accurately predicted human performance. Although previous work led us to expect that IT would outperform V4, we were surprised by the quantitative precision with which simple IT-based linking hypotheses accounted for human behavior. Copyright © 2015 the authors 0270-6474/15/3513402-17$15.00/0.
Advances in image compression and automatic target recognition; Proceedings of the Meeting, Orlando, FL, Mar. 30, 31, 1989

NASA Technical Reports Server (NTRS)

Tescher, Andrew G. (Editor)

1989-01-01

Various papers on image compression and automatic target recognition are presented. Individual topics addressed include: target cluster detection in cluttered SAR imagery, model-based target recognition using laser radar imagery, Smart Sensor front-end processor for feature extraction of images, object attitude estimation and tracking from a single video sensor, symmetry detection in human vision, analysis of high resolution aerial images for object detection, obscured object recognition for an ATR application, neural networks for adaptive shape tracking, statistical mechanics and pattern recognition, detection of cylinders in aerial range images, moving object tracking using local windows, new transform method for image data compression, quad-tree product vector quantization of images, predictive trellis encoding of imagery, reduced generalized chain code for contour description, compact architecture for a real-time vision system, use of human visibility functions in segmentation coding, color texture analysis and synthesis using Gibbs random fields.
Ignorance- versus evidence-based decision making: a decision time analysis of the recognition heuristic.

PubMed

Hilbig, Benjamin E; Pohl, Rüdiger F

2009-09-01

According to part of the adaptive toolbox notion of decision making known as the recognition heuristic (RH), the decision process in comparative judgments-and its duration-is determined by whether recognition discriminates between objects. By contrast, some recently proposed alternative models predict that choices largely depend on the amount of evidence speaking for each of the objects and that decision times thus depend on the evidential difference between objects, or the degree of conflict between options. This article presents 3 experiments that tested predictions derived from the RH against those from alternative models. All experiments used naturally recognized objects without teaching participants any information and thus provided optimal conditions for application of the RH. However, results supported the alternative, evidence-based models and often conflicted with the RH. Recognition was not the key determinant of decision times, whereas differences between objects with respect to (both positive and negative) evidence predicted effects well. In sum, alternative models that allow for the integration of different pieces of information may well provide a better account of comparative judgments. (c) 2009 APA, all rights reserved.
Get rich quick: the signal to respond procedure reveals the time course of semantic richness effects during visual word recognition.

PubMed

Hargreaves, Ian S; Pexman, Penny M

2014-05-01

According to several current frameworks, semantic processing involves an early influence of language-based information followed by later influences of object-based information (e.g., situated simulations; Santos, Chaigneau, Simmons, & Barsalou, 2011). In the present study we examined whether these predictions extend to the influence of semantic variables in visual word recognition. We investigated the time course of semantic richness effects in visual word recognition using a signal-to-respond (STR) paradigm fitted to a lexical decision (LDT) and a semantic categorization (SCT) task. We used linear mixed effects to examine the relative contributions of language-based (number of senses, ARC) and object-based (imageability, number of features, body-object interaction ratings) descriptions of semantic richness at four STR durations (75, 100, 200, and 400ms). Results showed an early influence of number of senses and ARC in the SCT. In both LDT and SCT, object-based effects were the last to influence participants' decision latencies. We interpret our results within a framework in which semantic processes are available to influence word recognition as a function of their availability over time, and of their relevance to task-specific demands. Copyright © 2014 Elsevier B.V. All rights reserved.
Running Improves Pattern Separation during Novel Object Recognition.

PubMed

Bolz, Leoni; Heigele, Stefanie; Bischofberger, Josef

2015-10-09

Running increases adult neurogenesis and improves pattern separation in various memory tasks including context fear conditioning or touch-screen based spatial learning. However, it is unknown whether pattern separation is improved in spontaneous behavior, not emotionally biased by positive or negative reinforcement. Here we investigated the effect of voluntary running on pattern separation during novel object recognition in mice using relatively similar or substantially different objects.We show that running increases hippocampal neurogenesis but does not affect object recognition memory with 1.5 h delay after sample phase. By contrast, at 24 h delay, running significantly improves recognition memory for similar objects, whereas highly different objects can be distinguished by both, running and sedentary mice. These data show that physical exercise improves pattern separation, independent of negative or positive reinforcement. In sedentary mice there is a pronounced temporal gradient for remembering object details. In running mice, however, increased neurogenesis improves hippocampal coding and temporally preserves distinction of novel objects from familiar ones.
Children's Face Identity Representations Are No More View Specific than Those of Adults

ERIC Educational Resources Information Center

Jeffery, Linda; Rathbone, Cameron; Read, Ainsley; Rhodes, Gillian

2013-01-01

Face recognition performance improves during childhood, not reaching adult levels until late adolescence, yet the source of this improvement is unclear. Recognition of faces across changes in viewpoint appears particularly slow to develop. Poor cross-view recognition suggests that children's face representations may be more view specific than…
Remembering the snake in the grass: Threat enhances recognition but not source memory.

PubMed

Meyer, Miriam Magdalena; Bell, Raoul; Buchner, Axel

2015-12-01

Research on the influence of emotion on source memory has yielded inconsistent findings. The object-based framework (Mather, 2007) predicts that negatively arousing stimuli attract attention, resulting in enhanced within-object binding, and, thereby, enhanced source memory for intrinsic context features of emotional stimuli. To test this prediction, we presented pictures of threatening and harmless animals, the color of which had been experimentally manipulated. In a memory test, old-new recognition for the animals and source memory for their color was assessed. In all 3 experiments, old-new recognition was better for the more threatening material, which supports previous reports of an emotional memory enhancement. This recognition advantage was due to the emotional properties of the stimulus material, and not specific for snake stimuli. However, inconsistent with the prediction of the object-based framework, intrinsic source memory was not affected by emotion. (c) 2015 APA, all rights reserved).
Identification and location of catenary insulator in complex background based on machine vision

NASA Astrophysics Data System (ADS)

Yao, Xiaotong; Pan, Yingli; Liu, Li; Cheng, Xiao

2018-04-01

It is an important premise to locate insulator precisely for fault detection. Current location algorithms for insulator under catenary checking images are not accurate, a target recognition and localization method based on binocular vision combined with SURF features is proposed. First of all, because of the location of the insulator in complex environment, using SURF features to achieve the coarse positioning of target recognition; then Using binocular vision principle to calculate the 3D coordinates of the object which has been coarsely located, realization of target object recognition and fine location; Finally, Finally, the key is to preserve the 3D coordinate of the object's center of mass, transfer to the inspection robot to control the detection position of the robot. Experimental results demonstrate that the proposed method has better recognition efficiency and accuracy, can successfully identify the target and has a define application value.
Traffic Sign Recognition with Invariance to Lighting in Dual-Focal Active Camera System

NASA Astrophysics Data System (ADS)

Gu, Yanlei; Panahpour Tehrani, Mehrdad; Yendo, Tomohiro; Fujii, Toshiaki; Tanimoto, Masayuki

In this paper, we present an automatic vision-based traffic sign recognition system, which can detect and classify traffic signs at long distance under different lighting conditions. To realize this purpose, the traffic sign recognition is developed in an originally proposed dual-focal active camera system. In this system, a telephoto camera is equipped as an assistant of a wide angle camera. The telephoto camera can capture a high accuracy image for an object of interest in the view field of the wide angle camera. The image from the telephoto camera provides enough information for recognition when the accuracy of traffic sign is low from the wide angle camera. In the proposed system, the traffic sign detection and classification are processed separately for different images from the wide angle camera and telephoto camera. Besides, in order to detect traffic sign from complex background in different lighting conditions, we propose a type of color transformation which is invariant to light changing. This color transformation is conducted to highlight the pattern of traffic signs by reducing the complexity of background. Based on the color transformation, a multi-resolution detector with cascade mode is trained and used to locate traffic signs at low resolution in the image from the wide angle camera. After detection, the system actively captures a high accuracy image of each detected traffic sign by controlling the direction and exposure time of the telephoto camera based on the information from the wide angle camera. Moreover, in classification, a hierarchical classifier is constructed and used to recognize the detected traffic signs in the high accuracy image from the telephoto camera. Finally, based on the proposed system, a set of experiments in the domain of traffic sign recognition is presented. The experimental results demonstrate that the proposed system can effectively recognize traffic signs at low resolution in different lighting conditions.
Representations of Shape in Object Recognition and Long-Term Visual Memory

DTIC Science & Technology

1993-02-11

in anything other than linguistic terms ( Biederman , 1987 , for example). STATUS 1. Viewpoint-Dependent Features in Object Representation Tarr and...is object- based orientation-independent representations sufficient for "basic-level" categorization ( Biederman , 1987 ; Corballis, 1988). Alternatively...space. REFERENCES Biederman , I. ( 1987 ). Recognition-by-components: A theory of human image understanding. Psychological Review, 94,115-147. Cooper, L
A sensor and video based ontology for activity recognition in smart environments.

PubMed

Mitchell, D; Morrow, Philip J; Nugent, Chris D

2014-01-01

Activity recognition is used in a wide range of applications including healthcare and security. In a smart environment activity recognition can be used to monitor and support the activities of a user. There have been a range of methods used in activity recognition including sensor-based approaches, vision-based approaches and ontological approaches. This paper presents a novel approach to activity recognition in a smart home environment which combines sensor and video data through an ontological framework. The ontology describes the relationships and interactions between activities, the user, objects, sensors and video data.
Two areas for familiar face recognition in the primate brain.

PubMed

Landi, Sofia M; Freiwald, Winrich A

2017-08-11

Familiarity alters face recognition: Familiar faces are recognized more accurately than unfamiliar ones and under difficult viewing conditions when unfamiliar face recognition fails. The neural basis for this fundamental difference remains unknown. Using whole-brain functional magnetic resonance imaging, we found that personally familiar faces engage the macaque face-processing network more than unfamiliar faces. Familiar faces also recruited two hitherto unknown face areas at anatomically conserved locations within the perirhinal cortex and the temporal pole. These two areas, but not the core face-processing network, responded to familiar faces emerging from a blur with a characteristic nonlinear surge, akin to the abruptness of familiar face recognition. In contrast, responses to unfamiliar faces and objects remained linear. Thus, two temporal lobe areas extend the core face-processing network into a familiar face-recognition system. Copyright © 2017 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works.
A Multistage Approach for Image Registration.

PubMed

Bowen, Francis; Hu, Jianghai; Du, Eliza Yingzi

2016-09-01

Successful image registration is an important step for object recognition, target detection, remote sensing, multimodal content fusion, scene blending, and disaster assessment and management. The geometric and photometric variations between images adversely affect the ability for an algorithm to estimate the transformation parameters that relate the two images. Local deformations, lighting conditions, object obstructions, and perspective differences all contribute to the challenges faced by traditional registration techniques. In this paper, a novel multistage registration approach is proposed that is resilient to view point differences, image content variations, and lighting conditions. Robust registration is realized through the utilization of a novel region descriptor which couples with the spatial and texture characteristics of invariant feature points. The proposed region descriptor is exploited in a multistage approach. A multistage process allows the utilization of the graph-based descriptor in many scenarios thus allowing the algorithm to be applied to a broader set of images. Each successive stage of the registration technique is evaluated through an effective similarity metric which determines subsequent action. The registration of aerial and street view images from pre- and post-disaster provide strong evidence that the proposed method estimates more accurate global transformation parameters than traditional feature-based methods. Experimental results show the robustness and accuracy of the proposed multistage image registration methodology.
Modulation of microsaccades by spatial frequency during object categorization.

PubMed

Craddock, Matt; Oppermann, Frank; Müller, Matthias M; Martinovic, Jasna

2017-01-01

The organization of visual processing into a coarse-to-fine information processing based on the spatial frequency properties of the input forms an important facet of the object recognition process. During visual object categorization tasks, microsaccades occur frequently. One potential functional role of these eye movements is to resolve high spatial frequency information. To assess this hypothesis, we examined the rate, amplitude and speed of microsaccades in an object categorization task in which participants viewed object and non-object images and classified them as showing either natural objects, man-made objects or non-objects. Images were presented unfiltered (broadband; BB) or filtered to contain only low (LSF) or high spatial frequency (HSF) information. This allowed us to examine whether microsaccades were modulated independently by the presence of a high-level feature - the presence of an object - and by low-level stimulus characteristics - spatial frequency. We found a bimodal distribution of saccades based on their amplitude, with a split between smaller and larger microsaccades at 0.4° of visual angle. The rate of larger saccades (⩾0.4°) was higher for objects than non-objects, and higher for objects with high spatial frequency content (HSF and BB objects) than for LSF objects. No effects were observed for smaller microsaccades (<0.4°). This is consistent with a role for larger microsaccades in resolving HSF information for object identification, and previous evidence that more microsaccades are directed towards informative image regions. Copyright © 2016 Elsevier Ltd. All rights reserved.
One-Reason Decision Making Unveiled: A Measurement Model of the Recognition Heuristic

ERIC Educational Resources Information Center

Hilbig, Benjamin E.; Erdfelder, Edgar; Pohl, Rudiger F.

2010-01-01

The fast-and-frugal recognition heuristic (RH) theory provides a precise process description of comparative judgments. It claims that, in suitable domains, judgments between pairs of objects are based on recognition alone, whereas further knowledge is ignored. However, due to the confound between recognition and further knowledge, previous…
Pattern recognition for passive polarimetric data using nonparametric classifiers

NASA Astrophysics Data System (ADS)

Thilak, Vimal; Saini, Jatinder; Voelz, David G.; Creusere, Charles D.

2005-08-01

Passive polarization based imaging is a useful tool in computer vision and pattern recognition. A passive polarization imaging system forms a polarimetric image from the reflection of ambient light that contains useful information for computer vision tasks such as object detection (classification) and recognition. Applications of polarization based pattern recognition include material classification and automatic shape recognition. In this paper, we present two target detection algorithms for images captured by a passive polarimetric imaging system. The proposed detection algorithms are based on Bayesian decision theory. In these approaches, an object can belong to one of any given number classes and classification involves making decisions that minimize the average probability of making incorrect decisions. This minimum is achieved by assigning an object to the class that maximizes the a posteriori probability. Computing a posteriori probabilities requires estimates of class conditional probability density functions (likelihoods) and prior probabilities. A Probabilistic neural network (PNN), which is a nonparametric method that can compute Bayes optimal boundaries, and a -nearest neighbor (KNN) classifier, is used for density estimation and classification. The proposed algorithms are applied to polarimetric image data gathered in the laboratory with a liquid crystal-based system. The experimental results validate the effectiveness of the above algorithms for target detection from polarimetric data.

3D interactive augmented reality-enhanced digital learning systems for mobile devices

NASA Astrophysics Data System (ADS)

Feng, Kai-Ten; Tseng, Po-Hsuan; Chiu, Pei-Shuan; Yang, Jia-Lin; Chiu, Chun-Jie

2013-03-01

With enhanced processing capability of mobile platforms, augmented reality (AR) has been considered a promising technology for achieving enhanced user experiences (UX). Augmented reality is to impose virtual information, e.g., videos and images, onto a live-view digital display. UX on real-world environment via the display can be e ectively enhanced with the adoption of interactive AR technology. Enhancement on UX can be bene cial for digital learning systems. There are existing research works based on AR targeting for the design of e-learning systems. However, none of these work focuses on providing three-dimensional (3-D) object modeling for en- hanced UX based on interactive AR techniques. In this paper, the 3-D interactive augmented reality-enhanced learning (IARL) systems will be proposed to provide enhanced UX for digital learning. The proposed IARL systems consist of two major components, including the markerless pattern recognition (MPR) for 3-D models and velocity-based object tracking (VOT) algorithms. Realistic implementation of proposed IARL system is conducted on Android-based mobile platforms. UX on digital learning can be greatly improved with the adoption of proposed IARL systems.
Orientation estimation of anatomical structures in medical images for object recognition

NASA Astrophysics Data System (ADS)

Bağci, Ulaş; Udupa, Jayaram K.; Chen, Xinjian

2011-03-01

Recognition of anatomical structures is an important step in model based medical image segmentation. It provides pose estimation of objects and information about "where" roughly the objects are in the image and distinguishing them from other object-like entities. In,1 we presented a general method of model-based multi-object recognition to assist in segmentation (delineation) tasks. It exploits the pose relationship that can be encoded, via the concept of ball scale (b-scale), between the binary training objects and their associated grey images. The goal was to place the model, in a single shot, close to the right pose (position, orientation, and scale) in a given image so that the model boundaries fall in the close vicinity of object boundaries in the image. Unlike position and scale parameters, we observe that orientation parameters require more attention when estimating the pose of the model as even small differences in orientation parameters can lead to inappropriate recognition. Motivated from the non-Euclidean nature of the pose information, we propose in this paper the use of non-Euclidean metrics to estimate orientation of the anatomical structures for more accurate recognition and segmentation. We statistically analyze and evaluate the following metrics for orientation estimation: Euclidean, Log-Euclidean, Root-Euclidean, Procrustes Size-and-Shape, and mean Hermitian metrics. The results show that mean Hermitian and Cholesky decomposition metrics provide more accurate orientation estimates than other Euclidean and non-Euclidean metrics.
Influence of motion on face recognition.

PubMed

Bonfiglio, Natale S; Manfredi, Valentina; Pessa, Eliano

2012-02-01

The influence of motion information and temporal associations on recognition of non-familiar faces was investigated using two groups which performed a face recognition task. One group was presented with regular temporal sequences of face views designed to produce the impression of motion of the face rotating in depth, the other group with random sequences of the same views. In one condition, participants viewed the sequences of the views in rapid succession with a negligible interstimulus interval (ISI). This condition was characterized by three different presentation times. In another condition, participants were presented a sequence with a 1-sec. ISI among the views. That regular sequences of views with a negligible ISI and a shorter presentation time were hypothesized to give rise to better recognition, related to a stronger impression of face rotation. Analysis of data from 45 participants showed a shorter presentation time was associated with significantly better accuracy on the recognition task; however, differences between performances associated with regular and random sequences were not significant.
Recognition and characterization of networks of water bodies in the Arctic ice-wedge polygonal tundra using high-resolution satellite imagery

NASA Astrophysics Data System (ADS)

Skurikhin, A. N.; Gangodagamage, C.; Rowland, J. C.; Wilson, C. J.

2013-12-01

Arctic lowland landscapes underlain by permafrost are often characterized by polygon-like patterns such as ice-wedge polygons outlined by networks of ice wedges and complemented with polygon rims, troughs, shallow ponds and thermokarst lakes. Polygonal patterns and corresponding features are relatively easy to recognize in high spatial resolution satellite imagery by a human, but their automated recognition is challenging due to the variability in their spectral appearance, the irregularity of individual trough spacing and orientation within the patterns, and a lack of unique spectral response attributable to troughs with widths commonly between 1 m and 2 m. Accurate identification of fine scale elements of ice-wedge polygonal tundra is important as their imprecise recognition may bias estimates of water, heat and carbon fluxes in large-scale climate models. Our focus is on the problem of identification of Arctic polygonal tundra fine-scale landscape elements (as small as 1 m - 2 m width). The challenge of the considered problem is that while large water bodies (e.g. lakes and rivers) can be recognized based on spectral response, reliable recognition of troughs is more difficult. Troughs do not have unique spectral signature, their appearance is noisy (edges are not strong), their width is small, and they often form connected networks with ponds and lakes, and thus they have overlapping spectral response with other water bodies and surrounding non-water bodies. We present a semi-automated approach to identify and classify Arctic polygonal tundra landscape components across the range of spatial scales, such as troughs, ponds, river- and lake-like objects, using high spatial resolution satellite imagery. The novelty of the approach lies in: (1) the combined use of segmentation and shape-based classification to identify a broad range of water bodies, including troughs, and (2) the use of high-resolution WorldView-2 satellite imagery (with resolution of 0.6 m) for this identification. The approach starts by segmenting water bodies from an image, which are then categorized using shape-based classification. Segmentation uses combination of pan sharpened multispectral bands and is based on the active contours without edges technique. The segmentation is robust to noise and can detect objects with weak boundaries that is important for extraction of troughs. We then categorize the segmented regions via shape based classification. Because segmentation accuracy is the main factor impacting the quality of the shape-based classification, for segmentation accuracy assessment we created reference image using WorldView-2 satellite image of ice-wedge polygonal tundra. Reference image contained manually labelled image regions which cover components of drainage networks, such as troughs, ponds, rivers and lakes. The evaluation has shown that the approach provides a good accuracy of segmentation and reasonable classification results. The overall accuracy of the segmentation is approximately 95%, the segmentation user's and producer's accuracies are approximately 92% and 97% respectively.
Mechanisms and neural basis of object and pattern recognition: a study with chess experts.

PubMed

Bilalić, Merim; Langner, Robert; Erb, Michael; Grodd, Wolfgang

2010-11-01

Comparing experts with novices offers unique insights into the functioning of cognition, based on the maximization of individual differences. Here we used this expertise approach to disentangle the mechanisms and neural basis behind two processes that contribute to everyday expertise: object and pattern recognition. We compared chess experts and novices performing chess-related and -unrelated (visual) search tasks. As expected, the superiority of experts was limited to the chess-specific task, as there were no differences in a control task that used the same chess stimuli but did not require chess-specific recognition. The analysis of eye movements showed that experts immediately and exclusively focused on the relevant aspects in the chess task, whereas novices also examined irrelevant aspects. With random chess positions, when pattern knowledge could not be used to guide perception, experts nevertheless maintained an advantage. Experts' superior domain-specific parafoveal vision, a consequence of their knowledge about individual domain-specific symbols, enabled improved object recognition. Functional magnetic resonance imaging corroborated this differentiation between object and pattern recognition and showed that chess-specific object recognition was accompanied by bilateral activation of the occipitotemporal junction, whereas chess-specific pattern recognition was related to bilateral activations in the middle part of the collateral sulci. Using the expertise approach together with carefully chosen controls and multiple dependent measures, we identified object and pattern recognition as two essential cognitive processes in expert visual cognition, which may also help to explain the mechanisms of everyday perception.
Automated Recognition of 3D Features in GPIR Images

NASA Technical Reports Server (NTRS)

Park, Han; Stough, Timothy; Fijany, Amir

2007-01-01

A method of automated recognition of three-dimensional (3D) features in images generated by ground-penetrating imaging radar (GPIR) is undergoing development. GPIR 3D images can be analyzed to detect and identify such subsurface features as pipes and other utility conduits. Until now, much of the analysis of GPIR images has been performed manually by expert operators who must visually identify and track each feature. The present method is intended to satisfy a need for more efficient and accurate analysis by means of algorithms that can automatically identify and track subsurface features, with minimal supervision by human operators. In this method, data from multiple sources (for example, data on different features extracted by different algorithms) are fused together for identifying subsurface objects. The algorithms of this method can be classified in several different ways. In one classification, the algorithms fall into three classes: (1) image-processing algorithms, (2) feature- extraction algorithms, and (3) a multiaxis data-fusion/pattern-recognition algorithm that includes a combination of machine-learning, pattern-recognition, and object-linking algorithms. The image-processing class includes preprocessing algorithms for reducing noise and enhancing target features for pattern recognition. The feature-extraction algorithms operate on preprocessed data to extract such specific features in images as two-dimensional (2D) slices of a pipe. Then the multiaxis data-fusion/ pattern-recognition algorithm identifies, classifies, and reconstructs 3D objects from the extracted features. In this process, multiple 2D features extracted by use of different algorithms and representing views along different directions are used to identify and reconstruct 3D objects. In object linking, which is an essential part of this process, features identified in successive 2D slices and located within a threshold radius of identical features in adjacent slices are linked in a directed-graph data structure. Relative to past approaches, this multiaxis approach offers the advantages of more reliable detections, better discrimination of objects, and provision of redundant information, which can be helpful in filling gaps in feature recognition by one of the component algorithms. The image-processing class also includes postprocessing algorithms that enhance identified features to prepare them for further scrutiny by human analysts (see figure). Enhancement of images as a postprocessing step is a significant departure from traditional practice, in which enhancement of images is a preprocessing step.
Neurocomputational bases of object and face recognition.

PubMed Central

Biederman, I; Kalocsai, P

1997-01-01

A number of behavioural phenomena distinguish the recognition of faces and objects, even when members of a set of objects are highly similar. Because faces have the same parts in approximately the same relations, individuation of faces typically requires specification of the metric variation in a holistic and integral representation of the facial surface. The direct mapping of a hypercolumn-like pattern of activation onto a representation layer that preserves relative spatial filter values in a two-dimensional (2D) coordinate space, as proposed by C. von der Malsburg and his associates, may account for many of the phenomena associated with face recognition. An additional refinement, in which each column of filters (termed a 'jet') is centred on a particular facial feature (or fiducial point), allows selectivity of the input into the holistic representation to avoid incorporation of occluding or nearby surfaces. The initial hypercolumn representation also characterizes the first stage of object perception, but the image variation for objects at a given location in a 2D coordinate space may be too great to yield sufficient predictability directly from the output of spatial kernels. Consequently, objects can be represented by a structural description specifying qualitative (typically, non-accidental) characterizations of an object's parts, the attributes of the parts, and the relations among the parts, largely based on orientation and depth discontinuities (as shown by Hummel & Biederman). A series of experiments on the name priming or physical matching of complementary images (in the Fourier domain) of objects and faces documents that whereas face recognition is strongly dependent on the original spatial filter values, evidence from object recognition indicates strong invariance to these values, even when distinguishing among objects that are as similar as faces. PMID:9304687
a Two-Step Classification Approach to Distinguishing Similar Objects in Mobile LIDAR Point Clouds

NASA Astrophysics Data System (ADS)

He, H.; Khoshelham, K.; Fraser, C.

2017-09-01

Nowadays, lidar is widely used in cultural heritage documentation, urban modeling, and driverless car technology for its fast and accurate 3D scanning ability. However, full exploitation of the potential of point cloud data for efficient and automatic object recognition remains elusive. Recently, feature-based methods have become very popular in object recognition on account of their good performance in capturing object details. Compared with global features describing the whole shape of the object, local features recording the fractional details are more discriminative and are applicable for object classes with considerable similarity. In this paper, we propose a two-step classification approach based on point feature histograms and the bag-of-features method for automatic recognition of similar objects in mobile lidar point clouds. Lamp post, street light and traffic sign are grouped as one category in the first-step classification for their inter similarity compared with tree and vehicle. A finer classification of the lamp post, street light and traffic sign based on the result of the first-step classification is implemented in the second step. The proposed two-step classification approach is shown to yield a considerable improvement over the conventional one-step classification approach.
Smart mobile robot system for rubbish collection

NASA Astrophysics Data System (ADS)

Ali, Mohammed A. H.; Sien Siang, Tan

2018-03-01

This paper records the research and procedures of developing a smart mobility robot with detection system to collect rubbish. The objective of this paper is to design a mobile robot that can detect and recognize medium-size rubbish such as drinking cans. Besides that, the objective is also to design a mobile robot with the ability to estimate the position of rubbish from the robot. In addition, the mobile robot is also able to approach the rubbish based on position of rubbish. This paper explained about the types of image processing, detection and recognition methods and image filters. This project implements RGB subtraction method as the prior system. Other than that, algorithm for distance measurement based on image plane is implemented in this project. This project is limited to use computer webcam as the sensor. Secondly, the robot is only able to approach the nearest rubbish in the same views of camera vision and any rubbish that contain RGB colour components on its body.
A study of payload specialist station monitor size constraints. [space shuttle orbiters

NASA Technical Reports Server (NTRS)

Kirkpatrick, M., III; Shields, N. L., Jr.; Malone, T. B.

1975-01-01

Constraints on the CRT display size for the shuttle orbiter cabin are studied. The viewing requirements placed on these monitors were assumed to involve display of imaged scenes providing visual feedback during payload operations and display of alphanumeric characters. Data on target recognition/resolution, target recognition, and range rate detection by human observers were utilized to determine viewing requirements for imaged scenes. Field-of-view and acuity requirements for a variety of payload operations were obtained along with the necessary detection capability in terms of range-to-target size ratios. The monitor size necessary to meet the acuity requirements was established. An empirical test was conducted to determine required recognition sizes for displayed alphanumeric characters. The results of the test were used to determine the number of characters which could be simultaneously displayed based on the recognition size requirements using the proposed monitor size. A CRT display of 20 x 20 cm is recommended. A portion of the display area is used for displaying imaged scenes and the remaining display area is used for alphanumeric characters pertaining to the displayed scene. The entire display is used for the character alone mode.
Object, spatial and social recognition testing in a single test paradigm.

PubMed

Lian, Bin; Gao, Jun; Sui, Nan; Feng, Tingyong; Li, Ming

2018-07-01

Animals have the ability to process information about an object or a conspecific's physical features and location, and alter its behavior when such information is updated. In the laboratory, the object, spatial and social recognition are often studied in separate tasks, making them unsuitable to study the potential dissociations and interactions among various types of recognition memories. The present study introduced a single paradigm to detect the object and spatial recognition, and social recognition of a familiar and novel conspecific. Specifically, male and female Sprague-Dawley adult (>75 days old) or preadolescent (25-28 days old) rats were tested with two objects and one social partner in an open-field arena for four 10-min sessions with a 20-min inter-session interval. After the first sample session, a new object replaced one of the sampled objects in the second session, and the location of one of the old objects was changed in the third session. Finally, a new social partner was introduced in the fourth session and replaced the familiar one. Exploration time with each stimulus was recorded and measures for the three recognitions were calculated based on the discrimination ratio. Overall results show that adult and preadolescent male and female rats spent more time exploring the social partner than the objects, showing a clear preference for social stimulus over nonsocial one. They also did not differ in their abilities to discriminate a new object, a new location and a new social partner from a familiar one, and to recognize a familiar conspecific. Acute administration of MK-801 (a NMDA receptor antagonist, 0.025 and 0.10 mg/kg, i.p.) after the sample session dose-dependently reduced the total time spent on exploring the social partner and objects in the adult rats, and had a significantly larger effect in the females than in the males. MK-801 also dose-dependently increased motor activity. However, it did not alter the object, spatial and social recognitions. These findings indicate that the new triple recognition paradigm is capable of recording the object, spatial location and social recognition together and revealing potential sex and age differences. This paradigm is also useful for the study of object and social exploration concurrently and can be used to evaluate cognition-altering drugs in various stages of recognition memories. Copyright © 2018. Published by Elsevier Inc.
Implementation of a Peltier-based cooling device for localized deep cortical deactivation during in vivo object recognition testing

NASA Astrophysics Data System (ADS)

Marra, Kyle; Graham, Brett; Carouso, Samantha; Cox, David

2012-02-01

While the application of local cortical cooling has recently become a focus of neurological research, extended localized deactivation deep within brain structures is still unexplored. Using a wirelessly controlled thermoelectric (Peltier) device and water-based heat sink, we have achieved inactivating temperatures (<20 C) at greater depths (>8 mm) than previously reported. After implanting the device into Long Evans rats' basolateral amygdala (BLA), an inhibitory brain center that controls anxiety and fear, we ran an open field test during which anxiety-driven behavioral tendencies were observed to decrease during cooling, thus confirming the device's effect on behavior. Our device will next be implanted in the rats' temporal association cortex (TeA) and recordings from our signal-tracing multichannel microelectrodes will measure and compare activated and deactivated neuronal activity so as to isolate and study the TeA signals responsible for object recognition. Having already achieved a top performing computational face-recognition system, the lab will utilize this TeA activity data to generalize its computational efforts of face recognition to achieve general object recognition.
Deficits in long-term recognition memory reveal dissociated subtypes in congenital prosopagnosia.

PubMed

Stollhoff, Rainer; Jost, Jürgen; Elze, Tobias; Kennerknecht, Ingo

2011-01-25

The study investigates long-term recognition memory in congenital prosopagnosia (CP), a lifelong impairment in face identification that is present from birth. Previous investigations of processing deficits in CP have mostly relied on short-term recognition tests to estimate the scope and severity of individual deficits. We firstly report on a controlled test of long-term (one year) recognition memory for faces and objects conducted with a large group of participants with CP. Long-term recognition memory is significantly impaired in eight CP participants (CPs). In all but one case, this deficit was selective to faces and didn't extend to intra-class recognition of object stimuli. In a test of famous face recognition, long-term recognition deficits were less pronounced, even after accounting for differences in media consumption between controls and CPs. Secondly, we combined test results on long-term and short-term recognition of faces and objects, and found a large heterogeneity in severity and scope of individual deficits. Analysis of the observed heterogeneity revealed a dissociation of CP into subtypes with a homogeneous phenotypical profile. Thirdly, we found that among CPs self-assessment of real-life difficulties, based on a standardized questionnaire, and experimentally assessed face recognition deficits are strongly correlated. Our results demonstrate that controlled tests of long-term recognition memory are needed to fully assess face recognition deficits in CP. Based on controlled and comprehensive experimental testing, CP can be dissociated into subtypes with a homogeneous phenotypical profile. The CP subtypes identified align with those found in prosopagnosia caused by cortical lesions; they can be interpreted with respect to a hierarchical neural system for face perception.
Deficits in Long-Term Recognition Memory Reveal Dissociated Subtypes in Congenital Prosopagnosia

PubMed Central

Stollhoff, Rainer; Jost, Jürgen; Elze, Tobias; Kennerknecht, Ingo

2011-01-01

The study investigates long-term recognition memory in congenital prosopagnosia (CP), a lifelong impairment in face identification that is present from birth. Previous investigations of processing deficits in CP have mostly relied on short-term recognition tests to estimate the scope and severity of individual deficits. We firstly report on a controlled test of long-term (one year) recognition memory for faces and objects conducted with a large group of participants with CP. Long-term recognition memory is significantly impaired in eight CP participants (CPs). In all but one case, this deficit was selective to faces and didn't extend to intra-class recognition of object stimuli. In a test of famous face recognition, long-term recognition deficits were less pronounced, even after accounting for differences in media consumption between controls and CPs. Secondly, we combined test results on long-term and short-term recognition of faces and objects, and found a large heterogeneity in severity and scope of individual deficits. Analysis of the observed heterogeneity revealed a dissociation of CP into subtypes with a homogeneous phenotypical profile. Thirdly, we found that among CPs self-assessment of real-life difficulties, based on a standardized questionnaire, and experimentally assessed face recognition deficits are strongly correlated. Our results demonstrate that controlled tests of long-term recognition memory are needed to fully assess face recognition deficits in CP. Based on controlled and comprehensive experimental testing, CP can be dissociated into subtypes with a homogeneous phenotypical profile. The CP subtypes identified align with those found in prosopagnosia caused by cortical lesions; they can be interpreted with respect to a hierarchical neural system for face perception. PMID:21283572
Apparent Frequency of Words and Pictures as a Function of Pronunciation and Imagery. Technical Report No. 238.

ERIC Educational Resources Information Center

Ghatala, Elizabeth S.; And Others

This study applied a frequency theory to measure the superiority of pictures over words in both discrimination learning and recognition memory tasks. Three groups of sixth grade students were given separate instructions before viewing slides of either common objects or words. The first group (control) was asked to study the items shown, the second…
Automatic recognition of ship types from infrared images using superstructure moment invariants

NASA Astrophysics Data System (ADS)

Li, Heng; Wang, Xinyu

2007-11-01

Automatic object recognition is an active area of interest for military and commercial applications. In this paper, a system addressing autonomous recognition of ship types in infrared images is proposed. Firstly, an approach of segmentation based on detection of salient features of the target with subsequent shadow removing is proposed, as is the base of the subsequent object recognition. Considering the differences between the shapes of various ships mainly lie in their superstructures, we then use superstructure moment functions invariant to translation, rotation and scale differences in input patterns and develop a robust algorithm of obtaining ship superstructure. Subsequently a back-propagation neural network is used as a classifier in the recognition stage and projection images of simulated three-dimensional ship models are used as the training sets. Our recognition model was implemented and experimentally validated using both simulated three-dimensional ship model images and real images derived from video of an AN/AAS-44V Forward Looking Infrared(FLIR) sensor.
Mechanisms and Neural Basis of Object and Pattern Recognition: A Study with Chess Experts

ERIC Educational Resources Information Center

Bilalic, Merim; Langner, Robert; Erb, Michael; Grodd, Wolfgang

2010-01-01

Comparing experts with novices offers unique insights into the functioning of cognition, based on the maximization of individual differences. Here we used this expertise approach to disentangle the mechanisms and neural basis behind two processes that contribute to everyday expertise: object and pattern recognition. We compared chess experts and…
Developmental Trajectories of Part-Based and Configural Object Recognition in Adolescence

ERIC Educational Resources Information Center

Juttner, Martin; Wakui, Elley; Petters, Dean; Kaur, Surinder; Davidoff, Jules

2013-01-01

Three experiments assessed the development of children's part and configural (part-relational) processing in object recognition during adolescence. In total, 312 school children aged 7-16 years and 80 adults were tested in 3-alternative forced choice (3-AFC) tasks. They judged the correct appearance of upright and inverted presented familiar…
The “parts and wholes” of face recognition: a review of the literature

PubMed Central

Tanaka, James W.; Simonyi, Diana

2016-01-01

It has been claimed that faces are recognized as a “whole” rather than the recognition of individual parts. In a paper published in the Quarterly Journal of Experimental Psychology in 1993, Martha Farah and I attempted to operationalize the holistic claim using the part/whole task. In this task, participants studied a face and then their memory presented in isolation and in the whole face. Consistent with the holistic view, recognition of the part was superior when tested in the whole-face condition compared to when it was tested in isolation. The “whole face” or holistic advantage was not found for faces that were inverted, or scrambled, nor for non-face objects suggesting that holistic encoding was specific to normal, intact faces. In this paper, we reflect on the part/whole paradigm and how it has contributed to our understanding of what it means to recognize a face as a “whole” stimulus. We describe the value of part/whole task for developing theories of holistic and non-holistic recognition of faces and objects. We discuss the research that has probed the neural substrates of holistic processing in healthy adults and people with prosopagnosia and autism. Finally, we examine how experience shapes holistic face recognition in children and recognition of own- and other-race faces in adults. The goal of this article is to summarize the research on the part/whole task and speculate on how it has informed our understanding of holistic face processing. PMID:26886495
The "parts and wholes" of face recognition: A review of the literature.

PubMed

Tanaka, James W; Simonyi, Diana

2016-10-01

It has been claimed that faces are recognized as a "whole" rather than by the recognition of individual parts. In a paper published in the Quarterly Journal of Experimental Psychology in 1993, Martha Farah and I attempted to operationalize the holistic claim using the part/whole task. In this task, participants studied a face and then their memory presented in isolation and in the whole face. Consistent with the holistic view, recognition of the part was superior when tested in the whole-face condition compared to when it was tested in isolation. The "whole face" or holistic advantage was not found for faces that were inverted, or scrambled, nor for non-face objects, suggesting that holistic encoding was specific to normal, intact faces. In this paper, we reflect on the part/whole paradigm and how it has contributed to our understanding of what it means to recognize a face as a "whole" stimulus. We describe the value of part/whole task for developing theories of holistic and non-holistic recognition of faces and objects. We discuss the research that has probed the neural substrates of holistic processing in healthy adults and people with prosopagnosia and autism. Finally, we examine how experience shapes holistic face recognition in children and recognition of own- and other-race faces in adults. The goal of this article is to summarize the research on the part/whole task and speculate on how it has informed our understanding of holistic face processing.

Knowledge-based vision for space station object motion detection, recognition, and tracking

NASA Technical Reports Server (NTRS)

Symosek, P.; Panda, D.; Yalamanchili, S.; Wehner, W., III

1987-01-01

Computer vision, especially color image analysis and understanding, has much to offer in the area of the automation of Space Station tasks such as construction, satellite servicing, rendezvous and proximity operations, inspection, experiment monitoring, data management and training. Knowledge-based techniques improve the performance of vision algorithms for unstructured environments because of their ability to deal with imprecise a priori information or inaccurately estimated feature data and still produce useful results. Conventional techniques using statistical and purely model-based approaches lack flexibility in dealing with the variabilities anticipated in the unstructured viewing environment of space. Algorithms developed under NASA sponsorship for Space Station applications to demonstrate the value of a hypothesized architecture for a Video Image Processor (VIP) are presented. Approaches to the enhancement of the performance of these algorithms with knowledge-based techniques and the potential for deployment of highly-parallel multi-processor systems for these algorithms are discussed.
Dietary effects on object recognition: The impact of high-fat high-sugar diets on recollection and familiarity-based memory.

PubMed

Tran, Dominic M D; Westbrook, R Frederick

2018-05-31

Exposure to a high-fat high-sugar (HFHS) diet rapidly impairs novel-place- but not novel-object-recognition memory in rats (Tran & Westbrook, 2015, 2017). Three experiments sought to investigate the generality of diet-induced cognitive deficits by examining whether there are conditions under which object-recognition memory is impaired. Experiments 1 and 3 tested the strength of short- and long-term object-memory trace, respectively, by varying the interval of time between object familiarization and subsequent novel object test. Experiment 2 tested the effect of increasing working memory load on object-recognition memory by interleaving additional object exposures between familiarization and test in an n-back style task. Experiments 1-3 failed to detect any differences in object recognition between HFHS and control rats. Experiment 4 controlled for object novelty by separately familiarizing both objects presented at test, which included one remote-familiar and one recent-familiar object. Under these conditions, when test objects differed in their relative recency, HFHS rats showed a weaker memory trace for the remote object compared to chow rats. This result suggests that the diet leaves intact recollection judgments, but impairs familiarity judgments. We speculate that the HFHS diet adversely affects "where" memories as well as the quality of "what" memories, and discuss these effects in relation to recollection and familiarity memory models, hippocampal-dependent functions, and episodic food memories. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Object recognition in images via a factor graph model

NASA Astrophysics Data System (ADS)

He, Yong; Wang, Long; Wu, Zhaolin; Zhang, Haisu

2018-04-01

Object recognition in images suffered from huge search space and uncertain object profile. Recently, the Bag-of- Words methods are utilized to solve these problems, especially the 2-dimension CRF(Conditional Random Field) model. In this paper we suggest the method based on a general and flexible fact graph model, which can catch the long-range correlation in Bag-of-Words by constructing a network learning framework contrasted from lattice in CRF. Furthermore, we explore a parameter learning algorithm based on the gradient descent and Loopy Sum-Product algorithms for the factor graph model. Experimental results on Graz 02 dataset show that, the recognition performance of our method in precision and recall is better than a state-of-art method and the original CRF model, demonstrating the effectiveness of the proposed method.
Image object recognition based on the Zernike moment and neural networks

NASA Astrophysics Data System (ADS)

Wan, Jianwei; Wang, Ling; Huang, Fukan; Zhou, Liangzhu

1998-03-01

This paper first give a comprehensive discussion about the concept of artificial neural network its research methods and the relations with information processing. On the basis of such a discussion, we expound the mathematical similarity of artificial neural network and information processing. Then, the paper presents a new method of image recognition based on invariant features and neural network by using image Zernike transform. The method not only has the invariant properties for rotation, shift and scale of image object, but also has good fault tolerance and robustness. Meanwhile, it is also compared with statistical classifier and invariant moments recognition method.
Enhancing Perception with Tactile Object Recognition in Adaptive Grippers for Human-Robot Interaction.

PubMed

Gandarias, Juan M; Gómez-de-Gabriel, Jesús M; García-Cerezo, Alfonso J

2018-02-26

The use of tactile perception can help first response robotic teams in disaster scenarios, where visibility conditions are often reduced due to the presence of dust, mud, or smoke, distinguishing human limbs from other objects with similar shapes. Here, the integration of the tactile sensor in adaptive grippers is evaluated, measuring the performance of an object recognition task based on deep convolutional neural networks (DCNNs) using a flexible sensor mounted in adaptive grippers. A total of 15 classes with 50 tactile images each were trained, including human body parts and common environment objects, in semi-rigid and flexible adaptive grippers based on the fin ray effect. The classifier was compared against the rigid configuration and a support vector machine classifier (SVM). Finally, a two-level output network has been proposed to provide both object-type recognition and human/non-human classification. Sensors in adaptive grippers have a higher number of non-null tactels (up to 37% more), with a lower mean of pressure values (up to 72% less) than when using a rigid sensor, with a softer grip, which is needed in physical human-robot interaction (pHRI). A semi-rigid implementation with 95.13% object recognition rate was chosen, even though the human/non-human classification had better results (98.78%) with a rigid sensor.
Bilateral Theta-Burst TMS to Influence Global Gestalt Perception

PubMed Central

Ritzinger, Bernd; Huberle, Elisabeth; Karnath, Hans-Otto

2012-01-01

While early and higher visual areas along the ventral visual pathway in the inferotemporal cortex are critical for the recognition of individual objects, the neural representation of human perception of complex global visual scenes remains under debate. Stroke patients with a selective deficit in the perception of a complex global Gestalt with intact recognition of individual objects – a deficit termed simultanagnosia – greatly helped to study this question. Interestingly, simultanagnosia typically results from bilateral lesions of the temporo-parietal junction (TPJ). The present study aimed to verify the relevance of this area for human global Gestalt perception. We applied continuous theta-burst TMS either unilaterally (left or right) or bilateral simultaneously over TPJ. Healthy subjects were presented with hierarchically organized visual stimuli that allowed parametrical degrading of the object at the global level. Identification of the global Gestalt was significantly modulated only for the bilateral TPJ stimulation condition. Our results strengthen the view that global Gestalt perception in the human brain involves TPJ and is co-dependent on both hemispheres. PMID:23110106
Bilateral theta-burst TMS to influence global gestalt perception.

PubMed

Ritzinger, Bernd; Huberle, Elisabeth; Karnath, Hans-Otto

2012-01-01

While early and higher visual areas along the ventral visual pathway in the inferotemporal cortex are critical for the recognition of individual objects, the neural representation of human perception of complex global visual scenes remains under debate. Stroke patients with a selective deficit in the perception of a complex global Gestalt with intact recognition of individual objects - a deficit termed simultanagnosia - greatly helped to study this question. Interestingly, simultanagnosia typically results from bilateral lesions of the temporo-parietal junction (TPJ). The present study aimed to verify the relevance of this area for human global Gestalt perception. We applied continuous theta-burst TMS either unilaterally (left or right) or bilateral simultaneously over TPJ. Healthy subjects were presented with hierarchically organized visual stimuli that allowed parametrical degrading of the object at the global level. Identification of the global Gestalt was significantly modulated only for the bilateral TPJ stimulation condition. Our results strengthen the view that global Gestalt perception in the human brain involves TPJ and is co-dependent on both hemispheres.
Fluent, fast, and frugal? A formal model evaluation of the interplay between memory, fluency, and comparative judgments.

PubMed

Hilbig, Benjamin E; Erdfelder, Edgar; Pohl, Rüdiger F

2011-07-01

A new process model of the interplay between memory and judgment processes was recently suggested, assuming that retrieval fluency-that is, the speed with which objects are recognized-will determine inferences concerning such objects in a single-cue fashion. This aspect of the fluency heuristic, an extension of the recognition heuristic, has remained largely untested due to methodological difficulties. To overcome the latter, we propose a measurement model from the class of multinomial processing tree models that can estimate true single-cue reliance on recognition and retrieval fluency. We applied this model to aggregate and individual data from a probabilistic inference experiment and considered both goodness of fit and model complexity to evaluate different hypotheses. The results were relatively clear-cut, revealing that the fluency heuristic is an unlikely candidate for describing comparative judgments concerning recognized objects. These findings are discussed in light of a broader theoretical view on the interplay of memory and judgment processes.
Geometry and Gesture-Based Features from Saccadic Eye-Movement as a Biometric in Radiology

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hammond, Tracy; Tourassi, Georgia; Yoon, Hong-Jun

In this study, we present a novel application of sketch gesture recognition on eye-movement for biometric identification and estimating task expertise. The study was performed for the task of mammographic screening with simultaneous viewing of four coordinated breast views as typically done in clinical practice. Eye-tracking data and diagnostic decisions collected for 100 mammographic cases (25 normal, 25 benign, 50 malignant) and 10 readers (three board certified radiologists and seven radiology residents), formed the corpus for this study. Sketch gesture recognition techniques were employed to extract geometric and gesture-based features from saccadic eye-movements. Our results show that saccadic eye-movement, characterizedmore » using sketch-based features, result in more accurate models for predicting individual identity and level of expertise than more traditional eye-tracking features.« less
A 3D camera for improved facial recognition

NASA Astrophysics Data System (ADS)

Lewin, Andrew; Orchard, David A.; Scott, Andrew M.; Walton, Nicholas A.; Austin, Jim

2004-12-01

We describe a camera capable of recording 3D images of objects. It does this by projecting thousands of spots onto an object and then measuring the range to each spot by determining the parallax from a single frame. A second frame can be captured to record a conventional image, which can then be projected onto the surface mesh to form a rendered skin. The camera is able of locating the images of the spots to a precision of better than one tenth of a pixel, and from this it can determine range to an accuracy of less than 1 mm at 1 meter. The data can be recorded as a set of two images, and is reconstructed by forming a 'wire mesh' of range points and morphing the 2 D image over this structure. The camera can be used to record the images of faces and reconstruct the shape of the face, which allows viewing of the face from various angles. This allows images to be more critically inspected for the purpose of identifying individuals. Multiple images can be stitched together to create full panoramic images of head sized objects that can be viewed from any direction. The system is being tested with a graph matching system capable of fast and accurate shape comparisons for facial recognition. It can also be used with "models" of heads and faces to provide a means of obtaining biometric data.
Familiarity and recollection produce distinct eye movement, pupil and medial temporal lobe responses when memory strength is matched.

PubMed

Kafkas, Alexandros; Montaldi, Daniela

2012-11-01

Two experiments explored eye measures (fixations and pupil response patterns) and brain responses (BOLD) accompanying the recognition of visual object stimuli based on familiarity and recollection. In both experiments, the use of a modified remember/know procedure led to high confidence and matched accuracy levels characterising strong familiarity (F3) and recollection (R) responses. In Experiment 1, visual scanning behaviour at retrieval distinguished familiarity-based from recollection-based recognition. Recollection, relative to strength-matched familiarity, involved significantly larger pupil dilations and more dispersed fixation patterns. In Experiment 2, the hippocampus was selectively activated for recollected stimuli, while no evidence of activation was observed in the hippocampus for strong familiarity of matched accuracy. Recollection also activated the parahippocampal cortex (PHC), while the adjacent perirhinal cortex (PRC) was actively engaged in response to strong familiarity (than to recollection). Activity in prefrontal and parietal areas differentiated familiarity and recollection in both the extent and the magnitude of activity they exhibited, while the dorsomedial thalamus showed selective familiarity-related activity, and the ventrolateral and anterior thalamus selective recollection-related activity. These findings are consistent with the view that the hippocampus and PRC play contrasting roles in supporting recollection and familiarity and that these differences are not a result of differences in memory strength. Overall, the combined pupil dilation, eye movement and fMRI data suggest the operation of recognition mechanisms drawing differentially on familiarity and recollection, whose neural bases are distinct within the MTL. Copyright © 2012 Elsevier Ltd. All rights reserved.
SEMI-SUPERVISED OBJECT RECOGNITION USING STRUCTURE KERNEL

PubMed Central

Wang, Botao; Xiong, Hongkai; Jiang, Xiaoqian; Ling, Fan

2013-01-01

Object recognition is a fundamental problem in computer vision. Part-based models offer a sparse, flexible representation of objects, but suffer from difficulties in training and often use standard kernels. In this paper, we propose a positive definite kernel called “structure kernel”, which measures the similarity of two part-based represented objects. The structure kernel has three terms: 1) the global term that measures the global visual similarity of two objects; 2) the part term that measures the visual similarity of corresponding parts; 3) the spatial term that measures the spatial similarity of geometric configuration of parts. The contribution of this paper is to generalize the discriminant capability of local kernels to complex part-based object models. Experimental results show that the proposed kernel exhibit higher accuracy than state-of-art approaches using standard kernels. PMID:23666108
Modal-Power-Based Haptic Motion Recognition

NASA Astrophysics Data System (ADS)

Kasahara, Yusuke; Shimono, Tomoyuki; Kuwahara, Hiroaki; Sato, Masataka; Ohnishi, Kouhei

Motion recognition based on sensory information is important for providing assistance to human using robots. Several studies have been carried out on motion recognition based on image information. However, in the motion of humans contact with an object can not be evaluated precisely by image-based recognition. This is because the considering force information is very important for describing contact motion. In this paper, a modal-power-based haptic motion recognition is proposed; modal power is considered to reveal information on both position and force. Modal power is considered to be one of the defining features of human motion. A motion recognition algorithm based on linear discriminant analysis is proposed to distinguish between similar motions. Haptic information is extracted using a bilateral master-slave system. Then, the observed motion is decomposed in terms of primitive functions in a modal space. The experimental results show the effectiveness of the proposed method.
Component-based target recognition inspired by human vision

NASA Astrophysics Data System (ADS)

Zheng, Yufeng; Agyepong, Kwabena

2009-05-01

In contrast with machine vision, human can recognize an object from complex background with great flexibility. For example, given the task of finding and circling all cars (no further information) in a picture, you may build a virtual image in mind from the task (or target) description before looking at the picture. Specifically, the virtual car image may be composed of the key components such as driver cabin and wheels. In this paper, we propose a component-based target recognition method by simulating the human recognition process. The component templates (equivalent to the virtual image in mind) of the target (car) are manually decomposed from the target feature image. Meanwhile, the edges of the testing image can be extracted by using a difference of Gaussian (DOG) model that simulates the spatiotemporal response in visual process. A phase correlation matching algorithm is then applied to match the templates with the testing edge image. If all key component templates are matched with the examining object, then this object is recognized as the target. Besides the recognition accuracy, we will also investigate if this method works with part targets (half cars). In our experiments, several natural pictures taken on streets were used to test the proposed method. The preliminary results show that the component-based recognition method is very promising.
Exploiting range imagery: techniques and applications

NASA Astrophysics Data System (ADS)

Armbruster, Walter

2009-07-01

Practically no applications exist for which automatic processing of 2D intensity imagery can equal human visual perception. This is not the case for range imagery. The paper gives examples of 3D laser radar applications, for which automatic data processing can exceed human visual cognition capabilities and describes basic processing techniques for attaining these results. The examples are drawn from the fields of helicopter obstacle avoidance, object detection in surveillance applications, object recognition at high range, multi-object-tracking, and object re-identification in range image sequences. Processing times and recognition performances are summarized. The techniques used exploit the bijective continuity of the imaging process as well as its independence of object reflectivity, emissivity and illumination. This allows precise formulations of the probability distributions involved in figure-ground segmentation, feature-based object classification and model based object recognition. The probabilistic approach guarantees optimal solutions for single images and enables Bayesian learning in range image sequences. Finally, due to recent results in 3D-surface completion, no prior model libraries are required for recognizing and re-identifying objects of quite general object categories, opening the way to unsupervised learning and fully autonomous cognitive systems.
Minimum Colour Differences Required To Recognise Small Objects On A Colour CRT

NASA Astrophysics Data System (ADS)

Phillips, Peter L.

1985-05-01

Data is required to assist in the assessment, evaluation and optimisation of colour and other displays for both military and general use. A general aim is to develop a mathematical technique to aid optimisation and reduce the amount of expensive hardware development and trials necessary when introducing new displays. The present standards and methods available for evaluating colour differences are known not to apply to the perception of typical objects on a display. Data is required for irregular objects viewed at small angular subtense ((1°) and relating the recognition of form rather than colour matching. Therefore laboratory experiments have been carried out using a computer controlled CRT to measure the threshold colour difference that an observer requires between object and background so that he can discriminate a variety of similar objects. Measurements are included for a variety of background and object colourings. The results are presented in the CIE colorimetric system similar to current standards used by the display engineer. Apart from the characteristic small field tritanopia, the results show that larger colour differences are required for object recognition than those assumed from conventional colour discrimination data. A simple relationship to account for object size and background colour is suggested to aid visual performance assessments and modelling.
Image Processing Strategies Based on a Visual Saliency Model for Object Recognition Under Simulated Prosthetic Vision.

PubMed

Wang, Jing; Li, Heng; Fu, Weizhen; Chen, Yao; Li, Liming; Lyu, Qing; Han, Tingting; Chai, Xinyu

2016-01-01

Retinal prostheses have the potential to restore partial vision. Object recognition in scenes of daily life is one of the essential tasks for implant wearers. Still limited by the low-resolution visual percepts provided by retinal prostheses, it is important to investigate and apply image processing methods to convey more useful visual information to the wearers. We proposed two image processing strategies based on Itti's visual saliency map, region of interest (ROI) extraction, and image segmentation. Itti's saliency model generated a saliency map from the original image, in which salient regions were grouped into ROI by the fuzzy c-means clustering. Then Grabcut generated a proto-object from the ROI labeled image which was recombined with background and enhanced in two ways--8-4 separated pixelization (8-4 SP) and background edge extraction (BEE). Results showed that both 8-4 SP and BEE had significantly higher recognition accuracy in comparison with direct pixelization (DP). Each saliency-based image processing strategy was subject to the performance of image segmentation. Under good and perfect segmentation conditions, BEE and 8-4 SP obtained noticeably higher recognition accuracy than DP, and under bad segmentation condition, only BEE boosted the performance. The application of saliency-based image processing strategies was verified to be beneficial to object recognition in daily scenes under simulated prosthetic vision. They are hoped to help the development of the image processing module for future retinal prostheses, and thus provide more benefit for the patients. Copyright © 2015 International Center for Artificial Organs and Transplantation and Wiley Periodicals, Inc.
Enhanced Gender Recognition System Using an Improved Histogram of Oriented Gradient (HOG) Feature from Quality Assessment of Visible Light and Thermal Images of the Human Body.

PubMed

Nguyen, Dat Tien; Park, Kang Ryoung

2016-07-21

With higher demand from users, surveillance systems are currently being designed to provide more information about the observed scene, such as the appearance of objects, types of objects, and other information extracted from detected objects. Although the recognition of gender of an observed human can be easily performed using human perception, it remains a difficult task when using computer vision system images. In this paper, we propose a new human gender recognition method that can be applied to surveillance systems based on quality assessment of human areas in visible light and thermal camera images. Our research is novel in the following two ways: First, we utilize the combination of visible light and thermal images of the human body for a recognition task based on quality assessment. We propose a quality measurement method to assess the quality of image regions so as to remove the effects of background regions in the recognition system. Second, by combining the features extracted using the histogram of oriented gradient (HOG) method and the measured qualities of image regions, we form a new image features, called the weighted HOG (wHOG), which is used for efficient gender recognition. Experimental results show that our method produces more accurate estimation results than the state-of-the-art recognition method that uses human body images.
Enhanced Gender Recognition System Using an Improved Histogram of Oriented Gradient (HOG) Feature from Quality Assessment of Visible Light and Thermal Images of the Human Body

PubMed Central

Nguyen, Dat Tien; Park, Kang Ryoung

2016-01-01

With higher demand from users, surveillance systems are currently being designed to provide more information about the observed scene, such as the appearance of objects, types of objects, and other information extracted from detected objects. Although the recognition of gender of an observed human can be easily performed using human perception, it remains a difficult task when using computer vision system images. In this paper, we propose a new human gender recognition method that can be applied to surveillance systems based on quality assessment of human areas in visible light and thermal camera images. Our research is novel in the following two ways: First, we utilize the combination of visible light and thermal images of the human body for a recognition task based on quality assessment. We propose a quality measurement method to assess the quality of image regions so as to remove the effects of background regions in the recognition system. Second, by combining the features extracted using the histogram of oriented gradient (HOG) method and the measured qualities of image regions, we form a new image features, called the weighted HOG (wHOG), which is used for efficient gender recognition. Experimental results show that our method produces more accurate estimation results than the state-of-the-art recognition method that uses human body images. PMID:27455264
Australian Recognition Framework Arrangements. Australia's National Training Framework.

ERIC Educational Resources Information Center

Australian National Training Authority, Brisbane.

This document explains the objectives, principles, standards, and protocols of the Australian Recognition Framework (ARF), which is a comprehensive approach to national recognition of vocational education and training (VET) that is based on a quality-assured approach to the registration of training organizations seeking to deliver training, assess…

Invariant object recognition based on the generalized discrete radon transform

NASA Astrophysics Data System (ADS)

Easley, Glenn R.; Colonna, Flavia

2004-04-01

We introduce a method for classifying objects based on special cases of the generalized discrete Radon transform. We adjust the transform and the corresponding ridgelet transform by means of circular shifting and a singular value decomposition (SVD) to obtain a translation, rotation and scaling invariant set of feature vectors. We then use a back-propagation neural network to classify the input feature vectors. We conclude with experimental results and compare these with other invariant recognition methods.
TU-FG-209-12: Treatment Site and View Recognition in X-Ray Images with Hierarchical Multiclass Recognition Models

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chang, X; Mazur, T; Yang, D

Purpose: To investigate an approach of automatically recognizing anatomical sites and imaging views (the orientation of the image acquisition) in 2D X-ray images. Methods: A hierarchical (binary tree) multiclass recognition model was developed to recognize the treatment sites and views in x-ray images. From top to bottom of the tree, the treatment sites are grouped hierarchically from more general to more specific. Each node in the hierarchical model was designed to assign images to one of two categories of anatomical sites. The binary image classification function of each node in the hierarchical model is implemented by using a PCA transformationmore » and a support vector machine (SVM) model. The optimal PCA transformation matrices and SVM models are obtained by learning from a set of sample images. Alternatives of the hierarchical model were developed to support three scenarios of site recognition that may happen in radiotherapy clinics, including two or one X-ray images with or without view information. The performance of the approach was tested with images of 120 patients from six treatment sites – brain, head-neck, breast, lung, abdomen and pelvis – with 20 patients per site and two views (AP and RT) per patient. Results: Given two images in known orthogonal views (AP and RT), the hierarchical model achieved a 99% average F1 score to recognize the six sites. Site specific view recognition models have 100 percent accuracy. The computation time to process a new patient case (preprocessing, site and view recognition) is 0.02 seconds. Conclusion: The proposed hierarchical model of site and view recognition is effective and computationally efficient. It could be useful to automatically and independently confirm the treatment sites and views in daily setup x-ray 2D images. It could also be applied to guide subsequent image processing tasks, e.g. site and view dependent contrast enhancement and image registration. The senior author received research grants from ViewRay Inc. and Varian Medical System.« less
Microoptical compound lens

DOEpatents

Sweatt, William C.; Gill, David D.

2007-10-23

An apposition microoptical compound lens comprises a plurality of lenslets arrayed around a segment of a hollow, three-dimensional optical shell. The lenslets collect light from an object and focus the light rays onto the concentric, curved front surface of a coherent fiber bundle. The fiber bundle transports the light rays to a planar detector, forming a plurality of sub-images that can be reconstructed as a full image. The microoptical compound lens can have a small size (millimeters), wide field of view (up to 180.degree.), and adequate resolution for object recognition and tracking.
Products recognition on shop-racks from local scale-invariant features

NASA Astrophysics Data System (ADS)

Zawistowski, Jacek; Kurzejamski, Grzegorz; Garbat, Piotr; Naruniec, Jacek

2016-04-01

This paper presents a system designed for the multi-object detection purposes and adjusted for the application of product search on the market shelves. System uses well known binary keypoint detection algorithms for finding characteristic points in the image. One of the main idea is object recognition based on Implicit Shape Model method. Authors of the article proposed many improvements of the algorithm. Originally fiducial points are matched with a very simple function. This leads to the limitations in the number of objects parts being success- fully separated, while various methods of classification may be validated in order to achieve higher performance. Such an extension implies research on training procedure able to deal with many objects categories. Proposed solution opens a new possibilities for many algorithms demanding fast and robust multi-object recognition.
Recognition by Linear Combination of Models

DTIC Science & Technology

1989-08-01

to the model (or to the viewed object) prior to, or during the matching stage. Such an approach is used in [Chien & Aggarwal 1987 , Faugeras & Hebert...1986, Fishler & Bolles 1981, Huttenlocher & Ullman 1987 , Lowe 1985, Thompson & Mundy 1987 , Ullman 1986]. Key problems that arise in any alignment...cludes 3-D rotation, translation and scaling, followed by an orthographic projection. The 1 transformation is determined as in [Huttenlocher & Ullman 1987
Three-dimensional passive sensing photon counting for object classification

NASA Astrophysics Data System (ADS)

Yeom, Seokwon; Javidi, Bahram; Watson, Edward

2007-04-01

In this keynote address, we address three-dimensional (3D) distortion-tolerant object recognition using photon-counting integral imaging (II). A photon-counting linear discriminant analysis (LDA) is discussed for classification of photon-limited images. We develop a compact distortion-tolerant recognition system based on the multiple-perspective imaging of II. Experimental and simulation results have shown that a low level of photons is sufficient to classify out-of-plane rotated objects.
Recognition of upper airway and surrounding structures at MRI in pediatric PCOS and OSAS

NASA Astrophysics Data System (ADS)

Tong, Yubing; Udupa, J. K.; Odhner, D.; Sin, Sanghun; Arens, Raanan

2013-03-01

Obstructive Sleep Apnea Syndrome (OSAS) is common in obese children with risk being 4.5 fold compared to normal control subjects. Polycystic Ovary Syndrome (PCOS) has recently been shown to be associated with OSAS that may further lead to significant cardiovascular and neuro-cognitive deficits. We are investigating image-based biomarkers to understand the architectural and dynamic changes in the upper airway and the surrounding hard and soft tissue structures via MRI in obese teenage children to study OSAS. At the previous SPIE conferences, we presented methods underlying Fuzzy Object Models (FOMs) for Automatic Anatomy Recognition (AAR) based on CT images of the thorax and the abdomen. The purpose of this paper is to demonstrate that the AAR approach is applicable to a different body region and image modality combination, namely in the study of upper airway structures via MRI. FOMs were built hierarchically, the smaller sub-objects forming the offspring of larger parent objects. FOMs encode the uncertainty and variability present in the form and relationships among the objects over a study population. Totally 11 basic objects (17 including composite) were modeled. Automatic recognition for the best pose of FOMs in a given image was implemented by using four methods - a one-shot method that does not require search, another three searching methods that include Fisher Linear Discriminate (FLD), a b-scale energy optimization strategy, and optimum threshold recognition method. In all, 30 multi-fold cross validation experiments based on 15 patient MRI data sets were carried out to assess the accuracy of recognition. The results indicate that the objects can be recognized with an average location error of less than 5 mm or 2-3 voxels. Then the iterative relative fuzzy connectedness (IRFC) algorithm was adopted for delineation of the target organs based on the recognized results. The delineation results showed an overall FP and TP volume fraction of 0.02 and 0.93.
A new method of edge detection for object recognition

USGS Publications Warehouse

Maddox, Brian G.; Rhew, Benjamin

2004-01-01

Traditional edge detection systems function by returning every edge in an input image. This can result in a large amount of clutter and make certain vectorization algorithms less accurate. Accuracy problems can then have a large impact on automated object recognition systems that depend on edge information. A new method of directed edge detection can be used to limit the number of edges returned based on a particular feature. This results in a cleaner image that is easier for vectorization. Vectorized edges from this process could then feed an object recognition system where the edge data would also contain information as to what type of feature it bordered.
3D face analysis by using Mesh-LBP feature

NASA Astrophysics Data System (ADS)

Wang, Haoyu; Yang, Fumeng; Zhang, Yuming; Wu, Congzhong

2017-11-01

Objective: Face Recognition is one of the widely application of image processing. Corresponding two-dimensional limitations, such as the pose and illumination changes, to a certain extent restricted its accurate rate and further development. How to overcome the pose and illumination changes and the effects of self-occlusion is the research hotspot and difficulty, also attracting more and more domestic and foreign experts and scholars to study it. 3D face recognition fusing shape and texture descriptors has become a very promising research direction. Method: Our paper presents a 3D point cloud based on mesh local binary pattern grid (Mesh-LBP), then feature extraction for 3D face recognition by fusing shape and texture descriptors. 3D Mesh-LBP not only retains the integrity of the 3D geometry, is also reduces the need for recognition process of normalization steps, because the triangle Mesh-LBP descriptor is calculated on 3D grid. On the other hand, in view of multi-modal consistency in face recognition advantage, construction of LBP can fusing shape and texture information on Triangular Mesh. In this paper, some of the operators used to extract Mesh-LBP, Such as the normal vectors of the triangle each face and vertex, the gaussian curvature, the mean curvature, laplace operator and so on. Conclusion: First, Kinect devices obtain 3D point cloud face, after the pretreatment and normalization, then transform it into triangular grid, grid local binary pattern feature extraction from face key significant parts of face. For each local face, calculate its Mesh-LBP feature with Gaussian curvature, mean curvature laplace operator and so on. Experiments on the our research database, change the method is robust and high recognition accuracy.
Invariant Visual Object and Face Recognition: Neural and Computational Bases, and a Model, VisNet

PubMed Central

Rolls, Edmund T.

2012-01-01

Neurophysiological evidence for invariant representations of objects and faces in the primate inferior temporal visual cortex is described. Then a computational approach to how invariant representations are formed in the brain is described that builds on the neurophysiology. A feature hierarchy model in which invariant representations can be built by self-organizing learning based on the temporal and spatial statistics of the visual input produced by objects as they transform in the world is described. VisNet can use temporal continuity in an associative synaptic learning rule with a short-term memory trace, and/or it can use spatial continuity in continuous spatial transformation learning which does not require a temporal trace. The model of visual processing in the ventral cortical stream can build representations of objects that are invariant with respect to translation, view, size, and also lighting. The model has been extended to provide an account of invariant representations in the dorsal visual system of the global motion produced by objects such as looming, rotation, and object-based movement. The model has been extended to incorporate top-down feedback connections to model the control of attention by biased competition in, for example, spatial and object search tasks. The approach has also been extended to account for how the visual system can select single objects in complex visual scenes, and how multiple objects can be represented in a scene. The approach has also been extended to provide, with an additional layer, for the development of representations of spatial scenes of the type found in the hippocampus. PMID:22723777
Invariant Visual Object and Face Recognition: Neural and Computational Bases, and a Model, VisNet.

PubMed

Rolls, Edmund T

2012-01-01

Neurophysiological evidence for invariant representations of objects and faces in the primate inferior temporal visual cortex is described. Then a computational approach to how invariant representations are formed in the brain is described that builds on the neurophysiology. A feature hierarchy model in which invariant representations can be built by self-organizing learning based on the temporal and spatial statistics of the visual input produced by objects as they transform in the world is described. VisNet can use temporal continuity in an associative synaptic learning rule with a short-term memory trace, and/or it can use spatial continuity in continuous spatial transformation learning which does not require a temporal trace. The model of visual processing in the ventral cortical stream can build representations of objects that are invariant with respect to translation, view, size, and also lighting. The model has been extended to provide an account of invariant representations in the dorsal visual system of the global motion produced by objects such as looming, rotation, and object-based movement. The model has been extended to incorporate top-down feedback connections to model the control of attention by biased competition in, for example, spatial and object search tasks. The approach has also been extended to account for how the visual system can select single objects in complex visual scenes, and how multiple objects can be represented in a scene. The approach has also been extended to provide, with an additional layer, for the development of representations of spatial scenes of the type found in the hippocampus.
Image-based automatic recognition of larvae

NASA Astrophysics Data System (ADS)

Sang, Ru; Yu, Guiying; Fan, Weijun; Guo, Tiantai

2010-08-01

As the main objects, imagoes have been researched in quarantine pest recognition in these days. However, pests in their larval stage are latent, and the larvae spread abroad much easily with the circulation of agricultural and forest products. It is presented in this paper that, as the new research objects, larvae are recognized by means of machine vision, image processing and pattern recognition. More visional information is reserved and the recognition rate is improved as color image segmentation is applied to images of larvae. Along with the characteristics of affine invariance, perspective invariance and brightness invariance, scale invariant feature transform (SIFT) is adopted for the feature extraction. The neural network algorithm is utilized for pattern recognition, and the automatic identification of larvae images is successfully achieved with satisfactory results.
Biometric identification

NASA Astrophysics Data System (ADS)

Syryamkim, V. I.; Kuznetsov, D. N.; Kuznetsova, A. S.

2018-05-01

Image recognition is an information process implemented by some information converter (intelligent information channel, recognition system) having input and output. The input of the system is fed with information about the characteristics of the objects being presented. The output of the system displays information about which classes (generalized images) the recognized objects are assigned to. When creating and operating an automated system for pattern recognition, a number of problems are solved, while for different authors the formulations of these tasks, and the set itself, do not coincide, since it depends to a certain extent on the specific mathematical model on which this or that recognition system is based. This is the task of formalizing the domain, forming a training sample, learning the recognition system, reducing the dimensionality of space.
Efficient view based 3-D object retrieval using Hidden Markov Model

NASA Astrophysics Data System (ADS)

Jain, Yogendra Kumar; Singh, Roshan Kumar

2013-12-01

Recent research effort has been dedicated to view based 3-D object retrieval, because of highly discriminative property of 3-D object and has multi view representation. The state-of-art method is highly depending on their own camera array setting for capturing views of 3-D object and use complex Zernike descriptor, HAC for representative view selection which limit their practical application and make it inefficient for retrieval. Therefore, an efficient and effective algorithm is required for 3-D Object Retrieval. In order to move toward a general framework for efficient 3-D object retrieval which is independent of camera array setting and avoidance of representative view selection, we propose an Efficient View Based 3-D Object Retrieval (EVBOR) method using Hidden Markov Model (HMM). In this framework, each object is represented by independent set of view, which means views are captured from any direction without any camera array restriction. In this, views are clustered (including query view) to generate the view cluster, which is then used to build the query model with HMM. In our proposed method, HMM is used in twofold: in the training (i.e. HMM estimate) and in the retrieval (i.e. HMM decode). The query model is trained by using these view clusters. The EVBOR query model is worked on the basis of query model combining with HMM. The proposed approach remove statically camera array setting for view capturing and can be apply for any 3-D object database to retrieve 3-D object efficiently and effectively. Experimental results demonstrate that the proposed scheme has shown better performance than existing methods. [Figure not available: see fulltext.
Towards NIRS-based hand movement recognition.

PubMed

Paleari, Marco; Luciani, Riccardo; Ariano, Paolo

2017-07-01

This work reports on preliminary results about on hand movement recognition with Near InfraRed Spectroscopy (NIRS) and surface ElectroMyoGraphy (sEMG). Either basing on physical contact (touchscreens, data-gloves, etc.), vision techniques (Microsoft Kinect, Sony PlayStation Move, etc.), or other modalities, hand movement recognition is a pervasive function in today environment and it is at the base of many gaming, social, and medical applications. Albeit, in recent years, the use of muscle information extracted by sEMG has spread out from the medical applications to contaminate the consumer world, this technique still falls short when dealing with movements of the hand. We tested NIRS as a technique to get another point of view on the muscle phenomena and proved that, within a specific movements selection, NIRS can be used to recognize movements and return information regarding muscles at different depths. Furthermore, we propose here three different multimodal movement recognition approaches and compare their performances.
Artificially intelligent recognition of Arabic speaker using voice print-based local features

NASA Astrophysics Data System (ADS)

Mahmood, Awais; Alsulaiman, Mansour; Muhammad, Ghulam; Akram, Sheeraz

2016-11-01

Local features for any pattern recognition system are based on the information extracted locally. In this paper, a local feature extraction technique was developed. This feature was extracted in the time-frequency plain by taking the moving average on the diagonal directions of the time-frequency plane. This feature captured the time-frequency events producing a unique pattern for each speaker that can be viewed as a voice print of the speaker. Hence, we referred to this technique as voice print-based local feature. The proposed feature was compared to other features including mel-frequency cepstral coefficient (MFCC) for speaker recognition using two different databases. One of the databases used in the comparison is a subset of an LDC database that consisted of two short sentences uttered by 182 speakers. The proposed feature attained 98.35% recognition rate compared to 96.7% for MFCC using the LDC subset.
a Fully Automated Pipeline for Classification Tasks with AN Application to Remote Sensing

NASA Astrophysics Data System (ADS)

Suzuki, K.; Claesen, M.; Takeda, H.; De Moor, B.

2016-06-01

Nowadays deep learning has been intensively in spotlight owing to its great victories at major competitions, which undeservedly pushed `shallow' machine learning methods, relatively naive/handy algorithms commonly used by industrial engineers, to the background in spite of their facilities such as small requisite amount of time/dataset for training. We, with a practical point of view, utilized shallow learning algorithms to construct a learning pipeline such that operators can utilize machine learning without any special knowledge, expensive computation environment, and a large amount of labelled data. The proposed pipeline automates a whole classification process, namely feature-selection, weighting features and the selection of the most suitable classifier with optimized hyperparameters. The configuration facilitates particle swarm optimization, one of well-known metaheuristic algorithms for the sake of generally fast and fine optimization, which enables us not only to optimize (hyper)parameters but also to determine appropriate features/classifier to the problem, which has conventionally been a priori based on domain knowledge and remained untouched or dealt with naïve algorithms such as grid search. Through experiments with the MNIST and CIFAR-10 datasets, common datasets in computer vision field for character recognition and object recognition problems respectively, our automated learning approach provides high performance considering its simple setting (i.e. non-specialized setting depending on dataset), small amount of training data, and practical learning time. Moreover, compared to deep learning the performance stays robust without almost any modification even with a remote sensing object recognition problem, which in turn indicates that there is a high possibility that our approach contributes to general classification problems.
Critical object recognition in millimeter-wave images with robustness to rotation and scale.

PubMed

Mohammadzade, Hoda; Ghojogh, Benyamin; Faezi, Sina; Shabany, Mahdi

2017-06-01

Locating critical objects is crucial in various security applications and industries. For example, in security applications, such as in airports, these objects might be hidden or covered under shields or secret sheaths. Millimeter-wave images can be utilized to discover and recognize the critical objects out of the hidden cases without any health risk due to their non-ionizing features. However, millimeter-wave images usually have waves in and around the detected objects, making object recognition difficult. Thus, regular image processing and classification methods cannot be used for these images and additional pre-processings and classification methods should be introduced. This paper proposes a novel pre-processing method for canceling rotation and scale using principal component analysis. In addition, a two-layer classification method is introduced and utilized for recognition. Moreover, a large dataset of millimeter-wave images is collected and created for experiments. Experimental results show that a typical classification method such as support vector machines can recognize 45.5% of a type of critical objects at 34.2% false alarm rate (FAR), which is a drastically poor recognition. The same method within the proposed recognition framework achieves 92.9% recognition rate at 0.43% FAR, which indicates a highly significant improvement. The significant contribution of this work is to introduce a new method for analyzing millimeter-wave images based on machine vision and learning approaches, which is not yet widely noted in the field of millimeter-wave image analysis.
Extensions of the picture superiority effect in associative recognition.

PubMed

Hockley, William E; Bancroft, Tyler

2011-12-01

Previous research has shown that the picture superiority effect (PSE) is seen in tests of associative recognition for random pairs of line drawings compared to pairs of concrete words (Hockley, 2008). In the present study we demonstrated that the PSE for associative recognition is still observed when subjects have correctly identified the individual items of each pair as old (Experiment 1), and that this effect is not due to rehearsal borrowing (Experiment 2). The PSE for associative recognition also is shown to be present but attenuated for mixed picture-word pairs (Experiment 3), and similar in magnitude for pairs of simple black and white line drawings and coloured photographs of detailed objects (Experiment 4). The results are consistent with the view that the semantic meaning of nameable pictures is activated faster than that of words thereby affording subjects more time to generate and elaborate meaningful associations between items depicted in picture form. PsycINFO Database Record (c) 2011 APA, all rights reserved.
Maximum mutual information estimation of a simplified hidden MRF for offline handwritten Chinese character recognition

NASA Astrophysics Data System (ADS)

Xiong, Yan; Reichenbach, Stephen E.

1999-01-01

Understanding of hand-written Chinese characters is at such a primitive stage that models include some assumptions about hand-written Chinese characters that are simply false. So Maximum Likelihood Estimation (MLE) may not be an optimal method for hand-written Chinese characters recognition. This concern motivates the research effort to consider alternative criteria. Maximum Mutual Information Estimation (MMIE) is an alternative method for parameter estimation that does not derive its rationale from presumed model correctness, but instead examines the pattern-modeling problem in automatic recognition system from an information- theoretic point of view. The objective of MMIE is to find a set of parameters in such that the resultant model allows the system to derive from the observed data as much information as possible about the class. We consider MMIE for recognition of hand-written Chinese characters using on a simplified hidden Markov Random Field. MMIE provides improved performance improvement over MLE in this application.

On a problematic procedure to manipulate response biases in recognition experiments: the case of "implied" base rates.

PubMed

Bröder, Arndt; Malejka, Simone

2017-07-01

The experimental manipulation of response biases in recognition-memory tests is an important means for testing recognition models and for estimating their parameters. The textbook manipulations for binary-response formats either vary the payoff scheme or the base rate of targets in the recognition test, with the latter being the more frequently applied procedure. However, some published studies reverted to implying different base rates by instruction rather than actually changing them. Aside from unnecessarily deceiving participants, this procedure may lead to cognitive conflicts that prompt response strategies unknown to the experimenter. To test our objection, implied base rates were compared to actual base rates in a recognition experiment followed by a post-experimental interview to assess participants' response strategies. The behavioural data show that recognition-memory performance was estimated to be lower in the implied base-rate condition. The interview data demonstrate that participants used various second-order response strategies that jeopardise the interpretability of the recognition data. We thus advice researchers against substituting actual base rates with implied base rates.
Research on autonomous identification of airport targets based on Gabor filtering and Radon transform

NASA Astrophysics Data System (ADS)

Yi, Juan; Du, Qingyu; Zhang, Hong jiang; Zhang, Yao lei

2017-11-01

Target recognition is a leading key technology in intelligent image processing and application development at present, with the enhancement of computer processing ability, autonomous target recognition algorithm, gradually improve intelligence, and showed good adaptability. Taking the airport target as the research object, analysis the airport layout characteristics, construction of knowledge model, Gabor filter and Radon transform based on the target recognition algorithm of independent design, image processing and feature extraction of the airport, the algorithm was verified, and achieved better recognition results.
Deep learning based hand gesture recognition in complex scenes

NASA Astrophysics Data System (ADS)

Ni, Zihan; Sang, Nong; Tan, Cheng

2018-03-01

Recently, region-based convolutional neural networks(R-CNNs) have achieved significant success in the field of object detection, but their accuracy is not too high for small objects and similar objects, such as the gestures. To solve this problem, we present an online hard example testing(OHET) technology to evaluate the confidence of the R-CNNs' outputs, and regard those outputs with low confidence as hard examples. In this paper, we proposed a cascaded networks to recognize the gestures. Firstly, we use the region-based fully convolutional neural network(R-FCN), which is capable of the detection for small object, to detect the gestures, and then use the OHET to select the hard examples. To enhance the accuracy of the gesture recognition, we re-classify the hard examples through VGG-19 classification network to obtain the final output of the gesture recognition system. Through the contrast experiments with other methods, we can see that the cascaded networks combined with the OHET reached to the state-of-the-art results of 99.3% mAP on small and similar gestures in complex scenes.
Learning Distance Functions for Exemplar-Based Object Recognition

DTIC Science & Technology

2007-08-08

requires prior specific permission. Learning Distance Functions for Exemplar-Based Object Recognition by Andrea Lynn Frome B.S. ( Mary Washington...fantastic advisor and advocate when I was at Mary Washington College i and has since become a dear friend. Thank you, Dr. Bass, for continuing to stand...Antonio Torralba. 5 Chapter 1. Introduction 0 5 10 15 20 25 30 35 10 15 20 25 30 35 40 45 50 55 60 65 70 Number of training examples per class M ea n
Learning Distance Functions for Exemplar-Based Object Recognition

DTIC Science & Technology

2007-01-01

Learning Distance Functions for Exemplar-Based Object Recognition by Andrea Lynn Frome B.S. ( Mary Washington College) 1996 A dissertation submitted...advisor and advocate when I was at Mary Washington College i and has since become a dear friend. Thank you, Dr. Bass, for continuing to stand by my...Torralba. 5 Chapter 1. Introduction 0 5 10 15 20 25 30 35 10 15 20 25 30 35 40 45 50 55 60 65 70 Number of training examples per class M ea n re co
Hand biometric recognition based on fused hand geometry and vascular patterns.

PubMed

Park, GiTae; Kim, Soowon

2013-02-28

A hand biometric authentication method based on measurements of the user's hand geometry and vascular pattern is proposed. To acquire the hand geometry, the thickness of the side view of the hand, the K-curvature with a hand-shaped chain code, the lengths and angles of the finger valleys, and the lengths and profiles of the fingers were used, and for the vascular pattern, the direction-based vascular-pattern extraction method was used, and thus, a new multimodal biometric approach is proposed. The proposed multimodal biometric system uses only one image to extract the feature points. This system can be configured for low-cost devices. Our multimodal biometric-approach hand-geometry (the side view of the hand and the back of hand) and vascular-pattern recognition method performs at the score level. The results of our study showed that the equal error rate of the proposed system was 0.06%.
Hand Biometric Recognition Based on Fused Hand Geometry and Vascular Patterns

PubMed Central

Park, GiTae; Kim, Soowon

2013-01-01

A hand biometric authentication method based on measurements of the user's hand geometry and vascular pattern is proposed. To acquire the hand geometry, the thickness of the side view of the hand, the K-curvature with a hand-shaped chain code, the lengths and angles of the finger valleys, and the lengths and profiles of the fingers were used, and for the vascular pattern, the direction-based vascular-pattern extraction method was used, and thus, a new multimodal biometric approach is proposed. The proposed multimodal biometric system uses only one image to extract the feature points. This system can be configured for low-cost devices. Our multimodal biometric-approach hand-geometry (the side view of the hand and the back of hand) and vascular-pattern recognition method performs at the score level. The results of our study showed that the equal error rate of the proposed system was 0.06%. PMID:23449119
Deep neural network features for horses identity recognition using multiview horses' face pattern

NASA Astrophysics Data System (ADS)

Jarraya, Islem; Ouarda, Wael; Alimi, Adel M.

2017-03-01

To control the state of horses in the born, breeders needs a monitoring system with a surveillance camera that can identify and distinguish between horses. We proposed in [5] a method of horse's identification at a distance using the frontal facial biometric modality. Due to the change of views, the face recognition becomes more difficult. In this paper, the number of images used in our THoDBRL'2015 database (Tunisian Horses DataBase of Regim Lab) is augmented by adding other images of other views. Thus, we used front, right and left profile face's view. Moreover, we suggested an approach for multiview face recognition. First, we proposed to use the Gabor filter for face characterization. Next, due to the augmentation of the number of images, and the large number of Gabor features, we proposed to test the Deep Neural Network with the auto-encoder to obtain the more pertinent features and to reduce the size of features vector. Finally, we performed the proposed approach on our THoDBRL'2015 database and we used the linear SVM for classification.
Game theoretic approach for cooperative feature extraction in camera networks

NASA Astrophysics Data System (ADS)

Redondi, Alessandro E. C.; Baroffio, Luca; Cesana, Matteo; Tagliasacchi, Marco

2016-07-01

Visual sensor networks (VSNs) consist of several camera nodes with wireless communication capabilities that can perform visual analysis tasks such as object identification, recognition, and tracking. Often, VSN deployments result in many camera nodes with overlapping fields of view. In the past, such redundancy has been exploited in two different ways: (1) to improve the accuracy/quality of the visual analysis task by exploiting multiview information or (2) to reduce the energy consumed for performing the visual task, by applying temporal scheduling techniques among the cameras. We propose a game theoretic framework based on the Nash bargaining solution to bridge the gap between the two aforementioned approaches. The key tenet of the proposed framework is for cameras to reduce the consumed energy in the analysis process by exploiting the redundancy in the reciprocal fields of view. Experimental results in both simulated and real-life scenarios confirm that the proposed scheme is able to increase the network lifetime, with a negligible loss in terms of visual analysis accuracy.
Robust and Effective Component-based Banknote Recognition for the Blind

PubMed Central

Hasanuzzaman, Faiz M.; Yang, Xiaodong; Tian, YingLi

2012-01-01

We develop a novel camera-based computer vision technology to automatically recognize banknotes for assisting visually impaired people. Our banknote recognition system is robust and effective with the following features: 1) high accuracy: high true recognition rate and low false recognition rate, 2) robustness: handles a variety of currency designs and bills in various conditions, 3) high efficiency: recognizes banknotes quickly, and 4) ease of use: helps blind users to aim the target for image capture. To make the system robust to a variety of conditions including occlusion, rotation, scaling, cluttered background, illumination change, viewpoint variation, and worn or wrinkled bills, we propose a component-based framework by using Speeded Up Robust Features (SURF). Furthermore, we employ the spatial relationship of matched SURF features to detect if there is a bill in the camera view. This process largely alleviates false recognition and can guide the user to correctly aim at the bill to be recognized. The robustness and generalizability of the proposed system is evaluated on a dataset including both positive images (with U.S. banknotes) and negative images (no U.S. banknotes) collected under a variety of conditions. The proposed algorithm, achieves 100% true recognition rate and 0% false recognition rate. Our banknote recognition system is also tested by blind users. PMID:22661884
Beyond perceptual expertise: revisiting the neural substrates of expert object recognition

PubMed Central

Harel, Assaf; Kravitz, Dwight; Baker, Chris I.

2013-01-01

Real-world expertise provides a valuable opportunity to understand how experience shapes human behavior and neural function. In the visual domain, the study of expert object recognition, such as in car enthusiasts or bird watchers, has produced a large, growing, and often-controversial literature. Here, we synthesize this literature, focusing primarily on results from functional brain imaging, and propose an interactive framework that incorporates the impact of high-level factors, such as attention and conceptual knowledge, in supporting expertise. This framework contrasts with the perceptual view of object expertise that has concentrated largely on stimulus-driven processing in visual cortex. One prominent version of this perceptual account has almost exclusively focused on the relation of expertise to face processing and, in terms of the neural substrates, has centered on face-selective cortical regions such as the Fusiform Face Area (FFA). We discuss the limitations of this face-centric approach as well as the more general perceptual view, and highlight that expert related activity is: (i) found throughout visual cortex, not just FFA, with a strong relationship between neural response and behavioral expertise even in the earliest stages of visual processing, (ii) found outside visual cortex in areas such as parietal and prefrontal cortices, and (iii) modulated by the attentional engagement of the observer suggesting that it is neither automatic nor driven solely by stimulus properties. These findings strongly support a framework in which object expertise emerges from extensive interactions within and between the visual system and other cognitive systems, resulting in widespread, distributed patterns of expertise-related activity across the entire cortex. PMID:24409134
User acceptability--a critical success factor for picture archiving and communication system implementation.

PubMed

Crivianu-Gaita, D; Babyn, P; Gilday, D; O'Brien, B; Charkot, E

2000-05-01

The Department of Diagnostic Imaging at the Hospital for Sick Children (HSC), Toronto, implemented a picture archiving and communication system (PACS) during the last year. This report describes our experience from the point of view of user acceptability. Based on objective data, the following key success factors were identified: user involvement in PACS planning, training, technical support, and rollout of pilot projects. Although technical factors are critical and must be addressed, the main conclusion of our study is that other nontechnical factors need to be recognized and resolved. Recognition of the importance of these factors to user acceptance and clear communication and consultation will help reduce negative user attitudes and increase the chance of a successful PACS implementation.
Ball-scale based hierarchical multi-object recognition in 3D medical images

NASA Astrophysics Data System (ADS)

Bağci, Ulas; Udupa, Jayaram K.; Chen, Xinjian

2010-03-01

This paper investigates, using prior shape models and the concept of ball scale (b-scale), ways of automatically recognizing objects in 3D images without performing elaborate searches or optimization. That is, the goal is to place the model in a single shot close to the right pose (position, orientation, and scale) in a given image so that the model boundaries fall in the close vicinity of object boundaries in the image. This is achieved via the following set of key ideas: (a) A semi-automatic way of constructing a multi-object shape model assembly. (b) A novel strategy of encoding, via b-scale, the pose relationship between objects in the training images and their intensity patterns captured in b-scale images. (c) A hierarchical mechanism of positioning the model, in a one-shot way, in a given image from a knowledge of the learnt pose relationship and the b-scale image of the given image to be segmented. The evaluation results on a set of 20 routine clinical abdominal female and male CT data sets indicate the following: (1) Incorporating a large number of objects improves the recognition accuracy dramatically. (2) The recognition algorithm can be thought as a hierarchical framework such that quick replacement of the model assembly is defined as coarse recognition and delineation itself is known as finest recognition. (3) Scale yields useful information about the relationship between the model assembly and any given image such that the recognition results in a placement of the model close to the actual pose without doing any elaborate searches or optimization. (4) Effective object recognition can make delineation most accurate.
Progestogens’ effects and mechanisms for object recognition memory across the lifespan

PubMed Central

Walf, Alicia A.; Koonce, Carolyn J.; Frye, Cheryl A.

2016-01-01

This review explores the effects of female reproductive hormones, estrogens and progestogens, with a focus on progesterone and allopregnanolone, on object memory. Progesterone and its metabolites, in particular allopregnanolone, exert various effects on both cognitive and non-mnemonic functions in females. The well-known object recognition task is a valuable experimental paradigm that can be used to determine the effects and mechanisms of progestogens for mnemonic effects across the lifespan, which will be discussed herein. In this task there is little test-decay when different objects are used as targets and baseline valance for objects is controlled. This allows repeated testing, within-subjects designs, and longitudinal assessments, which aid understanding of changes in hormonal milieu. Objects are not aversive or food-based, which are hormone-sensitive factors. This review focuses on published data from our laboratory, and others, using the object recognition task in rodents to assess the role and mechanisms of progestogens throughout the lifespan. Improvements in object recognition performance of rodents are often associated with higher hormone levels in the hippocampus and prefrontal cortex during natural cycles, with hormone replacement following ovariectomy in young animals, or with aging. The capacity for reversal of age- and reproductive senescence-related decline in cognitive performance, and changes in neural plasticity that may be dissociated from peripheral effects with such decline, are discussed. The focus here will be on the effects of brain-derived factors, such as the neurosteroid, allopregnanolone, and other hormones, for enhancing object recognition across the lifespan. PMID:26235328
STDP in lateral connections creates category-based perceptual cycles for invariance learning with multiple stimuli.

PubMed

Evans, Benjamin D; Stringer, Simon M

2015-04-01

Learning to recognise objects and faces is an important and challenging problem tackled by the primate ventral visual system. One major difficulty lies in recognising an object despite profound differences in the retinal images it projects, due to changes in view, scale, position and other identity-preserving transformations. Several models of the ventral visual system have been successful in coping with these issues, but have typically been privileged by exposure to only one object at a time. In natural scenes, however, the challenges of object recognition are typically further compounded by the presence of several objects which should be perceived as distinct entities. In the present work, we explore one possible mechanism by which the visual system may overcome these two difficulties simultaneously, through segmenting unseen (artificial) stimuli using information about their category encoded in plastic lateral connections. We demonstrate that these experience-guided lateral interactions robustly organise input representations into perceptual cycles, allowing feed-forward connections trained with spike-timing-dependent plasticity to form independent, translation-invariant output representations. We present these simulations as a functional explanation for the role of plasticity in the lateral connectivity of visual cortex.
A New Experiment on Bengali Character Recognition

NASA Astrophysics Data System (ADS)

Barman, Sumana; Bhattacharyya, Debnath; Jeon, Seung-Whan; Kim, Tai-Hoon; Kim, Haeng-Kon

This paper presents a method to use View based approach in Bangla Optical Character Recognition (OCR) system providing reduced data set to the ANN classification engine rather than the traditional OCR methods. It describes how Bangla characters are processed, trained and then recognized with the use of a Backpropagation Artificial neural network. This is the first published account of using a segmentation-free optical character recognition system for Bangla using a view based approach. The methodology presented here assumes that the OCR pre-processor has presented the input images to the classification engine described here. The size and the font face used to render the characters are also significant in both training and classification. The images are first converted into greyscale and then to binary images; these images are then scaled to a fit a pre-determined area with a fixed but significant number of pixels. The feature vectors are then formed extracting the characteristics points, which in this case is simply a series of 0s and 1s of fixed length. Finally, an artificial neural network is chosen for the training and classification process.
Complex scenes and situations visualization in hierarchical learning algorithm with dynamic 3D NeoAxis engine

NASA Astrophysics Data System (ADS)

Graham, James; Ternovskiy, Igor V.

2013-06-01

We applied a two stage unsupervised hierarchical learning system to model complex dynamic surveillance and cyber space monitoring systems using a non-commercial version of the NeoAxis visualization software. The hierarchical scene learning and recognition approach is based on hierarchical expectation maximization, and was linked to a 3D graphics engine for validation of learning and classification results and understanding the human - autonomous system relationship. Scene recognition is performed by taking synthetically generated data and feeding it to a dynamic logic algorithm. The algorithm performs hierarchical recognition of the scene by first examining the features of the objects to determine which objects are present, and then determines the scene based on the objects present. This paper presents a framework within which low level data linked to higher-level visualization can provide support to a human operator and be evaluated in a detailed and systematic way.
The posterior parietal cortex in recognition memory: a neuropsychological study.

PubMed

Haramati, Sharon; Soroker, Nachum; Dudai, Yadin; Levy, Daniel A

2008-01-01

Several recent functional neuroimaging studies have reported robust bilateral activation (L>R) in lateral posterior parietal cortex and precuneus during recognition memory retrieval tasks. It has not yet been determined what cognitive processes are represented by those activations. In order to examine whether parietal lobe-based processes are necessary for basic episodic recognition abilities, we tested a group of 17 first-incident CVA patients whose cortical damage included (but was not limited to) extensive unilateral posterior parietal lesions. These patients performed a series of tasks that yielded parietal activations in previous fMRI studies: yes/no recognition judgments on visual words and on colored object pictures and identifiable environmental sounds. We found that patients with left hemisphere lesions were not impaired compared to controls in any of the tasks. Patients with right hemisphere lesions were not significantly impaired in memory for visual words, but were impaired in recognition of object pictures and sounds. Two lesion--behavior analyses--area-based correlations and voxel-based lesion symptom mapping (VLSM)---indicate that these impairments resulted from extra-parietal damage, specifically to frontal and lateral temporal areas. These findings suggest that extensive parietal damage does not impair recognition performance. We suggest that parietal activations recorded during recognition memory tasks might reflect peri-retrieval processes, such as the storage of retrieved memoranda in a working memory buffer for further cognitive processing.
Bimodal benefits on objective and subjective outcomes for adult cochlear implant users.

PubMed

Heo, Ji-Hye; Lee, Jae-Hee; Lee, Won-Sang

2013-09-01

Given that only a few studies have focused on the bimodal benefits on objective and subjective outcomes and emphasized the importance of individual data, the present study aimed to measure the bimodal benefits on the objective and subjective outcomes for adults with cochlear implant. Fourteen listeners with bimodal devices were tested on the localization and recognition abilities using environmental sounds, 1-talker, and 2-talker speech materials. The localization ability was measured through an 8-loudspeaker array. For the recognition measures, listeners were asked to repeat the sentences or say the environmental sounds the listeners heard. As a subjective questionnaire, three domains of Korean-version of Speech, Spatial, Qualities of Hearing scale (K-SSQ) were used to explore any relationships between objective and subjective outcomes. Based on the group-mean data, the bimodal hearing enhanced both localization and recognition regardless of test material. However, the inter- and intra-subject variability appeared to be large across test materials for both localization and recognition abilities. Correlation analyses revealed that the relationships were not always consistent between the objective outcomes and the subjective self-reports with bimodal devices. Overall, this study supports significant bimodal advantages on localization and recognition measures, yet the large individual variability in bimodal benefits should be considered carefully for the clinical assessment as well as counseling. The discrepant relations between objective and subjective results suggest that the bimodal benefits in traditional localization or recognition measures might not necessarily correspond to the self-reported subjective advantages in everyday listening environments.
Locally linear regression for pose-invariant face recognition.

PubMed

Chai, Xiujuan; Shan, Shiguang; Chen, Xilin; Gao, Wen

2007-07-01

The variation of facial appearance due to the viewpoint (/pose) degrades face recognition systems considerably, which is one of the bottlenecks in face recognition. One of the possible solutions is generating virtual frontal view from any given nonfrontal view to obtain a virtual gallery/probe face. Following this idea, this paper proposes a simple, but efficient, novel locally linear regression (LLR) method, which generates the virtual frontal view from a given nonfrontal face image. We first justify the basic assumption of the paper that there exists an approximate linear mapping between a nonfrontal face image and its frontal counterpart. Then, by formulating the estimation of the linear mapping as a prediction problem, we present the regression-based solution, i.e., globally linear regression. To improve the prediction accuracy in the case of coarse alignment, LLR is further proposed. In LLR, we first perform dense sampling in the nonfrontal face image to obtain many overlapped local patches. Then, the linear regression technique is applied to each small patch for the prediction of its virtual frontal patch. Through the combination of all these patches, the virtual frontal view is generated. The experimental results on the CMU PIE database show distinct advantage of the proposed method over Eigen light-field method.

Ignorance- versus Evidence-Based Decision Making: A Decision Time Analysis of the Recognition Heuristic

ERIC Educational Resources Information Center

Hilbig, Benjamin E.; Pohl, Rudiger F.

2009-01-01

According to part of the adaptive toolbox notion of decision making known as the recognition heuristic (RH), the decision process in comparative judgments--and its duration--is determined by whether recognition discriminates between objects. By contrast, some recently proposed alternative models predict that choices largely depend on the amount of…
The influence of writing practice on letter recognition in preschool children: a comparison between handwriting and typing.

PubMed

Longcamp, Marieke; Zerbato-Poudou, Marie-Thérèse; Velay, Jean-Luc

2005-05-01

A large body of data supports the view that movement plays a crucial role in letter representation and suggests that handwriting contributes to the visual recognition of letters. If so, changing the motor conditions while children are learning to write by using a method based on typing instead of handwriting should affect their subsequent letter recognition performances. In order to test this hypothesis, we trained two groups of 38 children (aged 3-5 years) to copy letters of the alphabet either by hand or by typing them. After three weeks of learning, we ran two recognition tests, one week apart, to compare the letter recognition performances of the two groups. The results showed that in the older children, the handwriting training gave rise to a better letter recognition than the typing training.
Facial Expression Influences Face Identity Recognition During the Attentional Blink

PubMed Central

2014-01-01

Emotional stimuli (e.g., negative facial expressions) enjoy prioritized memory access when task relevant, consistent with their ability to capture attention. Whether emotional expression also impacts on memory access when task-irrelevant is important for arbitrating between feature-based and object-based attentional capture. Here, the authors address this question in 3 experiments using an attentional blink task with face photographs as first and second target (T1, T2). They demonstrate reduced neutral T2 identity recognition after angry or happy T1 expression, compared to neutral T1, and this supports attentional capture by a task-irrelevant feature. Crucially, after neutral T1, T2 identity recognition was enhanced and not suppressed when T2 was angry—suggesting that attentional capture by this task-irrelevant feature may be object-based and not feature-based. As an unexpected finding, both angry and happy facial expressions suppress memory access for competing objects, but only angry facial expression enjoyed privileged memory access. This could imply that these 2 processes are relatively independent from one another. PMID:25286076
Facial expression influences face identity recognition during the attentional blink.

PubMed

Bach, Dominik R; Schmidt-Daffy, Martin; Dolan, Raymond J

2014-12-01

Emotional stimuli (e.g., negative facial expressions) enjoy prioritized memory access when task relevant, consistent with their ability to capture attention. Whether emotional expression also impacts on memory access when task-irrelevant is important for arbitrating between feature-based and object-based attentional capture. Here, the authors address this question in 3 experiments using an attentional blink task with face photographs as first and second target (T1, T2). They demonstrate reduced neutral T2 identity recognition after angry or happy T1 expression, compared to neutral T1, and this supports attentional capture by a task-irrelevant feature. Crucially, after neutral T1, T2 identity recognition was enhanced and not suppressed when T2 was angry-suggesting that attentional capture by this task-irrelevant feature may be object-based and not feature-based. As an unexpected finding, both angry and happy facial expressions suppress memory access for competing objects, but only angry facial expression enjoyed privileged memory access. This could imply that these 2 processes are relatively independent from one another.
Evaluating structural pattern recognition for handwritten math via primitive label graphs

NASA Astrophysics Data System (ADS)

Zanibbi, Richard; MoucheÌre, Harold; Viard-Gaudin, Christian

2013-01-01

Currently, structural pattern recognizer evaluations compare graphs of detected structure to target structures (i.e. ground truth) using recognition rates, recall and precision for object segmentation, classification and relationships. In document recognition, these target objects (e.g. symbols) are frequently comprised of multiple primitives (e.g. connected components, or strokes for online handwritten data), but current metrics do not characterize errors at the primitive level, from which object-level structure is obtained. Primitive label graphs are directed graphs defined over primitives and primitive pairs. We define new metrics obtained by Hamming distances over label graphs, which allow classification, segmentation and parsing errors to be characterized separately, or using a single measure. Recall and precision for detected objects may also be computed directly from label graphs. We illustrate the new metrics by comparing a new primitive-level evaluation to the symbol-level evaluation performed for the CROHME 2012 handwritten math recognition competition. A Python-based set of utilities for evaluating, visualizing and translating label graphs is publicly available.
Bimodal Benefits on Objective and Subjective Outcomes for Adult Cochlear Implant Users

PubMed Central

Heo, Ji-Hye; Lee, Won-Sang

2013-01-01

Background and Objectives Given that only a few studies have focused on the bimodal benefits on objective and subjective outcomes and emphasized the importance of individual data, the present study aimed to measure the bimodal benefits on the objective and subjective outcomes for adults with cochlear implant. Subjects and Methods Fourteen listeners with bimodal devices were tested on the localization and recognition abilities using environmental sounds, 1-talker, and 2-talker speech materials. The localization ability was measured through an 8-loudspeaker array. For the recognition measures, listeners were asked to repeat the sentences or say the environmental sounds the listeners heard. As a subjective questionnaire, three domains of Korean-version of Speech, Spatial, Qualities of Hearing scale (K-SSQ) were used to explore any relationships between objective and subjective outcomes. Results Based on the group-mean data, the bimodal hearing enhanced both localization and recognition regardless of test material. However, the inter- and intra-subject variability appeared to be large across test materials for both localization and recognition abilities. Correlation analyses revealed that the relationships were not always consistent between the objective outcomes and the subjective self-reports with bimodal devices. Conclusions Overall, this study supports significant bimodal advantages on localization and recognition measures, yet the large individual variability in bimodal benefits should be considered carefully for the clinical assessment as well as counseling. The discrepant relations between objective and subjective results suggest that the bimodal benefits in traditional localization or recognition measures might not necessarily correspond to the self-reported subjective advantages in everyday listening environments. PMID:24653909
Incrementally learning objects by touch: online discriminative and generative models for tactile-based recognition.

PubMed

Soh, Harold; Demiris, Yiannis

2014-01-01

Human beings not only possess the remarkable ability to distinguish objects through tactile feedback but are further able to improve upon recognition competence through experience. In this work, we explore tactile-based object recognition with learners capable of incremental learning. Using the sparse online infinite Echo-State Gaussian process (OIESGP), we propose and compare two novel discriminative and generative tactile learners that produce probability distributions over objects during object grasping/palpation. To enable iterative improvement, our online methods incorporate training samples as they become available. We also describe incremental unsupervised learning mechanisms, based on novelty scores and extreme value theory, when teacher labels are not available. We present experimental results for both supervised and unsupervised learning tasks using the iCub humanoid, with tactile sensors on its five-fingered anthropomorphic hand, and 10 different object classes. Our classifiers perform comparably to state-of-the-art methods (C4.5 and SVM classifiers) and findings indicate that tactile signals are highly relevant for making accurate object classifications. We also show that accurate "early" classifications are possible using only 20-30 percent of the grasp sequence. For unsupervised learning, our methods generate high quality clusterings relative to the widely-used sequential k-means and self-organising map (SOM), and we present analyses into the differences between the approaches.
Body-wide hierarchical fuzzy modeling, recognition, and delineation of anatomy in medical images.

PubMed

Udupa, Jayaram K; Odhner, Dewey; Zhao, Liming; Tong, Yubing; Matsumoto, Monica M S; Ciesielski, Krzysztof C; Falcao, Alexandre X; Vaideeswaran, Pavithra; Ciesielski, Victoria; Saboury, Babak; Mohammadianrasanani, Syedmehrdad; Sin, Sanghun; Arens, Raanan; Torigian, Drew A

2014-07-01

To make Quantitative Radiology (QR) a reality in radiological practice, computerized body-wide Automatic Anatomy Recognition (AAR) becomes essential. With the goal of building a general AAR system that is not tied to any specific organ system, body region, or image modality, this paper presents an AAR methodology for localizing and delineating all major organs in different body regions based on fuzzy modeling ideas and a tight integration of fuzzy models with an Iterative Relative Fuzzy Connectedness (IRFC) delineation algorithm. The methodology consists of five main steps: (a) gathering image data for both building models and testing the AAR algorithms from patient image sets existing in our health system; (b) formulating precise definitions of each body region and organ and delineating them following these definitions; (c) building hierarchical fuzzy anatomy models of organs for each body region; (d) recognizing and locating organs in given images by employing the hierarchical models; and (e) delineating the organs following the hierarchy. In Step (c), we explicitly encode object size and positional relationships into the hierarchy and subsequently exploit this information in object recognition in Step (d) and delineation in Step (e). Modality-independent and dependent aspects are carefully separated in model encoding. At the model building stage, a learning process is carried out for rehearsing an optimal threshold-based object recognition method. The recognition process in Step (d) starts from large, well-defined objects and proceeds down the hierarchy in a global to local manner. A fuzzy model-based version of the IRFC algorithm is created by naturally integrating the fuzzy model constraints into the delineation algorithm. The AAR system is tested on three body regions - thorax (on CT), abdomen (on CT and MRI), and neck (on MRI and CT) - involving a total of over 35 organs and 130 data sets (the total used for model building and testing). The training and testing data sets are divided into equal size in all cases except for the neck. Overall the AAR method achieves a mean accuracy of about 2 voxels in localizing non-sparse blob-like objects and most sparse tubular objects. The delineation accuracy in terms of mean false positive and negative volume fractions is 2% and 8%, respectively, for non-sparse objects, and 5% and 15%, respectively, for sparse objects. The two object groups achieve mean boundary distance relative to ground truth of 0.9 and 1.5 voxels, respectively. Some sparse objects - venous system (in the thorax on CT), inferior vena cava (in the abdomen on CT), and mandible and naso-pharynx (in neck on MRI, but not on CT) - pose challenges at all levels, leading to poor recognition and/or delineation results. The AAR method fares quite favorably when compared with methods from the recent literature for liver, kidneys, and spleen on CT images. We conclude that separation of modality-independent from dependent aspects, organization of objects in a hierarchy, encoding of object relationship information explicitly into the hierarchy, optimal threshold-based recognition learning, and fuzzy model-based IRFC are effective concepts which allowed us to demonstrate the feasibility of a general AAR system that works in different body regions on a variety of organs and on different modalities. Copyright © 2014 Elsevier B.V. All rights reserved.
Short- and long-term effects of nicotine and the histone deacetylase inhibitor phenylbutyrate on novel object recognition in zebrafish.

PubMed

Faillace, M P; Pisera-Fuster, A; Medrano, M P; Bejarano, A C; Bernabeu, R O

2017-03-01

Zebrafish have a sophisticated color- and shape-sensitive visual system, so we examined color cue-based novel object recognition in zebrafish. We evaluated preference in the absence or presence of drugs that affect attention and memory retention in rodents: nicotine and the histone deacetylase inhibitor (HDACi) phenylbutyrate (PhB). The objective of this study was to evaluate whether nicotine and PhB affect innate preferences of zebrafish for familiar and novel objects after short- and long-retention intervals. We developed modified object recognition (OR) tasks using neutral novel and familiar objects in different colors. We also tested objects which differed with respect to the exploratory behavior they elicited from naïve zebrafish. Zebrafish showed an innate preference for exploring red or green objects rather than yellow or blue objects. Zebrafish were better at discriminating color changes than changes in object shape or size. Nicotine significantly enhanced or changed short-term innate novel object preference whereas PhB had similar effects when preference was assessed 24 h after training. Analysis of other zebrafish behaviors corroborated these results. Zebrafish were innately reluctant or prone to explore colored novel objects, so drug effects on innate preference for objects can be evaluated changing the color of objects with a simple geometry. Zebrafish exhibited recognition memory for novel objects with similar innate significance. Interestingly, nicotine and PhB significantly modified innate object preference.
Finger tips detection for two handed gesture recognition

NASA Astrophysics Data System (ADS)

Bhuyan, M. K.; Kar, Mithun Kumar; Neog, Debanga Raj

2011-10-01

In this paper, a novel algorithm is proposed for fingertips detection in view of two-handed static hand pose recognition. In our method, finger tips of both hands are detected after detecting hand regions by skin color-based segmentation. At first, the face is removed in the image by using Haar classifier and subsequently, the regions corresponding to the gesturing hands are isolated by a region labeling technique. Next, the key geometric features characterizing gesturing hands are extracted for two hands. Finally, for all possible/allowable finger movements, a probabilistic model is developed for pose recognition. Proposed method can be employed in a variety of applications like sign language recognition and human-robot-interactions etc.
Bidirectional Modulation of Recognition Memory

PubMed Central

Ho, Jonathan W.; Poeta, Devon L.; Jacobson, Tara K.; Zolnik, Timothy A.; Neske, Garrett T.; Connors, Barry W.

2015-01-01

Perirhinal cortex (PER) has a well established role in the familiarity-based recognition of individual items and objects. For example, animals and humans with perirhinal damage are unable to distinguish familiar from novel objects in recognition memory tasks. In the normal brain, perirhinal neurons respond to novelty and familiarity by increasing or decreasing firing rates. Recent work also implicates oscillatory activity in the low-beta and low-gamma frequency bands in sensory detection, perception, and recognition. Using optogenetic methods in a spontaneous object exploration (SOR) task, we altered recognition memory performance in rats. In the SOR task, normal rats preferentially explore novel images over familiar ones. We modulated exploratory behavior in this task by optically stimulating channelrhodopsin-expressing perirhinal neurons at various frequencies while rats looked at novel or familiar 2D images. Stimulation at 30–40 Hz during looking caused rats to treat a familiar image as if it were novel by increasing time looking at the image. Stimulation at 30–40 Hz was not effective in increasing exploration of novel images. Stimulation at 10–15 Hz caused animals to treat a novel image as familiar by decreasing time looking at the image, but did not affect looking times for images that were already familiar. We conclude that optical stimulation of PER at different frequencies can alter visual recognition memory bidirectionally. SIGNIFICANCE STATEMENT Recognition of novelty and familiarity are important for learning, memory, and decision making. Perirhinal cortex (PER) has a well established role in the familiarity-based recognition of individual items and objects, but how novelty and familiarity are encoded and transmitted in the brain is not known. Perirhinal neurons respond to novelty and familiarity by changing firing rates, but recent work suggests that brain oscillations may also be important for recognition. In this study, we showed that stimulation of the PER could increase or decrease exploration of novel and familiar images depending on the frequency of stimulation. Our findings suggest that optical stimulation of PER at specific frequencies can predictably alter recognition memory. PMID:26424881
Using Prosopagnosia to Test and Modify Visual Recognition Theory.

PubMed

O'Brien, Alexander M

2018-02-01

Biederman's contemporary theory of basic visual object recognition (Recognition-by-Components) is based on structural descriptions of objects and presumes 36 visual primitives (geons) people can discriminate, but there has been no empirical test of the actual use of these 36 geons to visually distinguish objects. In this study, we tested for the actual use of these geons in basic visual discrimination by comparing object discrimination performance patterns (when distinguishing varied stimuli) of an acquired prosopagnosia patient (LB) and healthy control participants. LB's prosopagnosia left her heavily reliant on structural descriptions or categorical object differences in visual discrimination tasks versus the control participants' additional ability to use face recognition or coordinate systems (Coordinate Relations Hypothesis). Thus, when LB performed comparably to control participants with a given stimulus, her restricted reliance on basic or categorical discriminations meant that the stimuli must be distinguishable on the basis of a geon feature. By varying stimuli in eight separate experiments and presenting all 36 geons, we discerned that LB coded only 12 (vs. 36) distinct visual primitives (geons), apparently reflective of human visual systems generally.
Tactile agnosia. Casuistic evidence and theoretical remarks on modality-specific meaning representations and sensorimotor integration.

PubMed

Platz, T

1996-10-01

Somaesthetic, motor and cognitive functions were studied in a man with impaired tactile object-recognition (TOR) in his left hand due to a right parietal convexity meningeoma which had been surgically removed. Primary motor and somatosensory functions were not impaired, and discriminative abilities for various tactile aspects and cognitive skills were preserved. Nevertheless, the patient could often not appreciate the object's nature or significance when it was placed in his left hand and was unable to name or to describe or demonstrate the use of these objects. Therefore, he can be regarded as an example of associative tactile agnosia. The view is taken and elaborated that defective modality-specific meaning representations account for associative tactile agnosia. These meaning representations are conceptualized as learned unimodal feature-entity relationships which are thought to be defective in tactile agnosia. In line with this hypothesis, tactile feature analysis and cross-modal matching of features were largely preserved in the investigated patient, while combining features to form entities was defective in the tactile domain. The alternative hypothesis of agnosia as deficit of cross-modal association of features was not supported. The presumed distributed functional network responsible for TOR is thought to involve perception of features, object recognition and related tactile motor behaviour interactively. A deficit leading primarily to impaired combining features to form entities can therefore be expected to result in additional minor impairment of related perceptual-motor processes. Unilaterality of the gnostic deficit can be explained by a lateralized organization of the functional network responsible for tactile recognition of objects.
Using LabView for real-time monitoring and tracking of multiple biological objects

NASA Astrophysics Data System (ADS)

Nikolskyy, Aleksandr I.; Krasilenko, Vladimir G.; Bilynsky, Yosyp Y.; Starovier, Anzhelika

2017-04-01

Today real-time studying and tracking of movement dynamics of various biological objects is important and widely researched. Features of objects, conditions of their visualization and model parameters strongly influence the choice of optimal methods and algorithms for a specific task. Therefore, to automate the processes of adaptation of recognition tracking algorithms, several Labview project trackers are considered in the article. Projects allow changing templates for training and retraining the system quickly. They adapt to the speed of objects and statistical characteristics of noise in images. New functions of comparison of images or their features, descriptors and pre-processing methods will be discussed. The experiments carried out to test the trackers on real video files will be presented and analyzed.
An enhanced multi-view vertical line locus matching algorithm of object space ground primitives based on positioning consistency for aerial and space images

NASA Astrophysics Data System (ADS)

Zhang, Ka; Sheng, Yehua; Wang, Meizhen; Fu, Suxia

2018-05-01

The traditional multi-view vertical line locus (TMVLL) matching method is an object-space-based method that is commonly used to directly acquire spatial 3D coordinates of ground objects in photogrammetry. However, the TMVLL method can only obtain one elevation and lacks an accurate means of validating the matching results. In this paper, we propose an enhanced multi-view vertical line locus (EMVLL) matching algorithm based on positioning consistency for aerial or space images. The algorithm involves three components: confirming candidate pixels of the ground primitive in the base image, multi-view image matching based on the object space constraints for all candidate pixels, and validating the consistency of the object space coordinates with the multi-view matching result. The proposed algorithm was tested using actual aerial images and space images. Experimental results show that the EMVLL method successfully solves the problems associated with the TMVLL method, and has greater reliability, accuracy and computing efficiency.
Paradoxical false memory for objects after brain damage.

PubMed

McTighe, Stephanie M; Cowell, Rosemary A; Winters, Boyer D; Bussey, Timothy J; Saksida, Lisa M

2010-12-03

Poor memory after brain damage is usually considered to be a result of information being lost or rendered inaccessible. It is assumed that such memory impairment must be due to the incorrect interpretation of previously encountered information as being novel. In object recognition memory experiments with rats, we found that memory impairment can take the opposite form: a tendency to treat novel experiences as familiar. This impairment could be rescued with the use of a visual-restriction procedure that reduces interference. Such a pattern of data can be explained in terms of a recent representational-hierarchical view of cognition.
Incongruence Between Observers’ and Observed Facial Muscle Activation Reduces Recognition of Emotional Facial Expressions From Video Stimuli

PubMed Central

Wingenbach, Tanja S. H.; Brosnan, Mark; Pfaltz, Monique C.; Plichta, Michael M.; Ashwin, Chris

2018-01-01

According to embodied cognition accounts, viewing others’ facial emotion can elicit the respective emotion representation in observers which entails simulations of sensory, motor, and contextual experiences. In line with that, published research found viewing others’ facial emotion to elicit automatic matched facial muscle activation, which was further found to facilitate emotion recognition. Perhaps making congruent facial muscle activity explicit produces an even greater recognition advantage. If there is conflicting sensory information, i.e., incongruent facial muscle activity, this might impede recognition. The effects of actively manipulating facial muscle activity on facial emotion recognition from videos were investigated across three experimental conditions: (a) explicit imitation of viewed facial emotional expressions (stimulus-congruent condition), (b) pen-holding with the lips (stimulus-incongruent condition), and (c) passive viewing (control condition). It was hypothesised that (1) experimental condition (a) and (b) result in greater facial muscle activity than (c), (2) experimental condition (a) increases emotion recognition accuracy from others’ faces compared to (c), (3) experimental condition (b) lowers recognition accuracy for expressions with a salient facial feature in the lower, but not the upper face area, compared to (c). Participants (42 males, 42 females) underwent a facial emotion recognition experiment (ADFES-BIV) while electromyography (EMG) was recorded from five facial muscle sites. The experimental conditions’ order was counter-balanced. Pen-holding caused stimulus-incongruent facial muscle activity for expressions with facial feature saliency in the lower face region, which reduced recognition of lower face region emotions. Explicit imitation caused stimulus-congruent facial muscle activity without modulating recognition. Methodological implications are discussed. PMID:29928240
Incongruence Between Observers' and Observed Facial Muscle Activation Reduces Recognition of Emotional Facial Expressions From Video Stimuli.

PubMed

Wingenbach, Tanja S H; Brosnan, Mark; Pfaltz, Monique C; Plichta, Michael M; Ashwin, Chris

2018-01-01

According to embodied cognition accounts, viewing others' facial emotion can elicit the respective emotion representation in observers which entails simulations of sensory, motor, and contextual experiences. In line with that, published research found viewing others' facial emotion to elicit automatic matched facial muscle activation, which was further found to facilitate emotion recognition. Perhaps making congruent facial muscle activity explicit produces an even greater recognition advantage. If there is conflicting sensory information, i.e., incongruent facial muscle activity, this might impede recognition. The effects of actively manipulating facial muscle activity on facial emotion recognition from videos were investigated across three experimental conditions: (a) explicit imitation of viewed facial emotional expressions (stimulus-congruent condition), (b) pen-holding with the lips (stimulus-incongruent condition), and (c) passive viewing (control condition). It was hypothesised that (1) experimental condition (a) and (b) result in greater facial muscle activity than (c), (2) experimental condition (a) increases emotion recognition accuracy from others' faces compared to (c), (3) experimental condition (b) lowers recognition accuracy for expressions with a salient facial feature in the lower, but not the upper face area, compared to (c). Participants (42 males, 42 females) underwent a facial emotion recognition experiment (ADFES-BIV) while electromyography (EMG) was recorded from five facial muscle sites. The experimental conditions' order was counter-balanced. Pen-holding caused stimulus-incongruent facial muscle activity for expressions with facial feature saliency in the lower face region, which reduced recognition of lower face region emotions. Explicit imitation caused stimulus-congruent facial muscle activity without modulating recognition. Methodological implications are discussed.
Magical thinking and memory: distinctiveness effect for tv commercials with magical content.

PubMed

Subbotsky, Eugene; Mathews, Jayne

2011-10-01

The aim of this study was to examine whether memorizing advertised products of television advertisements with magical effects (i.e., talking animals, inanimate objects which turn into humans, objects that appear from thin air or instantly turn into other objects) is easier than memorizing products of advertisements without such effects, by testing immediate and delayed retention. Adolescents and adults viewed two films containing television advertisements and were asked to recall and recognize the films' characters, events, and advertised products. Film 1 included magical effects, but Film 2 did not. On a free-recall test, no differences in the number of items recalled were noted for the two films. On the immediate recognition test, adolescents, but not adults, showed significantly better recognition for the magical than the nonmagical film. When this test was repeated two weeks later, results were reversed: adults, but not adolescents, recognized a significantly larger number of items from the magical film than the nonmagical one. These results are interpreted to accentuate the role of magical thinking in cognitive processes.
Quantifying Novice and Expert Differences in Visual Diagnostic Reasoning in Veterinary Pathology Using Eye-Tracking Technology.

PubMed

Warren, Amy L; Donnon, Tyrone L; Wagg, Catherine R; Priest, Heather; Fernandez, Nicole J

2018-01-18

Visual diagnostic reasoning is the cognitive process by which pathologists reach a diagnosis based on visual stimuli (cytologic, histopathologic, or gross imagery). Currently, there is little to no literature examining visual reasoning in veterinary pathology. The objective of the study was to use eye tracking to establish baseline quantitative and qualitative differences between the visual reasoning processes of novice and expert veterinary pathologists viewing cytology specimens. Novice and expert participants were each shown 10 cytology images and asked to formulate a diagnosis while wearing eye-tracking equipment (10 slides) and while concurrently verbalizing their thought processes using the think-aloud protocol (5 slides). Compared to novices, experts demonstrated significantly higher diagnostic accuracy (p<.017), shorter time to diagnosis (p<.017), and a higher percentage of time spent viewing areas of diagnostic interest (p<.017). Experts elicited more key diagnostic features in the think-aloud protocol and had more efficient patterns of eye movement. These findings suggest that experts' fast time to diagnosis, efficient eye-movement patterns, and preference for viewing areas of interest supports system 1 (pattern-recognition) reasoning and script-inductive knowledge structures with system 2 (analytic) reasoning to verify their diagnosis.

Target recognition based on convolutional neural network

NASA Astrophysics Data System (ADS)

Wang, Liqiang; Wang, Xin; Xi, Fubiao; Dong, Jian

2017-11-01

One of the important part of object target recognition is the feature extraction, which can be classified into feature extraction and automatic feature extraction. The traditional neural network is one of the automatic feature extraction methods, while it causes high possibility of over-fitting due to the global connection. The deep learning algorithm used in this paper is a hierarchical automatic feature extraction method, trained with the layer-by-layer convolutional neural network (CNN), which can extract the features from lower layers to higher layers. The features are more discriminative and it is beneficial to the object target recognition.
Identification and detection of simple 3D objects with severely blurred vision.

PubMed

Kallie, Christopher S; Legge, Gordon E; Yu, Deyue

2012-12-05

Detecting and recognizing three-dimensional (3D) objects is an important component of the visual accessibility of public spaces for people with impaired vision. The present study investigated the impact of environmental factors and object properties on the recognition of objects by subjects who viewed physical objects with severely reduced acuity. The experiment was conducted in an indoor testing space. We examined detection and identification of simple convex objects by normally sighted subjects wearing diffusing goggles that reduced effective acuity to 20/900. We used psychophysical methods to examine the effect on performance of important environmental variables: viewing distance (from 10-24 feet, or 3.05-7.32 m) and illumination (overhead fluorescent and artificial window), and object variables: shape (boxes and cylinders), size (heights from 2-6 feet, or 0.61-1.83 m), and color (gray and white). Object identification was significantly affected by distance, color, height, and shape, as well as interactions between illumination, color, and shape. A stepwise regression analysis showed that 64% of the variability in identification could be explained by object contrast values (58%) and object visual angle (6%). When acuity is severely limited, illumination, distance, color, height, and shape influence the identification and detection of simple 3D objects. These effects can be explained in large part by the impact of these variables on object contrast and visual angle. Basic design principles for improving object visibility are discussed.
Design and implementation of face recognition system based on Windows

NASA Astrophysics Data System (ADS)

Zhang, Min; Liu, Ting; Li, Ailan

2015-07-01

In view of the basic Windows login password input way lacking of safety and convenient operation, we will introduce the biometrics technology, face recognition, into the computer to login system. Not only can it encrypt the computer system, also according to the level to identify administrators at all levels. With the enhancement of the system security, user input can neither be a cumbersome nor worry about being stolen password confidential.
Standard object recognition memory and "what" and "where" components: Improvement by post-training epinephrine in highly habituated rats.

PubMed

Jurado-Berbel, Patricia; Costa-Miserachs, David; Torras-Garcia, Meritxell; Coll-Andreu, Margalida; Portell-Cortés, Isabel

2010-02-11

The present work examined whether post-training systemic epinephrine (EPI) is able to modulate short-term (3h) and long-term (24 h and 48 h) memory of standard object recognition, as well as long-term (24 h) memory of separate "what" (object identity) and "where" (object location) components of object recognition. Although object recognition training is associated to low arousal levels, all the animals received habituation to the training box in order to further reduce emotional arousal. Post-training EPI improved long-term (24 h and 48 h), but not short-term (3 h), memory in the standard object recognition task, as well as 24 h memory for both object identity and object location. These data indicate that post-training epinephrine: (1) facilitates long-term memory for standard object recognition; (2) exerts separate facilitatory effects on "what" (object identity) and "where" (object location) components of object recognition; and (3) is capable of improving memory for a low arousing task even in highly habituated rats.
[Visual Texture Agnosia in Humans].

PubMed

Suzuki, Kyoko

2015-06-01

Visual object recognition requires the processing of both geometric and surface properties. Patients with occipital lesions may have visual agnosia, which is impairment in the recognition and identification of visually presented objects primarily through their geometric features. An analogous condition involving the failure to recognize an object by its texture may exist, which can be called visual texture agnosia. Here we present two cases with visual texture agnosia. Case 1 had left homonymous hemianopia and right upper quadrantanopia, along with achromatopsia, prosopagnosia, and texture agnosia, because of damage to his left ventromedial occipitotemporal cortex and right lateral occipito-temporo-parietal cortex due to multiple cerebral embolisms. Although he showed difficulty matching and naming textures of real materials, he could readily name visually presented objects by their contours. Case 2 had right lower quadrantanopia, along with impairment in stereopsis and recognition of texture in 2D images, because of subcortical hemorrhage in the left occipitotemporal region. He failed to recognize shapes based on texture information, whereas shape recognition based on contours was well preserved. Our findings, along with those of three reported cases with texture agnosia, indicate that there are separate channels for processing texture, color, and geometric features, and that the regions around the left collateral sulcus are crucial for texture processing.
Experience moderates overlap between object and face recognition, suggesting a common ability

PubMed Central

Gauthier, Isabel; McGugin, Rankin W.; Richler, Jennifer J.; Herzmann, Grit; Speegle, Magen; Van Gulick, Ana E.

2014-01-01

Some research finds that face recognition is largely independent from the recognition of other objects; a specialized and innate ability to recognize faces could therefore have little or nothing to do with our ability to recognize objects. We propose a new framework in which recognition performance for any category is the product of domain-general ability and category-specific experience. In Experiment 1, we show that the overlap between face and object recognition depends on experience with objects. In 256 subjects we measured face recognition, object recognition for eight categories, and self-reported experience with these categories. Experience predicted neither face recognition nor object recognition but moderated their relationship: Face recognition performance is increasingly similar to object recognition performance with increasing object experience. If a subject has a lot of experience with objects and is found to perform poorly, they also prove to have a low ability with faces. In a follow-up survey, we explored the dimensions of experience with objects that may have contributed to self-reported experience in Experiment 1. Different dimensions of experience appear to be more salient for different categories, with general self-reports of expertise reflecting judgments of verbal knowledge about a category more than judgments of visual performance. The complexity of experience and current limitations in its measurement support the importance of aggregating across multiple categories. Our findings imply that both face and object recognition are supported by a common, domain-general ability expressed through experience with a category and best measured when accounting for experience. PMID:24993021
Experience moderates overlap between object and face recognition, suggesting a common ability.

PubMed

Gauthier, Isabel; McGugin, Rankin W; Richler, Jennifer J; Herzmann, Grit; Speegle, Magen; Van Gulick, Ana E

2014-07-03

Some research finds that face recognition is largely independent from the recognition of other objects; a specialized and innate ability to recognize faces could therefore have little or nothing to do with our ability to recognize objects. We propose a new framework in which recognition performance for any category is the product of domain-general ability and category-specific experience. In Experiment 1, we show that the overlap between face and object recognition depends on experience with objects. In 256 subjects we measured face recognition, object recognition for eight categories, and self-reported experience with these categories. Experience predicted neither face recognition nor object recognition but moderated their relationship: Face recognition performance is increasingly similar to object recognition performance with increasing object experience. If a subject has a lot of experience with objects and is found to perform poorly, they also prove to have a low ability with faces. In a follow-up survey, we explored the dimensions of experience with objects that may have contributed to self-reported experience in Experiment 1. Different dimensions of experience appear to be more salient for different categories, with general self-reports of expertise reflecting judgments of verbal knowledge about a category more than judgments of visual performance. The complexity of experience and current limitations in its measurement support the importance of aggregating across multiple categories. Our findings imply that both face and object recognition are supported by a common, domain-general ability expressed through experience with a category and best measured when accounting for experience. © 2014 ARVO.
Remembering the object you fear: brain potentials during recognition of spiders in spider-fearful individuals.

PubMed

Michalowski, Jaroslaw M; Weymar, Mathias; Hamm, Alfons O

2014-01-01

In the present study we investigated long-term memory for unpleasant, neutral and spider pictures in 15 spider-fearful and 15 non-fearful control individuals using behavioral and electrophysiological measures. During the initial (incidental) encoding, pictures were passively viewed in three separate blocks and were subsequently rated for valence and arousal. A recognition memory task was performed one week later in which old and new unpleasant, neutral and spider pictures were presented. Replicating previous results, we found enhanced memory performance and higher confidence ratings for unpleasant when compared to neutral materials in both animal fearful individuals and controls. When compared to controls high animal fearful individuals also showed a tendency towards better memory accuracy and significantly higher confidence during recognition of spider pictures, suggesting that memory of objects prompting specific fear is also facilitated in fearful individuals. In line, spider-fearful but not control participants responded with larger ERP positivity for correctly recognized old when compared to correctly rejected new spider pictures, thus showing the same effects in the neural signature of emotional memory for feared objects that were already discovered for other emotional materials. The increased fear memory for phobic materials observed in the present study in spider-fearful individuals might result in an enhanced fear response and reinforce negative beliefs aggravating anxiety symptomatology and hindering recovery.
The Memory State Heuristic: A Formal Model Based on Repeated Recognition Judgments

ERIC Educational Resources Information Center

Castela, Marta; Erdfelder, Edgar

2017-01-01

The recognition heuristic (RH) theory predicts that, in comparative judgment tasks, if one object is recognized and the other is not, the recognized one is chosen. The memory-state heuristic (MSH) extends the RH by assuming that choices are not affected by recognition judgments per se, but by the memory states underlying these judgments (i.e.,…
Extraction of edge-based and region-based features for object recognition

NASA Astrophysics Data System (ADS)

Coutts, Benjamin; Ravi, Srinivas; Hu, Gongzhu; Shrikhande, Neelima

1993-08-01

One of the central problems of computer vision is object recognition. A catalogue of model objects is described as a set of features such as edges and surfaces. The same features are extracted from the scene and matched against the models for object recognition. Edges and surfaces extracted from the scenes are often noisy and imperfect. In this paper algorithms are described for improving low level edge and surface features. Existing edge extraction algorithms are applied to the intensity image to obtain edge features. Initial edges are traced by following directions of the current contour. These are improved by using corresponding depth and intensity information for decision making at branch points. Surface fitting routines are applied to the range image to obtain planar surface patches. An algorithm of region growing is developed that starts with a coarse segmentation and uses quadric surface fitting to iteratively merge adjacent regions into quadric surfaces based on approximate orthogonal distance regression. Surface information obtained is returned to the edge extraction routine to detect and remove fake edges. This process repeats until no more merging or edge improvement can take place. Both synthetic (with Gaussian noise) and real images containing multiple object scenes have been tested using the merging criteria. Results appeared quite encouraging.
Object Representations in Human Visual Cortex Formed Through Temporal Integration of Dynamic Partial Shape Views.

PubMed

Orlov, Tanya; Zohary, Ehud

2018-01-17

We typically recognize visual objects using the spatial layout of their parts, which are present simultaneously on the retina. Therefore, shape extraction is based on integration of the relevant retinal information over space. The lateral occipital complex (LOC) can represent shape faithfully in such conditions. However, integration over time is sometimes required to determine object shape. To study shape extraction through temporal integration of successive partial shape views, we presented human participants (both men and women) with artificial shapes that moved behind a narrow vertical or horizontal slit. Only a tiny fraction of the shape was visible at any instant at the same retinal location. However, observers perceived a coherent whole shape instead of a jumbled pattern. Using fMRI and multivoxel pattern analysis, we searched for brain regions that encode temporally integrated shape identity. We further required that the representation of shape should be invariant to changes in the slit orientation. We show that slit-invariant shape information is most accurate in the LOC. Importantly, the slit-invariant shape representations matched the conventional whole-shape representations assessed during full-image runs. Moreover, when the same slit-dependent shape slivers were shuffled, thereby preventing their spatiotemporal integration, slit-invariant shape information was reduced dramatically. The slit-invariant representation of the various shapes also mirrored the structure of shape perceptual space as assessed by perceptual similarity judgment tests. Therefore, the LOC is likely to mediate temporal integration of slit-dependent shape views, generating a slit-invariant whole-shape percept. These findings provide strong evidence for a global encoding of shape in the LOC regardless of integration processes required to generate the shape percept. SIGNIFICANCE STATEMENT Visual objects are recognized through spatial integration of features available simultaneously on the retina. The lateral occipital complex (LOC) represents shape faithfully in such conditions even if the object is partially occluded. However, shape must sometimes be reconstructed over both space and time. Such is the case in anorthoscopic perception, when an object is moving behind a narrow slit. In this scenario, spatial information is limited at any moment so the whole-shape percept can only be inferred by integration of successive shape views over time. We find that LOC carries shape-specific information recovered using such temporal integration processes. The shape representation is invariant to slit orientation and is similar to that evoked by a fully viewed image. Existing models of object recognition lack such capabilities. Copyright © 2018 the authors 0270-6474/18/380659-20$15.00/0.
The Potential of Using Brain Images for Authentication

PubMed Central

Zhou, Zongtan; Shen, Hui; Hu, Dewen

2014-01-01

Biometric recognition (also known as biometrics) refers to the automated recognition of individuals based on their biological or behavioral traits. Examples of biometric traits include fingerprint, palmprint, iris, and face. The brain is the most important and complex organ in the human body. Can it be used as a biometric trait? In this study, we analyze the uniqueness of the brain and try to use the brain for identity authentication. The proposed brain-based verification system operates in two stages: gray matter extraction and gray matter matching. A modified brain segmentation algorithm is implemented for extracting gray matter from an input brain image. Then, an alignment-based matching algorithm is developed for brain matching. Experimental results on two data sets show that the proposed brain recognition system meets the high accuracy requirement of identity authentication. Though currently the acquisition of the brain is still time consuming and expensive, brain images are highly unique and have the potential possibility for authentication in view of pattern recognition. PMID:25126604
The potential of using brain images for authentication.

PubMed

Chen, Fanglin; Zhou, Zongtan; Shen, Hui; Hu, Dewen

2014-01-01

Biometric recognition (also known as biometrics) refers to the automated recognition of individuals based on their biological or behavioral traits. Examples of biometric traits include fingerprint, palmprint, iris, and face. The brain is the most important and complex organ in the human body. Can it be used as a biometric trait? In this study, we analyze the uniqueness of the brain and try to use the brain for identity authentication. The proposed brain-based verification system operates in two stages: gray matter extraction and gray matter matching. A modified brain segmentation algorithm is implemented for extracting gray matter from an input brain image. Then, an alignment-based matching algorithm is developed for brain matching. Experimental results on two data sets show that the proposed brain recognition system meets the high accuracy requirement of identity authentication. Though currently the acquisition of the brain is still time consuming and expensive, brain images are highly unique and have the potential possibility for authentication in view of pattern recognition.
A Low-Dimensional Radial Silhouette-Based Feature for Fast Human Action Recognition Fusing Multiple Views.

PubMed

Chaaraoui, Alexandros Andre; Flórez-Revuelta, Francisco

2014-01-01

This paper presents a novel silhouette-based feature for vision-based human action recognition, which relies on the contour of the silhouette and a radial scheme. Its low-dimensionality and ease of extraction result in an outstanding proficiency for real-time scenarios. This feature is used in a learning algorithm that by means of model fusion of multiple camera streams builds a bag of key poses, which serves as a dictionary of known poses and allows converting the training sequences into sequences of key poses. These are used in order to perform action recognition by means of a sequence matching algorithm. Experimentation on three different datasets returns high and stable recognition rates. To the best of our knowledge, this paper presents the highest results so far on the MuHAVi-MAS dataset. Real-time suitability is given, since the method easily performs above video frequency. Therefore, the related requirements that applications as ambient-assisted living services impose are successfully fulfilled.
Cultural differences in visual object recognition in 3-year-old children

PubMed Central

Kuwabara, Megumi; Smith, Linda B.

2016-01-01

Recent research indicates that culture penetrates fundamental processes of perception and cognition (e.g. Nisbett & Miyamoto, 2005). Here, we provide evidence that these influences begin early and influence how preschool children recognize common objects. The three tasks (n=128) examined the degree to which nonface object recognition by 3 year olds was based on individual diagnostic features versus more configural and holistic processing. Task 1 used a 6-alternative forced choice task in which children were asked to find a named category in arrays of masked objects in which only 3 diagnostic features were visible for each object. U.S. children outperformed age-matched Japanese children. Task 2 presented pictures of objects to children piece by piece. U.S. children recognized the objects given fewer pieces than Japanese children and likelihood of recognition increased for U.S., but not Japanese children when the piece added was rated by both U.S. and Japanese adults as highly defining. Task 3 used a standard measure of configural progressing, asking the degree to which recognition of matching pictures was disrupted by the rotation of one picture. Japanese children’s recognition was more disrupted by inversion than was that of U.S. children, indicating more configural processing by Japanese than U.S. children. The pattern suggests early cross-cultural differences in visual processing; findings that raise important questions about how visual experiences differ across cultures and about universal patterns of cognitive development. PMID:26985576
Cultural differences in visual object recognition in 3-year-old children.

PubMed

Kuwabara, Megumi; Smith, Linda B

2016-07-01

Recent research indicates that culture penetrates fundamental processes of perception and cognition. Here, we provide evidence that these influences begin early and influence how preschool children recognize common objects. The three tasks (N=128) examined the degree to which nonface object recognition by 3-year-olds was based on individual diagnostic features versus more configural and holistic processing. Task 1 used a 6-alternative forced choice task in which children were asked to find a named category in arrays of masked objects where only three diagnostic features were visible for each object. U.S. children outperformed age-matched Japanese children. Task 2 presented pictures of objects to children piece by piece. U.S. children recognized the objects given fewer pieces than Japanese children, and the likelihood of recognition increased for U.S. children, but not Japanese children, when the piece added was rated by both U.S. and Japanese adults as highly defining. Task 3 used a standard measure of configural progressing, asking the degree to which recognition of matching pictures was disrupted by the rotation of one picture. Japanese children's recognition was more disrupted by inversion than was that of U.S. children, indicating more configural processing by Japanese than U.S. children. The pattern suggests early cross-cultural differences in visual processing; findings that raise important questions about how visual experiences differ across cultures and about universal patterns of cognitive development. Copyright © 2016 Elsevier Inc. All rights reserved.
Associative (prosop)agnosia without (apparent) perceptual deficits: a case-study.

PubMed

Anaki, David; Kaufman, Yakir; Freedman, Morris; Moscovitch, Morris

2007-04-09

In associative agnosia early perceptual processing of faces or objects are considered to be intact, while the ability to access stored semantic information about the individual face or object is impaired. Recent claims, however, have asserted that associative agnosia is also characterized by deficits at the perceptual level, which are too subtle to be detected by current neuropsychological tests. Thus, the impaired identification of famous faces or common objects in associative agnosia stems from difficulties in extracting the minute perceptual details required to identify a face or an object. In the present study, we report the case of a patient DBO with a left occipital infarct, who shows impaired object and famous face recognition. Despite his disability, he exhibits a face inversion effect, and is able to select a famous face from among non-famous distractors. In addition, his performance is normal in an immediate and delayed recognition memory for faces, whose external features were deleted. His deficits in face recognition are apparent only when he is required to name a famous face, or select two faces from among a triad of famous figures based on their semantic relationships (a task which does not require access to names). The nature of his deficits in object perception and recognition are similar to his impairments in the face domain. This pattern of behavior supports the notion that apperceptive and associative agnosia reflect distinct and dissociated deficits, which result from damage to different stages of the face and object recognition process.
Toward a Computer Vision-based Wayfinding Aid for Blind Persons to Access Unfamiliar Indoor Environments.

PubMed

Tian, Yingli; Yang, Xiaodong; Yi, Chucai; Arditi, Aries

2013-04-01

Independent travel is a well known challenge for blind and visually impaired persons. In this paper, we propose a proof-of-concept computer vision-based wayfinding aid for blind people to independently access unfamiliar indoor environments. In order to find different rooms (e.g. an office, a lab, or a bathroom) and other building amenities (e.g. an exit or an elevator), we incorporate object detection with text recognition. First we develop a robust and efficient algorithm to detect doors, elevators, and cabinets based on their general geometric shape, by combining edges and corners. The algorithm is general enough to handle large intra-class variations of objects with different appearances among different indoor environments, as well as small inter-class differences between different objects such as doors and door-like cabinets. Next, in order to distinguish intra-class objects (e.g. an office door from a bathroom door), we extract and recognize text information associated with the detected objects. For text recognition, we first extract text regions from signs with multiple colors and possibly complex backgrounds, and then apply character localization and topological analysis to filter out background interference. The extracted text is recognized using off-the-shelf optical character recognition (OCR) software products. The object type, orientation, location, and text information are presented to the blind traveler as speech.
Toward a Computer Vision-based Wayfinding Aid for Blind Persons to Access Unfamiliar Indoor Environments

PubMed Central

Tian, YingLi; Yang, Xiaodong; Yi, Chucai; Arditi, Aries

2012-01-01

Independent travel is a well known challenge for blind and visually impaired persons. In this paper, we propose a proof-of-concept computer vision-based wayfinding aid for blind people to independently access unfamiliar indoor environments. In order to find different rooms (e.g. an office, a lab, or a bathroom) and other building amenities (e.g. an exit or an elevator), we incorporate object detection with text recognition. First we develop a robust and efficient algorithm to detect doors, elevators, and cabinets based on their general geometric shape, by combining edges and corners. The algorithm is general enough to handle large intra-class variations of objects with different appearances among different indoor environments, as well as small inter-class differences between different objects such as doors and door-like cabinets. Next, in order to distinguish intra-class objects (e.g. an office door from a bathroom door), we extract and recognize text information associated with the detected objects. For text recognition, we first extract text regions from signs with multiple colors and possibly complex backgrounds, and then apply character localization and topological analysis to filter out background interference. The extracted text is recognized using off-the-shelf optical character recognition (OCR) software products. The object type, orientation, location, and text information are presented to the blind traveler as speech. PMID:23630409
Classification of Uxo by Principal Dipole Polarizability

NASA Astrophysics Data System (ADS)

Kappler, K. N.

2010-12-01

Data acquired by multiple-Transmitter, multiple-receiver time-domain electromagnetic devices show great potential for determining the geometric and compositional information relating to near surface conductive targets. Here is presented an analysis of data from one such system; the Berkeley Unexploded-ordnance Discriminator (BUD) system. BUD data are succinctly reduced by processing the multi-static data matrices to obtain magnetic dipole polarizability matrices for data from each time gate. When viewed over all time gates, the projections of the data onto the principal polar axes yield so-called polarizability curves. These curves are especially well suited to discriminating between subsurface conductivity anomalies which correspond to objects of rotational symmetry and irregularly shaped objects. The curves have previously been successfully employed as library elements in a pattern recognition scheme aimed at discriminating harmless scrap metal from dangerous intact unexploded ordnance. However, previous polarizability-curve matching methods have only been applied at field sites which are known a priori to be contaminated by a single type of ordnance, and furthermore, the particular ordnance present in the subsurface was known to be large. Thus signal amplitude was a key element in the discrimination process. The work presented here applies feature-based pattern classification techniques to BUD field data where more than 20 categories of object are present. Data soundings from a calibration grid at the Yuma, AZ proving ground are used in a cross validation study to calibrate the pattern recognition method. The resultant method is then applied to a Blind Test Grid. Results indicate that when lone UXO are present and SNR is reasonably high, Polarizability Curve Matching successfully discriminates UXO from scrap metal when a broad range of objects are present.

Development of a sonar-based object recognition system

NASA Astrophysics Data System (ADS)

Ecemis, Mustafa Ihsan

2001-02-01

Sonars are used extensively in mobile robotics for obstacle detection, ranging and avoidance. However, these range-finding applications do not exploit the full range of information carried in sonar echoes. In addition, mobile robots need robust object recognition systems. Therefore, a simple and robust object recognition system using ultrasonic sensors may have a wide range of applications in robotics. This dissertation develops and analyzes an object recognition system that uses ultrasonic sensors of the type commonly found on mobile robots. Three principal experiments are used to test the sonar recognition system: object recognition at various distances, object recognition during unconstrained motion, and softness discrimination. The hardware setup, consisting of an inexpensive Polaroid sonar and a data acquisition board, is described first. The software for ultrasound signal generation, echo detection, data collection, and data processing is then presented. Next, the dissertation describes two methods to extract information from the echoes, one in the frequency domain and the other in the time domain. The system uses the fuzzy ARTMAP neural network to recognize objects on the basis of the information content of their echoes. In order to demonstrate that the performance of the system does not depend on the specific classification method being used, the K- Nearest Neighbors (KNN) Algorithm is also implemented. KNN yields a test accuracy similar to fuzzy ARTMAP in all experiments. Finally, the dissertation describes a method for extracting features from the envelope function in order to reduce the dimension of the input vector used by the classifiers. Decreasing the size of the input vectors reduces the memory requirements of the system and makes it run faster. It is shown that this method does not affect the performance of the system dramatically and is more appropriate for some tasks. The results of these experiments demonstrate that sonar can be used to develop a low-cost, low-computation system for real-time object recognition tasks on mobile robots. This system differs from all previous approaches in that it is relatively simple, robust, fast, and inexpensive.
Viewpoint Integration for Hand-Based Recognition of Social Interactions from a First-Person View.

PubMed

Bambach, Sven; Crandall, David J; Yu, Chen

2015-11-01

Wearable devices are becoming part of everyday life, from first-person cameras (GoPro, Google Glass), to smart watches (Apple Watch), to activity trackers (FitBit). These devices are often equipped with advanced sensors that gather data about the wearer and the environment. These sensors enable new ways of recognizing and analyzing the wearer's everyday personal activities, which could be used for intelligent human-computer interfaces and other applications. We explore one possible application by investigating how egocentric video data collected from head-mounted cameras can be used to recognize social activities between two interacting partners (e.g. playing chess or cards). In particular, we demonstrate that just the positions and poses of hands within the first-person view are highly informative for activity recognition, and present a computer vision approach that detects hands to automatically estimate activities. While hand pose detection is imperfect, we show that combining evidence across first-person views from the two social partners significantly improves activity recognition accuracy. This result highlights how integrating weak but complimentary sources of evidence from social partners engaged in the same task can help to recognize the nature of their interaction.
Viewpoint Integration for Hand-Based Recognition of Social Interactions from a First-Person View

PubMed Central

Bambach, Sven; Crandall, David J.; Yu, Chen

2016-01-01

Wearable devices are becoming part of everyday life, from first-person cameras (GoPro, Google Glass), to smart watches (Apple Watch), to activity trackers (FitBit). These devices are often equipped with advanced sensors that gather data about the wearer and the environment. These sensors enable new ways of recognizing and analyzing the wearer’s everyday personal activities, which could be used for intelligent human-computer interfaces and other applications. We explore one possible application by investigating how egocentric video data collected from head-mounted cameras can be used to recognize social activities between two interacting partners (e.g. playing chess or cards). In particular, we demonstrate that just the positions and poses of hands within the first-person view are highly informative for activity recognition, and present a computer vision approach that detects hands to automatically estimate activities. While hand pose detection is imperfect, we show that combining evidence across first-person views from the two social partners significantly improves activity recognition accuracy. This result highlights how integrating weak but complimentary sources of evidence from social partners engaged in the same task can help to recognize the nature of their interaction. PMID:28966999
Hierarchical Context Modeling for Video Event Recognition.

PubMed

Wang, Xiaoyang; Ji, Qiang

2016-10-11

Current video event recognition research remains largely target-centered. For real-world surveillance videos, targetcentered event recognition faces great challenges due to large intra-class target variation, limited image resolution, and poor detection and tracking results. To mitigate these challenges, we introduced a context-augmented video event recognition approach. Specifically, we explicitly capture different types of contexts from three levels including image level, semantic level, and prior level. At the image level, we introduce two types of contextual features including the appearance context features and interaction context features to capture the appearance of context objects and their interactions with the target objects. At the semantic level, we propose a deep model based on deep Boltzmann machine to learn event object representations and their interactions. At the prior level, we utilize two types of prior-level contexts including scene priming and dynamic cueing. Finally, we introduce a hierarchical context model that systematically integrates the contextual information at different levels. Through the hierarchical context model, contexts at different levels jointly contribute to the event recognition. We evaluate the hierarchical context model for event recognition on benchmark surveillance video datasets. Results show that incorporating contexts in each level can improve event recognition performance, and jointly integrating three levels of contexts through our hierarchical model achieves the best performance.
Size-Sensitive Perceptual Representations Underlie Visual and Haptic Object Recognition

PubMed Central

Craddock, Matt; Lawson, Rebecca

2009-01-01

A variety of similarities between visual and haptic object recognition suggests that the two modalities may share common representations. However, it is unclear whether such common representations preserve low-level perceptual features or whether transfer between vision and haptics is mediated by high-level, abstract representations. Two experiments used a sequential shape-matching task to examine the effects of size changes on unimodal and crossmodal visual and haptic object recognition. Participants felt or saw 3D plastic models of familiar objects. The two objects presented on a trial were either the same size or different sizes and were the same shape or different but similar shapes. Participants were told to ignore size changes and to match on shape alone. In Experiment 1, size changes on same-shape trials impaired performance similarly for both visual-to-visual and haptic-to-haptic shape matching. In Experiment 2, size changes impaired performance on both visual-to-haptic and haptic-to-visual shape matching and there was no interaction between the cost of size changes and direction of transfer. Together the unimodal and crossmodal matching results suggest that the same, size-specific perceptual representations underlie both visual and haptic object recognition, and indicate that crossmodal memory for objects must be at least partly based on common perceptual representations. PMID:19956685
Learning the moves: the effect of familiarity and facial motion on person recognition across large changes in viewing format.

PubMed

Roark, Dana A; O'Toole, Alice J; Abdi, Hervé; Barrett, Susan E

2006-01-01

Familiarity with a face or person can support recognition in tasks that require generalization to novel viewing contexts. Using naturalistic viewing conditions requiring recognition of people from face or whole body gait stimuli, we investigated the effects of familiarity, facial motion, and direction of learning/test transfer on person recognition. Participants were familiarized with previously unknown people from gait videos and were tested on faces (experiment 1a) or were familiarized with faces and were tested with gait videos (experiment 1b). Recognition was more accurate when learning from the face and testing with the gait videos, than when learning from the gait videos and testing with the face. The repetition of a single stimulus, either the face or gait, produced strong recognition gains across transfer conditions. Also, the presentation of moving faces resulted in better performance than that of static faces. In experiment 2, we investigated the role of facial motion further by testing recognition with static profile images. Motion provided no benefit for recognition, indicating that structure-from-motion is an unlikely source of the motion advantage found in the first set of experiments.
Image registration under translation and rotation in two-dimensional planes using Fourier slice theorem.

PubMed

Pohit, M; Sharma, J

2015-05-10

Image recognition in the presence of both rotation and translation is a longstanding problem in correlation pattern recognition. Use of log polar transform gives a solution to this problem, but at a cost of losing the vital phase information from the image. The main objective of this paper is to develop an algorithm based on Fourier slice theorem for measuring the simultaneous rotation and translation of an object in a 2D plane. The algorithm is applicable for any arbitrary object shift for full 180° rotation.
Rehabilitation regimes based upon psychophysical studies of prosthetic vision

NASA Astrophysics Data System (ADS)

Chen, S. C.; Suaning, G. J.; Morley, J. W.; Lovell, N. H.

2009-06-01

Human trials of prototype visual prostheses have successfully elicited visual percepts (phosphenes) in the visual field of implant recipients blinded through retinitis pigmentosa and age-related macular degeneration. Researchers are progressing rapidly towards a device that utilizes individual phosphenes as the elementary building blocks to compose a visual scene. This form of prosthetic vision is expected, in the near term, to have low resolution, large inter-phosphene gaps, distorted spatial distribution of phosphenes, restricted field of view, an eccentrically located phosphene field and limited number of expressible luminance levels. In order to fully realize the potential of these devices, there needs to be a training and rehabilitation program which aims to assist the prosthesis recipients to understand what they are seeing, and also to adapt their viewing habits to optimize the performance of the device. Based on the literature of psychophysical studies in simulated and real prosthetic vision, this paper proposes a comprehensive, theoretical training regime for a prosthesis recipient: visual search, visual acuity, reading, face/object recognition, hand-eye coordination and navigation. The aim of these tasks is to train the recipients to conduct visual scanning, eccentric viewing and reading, discerning low-contrast visual information, and coordinating bodily actions for visual-guided tasks under prosthetic vision. These skills have been identified as playing an important role in making prosthetic vision functional for the daily activities of their recipients.
Neuro-inspired smart image sensor: analog Hmax implementation

NASA Astrophysics Data System (ADS)

Paindavoine, Michel; Dubois, Jérôme; Musa, Purnawarman

2015-03-01

Neuro-Inspired Vision approach, based on models from biology, allows to reduce the computational complexity. One of these models - The Hmax model - shows that the recognition of an object in the visual cortex mobilizes V1, V2 and V4 areas. From the computational point of view, V1 corresponds to the area of the directional filters (for example Sobel filters, Gabor filters or wavelet filters). This information is then processed in the area V2 in order to obtain local maxima. This new information is then sent to an artificial neural network. This neural processing module corresponds to area V4 of the visual cortex and is intended to categorize objects present in the scene. In order to realize autonomous vision systems (consumption of a few milliwatts) with such treatments inside, we studied and realized in 0.35μm CMOS technology prototypes of two image sensors in order to achieve the V1 and V2 processing of Hmax model.
Eye movements during object recognition in visual agnosia.

PubMed

Charles Leek, E; Patterson, Candy; Paul, Matthew A; Rafal, Robert; Cristino, Filipe

2012-07-01

This paper reports the first ever detailed study about eye movement patterns during single object recognition in visual agnosia. Eye movements were recorded in a patient with an integrative agnosic deficit during two recognition tasks: common object naming and novel object recognition memory. The patient showed normal directional biases in saccades and fixation dwell times in both tasks and was as likely as controls to fixate within object bounding contour regardless of recognition accuracy. In contrast, following initial saccades of similar amplitude to controls, the patient showed a bias for short saccades. In object naming, but not in recognition memory, the similarity of the spatial distributions of patient and control fixations was modulated by recognition accuracy. The study provides new evidence about how eye movements can be used to elucidate the functional impairments underlying object recognition deficits. We argue that the results reflect a breakdown in normal functional processes involved in the integration of shape information across object structure during the visual perception of shape. Copyright © 2012 Elsevier Ltd. All rights reserved.
Multifunctional microcontrollable interface module

NASA Astrophysics Data System (ADS)

Spitzer, Mark B.; Zavracky, Paul M.; Rensing, Noa M.; Crawford, J.; Hockman, Angela H.; Aquilino, P. D.; Girolamo, Henry J.

2001-08-01

This paper reports the development of a complete eyeglass- mounted computer interface system including display, camera and audio subsystems. The display system provides an SVGA image with a 20 degree horizontal field of view. The camera system has been optimized for face recognition and provides a 19 degree horizontal field of view. A microphone and built-in pre-amp optimized for voice recognition and a speaker on an articulated arm are included for audio. An important feature of the system is a high degree of adjustability and reconfigurability. The system has been developed for testing by the Military Police, in a complete system comprising the eyeglass-mounted interface, a wearable computer, and an RF link. Details of the design, construction, and performance of the eyeglass-based system are discussed.
Automatic Target Recognition Based on Cross-Plot

PubMed Central

Wong, Kelvin Kian Loong; Abbott, Derek

2011-01-01

Automatic target recognition that relies on rapid feature extraction of real-time target from photo-realistic imaging will enable efficient identification of target patterns. To achieve this objective, Cross-plots of binary patterns are explored as potential signatures for the observed target by high-speed capture of the crucial spatial features using minimal computational resources. Target recognition was implemented based on the proposed pattern recognition concept and tested rigorously for its precision and recall performance. We conclude that Cross-plotting is able to produce a digital fingerprint of a target that correlates efficiently and effectively to signatures of patterns having its identity in a target repository. PMID:21980508
The role of color information on object recognition: a review and meta-analysis.

PubMed

Bramão, Inês; Reis, Alexandra; Petersson, Karl Magnus; Faísca, Luís

2011-09-01

In this study, we systematically review the scientific literature on the effect of color on object recognition. Thirty-five independent experiments, comprising 1535 participants, were included in a meta-analysis. We found a moderate effect of color on object recognition (d=0.28). Specific effects of moderator variables were analyzed and we found that color diagnosticity is the factor with the greatest moderator effect on the influence of color in object recognition; studies using color diagnostic objects showed a significant color effect (d=0.43), whereas a marginal color effect was found in studies that used non-color diagnostic objects (d=0.18). The present study did not permit the drawing of specific conclusions about the moderator effect of the object recognition task; while the meta-analytic review showed that color information improves object recognition mainly in studies using naming tasks (d=0.36), the literature review revealed a large body of evidence showing positive effects of color information on object recognition in studies using a large variety of visual recognition tasks. We also found that color is important for the ability to recognize artifacts and natural objects, to recognize objects presented as types (line-drawings) or as tokens (photographs), and to recognize objects that are presented without surface details, such as texture or shadow. Taken together, the results of the meta-analysis strongly support the contention that color plays a role in object recognition. This suggests that the role of color should be taken into account in models of visual object recognition. Copyright © 2011 Elsevier B.V. All rights reserved.
Automated Recognition of Vegetation and Water Bodies on the Territory of Megacities in Satellite Images of Visible and IR Bands

NASA Astrophysics Data System (ADS)

Mozgovoy, Dmitry k.; Hnatushenko, Volodymyr V.; Vasyliev, Volodymyr V.

2018-04-01

Vegetation and water bodies are a fundamental element of urban ecosystems, and water mapping is critical for urban and landscape planning and management. A methodology of automated recognition of vegetation and water bodies on the territory of megacities in satellite images of sub-meter spatial resolution of the visible and IR bands is proposed. By processing multispectral images from the satellite SuperView-1A, vector layers of recognized plant and water objects were obtained. Analysis of the results of image processing showed a sufficiently high accuracy of the delineation of the boundaries of recognized objects and a good separation of classes. The developed methodology provides a significant increase of the efficiency and reliability of updating maps of large cities while reducing financial costs. Due to the high degree of automation, the proposed methodology can be implemented in the form of a geo-information web service functioning in the interests of a wide range of public services and commercial institutions.
Analysis and Recognition of Curve Type as The Basis of Object Recognition in Image

NASA Astrophysics Data System (ADS)

Nugraha, Nurma; Madenda, Sarifuddin; Indarti, Dina; Dewi Agushinta, R.; Ernastuti

2016-06-01

An object in an image when analyzed further will show the characteristics that distinguish one object with another object in an image. Characteristics that are used in object recognition in an image can be a color, shape, pattern, texture and spatial information that can be used to represent objects in the digital image. The method has recently been developed for image feature extraction on objects that share characteristics curve analysis (simple curve) and use the search feature of chain code object. This study will develop an algorithm analysis and the recognition of the type of curve as the basis for object recognition in images, with proposing addition of complex curve characteristics with maximum four branches that will be used for the process of object recognition in images. Definition of complex curve is the curve that has a point of intersection. By using some of the image of the edge detection, the algorithm was able to do the analysis and recognition of complex curve shape well.
3D Viewing: Odd Perception - Illusion? reality? or both?

NASA Astrophysics Data System (ADS)

Kisimoto, K.; Iizasa, K.

2008-12-01

We live in the three dimensional space, don't we? It could be at least four dimensions, but that is another story. In either way our perceptual capability of 3D-Viewing is constrained by our 2D-perception (our intrinsic tools of perception). I carried out a few visual experiments using topographic data to show our intrinsic (or biological) disability (or shortcoming) in 3D-recognition of our world. Results of the experiments suggest: (1) 3D-surface model displayed on a 2D-computer screen (or paper) always has two interpretations of the 3D- surface geometry, if we choose one of the interpretation (in other word, if we are hooked by one perception of the two), we maintain its perception even if the 3D-model changes its viewing perspective in time shown on the screen, (2) more interesting is that 3D-real solid object (e.g.,made of clay) also gives above mentioned two interpretations of the geometry of the object, if we observe the object with one-eye. Most famous example of this viewing illusion is exemplified by a magician, who died in 2007, Jerry Andrus who made a super-cool paper crafted dragon which causes visual illusion to one-eyed viewer. I, by the experiments, confirmed this phenomenon in another perceptually persuasive (deceptive?) way. My conclusion is that this illusion is intrinsic, i.e. reality for human, because, even if we live in 3D-space, our perceptional tool (eyes) is composed of 2D sensors whose information is reconstructed or processed to 3D by our experience-based brain. So, (3) when we observe the 3D-surface-model on the computer screen, we are always one eye short even if we use both eyes. One last suggestion from my experiments is that recent highly sophisticated 3D- models might include too many information that human perceptions cannot handle properly, i.e. we might not be understanding the 3D world (geospace) at all, just illusioned.
Cross-View Action Recognition via Transferable Dictionary Learning.

PubMed

Zheng, Jingjing; Jiang, Zhuolin; Chellappa, Rama

2016-05-01

Discriminative appearance features are effective for recognizing actions in a fixed view, but may not generalize well to a new view. In this paper, we present two effective approaches to learn dictionaries for robust action recognition across views. In the first approach, we learn a set of view-specific dictionaries where each dictionary corresponds to one camera view. These dictionaries are learned simultaneously from the sets of correspondence videos taken at different views with the aim of encouraging each video in the set to have the same sparse representation. In the second approach, we additionally learn a common dictionary shared by different views to model view-shared features. This approach represents the videos in each view using a view-specific dictionary and the common dictionary. More importantly, it encourages the set of videos taken from the different views of the same action to have the similar sparse representations. The learned common dictionary not only has the capability to represent actions from unseen views, but also makes our approach effective in a semi-supervised setting where no correspondence videos exist and only a few labeled videos exist in the target view. The extensive experiments using three public datasets demonstrate that the proposed approach outperforms recently developed approaches for cross-view action recognition.
Infrared detection, recognition and identification of handheld objects

NASA Astrophysics Data System (ADS)

Adomeit, Uwe

2012-10-01

A main criterion for comparison and selection of thermal imagers for military applications is their nominal range performance. This nominal range performance is calculated for a defined task and standardized target and environmental conditions. The only standardization available to date is STANAG 4347. The target defined there is based on a main battle tank in front view. Because of modified military requirements, this target is no longer up-to-date. Today, different topics of interest are of interest, especially differentiation between friend and foe and identification of humans. There is no direct way to differentiate between friend and foe in asymmetric scenarios, but one clue can be that someone is carrying a weapon. This clue can be transformed in the observer tasks detection: a person is carrying or is not carrying an object, recognition: the object is a long / medium / short range weapon or civil equipment and identification: the object can be named (e. g. AK-47, M-4, G36, RPG7, Axe, Shovel etc.). These tasks can be assessed experimentally and from the results of such an assessment, a standard target for handheld objects may be derived. For a first assessment, a human carrying 13 different handheld objects in front of his chest was recorded at four different ranges with an IR-dual-band camera. From the recorded data, a perception experiment was prepared. It was conducted with 17 observers in a 13-alternative forced choice, unlimited observation time arrangement. The results of the test together with Minimum Temperature Difference Perceived measurements of the camera and temperature difference and critical dimension derived from the recorded imagery allowed defining a first standard target according to the above tasks. This standard target consist of 2.5 / 3.5 / 5 DRI line pairs on target, 0.24 m critical size and 1 K temperature difference. The values are preliminary and have to be refined in the future. Necessary are different aspect angles, different carriage and movement.
A Low-Cost EEG System-Based Hybrid Brain-Computer Interface for Humanoid Robot Navigation and Recognition

PubMed Central

Choi, Bongjae; Jo, Sungho

2013-01-01

This paper describes a hybrid brain-computer interface (BCI) technique that combines the P300 potential, the steady state visually evoked potential (SSVEP), and event related de-synchronization (ERD) to solve a complicated multi-task problem consisting of humanoid robot navigation and control along with object recognition using a low-cost BCI system. Our approach enables subjects to control the navigation and exploration of a humanoid robot and recognize a desired object among candidates. This study aims to demonstrate the possibility of a hybrid BCI based on a low-cost system for a realistic and complex task. It also shows that the use of a simple image processing technique, combined with BCI, can further aid in making these complex tasks simpler. An experimental scenario is proposed in which a subject remotely controls a humanoid robot in a properly sized maze. The subject sees what the surrogate robot sees through visual feedback and can navigate the surrogate robot. While navigating, the robot encounters objects located in the maze. It then recognizes if the encountered object is of interest to the subject. The subject communicates with the robot through SSVEP and ERD-based BCIs to navigate and explore with the robot, and P300-based BCI to allow the surrogate robot recognize their favorites. Using several evaluation metrics, the performances of five subjects navigating the robot were quite comparable to manual keyboard control. During object recognition mode, favorite objects were successfully selected from two to four choices. Subjects conducted humanoid navigation and recognition tasks as if they embodied the robot. Analysis of the data supports the potential usefulness of the proposed hybrid BCI system for extended applications. This work presents an important implication for the future work that a hybridization of simple BCI protocols provide extended controllability to carry out complicated tasks even with a low-cost system. PMID:24023953
A low-cost EEG system-based hybrid brain-computer interface for humanoid robot navigation and recognition.

PubMed

Choi, Bongjae; Jo, Sungho

2013-01-01

This paper describes a hybrid brain-computer interface (BCI) technique that combines the P300 potential, the steady state visually evoked potential (SSVEP), and event related de-synchronization (ERD) to solve a complicated multi-task problem consisting of humanoid robot navigation and control along with object recognition using a low-cost BCI system. Our approach enables subjects to control the navigation and exploration of a humanoid robot and recognize a desired object among candidates. This study aims to demonstrate the possibility of a hybrid BCI based on a low-cost system for a realistic and complex task. It also shows that the use of a simple image processing technique, combined with BCI, can further aid in making these complex tasks simpler. An experimental scenario is proposed in which a subject remotely controls a humanoid robot in a properly sized maze. The subject sees what the surrogate robot sees through visual feedback and can navigate the surrogate robot. While navigating, the robot encounters objects located in the maze. It then recognizes if the encountered object is of interest to the subject. The subject communicates with the robot through SSVEP and ERD-based BCIs to navigate and explore with the robot, and P300-based BCI to allow the surrogate robot recognize their favorites. Using several evaluation metrics, the performances of five subjects navigating the robot were quite comparable to manual keyboard control. During object recognition mode, favorite objects were successfully selected from two to four choices. Subjects conducted humanoid navigation and recognition tasks as if they embodied the robot. Analysis of the data supports the potential usefulness of the proposed hybrid BCI system for extended applications. This work presents an important implication for the future work that a hybridization of simple BCI protocols provide extended controllability to carry out complicated tasks even with a low-cost system.

Traffic Behavior Recognition Using the Pachinko Allocation Model

PubMed Central

Huynh-The, Thien; Banos, Oresti; Le, Ba-Vui; Bui, Dinh-Mao; Yoon, Yongik; Lee, Sungyoung

2015-01-01

CCTV-based behavior recognition systems have gained considerable attention in recent years in the transportation surveillance domain for identifying unusual patterns, such as traffic jams, accidents, dangerous driving and other abnormal behaviors. In this paper, a novel approach for traffic behavior modeling is presented for video-based road surveillance. The proposed system combines the pachinko allocation model (PAM) and support vector machine (SVM) for a hierarchical representation and identification of traffic behavior. A background subtraction technique using Gaussian mixture models (GMMs) and an object tracking mechanism based on Kalman filters are utilized to firstly construct the object trajectories. Then, the sparse features comprising the locations and directions of the moving objects are modeled by PAM into traffic topics, namely activities and behaviors. As a key innovation, PAM captures not only the correlation among the activities, but also among the behaviors based on the arbitrary directed acyclic graph (DAG). The SVM classifier is then utilized on top to train and recognize the traffic activity and behavior. The proposed model shows more flexibility and greater expressive power than the commonly-used latent Dirichlet allocation (LDA) approach, leading to a higher recognition accuracy in the behavior classification. PMID:26151213
Grayscale image segmentation for real-time traffic sign recognition: the hardware point of view

NASA Astrophysics Data System (ADS)

Cao, Tam P.; Deng, Guang; Elton, Darrell

2009-02-01

In this paper, we study several grayscale-based image segmentation methods for real-time road sign recognition applications on an FPGA hardware platform. The performance of different image segmentation algorithms in different lighting conditions are initially compared using PC simulation. Based on these results and analysis, suitable algorithms are implemented and tested on a real-time FPGA speed sign detection system. Experimental results show that the system using segmented images uses significantly less hardware resources on an FPGA while maintaining comparable system's performance. The system is capable of processing 60 live video frames per second.
Art critic: Multisignal vision and speech interaction system in a gaming context.

PubMed

Reale, Michael J; Liu, Peng; Yin, Lijun; Canavan, Shaun

2013-12-01

True immersion of a player within a game can only occur when the world simulated looks and behaves as close to reality as possible. This implies that the game must correctly read and understand, among other things, the player's focus, attitude toward the objects/persons in focus, gestures, and speech. In this paper, we proposed a novel system that integrates eye gaze estimation, head pose estimation, facial expression recognition, speech recognition, and text-to-speech components for use in real-time games. Both the eye gaze and head pose components utilize underlying 3-D models, and our novel head pose estimation algorithm uniquely combines scene flow with a generic head model. The facial expression recognition module uses the local binary patterns with three orthogonal planes approach on the 2-D shape index domain rather than the pixel domain, resulting in improved classification. Our system has also been extended to use a pan-tilt-zoom camera driven by the Kinect, allowing us to track a moving player. A test game, Art Critic, is also presented, which not only demonstrates the utility of our system but also provides a template for player/non-player character (NPC) interaction in a gaming context. The player alters his/her view of the 3-D world using head pose, looks at paintings/NPCs using eye gaze, and makes an evaluation based on the player's expression and speech. The NPC artist will respond with facial expression and synthetic speech based on its personality. Both qualitative and quantitative evaluations of the system are performed to illustrate the system's effectiveness.
Projective Structure from Two Uncalibrated Images: Structure from Motion and Recognition

DTIC Science & Technology

1992-09-01

correspondence between points in Maybank 1990). The question, therefore, is why look for both views more of a problem, and hence, may make the...plane is fixed with respect to the 1987, Faugeras, Luong and Maybank 1992). The prob- camera coordinate frame. A rigid camera motion, there- lem of...the second reference Rieger-Lawton 1985, Faugeras and Maybank 1990, Hil- plane (assuming the four object points Pi, j = 1, ...,4, dreth 1991, Faugeras
Separability of Abstract-Category and Specific-Exemplar Visual Object Subsystems: Evidence from fMRI Pattern Analysis

PubMed Central

McMenamin, Brenton W.; Deason, Rebecca G.; Steele, Vaughn R.; Koutstaal, Wilma; Marsolek, Chad J.

2014-01-01

Previous research indicates that dissociable neural subsystems underlie abstract-category (AC) recognition and priming of objects (e.g., cat, piano) and specific-exemplar (SE) recognition and priming of objects (e.g., a calico cat, a different calico cat, a grand piano, etc.). However, the degree of separability between these subsystems is not known, despite the importance of this issue for assessing relevant theories. Visual object representations are widely distributed in visual cortex, thus a multivariate pattern analysis (MVPA) approach to analyzing functional magnetic resonance imaging (fMRI) data may be critical for assessing the separability of different kinds of visual object processing. Here we examined the neural representations of visual object categories and visual object exemplars using multi-voxel pattern analyses of brain activity elicited in visual object processing areas during a repetition-priming task. In the encoding phase, participants viewed visual objects and the printed names of other objects. In the subsequent test phase, participants identified objects that were either same-exemplar primed, different-exemplar primed, word-primed, or unprimed. In visual object processing areas, classifiers were trained to distinguish same-exemplar primed objects from word-primed objects. Then, the abilities of these classifiers to discriminate different-exemplar primed objects and word-primed objects (reflecting AC priming) and to discriminate same-exemplar primed objects and different-exemplar primed objects (reflecting SE priming) was assessed. Results indicated that (a) repetition priming in occipital-temporal regions is organized asymmetrically, such that AC priming is more prevalent in the left hemisphere and SE priming is more prevalent in the right hemisphere, and (b) AC and SE subsystems are weakly modular, not strongly modular or unified. PMID:25528436
Separability of abstract-category and specific-exemplar visual object subsystems: evidence from fMRI pattern analysis.

PubMed

McMenamin, Brenton W; Deason, Rebecca G; Steele, Vaughn R; Koutstaal, Wilma; Marsolek, Chad J

2015-02-01

Previous research indicates that dissociable neural subsystems underlie abstract-category (AC) recognition and priming of objects (e.g., cat, piano) and specific-exemplar (SE) recognition and priming of objects (e.g., a calico cat, a different calico cat, a grand piano, etc.). However, the degree of separability between these subsystems is not known, despite the importance of this issue for assessing relevant theories. Visual object representations are widely distributed in visual cortex, thus a multivariate pattern analysis (MVPA) approach to analyzing functional magnetic resonance imaging (fMRI) data may be critical for assessing the separability of different kinds of visual object processing. Here we examined the neural representations of visual object categories and visual object exemplars using multi-voxel pattern analyses of brain activity elicited in visual object processing areas during a repetition-priming task. In the encoding phase, participants viewed visual objects and the printed names of other objects. In the subsequent test phase, participants identified objects that were either same-exemplar primed, different-exemplar primed, word-primed, or unprimed. In visual object processing areas, classifiers were trained to distinguish same-exemplar primed objects from word-primed objects. Then, the abilities of these classifiers to discriminate different-exemplar primed objects and word-primed objects (reflecting AC priming) and to discriminate same-exemplar primed objects and different-exemplar primed objects (reflecting SE priming) was assessed. Results indicated that (a) repetition priming in occipital-temporal regions is organized asymmetrically, such that AC priming is more prevalent in the left hemisphere and SE priming is more prevalent in the right hemisphere, and (b) AC and SE subsystems are weakly modular, not strongly modular or unified. Copyright © 2014 Elsevier Inc. All rights reserved.
Activity and function recognition for moving and static objects in urban environments from wide-area persistent surveillance inputs

NASA Astrophysics Data System (ADS)

Levchuk, Georgiy; Bobick, Aaron; Jones, Eric

2010-04-01

In this paper, we describe results from experimental analysis of a model designed to recognize activities and functions of moving and static objects from low-resolution wide-area video inputs. Our model is based on representing the activities and functions using three variables: (i) time; (ii) space; and (iii) structures. The activity and function recognition is achieved by imposing lexical, syntactic, and semantic constraints on the lower-level event sequences. In the reported research, we have evaluated the utility and sensitivity of several algorithms derived from natural language processing and pattern recognition domains. We achieved high recognition accuracy for a wide range of activity and function types in the experiments using Electro-Optical (EO) imagery collected by Wide Area Airborne Surveillance (WAAS) platform.
Method and System for Object Recognition Search

NASA Technical Reports Server (NTRS)

Duong, Tuan A. (Inventor); Duong, Vu A. (Inventor); Stubberud, Allen R. (Inventor)

2012-01-01

A method for object recognition using shape and color features of the object to be recognized. An adaptive architecture is used to recognize and adapt the shape and color features for moving objects to enable object recognition.
Object memory effects on figure assignment: conscious object recognition is not necessary or sufficient.

PubMed

Peterson, M A; de Gelder, B; Rapcsak, S Z; Gerhardstein, P C; Bachoud-Lévi, A

2000-01-01

In three experiments we investigated whether conscious object recognition is necessary or sufficient for effects of object memories on figure assignment. In experiment 1, we examined a brain-damaged participant, AD, whose conscious object recognition is severely impaired. AD's responses about figure assignment do reveal effects from memories of object structure, indicating that conscious object recognition is not necessary for these effects, and identifying the figure-ground test employed here as a new implicit test of access to memories of object structure. In experiments 2 and 3, we tested a second brain-damaged participant, WG, for whom conscious object recognition was relatively spared. Nevertheless, effects from memories of object structure on figure assignment were not evident in WG's responses about figure assignment in experiment 2, indicating that conscious object recognition is not sufficient for effects of object memories on figure assignment. WG's performance sheds light on AD's performance, and has implications for the theoretical understanding of object memory effects on figure assignment.
Fast neuromimetic object recognition using FPGA outperforms GPU implementations.

PubMed

Orchard, Garrick; Martin, Jacob G; Vogelstein, R Jacob; Etienne-Cummings, Ralph

2013-08-01

Recognition of objects in still images has traditionally been regarded as a difficult computational problem. Although modern automated methods for visual object recognition have achieved steadily increasing recognition accuracy, even the most advanced computational vision approaches are unable to obtain performance equal to that of humans. This has led to the creation of many biologically inspired models of visual object recognition, among them the hierarchical model and X (HMAX) model. HMAX is traditionally known to achieve high accuracy in visual object recognition tasks at the expense of significant computational complexity. Increasing complexity, in turn, increases computation time, reducing the number of images that can be processed per unit time. In this paper we describe how the computationally intensive and biologically inspired HMAX model for visual object recognition can be modified for implementation on a commercial field-programmable aate Array, specifically the Xilinx Virtex 6 ML605 evaluation board with XC6VLX240T FPGA. We show that with minor modifications to the traditional HMAX model we can perform recognition on images of size 128 × 128 pixels at a rate of 190 images per second with a less than 1% loss in recognition accuracy in both binary and multiclass visual object recognition tasks.
Some logical functions of joint control.

PubMed Central

Lowenkron, B

1998-01-01

Constructing a behavioral account of the language-related performances that characterize responding to logical and symbolic relations between stimuli is commonly viewed as a problem for the area of stimulus control. In response to this problem, the notion of joint control is presented here, and its ability to provide an interpretative account of these kinds of performances is explored. Joint control occurs when the currently rehearsed topography of a verbal operant, as evoked by one stimulus, is simultaneously evoked by another stimulus. This event, the onset of joint stimulus control by two stimuli over a common response topography, then sets the occasion for a response appropriate to this special relation between the stimuli. Although the mechanism described is simple, it seems to have broad explanatory properties. In what follows, these properties are applied to provide a behavioral interpretation of two sorts of fundamental, putatively cognitive, performances: those based on logical relations and those based on semantic relations. The first includes responding to generalized conceptual relations such as identity, order, relative size, distance, and orientation. The second includes responding to relations usually ascribed to word meaning. These include relations between words and objects, the specification of objects by words, name-object bidirectionality, and the recognition of objects from their description. Finally, as a preview of some further possibilities, the role of joint control in goal-oriented behavior is considered briefly. PMID:9599452
A bio-inspired system for spatio-temporal recognition in static and video imagery

NASA Astrophysics Data System (ADS)

Khosla, Deepak; Moore, Christopher K.; Chelian, Suhas

2007-04-01

This paper presents a bio-inspired method for spatio-temporal recognition in static and video imagery. It builds upon and extends our previous work on a bio-inspired Visual Attention and object Recognition System (VARS). The VARS approach locates and recognizes objects in a single frame. This work presents two extensions of VARS. The first extension is a Scene Recognition Engine (SCE) that learns to recognize spatial relationships between objects that compose a particular scene category in static imagery. This could be used for recognizing the category of a scene, e.g., office vs. kitchen scene. The second extension is the Event Recognition Engine (ERE) that recognizes spatio-temporal sequences or events in sequences. This extension uses a working memory model to recognize events and behaviors in video imagery by maintaining and recognizing ordered spatio-temporal sequences. The working memory model is based on an ARTSTORE1 neural network that combines an ART-based neural network with a cascade of sustained temporal order recurrent (STORE)1 neural networks. A series of Default ARTMAP classifiers ascribes event labels to these sequences. Our preliminary studies have shown that this extension is robust to variations in an object's motion profile. We evaluated the performance of the SCE and ERE on real datasets. The SCE module was tested on a visual scene classification task using the LabelMe2 dataset. The ERE was tested on real world video footage of vehicles and pedestrians in a street scene. Our system is able to recognize the events in this footage involving vehicles and pedestrians.
BDNF Expression in Perirhinal Cortex is Associated with Exercise-Induced Improvement in Object Recognition Memory

PubMed Central

Hopkins, Michael E.; Bucci, David J.

2010-01-01

Physical exercise induces widespread neurobiological adaptations and improves learning and memory. Most research in this field has focused on hippocampus-based spatial tasks and changes in brain-derived neurotrophic factor (BDNF) as a putative substrate underlying exercise-induced cognitive improvements. Chronic exercise can also be anxiolytic and causes adaptive changes in stress reactivity. The present study employed a perirhinal cortex-dependent object recognition task as well as the elevated plus maze to directly test for interactions between the cognitive and anxiolytic effects of exercise in male Long Evans rats. Hippocampal and perirhinal cortex tissue was collected to determine whether the relationship between BDNF and cognitive performance extends to this non-spatial and non-hippocampal-dependent task. We also examined whether the cognitive improvements persisted once the exercise regimen was terminated. Our data indicate that 4 weeks of voluntary exercise every-other-day improved object recognition memory. Importantly, BDNF expression in the perirhinal cortex of exercising rats was strongly correlated with object recognition memory. Exercise also decreased anxiety-like behavior, however there was no evidence to support a relationship between anxiety-like behavior and performance on the novel object recognition task. There was a trend for a negative relationship between anxiety-like behavior and hippocampal BDNF. Neither the cognitive improvements nor the relationship between cognitive function and perirhinal BDNF levels persisted after 2 weeks of inactivity. These are the first data demonstrating that region-specific changes in BDNF protein levels are correlated with exercise-induced improvements in non-spatial memory, mediated by structures outside the hippocampus and are consistent with the theory that, with regard to object recognition, the anxiolytic and cognitive effects of exercise may be mediated through separable mechanisms. PMID:20601027
Resolving human object recognition in space and time

PubMed Central

Cichy, Radoslaw Martin; Pantazis, Dimitrios; Oliva, Aude

2014-01-01

A comprehensive picture of object processing in the human brain requires combining both spatial and temporal information about brain activity. Here, we acquired human magnetoencephalography (MEG) and functional magnetic resonance imaging (fMRI) responses to 92 object images. Multivariate pattern classification applied to MEG revealed the time course of object processing: whereas individual images were discriminated by visual representations early, ordinate and superordinate category levels emerged relatively later. Using representational similarity analysis, we combine human fMRI and MEG to show content-specific correspondence between early MEG responses and primary visual cortex (V1), and later MEG responses and inferior temporal (IT) cortex. We identified transient and persistent neural activities during object processing, with sources in V1 and IT., Finally, human MEG signals were correlated to single-unit responses in monkey IT. Together, our findings provide an integrated space- and time-resolved view of human object categorization during the first few hundred milliseconds of vision. PMID:24464044
Fast and efficient indexing approach for object recognition

NASA Astrophysics Data System (ADS)

Hefnawy, Alaa; Mashali, Samia A.; Rashwan, Mohsen; Fikri, Magdi

1999-08-01

This paper introduces a fast and efficient indexing approach for both 2D and 3D model-based object recognition in the presence of rotation, translation, and scale variations of objects. The indexing entries are computed after preprocessing the data by Haar wavelet decomposition. The scheme is based on a unified image feature detection approach based on Zernike moments. A set of low level features, e.g. high precision edges, gray level corners, are estimated by a set of orthogonal Zernike moments, calculated locally around every image point. A high dimensional, highly descriptive indexing entries are then calculated based on the correlation of these local features and employed for fast access to the model database to generate hypotheses. A list of the most candidate models is then presented by evaluating the hypotheses. Experimental results are included to demonstrate the effectiveness of the proposed indexing approach.
Short-Term Memory Scanning Viewed as Exemplar-Based Categorization

ERIC Educational Resources Information Center

Nosofsky, Robert M.; Little, Daniel R.; Donkin, Christopher; Fific, Mario

2011-01-01

Exemplar-similarity models such as the exemplar-based random walk (EBRW) model (Nosofsky & Palmeri, 1997b) were designed to provide a formal account of multidimensional classification choice probabilities and response times (RTs). At the same time, a recurring theme has been to use exemplar models to account for old-new item recognition and to…
Surface versus Edge-Based Determinants of Visual Recognition.

ERIC Educational Resources Information Center

Biederman, Irving; Ju, Ginny

1988-01-01

The latency at which objects could be identified by 126 subjects was compared through line drawings (edge-based) or color photography (surface depiction). The line drawing was identified about as quickly as the photograph; primal access to a mental representation of an object can be modeled from an edge-based description. (SLD)
New technique for real-time distortion-invariant multiobject recognition and classification

NASA Astrophysics Data System (ADS)

Hong, Rutong; Li, Xiaoshun; Hong, En; Wang, Zuyi; Wei, Hongan

2001-04-01

A real-time hybrid distortion-invariant OPR system was established to make 3D multiobject distortion-invariant automatic pattern recognition. Wavelet transform technique was used to make digital preprocessing of the input scene, to depress the noisy background and enhance the recognized object. A three-layer backpropagation artificial neural network was used in correlation signal post-processing to perform multiobject distortion-invariant recognition and classification. The C-80 and NOA real-time processing ability and the multithread programming technology were used to perform high speed parallel multitask processing and speed up the post processing rate to ROIs. The reference filter library was constructed for the distortion version of 3D object model images based on the distortion parameter tolerance measuring as rotation, azimuth and scale. The real-time optical correlation recognition testing of this OPR system demonstrates that using the preprocessing, post- processing, the nonlinear algorithm os optimum filtering, RFL construction technique and the multithread programming technology, a high possibility of recognition and recognition rate ere obtained for the real-time multiobject distortion-invariant OPR system. The recognition reliability and rate was improved greatly. These techniques are very useful to automatic target recognition.
The development of newborn object recognition in fast and slow visual worlds

PubMed Central

Wood, Justin N.; Wood, Samantha M. W.

2016-01-01

Object recognition is central to perception and cognition. Yet relatively little is known about the environmental factors that cause invariant object recognition to emerge in the newborn brain. Is this ability a hardwired property of vision? Or does the development of invariant object recognition require experience with a particular kind of visual environment? Here, we used a high-throughput controlled-rearing method to examine whether newborn chicks (Gallus gallus) require visual experience with slowly changing objects to develop invariant object recognition abilities. When newborn chicks were raised with a slowly rotating virtual object, the chicks built invariant object representations that generalized across novel viewpoints and rotation speeds. In contrast, when newborn chicks were raised with a virtual object that rotated more quickly, the chicks built viewpoint-specific object representations that failed to generalize to novel viewpoints and rotation speeds. Moreover, there was a direct relationship between the speed of the object and the amount of invariance in the chick's object representation. Thus, visual experience with slowly changing objects plays a critical role in the development of invariant object recognition. These results indicate that invariant object recognition is not a hardwired property of vision, but is learned rapidly when newborns encounter a slowly changing visual world. PMID:27097925
Intelligent fault recognition strategy based on adaptive optimized multiple centers

NASA Astrophysics Data System (ADS)

Zheng, Bo; Li, Yan-Feng; Huang, Hong-Zhong

2018-06-01

For the recognition principle based optimized single center, one important issue is that the data with nonlinear separatrix cannot be recognized accurately. In order to solve this problem, a novel recognition strategy based on adaptive optimized multiple centers is proposed in this paper. This strategy recognizes the data sets with nonlinear separatrix by the multiple centers. Meanwhile, the priority levels are introduced into the multi-objective optimization, including recognition accuracy, the quantity of optimized centers, and distance relationship. According to the characteristics of various data, the priority levels are adjusted to ensure the quantity of optimized centers adaptively and to keep the original accuracy. The proposed method is compared with other methods, including support vector machine (SVM), neural network, and Bayesian classifier. The results demonstrate that the proposed strategy has the same or even better recognition ability on different distribution characteristics of data.

Effects of Pictorial Cues on Reaching Depend on the Distinctiveness of Target Objects

PubMed Central

Himmelbach, Marc

2013-01-01

There is an ongoing debate under what conditions learned object sizes influence visuomotor control under preserved stereovision. Using meaningful objects (matchboxes of locally well-known brands in the UK) a previous study has nicely shown that the recognition of these objects influences action programming by means of reach amplitude and grasp pre-shaping even under binocular vision. Using the same paradigm, we demonstrated that short-term learning of colour-size associations was not sufficient to induce any visuomotor effects under binocular viewing conditions. Now we used the same matchboxes, for which the familiarity effect was shown in the UK, with German participants who have never seen these objects before. We addressed the question whether simply a high degree of distinctness, or whether instead actual prior familiarity of these objects, are required to affect motor computations. We found that under monocular and binocular viewing conditions the learned size and location influenced the amplitude of the reaching component significantly. In contrast, the maximum grip aperture remained unaffected for binocular vision. We conclude that visual distinctness is sufficient to form reliable associations in short-term learning to influence reaching even for preserved stereovision. Grasp pre-shaping instead seems to be less susceptible to such perceptual effects. PMID:23382882
Learning viewpoint invariant object representations using a temporal coherence principle.

PubMed

Einhäuser, Wolfgang; Hipp, Jörg; Eggert, Julian; Körner, Edgar; König, Peter

2005-07-01

Invariant object recognition is arguably one of the major challenges for contemporary machine vision systems. In contrast, the mammalian visual system performs this task virtually effortlessly. How can we exploit our knowledge on the biological system to improve artificial systems? Our understanding of the mammalian early visual system has been augmented by the discovery that general coding principles could explain many aspects of neuronal response properties. How can such schemes be transferred to system level performance? In the present study we train cells on a particular variant of the general principle of temporal coherence, the "stability" objective. These cells are trained on unlabeled real-world images without a teaching signal. We show that after training, the cells form a representation that is largely independent of the viewpoint from which the stimulus is looked at. This finding includes generalization to previously unseen viewpoints. The achieved representation is better suited for view-point invariant object classification than the cells' input patterns. This property to facilitate view-point invariant classification is maintained even if training and classification take place in the presence of an--also unlabeled--distractor object. In summary, here we show that unsupervised learning using a general coding principle facilitates the classification of real-world objects, that are not segmented from the background and undergo complex, non-isomorphic, transformations.
Towards discrete wavelet transform-based human activity recognition

NASA Astrophysics Data System (ADS)

Khare, Manish; Jeon, Moongu

2017-06-01

Providing accurate recognition of human activities is a challenging problem for visual surveillance applications. In this paper, we present a simple and efficient algorithm for human activity recognition based on a wavelet transform. We adopt discrete wavelet transform (DWT) coefficients as a feature of human objects to obtain advantages of its multiresolution approach. The proposed method is tested on multiple levels of DWT. Experiments are carried out on different standard action datasets including KTH and i3D Post. The proposed method is compared with other state-of-the-art methods in terms of different quantitative performance measures. The proposed method is found to have better recognition accuracy in comparison to the state-of-the-art methods.
The review and results of different methods for facial recognition

NASA Astrophysics Data System (ADS)

Le, Yifan

2017-09-01

In recent years, facial recognition draws much attention due to its wide potential applications. As a unique technology in Biometric Identification, facial recognition represents a significant improvement since it could be operated without cooperation of people under detection. Hence, facial recognition will be taken into defense system, medical detection, human behavior understanding, etc. Several theories and methods have been established to make progress in facial recognition: (1) A novel two-stage facial landmark localization method is proposed which has more accurate facial localization effect under specific database; (2) A statistical face frontalization method is proposed which outperforms state-of-the-art methods for face landmark localization; (3) It proposes a general facial landmark detection algorithm to handle images with severe occlusion and images with large head poses; (4) There are three methods proposed on Face Alignment including shape augmented regression method, pose-indexed based multi-view method and a learning based method via regressing local binary features. The aim of this paper is to analyze previous work of different aspects in facial recognition, focusing on concrete method and performance under various databases. In addition, some improvement measures and suggestions in potential applications will be put forward.
A Biologically Plausible Transform for Visual Recognition that is Invariant to Translation, Scale, and Rotation.

PubMed

Sountsov, Pavel; Santucci, David M; Lisman, John E

2011-01-01

Visual object recognition occurs easily despite differences in position, size, and rotation of the object, but the neural mechanisms responsible for this invariance are not known. We have found a set of transforms that achieve invariance in a neurally plausible way. We find that a transform based on local spatial frequency analysis of oriented segments and on logarithmic mapping, when applied twice in an iterative fashion, produces an output image that is unique to the object and that remains constant as the input image is shifted, scaled, or rotated.
A Biologically Plausible Transform for Visual Recognition that is Invariant to Translation, Scale, and Rotation

PubMed Central

Sountsov, Pavel; Santucci, David M.; Lisman, John E.

2011-01-01

Visual object recognition occurs easily despite differences in position, size, and rotation of the object, but the neural mechanisms responsible for this invariance are not known. We have found a set of transforms that achieve invariance in a neurally plausible way. We find that a transform based on local spatial frequency analysis of oriented segments and on logarithmic mapping, when applied twice in an iterative fashion, produces an output image that is unique to the object and that remains constant as the input image is shifted, scaled, or rotated. PMID:22125522
Recognition of 3-D symmetric objects from range images in automated assembly tasks

NASA Technical Reports Server (NTRS)

Alvertos, Nicolas; Dcunha, Ivan

1990-01-01

A new technique is presented for the three dimensional recognition of symmetric objects from range images. Beginning from the implicit representation of quadrics, a set of ten coefficients is determined for symmetric objects like spheres, cones, cylinders, ellipsoids, and parallelepipeds. Instead of using these ten coefficients trying to fit them to smooth surfaces (patches) based on the traditional way of determining curvatures, a new approach based on two dimensional geometry is used. For each symmetric object, a unique set of two dimensional curves is obtained from the various angles at which the object is intersected with a plane. Using the same ten coefficients obtained earlier and based on the discriminant method, each of these curves is classified as a parabola, circle, ellipse, or hyperbola. Each symmetric object is found to possess a unique set of these two dimensional curves whereby it can be differentiated from the others. It is shown that instead of using the three dimensional discriminant which involves evaluation of the rank of its matrix, it is sufficient to use the two dimensional discriminant which only requires three arithmetic operations.
Object recognition with hierarchical discriminant saliency networks.

PubMed

Han, Sunhyoung; Vasconcelos, Nuno

2014-01-01

The benefits of integrating attention and object recognition are investigated. While attention is frequently modeled as a pre-processor for recognition, we investigate the hypothesis that attention is an intrinsic component of recognition and vice-versa. This hypothesis is tested with a recognition model, the hierarchical discriminant saliency network (HDSN), whose layers are top-down saliency detectors, tuned for a visual class according to the principles of discriminant saliency. As a model of neural computation, the HDSN has two possible implementations. In a biologically plausible implementation, all layers comply with the standard neurophysiological model of visual cortex, with sub-layers of simple and complex units that implement a combination of filtering, divisive normalization, pooling, and non-linearities. In a convolutional neural network implementation, all layers are convolutional and implement a combination of filtering, rectification, and pooling. The rectification is performed with a parametric extension of the now popular rectified linear units (ReLUs), whose parameters can be tuned for the detection of target object classes. This enables a number of functional enhancements over neural network models that lack a connection to saliency, including optimal feature denoising mechanisms for recognition, modulation of saliency responses by the discriminant power of the underlying features, and the ability to detect both feature presence and absence. In either implementation, each layer has a precise statistical interpretation, and all parameters are tuned by statistical learning. Each saliency detection layer learns more discriminant saliency templates than its predecessors and higher layers have larger pooling fields. This enables the HDSN to simultaneously achieve high selectivity to target object classes and invariance. The performance of the network in saliency and object recognition tasks is compared to those of models from the biological and computer vision literatures. This demonstrates benefits for all the functional enhancements of the HDSN, the class tuning inherent to discriminant saliency, and saliency layers based on templates of increasing target selectivity and invariance. Altogether, these experiments suggest that there are non-trivial benefits in integrating attention and recognition.
Automatic anatomy recognition via multiobject oriented active shape models.

PubMed

Chen, Xinjian; Udupa, Jayaram K; Alavi, Abass; Torigian, Drew A

2010-12-01

This paper studies the feasibility of developing an automatic anatomy recognition (AAR) system in clinical radiology and demonstrates its operation on clinical 2D images. The anatomy recognition method described here consists of two main components: (a) multiobject generalization of OASM and (b) object recognition strategies. The OASM algorithm is generalized to multiple objects by including a model for each object and assigning a cost structure specific to each object in the spirit of live wire. The delineation of multiobject boundaries is done in MOASM via a three level dynamic programming algorithm, wherein the first level is at pixel level which aims to find optimal oriented boundary segments between successive landmarks, the second level is at landmark level which aims to find optimal location for the landmarks, and the third level is at the object level which aims to find optimal arrangement of object boundaries over all objects. The object recognition strategy attempts to find that pose vector (consisting of translation, rotation, and scale component) for the multiobject model that yields the smallest total boundary cost for all objects. The delineation and recognition accuracies were evaluated separately utilizing routine clinical chest CT, abdominal CT, and foot MRI data sets. The delineation accuracy was evaluated in terms of true and false positive volume fractions (TPVF and FPVF). The recognition accuracy was assessed (1) in terms of the size of the space of the pose vectors for the model assembly that yielded high delineation accuracy, (2) as a function of the number of objects and objects' distribution and size in the model, (3) in terms of the interdependence between delineation and recognition, and (4) in terms of the closeness of the optimum recognition result to the global optimum. When multiple objects are included in the model, the delineation accuracy in terms of TPVF can be improved to 97%-98% with a low FPVF of 0.1%-0.2%. Typically, a recognition accuracy of > or = 90% yielded a TPVF > or = 95% and FPVF < or = 0.5%. Over the three data sets and over all tested objects, in 97% of the cases, the optimal solutions found by the proposed method constituted the true global optimum. The experimental results showed the feasibility and efficacy of the proposed automatic anatomy recognition system. Increasing the number of objects in the model can significantly improve both recognition and delineation accuracy. More spread out arrangement of objects in the model can lead to improved recognition and delineation accuracy. Including larger objects in the model also improved recognition and delineation. The proposed method almost always finds globally optimum solutions.
The memory state heuristic: A formal model based on repeated recognition judgments.

PubMed

Castela, Marta; Erdfelder, Edgar

2017-02-01

The recognition heuristic (RH) theory predicts that, in comparative judgment tasks, if one object is recognized and the other is not, the recognized one is chosen. The memory-state heuristic (MSH) extends the RH by assuming that choices are not affected by recognition judgments per se, but by the memory states underlying these judgments (i.e., recognition certainty, uncertainty, or rejection certainty). Specifically, the larger the discrepancy between memory states, the larger the probability of choosing the object in the higher state. The typical RH paradigm does not allow estimation of the underlying memory states because it is unknown whether the objects were previously experienced or not. Therefore, we extended the paradigm by repeating the recognition task twice. In line with high threshold models of recognition, we assumed that inconsistent recognition judgments result from uncertainty whereas consistent judgments most likely result from memory certainty. In Experiment 1, we fitted 2 nested multinomial models to the data: an MSH model that formalizes the relation between memory states and binary choices explicitly and an approximate model that ignores the (unlikely) possibility of consistent guesses. Both models provided converging results. As predicted, reliance on recognition increased with the discrepancy in the underlying memory states. In Experiment 2, we replicated these results and found support for choice consistency predictions of the MSH. Additionally, recognition and choice latencies were in agreement with the MSH in both experiments. Finally, we validated critical parameters of our MSH model through a cross-validation method and a third experiment. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Thermalnet: a Deep Convolutional Network for Synthetic Thermal Image Generation

NASA Astrophysics Data System (ADS)

Kniaz, V. V.; Gorbatsevich, V. S.; Mizginov, V. A.

2017-05-01

Deep convolutional neural networks have dramatically changed the landscape of the modern computer vision. Nowadays methods based on deep neural networks show the best performance among image recognition and object detection algorithms. While polishing of network architectures received a lot of scholar attention, from the practical point of view the preparation of a large image dataset for a successful training of a neural network became one of major challenges. This challenge is particularly profound for image recognition in wavelengths lying outside the visible spectrum. For example no infrared or radar image datasets large enough for successful training of a deep neural network are available to date in public domain. Recent advances of deep neural networks prove that they are also capable to do arbitrary image transformations such as super-resolution image generation, grayscale image colorisation and imitation of style of a given artist. Thus a natural question arise: how could be deep neural networks used for augmentation of existing large image datasets? This paper is focused on the development of the Thermalnet deep convolutional neural network for augmentation of existing large visible image datasets with synthetic thermal images. The Thermalnet network architecture is inspired by colorisation deep neural networks.
DORSAL HIPPOCAMPAL PROGESTERONE INFUSIONS ENHANCE OBJECT RECOGNITION IN YOUNG FEMALE MICE

PubMed Central

Orr, Patrick T.; Lewis, Michael C.; Frick, Karyn M.

2009-01-01

The effects of progesterone on memory are not nearly as well studied as the effects of estrogens. Although progesterone can reportedly enhance spatial and/or object recognition in female rodents when given immediately after training, previous studies have injected progesterone systemically, and therefore, the brain regions mediating this enhancement are not clear. As such, this study was designed to determine the role of the dorsal hippocampus in mediating the beneficial effect of progesterone on object recognition. Young ovariectomized C57BL/6 mice were trained in a hippocampal-dependent object recognition task utilizing two identical objects, and then immediately or 2 hrs afterwards, received bilateral dorsal hippocampal infusions of vehicle or 0.01, 0.1, or 1.0 μg/μl water-soluble progesterone. Forty-eight hours later, object recognition memory was tested using a previously explored object and a novel object. Relative to the vehicle group, memory for the familiar object was enhanced in all groups receiving immediate infusions of progesterone. Progesterone infusion delayed 2 hrs after training did not affect object recognition. These data suggest that the dorsal hippocampus may play a critical role in progesterone-induced enhancement of object recognition. PMID:19477194
Face Recognition From One Example View.

DTIC Science & Technology

1995-09-01

Proceedings, International Workshop on Automatic Face- and Gesture-Recognition, pages 248{253, Zurich, 1995. [32] Yael Moses, Shimon Ullman, and Shimon...recognition. Journal of Cognitive Neuroscience, 3(1):71{86, 1991. [49] Shimon Ullman and Ronen Basri. Recognition by linear combinations of models
Beyond sensory images: Object-based representation in the human ventral pathway

PubMed Central

Pietrini, Pietro; Furey, Maura L.; Ricciardi, Emiliano; Gobbini, M. Ida; Wu, W.-H. Carolyn; Cohen, Leonardo; Guazzelli, Mario; Haxby, James V.

2004-01-01

We investigated whether the topographically organized, category-related patterns of neural response in the ventral visual pathway are a representation of sensory images or a more abstract representation of object form that is not dependent on sensory modality. We used functional MRI to measure patterns of response evoked during visual and tactile recognition of faces and manmade objects in sighted subjects and during tactile recognition in blind subjects. Results showed that visual and tactile recognition evoked category-related patterns of response in a ventral extrastriate visual area in the inferior temporal gyrus that were correlated across modality for manmade objects. Blind subjects also demonstrated category-related patterns of response in this “visual” area, and in more ventral cortical regions in the fusiform gyrus, indicating that these patterns are not due to visual imagery and, furthermore, that visual experience is not necessary for category-related representations to develop in these cortices. These results demonstrate that the representation of objects in the ventral visual pathway is not simply a representation of visual images but, rather, is a representation of more abstract features of object form. PMID:15064396
Multidimensional display controller for displaying to a user an aspect of a multidimensional space visible from a base viewing location along a desired viewing orientation

DOEpatents

Davidson, George S.; Anderson, Thomas G.

2001-01-01

A display controller allows a user to control a base viewing location, a base viewing orientation, and a relative viewing orientation. The base viewing orientation and relative viewing orientation are combined to determine a desired viewing orientation. An aspect of a multidimensional space visible from the base viewing location along the desired viewing orientation is displayed to the user. The user can change the base viewing location, base viewing orientation, and relative viewing orientation by changing the location or other properties of input objects.
A Vision-Based System for Intelligent Monitoring: Human Behaviour Analysis and Privacy by Context

PubMed Central

Chaaraoui, Alexandros Andre; Padilla-López, José Ramón; Ferrández-Pastor, Francisco Javier; Nieto-Hidalgo, Mario; Flórez-Revuelta, Francisco

2014-01-01

Due to progress and demographic change, society is facing a crucial challenge related to increased life expectancy and a higher number of people in situations of dependency. As a consequence, there exists a significant demand for support systems for personal autonomy. This article outlines the vision@home project, whose goal is to extend independent living at home for elderly and impaired people, providing care and safety services by means of vision-based monitoring. Different kinds of ambient-assisted living services are supported, from the detection of home accidents, to telecare services. In this contribution, the specification of the system is presented, and novel contributions are made regarding human behaviour analysis and privacy protection. By means of a multi-view setup of cameras, people's behaviour is recognised based on human action recognition. For this purpose, a weighted feature fusion scheme is proposed to learn from multiple views. In order to protect the right to privacy of the inhabitants when a remote connection occurs, a privacy-by-context method is proposed. The experimental results of the behaviour recognition method show an outstanding performance, as well as support for multi-view scenarios and real-time execution, which are required in order to provide the proposed services. PMID:24854209
A vision-based system for intelligent monitoring: human behaviour analysis and privacy by context.

PubMed

Chaaraoui, Alexandros Andre; Padilla-López, José Ramón; Ferrández-Pastor, Francisco Javier; Nieto-Hidalgo, Mario; Flórez-Revuelta, Francisco

2014-05-20

Due to progress and demographic change, society is facing a crucial challenge related to increased life expectancy and a higher number of people in situations of dependency. As a consequence, there exists a significant demand for support systems for personal autonomy. This article outlines the vision@home project, whose goal is to extend independent living at home for elderly and impaired people, providing care and safety services by means of vision-based monitoring. Different kinds of ambient-assisted living services are supported, from the detection of home accidents, to telecare services. In this contribution, the specification of the system is presented, and novel contributions are made regarding human behaviour analysis and privacy protection. By means of a multi-view setup of cameras, people's behaviour is recognised based on human action recognition. For this purpose, a weighted feature fusion scheme is proposed to learn from multiple views. In order to protect the right to privacy of the inhabitants when a remote connection occurs, a privacy-by-context method is proposed. The experimental results of the behaviour recognition method show an outstanding performance, as well as support for multi-view scenarios and real-time execution, which are required in order to provide the proposed services.
Out of place, out of mind: Schema-driven false memory effects for object-location bindings.

PubMed

Lew, Adina R; Howe, Mark L

2017-03-01

Events consist of diverse elements, each processed in specialized neocortical networks, with temporal lobe memory systems binding these elements to form coherent event memories. We provide a novel theoretical analysis of an unexplored consequence of the independence of memory systems for elements and their bindings, 1 that raises the paradoxical prediction that schema-driven false memories can act solely on the binding of event elements despite the superior retrieval of individual elements. This is because if 2, or more, schema-relevant elements are bound together in unexpected conjunctions, the unexpected conjunction will increase attention during encoding to both the elements and their bindings, but only the bindings will receive competition with evoked schema-expected bindings. We test our model by examining memory for object-location bindings in recognition (Study 1) and recall (Studies 2 and 3) tasks. After studying schema-relevant objects in unexpected locations (e.g., pan on a stool in a kitchen scene), participants who then viewed these objects in expected locations (e.g., pan on stove) at test were more likely to falsely remember this object-location pairing as correct, compared with participants that viewed a different unexpected object-location pairing (e.g., pan on floor). In recall, participants were more likely to correctly remember individual schema-relevant objects originally viewed in unexpected, as opposed to expected locations, but were then more likely to misplace these items in the original room scene to expected places, relative to control schema-irrelevant objects. Our theoretical analysis and novel paradigm provide a tool for investigating memory distortions acting on binding processes. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Hierachical Object Recognition Using Libraries of Parameterized Model Sub-Parts.

DTIC Science & Technology

1987-06-01

SketchI Structure Hierarchy Constrained Search 20. AUISTR ACT (Ce.ntU..w se reveres. 01411 at 00 OW 4MI 9smtilp Me"h aindo" This thesis describes the... theseU hierarchies to achieve robust recognition based on effective organization and indexing schemes for model libraries. The goal of the system is to...with different relative scaling, rotation, or translation than in the models. The approach taken in this thesis is to develop an object shape
Object recognition of real targets using modelled SAR images

NASA Astrophysics Data System (ADS)

Zherdev, D. A.

2017-12-01

In this work the problem of recognition is studied using SAR images. The algorithm of recognition is based on the computation of conjugation indices with vectors of class. The support subspaces for each class are constructed by exception of the most and the less correlated vectors in a class. In the study we examine the ability of a significant feature vector size reduce that leads to recognition time decrease. The images of targets form the feature vectors that are transformed using pre-trained convolutional neural network (CNN).

A rat in the sewer: How mental imagery interacts with object recognition

PubMed Central

Hamburger, Kai

2018-01-01

The role of mental imagery has been puzzling researchers for more than two millennia. Both positive and negative effects of mental imagery on information processing have been discussed. The aim of this work was to examine how mental imagery affects object recognition and associative learning. Based on different perceptual and cognitive accounts we tested our imagery-induced interaction hypothesis in a series of two experiments. According to that, mental imagery could lead to (1) a superior performance in object recognition and associative learning if these objects are imagery-congruent (semantically) and to (2) an inferior performance if these objects are imagery-incongruent. In the first experiment, we used a static environment and tested associative learning. In the second experiment, subjects encoded object information in a dynamic environment by means of a virtual sewer system. Our results demonstrate that subjects who received a role adoption task (by means of guided mental imagery) performed better when imagery-congruent objects were used and worse when imagery-incongruent objects were used. We finally discuss our findings also with respect to alternative accounts and plead for a multi-methodological approach for future research in order to solve this issue. PMID:29590161
A rat in the sewer: How mental imagery interacts with object recognition.

PubMed

Karimpur, Harun; Hamburger, Kai

2018-01-01

The role of mental imagery has been puzzling researchers for more than two millennia. Both positive and negative effects of mental imagery on information processing have been discussed. The aim of this work was to examine how mental imagery affects object recognition and associative learning. Based on different perceptual and cognitive accounts we tested our imagery-induced interaction hypothesis in a series of two experiments. According to that, mental imagery could lead to (1) a superior performance in object recognition and associative learning if these objects are imagery-congruent (semantically) and to (2) an inferior performance if these objects are imagery-incongruent. In the first experiment, we used a static environment and tested associative learning. In the second experiment, subjects encoded object information in a dynamic environment by means of a virtual sewer system. Our results demonstrate that subjects who received a role adoption task (by means of guided mental imagery) performed better when imagery-congruent objects were used and worse when imagery-incongruent objects were used. We finally discuss our findings also with respect to alternative accounts and plead for a multi-methodological approach for future research in order to solve this issue.
Infant Visual Attention and Object Recognition

PubMed Central

Reynolds, Greg D.

2015-01-01

This paper explores the role visual attention plays in the recognition of objects in infancy. Research and theory on the development of infant attention and recognition memory are reviewed in three major sections. The first section reviews some of the major findings and theory emerging from a rich tradition of behavioral research utilizing preferential looking tasks to examine visual attention and recognition memory in infancy. The second section examines research utilizing neural measures of attention and object recognition in infancy as well as research on brain-behavior relations in the early development of attention and recognition memory. The third section addresses potential areas of the brain involved in infant object recognition and visual attention. An integrated synthesis of some of the existing models of the development of visual attention is presented which may account for the observed changes in behavioral and neural measures of visual attention and object recognition that occur across infancy. PMID:25596333
Magnet status and registered nurse views of the work environment and nursing as a career.

PubMed

Ulrich, Beth T; Buerhaus, Peter I; Donelan, Karen; Norman, Linda; Dittus, Robert

2007-05-01

To compare how registered nurses view the work environment and the nursing shortage based on the Magnet status of their organizations. The upsurge in organizations pursuing and obtaining Magnet recognition provides increased opportunities to investigate whether and how registered nurses who are employed in Magnet organizations and organizations pursuing Magnet status perceive differences in the nursing shortage, hospitals' responses to the shortage, characteristics of the work environment, and professional relationships. A nationally representative sample of registered nurses licensed to practice in the United States was surveyed. The views of registered nurses who worked in Magnet organizations, organizations in the process of applying for Magnet status, and non-Magnet organizations were analyzed as independent groups. Significant differences were found. Although there is a clear Magnet difference, there are also identifiable differences that occur during the pursuit of Magnet recognition. Many organizations in the process of applying for Magnet status rated higher than Magnet organizations, indicating that there is much to do to maintain the comparative advantages for Magnet hospitals.
Magnet status and registered nurse views of the work environment and nursing as a career.

PubMed

Ulrich, Beth T; Buerhaus, Peter I; Donelan, Karen; Norman, Linda; Dittus, Robert

2009-01-01

To compare how registered nurses view the work environment and the nursing shortage based on the Magnet status of their organizations. The upsurge in organizations pursuing and obtaining Magnet recognition provides increased opportunities to investigate whether and how registered nurses who are employed in Magnet organizations and organizations pursuing Magnet status perceive differences in the nursing shortage, hospitals' responses to the shortage, characteristics of the work environment, and professional relationships. A nationally representative sample of registered nurses licensed to practice in the United States was surveyed. The views of registered nurses who worked in Magnet organizations, organizations in the process of applying for Magnet status, and non-Magnet organizations were analyzed as independent groups. Significant differences were found. Although there is a clear Magnet difference, there are also identifiable differences that occur during the pursuit of Magnet recognition. Many organizations in the process of applying for Magnet status rated higher than Magnet organizations, indicating that there is much to do to maintain the comparative advantages for Magnet hospitals.
Development of visuo-haptic transfer for object recognition in typical preschool and school-aged children.

PubMed

Purpura, Giulia; Cioni, Giovanni; Tinelli, Francesca

2018-07-01

Object recognition is a long and complex adaptive process and its full maturation requires combination of many different sensory experiences as well as cognitive abilities to manipulate previous experiences in order to develop new percepts and subsequently to learn from the environment. It is well recognized that the transfer of visual and haptic information facilitates object recognition in adults, but less is known about development of this ability. In this study, we explored the developmental course of object recognition capacity in children using unimodal visual information, unimodal haptic information, and visuo-haptic information transfer in children from 4 years to 10 years and 11 months of age. Participants were tested through a clinical protocol, involving visual exploration of black-and-white photographs of common objects, haptic exploration of real objects, and visuo-haptic transfer of these two types of information. Results show an age-dependent development of object recognition abilities for visual, haptic, and visuo-haptic modalities. A significant effect of time on development of unimodal and crossmodal recognition skills was found. Moreover, our data suggest that multisensory processes for common object recognition are active at 4 years of age. They facilitate recognition of common objects, and, although not fully mature, are significant in adaptive behavior from the first years of age. The study of typical development of visuo-haptic processes in childhood is a starting point for future studies regarding object recognition in impaired populations.
Behavior analysis of video object in complicated background

NASA Astrophysics Data System (ADS)

Zhao, Wenting; Wang, Shigang; Liang, Chao; Wu, Wei; Lu, Yang

2016-10-01

This paper aims to achieve robust behavior recognition of video object in complicated background. Features of the video object are described and modeled according to the depth information of three-dimensional video. Multi-dimensional eigen vector are constructed and used to process high-dimensional data. Stable object tracing in complex scenes can be achieved with multi-feature based behavior analysis, so as to obtain the motion trail. Subsequently, effective behavior recognition of video object is obtained according to the decision criteria. What's more, the real-time of algorithms and accuracy of analysis are both improved greatly. The theory and method on the behavior analysis of video object in reality scenes put forward by this project have broad application prospect and important practical significance in the security, terrorism, military and many other fields.
Robust Dehaze Algorithm for Degraded Image of CMOS Image Sensors.

PubMed

Qu, Chen; Bi, Du-Yan; Sui, Ping; Chao, Ai-Nong; Wang, Yun-Fei

2017-09-22

The CMOS (Complementary Metal-Oxide-Semiconductor) is a new type of solid image sensor device widely used in object tracking, object recognition, intelligent navigation fields, and so on. However, images captured by outdoor CMOS sensor devices are usually affected by suspended atmospheric particles (such as haze), causing a reduction in image contrast, color distortion problems, and so on. In view of this, we propose a novel dehazing approach based on a local consistent Markov random field (MRF) framework. The neighboring clique in traditional MRF is extended to the non-neighboring clique, which is defined on local consistent blocks based on two clues, where both the atmospheric light and transmission map satisfy the character of local consistency. In this framework, our model can strengthen the restriction of the whole image while incorporating more sophisticated statistical priors, resulting in more expressive power of modeling, thus, solving inadequate detail recovery effectively and alleviating color distortion. Moreover, the local consistent MRF framework can obtain details while maintaining better results for dehazing, which effectively improves the image quality captured by the CMOS image sensor. Experimental results verified that the method proposed has the combined advantages of detail recovery and color preservation.
Object recognition and localization from 3D point clouds by maximum-likelihood estimation

NASA Astrophysics Data System (ADS)

Dantanarayana, Harshana G.; Huntley, Jonathan M.

2017-08-01

We present an algorithm based on maximum-likelihood analysis for the automated recognition of objects, and estimation of their pose, from 3D point clouds. Surfaces segmented from depth images are used as the features, unlike `interest point'-based algorithms which normally discard such data. Compared to the 6D Hough transform, it has negligible memory requirements, and is computationally efficient compared to iterative closest point algorithms. The same method is applicable to both the initial recognition/pose estimation problem as well as subsequent pose refinement through appropriate choice of the dispersion of the probability density functions. This single unified approach therefore avoids the usual requirement for different algorithms for these two tasks. In addition to the theoretical description, a simple 2 degrees of freedom (d.f.) example is given, followed by a full 6 d.f. analysis of 3D point cloud data from a cluttered scene acquired by a projected fringe-based scanner, which demonstrated an RMS alignment error as low as 0.3 mm.
Implementation of Augmented Reality Technology in Sangiran Museum with Vuforia

NASA Astrophysics Data System (ADS)

Purnomo, F. A.; Santosa, P. I.; Hartanto, R.; Pratisto, E. H.; Purbayu, A.

2018-03-01

Archaeological object is an evidence of life on ancient relics which has a lifespan of millions years ago. The discovery of this ancient object by the Museum Sangiran then is preserved and protected from potential damage. This research will develop Augmented Reality application for the museum that display a virtual information from ancient object on display. The content includes information as text, audio, and animation of 3D model as a representation of the ancient object. This study emphasizes the 3D Markerless recognition process by using Vuforia Augmented Reality (AR) system so that visitor can access the exhibition objects through different viewpoints. Based on the test result, by registering image target with 25o angle interval, 3D markerless keypoint feature can be detected with different viewpoint. The device must meet minimal specifications of Dual Core 1.2 GHz processor, GPU Power VR SG5X, 8 MP auto focus camera and 1 GB of memory to run the application. The average success of the AR application detects object in museum exhibition to 3D Markerless with a single view by 40%, Markerless multiview by 86% (for angle 0° - 180°) and 100% (for angle 0° - 360°). Application detection distance is between 23 cm and up to 540 cm with the response time to detect 3D Markerless has 12 seconds in average.
Culture modulates implicit ownership-induced self-bias in memory.

PubMed

Sparks, Samuel; Cunningham, Sheila J; Kritikos, Ada

2016-08-01

The relation of incoming stimuli to the self implicitly determines the allocation of cognitive resources. Cultural variations in the self-concept shape cognition, but the extent is unclear because the majority of studies sample only Western participants. We report cultural differences (Asian versus Western) in ownership-induced self-bias in recognition memory for objects. In two experiments, participants allocated a series of images depicting household objects to self-owned or other-owned virtual baskets based on colour cues before completing a surprise recognition memory test for the objects. The 'other' was either a stranger or a close other. In both experiments, Western participants showed greater recognition memory accuracy for self-owned compared with other-owned objects, consistent with an independent self-construal. In Experiment 1, which required minimal attention to the owned objects, Asian participants showed no such ownership-related bias in recognition accuracy. In Experiment 2, which required attention to owned objects to move them along the screen, Asian participants again showed no overall memory advantage for self-owned items and actually exhibited higher recognition accuracy for mother-owned than self-owned objects, reversing the pattern observed for Westerners. This is consistent with an interdependent self-construal which is sensitive to the particular relationship between the self and other. Overall, our results suggest that the self acts as an organising principle for allocating cognitive resources, but that the way it is constructed depends upon cultural experience. Additionally, the manifestation of these cultural differences in self-representation depends on the allocation of attentional resources to self- and other-associated stimuli. Crown Copyright © 2016. Published by Elsevier B.V. All rights reserved.
If you watch it move, you'll recognize it in 3D: Transfer of depth cues between encoding and retrieval.

PubMed

Papenmeier, Frank; Schwan, Stephan

2016-02-01

Viewing objects with stereoscopic displays provides additional depth cues through binocular disparity supporting object recognition. So far, it was unknown whether this results from the representation of specific stereoscopic information in memory or a more general representation of an object's depth structure. Therefore, we investigated whether continuous object rotation acting as depth cue during encoding results in a memory representation that can subsequently be accessed by stereoscopic information during retrieval. In Experiment 1, we found such transfer effects from continuous object rotation during encoding to stereoscopic presentations during retrieval. In Experiments 2a and 2b, we found that the continuity of object rotation is important because only continuous rotation and/or stereoscopic depth but not multiple static snapshots presented without stereoscopic information caused the extraction of an object's depth structure into memory. We conclude that an object's depth structure and not specific depth cues are represented in memory. Copyright © 2015 Elsevier B.V. All rights reserved.
Automatic anatomy recognition in whole-body PET/CT images

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wang, Huiqian; Udupa, Jayaram K., E-mail: jay@mail.med.upenn.edu; Odhner, Dewey

Purpose: Whole-body positron emission tomography/computed tomography (PET/CT) has become a standard method of imaging patients with various disease conditions, especially cancer. Body-wide accurate quantification of disease burden in PET/CT images is important for characterizing lesions, staging disease, prognosticating patient outcome, planning treatment, and evaluating disease response to therapeutic interventions. However, body-wide anatomy recognition in PET/CT is a critical first step for accurately and automatically quantifying disease body-wide, body-region-wise, and organwise. This latter process, however, has remained a challenge due to the lower quality of the anatomic information portrayed in the CT component of this imaging modality and the paucity ofmore » anatomic details in the PET component. In this paper, the authors demonstrate the adaptation of a recently developed automatic anatomy recognition (AAR) methodology [Udupa et al., “Body-wide hierarchical fuzzy modeling, recognition, and delineation of anatomy in medical images,” Med. Image Anal. 18, 752–771 (2014)] to PET/CT images. Their goal was to test what level of object localization accuracy can be achieved on PET/CT compared to that achieved on diagnostic CT images. Methods: The authors advance the AAR approach in this work in three fronts: (i) from body-region-wise treatment in the work of Udupa et al. to whole body; (ii) from the use of image intensity in optimal object recognition in the work of Udupa et al. to intensity plus object-specific texture properties, and (iii) from the intramodality model-building-recognition strategy to the intermodality approach. The whole-body approach allows consideration of relationships among objects in different body regions, which was previously not possible. Consideration of object texture allows generalizing the previous optimal threshold-based fuzzy model recognition method from intensity images to any derived fuzzy membership image, and in the process, to bring performance to the level achieved on diagnostic CT and MR images in body-region-wise approaches. The intermodality approach fosters the use of already existing fuzzy models, previously created from diagnostic CT images, on PET/CT and other derived images, thus truly separating the modality-independent object assembly anatomy from modality-specific tissue property portrayal in the image. Results: Key ways of combining the above three basic ideas lead them to 15 different strategies for recognizing objects in PET/CT images. Utilizing 50 diagnostic CT image data sets from the thoracic and abdominal body regions and 16 whole-body PET/CT image data sets, the authors compare the recognition performance among these 15 strategies on 18 objects from the thorax, abdomen, and pelvis in object localization error and size estimation error. Particularly on texture membership images, object localization is within three voxels on whole-body low-dose CT images and 2 voxels on body-region-wise low-dose images of known true locations. Surprisingly, even on direct body-region-wise PET images, localization error within 3 voxels seems possible. Conclusions: The previous body-region-wise approach can be extended to whole-body torso with similar object localization performance. Combined use of image texture and intensity property yields the best object localization accuracy. In both body-region-wise and whole-body approaches, recognition performance on low-dose CT images reaches levels previously achieved on diagnostic CT images. The best object recognition strategy varies among objects; the proposed framework however allows employing a strategy that is optimal for each object.« less
TU-C-17A-03: An Integrated Contour Evaluation Software Tool Using Supervised Pattern Recognition for Radiotherapy

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, H; Tan, J; Kavanaugh, J

Purpose: Radiotherapy (RT) contours delineated either manually or semiautomatically require verification before clinical usage. Manual evaluation is very time consuming. A new integrated software tool using supervised pattern contour recognition was thus developed to facilitate this process. Methods: The contouring tool was developed using an object-oriented programming language C# and application programming interfaces, e.g. visualization toolkit (VTK). The C# language served as the tool design basis. The Accord.Net scientific computing libraries were utilized for the required statistical data processing and pattern recognition, while the VTK was used to build and render 3-D mesh models from critical RT structures in real-timemore » and 360° visualization. Principal component analysis (PCA) was used for system self-updating geometry variations of normal structures based on physician-approved RT contours as a training dataset. The inhouse design of supervised PCA-based contour recognition method was used for automatically evaluating contour normality/abnormality. The function for reporting the contour evaluation results was implemented by using C# and Windows Form Designer. Results: The software input was RT simulation images and RT structures from commercial clinical treatment planning systems. Several abilities were demonstrated: automatic assessment of RT contours, file loading/saving of various modality medical images and RT contours, and generation/visualization of 3-D images and anatomical models. Moreover, it supported the 360° rendering of the RT structures in a multi-slice view, which allows physicians to visually check and edit abnormally contoured structures. Conclusion: This new software integrates the supervised learning framework with image processing and graphical visualization modules for RT contour verification. This tool has great potential for facilitating treatment planning with the assistance of an automatic contour evaluation module in avoiding unnecessary manual verification for physicians/dosimetrists. In addition, its nature as a compact and stand-alone tool allows for future extensibility to include additional functions for physicians’ clinical needs.« less
Intrinsic Bayesian Active Contours for Extraction of Object Boundaries in Images

PubMed Central

Srivastava, Anuj

2010-01-01

We present a framework for incorporating prior information about high-probability shapes in the process of contour extraction and object recognition in images. Here one studies shapes as elements of an infinite-dimensional, non-linear quotient space, and statistics of shapes are defined and computed intrinsically using differential geometry of this shape space. Prior models on shapes are constructed using probability distributions on tangent bundles of shape spaces. Similar to the past work on active contours, where curves are driven by vector fields based on image gradients and roughness penalties, we incorporate the prior shape knowledge in the form of vector fields on curves. Through experimental results, we demonstrate the use of prior shape models in the estimation of object boundaries, and their success in handling partial obscuration and missing data. Furthermore, we describe the use of this framework in shape-based object recognition or classification. PMID:21076692
Vertex Space Analysis for Model-Based Target Recognition.

DTIC Science & Technology

1996-08-01

performed in our unique invariant representation, Vertex Space, that reduces both the dimensionality and size of the required search space. Vertex Space ... mapping results in a reduced representation that serves as a characteristic target signature which is invariant to four of the six viewing geometry
Microcomputers: Independence and Information Access for the Physically Handicapped.

ERIC Educational Resources Information Center

Regen, Shari S.; Chen, Ching-chih

1984-01-01

Provides overview of recent technological developments in microcomputer technology for the physically disabled, including discussion of view expansion, "talking terminals," voice recognition, and price and convenience of micro-based products. Equipment manufacturers and training centers for the physically disabled are listed and microcomputer…
Visual agnosia and focal brain injury.

PubMed

Martinaud, O

Visual agnosia encompasses all disorders of visual recognition within a selective visual modality not due to an impairment of elementary visual processing or other cognitive deficit. Based on a sequential dichotomy between the perceptual and memory systems, two different categories of visual object agnosia are usually considered: 'apperceptive agnosia' and 'associative agnosia'. Impaired visual recognition within a single category of stimuli is also reported in: (i) visual object agnosia of the ventral pathway, such as prosopagnosia (for faces), pure alexia (for words), or topographagnosia (for landmarks); (ii) visual spatial agnosia of the dorsal pathway, such as cerebral akinetopsia (for movement), or orientation agnosia (for the placement of objects in space). Focal brain injuries provide a unique opportunity to better understand regional brain function, particularly with the use of effective statistical approaches such as voxel-based lesion-symptom mapping (VLSM). The aim of the present work was twofold: (i) to review the various agnosia categories according to the traditional visual dual-pathway model; and (ii) to better assess the anatomical network underlying visual recognition through lesion-mapping studies correlating neuroanatomical and clinical outcomes. Copyright © 2017 Elsevier Masson SAS. All rights reserved.
Dissociation of rapid response learning and facilitation in perceptual and conceptual networks of person recognition.

PubMed

Valt, Christian; Klein, Christoph; Boehm, Stephan G

2015-08-01

Repetition priming is a prominent example of non-declarative memory, and it increases the accuracy and speed of responses to repeatedly processed stimuli. Major long-hold memory theories posit that repetition priming results from facilitation within perceptual and conceptual networks for stimulus recognition and categorization. Stimuli can also be bound to particular responses, and it has recently been suggested that this rapid response learning, not network facilitation, provides a sound theory of priming of object recognition. Here, we addressed the relevance of network facilitation and rapid response learning for priming of person recognition with a view to advance general theories of priming. In four experiments, participants performed conceptual decisions like occupation or nationality judgments for famous faces. The magnitude of rapid response learning varied across experiments, and rapid response learning co-occurred and interacted with facilitation in perceptual and conceptual networks. These findings indicate that rapid response learning and facilitation in perceptual and conceptual networks are complementary rather than competing theories of priming. Thus, future memory theories need to incorporate both rapid response learning and network facilitation as individual facets of priming. © 2014 The British Psychological Society.
3D automatic anatomy recognition based on iterative graph-cut-ASM

NASA Astrophysics Data System (ADS)

Chen, Xinjian; Udupa, Jayaram K.; Bagci, Ulas; Alavi, Abass; Torigian, Drew A.

2010-02-01

We call the computerized assistive process of recognizing, delineating, and quantifying organs and tissue regions in medical imaging, occurring automatically during clinical image interpretation, automatic anatomy recognition (AAR). The AAR system we are developing includes five main parts: model building, object recognition, object delineation, pathology detection, and organ system quantification. In this paper, we focus on the delineation part. For the modeling part, we employ the active shape model (ASM) strategy. For recognition and delineation, we integrate several hybrid strategies of combining purely image based methods with ASM. In this paper, an iterative Graph-Cut ASM (IGCASM) method is proposed for object delineation. An algorithm called GC-ASM was presented at this symposium last year for object delineation in 2D images which attempted to combine synergistically ASM and GC. Here, we extend this method to 3D medical image delineation. The IGCASM method effectively combines the rich statistical shape information embodied in ASM with the globally optimal delineation capability of the GC method. We propose a new GC cost function, which effectively integrates the specific image information with the ASM shape model information. The proposed methods are tested on a clinical abdominal CT data set. The preliminary results show that: (a) it is feasible to explicitly bring prior 3D statistical shape information into the GC framework; (b) the 3D IGCASM delineation method improves on ASM and GC and can provide practical operational time on clinical images.

Object and event recognition for stroke rehabilitation

NASA Astrophysics Data System (ADS)

Ghali, Ahmed; Cunningham, Andrew S.; Pridmore, Tony P.

2003-06-01

Stroke is a major cause of disability and health care expenditure around the world. Existing stroke rehabilitation methods can be effective but are costly and need to be improved. Even modest improvements in the effectiveness of rehabilitation techniques could produce large benefits in terms of quality of life. The work reported here is part of an ongoing effort to integrate virtual reality and machine vision technologies to produce innovative stroke rehabilitation methods. We describe a combined object recognition and event detection system that provides real time feedback to stroke patients performing everyday kitchen tasks necessary for independent living, e.g. making a cup of coffee. The image plane position of each object, including the patient"s hand, is monitored using histogram-based recognition methods. The relative positions of hand and objects are then reported to a task monitor that compares the patient"s actions against a model of the target task. A prototype system has been constructed and is currently undergoing technical and clinical evaluation.
A biologically plausible computational model for auditory object recognition.

PubMed

Larson, Eric; Billimoria, Cyrus P; Sen, Kamal

2009-01-01

Object recognition is a task of fundamental importance for sensory systems. Although this problem has been intensively investigated in the visual system, relatively little is known about the recognition of complex auditory objects. Recent work has shown that spike trains from individual sensory neurons can be used to discriminate between and recognize stimuli. Multiple groups have developed spike similarity or dissimilarity metrics to quantify the differences between spike trains. Using a nearest-neighbor approach the spike similarity metrics can be used to classify the stimuli into groups used to evoke the spike trains. The nearest prototype spike train to the tested spike train can then be used to identify the stimulus. However, how biological circuits might perform such computations remains unclear. Elucidating this question would facilitate the experimental search for such circuits in biological systems, as well as the design of artificial circuits that can perform such computations. Here we present a biologically plausible model for discrimination inspired by a spike distance metric using a network of integrate-and-fire model neurons coupled to a decision network. We then apply this model to the birdsong system in the context of song discrimination and recognition. We show that the model circuit is effective at recognizing individual songs, based on experimental input data from field L, the avian primary auditory cortex analog. We also compare the performance and robustness of this model to two alternative models of song discrimination: a model based on coincidence detection and a model based on firing rate.
A high-fat high-sugar diet-induced impairment in place-recognition memory is reversible and training-dependent.

PubMed

Tran, Dominic M D; Westbrook, R Frederick

2017-03-01

A high-fat high-sugar (HFHS) diet is associated with cognitive deficits in people and produces spatial learning and memory deficits in rodents. Notable, such diets rapidly impair place-, but not object-recognition memory in rats within one week of exposure. Three experiments examined whether this impairment was reversed by removal of the diet, or prevented by pre-diet training. Experiment 1 showed that rats switched from HFHS to chow recovered from the place-recognition impairment that they displayed while on HFHS. Experiment 2 showed that control rats ("Untrained") who were exposed to an empty testing arena while on chow, were impaired in place-recognition when switched to HFHS and tested for the first time. However, rats tested ("Trained") on the place and object task while on chow, were protected from the diet-induce deficit and maintained good place-recognition when switched to HFHS. Experiment 3 examined the conditions of this protection effect by training rats in a square arena while on chow, and testing them in a rectangular arena while on HFHS. We have previously demonstrated that chow rats, but not HFHS rats, show geometry-based reorientation on a rectangular arena place-recognition task (Tran & Westbrook, 2015). Experiment 3 assessed whether rats switched to the HFHS diet after training on the place and object tasks in a square area, would show geometry-based reorientation in a rectangular arena. The protective benefit of training was replicated in the square arena, but both Untrained and Trained HFHS failed to show geometry-based reorientation in the rectangular arena. These findings are discussed in relation to the specificity of the training effect, the role of the hippocampus in diet-induced deficits, and their implications for dietary effects on cognition in people. Copyright © 2016 Elsevier Ltd. All rights reserved.
Semantic Image Based Geolocation Given a Map (Author’s Initial Manuscript)

DTIC Science & Technology

2016-09-01

novel technique for detection and identification of building facades from geo-tagged reference view using the map and geometry of the building facades. We...2D map of the environment, and geometry of building facades. We evaluate our approach for building identification and geo-localization on a new...location recognition and building identification is done by matching the query view to a reference set, followed by estimation of 3D building facades
How does the brain solve visual object recognition?

PubMed Central

Zoccolan, Davide; Rust, Nicole C.

2012-01-01

Mounting evidence suggests that “core object recognition,” the ability to rapidly recognize objects despite substantial appearance variation, is solved in the brain via a cascade of reflexive, largely feedforward computations that culminate in a powerful neuronal representation in the inferior temporal cortex. However, the algorithm that produces this solution remains little-understood. Here we review evidence ranging from individual neurons, to neuronal populations, to behavior, to computational models. We propose that understanding this algorithm will require using neuronal and psychophysical data to sift through many computational models, each based on building blocks of small, canonical sub-networks with a common functional goal. PMID:22325196
Infant visual attention and object recognition.

PubMed

Reynolds, Greg D

2015-05-15

This paper explores the role visual attention plays in the recognition of objects in infancy. Research and theory on the development of infant attention and recognition memory are reviewed in three major sections. The first section reviews some of the major findings and theory emerging from a rich tradition of behavioral research utilizing preferential looking tasks to examine visual attention and recognition memory in infancy. The second section examines research utilizing neural measures of attention and object recognition in infancy as well as research on brain-behavior relations in the early development of attention and recognition memory. The third section addresses potential areas of the brain involved in infant object recognition and visual attention. An integrated synthesis of some of the existing models of the development of visual attention is presented which may account for the observed changes in behavioral and neural measures of visual attention and object recognition that occur across infancy. Copyright © 2015 Elsevier B.V. All rights reserved.
How Artists Working in Academia View Artistic Practice as Research: Implications for Tertiary Music Education

ERIC Educational Resources Information Center

Blom, Diana; Bennett, Dawn; Wright, David

2011-01-01

Artistic research output struggles for recognition as "legitimate" research within the highly-competitive and often traditional university sector. Often recognition requires the underpinning processes and thinking to be documented in a traditional written format. This article discusses the views of eight arts practitioners working in…
Exploring the association between visual perception abilities and reading of musical notation.

PubMed

Lee, Horng-Yih

2012-06-01

In the reading of music, the acquisition of pitch information depends primarily upon the spatial position of notes as well as upon an individual's spatial processing ability. This study investigated the relationship between the ability to read single notes and visual-spatial ability. Participants with high and low single-note reading abilities were differentiated based upon differences in musical notation-reading abilities and their spatial processing; object recognition abilities were then assessed. It was found that the group with lower note-reading abilities made more errors than did the group with a higher note-reading abilities in the mental rotation task. In contrast, there was no apparent significant difference between the two groups in the object recognition task. These results suggest that note-reading may be related to visual spatial processing abilities, and not to an individual's ability with object recognition.
Distinct roles of basal forebrain cholinergic neurons in spatial and object recognition memory.

PubMed

Okada, Kana; Nishizawa, Kayo; Kobayashi, Tomoko; Sakata, Shogo; Kobayashi, Kazuto

2015-08-06

Recognition memory requires processing of various types of information such as objects and locations. Impairment in recognition memory is a prominent feature of amnesia and a symptom of Alzheimer's disease (AD). Basal forebrain cholinergic neurons contain two major groups, one localized in the medial septum (MS)/vertical diagonal band of Broca (vDB), and the other in the nucleus basalis magnocellularis (NBM). The roles of these cell groups in recognition memory have been debated, and it remains unclear how they contribute to it. We use a genetic cell targeting technique to selectively eliminate cholinergic cell groups and then test spatial and object recognition memory through different behavioural tasks. Eliminating MS/vDB neurons impairs spatial but not object recognition memory in the reference and working memory tasks, whereas NBM elimination undermines only object recognition memory in the working memory task. These impairments are restored by treatment with acetylcholinesterase inhibitors, anti-dementia drugs for AD. Our results highlight that MS/vDB and NBM cholinergic neurons are not only implicated in recognition memory but also have essential roles in different types of recognition memory.
Data-centric method for object observation through scattering media

NASA Astrophysics Data System (ADS)

Tanida, Jun; Horisaki, Ryoichi

2018-03-01

A data-centric method is introduced for object observation through scattering media. A large number of training pairs are used to characterize the relation between the object and the observation signals based on machine learning. Using the method object information can be retrieved even from strongly-disturbed signals. As potential applications, object recognition, imaging, and focusing through scattering media were demonstrated.
Dentate gyrus supports slope recognition memory, shades of grey-context pattern separation and recognition memory, and CA3 supports pattern completion for object memory.

PubMed

Kesner, Raymond P; Kirk, Ryan A; Yu, Zhenghui; Polansky, Caitlin; Musso, Nick D

2016-03-01

In order to examine the role of the dorsal dentate gyrus (dDG) in slope (vertical space) recognition and possible pattern separation, various slope (vertical space) degrees were used in a novel exploratory paradigm to measure novelty detection for changes in slope (vertical space) recognition memory and slope memory pattern separation in Experiment 1. The results of the experiment indicate that control rats displayed a slope recognition memory function with a pattern separation process for slope memory that is dependent upon the magnitude of change in slope between study and test phases. In contrast, the dDG lesioned rats displayed an impairment in slope recognition memory, though because there was no significant interaction between the two groups and slope memory, a reliable pattern separation impairment for slope could not be firmly established in the DG lesioned rats. In Experiment 2, in order to determine whether, the dDG plays a role in shades of grey spatial context recognition and possible pattern separation, shades of grey were used in a novel exploratory paradigm to measure novelty detection for changes in the shades of grey context environment. The results of the experiment indicate that control rats displayed a shades of grey-context pattern separation effect across levels of separation of context (shades of grey). In contrast, the DG lesioned rats displayed a significant interaction between the two groups and levels of shades of grey suggesting impairment in a pattern separation function for levels of shades of grey. In Experiment 3 in order to determine whether the dorsal CA3 (dCA3) plays a role in object pattern completion, a new task requiring less training and using a choice that was based on choosing the correct set of objects on a two-choice discrimination task was used. The results indicated that control rats displayed a pattern completion function based on the availability of one, two, three or four cues. In contrast, the dCA3 lesioned rats displayed a significant interaction between the two groups and the number of available objects suggesting impairment in a pattern completion function for object cues. Copyright © 2015 Elsevier Inc. All rights reserved.
View subspaces for indexing and retrieval of 3D models

NASA Astrophysics Data System (ADS)

Dutagaci, Helin; Godil, Afzal; Sankur, Bülent; Yemez, Yücel

2010-02-01

View-based indexing schemes for 3D object retrieval are gaining popularity since they provide good retrieval results. These schemes are coherent with the theory that humans recognize objects based on their 2D appearances. The viewbased techniques also allow users to search with various queries such as binary images, range images and even 2D sketches. The previous view-based techniques use classical 2D shape descriptors such as Fourier invariants, Zernike moments, Scale Invariant Feature Transform-based local features and 2D Digital Fourier Transform coefficients. These methods describe each object independent of others. In this work, we explore data driven subspace models, such as Principal Component Analysis, Independent Component Analysis and Nonnegative Matrix Factorization to describe the shape information of the views. We treat the depth images obtained from various points of the view sphere as 2D intensity images and train a subspace to extract the inherent structure of the views within a database. We also show the benefit of categorizing shapes according to their eigenvalue spread. Both the shape categorization and data-driven feature set conjectures are tested on the PSB database and compared with the competitor view-based 3D shape retrieval algorithms.
Effects of long-term voluntary exercise on learning and memory processes: dependency of the task and level of exercise.

PubMed

García-Capdevila, Sílvia; Portell-Cortés, Isabel; Torras-Garcia, Meritxell; Coll-Andreu, Margalida; Costa-Miserachs, David

2009-09-14

The effect of long-term voluntary exercise (running wheel) on anxiety-like behaviour (plus maze and open field) and learning and memory processes (object recognition and two-way active avoidance) was examined on Wistar rats. Because major individual differences in running wheel behaviour were observed, the data were analysed considering the exercising animals both as a whole and grouped according to the time spent in the running wheel (low, high, and very-high running). Although some variables related to anxiety-like behaviour seem to reflect an anxiogenic compatible effect, the view of the complete set of variables could be interpreted as an enhancement of defensive and risk assessment behaviours in exercised animals, without major differences depending on the exercise level. Effects on learning and memory processes were dependent on task and level of exercise. Two-way avoidance was not affected either in the acquisition or in the retention session, while the retention of object recognition task was affected. In this latter task, an enhancement in low running subjects and impairment in high and very-high running animals were observed.
Measuring the Speed of Newborn Object Recognition in Controlled Visual Worlds

ERIC Educational Resources Information Center

Wood, Justin N.; Wood, Samantha M. W.

2017-01-01

How long does it take for a newborn to recognize an object? Adults can recognize objects rapidly, but measuring object recognition speed in newborns has not previously been possible. Here we introduce an automated controlled-rearing method for measuring the speed of newborn object recognition in controlled visual worlds. We raised newborn chicks…
Clinical Views: Object-Oriented Views for Clinical Databases

PubMed Central

Portoni, Luisa; Combi, Carlo; Pinciroli, Francesco

1998-01-01

We present here a prototype of a clinical information system for the archiving and the management of multimedia and temporally-oriented clinical data related to PTCA patients. The system is based on an object-oriented DBMS and supports multiple views and view schemas on patients' data. Remote data access is supported too.
Deletion of the GluA1 AMPA receptor subunit impairs recency-dependent object recognition memory

PubMed Central

Sanderson, David J.; Hindley, Emma; Smeaton, Emily; Denny, Nick; Taylor, Amy; Barkus, Chris; Sprengel, Rolf; Seeburg, Peter H.; Bannerman, David M.

2011-01-01

Deletion of the GluA1 AMPA receptor subunit impairs short-term spatial recognition memory. It has been suggested that short-term recognition depends upon memory caused by the recent presentation of a stimulus that is independent of contextual–retrieval processes. The aim of the present set of experiments was to test whether the role of GluA1 extends to nonspatial recognition memory. Wild-type and GluA1 knockout mice were tested on the standard object recognition task and a context-independent recognition task that required recency-dependent memory. In a first set of experiments it was found that GluA1 deletion failed to impair performance on either of the object recognition or recency-dependent tasks. However, GluA1 knockout mice displayed increased levels of exploration of the objects in both the sample and test phases compared to controls. In contrast, when the time that GluA1 knockout mice spent exploring the objects was yoked to control mice during the sample phase, it was found that GluA1 deletion now impaired performance on both the object recognition and the recency-dependent tasks. GluA1 deletion failed to impair performance on a context-dependent recognition task regardless of whether object exposure in knockout mice was yoked to controls or not. These results demonstrate that GluA1 is necessary for nonspatial as well as spatial recognition memory and plays an important role in recency-dependent memory processes. PMID:21378100
Wavelet-based study of valence-arousal model of emotions on EEG signals with LabVIEW.

PubMed

Guzel Aydin, Seda; Kaya, Turgay; Guler, Hasan

2016-06-01

This paper illustrates the wavelet-based feature extraction for emotion assessment using electroencephalogram (EEG) signal through graphical coding design. Two-dimensional (valence-arousal) emotion model was studied. Different emotions (happy, joy, melancholy, and disgust) were studied for assessment. These emotions were stimulated by video clips. EEG signals obtained from four subjects were decomposed into five frequency bands (gamma, beta, alpha, theta, and delta) using "db5" wavelet function. Relative features were calculated to obtain further information. Impact of the emotions according to valence value was observed to be optimal on power spectral density of gamma band. The main objective of this work is not only to investigate the influence of the emotions on different frequency bands but also to overcome the difficulties in the text-based program. This work offers an alternative approach for emotion evaluation through EEG processing. There are a number of methods for emotion recognition such as wavelet transform-based, Fourier transform-based, and Hilbert-Huang transform-based methods. However, the majority of these methods have been applied with the text-based programming languages. In this study, we proposed and implemented an experimental feature extraction with graphics-based language, which provides great convenience in bioelectrical signal processing.
Activity-based exploitation of Full Motion Video (FMV)

NASA Astrophysics Data System (ADS)

Kant, Shashi

2012-06-01

Video has been a game-changer in how US forces are able to find, track and defeat its adversaries. With millions of minutes of video being generated from an increasing number of sensor platforms, the DOD has stated that the rapid increase in video is overwhelming their analysts. The manpower required to view and garner useable information from the flood of video is unaffordable, especially in light of current fiscal restraints. "Search" within full-motion video has traditionally relied on human tagging of content, and video metadata, to provision filtering and locate segments of interest, in the context of analyst query. Our approach utilizes a novel machine-vision based approach to index FMV, using object recognition & tracking, events and activities detection. This approach enables FMV exploitation in real-time, as well as a forensic look-back within archives. This approach can help get the most information out of video sensor collection, help focus the attention of overburdened analysts form connections in activity over time and conserve national fiscal resources in exploiting FMV.
What is a new drug worth? An innovative model for performance-based pricing.

PubMed

Dranitsaris, G; Dorward, K; Owens, R C; Schipper, H

2015-05-01

This article focuses on a novel method to derive prices for new pharmaceuticals by making price a function of drug performance. We briefly review current models for determining price for a new product and discuss alternatives that have historically been favoured by various funding bodies. The progressive approach to drug pricing, proposed herein, may better address the views and concerns of multiple stakeholders in a developed healthcare system by acknowledging and incorporating input from disparate parties via comprehensive and successive negotiation stages. In proposing a valid construct for performance-based pricing, the following model seeks to achieve several crucial objectives: earlier and wider access to new treatments; improved transparency in drug pricing; multi-stakeholder involvement through phased pricing negotiations; recognition of innovative product performance and latent changes in value; an earlier and more predictable return for developers without sacrificing total return on investment (ROI); more involved and informed risk sharing by the end-user. © 2014 John Wiley & Sons Ltd.
Noisy Ocular Recognition Based on Three Convolutional Neural Networks.

PubMed

Lee, Min Beom; Hong, Hyung Gil; Park, Kang Ryoung

2017-12-17

In recent years, the iris recognition system has been gaining increasing acceptance for applications such as access control and smartphone security. When the images of the iris are obtained under unconstrained conditions, an issue of undermined quality is caused by optical and motion blur, off-angle view (the user's eyes looking somewhere else, not into the front of the camera), specular reflection (SR) and other factors. Such noisy iris images increase intra-individual variations and, as a result, reduce the accuracy of iris recognition. A typical iris recognition system requires a near-infrared (NIR) illuminator along with an NIR camera, which are larger and more expensive than fingerprint recognition equipment. Hence, many studies have proposed methods of using iris images captured by a visible light camera without the need for an additional illuminator. In this research, we propose a new recognition method for noisy iris and ocular images by using one iris and two periocular regions, based on three convolutional neural networks (CNNs). Experiments were conducted by using the noisy iris challenge evaluation-part II (NICE.II) training dataset (selected from the university of Beira iris (UBIRIS).v2 database), mobile iris challenge evaluation (MICHE) database, and institute of automation of Chinese academy of sciences (CASIA)-Iris-Distance database. As a result, the method proposed by this study outperformed previous methods.

Three-dimensional obstacle classification in laser range data

NASA Astrophysics Data System (ADS)

Armbruster, Walter; Bers, Karl-Heinz

1998-10-01

The threat of hostile surveillance and weapon systems require military aircraft to fly under extreme conditions such as low altitude, high speed, poor visibility and incomplete terrain information. The probability of collision with natural and man-made obstacles during such contour missions is high if detection capability is restricted to conventional vision aids. Forward-looking scanning laser rangefinders which are presently being flight tested and evaluated at German proving grounds, provide a possible solution, having a large field of view, high angular and range resolution, a high pulse repetition rate, and sufficient pulse energy to register returns from wires at over 500 m range (depends on the system) with a high hit-and-detect probability. Despite the efficiency of the sensor, acceptance of current obstacle warning systems by test pilots is not very high, mainly due to the systems' inadequacies in obstacle recognition and visualization. This has motivated the development and the testing of more advanced 3d-scene analysis algorithm at FGAN-FIM to replace the obstacle recognition component of current warning systems. The basic ideas are to increase the recognition probability and to reduce the false alarm rate for hard-to-extract obstacles such as wires, by using more readily recognizable objects such as terrain, poles, pylons, trees, etc. by implementing a hierarchical classification procedure to generate a parametric description of the terrain surface as well as the class, position, orientation, size and shape of all objects in the scene. The algorithms can be used for other applications such as terrain following, autonomous obstacle avoidance, and automatic target recognition.
Recognizing familiar objects by hand and foot: Haptic shape perception generalizes to inputs from unusual locations and untrained body parts.

PubMed

Lawson, Rebecca

2014-02-01

The limits of generalization of our 3-D shape recognition system to identifying objects by touch was investigated by testing exploration at unusual locations and using untrained effectors. In Experiments 1 and 2, people found identification by hand of real objects, plastic 3-D models of objects, and raised line drawings placed in front of themselves no easier than when exploration was behind their back. Experiment 3 compared one-handed, two-handed, one-footed, and two-footed haptic object recognition of familiar objects. Recognition by foot was slower (7 vs. 13 s) and much less accurate (9 % vs. 47 % errors) than recognition by either one or both hands. Nevertheless, item difficulty was similar across hand and foot exploration, and there was a strong correlation between an individual's hand and foot performance. Furthermore, foot recognition was better with the largest 20 of the 80 items (32 % errors), suggesting that physical limitations hampered exploration by foot. Thus, object recognition by hand generalized efficiently across the spatial location of stimuli, while object recognition by foot seemed surprisingly good given that no prior training was provided. Active touch (haptics) thus efficiently extracts 3-D shape information and accesses stored representations of familiar objects from novel modes of input.
Real-time unconstrained object recognition: a processing pipeline based on the mammalian visual system.

PubMed

Aguilar, Mario; Peot, Mark A; Zhou, Jiangying; Simons, Stephen; Liao, Yuwei; Metwalli, Nader; Anderson, Mark B

2012-03-01

The mammalian visual system is still the gold standard for recognition accuracy, flexibility, efficiency, and speed. Ongoing advances in our understanding of function and mechanisms in the visual system can now be leveraged to pursue the design of computer vision architectures that will revolutionize the state of the art in computer vision.
Insular Cortex Is Involved in Consolidation of Object Recognition Memory

ERIC Educational Resources Information Center

Bermudez-Rattoni, Federico; Okuda, Shoki; Roozendaal, Benno; McGaugh, James L.

2005-01-01

Extensive evidence indicates that the insular cortex (IC), also termed gustatory cortex, is critically involved in conditioned taste aversion and taste recognition memory. Although most studies of the involvement of the IC in memory have investigated taste, there is some evidence that the IC is involved in memory that is not based on taste. In…
SU-D-201-05: On the Automatic Recognition of Patient Safety Hazards in a Radiotherapy Setup Using a Novel 3D Camera System and a Deep Learning Framework

DOE Office of Scientific and Technical Information (OSTI.GOV)

Santhanam, A; Min, Y; Beron, P

Purpose: Patient safety hazards such as a wrong patient/site getting treated can lead to catastrophic results. The purpose of this project is to automatically detect potential patient safety hazards during the radiotherapy setup and alert the therapist before the treatment is initiated. Methods: We employed a set of co-located and co-registered 3D cameras placed inside the treatment room. Each camera provided a point-cloud of fraxels (fragment pixels with 3D depth information). Each of the cameras were calibrated using a custom-built calibration target to provide 3D information with less than 2 mm error in the 500 mm neighborhood around the isocenter.more » To identify potential patient safety hazards, the treatment room components and the patient’s body needed to be identified and tracked in real-time. For feature recognition purposes, we used a graph-cut based feature recognition with principal component analysis (PCA) based feature-to-object correlation to segment the objects in real-time. Changes in the object’s position were tracked using the CamShift algorithm. The 3D object information was then stored for each classified object (e.g. gantry, couch). A deep learning framework was then used to analyze all the classified objects in both 2D and 3D and was then used to fine-tune a convolutional network for object recognition. The number of network layers were optimized to identify the tracked objects with >95% accuracy. Results: Our systematic analyses showed that, the system was effectively able to recognize wrong patient setups and wrong patient accessories. The combined usage of 2D camera information (color + depth) enabled a topology-preserving approach to verify patient safety hazards in an automatic manner and even in scenarios where the depth information is partially available. Conclusion: By utilizing the 3D cameras inside the treatment room and a deep learning based image classification, potential patient safety hazards can be effectively avoided.« less
Dopamine D1 receptor stimulation modulates the formation and retrieval of novel object recognition memory: Role of the prelimbic cortex

PubMed Central

Pezze, Marie A.; Marshall, Hayley J.; Fone, Kevin C.F.; Cassaday, Helen J.

2015-01-01

Previous studies have shown that dopamine D1 receptor antagonists impair novel object recognition memory but the effects of dopamine D1 receptor stimulation remain to be determined. This study investigated the effects of the selective dopamine D1 receptor agonist SKF81297 on acquisition and retrieval in the novel object recognition task in male Wistar rats. SKF81297 (0.4 and 0.8 mg/kg s.c.) given 15 min before the sampling phase impaired novel object recognition evaluated 10 min or 24 h later. The same treatments also reduced novel object recognition memory tested 24 h after the sampling phase and when given 15 min before the choice session. These data indicate that D1 receptor stimulation modulates both the encoding and retrieval of object recognition memory. Microinfusion of SKF81297 (0.025 or 0.05 μg/side) into the prelimbic sub-region of the medial prefrontal cortex (mPFC) in this case 10 min before the sampling phase also impaired novel object recognition memory, suggesting that the mPFC is one important site mediating the effects of D1 receptor stimulation on visual recognition memory. PMID:26277743
Towards evidence-based, quality-controlled health promotion: the Dutch recognition system for health promotion interventions

PubMed Central

Brug, Johannes; van Dale, Djoeke; Lanting, Loes; Kremers, Stef; Veenhof, Cindy; Leurs, Mariken; van Yperen, Tom; Kok, Gerjo

2010-01-01

Registration or recognition systems for best-practice health promotion interventions may contribute to better quality assurance and control in health promotion practice. In the Netherlands, such a system has been developed and is being implemented aiming to provide policy makers and professionals with more information on the quality and effectiveness of available health promotion interventions and to promote use of good-practice and evidence-based interventions by health promotion organizations. The quality assessments are supervised by the Netherlands Organization for Public Health and the Environment and the Netherlands Youth Institute and conducted by two committees, one for interventions aimed at youth and one for adults. These committees consist of experts in the fields of research, policy and practice. Four levels of recognition are distinguished inspired by the UK Medical Research Council's evaluation framework for complex interventions to improve health: (i) theoretically sound, (ii) probable effectiveness, (iii) established effectiveness, and (iv) established cost effectiveness. Specific criteria have been set for each level of recognition, except for Level 4 which will be included from 2011. This point of view article describes and discusses the rationale, organization and criteria of this Dutch recognition system and the first experiences with the system. PMID:20841318
Case-Based Learning in Athletic Training

ERIC Educational Resources Information Center

Berry, David C.

2013-01-01

The National Athletic Trainers' Association (NATA) Executive Committee for Education has emphasized the need for proper recognition and management of orthopaedic and general medical conditions through their support of numerous learning objectives and the clinical integrated proficiencies. These learning objectives and integrated clinical…
Optimization of Visual Information Presentation for Visual Prosthesis.

PubMed

Guo, Fei; Yang, Yuan; Gao, Yong

2018-01-01

Visual prosthesis applying electrical stimulation to restore visual function for the blind has promising prospects. However, due to the low resolution, limited visual field, and the low dynamic range of the visual perception, huge loss of information occurred when presenting daily scenes. The ability of object recognition in real-life scenarios is severely restricted for prosthetic users. To overcome the limitations, optimizing the visual information in the simulated prosthetic vision has been the focus of research. This paper proposes two image processing strategies based on a salient object detection technique. The two processing strategies enable the prosthetic implants to focus on the object of interest and suppress the background clutter. Psychophysical experiments show that techniques such as foreground zooming with background clutter removal and foreground edge detection with background reduction have positive impacts on the task of object recognition in simulated prosthetic vision. By using edge detection and zooming technique, the two processing strategies significantly improve the recognition accuracy of objects. We can conclude that the visual prosthesis using our proposed strategy can assist the blind to improve their ability to recognize objects. The results will provide effective solutions for the further development of visual prosthesis.
Optimization of Visual Information Presentation for Visual Prosthesis

PubMed Central

Gao, Yong

2018-01-01

Visual prosthesis applying electrical stimulation to restore visual function for the blind has promising prospects. However, due to the low resolution, limited visual field, and the low dynamic range of the visual perception, huge loss of information occurred when presenting daily scenes. The ability of object recognition in real-life scenarios is severely restricted for prosthetic users. To overcome the limitations, optimizing the visual information in the simulated prosthetic vision has been the focus of research. This paper proposes two image processing strategies based on a salient object detection technique. The two processing strategies enable the prosthetic implants to focus on the object of interest and suppress the background clutter. Psychophysical experiments show that techniques such as foreground zooming with background clutter removal and foreground edge detection with background reduction have positive impacts on the task of object recognition in simulated prosthetic vision. By using edge detection and zooming technique, the two processing strategies significantly improve the recognition accuracy of objects. We can conclude that the visual prosthesis using our proposed strategy can assist the blind to improve their ability to recognize objects. The results will provide effective solutions for the further development of visual prosthesis. PMID:29731769
Performance improvement of multi-class detection using greedy algorithm for Viola-Jones cascade selection

NASA Astrophysics Data System (ADS)

Tereshin, Alexander A.; Usilin, Sergey A.; Arlazarov, Vladimir V.

2018-04-01

This paper aims to study the problem of multi-class object detection in video stream with Viola-Jones cascades. An adaptive algorithm for selecting Viola-Jones cascade based on greedy choice strategy in solution of the N-armed bandit problem is proposed. The efficiency of the algorithm on the problem of detection and recognition of the bank card logos in the video stream is shown. The proposed algorithm can be effectively used in documents localization and identification, recognition of road scene elements, localization and tracking of the lengthy objects , and for solving other problems of rigid object detection in a heterogeneous data flows. The computational efficiency of the algorithm makes it possible to use it both on personal computers and on mobile devices based on processors with low power consumption.
Object representations in ventral and dorsal visual streams: fMRI repetition effects depend on attention and part–whole configuration

PubMed Central

Thoma, Volker; Henson, Richard N.

2011-01-01

The effects of attention and object configuration on the neural responses to short-lag visual image repetition were investigated with fMRI. Attention to one of two object images in a prime display was cued spatially. The images were either intact or split vertically; a manipulation that negates the influence of view-based representations. A subsequent single intact probe image was named covertly. Behavioural priming observed as faster button presses was found for attended primes in both intact and split configurations, but only for uncued primes in the intact configuration. In a voxel-wise analysis, fMRI repetition suppression (RS) was observed in a left mid-fusiform region for attended primes, both intact and split, whilst a right intraparietal region showed repetition enhancement (RE) for intact primes, regardless of attention. In a factorial analysis across regions of interest (ROIs) defined from independent localiser contrasts, RS for attended objects in the ventral stream was significantly left-lateralised, whilst repetition effects in ventral and dorsal ROIs correlated with the amount of priming in specific conditions. These fMRI results extend hybrid theories of object recognition, implicating left ventral stream regions in analytic processing (requiring attention), consistent with prior hypotheses about hemispheric specialisation, and implicating dorsal stream regions in holistic processing (independent of attention). PMID:21554967
Definition and automatic anatomy recognition of lymph node zones in the pelvis on CT images

NASA Astrophysics Data System (ADS)

Liu, Yu; Udupa, Jayaram K.; Odhner, Dewey; Tong, Yubing; Guo, Shuxu; Attor, Rosemary; Reinicke, Danica; Torigian, Drew A.

2016-03-01

Currently, unlike IALSC-defined thoracic lymph node zones, no explicitly provided definitions for lymph nodes in other body regions are available. Yet, definitions are critical for standardizing the recognition, delineation, quantification, and reporting of lymphadenopathy in other body regions. Continuing from our previous work in the thorax, this paper proposes a standardized definition of the grouping of pelvic lymph nodes into 10 zones. We subsequently employ our earlier Automatic Anatomy Recognition (AAR) framework designed for body-wide organ modeling, recognition, and delineation to actually implement these zonal definitions where the zones are treated as anatomic objects. First, all 10 zones and key anatomic organs used as anchors are manually delineated under expert supervision for constructing fuzzy anatomy models of the assembly of organs together with the zones. Then, optimal hierarchical arrangement of these objects is constructed for the purpose of achieving the best zonal recognition. For actual localization of the objects, two strategies are used -- optimal thresholded search for organs and one-shot method for the zones where the known relationship of the zones to key organs is exploited. Based on 50 computed tomography (CT) image data sets for the pelvic body region and an equal division into training and test subsets, automatic zonal localization within 1-3 voxels is achieved.
No childhood development of viewpoint-invariant face recognition: evidence from 8-year-olds and adults.

PubMed

Crookes, Kate; Robbins, Rachel A

2014-10-01

Performance on laboratory face tasks improves across childhood, not reaching adult levels until adolescence. Debate surrounds the source of this development, with recent reviews suggesting that underlying face processing mechanisms are mature early in childhood and that the improvement seen on experimental tasks instead results from general cognitive/perceptual development. One face processing mechanism that has been argued to develop slowly is the ability to encode faces in a view-invariant manner (i.e., allowing recognition across changes in viewpoint). However, many previous studies have not controlled for general cognitive factors. In the current study, 8-year-olds and adults performed a recognition memory task with two study-test viewpoint conditions: same view (study front view, test front view) and change view (study front view, test three-quarter view). To allow quantitative comparison between children and adults, performance in the same view condition was matched across the groups by increasing the learning set size for adults. Results showed poorer memory in the change view condition than in the same view condition for both adults and children. Importantly, there was no quantitative difference between children and adults in the size of decrement in memory performance resulting from a change in viewpoint. This finding adds to growing evidence that face processing mechanisms are mature early in childhood. Copyright © 2014 Elsevier Inc. All rights reserved.
Superordinate Level Processing Has Priority Over Basic-Level Processing in Scene Gist Recognition

PubMed Central

Sun, Qi; Zheng, Yang; Sun, Mingxia; Zheng, Yuanjie

2016-01-01

By combining a perceptual discrimination task and a visuospatial working memory task, the present study examined the effects of visuospatial working memory load on the hierarchical processing of scene gist. In the perceptual discrimination task, two scene images from the same (manmade–manmade pairing or natural–natural pairing) or different superordinate level categories (manmade–natural pairing) were presented simultaneously, and participants were asked to judge whether these two images belonged to the same basic-level category (e.g., street–street pairing) or not (e.g., street–highway pairing). In the concurrent working memory task, spatial load (position-based load in Experiment 1) and object load (figure-based load in Experiment 2) were manipulated. The results were as follows: (a) spatial load and object load have stronger effects on discrimination of same basic-level scene pairing than same superordinate level scene pairing; (b) spatial load has a larger impact on the discrimination of scene pairings at early stages than at later stages; on the contrary, object information has a larger influence on at later stages than at early stages. It followed that superordinate level processing has priority over basic-level processing in scene gist recognition and spatial information contributes to the earlier and object information to the later stages in scene gist recognition. PMID:28382195
Recognition-induced forgetting of faces in visual long-term memory.

PubMed

Rugo, Kelsi F; Tamler, Kendall N; Woodman, Geoffrey F; Maxcey, Ashleigh M

2017-10-01

Despite more than a century of evidence that long-term memory for pictures and words are different, much of what we know about memory comes from studies using words. Recent research examining visual long-term memory has demonstrated that recognizing an object induces the forgetting of objects from the same category. This recognition-induced forgetting has been shown with a variety of everyday objects. However, unlike everyday objects, faces are objects of expertise. As a result, faces may be immune to recognition-induced forgetting. However, despite excellent memory for such stimuli, we found that faces were susceptible to recognition-induced forgetting. Our findings have implications for how models of human memory account for recognition-induced forgetting as well as represent objects of expertise and consequences for eyewitness testimony and the justice system.
Psychopolitical Validity: Power, Culture, and Wellness

ERIC Educational Resources Information Center

Fisher, Adrian T.; Sonn, Christopher C.

2008-01-01

In this commentary, the authors review and critique Prilleltensky's model of psychopolitical validity and wellness. Although the overt recognition of power, oppression, and political forces are viewed most favorably, cautions are also given. Of most importance is the way in which his model is based in an undeclared North American model of…
Community Asset Mapping. Trends and Issues Alert.

ERIC Educational Resources Information Center

Kerka, Sandra

Asset mapping involves documenting tangible and intangible resources of a community viewed as a place with assets to be preserved and enhanced, not deficits to be remedied. Kretzmann and McKnight (1993) are credited with developing the concept of asset-based community development (ABCD) that draws on appreciative inquiry; recognition of social…
Acute Effects of Alcohol on Encoding and Consolidation of Memory for Emotional Stimuli

PubMed Central

Weafer, Jessica; Gallo, David A.; De Wit, Harriet

2016-01-01

Objective: Acute doses of alcohol impair memory when administered before encoding of emotionally neutral stimuli but enhance memory when administered immediately after encoding, potentially by affecting memory consolidation. Here, we examined whether alcohol produces similar biphasic effects on memory for positive or negative emotional stimuli. Method: The current study examined memory for emotional stimuli after alcohol (0.8 g/kg) was administered either before stimulus viewing (encoding group; n = 20) or immediately following stimulus viewing (consolidation group; n = 20). A third group received placebo both before and after stimulus viewing (control group; n = 19). Participants viewed the stimuli on one day, and their retrieval was assessed exactly 48 hours later, when they performed a surprise cued recollection and recognition test of the stimuli in a drug-free state. Results: As in previous studies, alcohol administered before encoding impaired memory accuracy, whereas alcohol administered after encoding enhanced memory accuracy. Critically, alcohol effects on cued recollection depended on the valence of the emotional stimuli: Its memory-impairing effects during encoding were greatest for emotional stimuli, whereas its memory-enhancing effects during consolidation were greatest for emotionally neutral stimuli. Effects of alcohol on recognition were not related to stimulus valence. Conclusions: This study extends previous findings with memory for neutral stimuli, showing that alcohol differentially affects the encoding and consolidation of memory for emotional stimuli. These effects of alcohol on memory for emotionally salient material may contribute to the development of alcohol-related problems, perhaps by dampening memory for adverse consequences of alcohol consumption. PMID:26751358
Ceres In Context: What the Rest of the Asteroid Population Tells Us About Its Largest Member

NASA Astrophysics Data System (ADS)

Rivkin, A.

2015-12-01

Ceres is famously the largest object in the asteroid belt. Over the course of the last 215 years it has been considered everything from a unique protoplanet (or indeed full-fledged "planet") to a large but run-of-the-mill piece of rock. Over the last decade, models of Ceres' thermal history and shape measurements based on HST imagery have led to the recognition that Ceres is a differentiated object, and likely an ice-rich one. In the last year the Dawn spacecraft has provided unprecedented views of Ceres' surface and combined with data from observational facilities like Herschel and countless telescopes it has shown the varied nature of its geology and ongoing processes. Even given these recent results, Ceres remains an inhabitant of the asteroid belt, existing in the ambient environment and affected by impactors, micrometeorites, solar wind, and other factors. While we only have spacecraft imagery from a very small number of targets, we do have a wealth of Earth-based data from the objects that have shared space with Ceres for billions of years. The insights gained from studying these objects can be applied to Ceres to understand its context and nature. Similarly, what we learn at Ceres will be applicable in many ways to other objects, particularly the twenty or so largest asteroids, which tend to be low-albedo, water-rich bodies. I will discuss our current understanding of the asteroids, particularly those that share important characteristics with Ceres, and focus on what we can learn about Ceres from these bodies.

Language deficits in poor comprehenders: a case for the simple view of reading.

PubMed

Catts, Hugh W; Adlof, Suzanne M; Ellis Weismer, Susan

2006-04-01

To examine concurrently and retrospectively the language abilities of children with specific reading comprehension deficits ("poor comprehenders") and compare them to typical readers and children with specific decoding deficits ("poor decoders"). In Study 1, the authors identified 57 poor comprehenders, 27 poor decoders, and 98 typical readers on the basis of 8th-grade reading achievement. These subgroups' performances on 8th-grade measures of language comprehension and phonological processing were investigated. In Study 2, the authors examined retrospectively subgroups' performances on measures of language comprehension and phonological processing in kindergarten, 2nd, and 4th grades. Word recognition and reading comprehension in 2nd and 4th grades were also considered. Study 1 showed that poor comprehenders had concurrent deficits in language comprehension but normal abilities in phonological processing. Poor decoders were characterized by the opposite pattern of language abilities. Study 2 results showed that subgroups had language (and word recognition) profiles in the earlier grades that were consistent with those observed in 8th grade. Subgroup differences in reading comprehension were inconsistent across grades but reflective of the changes in the components of reading comprehension over time. The results support the simple view of reading and the phonological deficit hypothesis. Furthermore, the findings indicate that a classification system that is based on the simple view has advantages over standard systems that focus only on word recognition and/or reading comprehension.
Human activities recognition by head movement using partial recurrent neural network

NASA Astrophysics Data System (ADS)

Tan, Henry C. C.; Jia, Kui; De Silva, Liyanage C.

2003-06-01

Traditionally, human activities recognition has been achieved mainly by the statistical pattern recognition methods or the Hidden Markov Model (HMM). In this paper, we propose a novel use of the connectionist approach for the recognition of ten simple human activities: walking, sitting down, getting up, squatting down and standing up, in both lateral and frontal views, in an office environment. By means of tracking the head movement of the subjects over consecutive frames from a database of different color image sequences, and incorporating the Elman model of the partial recurrent neural network (RNN) that learns the sequential patterns of relative change of the head location in the images, the proposed system is able to robustly classify all the ten activities performed by unseen subjects from both sexes, of different race and physique, with a recognition rate as high as 92.5%. This demonstrates the potential of employing partial RNN to recognize complex activities in the increasingly popular human-activities-based applications.
Object recognition of ladar with support vector machine

NASA Astrophysics Data System (ADS)

Sun, Jian-Feng; Li, Qi; Wang, Qi

2005-01-01

Intensity, range and Doppler images can be obtained by using laser radar. Laser radar can detect much more object information than other detecting sensor, such as passive infrared imaging and synthetic aperture radar (SAR), so it is well suited as the sensor of object recognition. Traditional method of laser radar object recognition is extracting target features, which can be influenced by noise. In this paper, a laser radar recognition method-Support Vector Machine is introduced. Support Vector Machine (SVM) is a new hotspot of recognition research after neural network. It has well performance on digital written and face recognition. Two series experiments about SVM designed for preprocessing and non-preprocessing samples are performed by real laser radar images, and the experiments results are compared.
How similar are recognition memory and inductive reasoning?

PubMed

Hayes, Brett K; Heit, Evan

2013-07-01

Conventionally, memory and reasoning are seen as different types of cognitive activities driven by different processes. In two experiments, we challenged this view by examining the relationship between recognition memory and inductive reasoning involving multiple forms of similarity. A common study set (members of a conjunctive category) was followed by a test set containing old and new category members, as well as items that matched the study set on only one dimension. The study and test sets were presented under recognition or induction instructions. In Experiments 1 and 2, the inductive property being generalized was varied in order to direct attention to different dimensions of similarity. When there was no time pressure on decisions, patterns of positive responding were strongly affected by property type, indicating that different types of similarity were driving recognition and induction. By comparison, speeded judgments showed weaker property effects and could be explained by generalization based on overall similarity. An exemplar model, GEN-EX (GENeralization from EXamples), could account for both the induction and recognition data. These findings show that induction and recognition share core component processes, even when the tasks involve flexible forms of similarity.
Line-based logo recognition through a web-camera

NASA Astrophysics Data System (ADS)

Chen, Xiaolu; Wang, Yangsheng; Feng, Xuetao

2007-11-01

Logo recognition has gained much development in the document retrieval and shape analysis domain. As human computer interaction becomes more and more popular, the logo recognition through a web-camera is a promising technology in view of application. But for practical application, the study of logo recognition in real scene is much more difficult than the work in clear scene. To cope with the need, we make some improvements on conventional method. First, moment information is used to calculate the test image's orientation angle, which is used to normalize the test image. Second, the main structure of the test image, which is represented by lines patterns, is acquired and modified Hausdorff distance is employed to match the image and each of the existing templates. The proposed method, which is invariant to scale and rotation, gives good result and can work at real-time. The main contribution of this paper is that some improvements are introduced into the exiting recognition framework which performs much better than the original one. Besides, we have built a highly successful logo recognition system using our improved method.
Research on application of LADAR in ground vehicle recognition

NASA Astrophysics Data System (ADS)

Lan, Jinhui; Shen, Zhuoxun

2009-11-01

For the requirement of many practical applications in the field of military, the research of 3D target recognition is active. The representation that captures the salient attributes of a 3D target independent of the viewing angle will be especially useful to the automatic 3D target recognition system. This paper presents a new approach of image generation based on Laser Detection and Ranging (LADAR) data. Range image of target is obtained by transformation of point cloud. In order to extract features of different ground vehicle targets and to recognize targets, zernike moment properties of typical ground vehicle targets are researched in this paper. A technique of support vector machine is applied to the classification and recognition of target. The new method of image generation and feature representation has been applied to the outdoor experiments. Through outdoor experiments, it can be proven that the method of image generation is stability, the moments are effective to be used as features for recognition, and the LADAR can be applied to the field of 3D target recognition.
Emotion Recognition in Frontotemporal Dementia and Alzheimer's Disease: A New Film-Based Assessment

PubMed Central

Goodkind, Madeleine S.; Sturm, Virginia E.; Ascher, Elizabeth A.; Shdo, Suzanne M.; Miller, Bruce L.; Rankin, Katherine P.; Levenson, Robert W.

2015-01-01

Deficits in recognizing others' emotions are reported in many psychiatric and neurological disorders, including autism, schizophrenia, behavioral variant frontotemporal dementia (bvFTD) and Alzheimer's disease (AD). Most previous emotion recognition studies have required participants to identify emotional expressions in photographs. This type of assessment differs from real-world emotion recognition in important ways: Images are static rather than dynamic, include only 1 modality of emotional information (i.e., visual information), and are presented absent a social context. Additionally, existing emotion recognition batteries typically include multiple negative emotions, but only 1 positive emotion (i.e., happiness) and no self-conscious emotions (e.g., embarrassment). We present initial results using a new task for assessing emotion recognition that was developed to address these limitations. In this task, respondents view a series of short film clips and are asked to identify the main characters' emotions. The task assesses multiple negative, positive, and self-conscious emotions based on information that is multimodal, dynamic, and socially embedded. We evaluate this approach in a sample of patients with bvFTD, AD, and normal controls. Results indicate that patients with bvFTD have emotion recognition deficits in all 3 categories of emotion compared to the other groups. These deficits were especially pronounced for negative and self-conscious emotions. Emotion recognition in this sample of patients with AD was indistinguishable from controls. These findings underscore the utility of this approach to assessing emotion recognition and suggest that previous findings that recognition of positive emotion was preserved in dementia patients may have resulted from the limited sampling of positive emotion in traditional tests. PMID:26010574
Age-related increases in false recognition: the role of perceptual and conceptual similarity.

PubMed

Pidgeon, Laura M; Morcom, Alexa M

2014-01-01

Older adults (OAs) are more likely to falsely recognize novel events than young adults, and recent behavioral and neuroimaging evidence points to a reduced ability to distinguish overlapping information due to decline in hippocampal pattern separation. However, other data suggest a critical role for semantic similarity. Koutstaal et al. [(2003) false recognition of abstract vs. common objects in older and younger adults: testing the semantic categorization account, J. Exp. Psychol. Learn. 29, 499-510] reported that OAs were only vulnerable to false recognition of items with pre-existing semantic representations. We replicated Koutstaal et al.'s (2003) second experiment and examined the influence of independently rated perceptual and conceptual similarity between stimuli and lures. At study, young and OAs judged the pleasantness of pictures of abstract (unfamiliar) and concrete (familiar) items, followed by a surprise recognition test including studied items, similar lures, and novel unrelated items. Experiment 1 used dichotomous "old/new" responses at test, while in Experiment 2 participants were also asked to judge lures as "similar," to increase explicit demands on pattern separation. In both experiments, OAs showed a greater increase in false recognition for concrete than abstract items relative to the young, replicating Koutstaal et al.'s (2003) findings. However, unlike in the earlier study, there was also an age-related increase in false recognition of abstract lures when multiple similar images had been studied. In line with pattern separation accounts of false recognition, OAs were more likely to misclassify concrete lures with high and moderate, but not low degrees of rated similarity to studied items. Results are consistent with the view that OAs are particularly susceptible to semantic interference in recognition memory, and with the possibility that this reflects age-related decline in pattern separation.
Age-related increases in false recognition: the role of perceptual and conceptual similarity

PubMed Central

Pidgeon, Laura M.; Morcom, Alexa M.

2014-01-01

Older adults (OAs) are more likely to falsely recognize novel events than young adults, and recent behavioral and neuroimaging evidence points to a reduced ability to distinguish overlapping information due to decline in hippocampal pattern separation. However, other data suggest a critical role for semantic similarity. Koutstaal et al. [(2003) false recognition of abstract vs. common objects in older and younger adults: testing the semantic categorization account, J. Exp. Psychol. Learn. 29, 499–510] reported that OAs were only vulnerable to false recognition of items with pre-existing semantic representations. We replicated Koutstaal et al.’s (2003) second experiment and examined the influence of independently rated perceptual and conceptual similarity between stimuli and lures. At study, young and OAs judged the pleasantness of pictures of abstract (unfamiliar) and concrete (familiar) items, followed by a surprise recognition test including studied items, similar lures, and novel unrelated items. Experiment 1 used dichotomous “old/new” responses at test, while in Experiment 2 participants were also asked to judge lures as “similar,” to increase explicit demands on pattern separation. In both experiments, OAs showed a greater increase in false recognition for concrete than abstract items relative to the young, replicating Koutstaal et al.’s (2003) findings. However, unlike in the earlier study, there was also an age-related increase in false recognition of abstract lures when multiple similar images had been studied. In line with pattern separation accounts of false recognition, OAs were more likely to misclassify concrete lures with high and moderate, but not low degrees of rated similarity to studied items. Results are consistent with the view that OAs are particularly susceptible to semantic interference in recognition memory, and with the possibility that this reflects age-related decline in pattern separation. PMID:25368576
The recognition of emotional expression in prosopagnosia: decoding whole and part faces.

PubMed

Stephan, Blossom Christa Maree; Breen, Nora; Caine, Diana

2006-11-01

Prosopagnosia is currently viewed within the constraints of two competing theories of face recognition, one highlighting the analysis of features, the other focusing on configural processing of the whole face. This study investigated the role of feature analysis versus whole face configural processing in the recognition of facial expression. A prosopagnosic patient, SC made expression decisions from whole and incomplete (eyes-only and mouth-only) faces where features had been obscured. SC was impaired at recognizing some (e.g., anger, sadness, and fear), but not all (e.g., happiness) emotional expressions from the whole face. Analyses of his performance on incomplete faces indicated that his recognition of some expressions actually improved relative to his performance on the whole face condition. We argue that in SC interference from damaged configural processes seem to override an intact ability to utilize part-based or local feature cues.
Online graphic symbol recognition using neural network and ARG matching

NASA Astrophysics Data System (ADS)

Yang, Bing; Li, Changhua; Xie, Weixing

2001-09-01

This paper proposes a novel method for on-line recognition of line-based graphic symbol. The input strokes are usually warped into a cursive form due to the sundry drawing style, and classifying them is very difficult. To deal with this, an ART-2 neural network is used to classify the input strokes. It has the advantages of high recognition rate, less recognition time and forming classes in a self-organized manner. The symbol recognition is achieved by an Attribute Relational Graph (ARG) matching algorithm. The ARG is very efficient for representing complex objects, but computation cost is very high. To over come this, we suggest a fast graph matching algorithm using symbol structure information. The experimental results show that the proposed method is effective for recognition of symbols with hierarchical structure.
Overt attention in natural scenes: objects dominate features.

PubMed

Stoll, Josef; Thrun, Michael; Nuthmann, Antje; Einhäuser, Wolfgang

2015-02-01

Whether overt attention in natural scenes is guided by object content or by low-level stimulus features has become a matter of intense debate. Experimental evidence seemed to indicate that once object locations in a scene are known, salience models provide little extra explanatory power. This approach has recently been criticized for using inadequate models of early salience; and indeed, state-of-the-art salience models outperform trivial object-based models that assume a uniform distribution of fixations on objects. Here we propose to use object-based models that take a preferred viewing location (PVL) close to the centre of objects into account. In experiment 1, we demonstrate that, when including this comparably subtle modification, object-based models again are at par with state-of-the-art salience models in predicting fixations in natural scenes. One possible interpretation of these results is that objects rather than early salience dominate attentional guidance. In this view, early-salience models predict fixations through the correlation of their features with object locations. To test this hypothesis directly, in two additional experiments we reduced low-level salience in image areas of high object content. For these modified stimuli, the object-based model predicted fixations significantly better than early salience. This finding held in an object-naming task (experiment 2) and a free-viewing task (experiment 3). These results provide further evidence for object-based fixation selection--and by inference object-based attentional guidance--in natural scenes. Copyright © 2014 The Authors. Published by Elsevier Ltd.. All rights reserved.
View-tolerant face recognition and Hebbian learning imply mirror-symmetric neural tuning to head orientation

PubMed Central

Leibo, Joel Z.; Liao, Qianli; Freiwald, Winrich A.; Anselmi, Fabio; Poggio, Tomaso

2017-01-01

SUMMARY The primate brain contains a hierarchy of visual areas, dubbed the ventral stream, which rapidly computes object representations that are both specific for object identity and robust against identity-preserving transformations like depth-rotations [1, 2]. Current computational models of object recognition, including recent deep learning networks, generate these properties through a hierarchy of alternating selectivity-increasing filtering and tolerance-increasing pooling operations, similar to simple-complex cells operations [3, 4, 5, 6]. Here we prove that a class of hierarchical architectures and a broad set of biologically plausible learning rules generate approximate invariance to identity-preserving transformations at the top level of the processing hierarchy. However, all past models tested failed to reproduce the most salient property of an intermediate representation of a three-level face-processing hierarchy in the brain: mirror-symmetric tuning to head orientation [7]. Here we demonstrate that one specific biologically-plausible Hebb-type learning rule generates mirror-symmetric tuning to bilaterally symmetric stimuli like faces at intermediate levels of the architecture and show why it does so. Thus the tuning properties of individual cells inside the visual stream appear to result from group properties of the stimuli they encode and to reflect the learning rules that sculpted the information-processing system within which they reside. PMID:27916522
Deep Networks Can Resemble Human Feed-forward Vision in Invariant Object Recognition

PubMed Central

Kheradpisheh, Saeed Reza; Ghodrati, Masoud; Ganjtabesh, Mohammad; Masquelier, Timothée

2016-01-01

Deep convolutional neural networks (DCNNs) have attracted much attention recently, and have shown to be able to recognize thousands of object categories in natural image databases. Their architecture is somewhat similar to that of the human visual system: both use restricted receptive fields, and a hierarchy of layers which progressively extract more and more abstracted features. Yet it is unknown whether DCNNs match human performance at the task of view-invariant object recognition, whether they make similar errors and use similar representations for this task, and whether the answers depend on the magnitude of the viewpoint variations. To investigate these issues, we benchmarked eight state-of-the-art DCNNs, the HMAX model, and a baseline shallow model and compared their results to those of humans with backward masking. Unlike in all previous DCNN studies, we carefully controlled the magnitude of the viewpoint variations to demonstrate that shallow nets can outperform deep nets and humans when variations are weak. When facing larger variations, however, more layers were needed to match human performance and error distributions, and to have representations that are consistent with human behavior. A very deep net with 18 layers even outperformed humans at the highest variation level, using the most human-like representations. PMID:27601096
Event completion: event based inferences distort memory in a matter of seconds.

PubMed

Strickland, Brent; Keil, Frank

2011-12-01

We present novel evidence that implicit causal inferences distort memory for events only seconds after viewing. Adults watched videos of someone launching (or throwing) an object. However, the videos omitted the moment of contact (or release). Subjects falsely reported seeing the moment of contact when it was implied by subsequent footage but did not do so when the contact was not implied. Causal implications were disrupted either by replacing the resulting flight of the ball with irrelevant video or by scrambling event segments. Subjects in the different causal implication conditions did not differ on false alarms for other moments of the event, nor did they differ in general recognition accuracy. These results suggest that as people perceive events, they generate rapid conceptual interpretations that can have a powerful effect on how events are remembered. Copyright © 2011 Elsevier B.V. All rights reserved.
Experience with Malleable Objects Influences Shape-based Object Individuation by Infants

PubMed Central

Woods, Rebecca J.; Schuler, Jena

2014-01-01

Infants’ ability to accurately represent and later recognize previously viewed objects, and conversely, to discriminate novel objects from those previously seen improves remarkably over the first two years of life. During this time, infants acquire extensive experience viewing and manipulating objects and these experiences influence their physical reasoning. Here we posited that infants’ observations of object feature stability (rigid versus malleable) can influence use of those features to individuate two successively viewed objects. We showed 8.5-month-olds a series of objects that could or could not change shape then assessed their use of shape as a basis for object individuation. Infants who explored rigid objects later used shape differences to individuate objects; however, infants who explored malleable objects did not. This outcome suggests that the latter infants did not take into account shape differences during the physical reasoning task and provides further evidence that infants’ attention to object features can be readily modified based on recent experiences. PMID:24561541
Feedforward object-vision models only tolerate small image variations compared to human

PubMed Central

Ghodrati, Masoud; Farzmahdi, Amirhossein; Rajaei, Karim; Ebrahimpour, Reza; Khaligh-Razavi, Seyed-Mahdi

2014-01-01

Invariant object recognition is a remarkable ability of primates' visual system that its underlying mechanism has constantly been under intense investigations. Computational modeling is a valuable tool toward understanding the processes involved in invariant object recognition. Although recent computational models have shown outstanding performances on challenging image databases, they fail to perform well in image categorization under more complex image variations. Studies have shown that making sparse representation of objects by extracting more informative visual features through a feedforward sweep can lead to higher recognition performances. Here, however, we show that when the complexity of image variations is high, even this approach results in poor performance compared to humans. To assess the performance of models and humans in invariant object recognition tasks, we built a parametrically controlled image database consisting of several object categories varied in different dimensions and levels, rendered from 3D planes. Comparing the performance of several object recognition models with human observers shows that only in low-level image variations the models perform similar to humans in categorization tasks. Furthermore, the results of our behavioral experiments demonstrate that, even under difficult experimental conditions (i.e., briefly presented masked stimuli with complex image variations), human observers performed outstandingly well, suggesting that the models are still far from resembling humans in invariant object recognition. Taken together, we suggest that learning sparse informative visual features, although desirable, is not a complete solution for future progresses in object-vision modeling. We show that this approach is not of significant help in solving the computational crux of object recognition (i.e., invariant object recognition) when the identity-preserving image variations become more complex. PMID:25100986
A Scientific Workflow Platform for Generic and Scalable Object Recognition on Medical Images

NASA Astrophysics Data System (ADS)

Möller, Manuel; Tuot, Christopher; Sintek, Michael

In the research project THESEUS MEDICO we aim at a system combining medical image information with semantic background knowledge from ontologies to give clinicians fully cross-modal access to biomedical image repositories. Therefore joint efforts have to be made in more than one dimension: Object detection processes have to be specified in which an abstraction is performed starting from low-level image features across landmark detection utilizing abstract domain knowledge up to high-level object recognition. We propose a system based on a client-server extension of the scientific workflow platform Kepler that assists the collaboration of medical experts and computer scientists during development and parameter learning.
Use of iris recognition camera technology for the quantification of corneal opacification in mucopolysaccharidoses.

PubMed

Aslam, Tariq Mehmood; Shakir, Savana; Wong, James; Au, Leon; Ashworth, Jane

2012-12-01

Mucopolysaccharidoses (MPS) can cause corneal opacification that is currently difficult to objectively quantify. With newer treatments for MPS comes an increased need for a more objective, valid and reliable index of disease severity for clinical and research use. Clinical evaluation by slit lamp is very subjective and techniques based on colour photography are difficult to standardise. In this article the authors present evidence for the utility of dedicated image analysis algorithms applied to images obtained by a highly sophisticated iris recognition camera that is small, manoeuvrable and adapted to achieve rapid, reliable and standardised objective imaging in a wide variety of patients while minimising artefactual interference in image quality.
Study on road sign recognition in LabVIEW

NASA Astrophysics Data System (ADS)

Panoiu, M.; Rat, C. L.; Panoiu, C.

2016-02-01

Road and traffic sign identification is a field of study that can be used to aid the development of in-car advisory systems. It uses computer vision and artificial intelligence to extract the road signs from outdoor images acquired by a camera in uncontrolled lighting conditions where they may be occluded by other objects, or may suffer from problems such as color fading, disorientation, variations in shape and size, etc. An automatic means of identifying traffic signs, in these conditions, can make a significant contribution to develop an Intelligent Transport Systems (ITS) that continuously monitors the driver, the vehicle, and the road. Road and traffic signs are characterized by a number of features which make them recognizable from the environment. Road signs are located in standard positions and have standard shapes, standard colors, and known pictograms. These characteristics make them suitable for image identification. Traffic sign identification covers two problems: traffic sign detection and traffic sign recognition. Traffic sign detection is meant for the accurate localization of traffic signs in the image space, while traffic sign recognition handles the labeling of such detections into specific traffic sign types or subcategories [1].

Some links on this page may take you to non-federal websites. Their policies may differ from this site.