model-based object recognition: Topics by Science.gov

Sample records for model-based object recognition

The implementation of aerial object recognition algorithm based on contour descriptor in FPGA-based on-board vision system

NASA Astrophysics Data System (ADS)

Babayan, Pavel; Smirnov, Sergey; Strotov, Valery

2017-10-01

This paper describes the aerial object recognition algorithm for on-board and stationary vision system. Suggested algorithm is intended to recognize the objects of a specific kind using the set of the reference objects defined by 3D models. The proposed algorithm based on the outer contour descriptor building. The algorithm consists of two stages: learning and recognition. Learning stage is devoted to the exploring of reference objects. Using 3D models we can build the database containing training images by rendering the 3D model from viewpoints evenly distributed on a sphere. Sphere points distribution is made by the geosphere principle. Gathered training image set is used for calculating descriptors, which will be used in the recognition stage of the algorithm. The recognition stage is focusing on estimating the similarity of the captured object and the reference objects by matching an observed image descriptor and the reference object descriptors. The experimental research was performed using a set of the models of the aircraft of the different types (airplanes, helicopters, UAVs). The proposed orientation estimation algorithm showed good accuracy in all case studies. The real-time performance of the algorithm in FPGA-based vision system was demonstrated.
Formal implementation of a performance evaluation model for the face recognition system.

PubMed

Shin, Yong-Nyuo; Kim, Jason; Lee, Yong-Jun; Shin, Woochang; Choi, Jin-Young

2008-01-01

Due to usability features, practical applications, and its lack of intrusiveness, face recognition technology, based on information, derived from individuals' facial features, has been attracting considerable attention recently. Reported recognition rates of commercialized face recognition systems cannot be admitted as official recognition rates, as they are based on assumptions that are beneficial to the specific system and face database. Therefore, performance evaluation methods and tools are necessary to objectively measure the accuracy and performance of any face recognition system. In this paper, we propose and formalize a performance evaluation model for the biometric recognition system, implementing an evaluation tool for face recognition systems based on the proposed model. Furthermore, we performed evaluations objectively by providing guidelines for the design and implementation of a performance evaluation system, formalizing the performance test process.
Vision-based object detection and recognition system for intelligent vehicles

NASA Astrophysics Data System (ADS)

Ran, Bin; Liu, Henry X.; Martono, Wilfung

1999-01-01

Recently, a proactive crash mitigation system is proposed to enhance the crash avoidance and survivability of the Intelligent Vehicles. Accurate object detection and recognition system is a prerequisite for a proactive crash mitigation system, as system component deployment algorithms rely on accurate hazard detection, recognition, and tracking information. In this paper, we present a vision-based approach to detect and recognize vehicles and traffic signs, obtain their information, and track multiple objects by using a sequence of color images taken from a moving vehicle. The entire system consist of two sub-systems, the vehicle detection and recognition sub-system and traffic sign detection and recognition sub-system. Both of the sub- systems consist of four models: object detection model, object recognition model, object information model, and object tracking model. In order to detect potential objects on the road, several features of the objects are investigated, which include symmetrical shape and aspect ratio of a vehicle and color and shape information of the signs. A two-layer neural network is trained to recognize different types of vehicles and a parameterized traffic sign model is established in the process of recognizing a sign. Tracking is accomplished by combining the analysis of single image frame with the analysis of consecutive image frames. The analysis of the single image frame is performed every ten full-size images. The information model will obtain the information related to the object, such as time to collision for the object vehicle and relative distance from the traffic sings. Experimental results demonstrated a robust and accurate system in real time object detection and recognition over thousands of image frames.
Shape and texture fused recognition of flying targets

NASA Astrophysics Data System (ADS)

Kovács, Levente; Utasi, Ákos; Kovács, Andrea; Szirányi, Tamás

2011-06-01

This paper presents visual detection and recognition of flying targets (e.g. planes, missiles) based on automatically extracted shape and object texture information, for application areas like alerting, recognition and tracking. Targets are extracted based on robust background modeling and a novel contour extraction approach, and object recognition is done by comparisons to shape and texture based query results on a previously gathered real life object dataset. Application areas involve passive defense scenarios, including automatic object detection and tracking with cheap commodity hardware components (CPU, camera and GPS).
Modeling guidance and recognition in categorical search: bridging human and computer object detection.

PubMed

Zelinsky, Gregory J; Peng, Yifan; Berg, Alexander C; Samaras, Dimitris

2013-10-08

Search is commonly described as a repeating cycle of guidance to target-like objects, followed by the recognition of these objects as targets or distractors. Are these indeed separate processes using different visual features? We addressed this question by comparing observer behavior to that of support vector machine (SVM) models trained on guidance and recognition tasks. Observers searched for a categorically defined teddy bear target in four-object arrays. Target-absent trials consisted of random category distractors rated in their visual similarity to teddy bears. Guidance, quantified as first-fixated objects during search, was strongest for targets, followed by target-similar, medium-similarity, and target-dissimilar distractors. False positive errors to first-fixated distractors also decreased with increasing dissimilarity to the target category. To model guidance, nine teddy bear detectors, using features ranging in biological plausibility, were trained on unblurred bears then tested on blurred versions of the same objects appearing in each search display. Guidance estimates were based on target probabilities obtained from these detectors. To model recognition, nine bear/nonbear classifiers, trained and tested on unblurred objects, were used to classify the object that would be fixated first (based on the detector estimates) as a teddy bear or a distractor. Patterns of categorical guidance and recognition accuracy were modeled almost perfectly by an HMAX model in combination with a color histogram feature. We conclude that guidance and recognition in the context of search are not separate processes mediated by different features, and that what the literature knows as guidance is really recognition performed on blurred objects viewed in the visual periphery.
Modeling guidance and recognition in categorical search: Bridging human and computer object detection

PubMed Central

Zelinsky, Gregory J.; Peng, Yifan; Berg, Alexander C.; Samaras, Dimitris

2013-01-01

Search is commonly described as a repeating cycle of guidance to target-like objects, followed by the recognition of these objects as targets or distractors. Are these indeed separate processes using different visual features? We addressed this question by comparing observer behavior to that of support vector machine (SVM) models trained on guidance and recognition tasks. Observers searched for a categorically defined teddy bear target in four-object arrays. Target-absent trials consisted of random category distractors rated in their visual similarity to teddy bears. Guidance, quantified as first-fixated objects during search, was strongest for targets, followed by target-similar, medium-similarity, and target-dissimilar distractors. False positive errors to first-fixated distractors also decreased with increasing dissimilarity to the target category. To model guidance, nine teddy bear detectors, using features ranging in biological plausibility, were trained on unblurred bears then tested on blurred versions of the same objects appearing in each search display. Guidance estimates were based on target probabilities obtained from these detectors. To model recognition, nine bear/nonbear classifiers, trained and tested on unblurred objects, were used to classify the object that would be fixated first (based on the detector estimates) as a teddy bear or a distractor. Patterns of categorical guidance and recognition accuracy were modeled almost perfectly by an HMAX model in combination with a color histogram feature. We conclude that guidance and recognition in the context of search are not separate processes mediated by different features, and that what the literature knows as guidance is really recognition performed on blurred objects viewed in the visual periphery. PMID:24105460
Trajectory Recognition as the Basis for Object Individuation: A Functional Model of Object File Instantiation and Object-Token Encoding

PubMed Central

Fields, Chris

2011-01-01

The perception of persisting visual objects is mediated by transient intermediate representations, object files, that are instantiated in response to some, but not all, visual trajectories. The standard object file concept does not, however, provide a mechanism sufficient to account for all experimental data on visual object persistence, object tracking, and the ability to perceive spatially disconnected stimuli as continuously existing objects. Based on relevant anatomical, functional, and developmental data, a functional model is constructed that bases visual object individuation on the recognition of temporal sequences of apparent center-of-mass positions that are specifically identified as trajectories by dedicated “trajectory recognition networks” downstream of the medial–temporal motion-detection area. This model is shown to account for a wide range of data, and to generate a variety of testable predictions. Individual differences in the recognition, abstraction, and encoding of trajectory information are expected to generate distinct object persistence judgments and object recognition abilities. Dominance of trajectory information over feature information in stored object tokens during early infancy, in particular, is expected to disrupt the ability to re-identify human and other individuals across perceptual episodes, and lead to developmental outcomes with characteristics of autism spectrum disorders. PMID:21716599
Three-dimensional model-based object recognition and segmentation in cluttered scenes.

PubMed

Mian, Ajmal S; Bennamoun, Mohammed; Owens, Robyn

2006-10-01

Viewpoint independent recognition of free-form objects and their segmentation in the presence of clutter and occlusions is a challenging task. We present a novel 3D model-based algorithm which performs this task automatically and efficiently. A 3D model of an object is automatically constructed offline from its multiple unordered range images (views). These views are converted into multidimensional table representations (which we refer to as tensors). Correspondences are automatically established between these views by simultaneously matching the tensors of a view with those of the remaining views using a hash table-based voting scheme. This results in a graph of relative transformations used to register the views before they are integrated into a seamless 3D model. These models and their tensor representations constitute the model library. During online recognition, a tensor from the scene is simultaneously matched with those in the library by casting votes. Similarity measures are calculated for the model tensors which receive the most votes. The model with the highest similarity is transformed to the scene and, if it aligns accurately with an object in the scene, that object is declared as recognized and is segmented. This process is repeated until the scene is completely segmented. Experiments were performed on real and synthetic data comprised of 55 models and 610 scenes and an overall recognition rate of 95 percent was achieved. Comparison with the spin images revealed that our algorithm is superior in terms of recognition rate and efficiency.
Digital and optical shape representation and pattern recognition; Proceedings of the Meeting, Orlando, FL, Apr. 4-6, 1988

NASA Technical Reports Server (NTRS)

Juday, Richard D. (Editor)

1988-01-01

The present conference discusses topics in pattern-recognition correlator architectures, digital stereo systems, geometric image transformations and their applications, topics in pattern recognition, filter algorithms, object detection and classification, shape representation techniques, and model-based object recognition methods. Attention is given to edge-enhancement preprocessing using liquid crystal TVs, massively-parallel optical data base management, three-dimensional sensing with polar exponential sensor arrays, the optical processing of imaging spectrometer data, hybrid associative memories and metric data models, the representation of shape primitives in neural networks, and the Monte Carlo estimation of moment invariants for pattern recognition.
Cognitive object recognition system (CORS)

NASA Astrophysics Data System (ADS)

Raju, Chaitanya; Varadarajan, Karthik Mahesh; Krishnamurthi, Niyant; Xu, Shuli; Biederman, Irving; Kelley, Troy

2010-04-01

We have developed a framework, Cognitive Object Recognition System (CORS), inspired by current neurocomputational models and psychophysical research in which multiple recognition algorithms (shape based geometric primitives, 'geons,' and non-geometric feature-based algorithms) are integrated to provide a comprehensive solution to object recognition and landmarking. Objects are defined as a combination of geons, corresponding to their simple parts, and the relations among the parts. However, those objects that are not easily decomposable into geons, such as bushes and trees, are recognized by CORS using "feature-based" algorithms. The unique interaction between these algorithms is a novel approach that combines the effectiveness of both algorithms and takes us closer to a generalized approach to object recognition. CORS allows recognition of objects through a larger range of poses using geometric primitives and performs well under heavy occlusion - about 35% of object surface is sufficient. Furthermore, geon composition of an object allows image understanding and reasoning even with novel objects. With reliable landmarking capability, the system improves vision-based robot navigation in GPS-denied environments. Feasibility of the CORS system was demonstrated with real stereo images captured from a Pioneer robot. The system can currently identify doors, door handles, staircases, trashcans and other relevant landmarks in the indoor environment.
Mechanisms of object recognition: what we have learned from pigeons

PubMed Central

Soto, Fabian A.; Wasserman, Edward A.

2014-01-01

Behavioral studies of object recognition in pigeons have been conducted for 50 years, yielding a large body of data. Recent work has been directed toward synthesizing this evidence and understanding the visual, associative, and cognitive mechanisms that are involved. The outcome is that pigeons are likely to be the non-primate species for which the computational mechanisms of object recognition are best understood. Here, we review this research and suggest that a core set of mechanisms for object recognition might be present in all vertebrates, including pigeons and people, making pigeons an excellent candidate model to study the neural mechanisms of object recognition. Behavioral and computational evidence suggests that error-driven learning participates in object category learning by pigeons and people, and recent neuroscientific research suggests that the basal ganglia, which are homologous in these species, may implement error-driven learning of stimulus-response associations. Furthermore, learning of abstract category representations can be observed in pigeons and other vertebrates. Finally, there is evidence that feedforward visual processing, a central mechanism in models of object recognition in the primate ventral stream, plays a role in object recognition by pigeons. We also highlight differences between pigeons and people in object recognition abilities, and propose candidate adaptive specializations which may explain them, such as holistic face processing and rule-based category learning in primates. From a modern comparative perspective, such specializations are to be expected regardless of the model species under study. The fact that we have a good idea of which aspects of object recognition differ in people and pigeons should be seen as an advantage over other animal models. From this perspective, we suggest that there is much to learn about human object recognition from studying the “simple” brains of pigeons. PMID:25352784
Interactive object recognition assistance: an approach to recognition starting from target objects

NASA Astrophysics Data System (ADS)

Geisler, Juergen; Littfass, Michael

1999-07-01

Recognition of target objects in remotely sensed imagery required detailed knowledge about the target object domain as well as about mapping properties of the sensing system. The art of object recognition is to combine both worlds appropriately and to provide models of target appearance with respect to sensor characteristics. Common approaches to support interactive object recognition are either driven from the sensor point of view and address the problem of displaying images in a manner adequate to the sensing system. Or they focus on target objects and provide exhaustive encyclopedic information about this domain. Our paper discusses an approach to assist interactive object recognition based on knowledge about target objects and taking into account the significance of object features with respect to characteristics of the sensed imagery, e.g. spatial and spectral resolution. An `interactive recognition assistant' takes the image analyst through the interpretation process by indicating step-by-step the respectively most significant features of objects in an actual set of candidates. The significance of object features is expressed by pregenerated trees of significance, and by the dynamic computation of decision relevance for every feature at each step of the recognition process. In the context of this approach we discuss the question of modeling and storing the multisensorial/multispectral appearances of target objects and object classes as well as the problem of an adequate dynamic human-machine-interface that takes into account various mental models of human image interpretation.
Orientation congruency effects for familiar objects: coordinate transformations in object recognition.

PubMed

Graf, M; Kaping, D; Bülthoff, H H

2005-03-01

How do observers recognize objects after spatial transformations? Recent neurocomputational models have proposed that object recognition is based on coordinate transformations that align memory and stimulus representations. If the recognition of a misoriented object is achieved by adjusting a coordinate system (or reference frame), then recognition should be facilitated when the object is preceded by a different object in the same orientation. In the two experiments reported here, two objects were presented in brief masked displays that were in close temporal contiguity; the objects were in either congruent or incongruent picture-plane orientations. Results showed that naming accuracy was higher for congruent than for incongruent orientations. The congruency effect was independent of superordinate category membership (Experiment 1) and was found for objects with different main axes of elongation (Experiment 2). The results indicate congruency effects for common familiar objects even when they have dissimilar shapes. These findings are compatible with models in which object recognition is achieved by an adjustment of a perceptual coordinate system.
Under what conditions is recognition spared relative to recall after selective hippocampal damage in humans?

PubMed

Holdstock, J S; Mayes, A R; Roberts, N; Cezayirli, E; Isaac, C L; O'Reilly, R C; Norman, K A

2002-01-01

The claim that recognition memory is spared relative to recall after focal hippocampal damage has been disputed in the literature. We examined this claim by investigating object and object-location recall and recognition memory in a patient, YR, who has adult-onset selective hippocampal damage. Our aim was to identify the conditions under which recognition was spared relative to recall in this patient. She showed unimpaired forced-choice object recognition but clearly impaired recall, even when her control subjects found the object recognition task to be numerically harder than the object recall task. However, on two other recognition tests, YR's performance was not relatively spared. First, she was clearly impaired at an equivalently difficult yes/no object recognition task, but only when targets and foils were very similar. Second, YR was clearly impaired at forced-choice recognition of object-location associations. This impairment was also unrelated to difficulty because this task was no more difficult than the forced-choice object recognition task for control subjects. The clear impairment of yes/no, but not of forced-choice, object recognition after focal hippocampal damage, when targets and foils are very similar, is predicted by the neural network-based Complementary Learning Systems model of recognition. This model postulates that recognition is mediated by hippocampally dependent recollection and cortically dependent familiarity; thus hippocampal damage should not impair item familiarity. The model postulates that familiarity is ineffective when very similar targets and foils are shown one at a time and subjects have to identify which items are old (yes/no recognition). In contrast, familiarity is effective in discriminating which of similar targets and foils, seen together, is old (forced-choice recognition). Independent evidence from the remember/know procedure also indicates that YR's familiarity is normal. The Complementary Learning Systems model can also accommodate the clear impairment of forced-choice object-location recognition memory if it incorporates the view that the most complete convergence of spatial and object information, represented in different cortical regions, occurs in the hippocampus.
Object recognition in images via a factor graph model

NASA Astrophysics Data System (ADS)

He, Yong; Wang, Long; Wu, Zhaolin; Zhang, Haisu

2018-04-01

Object recognition in images suffered from huge search space and uncertain object profile. Recently, the Bag-of- Words methods are utilized to solve these problems, especially the 2-dimension CRF(Conditional Random Field) model. In this paper we suggest the method based on a general and flexible fact graph model, which can catch the long-range correlation in Bag-of-Words by constructing a network learning framework contrasted from lattice in CRF. Furthermore, we explore a parameter learning algorithm based on the gradient descent and Loopy Sum-Product algorithms for the factor graph model. Experimental results on Graz 02 dataset show that, the recognition performance of our method in precision and recall is better than a state-of-art method and the original CRF model, demonstrating the effectiveness of the proposed method.
Ignorance- versus evidence-based decision making: a decision time analysis of the recognition heuristic.

PubMed

Hilbig, Benjamin E; Pohl, Rüdiger F

2009-09-01

According to part of the adaptive toolbox notion of decision making known as the recognition heuristic (RH), the decision process in comparative judgments-and its duration-is determined by whether recognition discriminates between objects. By contrast, some recently proposed alternative models predict that choices largely depend on the amount of evidence speaking for each of the objects and that decision times thus depend on the evidential difference between objects, or the degree of conflict between options. This article presents 3 experiments that tested predictions derived from the RH against those from alternative models. All experiments used naturally recognized objects without teaching participants any information and thus provided optimal conditions for application of the RH. However, results supported the alternative, evidence-based models and often conflicted with the RH. Recognition was not the key determinant of decision times, whereas differences between objects with respect to (both positive and negative) evidence predicted effects well. In sum, alternative models that allow for the integration of different pieces of information may well provide a better account of comparative judgments. (c) 2009 APA, all rights reserved.
Automatic anatomy recognition using neural network learning of object relationships via virtual landmarks

NASA Astrophysics Data System (ADS)

Yan, Fengxia; Udupa, Jayaram K.; Tong, Yubing; Xu, Guoping; Odhner, Dewey; Torigian, Drew A.

2018-03-01

The recently developed body-wide Automatic Anatomy Recognition (AAR) methodology depends on fuzzy modeling of individual objects, hierarchically arranging objects, constructing an anatomy ensemble of these models, and a dichotomous object recognition-delineation process. The parent-to-offspring spatial relationship in the object hierarchy is crucial in the AAR method. We have found this relationship to be quite complex, and as such any improvement in capturing this relationship information in the anatomy model will improve the process of recognition itself. Currently, the method encodes this relationship based on the layout of the geometric centers of the objects. Motivated by the concept of virtual landmarks (VLs), this paper presents a new one-shot AAR recognition method that utilizes the VLs to learn object relationships by training a neural network to predict the pose and the VLs of an offspring object given the VLs of the parent object in the hierarchy. We set up two neural networks for each parent-offspring object pair in a body region, one for predicting the VLs and another for predicting the pose parameters. The VL-based learning/prediction method is evaluated on two object hierarchies involving 14 objects. We utilize 54 computed tomography (CT) image data sets of head and neck cancer patients and the associated object contours drawn by dosimetrists for routine radiation therapy treatment planning. The VL neural network method is found to yield more accurate object localization than the currently used simple AAR method.
Orientation estimation of anatomical structures in medical images for object recognition

NASA Astrophysics Data System (ADS)

Bağci, Ulaş; Udupa, Jayaram K.; Chen, Xinjian

2011-03-01

Recognition of anatomical structures is an important step in model based medical image segmentation. It provides pose estimation of objects and information about "where" roughly the objects are in the image and distinguishing them from other object-like entities. In,1 we presented a general method of model-based multi-object recognition to assist in segmentation (delineation) tasks. It exploits the pose relationship that can be encoded, via the concept of ball scale (b-scale), between the binary training objects and their associated grey images. The goal was to place the model, in a single shot, close to the right pose (position, orientation, and scale) in a given image so that the model boundaries fall in the close vicinity of object boundaries in the image. Unlike position and scale parameters, we observe that orientation parameters require more attention when estimating the pose of the model as even small differences in orientation parameters can lead to inappropriate recognition. Motivated from the non-Euclidean nature of the pose information, we propose in this paper the use of non-Euclidean metrics to estimate orientation of the anatomical structures for more accurate recognition and segmentation. We statistically analyze and evaluate the following metrics for orientation estimation: Euclidean, Log-Euclidean, Root-Euclidean, Procrustes Size-and-Shape, and mean Hermitian metrics. The results show that mean Hermitian and Cholesky decomposition metrics provide more accurate orientation estimates than other Euclidean and non-Euclidean metrics.
Automatic anatomy recognition on CT images with pathology

NASA Astrophysics Data System (ADS)

Huang, Lidong; Udupa, Jayaram K.; Tong, Yubing; Odhner, Dewey; Torigian, Drew A.

2016-03-01

Body-wide anatomy recognition on CT images with pathology becomes crucial for quantifying body-wide disease burden. This, however, is a challenging problem because various diseases result in various abnormalities of objects such as shape and intensity patterns. We previously developed an automatic anatomy recognition (AAR) system [1] whose applicability was demonstrated on near normal diagnostic CT images in different body regions on 35 organs. The aim of this paper is to investigate strategies for adapting the previous AAR system to diagnostic CT images of patients with various pathologies as a first step toward automated body-wide disease quantification. The AAR approach consists of three main steps - model building, object recognition, and object delineation. In this paper, within the broader AAR framework, we describe a new strategy for object recognition to handle abnormal images. In the model building stage an optimal threshold interval is learned from near-normal training images for each object. This threshold is optimally tuned to the pathological manifestation of the object in the test image. Recognition is performed following a hierarchical representation of the objects. Experimental results for the abdominal body region based on 50 near-normal images used for model building and 20 abnormal images used for object recognition show that object localization accuracy within 2 voxels for liver and spleen and 3 voxels for kidney can be achieved with the new strategy.
Behavioral model of visual perception and recognition

NASA Astrophysics Data System (ADS)

Rybak, Ilya A.; Golovan, Alexander V.; Gusakova, Valentina I.

1993-09-01

In the processes of visual perception and recognition human eyes actively select essential information by way of successive fixations at the most informative points of the image. A behavioral program defining a scanpath of the image is formed at the stage of learning (object memorizing) and consists of sequential motor actions, which are shifts of attention from one to another point of fixation, and sensory signals expected to arrive in response to each shift of attention. In the modern view of the problem, invariant object recognition is provided by the following: (1) separated processing of `what' (object features) and `where' (spatial features) information at high levels of the visual system; (2) mechanisms of visual attention using `where' information; (3) representation of `what' information in an object-based frame of reference (OFR). However, most recent models of vision based on OFR have demonstrated the ability of invariant recognition of only simple objects like letters or binary objects without background, i.e. objects to which a frame of reference is easily attached. In contrast, we use not OFR, but a feature-based frame of reference (FFR), connected with the basic feature (edge) at the fixation point. This has provided for our model, the ability for invariant representation of complex objects in gray-level images, but demands realization of behavioral aspects of vision described above. The developed model contains a neural network subsystem of low-level vision which extracts a set of primary features (edges) in each fixation, and high- level subsystem consisting of `what' (Sensory Memory) and `where' (Motor Memory) modules. The resolution of primary features extraction decreases with distances from the point of fixation. FFR provides both the invariant representation of object features in Sensor Memory and shifts of attention in Motor Memory. Object recognition consists in successive recall (from Motor Memory) and execution of shifts of attention and successive verification of the expected sets of features (stored in Sensory Memory). The model shows the ability of recognition of complex objects (such as faces) in gray-level images invariant with respect to shift, rotation, and scale.

Shape and Color Features for Object Recognition Search

NASA Technical Reports Server (NTRS)

Duong, Tuan A.; Duong, Vu A.; Stubberud, Allen R.

2012-01-01

A bio-inspired shape feature of an object of interest emulates the integration of the saccadic eye movement and horizontal layer in vertebrate retina for object recognition search where a single object can be used one at a time. The optimal computational model for shape-extraction-based principal component analysis (PCA) was also developed to reduce processing time and enable the real-time adaptive system capability. A color feature of the object is employed as color segmentation to empower the shape feature recognition to solve the object recognition in the heterogeneous environment where a single technique - shape or color - may expose its difficulties. To enable the effective system, an adaptive architecture and autonomous mechanism were developed to recognize and adapt the shape and color feature of the moving object. The bio-inspired object recognition based on bio-inspired shape and color can be effective to recognize a person of interest in the heterogeneous environment where the single technique exposed its difficulties to perform effective recognition. Moreover, this work also demonstrates the mechanism and architecture of the autonomous adaptive system to enable the realistic system for the practical use in the future.
Short temporal asynchrony disrupts visual object recognition

PubMed Central

Singer, Jedediah M.; Kreiman, Gabriel

2014-01-01

Humans can recognize objects and scenes in a small fraction of a second. The cascade of signals underlying rapid recognition might be disrupted by temporally jittering different parts of complex objects. Here we investigated the time course over which shape information can be integrated to allow for recognition of complex objects. We presented fragments of object images in an asynchronous fashion and behaviorally evaluated categorization performance. We observed that visual recognition was significantly disrupted by asynchronies of approximately 30 ms, suggesting that spatiotemporal integration begins to break down with even small deviations from simultaneity. However, moderate temporal asynchrony did not completely obliterate recognition; in fact, integration of visual shape information persisted even with an asynchrony of 100 ms. We describe the data with a concise model based on the dynamic reduction of uncertainty about what image was presented. These results emphasize the importance of timing in visual processing and provide strong constraints for the development of dynamical models of visual shape recognition. PMID:24819738
Ball-scale based hierarchical multi-object recognition in 3D medical images

NASA Astrophysics Data System (ADS)

Bağci, Ulas; Udupa, Jayaram K.; Chen, Xinjian

2010-03-01

This paper investigates, using prior shape models and the concept of ball scale (b-scale), ways of automatically recognizing objects in 3D images without performing elaborate searches or optimization. That is, the goal is to place the model in a single shot close to the right pose (position, orientation, and scale) in a given image so that the model boundaries fall in the close vicinity of object boundaries in the image. This is achieved via the following set of key ideas: (a) A semi-automatic way of constructing a multi-object shape model assembly. (b) A novel strategy of encoding, via b-scale, the pose relationship between objects in the training images and their intensity patterns captured in b-scale images. (c) A hierarchical mechanism of positioning the model, in a one-shot way, in a given image from a knowledge of the learnt pose relationship and the b-scale image of the given image to be segmented. The evaluation results on a set of 20 routine clinical abdominal female and male CT data sets indicate the following: (1) Incorporating a large number of objects improves the recognition accuracy dramatically. (2) The recognition algorithm can be thought as a hierarchical framework such that quick replacement of the model assembly is defined as coarse recognition and delineation itself is known as finest recognition. (3) Scale yields useful information about the relationship between the model assembly and any given image such that the recognition results in a placement of the model close to the actual pose without doing any elaborate searches or optimization. (4) Effective object recognition can make delineation most accurate.
Automatic recognition of ship types from infrared images using superstructure moment invariants

NASA Astrophysics Data System (ADS)

Li, Heng; Wang, Xinyu

2007-11-01

Automatic object recognition is an active area of interest for military and commercial applications. In this paper, a system addressing autonomous recognition of ship types in infrared images is proposed. Firstly, an approach of segmentation based on detection of salient features of the target with subsequent shadow removing is proposed, as is the base of the subsequent object recognition. Considering the differences between the shapes of various ships mainly lie in their superstructures, we then use superstructure moment functions invariant to translation, rotation and scale differences in input patterns and develop a robust algorithm of obtaining ship superstructure. Subsequently a back-propagation neural network is used as a classifier in the recognition stage and projection images of simulated three-dimensional ship models are used as the training sets. Our recognition model was implemented and experimentally validated using both simulated three-dimensional ship model images and real images derived from video of an AN/AAS-44V Forward Looking Infrared(FLIR) sensor.
Image processing strategies based on saliency segmentation for object recognition under simulated prosthetic vision.

PubMed

Li, Heng; Su, Xiaofan; Wang, Jing; Kan, Han; Han, Tingting; Zeng, Yajie; Chai, Xinyu

2018-01-01

Current retinal prostheses can only generate low-resolution visual percepts constituted of limited phosphenes which are elicited by an electrode array and with uncontrollable color and restricted grayscale. Under this visual perception, prosthetic recipients can just complete some simple visual tasks, but more complex tasks like face identification/object recognition are extremely difficult. Therefore, it is necessary to investigate and apply image processing strategies for optimizing the visual perception of the recipients. This study focuses on recognition of the object of interest employing simulated prosthetic vision. We used a saliency segmentation method based on a biologically plausible graph-based visual saliency model and a grabCut-based self-adaptive-iterative optimization framework to automatically extract foreground objects. Based on this, two image processing strategies, Addition of Separate Pixelization and Background Pixel Shrink, were further utilized to enhance the extracted foreground objects. i) The results showed by verification of psychophysical experiments that under simulated prosthetic vision, both strategies had marked advantages over Direct Pixelization in terms of recognition accuracy and efficiency. ii) We also found that recognition performance under two strategies was tied to the segmentation results and was affected positively by the paired-interrelated objects in the scene. The use of the saliency segmentation method and image processing strategies can automatically extract and enhance foreground objects, and significantly improve object recognition performance towards recipients implanted a high-density implant. Copyright © 2017 Elsevier B.V. All rights reserved.
Body-wide hierarchical fuzzy modeling, recognition, and delineation of anatomy in medical images.

PubMed

Udupa, Jayaram K; Odhner, Dewey; Zhao, Liming; Tong, Yubing; Matsumoto, Monica M S; Ciesielski, Krzysztof C; Falcao, Alexandre X; Vaideeswaran, Pavithra; Ciesielski, Victoria; Saboury, Babak; Mohammadianrasanani, Syedmehrdad; Sin, Sanghun; Arens, Raanan; Torigian, Drew A

2014-07-01

To make Quantitative Radiology (QR) a reality in radiological practice, computerized body-wide Automatic Anatomy Recognition (AAR) becomes essential. With the goal of building a general AAR system that is not tied to any specific organ system, body region, or image modality, this paper presents an AAR methodology for localizing and delineating all major organs in different body regions based on fuzzy modeling ideas and a tight integration of fuzzy models with an Iterative Relative Fuzzy Connectedness (IRFC) delineation algorithm. The methodology consists of five main steps: (a) gathering image data for both building models and testing the AAR algorithms from patient image sets existing in our health system; (b) formulating precise definitions of each body region and organ and delineating them following these definitions; (c) building hierarchical fuzzy anatomy models of organs for each body region; (d) recognizing and locating organs in given images by employing the hierarchical models; and (e) delineating the organs following the hierarchy. In Step (c), we explicitly encode object size and positional relationships into the hierarchy and subsequently exploit this information in object recognition in Step (d) and delineation in Step (e). Modality-independent and dependent aspects are carefully separated in model encoding. At the model building stage, a learning process is carried out for rehearsing an optimal threshold-based object recognition method. The recognition process in Step (d) starts from large, well-defined objects and proceeds down the hierarchy in a global to local manner. A fuzzy model-based version of the IRFC algorithm is created by naturally integrating the fuzzy model constraints into the delineation algorithm. The AAR system is tested on three body regions - thorax (on CT), abdomen (on CT and MRI), and neck (on MRI and CT) - involving a total of over 35 organs and 130 data sets (the total used for model building and testing). The training and testing data sets are divided into equal size in all cases except for the neck. Overall the AAR method achieves a mean accuracy of about 2 voxels in localizing non-sparse blob-like objects and most sparse tubular objects. The delineation accuracy in terms of mean false positive and negative volume fractions is 2% and 8%, respectively, for non-sparse objects, and 5% and 15%, respectively, for sparse objects. The two object groups achieve mean boundary distance relative to ground truth of 0.9 and 1.5 voxels, respectively. Some sparse objects - venous system (in the thorax on CT), inferior vena cava (in the abdomen on CT), and mandible and naso-pharynx (in neck on MRI, but not on CT) - pose challenges at all levels, leading to poor recognition and/or delineation results. The AAR method fares quite favorably when compared with methods from the recent literature for liver, kidneys, and spleen on CT images. We conclude that separation of modality-independent from dependent aspects, organization of objects in a hierarchy, encoding of object relationship information explicitly into the hierarchy, optimal threshold-based recognition learning, and fuzzy model-based IRFC are effective concepts which allowed us to demonstrate the feasibility of a general AAR system that works in different body regions on a variety of organs and on different modalities. Copyright © 2014 Elsevier B.V. All rights reserved.
Three-dimensional object recognition based on planar images

NASA Astrophysics Data System (ADS)

Mital, Dinesh P.; Teoh, Eam-Khwang; Au, K. C.; Chng, E. K.

1993-01-01

This paper presents the development and realization of a robotic vision system for the recognition of 3-dimensional (3-D) objects. The system can recognize a single object from among a group of known regular convex polyhedron objects that is constrained to lie on a calibrated flat platform. The approach adopted comprises a series of image processing operations on a single 2-dimensional (2-D) intensity image to derive an image line drawing. Subsequently, a feature matching technique is employed to determine 2-D spatial correspondences of the image line drawing with the model in the database. Besides its identification ability, the system can also provide important position and orientation information of the recognized object. The system was implemented on an IBM-PC AT machine executing at 8 MHz without the 80287 Maths Co-processor. In our overall performance evaluation based on a 600 recognition cycles test, the system demonstrated an accuracy of above 80% with recognition time well within 10 seconds. The recognition time is, however, indirectly dependent on the number of models in the database. The reliability of the system is also affected by illumination conditions which must be clinically controlled as in any industrial robotic vision system.
Simple Learned Weighted Sums of Inferior Temporal Neuronal Firing Rates Accurately Predict Human Core Object Recognition Performance

PubMed Central

Hong, Ha; Solomon, Ethan A.; DiCarlo, James J.

2015-01-01

To go beyond qualitative models of the biological substrate of object recognition, we ask: can a single ventral stream neuronal linking hypothesis quantitatively account for core object recognition performance over a broad range of tasks? We measured human performance in 64 object recognition tests using thousands of challenging images that explore shape similarity and identity preserving object variation. We then used multielectrode arrays to measure neuronal population responses to those same images in visual areas V4 and inferior temporal (IT) cortex of monkeys and simulated V1 population responses. We tested leading candidate linking hypotheses and control hypotheses, each postulating how ventral stream neuronal responses underlie object recognition behavior. Specifically, for each hypothesis, we computed the predicted performance on the 64 tests and compared it with the measured pattern of human performance. All tested hypotheses based on low- and mid-level visually evoked activity (pixels, V1, and V4) were very poor predictors of the human behavioral pattern. However, simple learned weighted sums of distributed average IT firing rates exactly predicted the behavioral pattern. More elaborate linking hypotheses relying on IT trial-by-trial correlational structure, finer IT temporal codes, or ones that strictly respect the known spatial substructures of IT (“face patches”) did not improve predictive power. Although these results do not reject those more elaborate hypotheses, they suggest a simple, sufficient quantitative model: each object recognition task is learned from the spatially distributed mean firing rates (100 ms) of ∼60,000 IT neurons and is executed as a simple weighted sum of those firing rates. SIGNIFICANCE STATEMENT We sought to go beyond qualitative models of visual object recognition and determine whether a single neuronal linking hypothesis can quantitatively account for core object recognition behavior. To achieve this, we designed a database of images for evaluating object recognition performance. We used multielectrode arrays to characterize hundreds of neurons in the visual ventral stream of nonhuman primates and measured the object recognition performance of >100 human observers. Remarkably, we found that simple learned weighted sums of firing rates of neurons in monkey inferior temporal (IT) cortex accurately predicted human performance. Although previous work led us to expect that IT would outperform V4, we were surprised by the quantitative precision with which simple IT-based linking hypotheses accounted for human behavior. PMID:26424887
Automatic image database generation from CAD for 3D object recognition

NASA Astrophysics Data System (ADS)

Sardana, Harish K.; Daemi, Mohammad F.; Ibrahim, Mohammad K.

1993-06-01

The development and evaluation of Multiple-View 3-D object recognition systems is based on a large set of model images. Due to the various advantages of using CAD, it is becoming more and more practical to use existing CAD data in computer vision systems. Current PC- level CAD systems are capable of providing physical image modelling and rendering involving positional variations in cameras, light sources etc. We have formulated a modular scheme for automatic generation of various aspects (views) of the objects in a model based 3-D object recognition system. These views are generated at desired orientations on the unit Gaussian sphere. With a suitable network file sharing system (NFS), the images can directly be stored on a database located on a file server. This paper presents the image modelling solutions using CAD in relation to multiple-view approach. Our modular scheme for data conversion and automatic image database storage for such a system is discussed. We have used this approach in 3-D polyhedron recognition. An overview of the results, advantages and limitations of using CAD data and conclusions using such as scheme are also presented.
Hierarchical Context Modeling for Video Event Recognition.

PubMed

Wang, Xiaoyang; Ji, Qiang

2016-10-11

Current video event recognition research remains largely target-centered. For real-world surveillance videos, targetcentered event recognition faces great challenges due to large intra-class target variation, limited image resolution, and poor detection and tracking results. To mitigate these challenges, we introduced a context-augmented video event recognition approach. Specifically, we explicitly capture different types of contexts from three levels including image level, semantic level, and prior level. At the image level, we introduce two types of contextual features including the appearance context features and interaction context features to capture the appearance of context objects and their interactions with the target objects. At the semantic level, we propose a deep model based on deep Boltzmann machine to learn event object representations and their interactions. At the prior level, we utilize two types of prior-level contexts including scene priming and dynamic cueing. Finally, we introduce a hierarchical context model that systematically integrates the contextual information at different levels. Through the hierarchical context model, contexts at different levels jointly contribute to the event recognition. We evaluate the hierarchical context model for event recognition on benchmark surveillance video datasets. Results show that incorporating contexts in each level can improve event recognition performance, and jointly integrating three levels of contexts through our hierarchical model achieves the best performance.
Traffic Behavior Recognition Using the Pachinko Allocation Model

PubMed Central

Huynh-The, Thien; Banos, Oresti; Le, Ba-Vui; Bui, Dinh-Mao; Yoon, Yongik; Lee, Sungyoung

2015-01-01

CCTV-based behavior recognition systems have gained considerable attention in recent years in the transportation surveillance domain for identifying unusual patterns, such as traffic jams, accidents, dangerous driving and other abnormal behaviors. In this paper, a novel approach for traffic behavior modeling is presented for video-based road surveillance. The proposed system combines the pachinko allocation model (PAM) and support vector machine (SVM) for a hierarchical representation and identification of traffic behavior. A background subtraction technique using Gaussian mixture models (GMMs) and an object tracking mechanism based on Kalman filters are utilized to firstly construct the object trajectories. Then, the sparse features comprising the locations and directions of the moving objects are modeled by PAM into traffic topics, namely activities and behaviors. As a key innovation, PAM captures not only the correlation among the activities, but also among the behaviors based on the arbitrary directed acyclic graph (DAG). The SVM classifier is then utilized on top to train and recognize the traffic activity and behavior. The proposed model shows more flexibility and greater expressive power than the commonly-used latent Dirichlet allocation (LDA) approach, leading to a higher recognition accuracy in the behavior classification. PMID:26151213
One-Reason Decision Making Unveiled: A Measurement Model of the Recognition Heuristic

ERIC Educational Resources Information Center

Hilbig, Benjamin E.; Erdfelder, Edgar; Pohl, Rudiger F.

2010-01-01

The fast-and-frugal recognition heuristic (RH) theory provides a precise process description of comparative judgments. It claims that, in suitable domains, judgments between pairs of objects are based on recognition alone, whereas further knowledge is ignored. However, due to the confound between recognition and further knowledge, previous…
SEMI-SUPERVISED OBJECT RECOGNITION USING STRUCTURE KERNEL

PubMed Central

Wang, Botao; Xiong, Hongkai; Jiang, Xiaoqian; Ling, Fan

2013-01-01

Object recognition is a fundamental problem in computer vision. Part-based models offer a sparse, flexible representation of objects, but suffer from difficulties in training and often use standard kernels. In this paper, we propose a positive definite kernel called “structure kernel”, which measures the similarity of two part-based represented objects. The structure kernel has three terms: 1) the global term that measures the global visual similarity of two objects; 2) the part term that measures the visual similarity of corresponding parts; 3) the spatial term that measures the spatial similarity of geometric configuration of parts. The contribution of this paper is to generalize the discriminant capability of local kernels to complex part-based object models. Experimental results show that the proposed kernel exhibit higher accuracy than state-of-art approaches using standard kernels. PMID:23666108
Fast neuromimetic object recognition using FPGA outperforms GPU implementations.

PubMed

Orchard, Garrick; Martin, Jacob G; Vogelstein, R Jacob; Etienne-Cummings, Ralph

2013-08-01

Recognition of objects in still images has traditionally been regarded as a difficult computational problem. Although modern automated methods for visual object recognition have achieved steadily increasing recognition accuracy, even the most advanced computational vision approaches are unable to obtain performance equal to that of humans. This has led to the creation of many biologically inspired models of visual object recognition, among them the hierarchical model and X (HMAX) model. HMAX is traditionally known to achieve high accuracy in visual object recognition tasks at the expense of significant computational complexity. Increasing complexity, in turn, increases computation time, reducing the number of images that can be processed per unit time. In this paper we describe how the computationally intensive and biologically inspired HMAX model for visual object recognition can be modified for implementation on a commercial field-programmable aate Array, specifically the Xilinx Virtex 6 ML605 evaluation board with XC6VLX240T FPGA. We show that with minor modifications to the traditional HMAX model we can perform recognition on images of size 128 × 128 pixels at a rate of 190 images per second with a less than 1% loss in recognition accuracy in both binary and multiclass visual object recognition tasks.
Simple Learned Weighted Sums of Inferior Temporal Neuronal Firing Rates Accurately Predict Human Core Object Recognition Performance.

PubMed

Majaj, Najib J; Hong, Ha; Solomon, Ethan A; DiCarlo, James J

2015-09-30

To go beyond qualitative models of the biological substrate of object recognition, we ask: can a single ventral stream neuronal linking hypothesis quantitatively account for core object recognition performance over a broad range of tasks? We measured human performance in 64 object recognition tests using thousands of challenging images that explore shape similarity and identity preserving object variation. We then used multielectrode arrays to measure neuronal population responses to those same images in visual areas V4 and inferior temporal (IT) cortex of monkeys and simulated V1 population responses. We tested leading candidate linking hypotheses and control hypotheses, each postulating how ventral stream neuronal responses underlie object recognition behavior. Specifically, for each hypothesis, we computed the predicted performance on the 64 tests and compared it with the measured pattern of human performance. All tested hypotheses based on low- and mid-level visually evoked activity (pixels, V1, and V4) were very poor predictors of the human behavioral pattern. However, simple learned weighted sums of distributed average IT firing rates exactly predicted the behavioral pattern. More elaborate linking hypotheses relying on IT trial-by-trial correlational structure, finer IT temporal codes, or ones that strictly respect the known spatial substructures of IT ("face patches") did not improve predictive power. Although these results do not reject those more elaborate hypotheses, they suggest a simple, sufficient quantitative model: each object recognition task is learned from the spatially distributed mean firing rates (100 ms) of ∼60,000 IT neurons and is executed as a simple weighted sum of those firing rates. Significance statement: We sought to go beyond qualitative models of visual object recognition and determine whether a single neuronal linking hypothesis can quantitatively account for core object recognition behavior. To achieve this, we designed a database of images for evaluating object recognition performance. We used multielectrode arrays to characterize hundreds of neurons in the visual ventral stream of nonhuman primates and measured the object recognition performance of >100 human observers. Remarkably, we found that simple learned weighted sums of firing rates of neurons in monkey inferior temporal (IT) cortex accurately predicted human performance. Although previous work led us to expect that IT would outperform V4, we were surprised by the quantitative precision with which simple IT-based linking hypotheses accounted for human behavior. Copyright © 2015 the authors 0270-6474/15/3513402-17$15.00/0.
The memory state heuristic: A formal model based on repeated recognition judgments.

PubMed

Castela, Marta; Erdfelder, Edgar

2017-02-01

The recognition heuristic (RH) theory predicts that, in comparative judgment tasks, if one object is recognized and the other is not, the recognized one is chosen. The memory-state heuristic (MSH) extends the RH by assuming that choices are not affected by recognition judgments per se, but by the memory states underlying these judgments (i.e., recognition certainty, uncertainty, or rejection certainty). Specifically, the larger the discrepancy between memory states, the larger the probability of choosing the object in the higher state. The typical RH paradigm does not allow estimation of the underlying memory states because it is unknown whether the objects were previously experienced or not. Therefore, we extended the paradigm by repeating the recognition task twice. In line with high threshold models of recognition, we assumed that inconsistent recognition judgments result from uncertainty whereas consistent judgments most likely result from memory certainty. In Experiment 1, we fitted 2 nested multinomial models to the data: an MSH model that formalizes the relation between memory states and binary choices explicitly and an approximate model that ignores the (unlikely) possibility of consistent guesses. Both models provided converging results. As predicted, reliance on recognition increased with the discrepancy in the underlying memory states. In Experiment 2, we replicated these results and found support for choice consistency predictions of the MSH. Additionally, recognition and choice latencies were in agreement with the MSH in both experiments. Finally, we validated critical parameters of our MSH model through a cross-validation method and a third experiment. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
A biologically plausible computational model for auditory object recognition.

PubMed

Larson, Eric; Billimoria, Cyrus P; Sen, Kamal

2009-01-01

Object recognition is a task of fundamental importance for sensory systems. Although this problem has been intensively investigated in the visual system, relatively little is known about the recognition of complex auditory objects. Recent work has shown that spike trains from individual sensory neurons can be used to discriminate between and recognize stimuli. Multiple groups have developed spike similarity or dissimilarity metrics to quantify the differences between spike trains. Using a nearest-neighbor approach the spike similarity metrics can be used to classify the stimuli into groups used to evoke the spike trains. The nearest prototype spike train to the tested spike train can then be used to identify the stimulus. However, how biological circuits might perform such computations remains unclear. Elucidating this question would facilitate the experimental search for such circuits in biological systems, as well as the design of artificial circuits that can perform such computations. Here we present a biologically plausible model for discrimination inspired by a spike distance metric using a network of integrate-and-fire model neurons coupled to a decision network. We then apply this model to the birdsong system in the context of song discrimination and recognition. We show that the model circuit is effective at recognizing individual songs, based on experimental input data from field L, the avian primary auditory cortex analog. We also compare the performance and robustness of this model to two alternative models of song discrimination: a model based on coincidence detection and a model based on firing rate.
Automatic anatomy recognition via multiobject oriented active shape models.

PubMed

Chen, Xinjian; Udupa, Jayaram K; Alavi, Abass; Torigian, Drew A

2010-12-01

This paper studies the feasibility of developing an automatic anatomy recognition (AAR) system in clinical radiology and demonstrates its operation on clinical 2D images. The anatomy recognition method described here consists of two main components: (a) multiobject generalization of OASM and (b) object recognition strategies. The OASM algorithm is generalized to multiple objects by including a model for each object and assigning a cost structure specific to each object in the spirit of live wire. The delineation of multiobject boundaries is done in MOASM via a three level dynamic programming algorithm, wherein the first level is at pixel level which aims to find optimal oriented boundary segments between successive landmarks, the second level is at landmark level which aims to find optimal location for the landmarks, and the third level is at the object level which aims to find optimal arrangement of object boundaries over all objects. The object recognition strategy attempts to find that pose vector (consisting of translation, rotation, and scale component) for the multiobject model that yields the smallest total boundary cost for all objects. The delineation and recognition accuracies were evaluated separately utilizing routine clinical chest CT, abdominal CT, and foot MRI data sets. The delineation accuracy was evaluated in terms of true and false positive volume fractions (TPVF and FPVF). The recognition accuracy was assessed (1) in terms of the size of the space of the pose vectors for the model assembly that yielded high delineation accuracy, (2) as a function of the number of objects and objects' distribution and size in the model, (3) in terms of the interdependence between delineation and recognition, and (4) in terms of the closeness of the optimum recognition result to the global optimum. When multiple objects are included in the model, the delineation accuracy in terms of TPVF can be improved to 97%-98% with a low FPVF of 0.1%-0.2%. Typically, a recognition accuracy of > or = 90% yielded a TPVF > or = 95% and FPVF < or = 0.5%. Over the three data sets and over all tested objects, in 97% of the cases, the optimal solutions found by the proposed method constituted the true global optimum. The experimental results showed the feasibility and efficacy of the proposed automatic anatomy recognition system. Increasing the number of objects in the model can significantly improve both recognition and delineation accuracy. More spread out arrangement of objects in the model can lead to improved recognition and delineation accuracy. Including larger objects in the model also improved recognition and delineation. The proposed method almost always finds globally optimum solutions.
Hierachical Object Recognition Using Libraries of Parameterized Model Sub-Parts.

DTIC Science & Technology

1987-06-01

SketchI Structure Hierarchy Constrained Search 20. AUISTR ACT (Ce.ntU..w se reveres. 01411 at 00 OW 4MI 9smtilp Me"h aindo" This thesis describes the... theseU hierarchies to achieve robust recognition based on effective organization and indexing schemes for model libraries. The goal of the system is to...with different relative scaling, rotation, or translation than in the models. The approach taken in this thesis is to develop an object shape
Exploiting range imagery: techniques and applications

NASA Astrophysics Data System (ADS)

Armbruster, Walter

2009-07-01

Practically no applications exist for which automatic processing of 2D intensity imagery can equal human visual perception. This is not the case for range imagery. The paper gives examples of 3D laser radar applications, for which automatic data processing can exceed human visual cognition capabilities and describes basic processing techniques for attaining these results. The examples are drawn from the fields of helicopter obstacle avoidance, object detection in surveillance applications, object recognition at high range, multi-object-tracking, and object re-identification in range image sequences. Processing times and recognition performances are summarized. The techniques used exploit the bijective continuity of the imaging process as well as its independence of object reflectivity, emissivity and illumination. This allows precise formulations of the probability distributions involved in figure-ground segmentation, feature-based object classification and model based object recognition. The probabilistic approach guarantees optimal solutions for single images and enables Bayesian learning in range image sequences. Finally, due to recent results in 3D-surface completion, no prior model libraries are required for recognizing and re-identifying objects of quite general object categories, opening the way to unsupervised learning and fully autonomous cognitive systems.

Object recognition with hierarchical discriminant saliency networks.

PubMed

Han, Sunhyoung; Vasconcelos, Nuno

2014-01-01

The benefits of integrating attention and object recognition are investigated. While attention is frequently modeled as a pre-processor for recognition, we investigate the hypothesis that attention is an intrinsic component of recognition and vice-versa. This hypothesis is tested with a recognition model, the hierarchical discriminant saliency network (HDSN), whose layers are top-down saliency detectors, tuned for a visual class according to the principles of discriminant saliency. As a model of neural computation, the HDSN has two possible implementations. In a biologically plausible implementation, all layers comply with the standard neurophysiological model of visual cortex, with sub-layers of simple and complex units that implement a combination of filtering, divisive normalization, pooling, and non-linearities. In a convolutional neural network implementation, all layers are convolutional and implement a combination of filtering, rectification, and pooling. The rectification is performed with a parametric extension of the now popular rectified linear units (ReLUs), whose parameters can be tuned for the detection of target object classes. This enables a number of functional enhancements over neural network models that lack a connection to saliency, including optimal feature denoising mechanisms for recognition, modulation of saliency responses by the discriminant power of the underlying features, and the ability to detect both feature presence and absence. In either implementation, each layer has a precise statistical interpretation, and all parameters are tuned by statistical learning. Each saliency detection layer learns more discriminant saliency templates than its predecessors and higher layers have larger pooling fields. This enables the HDSN to simultaneously achieve high selectivity to target object classes and invariance. The performance of the network in saliency and object recognition tasks is compared to those of models from the biological and computer vision literatures. This demonstrates benefits for all the functional enhancements of the HDSN, the class tuning inherent to discriminant saliency, and saliency layers based on templates of increasing target selectivity and invariance. Altogether, these experiments suggest that there are non-trivial benefits in integrating attention and recognition.
Sensor agnostic object recognition using a map seeking circuit

NASA Astrophysics Data System (ADS)

Overman, Timothy L.; Hart, Michael

2012-05-01

Automatic object recognition capabilities are traditionally tuned to exploit the specific sensing modality they were designed to. Their successes (and shortcomings) are tied to object segmentation from the background, they typically require highly skilled personnel to train them, and they become cumbersome with the introduction of new objects. In this paper we describe a sensor independent algorithm based on the biologically inspired technology of map seeking circuits (MSC) which overcomes many of these obstacles. In particular, the MSC concept offers transparency in object recognition from a common interface to all sensor types, analogous to a USB device. It also provides a common core framework that is independent of the sensor and expandable to support high dimensionality decision spaces. Ease in training is assured by using commercially available 3D models from the video game community. The search time remains linear no matter how many objects are introduced, ensuring rapid object recognition. Here, we report results of an MSC algorithm applied to object recognition and pose estimation from high range resolution radar (1D), electrooptical imagery (2D), and LIDAR point clouds (3D) separately. By abstracting the sensor phenomenology from the underlying a prior knowledge base, MSC shows promise as an easily adaptable tool for incorporating additional sensor inputs.
Fast and efficient indexing approach for object recognition

NASA Astrophysics Data System (ADS)

Hefnawy, Alaa; Mashali, Samia A.; Rashwan, Mohsen; Fikri, Magdi

1999-08-01

This paper introduces a fast and efficient indexing approach for both 2D and 3D model-based object recognition in the presence of rotation, translation, and scale variations of objects. The indexing entries are computed after preprocessing the data by Haar wavelet decomposition. The scheme is based on a unified image feature detection approach based on Zernike moments. A set of low level features, e.g. high precision edges, gray level corners, are estimated by a set of orthogonal Zernike moments, calculated locally around every image point. A high dimensional, highly descriptive indexing entries are then calculated based on the correlation of these local features and employed for fast access to the model database to generate hypotheses. A list of the most candidate models is then presented by evaluating the hypotheses. Experimental results are included to demonstrate the effectiveness of the proposed indexing approach.
Introducing memory and association mechanism into a biologically inspired visual model.

PubMed

Qiao, Hong; Li, Yinlin; Tang, Tang; Wang, Peng

2014-09-01

A famous biologically inspired hierarchical model (HMAX model), which was proposed recently and corresponds to V1 to V4 of the ventral pathway in primate visual cortex, has been successfully applied to multiple visual recognition tasks. The model is able to achieve a set of position- and scale-tolerant recognition, which is a central problem in pattern recognition. In this paper, based on some other biological experimental evidence, we introduce the memory and association mechanism into the HMAX model. The main contributions of the work are: 1) mimicking the active memory and association mechanism and adding the top down adjustment to the HMAX model, which is the first try to add the active adjustment to this famous model and 2) from the perspective of information, algorithms based on the new model can reduce the computation storage and have a good recognition performance. The new model is also applied to object recognition processes. The primary experimental results show that our method is efficient with a much lower memory requirement.
Feedforward object-vision models only tolerate small image variations compared to human

PubMed Central

Ghodrati, Masoud; Farzmahdi, Amirhossein; Rajaei, Karim; Ebrahimpour, Reza; Khaligh-Razavi, Seyed-Mahdi

2014-01-01

Invariant object recognition is a remarkable ability of primates' visual system that its underlying mechanism has constantly been under intense investigations. Computational modeling is a valuable tool toward understanding the processes involved in invariant object recognition. Although recent computational models have shown outstanding performances on challenging image databases, they fail to perform well in image categorization under more complex image variations. Studies have shown that making sparse representation of objects by extracting more informative visual features through a feedforward sweep can lead to higher recognition performances. Here, however, we show that when the complexity of image variations is high, even this approach results in poor performance compared to humans. To assess the performance of models and humans in invariant object recognition tasks, we built a parametrically controlled image database consisting of several object categories varied in different dimensions and levels, rendered from 3D planes. Comparing the performance of several object recognition models with human observers shows that only in low-level image variations the models perform similar to humans in categorization tasks. Furthermore, the results of our behavioral experiments demonstrate that, even under difficult experimental conditions (i.e., briefly presented masked stimuli with complex image variations), human observers performed outstandingly well, suggesting that the models are still far from resembling humans in invariant object recognition. Taken together, we suggest that learning sparse informative visual features, although desirable, is not a complete solution for future progresses in object-vision modeling. We show that this approach is not of significant help in solving the computational crux of object recognition (i.e., invariant object recognition) when the identity-preserving image variations become more complex. PMID:25100986
Advances in image compression and automatic target recognition; Proceedings of the Meeting, Orlando, FL, Mar. 30, 31, 1989

NASA Technical Reports Server (NTRS)

Tescher, Andrew G. (Editor)

1989-01-01

Various papers on image compression and automatic target recognition are presented. Individual topics addressed include: target cluster detection in cluttered SAR imagery, model-based target recognition using laser radar imagery, Smart Sensor front-end processor for feature extraction of images, object attitude estimation and tracking from a single video sensor, symmetry detection in human vision, analysis of high resolution aerial images for object detection, obscured object recognition for an ATR application, neural networks for adaptive shape tracking, statistical mechanics and pattern recognition, detection of cylinders in aerial range images, moving object tracking using local windows, new transform method for image data compression, quad-tree product vector quantization of images, predictive trellis encoding of imagery, reduced generalized chain code for contour description, compact architecture for a real-time vision system, use of human visibility functions in segmentation coding, color texture analysis and synthesis using Gibbs random fields.
Image Processing Strategies Based on a Visual Saliency Model for Object Recognition Under Simulated Prosthetic Vision.

PubMed

Wang, Jing; Li, Heng; Fu, Weizhen; Chen, Yao; Li, Liming; Lyu, Qing; Han, Tingting; Chai, Xinyu

2016-01-01

Retinal prostheses have the potential to restore partial vision. Object recognition in scenes of daily life is one of the essential tasks for implant wearers. Still limited by the low-resolution visual percepts provided by retinal prostheses, it is important to investigate and apply image processing methods to convey more useful visual information to the wearers. We proposed two image processing strategies based on Itti's visual saliency map, region of interest (ROI) extraction, and image segmentation. Itti's saliency model generated a saliency map from the original image, in which salient regions were grouped into ROI by the fuzzy c-means clustering. Then Grabcut generated a proto-object from the ROI labeled image which was recombined with background and enhanced in two ways--8-4 separated pixelization (8-4 SP) and background edge extraction (BEE). Results showed that both 8-4 SP and BEE had significantly higher recognition accuracy in comparison with direct pixelization (DP). Each saliency-based image processing strategy was subject to the performance of image segmentation. Under good and perfect segmentation conditions, BEE and 8-4 SP obtained noticeably higher recognition accuracy than DP, and under bad segmentation condition, only BEE boosted the performance. The application of saliency-based image processing strategies was verified to be beneficial to object recognition in daily scenes under simulated prosthetic vision. They are hoped to help the development of the image processing module for future retinal prostheses, and thus provide more benefit for the patients. Copyright © 2015 International Center for Artificial Organs and Transplantation and Wiley Periodicals, Inc.
Ignorance- versus Evidence-Based Decision Making: A Decision Time Analysis of the Recognition Heuristic

ERIC Educational Resources Information Center

Hilbig, Benjamin E.; Pohl, Rudiger F.

2009-01-01

According to part of the adaptive toolbox notion of decision making known as the recognition heuristic (RH), the decision process in comparative judgments--and its duration--is determined by whether recognition discriminates between objects. By contrast, some recently proposed alternative models predict that choices largely depend on the amount of…
Appearance-based face recognition and light-fields.

PubMed

Gross, Ralph; Matthews, Iain; Baker, Simon

2004-04-01

Arguably the most important decision to be made when developing an object recognition algorithm is selecting the scene measurements or features on which to base the algorithm. In appearance-based object recognition, the features are chosen to be the pixel intensity values in an image of the object. These pixel intensities correspond directly to the radiance of light emitted from the object along certain rays in space. The set of all such radiance values over all possible rays is known as the plenoptic function or light-field. In this paper, we develop a theory of appearance-based object recognition from light-fields. This theory leads directly to an algorithm for face recognition across pose that uses as many images of the face as are available, from one upwards. All of the pixels, whichever image they come from, are treated equally and used to estimate the (eigen) light-field of the object. The eigen light-field is then used as the set of features on which to base recognition, analogously to how the pixel intensities are used in appearance-based face and object recognition.
Activity and function recognition for moving and static objects in urban environments from wide-area persistent surveillance inputs

NASA Astrophysics Data System (ADS)

Levchuk, Georgiy; Bobick, Aaron; Jones, Eric

2010-04-01

In this paper, we describe results from experimental analysis of a model designed to recognize activities and functions of moving and static objects from low-resolution wide-area video inputs. Our model is based on representing the activities and functions using three variables: (i) time; (ii) space; and (iii) structures. The activity and function recognition is achieved by imposing lexical, syntactic, and semantic constraints on the lower-level event sequences. In the reported research, we have evaluated the utility and sensitivity of several algorithms derived from natural language processing and pattern recognition domains. We achieved high recognition accuracy for a wide range of activity and function types in the experiments using Electro-Optical (EO) imagery collected by Wide Area Airborne Surveillance (WAAS) platform.
View-Based Models of 3D Object Recognition and Class-Specific Invariance

DTIC Science & Technology

1994-04-01

underlie recognition of geon-like com- ponents (see Edelman, 1991 and Biederman , 1987 ). I(X -_ ta)II1y = (X - ta)TWTW(x -_ ta) (3) View-invariant features...Institute of Technology, 1993. neocortex. Biological Cybernetics, 1992. 14] I. Biederman . Recognition by components: a theory [20] B. Olshausen, C...Anderson, and D. Van Essen. A of human image understanding. Psychol. Review, neural model of visual attention and invariant pat- 94:115-147, 1987 . tern
Research on autonomous identification of airport targets based on Gabor filtering and Radon transform

NASA Astrophysics Data System (ADS)

Yi, Juan; Du, Qingyu; Zhang, Hong jiang; Zhang, Yao lei

2017-11-01

Target recognition is a leading key technology in intelligent image processing and application development at present, with the enhancement of computer processing ability, autonomous target recognition algorithm, gradually improve intelligence, and showed good adaptability. Taking the airport target as the research object, analysis the airport layout characteristics, construction of knowledge model, Gabor filter and Radon transform based on the target recognition algorithm of independent design, image processing and feature extraction of the airport, the algorithm was verified, and achieved better recognition results.
3D automatic anatomy recognition based on iterative graph-cut-ASM

NASA Astrophysics Data System (ADS)

Chen, Xinjian; Udupa, Jayaram K.; Bagci, Ulas; Alavi, Abass; Torigian, Drew A.

2010-02-01

We call the computerized assistive process of recognizing, delineating, and quantifying organs and tissue regions in medical imaging, occurring automatically during clinical image interpretation, automatic anatomy recognition (AAR). The AAR system we are developing includes five main parts: model building, object recognition, object delineation, pathology detection, and organ system quantification. In this paper, we focus on the delineation part. For the modeling part, we employ the active shape model (ASM) strategy. For recognition and delineation, we integrate several hybrid strategies of combining purely image based methods with ASM. In this paper, an iterative Graph-Cut ASM (IGCASM) method is proposed for object delineation. An algorithm called GC-ASM was presented at this symposium last year for object delineation in 2D images which attempted to combine synergistically ASM and GC. Here, we extend this method to 3D medical image delineation. The IGCASM method effectively combines the rich statistical shape information embodied in ASM with the globally optimal delineation capability of the GC method. We propose a new GC cost function, which effectively integrates the specific image information with the ASM shape model information. The proposed methods are tested on a clinical abdominal CT data set. The preliminary results show that: (a) it is feasible to explicitly bring prior 3D statistical shape information into the GC framework; (b) the 3D IGCASM delineation method improves on ASM and GC and can provide practical operational time on clinical images.
The Memory State Heuristic: A Formal Model Based on Repeated Recognition Judgments

ERIC Educational Resources Information Center

Castela, Marta; Erdfelder, Edgar

2017-01-01

The recognition heuristic (RH) theory predicts that, in comparative judgment tasks, if one object is recognized and the other is not, the recognized one is chosen. The memory-state heuristic (MSH) extends the RH by assuming that choices are not affected by recognition judgments per se, but by the memory states underlying these judgments (i.e.,…
Automatic thoracic anatomy segmentation on CT images using hierarchical fuzzy models and registration

NASA Astrophysics Data System (ADS)

Sun, Kaioqiong; Udupa, Jayaram K.; Odhner, Dewey; Tong, Yubing; Torigian, Drew A.

2014-03-01

This paper proposes a thoracic anatomy segmentation method based on hierarchical recognition and delineation guided by a built fuzzy model. Labeled binary samples for each organ are registered and aligned into a 3D fuzzy set representing the fuzzy shape model for the organ. The gray intensity distributions of the corresponding regions of the organ in the original image are recorded in the model. The hierarchical relation and mean location relation between different organs are also captured in the model. Following the hierarchical structure and location relation, the fuzzy shape model of different organs is registered to the given target image to achieve object recognition. A fuzzy connected delineation method is then used to obtain the final segmentation result of organs with seed points provided by recognition. The hierarchical structure and location relation integrated in the model provide the initial parameters for registration and make the recognition efficient and robust. The 3D fuzzy model combined with hierarchical affine registration ensures that accurate recognition can be obtained for both non-sparse and sparse organs. The results on real images are presented and shown to be better than a recently reported fuzzy model-based anatomy recognition strategy.
Recognition of upper airway and surrounding structures at MRI in pediatric PCOS and OSAS

NASA Astrophysics Data System (ADS)

Tong, Yubing; Udupa, J. K.; Odhner, D.; Sin, Sanghun; Arens, Raanan

2013-03-01

Obstructive Sleep Apnea Syndrome (OSAS) is common in obese children with risk being 4.5 fold compared to normal control subjects. Polycystic Ovary Syndrome (PCOS) has recently been shown to be associated with OSAS that may further lead to significant cardiovascular and neuro-cognitive deficits. We are investigating image-based biomarkers to understand the architectural and dynamic changes in the upper airway and the surrounding hard and soft tissue structures via MRI in obese teenage children to study OSAS. At the previous SPIE conferences, we presented methods underlying Fuzzy Object Models (FOMs) for Automatic Anatomy Recognition (AAR) based on CT images of the thorax and the abdomen. The purpose of this paper is to demonstrate that the AAR approach is applicable to a different body region and image modality combination, namely in the study of upper airway structures via MRI. FOMs were built hierarchically, the smaller sub-objects forming the offspring of larger parent objects. FOMs encode the uncertainty and variability present in the form and relationships among the objects over a study population. Totally 11 basic objects (17 including composite) were modeled. Automatic recognition for the best pose of FOMs in a given image was implemented by using four methods - a one-shot method that does not require search, another three searching methods that include Fisher Linear Discriminate (FLD), a b-scale energy optimization strategy, and optimum threshold recognition method. In all, 30 multi-fold cross validation experiments based on 15 patient MRI data sets were carried out to assess the accuracy of recognition. The results indicate that the objects can be recognized with an average location error of less than 5 mm or 2-3 voxels. Then the iterative relative fuzzy connectedness (IRFC) algorithm was adopted for delineation of the target organs based on the recognized results. The delineation results showed an overall FP and TP volume fraction of 0.02 and 0.93.
Extraction of edge-based and region-based features for object recognition

NASA Astrophysics Data System (ADS)

Coutts, Benjamin; Ravi, Srinivas; Hu, Gongzhu; Shrikhande, Neelima

1993-08-01

One of the central problems of computer vision is object recognition. A catalogue of model objects is described as a set of features such as edges and surfaces. The same features are extracted from the scene and matched against the models for object recognition. Edges and surfaces extracted from the scenes are often noisy and imperfect. In this paper algorithms are described for improving low level edge and surface features. Existing edge extraction algorithms are applied to the intensity image to obtain edge features. Initial edges are traced by following directions of the current contour. These are improved by using corresponding depth and intensity information for decision making at branch points. Surface fitting routines are applied to the range image to obtain planar surface patches. An algorithm of region growing is developed that starts with a coarse segmentation and uses quadric surface fitting to iteratively merge adjacent regions into quadric surfaces based on approximate orthogonal distance regression. Surface information obtained is returned to the edge extraction routine to detect and remove fake edges. This process repeats until no more merging or edge improvement can take place. Both synthetic (with Gaussian noise) and real images containing multiple object scenes have been tested using the merging criteria. Results appeared quite encouraging.
Biometric identification

NASA Astrophysics Data System (ADS)

Syryamkim, V. I.; Kuznetsov, D. N.; Kuznetsova, A. S.

2018-05-01

Image recognition is an information process implemented by some information converter (intelligent information channel, recognition system) having input and output. The input of the system is fed with information about the characteristics of the objects being presented. The output of the system displays information about which classes (generalized images) the recognized objects are assigned to. When creating and operating an automated system for pattern recognition, a number of problems are solved, while for different authors the formulations of these tasks, and the set itself, do not coincide, since it depends to a certain extent on the specific mathematical model on which this or that recognition system is based. This is the task of formalizing the domain, forming a training sample, learning the recognition system, reducing the dimensionality of space.
Biologically Inspired Model for Visual Cognition Achieving Unsupervised Episodic and Semantic Feature Learning.

PubMed

Qiao, Hong; Li, Yinlin; Li, Fengfu; Xi, Xuanyang; Wu, Wei

2016-10-01

Recently, many biologically inspired visual computational models have been proposed. The design of these models follows the related biological mechanisms and structures, and these models provide new solutions for visual recognition tasks. In this paper, based on the recent biological evidence, we propose a framework to mimic the active and dynamic learning and recognition process of the primate visual cortex. From principle point of view, the main contributions are that the framework can achieve unsupervised learning of episodic features (including key components and their spatial relations) and semantic features (semantic descriptions of the key components), which support higher level cognition of an object. From performance point of view, the advantages of the framework are as follows: 1) learning episodic features without supervision-for a class of objects without a prior knowledge, the key components, their spatial relations and cover regions can be learned automatically through a deep neural network (DNN); 2) learning semantic features based on episodic features-within the cover regions of the key components, the semantic geometrical values of these components can be computed based on contour detection; 3) forming the general knowledge of a class of objects-the general knowledge of a class of objects can be formed, mainly including the key components, their spatial relations and average semantic values, which is a concise description of the class; and 4) achieving higher level cognition and dynamic updating-for a test image, the model can achieve classification and subclass semantic descriptions. And the test samples with high confidence are selected to dynamically update the whole model. Experiments are conducted on face images, and a good performance is achieved in each layer of the DNN and the semantic description learning process. Furthermore, the model can be generalized to recognition tasks of other objects with learning ability.
Products recognition on shop-racks from local scale-invariant features

NASA Astrophysics Data System (ADS)

Zawistowski, Jacek; Kurzejamski, Grzegorz; Garbat, Piotr; Naruniec, Jacek

2016-04-01

This paper presents a system designed for the multi-object detection purposes and adjusted for the application of product search on the market shelves. System uses well known binary keypoint detection algorithms for finding characteristic points in the image. One of the main idea is object recognition based on Implicit Shape Model method. Authors of the article proposed many improvements of the algorithm. Originally fiducial points are matched with a very simple function. This leads to the limitations in the number of objects parts being success- fully separated, while various methods of classification may be validated in order to achieve higher performance. Such an extension implies research on training procedure able to deal with many objects categories. Proposed solution opens a new possibilities for many algorithms demanding fast and robust multi-object recognition.

How does the brain solve visual object recognition?

PubMed Central

Zoccolan, Davide; Rust, Nicole C.

2012-01-01

Mounting evidence suggests that “core object recognition,” the ability to rapidly recognize objects despite substantial appearance variation, is solved in the brain via a cascade of reflexive, largely feedforward computations that culminate in a powerful neuronal representation in the inferior temporal cortex. However, the algorithm that produces this solution remains little-understood. Here we review evidence ranging from individual neurons, to neuronal populations, to behavior, to computational models. We propose that understanding this algorithm will require using neuronal and psychophysical data to sift through many computational models, each based on building blocks of small, canonical sub-networks with a common functional goal. PMID:22325196
Environmental modeling and recognition for an autonomous land vehicle

NASA Technical Reports Server (NTRS)

Lawton, D. T.; Levitt, T. S.; Mcconnell, C. C.; Nelson, P. C.

1987-01-01

An architecture for object modeling and recognition for an autonomous land vehicle is presented. Examples of objects of interest include terrain features, fields, roads, horizon features, trees, etc. The architecture is organized around a set of data bases for generic object models and perceptual structures, temporary memory for the instantiation of object and relational hypotheses, and a long term memory for storing stable hypotheses that are affixed to the terrain representation. Multiple inference processes operate over these databases. Researchers describe these particular components: the perceptual structure database, the grouping processes that operate over this, schemas, and the long term terrain database. A processing example that matches predictions from the long term terrain model to imagery, extracts significant perceptual structures for consideration as potential landmarks, and extracts a relational structure to update the long term terrain database is given.
How does aging affect recognition-based inference? A hierarchical Bayesian modeling approach.

PubMed

Horn, Sebastian S; Pachur, Thorsten; Mata, Rui

2015-01-01

The recognition heuristic (RH) is a simple strategy for probabilistic inference according to which recognized objects are judged to score higher on a criterion than unrecognized objects. In this article, a hierarchical Bayesian extension of the multinomial r-model is applied to measure use of the RH on the individual participant level and to re-evaluate differences between younger and older adults' strategy reliance across environments. Further, it is explored how individual r-model parameters relate to alternative measures of the use of recognition and other knowledge, such as adherence rates and indices from signal-detection theory (SDT). Both younger and older adults used the RH substantially more often in an environment with high than low recognition validity, reflecting adaptivity in strategy use across environments. In extension of previous analyses (based on adherence rates), hierarchical modeling revealed that in an environment with low recognition validity, (a) older adults had a stronger tendency than younger adults to rely on the RH and (b) variability in RH use between individuals was larger than in an environment with high recognition validity; variability did not differ between age groups. Further, the r-model parameters correlated moderately with an SDT measure expressing how well people can discriminate cases where the RH leads to a correct vs. incorrect inference; this suggests that the r-model and the SDT measures may offer complementary insights into the use of recognition in decision making. In conclusion, younger and older adults are largely adaptive in their application of the RH, but cognitive aging may be associated with an increased tendency to rely on this strategy. Copyright © 2014 Elsevier B.V. All rights reserved.
A neurophysiologically plausible population code model for feature integration explains visual crowding.

PubMed

van den Berg, Ronald; Roerdink, Jos B T M; Cornelissen, Frans W

2010-01-22

An object in the peripheral visual field is more difficult to recognize when surrounded by other objects. This phenomenon is called "crowding". Crowding places a fundamental constraint on human vision that limits performance on numerous tasks. It has been suggested that crowding results from spatial feature integration necessary for object recognition. However, in the absence of convincing models, this theory has remained controversial. Here, we present a quantitative and physiologically plausible model for spatial integration of orientation signals, based on the principles of population coding. Using simulations, we demonstrate that this model coherently accounts for fundamental properties of crowding, including critical spacing, "compulsory averaging", and a foveal-peripheral anisotropy. Moreover, we show that the model predicts increased responses to correlated visual stimuli. Altogether, these results suggest that crowding has little immediate bearing on object recognition but is a by-product of a general, elementary integration mechanism in early vision aimed at improving signal quality.
a Two-Step Classification Approach to Distinguishing Similar Objects in Mobile LIDAR Point Clouds

NASA Astrophysics Data System (ADS)

He, H.; Khoshelham, K.; Fraser, C.

2017-09-01

Nowadays, lidar is widely used in cultural heritage documentation, urban modeling, and driverless car technology for its fast and accurate 3D scanning ability. However, full exploitation of the potential of point cloud data for efficient and automatic object recognition remains elusive. Recently, feature-based methods have become very popular in object recognition on account of their good performance in capturing object details. Compared with global features describing the whole shape of the object, local features recording the fractional details are more discriminative and are applicable for object classes with considerable similarity. In this paper, we propose a two-step classification approach based on point feature histograms and the bag-of-features method for automatic recognition of similar objects in mobile lidar point clouds. Lamp post, street light and traffic sign are grouped as one category in the first-step classification for their inter similarity compared with tree and vehicle. A finer classification of the lamp post, street light and traffic sign based on the result of the first-step classification is implemented in the second step. The proposed two-step classification approach is shown to yield a considerable improvement over the conventional one-step classification approach.
Surface versus Edge-Based Determinants of Visual Recognition.

ERIC Educational Resources Information Center

Biederman, Irving; Ju, Ginny

1988-01-01

The latency at which objects could be identified by 126 subjects was compared through line drawings (edge-based) or color photography (surface depiction). The line drawing was identified about as quickly as the photograph; primal access to a mental representation of an object can be modeled from an edge-based description. (SLD)
Generalization between canonical and non-canonical views in object recognition

PubMed Central

Ghose, Tandra; Liu, Zili

2013-01-01

Viewpoint generalization in object recognition is the process that allows recognition of a given 3D object from many different viewpoints despite variations in its 2D projections. We used the canonical view effects as a foundation to empirically test the validity of a major theory in object recognition, the view-approximation model (Poggio & Edelman, 1990). This model predicts that generalization should be better when an object is first seen from a non-canonical view and then a canonical view than when seen in the reversed order. We also manipulated object similarity to study the degree to which this view generalization was constrained by shape details and task instructions (object vs. image recognition). Old-new recognition performance for basic and subordinate level objects was measured in separate blocks. We found that for object recognition, view generalization between canonical and non-canonical views was comparable for basic level objects. For subordinate level objects, recognition performance was more accurate from non-canonical to canonical views than the other way around. When the task was changed from object recognition to image recognition, the pattern of the results reversed. Interestingly, participants responded “old” to “new” images of “old” objects with a substantially higher rate than to “new” objects, despite instructions to the contrary, thereby indicating involuntary view generalization. Our empirical findings are incompatible with the prediction of the view-approximation theory, and argue against the hypothesis that views are stored independently. PMID:23283692
Intrinsic Bayesian Active Contours for Extraction of Object Boundaries in Images

PubMed Central

Srivastava, Anuj

2010-01-01

We present a framework for incorporating prior information about high-probability shapes in the process of contour extraction and object recognition in images. Here one studies shapes as elements of an infinite-dimensional, non-linear quotient space, and statistics of shapes are defined and computed intrinsically using differential geometry of this shape space. Prior models on shapes are constructed using probability distributions on tangent bundles of shape spaces. Similar to the past work on active contours, where curves are driven by vector fields based on image gradients and roughness penalties, we incorporate the prior shape knowledge in the form of vector fields on curves. Through experimental results, we demonstrate the use of prior shape models in the estimation of object boundaries, and their success in handling partial obscuration and missing data. Furthermore, we describe the use of this framework in shape-based object recognition or classification. PMID:21076692
Recognition-induced forgetting is not due to category-based set size.

PubMed

Maxcey, Ashleigh M

2016-01-01

What are the consequences of accessing a visual long-term memory representation? Previous work has shown that accessing a long-term memory representation via retrieval improves memory for the targeted item and hurts memory for related items, a phenomenon called retrieval-induced forgetting. Recently we found a similar forgetting phenomenon with recognition of visual objects. Recognition-induced forgetting occurs when practice recognizing an object during a two-alternative forced-choice task, from a group of objects learned at the same time, leads to worse memory for objects from that group that were not practiced. An alternative explanation of this effect is that category-based set size is inducing forgetting, not recognition practice as claimed by some researchers. This alternative explanation is possible because during recognition practice subjects make old-new judgments in a two-alternative forced-choice task, and are thus exposed to more objects from practiced categories, potentially inducing forgetting due to set-size. Herein I pitted the category-based set size hypothesis against the recognition-induced forgetting hypothesis. To this end, I parametrically manipulated the amount of practice objects received in the recognition-induced forgetting paradigm. If forgetting is due to category-based set size, then the magnitude of forgetting of related objects will increase as the number of practice trials increases. If forgetting is recognition induced, the set size of exemplars from any given category should not be predictive of memory for practiced objects. Consistent with this latter hypothesis, additional practice systematically improved memory for practiced objects, but did not systematically affect forgetting of related objects. These results firmly establish that recognition practice induces forgetting of related memories. Future directions and important real-world applications of using recognition to access our visual memories of previously encountered objects are discussed.
Search algorithm complexity modeling with application to image alignment and matching

NASA Astrophysics Data System (ADS)

DelMarco, Stephen

2014-05-01

Search algorithm complexity modeling, in the form of penetration rate estimation, provides a useful way to estimate search efficiency in application domains which involve searching over a hypothesis space of reference templates or models, as in model-based object recognition, automatic target recognition, and biometric recognition. The penetration rate quantifies the expected portion of the database that must be searched, and is useful for estimating search algorithm computational requirements. In this paper we perform mathematical modeling to derive general equations for penetration rate estimates that are applicable to a wide range of recognition problems. We extend previous penetration rate analyses to use more general probabilistic modeling assumptions. In particular we provide penetration rate equations within the framework of a model-based image alignment application domain in which a prioritized hierarchical grid search is used to rank subspace bins based on matching probability. We derive general equations, and provide special cases based on simplifying assumptions. We show how previously-derived penetration rate equations are special cases of the general formulation. We apply the analysis to model-based logo image alignment in which a hierarchical grid search is used over a geometric misalignment transform hypothesis space. We present numerical results validating the modeling assumptions and derived formulation.
Comparing visual representations across human fMRI and computational vision

PubMed Central

Leeds, Daniel D.; Seibert, Darren A.; Pyles, John A.; Tarr, Michael J.

2013-01-01

Feedforward visual object perception recruits a cortical network that is assumed to be hierarchical, progressing from basic visual features to complete object representations. However, the nature of the intermediate features related to this transformation remains poorly understood. Here, we explore how well different computer vision recognition models account for neural object encoding across the human cortical visual pathway as measured using fMRI. These neural data, collected during the viewing of 60 images of real-world objects, were analyzed with a searchlight procedure as in Kriegeskorte, Goebel, and Bandettini (2006): Within each searchlight sphere, the obtained patterns of neural activity for all 60 objects were compared to model responses for each computer recognition algorithm using representational dissimilarity analysis (Kriegeskorte et al., 2008). Although each of the computer vision methods significantly accounted for some of the neural data, among the different models, the scale invariant feature transform (Lowe, 2004), encoding local visual properties gathered from “interest points,” was best able to accurately and consistently account for stimulus representations within the ventral pathway. More generally, when present, significance was observed in regions of the ventral-temporal cortex associated with intermediate-level object perception. Differences in model effectiveness and the neural location of significant matches may be attributable to the fact that each model implements a different featural basis for representing objects (e.g., more holistic or more parts-based). Overall, we conclude that well-known computer vision recognition systems may serve as viable proxies for theories of intermediate visual object representation. PMID:24273227
Complex scenes and situations visualization in hierarchical learning algorithm with dynamic 3D NeoAxis engine

NASA Astrophysics Data System (ADS)

Graham, James; Ternovskiy, Igor V.

2013-06-01

We applied a two stage unsupervised hierarchical learning system to model complex dynamic surveillance and cyber space monitoring systems using a non-commercial version of the NeoAxis visualization software. The hierarchical scene learning and recognition approach is based on hierarchical expectation maximization, and was linked to a 3D graphics engine for validation of learning and classification results and understanding the human - autonomous system relationship. Scene recognition is performed by taking synthetically generated data and feeding it to a dynamic logic algorithm. The algorithm performs hierarchical recognition of the scene by first examining the features of the objects to determine which objects are present, and then determines the scene based on the objects present. This paper presents a framework within which low level data linked to higher-level visualization can provide support to a human operator and be evaluated in a detailed and systematic way.
Modeling recall memory for emotional objects in Alzheimer's disease.

PubMed

Sundstrøm, Martin

2011-07-01

To examine whether emotional memory (EM) of objects with self-reference in Alzheimer's disease (AD) can be modeled with binomial logistic regression in a free recall and an object recognition test to predict EM enhancement. Twenty patients with AD and twenty healthy controls were studied. Six objects (three presented as gifts) were shown to each participant. Ten minutes later, a free recall and a recognition test were applied. The recognition test had target-objects mixed with six similar distracter objects. Participants were asked to name any object in the recall test and identify each object in the recognition test as known or unknown. The total of gift objects recalled in AD patients (41.6%) was larger than neutral objects (13.3%) and a significant EM recall effect for gifts was found (Wilcoxon: p < .003). EM was not found for recognition in AD patients due to a ceiling effect. Healthy older adults scored overall higher in recall and recognition but showed no EM enhancement due to a ceiling effect. A logistic regression showed that likelihood of emotional recall memory can be modeled as a function of MMSE score (p < .014) and object status (p < .0001) as gift or non-gift. Recall memory was enhanced in AD patients for emotional objects indicating that EM in mild to moderate AD although impaired can be provoked with strong emotional load. The logistic regression model suggests that EM declines with the progression of AD rather than disrupts and may be a useful tool for evaluating magnitude of emotional load.
Component-based target recognition inspired by human vision

NASA Astrophysics Data System (ADS)

Zheng, Yufeng; Agyepong, Kwabena

2009-05-01

In contrast with machine vision, human can recognize an object from complex background with great flexibility. For example, given the task of finding and circling all cars (no further information) in a picture, you may build a virtual image in mind from the task (or target) description before looking at the picture. Specifically, the virtual car image may be composed of the key components such as driver cabin and wheels. In this paper, we propose a component-based target recognition method by simulating the human recognition process. The component templates (equivalent to the virtual image in mind) of the target (car) are manually decomposed from the target feature image. Meanwhile, the edges of the testing image can be extracted by using a difference of Gaussian (DOG) model that simulates the spatiotemporal response in visual process. A phase correlation matching algorithm is then applied to match the templates with the testing edge image. If all key component templates are matched with the examining object, then this object is recognized as the target. Besides the recognition accuracy, we will also investigate if this method works with part targets (half cars). In our experiments, several natural pictures taken on streets were used to test the proposed method. The preliminary results show that the component-based recognition method is very promising.
Exogenous temporal cues enhance recognition memory in an object-based manner.

PubMed

Ohyama, Junji; Watanabe, Katsumi

2010-11-01

Exogenous attention enhances the perception of attended items in both a space-based and an object-based manner. Exogenous attention also improves recognition memory for attended items in the space-based mode. However, it has not been examined whether object-based exogenous attention enhances recognition memory. To address this issue, we examined whether a sudden visual change in a task-irrelevant stimulus (an exogenous cue) would affect participants' recognition memory for items that were serially presented around a cued time. The results showed that recognition accuracy for an item was strongly enhanced when the visual cue occurred at the same location and time as the item (Experiments 1 and 2). The memory enhancement effect occurred when the exogenous visual cue and an item belonged to the same object (Experiments 3 and 4) and even when the cue was counterpredictive of the timing of an item to be asked about (Experiment 5). The present study suggests that an exogenous temporal cue automatically enhances the recognition accuracy for an item that is presented at close temporal proximity to the cue and that recognition memory enhancement occurs in an object-based manner.
On a problematic procedure to manipulate response biases in recognition experiments: the case of "implied" base rates.

PubMed

Bröder, Arndt; Malejka, Simone

2017-07-01

The experimental manipulation of response biases in recognition-memory tests is an important means for testing recognition models and for estimating their parameters. The textbook manipulations for binary-response formats either vary the payoff scheme or the base rate of targets in the recognition test, with the latter being the more frequently applied procedure. However, some published studies reverted to implying different base rates by instruction rather than actually changing them. Aside from unnecessarily deceiving participants, this procedure may lead to cognitive conflicts that prompt response strategies unknown to the experimenter. To test our objection, implied base rates were compared to actual base rates in a recognition experiment followed by a post-experimental interview to assess participants' response strategies. The behavioural data show that recognition-memory performance was estimated to be lower in the implied base-rate condition. The interview data demonstrate that participants used various second-order response strategies that jeopardise the interpretability of the recognition data. We thus advice researchers against substituting actual base rates with implied base rates.
A cortical framework for invariant object categorization and recognition.

PubMed

Rodrigues, João; Hans du Buf, J M

2009-08-01

In this paper we present a new model for invariant object categorization and recognition. It is based on explicit multi-scale features: lines, edges and keypoints are extracted from responses of simple, complex and end-stopped cells in cortical area V1, and keypoints are used to construct saliency maps for Focus-of-Attention. The model is a functional but dichotomous one, because keypoints are employed to model the "where" data stream, with dynamic routing of features from V1 to higher areas to obtain translation, rotation and size invariance, whereas lines and edges are employed in the "what" stream for object categorization and recognition. Furthermore, both the "where" and "what" pathways are dynamic in that information at coarse scales is employed first, after which information at progressively finer scales is added in order to refine the processes, i.e., both the dynamic feature routing and the categorization level. The construction of group and object templates, which are thought to be available in the prefrontal cortex with "what" and "where" components in PF46d and PF46v, is also illustrated. The model was tested in the framework of an integrated and biologically plausible architecture.
HWDA: A coherence recognition and resolution algorithm for hybrid web data aggregation

NASA Astrophysics Data System (ADS)

Guo, Shuhang; Wang, Jian; Wang, Tong

2017-09-01

Aiming at the object confliction recognition and resolution problem for hybrid distributed data stream aggregation, a distributed data stream object coherence solution technology is proposed. Firstly, the framework was defined for the object coherence conflict recognition and resolution, named HWDA. Secondly, an object coherence recognition technology was proposed based on formal language description logic and hierarchical dependency relationship between logic rules. Thirdly, a conflict traversal recognition algorithm was proposed based on the defined dependency graph. Next, the conflict resolution technology was prompted based on resolution pattern matching including the definition of the three types of conflict, conflict resolution matching pattern and arbitration resolution method. At last, the experiment use two kinds of web test data sets to validate the effect of application utilizing the conflict recognition and resolution technology of HWDA.
Dietary effects on object recognition: The impact of high-fat high-sugar diets on recollection and familiarity-based memory.

PubMed

Tran, Dominic M D; Westbrook, R Frederick

2018-05-31

Exposure to a high-fat high-sugar (HFHS) diet rapidly impairs novel-place- but not novel-object-recognition memory in rats (Tran & Westbrook, 2015, 2017). Three experiments sought to investigate the generality of diet-induced cognitive deficits by examining whether there are conditions under which object-recognition memory is impaired. Experiments 1 and 3 tested the strength of short- and long-term object-memory trace, respectively, by varying the interval of time between object familiarization and subsequent novel object test. Experiment 2 tested the effect of increasing working memory load on object-recognition memory by interleaving additional object exposures between familiarization and test in an n-back style task. Experiments 1-3 failed to detect any differences in object recognition between HFHS and control rats. Experiment 4 controlled for object novelty by separately familiarizing both objects presented at test, which included one remote-familiar and one recent-familiar object. Under these conditions, when test objects differed in their relative recency, HFHS rats showed a weaker memory trace for the remote object compared to chow rats. This result suggests that the diet leaves intact recollection judgments, but impairs familiarity judgments. We speculate that the HFHS diet adversely affects "where" memories as well as the quality of "what" memories, and discuss these effects in relation to recollection and familiarity memory models, hippocampal-dependent functions, and episodic food memories. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
A bio-inspired system for spatio-temporal recognition in static and video imagery

NASA Astrophysics Data System (ADS)

Khosla, Deepak; Moore, Christopher K.; Chelian, Suhas

2007-04-01

This paper presents a bio-inspired method for spatio-temporal recognition in static and video imagery. It builds upon and extends our previous work on a bio-inspired Visual Attention and object Recognition System (VARS). The VARS approach locates and recognizes objects in a single frame. This work presents two extensions of VARS. The first extension is a Scene Recognition Engine (SCE) that learns to recognize spatial relationships between objects that compose a particular scene category in static imagery. This could be used for recognizing the category of a scene, e.g., office vs. kitchen scene. The second extension is the Event Recognition Engine (ERE) that recognizes spatio-temporal sequences or events in sequences. This extension uses a working memory model to recognize events and behaviors in video imagery by maintaining and recognizing ordered spatio-temporal sequences. The working memory model is based on an ARTSTORE1 neural network that combines an ART-based neural network with a cascade of sustained temporal order recurrent (STORE)1 neural networks. A series of Default ARTMAP classifiers ascribes event labels to these sequences. Our preliminary studies have shown that this extension is robust to variations in an object's motion profile. We evaluated the performance of the SCE and ERE on real datasets. The SCE module was tested on a visual scene classification task using the LabelMe2 dataset. The ERE was tested on real world video footage of vehicles and pedestrians in a street scene. Our system is able to recognize the events in this footage involving vehicles and pedestrians.

A Neurophysiologically Plausible Population Code Model for Feature Integration Explains Visual Crowding

PubMed Central

van den Berg, Ronald; Roerdink, Jos B. T. M.; Cornelissen, Frans W.

2010-01-01

An object in the peripheral visual field is more difficult to recognize when surrounded by other objects. This phenomenon is called “crowding”. Crowding places a fundamental constraint on human vision that limits performance on numerous tasks. It has been suggested that crowding results from spatial feature integration necessary for object recognition. However, in the absence of convincing models, this theory has remained controversial. Here, we present a quantitative and physiologically plausible model for spatial integration of orientation signals, based on the principles of population coding. Using simulations, we demonstrate that this model coherently accounts for fundamental properties of crowding, including critical spacing, “compulsory averaging”, and a foveal-peripheral anisotropy. Moreover, we show that the model predicts increased responses to correlated visual stimuli. Altogether, these results suggest that crowding has little immediate bearing on object recognition but is a by-product of a general, elementary integration mechanism in early vision aimed at improving signal quality. PMID:20098499
Striatal and Hippocampal Entropy and Recognition Signals in Category Learning: Simultaneous Processes Revealed by Model-Based fMRI

PubMed Central

Davis, Tyler; Love, Bradley C.; Preston, Alison R.

2012-01-01

Category learning is a complex phenomenon that engages multiple cognitive processes, many of which occur simultaneously and unfold dynamically over time. For example, as people encounter objects in the world, they simultaneously engage processes to determine their fit with current knowledge structures, gather new information about the objects, and adjust their representations to support behavior in future encounters. Many techniques that are available to understand the neural basis of category learning assume that the multiple processes that subserve it can be neatly separated between different trials of an experiment. Model-based functional magnetic resonance imaging offers a promising tool to separate multiple, simultaneously occurring processes and bring the analysis of neuroimaging data more in line with category learning’s dynamic and multifaceted nature. We use model-based imaging to explore the neural basis of recognition and entropy signals in the medial temporal lobe and striatum that are engaged while participants learn to categorize novel stimuli. Consistent with theories suggesting a role for the anterior hippocampus and ventral striatum in motivated learning in response to uncertainty, we find that activation in both regions correlates with a model-based measure of entropy. Simultaneously, separate subregions of the hippocampus and striatum exhibit activation correlated with a model-based recognition strength measure. Our results suggest that model-based analyses are exceptionally useful for extracting information about cognitive processes from neuroimaging data. Models provide a basis for identifying the multiple neural processes that contribute to behavior, and neuroimaging data can provide a powerful test bed for constraining and testing model predictions. PMID:22746951
Object recognition of real targets using modelled SAR images

NASA Astrophysics Data System (ADS)

Zherdev, D. A.

2017-12-01

In this work the problem of recognition is studied using SAR images. The algorithm of recognition is based on the computation of conjugation indices with vectors of class. The support subspaces for each class are constructed by exception of the most and the less correlated vectors in a class. In the study we examine the ability of a significant feature vector size reduce that leads to recognition time decrease. The images of targets form the feature vectors that are transformed using pre-trained convolutional neural network (CNN).
Feature extraction for face recognition via Active Shape Model (ASM) and Active Appearance Model (AAM)

NASA Astrophysics Data System (ADS)

Iqtait, M.; Mohamad, F. S.; Mamat, M.

2018-03-01

Biometric is a pattern recognition system which is used for automatic recognition of persons based on characteristics and features of an individual. Face recognition with high recognition rate is still a challenging task and usually accomplished in three phases consisting of face detection, feature extraction, and expression classification. Precise and strong location of trait point is a complicated and difficult issue in face recognition. Cootes proposed a Multi Resolution Active Shape Models (ASM) algorithm, which could extract specified shape accurately and efficiently. Furthermore, as the improvement of ASM, Active Appearance Models algorithm (AAM) is proposed to extracts both shape and texture of specified object simultaneously. In this paper we give more details about the two algorithms and give the results of experiments, testing their performance on one dataset of faces. We found that the ASM is faster and gains more accurate trait point location than the AAM, but the AAM gains a better match to the texture.
Automatic anatomy recognition in whole-body PET/CT images

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wang, Huiqian; Udupa, Jayaram K., E-mail: jay@mail.med.upenn.edu; Odhner, Dewey

Purpose: Whole-body positron emission tomography/computed tomography (PET/CT) has become a standard method of imaging patients with various disease conditions, especially cancer. Body-wide accurate quantification of disease burden in PET/CT images is important for characterizing lesions, staging disease, prognosticating patient outcome, planning treatment, and evaluating disease response to therapeutic interventions. However, body-wide anatomy recognition in PET/CT is a critical first step for accurately and automatically quantifying disease body-wide, body-region-wise, and organwise. This latter process, however, has remained a challenge due to the lower quality of the anatomic information portrayed in the CT component of this imaging modality and the paucity ofmore » anatomic details in the PET component. In this paper, the authors demonstrate the adaptation of a recently developed automatic anatomy recognition (AAR) methodology [Udupa et al., “Body-wide hierarchical fuzzy modeling, recognition, and delineation of anatomy in medical images,” Med. Image Anal. 18, 752–771 (2014)] to PET/CT images. Their goal was to test what level of object localization accuracy can be achieved on PET/CT compared to that achieved on diagnostic CT images. Methods: The authors advance the AAR approach in this work in three fronts: (i) from body-region-wise treatment in the work of Udupa et al. to whole body; (ii) from the use of image intensity in optimal object recognition in the work of Udupa et al. to intensity plus object-specific texture properties, and (iii) from the intramodality model-building-recognition strategy to the intermodality approach. The whole-body approach allows consideration of relationships among objects in different body regions, which was previously not possible. Consideration of object texture allows generalizing the previous optimal threshold-based fuzzy model recognition method from intensity images to any derived fuzzy membership image, and in the process, to bring performance to the level achieved on diagnostic CT and MR images in body-region-wise approaches. The intermodality approach fosters the use of already existing fuzzy models, previously created from diagnostic CT images, on PET/CT and other derived images, thus truly separating the modality-independent object assembly anatomy from modality-specific tissue property portrayal in the image. Results: Key ways of combining the above three basic ideas lead them to 15 different strategies for recognizing objects in PET/CT images. Utilizing 50 diagnostic CT image data sets from the thoracic and abdominal body regions and 16 whole-body PET/CT image data sets, the authors compare the recognition performance among these 15 strategies on 18 objects from the thorax, abdomen, and pelvis in object localization error and size estimation error. Particularly on texture membership images, object localization is within three voxels on whole-body low-dose CT images and 2 voxels on body-region-wise low-dose images of known true locations. Surprisingly, even on direct body-region-wise PET images, localization error within 3 voxels seems possible. Conclusions: The previous body-region-wise approach can be extended to whole-body torso with similar object localization performance. Combined use of image texture and intensity property yields the best object localization accuracy. In both body-region-wise and whole-body approaches, recognition performance on low-dose CT images reaches levels previously achieved on diagnostic CT images. The best object recognition strategy varies among objects; the proposed framework however allows employing a strategy that is optimal for each object.« less
Toward a unified model of face and object recognition in the human visual system

PubMed Central

Wallis, Guy

2013-01-01

Our understanding of the mechanisms and neural substrates underlying visual recognition has made considerable progress over the past 30 years. During this period, accumulating evidence has led many scientists to conclude that objects and faces are recognised in fundamentally distinct ways, and in fundamentally distinct cortical areas. In the psychological literature, in particular, this dissociation has led to a palpable disconnect between theories of how we process and represent the two classes of object. This paper follows a trend in part of the recognition literature to try to reconcile what we know about these two forms of recognition by considering the effects of learning. Taking a widely accepted, self-organizing model of object recognition, this paper explains how such a system is affected by repeated exposure to specific stimulus classes. In so doing, it explains how many aspects of recognition generally regarded as unusual to faces (holistic processing, configural processing, sensitivity to inversion, the other-race effect, the prototype effect, etc.) are emergent properties of category-specific learning within such a system. Overall, the paper describes how a single model of recognition learning can and does produce the seemingly very different types of representation associated with faces and objects. PMID:23966963
Figure-ground organization and object recognition processes: an interactive account.

PubMed

Vecera, S P; O'Reilly, R C

1998-04-01

Traditional bottom-up models of visual processing assume that figure-ground organization precedes object recognition. This assumption seems logically necessary: How can object recognition occur before a region is labeled as figure? However, some behavioral studies find that familiar regions are more likely to be labeled figure than less familiar regions, a problematic finding for bottom-up models. An interactive account is proposed in which figure-ground processes receive top-down input from object representations in a hierarchical system. A graded, interactive computational model is presented that accounts for behavioral results in which familiarity effects are found. The interactive model offers an alternative conception of visual processing to bottom-up models.
Recognizing 3 D Objects from 2D Images Using Structural Knowledge Base of Genetic Views

DTIC Science & Technology

1988-08-31

technical report. [BIE85] I. Biederman , "Human image understanding: Recent research and a theory", Computer Vision, Graphics, and Image Processing, vol...model bases", Technical Report 87-85, COINS Dept, University of Massachusetts, Amherst, MA 01003, August 1987 . [BUR87b) Burns, J. B. and L. J. Kitchen...34Recognition in 2D images of 3D objects from large model bases using prediction hierarchies", Proc. IJCAI-10, 1987 . [BUR891 J. B. Burns, forthcoming
Model-based occluded object recognition using Petri nets

NASA Astrophysics Data System (ADS)

Zhou, Chuan; Hura, Gurdeep S.

1998-09-01

This paper discusses the use of Petri nets to model the process of the object matching between an image and a model under different 2D geometric transformations. This transformation finds its applications in sensor-based robot control, flexible manufacturing system and industrial inspection, etc. A description approach for object structure is presented by its topological structure relation called Point-Line Relation Structure (PLRS). It has been shown how Petri nets can be used to model the matching process, and an optimal or near optimal matching can be obtained by tracking the reachability graph of the net. The experiment result shows that object can be successfully identified and located under 2D transformation such as translations, rotations, scale changes and distortions due to object occluded partially.
View-Invariant Object Category Learning, Recognition, and Search: How Spatial and Object Attention are Coordinated Using Surface-Based Attentional Shrouds

ERIC Educational Resources Information Center

Fazl, Arash; Grossberg, Stephen; Mingolla, Ennio

2009-01-01

How does the brain learn to recognize an object from multiple viewpoints while scanning a scene with eye movements? How does the brain avoid the problem of erroneously classifying parts of different objects together? How are attention and eye movements intelligently coordinated to facilitate object learning? A neural model provides a unified…
Definition and automatic anatomy recognition of lymph node zones in the pelvis on CT images

NASA Astrophysics Data System (ADS)

Liu, Yu; Udupa, Jayaram K.; Odhner, Dewey; Tong, Yubing; Guo, Shuxu; Attor, Rosemary; Reinicke, Danica; Torigian, Drew A.

2016-03-01

Currently, unlike IALSC-defined thoracic lymph node zones, no explicitly provided definitions for lymph nodes in other body regions are available. Yet, definitions are critical for standardizing the recognition, delineation, quantification, and reporting of lymphadenopathy in other body regions. Continuing from our previous work in the thorax, this paper proposes a standardized definition of the grouping of pelvic lymph nodes into 10 zones. We subsequently employ our earlier Automatic Anatomy Recognition (AAR) framework designed for body-wide organ modeling, recognition, and delineation to actually implement these zonal definitions where the zones are treated as anatomic objects. First, all 10 zones and key anatomic organs used as anchors are manually delineated under expert supervision for constructing fuzzy anatomy models of the assembly of organs together with the zones. Then, optimal hierarchical arrangement of these objects is constructed for the purpose of achieving the best zonal recognition. For actual localization of the objects, two strategies are used -- optimal thresholded search for organs and one-shot method for the zones where the known relationship of the zones to key organs is exploited. Based on 50 computed tomography (CT) image data sets for the pelvic body region and an equal division into training and test subsets, automatic zonal localization within 1-3 voxels is achieved.
Using an Improved SIFT Algorithm and Fuzzy Closed-Loop Control Strategy for Object Recognition in Cluttered Scenes

PubMed Central

Nie, Haitao; Long, Kehui; Ma, Jun; Yue, Dan; Liu, Jinguo

2015-01-01

Partial occlusions, large pose variations, and extreme ambient illumination conditions generally cause the performance degradation of object recognition systems. Therefore, this paper presents a novel approach for fast and robust object recognition in cluttered scenes based on an improved scale invariant feature transform (SIFT) algorithm and a fuzzy closed-loop control method. First, a fast SIFT algorithm is proposed by classifying SIFT features into several clusters based on several attributes computed from the sub-orientation histogram (SOH), in the feature matching phase only features that share nearly the same corresponding attributes are compared. Second, a feature matching step is performed following a prioritized order based on the scale factor, which is calculated between the object image and the target object image, guaranteeing robust feature matching. Finally, a fuzzy closed-loop control strategy is applied to increase the accuracy of the object recognition and is essential for autonomous object manipulation process. Compared to the original SIFT algorithm for object recognition, the result of the proposed method shows that the number of SIFT features extracted from an object has a significant increase, and the computing speed of the object recognition processes increases by more than 40%. The experimental results confirmed that the proposed method performs effectively and accurately in cluttered scenes. PMID:25714094
Incrementally learning objects by touch: online discriminative and generative models for tactile-based recognition.

PubMed

Soh, Harold; Demiris, Yiannis

2014-01-01

Human beings not only possess the remarkable ability to distinguish objects through tactile feedback but are further able to improve upon recognition competence through experience. In this work, we explore tactile-based object recognition with learners capable of incremental learning. Using the sparse online infinite Echo-State Gaussian process (OIESGP), we propose and compare two novel discriminative and generative tactile learners that produce probability distributions over objects during object grasping/palpation. To enable iterative improvement, our online methods incorporate training samples as they become available. We also describe incremental unsupervised learning mechanisms, based on novelty scores and extreme value theory, when teacher labels are not available. We present experimental results for both supervised and unsupervised learning tasks using the iCub humanoid, with tactile sensors on its five-fingered anthropomorphic hand, and 10 different object classes. Our classifiers perform comparably to state-of-the-art methods (C4.5 and SVM classifiers) and findings indicate that tactile signals are highly relevant for making accurate object classifications. We also show that accurate "early" classifications are possible using only 20-30 percent of the grasp sequence. For unsupervised learning, our methods generate high quality clusterings relative to the widely-used sequential k-means and self-organising map (SOM), and we present analyses into the differences between the approaches.
EMG-based speech recognition using hidden markov models with global control variables.

PubMed

Lee, Ki-Seung

2008-03-01

It is well known that a strong relationship exists between human voices and the movement of articulatory facial muscles. In this paper, we utilize this knowledge to implement an automatic speech recognition scheme which uses solely surface electromyogram (EMG) signals. The sequence of EMG signals for each word is modelled by a hidden Markov model (HMM) framework. The main objective of the work involves building a model for state observation density when multichannel observation sequences are given. The proposed model reflects the dependencies between each of the EMG signals, which are described by introducing a global control variable. We also develop an efficient model training method, based on a maximum likelihood criterion. In a preliminary study, 60 isolated words were used as recognition variables. EMG signals were acquired from three articulatory facial muscles. The findings indicate that such a system may have the capacity to recognize speech signals with an accuracy of up to 87.07%, which is superior to the independent probabilistic model.
Target recognition for ladar range image using slice image

NASA Astrophysics Data System (ADS)

Xia, Wenze; Han, Shaokun; Wang, Liang

2015-12-01

A shape descriptor and a complete shape-based recognition system using slice images as geometric feature descriptor for ladar range images are introduced. A slice image is a two-dimensional image generated by three-dimensional Hough transform and the corresponding mathematical transformation. The system consists of two processes, the model library construction and recognition. In the model library construction process, a series of range images are obtained after the model object is sampled at preset attitude angles. Then, all the range images are converted into slice images. The number of slice images is reduced by clustering analysis and finding a representation to reduce the size of the model library. In the recognition process, the slice image of the scene is compared with the slice image in the model library. The recognition results depend on the comparison. Simulated ladar range images are used to analyze the recognition and misjudgment rates, and comparison between the slice image representation method and moment invariants representation method is performed. The experimental results show that whether in conditions without noise or with ladar noise, the system has a high recognition rate and low misjudgment rate. The comparison experiment demonstrates that the slice image has better representation ability than moment invariants.
Exploiting Attribute Correlations: A Novel Trace Lasso-Based Weakly Supervised Dictionary Learning Method.

PubMed

Wu, Lin; Wang, Yang; Pan, Shirui

2017-12-01

It is now well established that sparse representation models are working effectively for many visual recognition tasks, and have pushed forward the success of dictionary learning therein. Recent studies over dictionary learning focus on learning discriminative atoms instead of purely reconstructive ones. However, the existence of intraclass diversities (i.e., data objects within the same category but exhibit large visual dissimilarities), and interclass similarities (i.e., data objects from distinct classes but share much visual similarities), makes it challenging to learn effective recognition models. To this end, a large number of labeled data objects are required to learn models which can effectively characterize these subtle differences. However, labeled data objects are always limited to access, committing it difficult to learn a monolithic dictionary that can be discriminative enough. To address the above limitations, in this paper, we propose a weakly-supervised dictionary learning method to automatically learn a discriminative dictionary by fully exploiting visual attribute correlations rather than label priors. In particular, the intrinsic attribute correlations are deployed as a critical cue to guide the process of object categorization, and then a set of subdictionaries are jointly learned with respect to each category. The resulting dictionary is highly discriminative and leads to intraclass diversity aware sparse representations. Extensive experiments on image classification and object recognition are conducted to show the effectiveness of our approach.
Object Recognition and Localization: The Role of Tactile Sensors

PubMed Central

Aggarwal, Achint; Kirchner, Frank

2014-01-01

Tactile sensors, because of their intrinsic insensitivity to lighting conditions and water turbidity, provide promising opportunities for augmenting the capabilities of vision sensors in applications involving object recognition and localization. This paper presents two approaches for haptic object recognition and localization for ground and underwater environments. The first approach called Batch Ransac and Iterative Closest Point augmented Particle Filter (BRICPPF) is based on an innovative combination of particle filters, Iterative-Closest-Point algorithm, and a feature-based Random Sampling and Consensus (RANSAC) algorithm for database matching. It can handle a large database of 3D-objects of complex shapes and performs a complete six-degree-of-freedom localization of static objects. The algorithms are validated by experimentation in ground and underwater environments using real hardware. To our knowledge this is the first instance of haptic object recognition and localization in underwater environments. The second approach is biologically inspired, and provides a close integration between exploration and recognition. An edge following exploration strategy is developed that receives feedback from the current state of recognition. A recognition by parts approach is developed which uses the BRICPPF for object sub-part recognition. Object exploration is either directed to explore a part until it is successfully recognized, or is directed towards new parts to endorse the current recognition belief. This approach is validated by simulation experiments. PMID:24553087
Demonstration of a 3D vision algorithm for space applications

NASA Technical Reports Server (NTRS)

Defigueiredo, Rui J. P. (Editor)

1987-01-01

This paper reports an extension of the MIAG algorithm for recognition and motion parameter determination of general 3-D polyhedral objects based on model matching techniques and using movement invariants as features of object representation. Results of tests conducted on the algorithm under conditions simulating space conditions are presented.
A Joint Gaussian Process Model for Active Visual Recognition with Expertise Estimation in Crowdsourcing

PubMed Central

Long, Chengjiang; Hua, Gang; Kapoor, Ashish

2015-01-01

We present a noise resilient probabilistic model for active learning of a Gaussian process classifier from crowds, i.e., a set of noisy labelers. It explicitly models both the overall label noise and the expertise level of each individual labeler with two levels of flip models. Expectation propagation is adopted for efficient approximate Bayesian inference of our probabilistic model for classification, based on which, a generalized EM algorithm is derived to estimate both the global label noise and the expertise of each individual labeler. The probabilistic nature of our model immediately allows the adoption of the prediction entropy for active selection of data samples to be labeled, and active selection of high quality labelers based on their estimated expertise to label the data. We apply the proposed model for four visual recognition tasks, i.e., object category recognition, multi-modal activity recognition, gender recognition, and fine-grained classification, on four datasets with real crowd-sourced labels from the Amazon Mechanical Turk. The experiments clearly demonstrate the efficacy of the proposed model. In addition, we extend the proposed model with the Predictive Active Set Selection Method to speed up the active learning system, whose efficacy is verified by conducting experiments on the first three datasets. The results show our extended model can not only preserve a higher accuracy, but also achieve a higher efficiency. PMID:26924892
Behavior analysis of video object in complicated background

NASA Astrophysics Data System (ADS)

Zhao, Wenting; Wang, Shigang; Liang, Chao; Wu, Wei; Lu, Yang

2016-10-01

This paper aims to achieve robust behavior recognition of video object in complicated background. Features of the video object are described and modeled according to the depth information of three-dimensional video. Multi-dimensional eigen vector are constructed and used to process high-dimensional data. Stable object tracing in complex scenes can be achieved with multi-feature based behavior analysis, so as to obtain the motion trail. Subsequently, effective behavior recognition of video object is obtained according to the decision criteria. What's more, the real-time of algorithms and accuracy of analysis are both improved greatly. The theory and method on the behavior analysis of video object in reality scenes put forward by this project have broad application prospect and important practical significance in the security, terrorism, military and many other fields.

The research of edge extraction and target recognition based on inherent feature of objects

NASA Astrophysics Data System (ADS)

Xie, Yu-chan; Lin, Yu-chi; Huang, Yin-guo

2008-03-01

Current research on computer vision often needs specific techniques for particular problems. Little use has been made of high-level aspects of computer vision, such as three-dimensional (3D) object recognition, that are appropriate for large classes of problems and situations. In particular, high-level vision often focuses mainly on the extraction of symbolic descriptions, and pays little attention to the speed of processing. In order to extract and recognize target intelligently and rapidly, in this paper we developed a new 3D target recognition method based on inherent feature of objects in which cuboid was taken as model. On the basis of analysis cuboid nature contour and greyhound distributing characteristics, overall fuzzy evaluating technique was utilized to recognize and segment the target. Then Hough transform was used to extract and match model's main edges, we reconstruct aim edges by stereo technology in the end. There are three major contributions in this paper. Firstly, the corresponding relations between the parameters of cuboid model's straight edges lines in an image field and in the transform field were summed up. By those, the aimless computations and searches in Hough transform processing can be reduced greatly and the efficiency is improved. Secondly, as the priori knowledge about cuboids contour's geometry character known already, the intersections of the component extracted edges are taken, and assess the geometry of candidate edges matches based on the intersections, rather than the extracted edges. Therefore the outlines are enhanced and the noise is depressed. Finally, a 3-D target recognition method is proposed. Compared with other recognition methods, this new method has a quick response time and can be achieved with high-level computer vision. The method present here can be used widely in vision-guide techniques to strengthen its intelligence and generalization, which can also play an important role in object tracking, port AGV, robots fields. The results of simulation experiments and theory analyzing demonstrate that the proposed method could suppress noise effectively, extracted target edges robustly, and achieve the real time need. Theory analysis and experiment shows the method is reasonable and efficient.
Robust Pedestrian Tracking and Recognition from FLIR Video: A Unified Approach via Sparse Coding

PubMed Central

Li, Xin; Guo, Rui; Chen, Chao

2014-01-01

Sparse coding is an emerging method that has been successfully applied to both robust object tracking and recognition in the vision literature. In this paper, we propose to explore a sparse coding-based approach toward joint object tracking-and-recognition and explore its potential in the analysis of forward-looking infrared (FLIR) video to support nighttime machine vision systems. A key technical contribution of this work is to unify existing sparse coding-based approaches toward tracking and recognition under the same framework, so that they can benefit from each other in a closed-loop. On the one hand, tracking the same object through temporal frames allows us to achieve improved recognition performance through dynamical updating of template/dictionary and combining multiple recognition results; on the other hand, the recognition of individual objects facilitates the tracking of multiple objects (i.e., walking pedestrians), especially in the presence of occlusion within a crowded environment. We report experimental results on both the CASIAPedestrian Database and our own collected FLIR video database to demonstrate the effectiveness of the proposed joint tracking-and-recognition approach. PMID:24961216
Comparison Analysis of Recognition Algorithms of Forest-Cover Objects on Hyperspectral Air-Borne and Space-Borne Images

NASA Astrophysics Data System (ADS)

Kozoderov, V. V.; Kondranin, T. V.; Dmitriev, E. V.

2017-12-01

The basic model for the recognition of natural and anthropogenic objects using their spectral and textural features is described in the problem of hyperspectral air-borne and space-borne imagery processing. The model is based on improvements of the Bayesian classifier that is a computational procedure of statistical decision making in machine-learning methods of pattern recognition. The principal component method is implemented to decompose the hyperspectral measurements on the basis of empirical orthogonal functions. Application examples are shown of various modifications of the Bayesian classifier and Support Vector Machine method. Examples are provided of comparing these classifiers and a metrical classifier that operates on finding the minimal Euclidean distance between different points and sets in the multidimensional feature space. A comparison is also carried out with the " K-weighted neighbors" method that is close to the nonparametric Bayesian classifier.
Size-Sensitive Perceptual Representations Underlie Visual and Haptic Object Recognition

PubMed Central

Craddock, Matt; Lawson, Rebecca

2009-01-01

A variety of similarities between visual and haptic object recognition suggests that the two modalities may share common representations. However, it is unclear whether such common representations preserve low-level perceptual features or whether transfer between vision and haptics is mediated by high-level, abstract representations. Two experiments used a sequential shape-matching task to examine the effects of size changes on unimodal and crossmodal visual and haptic object recognition. Participants felt or saw 3D plastic models of familiar objects. The two objects presented on a trial were either the same size or different sizes and were the same shape or different but similar shapes. Participants were told to ignore size changes and to match on shape alone. In Experiment 1, size changes on same-shape trials impaired performance similarly for both visual-to-visual and haptic-to-haptic shape matching. In Experiment 2, size changes impaired performance on both visual-to-haptic and haptic-to-visual shape matching and there was no interaction between the cost of size changes and direction of transfer. Together the unimodal and crossmodal matching results suggest that the same, size-specific perceptual representations underlie both visual and haptic object recognition, and indicate that crossmodal memory for objects must be at least partly based on common perceptual representations. PMID:19956685
Infant Visual Attention and Object Recognition

PubMed Central

Reynolds, Greg D.

2015-01-01

This paper explores the role visual attention plays in the recognition of objects in infancy. Research and theory on the development of infant attention and recognition memory are reviewed in three major sections. The first section reviews some of the major findings and theory emerging from a rich tradition of behavioral research utilizing preferential looking tasks to examine visual attention and recognition memory in infancy. The second section examines research utilizing neural measures of attention and object recognition in infancy as well as research on brain-behavior relations in the early development of attention and recognition memory. The third section addresses potential areas of the brain involved in infant object recognition and visual attention. An integrated synthesis of some of the existing models of the development of visual attention is presented which may account for the observed changes in behavioral and neural measures of visual attention and object recognition that occur across infancy. PMID:25596333
Northeast Artificial Intelligence Consortium Annual Report. Volume 7. 1988 Research in Automated Photointerpretation

DTIC Science & Technology

1989-10-01

weight based on how powerful the corresponding feature is for object recognition and discrimination. For example, consider an arbitrary weight, denoted...quality of the segmentation, how powerful the features and spatial constraints in the knowledge base are (as far as object recognition is concern...that are powerful for object recognition and discrimination. At this point, this selection is performed heuristically through trial-and-error. As a
People's Risk Recognition Preceding Evacuation and Its Role in Demand Modeling and Planning.

PubMed

Urata, Junji; Pel, Adam J

2018-05-01

Evacuation planning and management involves estimating the travel demand in the event that such action is required. This is usually done as a function of people's decision to evacuate, which we show is strongly linked to their risk awareness. We use an empirical data set, which shows tsunami evacuation behavior, to demonstrate that risk recognition is not synonymous with objective risk, but is instead determined by a combination of factors including risk education, information, and sociodemographics, and that it changes dynamically over time. Based on these findings, we formulate an ordered logit model to describe risk recognition combined with a latent class model to describe evacuation choices. Our proposed evacuation choice model along with a risk recognition class can evaluate quantitatively the influence of disaster mitigation measures, risk education, and risk information. The results obtained from the risk recognition model show that risk information has a greater impact in the sense that people recognize their high risk. The results of the evacuation choice model show that people who are unaware of their risk take a longer time to evacuate. © 2017 Society for Risk Analysis.
Target recognition and scene interpretation in image/video understanding systems based on network-symbolic models

NASA Astrophysics Data System (ADS)

Kuvich, Gary

2004-08-01

Vision is only a part of a system that converts visual information into knowledge structures. These structures drive the vision process, resolving ambiguity and uncertainty via feedback, and provide image understanding, which is an interpretation of visual information in terms of these knowledge models. These mechanisms provide a reliable recognition if the object is occluded or cannot be recognized as a whole. It is hard to split the entire system apart, and reliable solutions to the target recognition problems are possible only within the solution of a more generic Image Understanding Problem. Brain reduces informational and computational complexities, using implicit symbolic coding of features, hierarchical compression, and selective processing of visual information. Biologically inspired Network-Symbolic representation, where both systematic structural/logical methods and neural/statistical methods are parts of a single mechanism, is the most feasible for such models. It converts visual information into relational Network-Symbolic structures, avoiding artificial precise computations of 3-dimensional models. Network-Symbolic Transformations derive abstract structures, which allows for invariant recognition of an object as exemplar of a class. Active vision helps creating consistent models. Attention, separation of figure from ground and perceptual grouping are special kinds of network-symbolic transformations. Such Image/Video Understanding Systems will be reliably recognizing targets.
Human recognition based on head-shoulder contour extraction and BP neural network

NASA Astrophysics Data System (ADS)

Kong, Xiao-fang; Wang, Xiu-qin; Gu, Guohua; Chen, Qian; Qian, Wei-xian

2014-11-01

In practical application scenarios like video surveillance and human-computer interaction, human body movements are uncertain because the human body is a non-rigid object. Based on the fact that the head-shoulder part of human body can be less affected by the movement, and will seldom be obscured by other objects, in human detection and recognition, a head-shoulder model with its stable characteristics can be applied as a detection feature to describe the human body. In order to extract the head-shoulder contour accurately, a head-shoulder model establish method with combination of edge detection and the mean-shift algorithm in image clustering has been proposed in this paper. First, an adaptive method of mixture Gaussian background update has been used to extract targets from the video sequence. Second, edge detection has been used to extract the contour of moving objects, and the mean-shift algorithm has been combined to cluster parts of target's contour. Third, the head-shoulder model can be established, according to the width and height ratio of human head-shoulder combined with the projection histogram of the binary image, and the eigenvectors of the head-shoulder contour can be acquired. Finally, the relationship between head-shoulder contour eigenvectors and the moving objects will be formed by the training of back-propagation (BP) neural network classifier, and the human head-shoulder model can be clustered for human detection and recognition. Experiments have shown that the method combined with edge detection and mean-shift algorithm proposed in this paper can extract the complete head-shoulder contour, with low calculating complexity and high efficiency.
OPTICAL INFORMATION PROCESSING: Synthesis of an object recognition system based on the profile of the envelope of a laser pulse in pulsed lidars

NASA Astrophysics Data System (ADS)

Buryi, E. V.

1998-05-01

The main problems in the synthesis of an object recognition system, based on the principles of operation of neuron networks, are considered. Advantages are demonstrated of a hierarchical structure of the recognition algorithm. The use of reading of the amplitude spectrum of signals as information tags is justified and a method is developed for determination of the dimensionality of the tag space. Methods are suggested for ensuring the stability of object recognition in the optical range. It is concluded that it should be possible to recognise perspectives of complex objects.
Virtual Environment for Surgical Room of the Future.

DTIC Science & Technology

1995-10-01

Design; 1. wire frame Dynamic Interaction 2. surface B. Acoustic Three-Dimensional Modeling; 3. solid based on radiosity modeling B. Dynamic...infection control of people and E. Rendering and Shadowing equipment 1. ray tracing D. Fluid Flow 2. radiosity F. Animation OBJECT RECOGNITION COMMUNICATION
Non-accidental properties, metric invariance, and encoding by neurons in a model of ventral stream visual object recognition, VisNet.

PubMed

Rolls, Edmund T; Mills, W Patrick C

2018-05-01

When objects transform into different views, some properties are maintained, such as whether the edges are convex or concave, and these non-accidental properties are likely to be important in view-invariant object recognition. The metric properties, such as the degree of curvature, may change with different views, and are less likely to be useful in object recognition. It is shown that in a model of invariant visual object recognition in the ventral visual stream, VisNet, non-accidental properties are encoded much more than metric properties by neurons. Moreover, it is shown how with the temporal trace rule training in VisNet, non-accidental properties of objects become encoded by neurons, and how metric properties are treated invariantly. We also show how VisNet can generalize between different objects if they have the same non-accidental property, because the metric properties are likely to overlap. VisNet is a 4-layer unsupervised model of visual object recognition trained by competitive learning that utilizes a temporal trace learning rule to implement the learning of invariance using views that occur close together in time. A second crucial property of this model of object recognition is, when neurons in the level corresponding to the inferior temporal visual cortex respond selectively to objects, whether neurons in the intermediate layers can respond to combinations of features that may be parts of two or more objects. In an investigation using the four sides of a square presented in every possible combination, it was shown that even though different layer 4 neurons are tuned to encode each feature or feature combination orthogonally, neurons in the intermediate layers can respond to features or feature combinations present is several objects. This property is an important part of the way in which high capacity can be achieved in the four-layer ventral visual cortical pathway. These findings concerning non-accidental properties and the use of neurons in intermediate layers of the hierarchy help to emphasise fundamental underlying principles of the computations that may be implemented in the ventral cortical visual stream used in object recognition. Copyright © 2018 Elsevier Inc. All rights reserved.
[NDVI difference rate recognition model of deciduous broad-leaved forest based on HJ-CCD remote sensing data].

PubMed

Wang, Yan; Tian, Qing-Jiu; Huang, Yan; Wei, Hong-Wei

2013-04-01

The present paper takes Chuzhou in Anhui Province as the research area, and deciduous broad-leaved forest as the research object. Then it constructs the recognition model about deciduous broad-leaved forest was constructed using NDVI difference rate between leaf expansion and flowering and fruit-bearing, and the model was applied to HJ-CCD remote sensing image on April 1, 2012 and May 4, 2012. At last, the spatial distribution map of deciduous broad-leaved forest was extracted effectively, and the results of extraction were verified and evaluated. The result shows the validity of NDVI difference rate extraction method proposed in this paper and also verifies the applicability of using HJ-CCD data for vegetation classification and recognition.
A new selective developmental deficit: Impaired object recognition with normal face recognition.

PubMed

Germine, Laura; Cashdollar, Nathan; Düzel, Emrah; Duchaine, Bradley

2011-05-01

Studies of developmental deficits in face recognition, or developmental prosopagnosia, have shown that individuals who have not suffered brain damage can show face recognition impairments coupled with normal object recognition (Duchaine and Nakayama, 2005; Duchaine et al., 2006; Nunn et al., 2001). However, no developmental cases with the opposite dissociation - normal face recognition with impaired object recognition - have been reported. The existence of a case of non-face developmental visual agnosia would indicate that the development of normal face recognition mechanisms does not rely on the development of normal object recognition mechanisms. To see whether a developmental variant of non-face visual object agnosia exists, we conducted a series of web-based object and face recognition tests to screen for individuals showing object recognition memory impairments but not face recognition impairments. Through this screening process, we identified AW, an otherwise normal 19-year-old female, who was then tested in the lab on face and object recognition tests. AW's performance was impaired in within-class visual recognition memory across six different visual categories (guns, horses, scenes, tools, doors, and cars). In contrast, she scored normally on seven tests of face recognition, tests of memory for two other object categories (houses and glasses), and tests of recall memory for visual shapes. Testing confirmed that her impairment was not related to a general deficit in lower-level perception, object perception, basic-level recognition, or memory. AW's results provide the first neuropsychological evidence that recognition memory for non-face visual object categories can be selectively impaired in individuals without brain damage or other memory impairment. These results indicate that the development of recognition memory for faces does not depend on intact object recognition memory and provide further evidence for category-specific dissociations in visual recognition. Copyright © 2010 Elsevier Srl. All rights reserved.
New technique for real-time distortion-invariant multiobject recognition and classification

NASA Astrophysics Data System (ADS)

Hong, Rutong; Li, Xiaoshun; Hong, En; Wang, Zuyi; Wei, Hongan

2001-04-01

A real-time hybrid distortion-invariant OPR system was established to make 3D multiobject distortion-invariant automatic pattern recognition. Wavelet transform technique was used to make digital preprocessing of the input scene, to depress the noisy background and enhance the recognized object. A three-layer backpropagation artificial neural network was used in correlation signal post-processing to perform multiobject distortion-invariant recognition and classification. The C-80 and NOA real-time processing ability and the multithread programming technology were used to perform high speed parallel multitask processing and speed up the post processing rate to ROIs. The reference filter library was constructed for the distortion version of 3D object model images based on the distortion parameter tolerance measuring as rotation, azimuth and scale. The real-time optical correlation recognition testing of this OPR system demonstrates that using the preprocessing, post- processing, the nonlinear algorithm os optimum filtering, RFL construction technique and the multithread programming technology, a high possibility of recognition and recognition rate ere obtained for the real-time multiobject distortion-invariant OPR system. The recognition reliability and rate was improved greatly. These techniques are very useful to automatic target recognition.
Use of the recognition heuristic depends on the domain's recognition validity, not on the recognition validity of selected sets of objects.

PubMed

Pohl, Rüdiger F; Michalkiewicz, Martha; Erdfelder, Edgar; Hilbig, Benjamin E

2017-07-01

According to the recognition-heuristic theory, decision makers solve paired comparisons in which one object is recognized and the other not by recognition alone, inferring that recognized objects have higher criterion values than unrecognized ones. However, success-and thus usefulness-of this heuristic depends on the validity of recognition as a cue, and adaptive decision making, in turn, requires that decision makers are sensitive to it. To this end, decision makers could base their evaluation of the recognition validity either on the selected set of objects (the set's recognition validity), or on the underlying domain from which the objects were drawn (the domain's recognition validity). In two experiments, we manipulated the recognition validity both in the selected set of objects and between domains from which the sets were drawn. The results clearly show that use of the recognition heuristic depends on the domain's recognition validity, not on the set's recognition validity. In other words, participants treat all sets as roughly representative of the underlying domain and adjust their decision strategy adaptively (only) with respect to the more general environment rather than the specific items they are faced with.
Implicit Shape Models for Object Detection in 3d Point Clouds

NASA Astrophysics Data System (ADS)

Velizhev, A.; Shapovalov, R.; Schindler, K.

2012-07-01

We present a method for automatic object localization and recognition in 3D point clouds representing outdoor urban scenes. The method is based on the implicit shape models (ISM) framework, which recognizes objects by voting for their center locations. It requires only few training examples per class, which is an important property for practical use. We also introduce and evaluate an improved version of the spin image descriptor, more robust to point density variation and uncertainty in normal direction estimation. Our experiments reveal a significant impact of these modifications on the recognition performance. We compare our results against the state-of-the-art method and get significant improvement in both precision and recall on the Ohio dataset, consisting of combined aerial and terrestrial LiDAR scans of 150,000 m2 of urban area in total.
A depictive neural model for the representation of motion verbs.

PubMed

Rao, Sunil; Aleksander, Igor

2011-11-01

In this paper, we present a depictive neural model for the representation of motion verb semantics in neural models of visual awareness. The problem of modelling motion verb representation is shown to be one of function application, mapping a set of given input variables defining the moving object and the path of motion to a defined output outcome in the motion recognition context. The particular function-applicative implementation and consequent recognition model design presented are seen as arising from a noun-adjective recognition model enabling the recognition of colour adjectives as applied to a set of shapes representing objects to be recognised. The presence of such a function application scheme and a separately implemented position identification and path labelling scheme are accordingly shown to be the primitives required to enable the design and construction of a composite depictive motion verb recognition scheme. Extensions to the presented design to enable the representation of transitive verbs are also discussed.
Object and event recognition for stroke rehabilitation

NASA Astrophysics Data System (ADS)

Ghali, Ahmed; Cunningham, Andrew S.; Pridmore, Tony P.

2003-06-01

Stroke is a major cause of disability and health care expenditure around the world. Existing stroke rehabilitation methods can be effective but are costly and need to be improved. Even modest improvements in the effectiveness of rehabilitation techniques could produce large benefits in terms of quality of life. The work reported here is part of an ongoing effort to integrate virtual reality and machine vision technologies to produce innovative stroke rehabilitation methods. We describe a combined object recognition and event detection system that provides real time feedback to stroke patients performing everyday kitchen tasks necessary for independent living, e.g. making a cup of coffee. The image plane position of each object, including the patient"s hand, is monitored using histogram-based recognition methods. The relative positions of hand and objects are then reported to a task monitor that compares the patient"s actions against a model of the target task. A prototype system has been constructed and is currently undergoing technical and clinical evaluation.
Recognition-induced forgetting of faces in visual long-term memory.

PubMed

Rugo, Kelsi F; Tamler, Kendall N; Woodman, Geoffrey F; Maxcey, Ashleigh M

2017-10-01

Despite more than a century of evidence that long-term memory for pictures and words are different, much of what we know about memory comes from studies using words. Recent research examining visual long-term memory has demonstrated that recognizing an object induces the forgetting of objects from the same category. This recognition-induced forgetting has been shown with a variety of everyday objects. However, unlike everyday objects, faces are objects of expertise. As a result, faces may be immune to recognition-induced forgetting. However, despite excellent memory for such stimuli, we found that faces were susceptible to recognition-induced forgetting. Our findings have implications for how models of human memory account for recognition-induced forgetting as well as represent objects of expertise and consequences for eyewitness testimony and the justice system.

A knowledge-based object recognition system for applications in the space station

NASA Technical Reports Server (NTRS)

Dhawan, Atam P.

1988-01-01

A knowledge-based three-dimensional (3D) object recognition system is being developed. The system uses primitive-based hierarchical relational and structural matching for the recognition of 3D objects in the two-dimensional (2D) image for interpretation of the 3D scene. At present, the pre-processing, low-level preliminary segmentation, rule-based segmentation, and the feature extraction are completed. The data structure of the primitive viewing knowledge-base (PVKB) is also completed. Algorithms and programs based on attribute-trees matching for decomposing the segmented data into valid primitives were developed. The frame-based structural and relational descriptions of some objects were created and stored in a knowledge-base. This knowledge-base of the frame-based descriptions were developed on the MICROVAX-AI microcomputer in LISP environment. The simulated 3D scene of simple non-overlapping objects as well as real camera data of images of 3D objects of low-complexity have been successfully interpreted.
Parallel and distributed computation for fault-tolerant object recognition

NASA Technical Reports Server (NTRS)

Wechsler, Harry

1988-01-01

The distributed associative memory (DAM) model is suggested for distributed and fault-tolerant computation as it relates to object recognition tasks. The fault-tolerance is with respect to geometrical distortions (scale and rotation), noisy inputs, occulsion/overlap, and memory faults. An experimental system was developed for fault-tolerant structure recognition which shows the feasibility of such an approach. The approach is futher extended to the problem of multisensory data integration and applied successfully to the recognition of colored polyhedral objects.
The role of color information on object recognition: a review and meta-analysis.

PubMed

Bramão, Inês; Reis, Alexandra; Petersson, Karl Magnus; Faísca, Luís

2011-09-01

In this study, we systematically review the scientific literature on the effect of color on object recognition. Thirty-five independent experiments, comprising 1535 participants, were included in a meta-analysis. We found a moderate effect of color on object recognition (d=0.28). Specific effects of moderator variables were analyzed and we found that color diagnosticity is the factor with the greatest moderator effect on the influence of color in object recognition; studies using color diagnostic objects showed a significant color effect (d=0.43), whereas a marginal color effect was found in studies that used non-color diagnostic objects (d=0.18). The present study did not permit the drawing of specific conclusions about the moderator effect of the object recognition task; while the meta-analytic review showed that color information improves object recognition mainly in studies using naming tasks (d=0.36), the literature review revealed a large body of evidence showing positive effects of color information on object recognition in studies using a large variety of visual recognition tasks. We also found that color is important for the ability to recognize artifacts and natural objects, to recognize objects presented as types (line-drawings) or as tokens (photographs), and to recognize objects that are presented without surface details, such as texture or shadow. Taken together, the results of the meta-analysis strongly support the contention that color plays a role in object recognition. This suggests that the role of color should be taken into account in models of visual object recognition. Copyright © 2011 Elsevier B.V. All rights reserved.
Multiview human activity recognition system based on spatiotemporal template for video surveillance system

NASA Astrophysics Data System (ADS)

Kushwaha, Alok Kumar Singh; Srivastava, Rajeev

2015-09-01

An efficient view invariant framework for the recognition of human activities from an input video sequence is presented. The proposed framework is composed of three consecutive modules: (i) detect and locate people by background subtraction, (ii) view invariant spatiotemporal template creation for different activities, (iii) and finally, template matching is performed for view invariant activity recognition. The foreground objects present in a scene are extracted using change detection and background modeling. The view invariant templates are constructed using the motion history images and object shape information for different human activities in a video sequence. For matching the spatiotemporal templates for various activities, the moment invariants and Mahalanobis distance are used. The proposed approach is tested successfully on our own viewpoint dataset, KTH action recognition dataset, i3DPost multiview dataset, MSR viewpoint action dataset, VideoWeb multiview dataset, and WVU multiview human action recognition dataset. From the experimental results and analysis over the chosen datasets, it is observed that the proposed framework is robust, flexible, and efficient with respect to multiple views activity recognition, scale, and phase variations.
Dynamic information processing states revealed through neurocognitive models of object semantics

PubMed Central

Clarke, Alex

2015-01-01

Recognising objects relies on highly dynamic, interactive brain networks to process multiple aspects of object information. To fully understand how different forms of information about objects are represented and processed in the brain requires a neurocognitive account of visual object recognition that combines a detailed cognitive model of semantic knowledge with a neurobiological model of visual object processing. Here we ask how specific cognitive factors are instantiated in our mental processes and how they dynamically evolve over time. We suggest that coarse semantic information, based on generic shared semantic knowledge, is rapidly extracted from visual inputs and is sufficient to drive rapid category decisions. Subsequent recurrent neural activity between the anterior temporal lobe and posterior fusiform supports the formation of object-specific semantic representations – a conjunctive process primarily driven by the perirhinal cortex. These object-specific representations require the integration of shared and distinguishing object properties and support the unique recognition of objects. We conclude that a valuable way of understanding the cognitive activity of the brain is though testing the relationship between specific cognitive measures and dynamic neural activity. This kind of approach allows us to move towards uncovering the information processing states of the brain and how they evolve over time. PMID:25745632
Infant visual attention and object recognition.

PubMed

Reynolds, Greg D

2015-05-15

This paper explores the role visual attention plays in the recognition of objects in infancy. Research and theory on the development of infant attention and recognition memory are reviewed in three major sections. The first section reviews some of the major findings and theory emerging from a rich tradition of behavioral research utilizing preferential looking tasks to examine visual attention and recognition memory in infancy. The second section examines research utilizing neural measures of attention and object recognition in infancy as well as research on brain-behavior relations in the early development of attention and recognition memory. The third section addresses potential areas of the brain involved in infant object recognition and visual attention. An integrated synthesis of some of the existing models of the development of visual attention is presented which may account for the observed changes in behavioral and neural measures of visual attention and object recognition that occur across infancy. Copyright © 2015 Elsevier B.V. All rights reserved.
Three-dimensional deformable-model-based localization and recognition of road vehicles.

PubMed

Zhang, Zhaoxiang; Tan, Tieniu; Huang, Kaiqi; Wang, Yunhong

2012-01-01

We address the problem of model-based object recognition. Our aim is to localize and recognize road vehicles from monocular images or videos in calibrated traffic scenes. A 3-D deformable vehicle model with 12 shape parameters is set up as prior information, and its pose is determined by three parameters, which are its position on the ground plane and its orientation about the vertical axis under ground-plane constraints. An efficient local gradient-based method is proposed to evaluate the fitness between the projection of the vehicle model and image data, which is combined into a novel evolutionary computing framework to estimate the 12 shape parameters and three pose parameters by iterative evolution. The recovery of pose parameters achieves vehicle localization, whereas the shape parameters are used for vehicle recognition. Numerous experiments are conducted in this paper to demonstrate the performance of our approach. It is shown that the local gradient-based method can evaluate accurately and efficiently the fitness between the projection of the vehicle model and the image data. The evolutionary computing framework is effective for vehicles of different types and poses is robust to all kinds of occlusion.
Neural dynamics of object-based multifocal visual spatial attention and priming: Object cueing, useful-field-of-view, and crowding

PubMed Central

Foley, Nicholas C.; Grossberg, Stephen; Mingolla, Ennio

2015-01-01

How are spatial and object attention coordinated to achieve rapid object learning and recognition during eye movement search? How do prefrontal priming and parietal spatial mechanisms interact to determine the reaction time costs of intra-object attention shifts, inter-object attention shifts, and shifts between visible objects and covertly cued locations? What factors underlie individual differences in the timing and frequency of such attentional shifts? How do transient and sustained spatial attentional mechanisms work and interact? How can volition, mediated via the basal ganglia, influence the span of spatial attention? A neural model is developed of how spatial attention in the where cortical stream coordinates view-invariant object category learning in the what cortical stream under free viewing conditions. The model simulates psychological data about the dynamics of covert attention priming and switching requiring multifocal attention without eye movements. The model predicts how “attentional shrouds” are formed when surface representations in cortical area V4 resonate with spatial attention in posterior parietal cortex (PPC) and prefrontal cortex (PFC), while shrouds compete among themselves for dominance. Winning shrouds support invariant object category learning, and active surface-shroud resonances support conscious surface perception and recognition. Attentive competition between multiple objects and cues simulates reaction-time data from the two-object cueing paradigm. The relative strength of sustained surface-driven and fast-transient motion-driven spatial attention controls individual differences in reaction time for invalid cues. Competition between surface-driven attentional shrouds controls individual differences in detection rate of peripheral targets in useful-field-of-view tasks. The model proposes how the strength of competition can be mediated, though learning or momentary changes in volition, by the basal ganglia. A new explanation of crowding shows how the cortical magnification factor, among other variables, can cause multiple object surfaces to share a single surface-shroud resonance, thereby preventing recognition of the individual objects. PMID:22425615
Neural dynamics of object-based multifocal visual spatial attention and priming: object cueing, useful-field-of-view, and crowding.

PubMed

Foley, Nicholas C; Grossberg, Stephen; Mingolla, Ennio

2012-08-01

How are spatial and object attention coordinated to achieve rapid object learning and recognition during eye movement search? How do prefrontal priming and parietal spatial mechanisms interact to determine the reaction time costs of intra-object attention shifts, inter-object attention shifts, and shifts between visible objects and covertly cued locations? What factors underlie individual differences in the timing and frequency of such attentional shifts? How do transient and sustained spatial attentional mechanisms work and interact? How can volition, mediated via the basal ganglia, influence the span of spatial attention? A neural model is developed of how spatial attention in the where cortical stream coordinates view-invariant object category learning in the what cortical stream under free viewing conditions. The model simulates psychological data about the dynamics of covert attention priming and switching requiring multifocal attention without eye movements. The model predicts how "attentional shrouds" are formed when surface representations in cortical area V4 resonate with spatial attention in posterior parietal cortex (PPC) and prefrontal cortex (PFC), while shrouds compete among themselves for dominance. Winning shrouds support invariant object category learning, and active surface-shroud resonances support conscious surface perception and recognition. Attentive competition between multiple objects and cues simulates reaction-time data from the two-object cueing paradigm. The relative strength of sustained surface-driven and fast-transient motion-driven spatial attention controls individual differences in reaction time for invalid cues. Competition between surface-driven attentional shrouds controls individual differences in detection rate of peripheral targets in useful-field-of-view tasks. The model proposes how the strength of competition can be mediated, though learning or momentary changes in volition, by the basal ganglia. A new explanation of crowding shows how the cortical magnification factor, among other variables, can cause multiple object surfaces to share a single surface-shroud resonance, thereby preventing recognition of the individual objects. Copyright © 2012 Elsevier Inc. All rights reserved.
Visual agnosia and focal brain injury.

PubMed

Martinaud, O

Visual agnosia encompasses all disorders of visual recognition within a selective visual modality not due to an impairment of elementary visual processing or other cognitive deficit. Based on a sequential dichotomy between the perceptual and memory systems, two different categories of visual object agnosia are usually considered: 'apperceptive agnosia' and 'associative agnosia'. Impaired visual recognition within a single category of stimuli is also reported in: (i) visual object agnosia of the ventral pathway, such as prosopagnosia (for faces), pure alexia (for words), or topographagnosia (for landmarks); (ii) visual spatial agnosia of the dorsal pathway, such as cerebral akinetopsia (for movement), or orientation agnosia (for the placement of objects in space). Focal brain injuries provide a unique opportunity to better understand regional brain function, particularly with the use of effective statistical approaches such as voxel-based lesion-symptom mapping (VLSM). The aim of the present work was twofold: (i) to review the various agnosia categories according to the traditional visual dual-pathway model; and (ii) to better assess the anatomical network underlying visual recognition through lesion-mapping studies correlating neuroanatomical and clinical outcomes. Copyright © 2017 Elsevier Masson SAS. All rights reserved.
Object classification for obstacle avoidance

NASA Astrophysics Data System (ADS)

Regensburger, Uwe; Graefe, Volker

1991-03-01

Object recognition is necessary for any mobile robot operating autonomously in the real world. This paper discusses an object classifier based on a 2-D object model. Obstacle candidates are tracked and analyzed false alarms generated by the object detector are recognized and rejected. The methods have been implemented on a multi-processor system and tested in real-world experiments. They work reliably under favorable conditions but sometimes problems occur e. g. when objects contain many features (edges) or move in front of structured background.
Parts and Relations in Young Children's Shape-Based Object Recognition

ERIC Educational Resources Information Center

Augustine, Elaine; Smith, Linda B.; Jones, Susan S.

2011-01-01

The ability to recognize common objects from sparse information about geometric shape emerges during the same period in which children learn object names and object categories. Hummel and Biederman's (1992) theory of object recognition proposes that the geometric shapes of objects have two components--geometric volumes representing major object…
Online Feature Transformation Learning for Cross-Domain Object Category Recognition.

PubMed

Zhang, Xuesong; Zhuang, Yan; Wang, Wei; Pedrycz, Witold

2017-06-09

In this paper, we introduce a new research problem termed online feature transformation learning in the context of multiclass object category recognition. The learning of a feature transformation is viewed as learning a global similarity metric function in an online manner. We first consider the problem of online learning a feature transformation matrix expressed in the original feature space and propose an online passive aggressive feature transformation algorithm. Then these original features are mapped to kernel space and an online single kernel feature transformation (OSKFT) algorithm is developed to learn a nonlinear feature transformation. Based on the OSKFT and the existing Hedge algorithm, a novel online multiple kernel feature transformation algorithm is also proposed, which can further improve the performance of online feature transformation learning in large-scale application. The classifier is trained with k nearest neighbor algorithm together with the learned similarity metric function. Finally, we experimentally examined the effect of setting different parameter values in the proposed algorithms and evaluate the model performance on several multiclass object recognition data sets. The experimental results demonstrate the validity and good performance of our methods on cross-domain and multiclass object recognition application.
Striatal and Hippocampal Entropy and Recognition Signals in Category Learning: Simultaneous Processes Revealed by Model-Based fMRI

ERIC Educational Resources Information Center

Davis, Tyler; Love, Bradley C.; Preston, Alison R.

2012-01-01

Category learning is a complex phenomenon that engages multiple cognitive processes, many of which occur simultaneously and unfold dynamically over time. For example, as people encounter objects in the world, they simultaneously engage processes to determine their fit with current knowledge structures, gather new information about the objects, and…
It Takes Two–Skilled Recognition of Objects Engages Lateral Areas in Both Hemispheres

PubMed Central

Bilalić, Merim; Kiesel, Andrea; Pohl, Carsten; Erb, Michael; Grodd, Wolfgang

2011-01-01

Our object recognition abilities, a direct product of our experience with objects, are fine-tuned to perfection. Left temporal and lateral areas along the dorsal, action related stream, as well as left infero-temporal areas along the ventral, object related stream are engaged in object recognition. Here we show that expertise modulates the activity of dorsal areas in the recognition of man-made objects with clearly specified functions. Expert chess players were faster than chess novices in identifying chess objects and their functional relations. Experts' advantage was domain-specific as there were no differences between groups in a control task featuring geometrical shapes. The pattern of eye movements supported the notion that experts' extensive knowledge about domain objects and their functions enabled superior recognition even when experts were not directly fixating the objects of interest. Functional magnetic resonance imaging (fMRI) related exclusively the areas along the dorsal stream to chess specific object recognition. Besides the commonly involved left temporal and parietal lateral brain areas, we found that only in experts homologous areas on the right hemisphere were also engaged in chess specific object recognition. Based on these results, we discuss whether skilled object recognition does not only involve a more efficient version of the processes found in non-skilled recognition, but also qualitatively different cognitive processes which engage additional brain areas. PMID:21283683
Graded effects in hierarchical figure-ground organization: reply to Peterson (1999).

PubMed

Vecera, S P; O'Reilly, R C

2000-06-01

An important issue in vision research concerns the order of visual processing. S. P. Vecera and R. C. O'Reilly (1998) presented an interactive, hierarchical model that placed figure-ground segregation prior to object recognition. M. A. Peterson (1999) critiqued this model, arguing that because it used ambiguous stimulus displays, figure-ground processing did not precede object processing. In the current article, the authors respond to Peterson's (1999) interpretation of ambiguity in the model and her interpretation of what it means for figure-ground processing to come before object recognition. The authors argue that complete stimulus ambiguity is not critical to the model and that figure-ground precedes object recognition architecturally in the model. The arguments are supported with additional simulation results and an experiment, demonstrating that top-down inputs can influence figure-ground organization in displays that contain stimulus cues.
Recognition vs Reverse Engineering in Boolean Concepts Learning

ERIC Educational Resources Information Center

Shafat, Gabriel; Levin, Ilya

2012-01-01

This paper deals with two types of logical problems--recognition problems and reverse engineering problems, and with the interrelations between these types of problems. The recognition problems are modeled in the form of a visual representation of various objects in a common pattern, with a composition of represented objects in the pattern.…
Three Dimensional Object Recognition Using an Unsupervised Neural Network: Understanding the Distinguishing Features

DTIC Science & Technology

1992-12-23

predominance of structural models of recognition, of which a recent example is the Recognition By Components (RBC) theory ( Biederman , 1987 ). Structural...related to recent statistical theory (Huber, 1985; Friedman, 1987 ) and is derived from a biologically motivated computational theory (Bienenstock et...dimensional object recognition (Intrator and Gold, 1991). The method is related to recent statistical theory (Huber, 1985; Friedman, 1987 ) and is derived
Priming Contour-Deleted Images: Evidence for Immediate Representations in Visual Object Recognition.

ERIC Educational Resources Information Center

Biederman, Irving; Cooper, Eric E.

1991-01-01

Speed and accuracy of identification of pictures of objects are facilitated by prior viewing. Contributions of image features, convex or concave components, and object models in a repetition priming task were explored in 2 studies involving 96 college students. Results provide evidence of intermediate representations in visual object recognition.…
Coding of visual object features and feature conjunctions in the human brain.

PubMed

Martinovic, Jasna; Gruber, Thomas; Müller, Matthias M

2008-01-01

Object recognition is achieved through neural mechanisms reliant on the activity of distributed coordinated neural assemblies. In the initial steps of this process, an object's features are thought to be coded very rapidly in distinct neural assemblies. These features play different functional roles in the recognition process--while colour facilitates recognition, additional contours and edges delay it. Here, we selectively varied the amount and role of object features in an entry-level categorization paradigm and related them to the electrical activity of the human brain. We found that early synchronizations (approx. 100 ms) increased quantitatively when more image features had to be coded, without reflecting their qualitative contribution to the recognition process. Later activity (approx. 200-400 ms) was modulated by the representational role of object features. These findings demonstrate that although early synchronizations may be sufficient for relatively crude discrimination of objects in visual scenes, they cannot support entry-level categorization. This was subserved by later processes of object model selection, which utilized the representational value of object features such as colour or edges to select the appropriate model and achieve identification.

Convolutional Neural Network Based on Extreme Learning Machine for Maritime Ships Recognition in Infrared Images.

PubMed

Khellal, Atmane; Ma, Hongbin; Fei, Qing

2018-05-09

The success of Deep Learning models, notably convolutional neural networks (CNNs), makes them the favorable solution for object recognition systems in both visible and infrared domains. However, the lack of training data in the case of maritime ships research leads to poor performance due to the problem of overfitting. In addition, the back-propagation algorithm used to train CNN is very slow and requires tuning many hyperparameters. To overcome these weaknesses, we introduce a new approach fully based on Extreme Learning Machine (ELM) to learn useful CNN features and perform a fast and accurate classification, which is suitable for infrared-based recognition systems. The proposed approach combines an ELM based learning algorithm to train CNN for discriminative features extraction and an ELM based ensemble for classification. The experimental results on VAIS dataset, which is the largest dataset of maritime ships, confirm that the proposed approach outperforms the state-of-the-art models in term of generalization performance and training speed. For instance, the proposed model is up to 950 times faster than the traditional back-propagation based training of convolutional neural networks, primarily for low-level features extraction.
Object Recognition using Feature- and Color-Based Methods

NASA Technical Reports Server (NTRS)

Duong, Tuan; Duong, Vu; Stubberud, Allen

2008-01-01

An improved adaptive method of processing image data in an artificial neural network has been developed to enable automated, real-time recognition of possibly moving objects under changing (including suddenly changing) conditions of illumination and perspective. The method involves a combination of two prior object-recognition methods one based on adaptive detection of shape features and one based on adaptive color segmentation to enable recognition in situations in which either prior method by itself may be inadequate. The chosen prior feature-based method is known as adaptive principal-component analysis (APCA); the chosen prior color-based method is known as adaptive color segmentation (ACOSE). These methods are made to interact with each other in a closed-loop system to obtain an optimal solution of the object-recognition problem in a dynamic environment. One of the results of the interaction is to increase, beyond what would otherwise be possible, the accuracy of the determination of a region of interest (containing an object that one seeks to recognize) within an image. Another result is to provide a minimized adaptive step that can be used to update the results obtained by the two component methods when changes of color and apparent shape occur. The net effect is to enable the neural network to update its recognition output and improve its recognition capability via an adaptive learning sequence. In principle, the improved method could readily be implemented in integrated circuitry to make a compact, low-power, real-time object-recognition system. It has been proposed to demonstrate the feasibility of such a system by integrating a 256-by-256 active-pixel sensor with APCA, ACOSE, and neural processing circuitry on a single chip. It has been estimated that such a system on a chip would have a volume no larger than a few cubic centimeters, could operate at a rate as high as 1,000 frames per second, and would consume in the order of milliwatts of power.
Generation, recognition, and consistent fusion of partial boundary representations from range images

NASA Astrophysics Data System (ADS)

Kohlhepp, Peter; Hanczak, Andrzej M.; Li, Gang

1994-10-01

This paper presents SOMBRERO, a new system for recognizing and locating 3D, rigid, non- moving objects from range data. The objects may be polyhedral or curved, partially occluding, touching or lying flush with each other. For data collection, we employ 2D time- of-flight laser scanners mounted to a moving gantry robot. By combining sensor and robot coordinates, we obtain 3D cartesian coordinates. Boundary representations (Brep's) provide view independent geometry models that are both efficiently recognizable and derivable automatically from sensor data. SOMBRERO's methods for generating, matching and fusing Brep's are highly synergetic. A split-and-merge segmentation algorithm with dynamic triangular builds a partial (21/2D) Brep from scattered data. The recognition module matches this scene description with a model database and outputs recognized objects, their positions and orientations, and possibly surfaces corresponding to unknown objects. We present preliminary results in scene segmentation and recognition. Partial Brep's corresponding to different range sensors or viewpoints can be merged into a consistent, complete and irredundant 3D object or scene model. This fusion algorithm itself uses the recognition and segmentation methods.
Ground target recognition using rectangle estimation.

PubMed

Grönwall, Christina; Gustafsson, Fredrik; Millnert, Mille

2006-11-01

We propose a ground target recognition method based on 3-D laser radar data. The method handles general 3-D scattered data. It is based on the fact that man-made objects of complex shape can be decomposed to a set of rectangles. The ground target recognition method consists of four steps; 3-D size and orientation estimation, target segmentation into parts of approximately rectangular shape, identification of segments that represent the target's functional/main parts, and target matching with CAD models. The core in this approach is rectangle estimation. The performance of the rectangle estimation method is evaluated statistically using Monte Carlo simulations. A case study on tank recognition is shown, where 3-D data from four fundamentally different types of laser radar systems are used. Although the approach is tested on rather few examples, we believe that the approach is promising.
Computing multiple aggregation levels and contextual features for road facilities recognition using mobile laser scanning data

NASA Astrophysics Data System (ADS)

Yang, Bisheng; Dong, Zhen; Liu, Yuan; Liang, Fuxun; Wang, Yongjun

2017-04-01

In recent years, updating the inventory of road infrastructures based on field work is labor intensive, time consuming, and costly. Fortunately, vehicle-based mobile laser scanning (MLS) systems provide an efficient solution to rapidly capture three-dimensional (3D) point clouds of road environments with high flexibility and precision. However, robust recognition of road facilities from huge volumes of 3D point clouds is still a challenging issue because of complicated and incomplete structures, occlusions and varied point densities. Most existing methods utilize point or object based features to recognize object candidates, and can only extract limited types of objects with a relatively low recognition rate, especially for incomplete and small objects. To overcome these drawbacks, this paper proposes a semantic labeling framework by combing multiple aggregation levels (point-segment-object) of features and contextual features to recognize road facilities, such as road surfaces, road boundaries, buildings, guardrails, street lamps, traffic signs, roadside-trees, power lines, and cars, for highway infrastructure inventory. The proposed method first identifies ground and non-ground points, and extracts road surfaces facilities from ground points. Non-ground points are segmented into individual candidate objects based on the proposed multi-rule region growing method. Then, the multiple aggregation levels of features and the contextual features (relative positions, relative directions, and spatial patterns) associated with each candidate object are calculated and fed into a SVM classifier to label the corresponding candidate object. The recognition performance of combining multiple aggregation levels and contextual features was compared with single level (point, segment, or object) based features using large-scale highway scene point clouds. Comparative studies demonstrated that the proposed semantic labeling framework significantly improves road facilities recognition precision (90.6%) and recall (91.2%), particularly for incomplete and small objects.
Advanced optical correlation and digital methods for pattern matching—50th anniversary of Vander Lugt matched filter

NASA Astrophysics Data System (ADS)

Millán, María S.

2012-10-01

On the verge of the 50th anniversary of Vander Lugt’s formulation for pattern matching based on matched filtering and optical correlation, we acknowledge the very intense research activity developed in the field of correlation-based pattern recognition during this period of time. The paper reviews some domains that appeared as emerging fields in the last years of the 20th century and have been developed later on in the 21st century. Such is the case of three-dimensional (3D) object recognition, biometric pattern matching, optical security and hybrid optical-digital processors. 3D object recognition is a challenging case of multidimensional image recognition because of its implications in the recognition of real-world objects independent of their perspective. Biometric recognition is essentially pattern recognition for which the personal identification is based on the authentication of a specific physiological characteristic possessed by the subject (e.g. fingerprint, face, iris, retina, and multifactor combinations). Biometric recognition often appears combined with encryption-decryption processes to secure information. The optical implementations of correlation-based pattern recognition processes still rely on the 4f-correlator, the joint transform correlator, or some of their variants. But the many applications developed in the field have been pushing the systems for a continuous improvement of their architectures and algorithms, thus leading towards merged optical-digital solutions.
A Single-System Model Predicts Recognition Memory and Repetition Priming in Amnesia

PubMed Central

Kessels, Roy P.C.; Wester, Arie J.; Shanks, David R.

2014-01-01

We challenge the claim that there are distinct neural systems for explicit and implicit memory by demonstrating that a formal single-system model predicts the pattern of recognition memory (explicit) and repetition priming (implicit) in amnesia. In the current investigation, human participants with amnesia categorized pictures of objects at study and then, at test, identified fragmented versions of studied (old) and nonstudied (new) objects (providing a measure of priming), and made a recognition memory judgment (old vs new) for each object. Numerous results in the amnesic patients were predicted in advance by the single-system model, as follows: (1) deficits in recognition memory and priming were evident relative to a control group; (2) items judged as old were identified at greater levels of fragmentation than items judged new, regardless of whether the items were actually old or new; and (3) the magnitude of the priming effect (the identification advantage for old vs new items) overall was greater than that of items judged new. Model evidence measures also favored the single-system model over two formal multiple-systems models. The findings support the single-system model, which explains the pattern of recognition and priming in amnesia primarily as a reduction in the strength of a single dimension of memory strength, rather than a selective explicit memory system deficit. PMID:25122896
Implications of Animal Object Memory Research for Human Amnesia

ERIC Educational Resources Information Center

Winters, Boyer D.; Saksida, Lisa M.; Bussey, Timothy J.

2010-01-01

Damage to structures in the human medial temporal lobe causes severe memory impairment. Animal object recognition tests gained prominence from attempts to model "global" human medial temporal lobe amnesia, such as that observed in patient HM. These tasks, such as delayed nonmatching-to-sample and spontaneous object recognition, for assessing…
Crowded and Sparse Domains in Object Recognition: Consequences for Categorization and Naming

ERIC Educational Resources Information Center

Gale, Tim M.; Laws, Keith R.; Foley, Kerry

2006-01-01

Some models of object recognition propose that items from structurally crowded categories (e.g., living things) permit faster access to superordinate semantic information than structurally dissimilar categories (e.g., nonliving things), but slower access to individual object information when naming items. We present four experiments that utilize…
Pose estimation of industrial objects towards robot operation

NASA Astrophysics Data System (ADS)

Niu, Jie; Zhou, Fuqiang; Tan, Haishu; Cao, Yu

2017-10-01

With the advantages of wide range, non-contact and high flexibility, the visual estimation technology of target pose has been widely applied in modern industry, robot guidance and other engineering practices. However, due to the influence of complicated industrial environment, outside interference factors, lack of object characteristics, restrictions of camera and other limitations, the visual estimation technology of target pose is still faced with many challenges. Focusing on the above problems, a pose estimation method of the industrial objects is developed based on 3D models of targets. By matching the extracted shape characteristics of objects with the priori 3D model database of targets, the method realizes the recognition of target. Thus a pose estimation of objects can be determined based on the monocular vision measuring model. The experimental results show that this method can be implemented to estimate the position of rigid objects based on poor images information, and provides guiding basis for the operation of the industrial robot.
Recognizing familiar objects by hand and foot: Haptic shape perception generalizes to inputs from unusual locations and untrained body parts.

PubMed

Lawson, Rebecca

2014-02-01

The limits of generalization of our 3-D shape recognition system to identifying objects by touch was investigated by testing exploration at unusual locations and using untrained effectors. In Experiments 1 and 2, people found identification by hand of real objects, plastic 3-D models of objects, and raised line drawings placed in front of themselves no easier than when exploration was behind their back. Experiment 3 compared one-handed, two-handed, one-footed, and two-footed haptic object recognition of familiar objects. Recognition by foot was slower (7 vs. 13 s) and much less accurate (9 % vs. 47 % errors) than recognition by either one or both hands. Nevertheless, item difficulty was similar across hand and foot exploration, and there was a strong correlation between an individual's hand and foot performance. Furthermore, foot recognition was better with the largest 20 of the 80 items (32 % errors), suggesting that physical limitations hampered exploration by foot. Thus, object recognition by hand generalized efficiently across the spatial location of stimuli, while object recognition by foot seemed surprisingly good given that no prior training was provided. Active touch (haptics) thus efficiently extracts 3-D shape information and accesses stored representations of familiar objects from novel modes of input.
Comparison of Object Recognition Behavior in Human and Monkey

PubMed Central

Rajalingham, Rishi; Schmidt, Kailyn

2015-01-01

Although the rhesus monkey is used widely as an animal model of human visual processing, it is not known whether invariant visual object recognition behavior is quantitatively comparable across monkeys and humans. To address this question, we systematically compared the core object recognition behavior of two monkeys with that of human subjects. To test true object recognition behavior (rather than image matching), we generated several thousand naturalistic synthetic images of 24 basic-level objects with high variation in viewing parameters and image background. Monkeys were trained to perform binary object recognition tasks on a match-to-sample paradigm. Data from 605 human subjects performing the same tasks on Mechanical Turk were aggregated to characterize “pooled human” object recognition behavior, as well as 33 separate Mechanical Turk subjects to characterize individual human subject behavior. Our results show that monkeys learn each new object in a few days, after which they not only match mean human performance but show a pattern of object confusion that is highly correlated with pooled human confusion patterns and is statistically indistinguishable from individual human subjects. Importantly, this shared human and monkey pattern of 3D object confusion is not shared with low-level visual representations (pixels, V1+; models of the retina and primary visual cortex) but is shared with a state-of-the-art computer vision feature representation. Together, these results are consistent with the hypothesis that rhesus monkeys and humans share a common neural shape representation that directly supports object perception. SIGNIFICANCE STATEMENT To date, several mammalian species have shown promise as animal models for studying the neural mechanisms underlying high-level visual processing in humans. In light of this diversity, making tight comparisons between nonhuman and human primates is particularly critical in determining the best use of nonhuman primates to further the goal of the field of translating knowledge gained from animal models to humans. To the best of our knowledge, this study is the first systematic attempt at comparing a high-level visual behavior of humans and macaque monkeys. PMID:26338324
Recognizing Spoken Words: The Neighborhood Activation Model

PubMed Central

Luce, Paul A.; Pisoni, David B.

2012-01-01

Objective A fundamental problem in the study of human spoken word recognition concerns the structural relations among the sound patterns of words in memory and the effects these relations have on spoken word recognition. In the present investigation, computational and experimental methods were employed to address a number of fundamental issues related to the representation and structural organization of spoken words in the mental lexicon and to lay the groundwork for a model of spoken word recognition. Design Using a computerized lexicon consisting of transcriptions of 20,000 words, similarity neighborhoods for each of the transcriptions were computed. Among the variables of interest in the computation of the similarity neighborhoods were: 1) the number of words occurring in a neighborhood, 2) the degree of phonetic similarity among the words, and 3) the frequencies of occurrence of the words in the language. The effects of these variables on auditory word recognition were examined in a series of behavioral experiments employing three experimental paradigms: perceptual identification of words in noise, auditory lexical decision, and auditory word naming. Results The results of each of these experiments demonstrated that the number and nature of words in a similarity neighborhood affect the speed and accuracy of word recognition. A neighborhood probability rule was developed that adequately predicted identification performance. This rule, based on Luce's (1959) choice rule, combines stimulus word intelligibility, neighborhood confusability, and frequency into a single expression. Based on this rule, a model of auditory word recognition, the neighborhood activation model, was proposed. This model describes the effects of similarity neighborhood structure on the process of discriminating among the acoustic-phonetic representations of words in memory. The results of these experiments have important implications for current conceptions of auditory word recognition in normal and hearing impaired populations of children and adults. PMID:9504270
ROBOSIGHT: Robotic Vision System For Inspection And Manipulation

NASA Astrophysics Data System (ADS)

Trivedi, Mohan M.; Chen, ChuXin; Marapane, Suresh

1989-02-01

Vision is an important sensory modality that can be used for deriving information critical to the proper, efficient, flexible, and safe operation of an intelligent robot. Vision systems are uti-lized for developing higher level interpretation of the nature of a robotic workspace using images acquired by cameras mounted on a robot. Such information can be useful for tasks such as object recognition, object location, object inspection, obstacle avoidance and navigation. In this paper we describe efforts directed towards developing a vision system useful for performing various robotic inspection and manipulation tasks. The system utilizes gray scale images and can be viewed as a model-based system. It includes general purpose image analysis modules as well as special purpose, task dependent object status recognition modules. Experiments are described to verify the robust performance of the integrated system using a robotic testbed.
Crowding by a single bar: probing pattern recognition mechanisms in the visual periphery.

PubMed

Põder, Endel

2014-11-06

Whereas visual crowding does not greatly affect the detection of the presence of simple visual features, it heavily inhibits combining them into recognizable objects. Still, crowding effects have rarely been directly related to general pattern recognition mechanisms. In this study, pattern recognition mechanisms in visual periphery were probed using a single crowding feature. Observers had to identify the orientation of a rotated T presented briefly in a peripheral location. Adjacent to the target, a single bar was presented. The bar was either horizontal or vertical and located in a random direction from the target. It appears that such a crowding bar has very strong and regular effects on the identification of the target orientation. The observer's responses are determined by approximate relative positions of basic visual features; exact image-based similarity to the target is not important. A version of the "standard model" of object recognition with second-order features explains the main regularities of the data. © 2014 ARVO.
Augmented reality three-dimensional object visualization and recognition with axially distributed sensing.

PubMed

Markman, Adam; Shen, Xin; Hua, Hong; Javidi, Bahram

2016-01-15

An augmented reality (AR) smartglass display combines real-world scenes with digital information enabling the rapid growth of AR-based applications. We present an augmented reality-based approach for three-dimensional (3D) optical visualization and object recognition using axially distributed sensing (ADS). For object recognition, the 3D scene is reconstructed, and feature extraction is performed by calculating the histogram of oriented gradients (HOG) of a sliding window. A support vector machine (SVM) is then used for classification. Once an object has been identified, the 3D reconstructed scene with the detected object is optically displayed in the smartglasses allowing the user to see the object, remove partial occlusions of the object, and provide critical information about the object such as 3D coordinates, which are not possible with conventional AR devices. To the best of our knowledge, this is the first report on combining axially distributed sensing with 3D object visualization and recognition for applications to augmented reality. The proposed approach can have benefits for many applications, including medical, military, transportation, and manufacturing.
Ventral-stream-like shape representation: from pixel intensity values to trainable object-selective COSFIRE models

PubMed Central

Azzopardi, George; Petkov, Nicolai

2014-01-01

The remarkable abilities of the primate visual system have inspired the construction of computational models of some visual neurons. We propose a trainable hierarchical object recognition model, which we call S-COSFIRE (S stands for Shape and COSFIRE stands for Combination Of Shifted FIlter REsponses) and use it to localize and recognize objects of interests embedded in complex scenes. It is inspired by the visual processing in the ventral stream (V1/V2 → V4 → TEO). Recognition and localization of objects embedded in complex scenes is important for many computer vision applications. Most existing methods require prior segmentation of the objects from the background which on its turn requires recognition. An S-COSFIRE filter is automatically configured to be selective for an arrangement of contour-based features that belong to a prototype shape specified by an example. The configuration comprises selecting relevant vertex detectors and determining certain blur and shift parameters. The response is computed as the weighted geometric mean of the blurred and shifted responses of the selected vertex detectors. S-COSFIRE filters share similar properties with some neurons in inferotemporal cortex, which provided inspiration for this work. We demonstrate the effectiveness of S-COSFIRE filters in two applications: letter and keyword spotting in handwritten manuscripts and object spotting in complex scenes for the computer vision system of a domestic robot. S-COSFIRE filters are effective to recognize and localize (deformable) objects in images of complex scenes without requiring prior segmentation. They are versatile trainable shape detectors, conceptually simple and easy to implement. The presented hierarchical shape representation contributes to a better understanding of the brain and to more robust computer vision algorithms. PMID:25126068
Invariant visual object recognition: a model, with lighting invariance.

PubMed

Rolls, Edmund T; Stringer, Simon M

2006-01-01

How are invariant representations of objects formed in the visual cortex? We describe a neurophysiological and computational approach which focusses on a feature hierarchy model in which invariant representations can be built by self-organizing learning based on the statistics of the visual input. The model can use temporal continuity in an associative synaptic learning rule with a short term memory trace, and/or it can use spatial continuity in Continuous Transformation learning. The model of visual processing in the ventral cortical stream can build representations of objects that are invariant with respect to translation, view, size, and in this paper we show also lighting. The model has been extended to provide an account of invariant representations in the dorsal visual system of the global motion produced by objects such as looming, rotation, and object-based movement. The model has been extended to incorporate top-down feedback connections to model the control of attention by biased competition in for example spatial and object search tasks. The model has also been extended to account for how the visual system can select single objects in complex visual scenes, and how multiple objects can be represented in a scene.
A Comparison of the Effects of Depth Rotation on Visual and Haptic Three-Dimensional Object Recognition

ERIC Educational Resources Information Center

Lawson, Rebecca

2009-01-01

A sequential matching task was used to compare how the difficulty of shape discrimination influences the achievement of object constancy for depth rotations across haptic and visual object recognition. Stimuli were nameable, 3-dimensional plastic models of familiar objects (e.g., bed, chair) and morphs midway between these endpoint shapes (e.g., a…
Localized contourlet features in vehicle make and model recognition

NASA Astrophysics Data System (ADS)

Zafar, I.; Edirisinghe, E. A.; Acar, B. S.

2009-02-01

Automatic vehicle Make and Model Recognition (MMR) systems provide useful performance enhancements to vehicle recognitions systems that are solely based on Automatic Number Plate Recognition (ANPR) systems. Several vehicle MMR systems have been proposed in literature. In parallel to this, the usefulness of multi-resolution based feature analysis techniques leading to efficient object classification algorithms have received close attention from the research community. To this effect, Contourlet transforms that can provide an efficient directional multi-resolution image representation has recently been introduced. Already an attempt has been made in literature to use Curvelet/Contourlet transforms in vehicle MMR. In this paper we propose a novel localized feature detection method in Contourlet transform domain that is capable of increasing the classification rates up to 4%, as compared to the previously proposed Contourlet based vehicle MMR approach in which the features are non-localized and thus results in sub-optimal classification. Further we show that the proposed algorithm can achieve the increased classification accuracy of 96% at significantly lower computational complexity due to the use of Two Dimensional Linear Discriminant Analysis (2DLDA) for dimensionality reduction by preserving the features with high between-class variance and low inter-class variance.

Complementary Hemispheric Asymmetries in Object Naming and Recognition: A Voxel-Based Correlational Study

ERIC Educational Resources Information Center

Acres, K.; Taylor, K. I.; Moss, H. E.; Stamatakis, E. A.; Tyler, L. K.

2009-01-01

Cognitive neuroscientific research proposes complementary hemispheric asymmetries in naming and recognising visual objects, with a left temporal lobe advantage for object naming and a right temporal lobe advantage for object recognition. Specifically, it has been proposed that the left inferior temporal lobe plays a mediational role linking…
A defense of the subordinate-level expertise account for the N170 component.

PubMed

Rossion, Bruno; Curran, Tim; Gauthier, Isabel

2002-09-01

A recent paper in this journal reports two event-related potential (ERP) experiments interpreted as supporting the domain specificity of the visual mechanisms implicated in processing faces (Cognition 83 (2002) 1). The authors argue that because a large neurophysiological response to faces (N170) is less influenced by the task than the response to objects, and because the response for human faces extends to ape faces (for which we are not expert), we should reject the hypothesis that the face-sensitivity reflected by the N170 can be accounted for by the subordinate-level expertise model of object recognition (Nature Neuroscience 3 (2000) 764). In this commentary, we question this conclusion based on some of our own ERP work on expert object recognition as well as the work of others.
A Theory of How Columns in the Neocortex Enable Learning the Structure of the World

PubMed Central

Hawkins, Jeff; Ahmad, Subutai; Cui, Yuwei

2017-01-01

Neocortical regions are organized into columns and layers. Connections between layers run mostly perpendicular to the surface suggesting a columnar functional organization. Some layers have long-range excitatory lateral connections suggesting interactions between columns. Similar patterns of connectivity exist in all regions but their exact role remain a mystery. In this paper, we propose a network model composed of columns and layers that performs robust object learning and recognition. Each column integrates its changing input over time to learn complete predictive models of observed objects. Excitatory lateral connections across columns allow the network to more rapidly infer objects based on the partial knowledge of adjacent columns. Because columns integrate input over time and space, the network learns models of complex objects that extend well beyond the receptive field of individual cells. Our network model introduces a new feature to cortical columns. We propose that a representation of location relative to the object being sensed is calculated within the sub-granular layers of each column. The location signal is provided as an input to the network, where it is combined with sensory data. Our model contains two layers and one or more columns. Simulations show that using Hebbian-like learning rules small single-column networks can learn to recognize hundreds of objects, with each object containing tens of features. Multi-column networks recognize objects with significantly fewer movements of the sensory receptors. Given the ubiquity of columnar and laminar connectivity patterns throughout the neocortex, we propose that columns and regions have more powerful recognition and modeling capabilities than previously assumed. PMID:29118696
Central administration of angiotensin IV rapidly enhances novel object recognition among mice.

PubMed

Paris, Jason J; Eans, Shainnel O; Mizrachi, Elisa; Reilley, Kate J; Ganno, Michelle L; McLaughlin, Jay P

2013-07-01

Angiotensin IV (Val(1)-Tyr(2)-Ile(3)-His(4)-Pro(5)-Phe(6)) has demonstrated potential cognitive-enhancing effects. The present investigation assessed and characterized: (1) dose-dependency of angiotensin IV's cognitive enhancement in a C57BL/6J mouse model of novel object recognition, (2) the time-course for these effects, (3) the identity of residues in the hexapeptide important to these effects and (4) the necessity of actions at angiotensin IV receptors for procognitive activity. Assessment of C57BL/6J mice in a novel object recognition task demonstrated that prior administration of angiotensin IV (0.1, 1.0, or 10.0, but not 0.01 nmol, i.c.v.) significantly enhanced novel object recognition in a dose-dependent manner. These effects were time dependent, with improved novel object recognition observed when angiotensin IV (0.1 nmol, i.c.v.) was administered 10 or 20, but not 30 min prior to the onset of the novel object recognition testing. An alanine scan of the angiotensin IV peptide revealed that replacement of the Val(1), Ile(3), His(4), or Phe(6) residues with Ala attenuated peptide-induced improvements in novel object recognition, whereas Tyr(2) or Pro(5) replacement did not significantly affect performance. Administration of the angiotensin IV receptor antagonist, divalinal-Ang IV (20 nmol, i.c.v.), reduced (but did not abolish) novel object recognition; however, this antagonist completely blocked the procognitive effects of angiotensin IV (0.1 nmol, i.c.v.) in this task. Rotorod testing demonstrated no locomotor effects with any angiotensin IV or divalinal-Ang IV dose tested. These data demonstrate that angiotensin IV produces a rapid enhancement of associative learning and memory performance in a mouse model that was dependent on the angiotensin IV receptor. Copyright © 2013 Elsevier Ltd. All rights reserved.
Central administration of angiotensin IV rapidly enhances novel object recognition among mice

PubMed Central

Paris, Jason J.; Eans, Shainnel O.; Mizrachi, Elisa; Reilley, Kate J.; Ganno, Michelle L.; McLaughlin, Jay P.

2013-01-01

Angiotensin IV (Val1-Tyr2-Ile3-His4-Pro5-Phe6) has demonstrated potential cognitive-enhancing effects. The present investigation assessed and characterized: (1) dose-dependency of angiotensin IV's cognitive enhancement in a C57BL/6J mouse model of novel object recognition, (2) the time-course for these effects, (3) the identity of residues in the hexapeptide important to these effects and (4) the necessity of actions at angiotensin IV receptors for pro-cognitive activity. Assessment of C57BL/6J mice in a novel object recognition task demonstrated that prior administration of angiotensin IV (0.1, 1.0, or 10.0, but not 0.01, nmol, i.c.v.) significantly enhanced novel object recognition in a dose-dependent manner. These effects were time dependent, with improved novel object recognition observed when angiotensin IV (0.1 nmol, i.c.v.) was administered 10 or 20, but not 30, min prior to the onset of the novel object recognition testing. An alanine scan of the angiotensin IV peptide revealed that replacement of the Val1, Ile3, His4, or Phe6 residues with Ala attenuated peptide-induced improvements in novel object recognition, whereas Tyr2 or Pro5 replacement did not significantly affect performance. Administration of the angiotensin IV receptor antagonist, divalinal-Ang IV (20 nmol, i.c.v.), reduced (but did not abolish) novel object recognition; however, this antagonist completely blocked the pro-cognitive effects of angiotensin IV (0.1 nmol, i.c.v.) in this task. Rotorod testing demonstrated no locomotor effects for any angiotensin IV or divalinal-Ang IV dose tested. These data demonstrate that angiotensin IV produces a rapid enhancement of associative learning and memory performance in a mouse model that was dependent on the angiotensin IV receptor. PMID:23416700
From brain synapses to systems for learning and memory: Object recognition, spatial navigation, timed conditioning, and movement control.

PubMed

Grossberg, Stephen

2015-09-24

This article provides an overview of neural models of synaptic learning and memory whose expression in adaptive behavior depends critically on the circuits and systems in which the synapses are embedded. It reviews Adaptive Resonance Theory, or ART, models that use excitatory matching and match-based learning to achieve fast category learning and whose learned memories are dynamically stabilized by top-down expectations, attentional focusing, and memory search. ART clarifies mechanistic relationships between consciousness, learning, expectation, attention, resonance, and synchrony. ART models are embedded in ARTSCAN architectures that unify processes of invariant object category learning, recognition, spatial and object attention, predictive remapping, and eye movement search, and that clarify how conscious object vision and recognition may fail during perceptual crowding and parietal neglect. The generality of learned categories depends upon a vigilance process that is regulated by acetylcholine via the nucleus basalis. Vigilance can get stuck at too high or too low values, thereby causing learning problems in autism and medial temporal amnesia. Similar synaptic learning laws support qualitatively different behaviors: Invariant object category learning in the inferotemporal cortex; learning of grid cells and place cells in the entorhinal and hippocampal cortices during spatial navigation; and learning of time cells in the entorhinal-hippocampal system during adaptively timed conditioning, including trace conditioning. Spatial and temporal processes through the medial and lateral entorhinal-hippocampal system seem to be carried out with homologous circuit designs. Variations of a shared laminar neocortical circuit design have modeled 3D vision, speech perception, and cognitive working memory and learning. A complementary kind of inhibitory matching and mismatch learning controls movement. This article is part of a Special Issue entitled SI: Brain and Memory. Copyright © 2014 Elsevier B.V. All rights reserved.
Global ensemble texture representations are critical to rapid scene perception.

PubMed

Brady, Timothy F; Shafer-Skelton, Anna; Alvarez, George A

2017-06-01

Traditionally, recognizing the objects within a scene has been treated as a prerequisite to recognizing the scene itself. However, research now suggests that the ability to rapidly recognize visual scenes could be supported by global properties of the scene itself rather than the objects within the scene. Here, we argue for a particular instantiation of this view: That scenes are recognized by treating them as a global texture and processing the pattern of orientations and spatial frequencies across different areas of the scene without recognizing any objects. To test this model, we asked whether there is a link between how proficient individuals are at rapid scene perception and how proficiently they represent simple spatial patterns of orientation information (global ensemble texture). We find a significant and selective correlation between these tasks, suggesting a link between scene perception and spatial ensemble tasks but not nonspatial summary statistics In a second and third experiment, we additionally show that global ensemble texture information is not only associated with scene recognition, but that preserving only global ensemble texture information from scenes is sufficient to support rapid scene perception; however, preserving the same information is not sufficient for object recognition. Thus, global ensemble texture alone is sufficient to allow activation of scene representations but not object representations. Together, these results provide evidence for a view of scene recognition based on global ensemble texture rather than a view based purely on objects or on nonspatially localized global properties. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
DIAC object recognition system

NASA Astrophysics Data System (ADS)

Buurman, Johannes

1992-03-01

This paper describes the object recognition system used in an intelligent robot cell. It is used to recognize and estimate pose and orientation of parts as they enter the cell. The parts are mostly metal and consist of polyhedral and cylindrical shapes. The system uses feature-based stereo vision to acquire a wireframe of the observed part. Features are defined as straight lines and ellipses, which lead to a wireframe of straight lines and circular arcs (the latter using a new algorithm). This wireframe is compared to a number of wire frame models obtained from the CAD database. Experimental results show that image processing hardware and parallelization may add considerably to the speed of the system.
Exploring the feasibility of traditional image querying tasks for industrial radiographs

NASA Astrophysics Data System (ADS)

Bray, Iliana E.; Tsai, Stephany J.; Jimenez, Edward S.

2015-08-01

Although there have been great strides in object recognition with optical images (photographs), there has been comparatively little research into object recognition for X-ray radiographs. Our exploratory work contributes to this area by creating an object recognition system designed to recognize components from a related database of radiographs. Object recognition for radiographs must be approached differently than for optical images, because radiographs have much less color-based information to distinguish objects, and they exhibit transmission overlap that alters perceived object shapes. The dataset used in this work contained more than 55,000 intermixed radiographs and photographs, all in a compressed JPEG form and with multiple ways of describing pixel information. For this work, a robust and efficient system is needed to combat problems presented by properties of the X-ray imaging modality, the large size of the given database, and the quality of the images contained in said database. We have explored various pre-processing techniques to clean the cluttered and low-quality images in the database, and we have developed our object recognition system by combining multiple object detection and feature extraction methods. We present the preliminary results of the still-evolving hybrid object recognition system.
Event Recognition Based on Deep Learning in Chinese Texts

PubMed Central

Zhang, Yajun; Liu, Zongtian; Zhou, Wen

2016-01-01

Event recognition is the most fundamental and critical task in event-based natural language processing systems. Existing event recognition methods based on rules and shallow neural networks have certain limitations. For example, extracting features using methods based on rules is difficult; methods based on shallow neural networks converge too quickly to a local minimum, resulting in low recognition precision. To address these problems, we propose the Chinese emergency event recognition model based on deep learning (CEERM). Firstly, we use a word segmentation system to segment sentences. According to event elements labeled in the CEC 2.0 corpus, we classify words into five categories: trigger words, participants, objects, time and location. Each word is vectorized according to the following six feature layers: part of speech, dependency grammar, length, location, distance between trigger word and core word and trigger word frequency. We obtain deep semantic features of words by training a feature vector set using a deep belief network (DBN), then analyze those features in order to identify trigger words by means of a back propagation neural network. Extensive testing shows that the CEERM achieves excellent recognition performance, with a maximum F-measure value of 85.17%. Moreover, we propose the dynamic-supervised DBN, which adds supervised fine-tuning to a restricted Boltzmann machine layer by monitoring its training performance. Test analysis reveals that the new DBN improves recognition performance and effectively controls the training time. Although the F-measure increases to 88.11%, the training time increases by only 25.35%. PMID:27501231
Event Recognition Based on Deep Learning in Chinese Texts.

PubMed

Zhang, Yajun; Liu, Zongtian; Zhou, Wen

2016-01-01

Event recognition is the most fundamental and critical task in event-based natural language processing systems. Existing event recognition methods based on rules and shallow neural networks have certain limitations. For example, extracting features using methods based on rules is difficult; methods based on shallow neural networks converge too quickly to a local minimum, resulting in low recognition precision. To address these problems, we propose the Chinese emergency event recognition model based on deep learning (CEERM). Firstly, we use a word segmentation system to segment sentences. According to event elements labeled in the CEC 2.0 corpus, we classify words into five categories: trigger words, participants, objects, time and location. Each word is vectorized according to the following six feature layers: part of speech, dependency grammar, length, location, distance between trigger word and core word and trigger word frequency. We obtain deep semantic features of words by training a feature vector set using a deep belief network (DBN), then analyze those features in order to identify trigger words by means of a back propagation neural network. Extensive testing shows that the CEERM achieves excellent recognition performance, with a maximum F-measure value of 85.17%. Moreover, we propose the dynamic-supervised DBN, which adds supervised fine-tuning to a restricted Boltzmann machine layer by monitoring its training performance. Test analysis reveals that the new DBN improves recognition performance and effectively controls the training time. Although the F-measure increases to 88.11%, the training time increases by only 25.35%.
RecceMan: an interactive recognition assistance for image-based reconnaissance: synergistic effects of human perception and computational methods for object recognition, identification, and infrastructure analysis

NASA Astrophysics Data System (ADS)

El Bekri, Nadia; Angele, Susanne; Ruckhäberle, Martin; Peinsipp-Byma, Elisabeth; Haelke, Bruno

2015-10-01

This paper introduces an interactive recognition assistance system for imaging reconnaissance. This system supports aerial image analysts on missions during two main tasks: Object recognition and infrastructure analysis. Object recognition concentrates on the classification of one single object. Infrastructure analysis deals with the description of the components of an infrastructure and the recognition of the infrastructure type (e.g. military airfield). Based on satellite or aerial images, aerial image analysts are able to extract single object features and thereby recognize different object types. It is one of the most challenging tasks in the imaging reconnaissance. Currently, there are no high potential ATR (automatic target recognition) applications available, as consequence the human observer cannot be replaced entirely. State-of-the-art ATR applications cannot assume in equal measure human perception and interpretation. Why is this still such a critical issue? First, cluttered and noisy images make it difficult to automatically extract, classify and identify object types. Second, due to the changed warfare and the rise of asymmetric threats it is nearly impossible to create an underlying data set containing all features, objects or infrastructure types. Many other reasons like environmental parameters or aspect angles compound the application of ATR supplementary. Due to the lack of suitable ATR procedures, the human factor is still important and so far irreplaceable. In order to use the potential benefits of the human perception and computational methods in a synergistic way, both are unified in an interactive assistance system. RecceMan® (Reconnaissance Manual) offers two different modes for aerial image analysts on missions: the object recognition mode and the infrastructure analysis mode. The aim of the object recognition mode is to recognize a certain object type based on the object features that originated from the image signatures. The infrastructure analysis mode pursues the goal to analyze the function of the infrastructure. The image analyst extracts visually certain target object signatures, assigns them to corresponding object features and is finally able to recognize the object type. The system offers him the possibility to assign the image signatures to features given by sample images. The underlying data set contains a wide range of objects features and object types for different domains like ships or land vehicles. Each domain has its own feature tree developed by aerial image analyst experts. By selecting the corresponding features, the possible solution set of objects is automatically reduced and matches only the objects that contain the selected features. Moreover, we give an outlook of current research in the field of ground target analysis in which we deal with partly automated methods to extract image signatures and assign them to the corresponding features. This research includes methods for automatically determining the orientation of an object and geometric features like width and length of the object. This step enables to reduce automatically the possible object types offered to the image analyst by the interactive recognition assistance system.
Track Everything: Limiting Prior Knowledge in Online Multi-Object Recognition.

PubMed

Wong, Sebastien C; Stamatescu, Victor; Gatt, Adam; Kearney, David; Lee, Ivan; McDonnell, Mark D

2017-10-01

This paper addresses the problem of online tracking and classification of multiple objects in an image sequence. Our proposed solution is to first track all objects in the scene without relying on object-specific prior knowledge, which in other systems can take the form of hand-crafted features or user-based track initialization. We then classify the tracked objects with a fast-learning image classifier, that is based on a shallow convolutional neural network architecture and demonstrate that object recognition improves when this is combined with object state information from the tracking algorithm. We argue that by transferring the use of prior knowledge from the detection and tracking stages to the classification stage, we can design a robust, general purpose object recognition system with the ability to detect and track a variety of object types. We describe our biologically inspired implementation, which adaptively learns the shape and motion of tracked objects, and apply it to the Neovision2 Tower benchmark data set, which contains multiple object types. An experimental evaluation demonstrates that our approach is competitive with the state-of-the-art video object recognition systems that do make use of object-specific prior knowledge in detection and tracking, while providing additional practical advantages by virtue of its generality.
Impaired recognition of faces and objects in dyslexia: Evidence for ventral stream dysfunction?

PubMed

Sigurdardottir, Heida Maria; Ívarsson, Eysteinn; Kristinsdóttir, Kristjana; Kristjánsson, Árni

2015-09-01

The objective of this study was to establish whether or not dyslexics are impaired at the recognition of faces and other complex nonword visual objects. This would be expected based on a meta-analysis revealing that children and adult dyslexics show functional abnormalities within the left fusiform gyrus, a brain region high up in the ventral visual stream, which is thought to support the recognition of words, faces, and other objects. 20 adult dyslexics (M = 29 years) and 20 matched typical readers (M = 29 years) participated in the study. One dyslexic-typical reader pair was excluded based on Adult Reading History Questionnaire scores and IS-FORM reading scores. Performance was measured on 3 high-level visual processing tasks: the Cambridge Face Memory Test, the Vanderbilt Holistic Face Processing Test, and the Vanderbilt Expertise Test. People with dyslexia are impaired in their recognition of faces and other visually complex objects. Their holistic processing of faces appears to be intact, suggesting that dyslexics may instead be specifically impaired at part-based processing of visual objects. The difficulty that people with dyslexia experience with reading might be the most salient manifestation of a more general high-level visual deficit. (c) 2015 APA, all rights reserved).
Multi-objects recognition for distributed intelligent sensor networks

NASA Astrophysics Data System (ADS)

He, Haibo; Chen, Sheng; Cao, Yuan; Desai, Sachi; Hohil, Myron E.

2008-04-01

This paper proposes an innovative approach for multi-objects recognition for homeland security and defense based intelligent sensor networks. Unlike the conventional way of information analysis, data mining in such networks is typically characterized with high information ambiguity/uncertainty, data redundancy, high dimensionality and real-time constrains. Furthermore, since a typical military based network normally includes multiple mobile sensor platforms, ground forces, fortified tanks, combat flights, and other resources, it is critical to develop intelligent data mining approaches to fuse different information resources to understand dynamic environments, to support decision making processes, and finally to achieve the goals. This paper aims to address these issues with a focus on multi-objects recognition. Instead of classifying a single object as in the traditional image classification problems, the proposed method can automatically learn multiple objectives simultaneously. Image segmentation techniques are used to identify the interesting regions in the field, which correspond to multiple objects such as soldiers or tanks. Since different objects will come with different feature sizes, we propose a feature scaling method to represent each object in the same number of dimensions. This is achieved by linear/nonlinear scaling and sampling techniques. Finally, support vector machine (SVM) based learning algorithms are developed to learn and build the associations for different objects, and such knowledge will be adaptively accumulated for objects recognition in the testing stage. We test the effectiveness of proposed method in different simulated military environments.
Learning object-to-class kernels for scene classification.

PubMed

Zhang, Lei; Zhen, Xiantong; Shao, Ling

2014-08-01

High-level image representations have drawn increasing attention in visual recognition, e.g., scene classification, since the invention of the object bank. The object bank represents an image as a response map of a large number of pretrained object detectors and has achieved superior performance for visual recognition. In this paper, based on the object bank representation, we propose the object-to-class (O2C) distances to model scene images. In particular, four variants of O2C distances are presented, and with the O2C distances, we can represent the images using the object bank by lower-dimensional but more discriminative spaces, called distance spaces, which are spanned by the O2C distances. Due to the explicit computation of O2C distances based on the object bank, the obtained representations can possess more semantic meanings. To combine the discriminant ability of the O2C distances to all scene classes, we further propose to kernalize the distance representation for the final classification. We have conducted extensive experiments on four benchmark data sets, UIUC-Sports, Scene-15, MIT Indoor, and Caltech-101, which demonstrate that the proposed approaches can significantly improve the original object bank approach and achieve the state-of-the-art performance.
Sparse aperture 3D passive image sensing and recognition

NASA Astrophysics Data System (ADS)

Daneshpanah, Mehdi

The way we perceive, capture, store, communicate and visualize the world has greatly changed in the past century Novel three dimensional (3D) imaging and display systems are being pursued both in academic and industrial settings. In many cases, these systems have revolutionized traditional approaches and/or enabled new technologies in other disciplines including medical imaging and diagnostics, industrial metrology, entertainment, robotics as well as defense and security. In this dissertation, we focus on novel aspects of sparse aperture multi-view imaging systems and their application in quantum-limited object recognition in two separate parts. In the first part, two concepts are proposed. First a solution is presented that involves a generalized framework for 3D imaging using randomly distributed sparse apertures. Second, a method is suggested to extract the profile of objects in the scene through statistical properties of the reconstructed light field. In both cases, experimental results are presented that demonstrate the feasibility of the techniques. In the second part, the application of 3D imaging systems in sensing and recognition of objects is addressed. In particular, we focus on the scenario in which only 10s of photons reach the sensor from the object of interest, as opposed to hundreds of billions of photons in normal imaging conditions. At this level, the quantum limited behavior of light will dominate and traditional object recognition practices may fail. We suggest a likelihood based object recognition framework that incorporates the physics of sensing at quantum-limited conditions. Sensor dark noise has been modeled and taken into account. This framework is applied to 3D sensing of thermal objects using visible spectrum detectors. Thermal objects as cold as 250K are shown to provide enough signature photons to be sensed and recognized within background and dark noise with mature, visible band, image forming optics and detector arrays. The results suggest that one might not need to venture into exotic and expensive detector arrays and associated optics for sensing room-temperature thermal objects in complete darkness.
A reciprocal model of face recognition and autistic traits: evidence from an individual differences perspective.

PubMed

Halliday, Drew W R; MacDonald, Stuart W S; Scherf, K Suzanne; Sherf, Suzanne K; Tanaka, James W

2014-01-01

Although not a core symptom of the disorder, individuals with autism often exhibit selective impairments in their face processing abilities. Importantly, the reciprocal connection between autistic traits and face perception has rarely been examined within the typically developing population. In this study, university participants from the social sciences, physical sciences, and humanities completed a battery of measures that assessed face, object and emotion recognition abilities, general perceptual-cognitive style, and sub-clinical autistic traits (the Autism Quotient (AQ)). We employed separate hierarchical multiple regression analyses to evaluate which factors could predict face recognition scores and AQ scores. Gender, object recognition performance, and AQ scores predicted face recognition behaviour. Specifically, males, individuals with more autistic traits, and those with lower object recognition scores performed more poorly on the face recognition test. Conversely, university major, gender and face recognition performance reliably predicted AQ scores. Science majors, males, and individuals with poor face recognition skills showed more autistic-like traits. These results suggest that the broader autism phenotype is associated with lower face recognition abilities, even among typically developing individuals.
A Reciprocal Model of Face Recognition and Autistic Traits: Evidence from an Individual Differences Perspective

PubMed Central

Halliday, Drew W. R.; MacDonald, Stuart W. S.; Sherf, Suzanne K.; Tanaka, James W.

2014-01-01

Although not a core symptom of the disorder, individuals with autism often exhibit selective impairments in their face processing abilities. Importantly, the reciprocal connection between autistic traits and face perception has rarely been examined within the typically developing population. In this study, university participants from the social sciences, physical sciences, and humanities completed a battery of measures that assessed face, object and emotion recognition abilities, general perceptual-cognitive style, and sub-clinical autistic traits (the Autism Quotient (AQ)). We employed separate hierarchical multiple regression analyses to evaluate which factors could predict face recognition scores and AQ scores. Gender, object recognition performance, and AQ scores predicted face recognition behaviour. Specifically, males, individuals with more autistic traits, and those with lower object recognition scores performed more poorly on the face recognition test. Conversely, university major, gender and face recognition performance reliably predicted AQ scores. Science majors, males, and individuals with poor face recognition skills showed more autistic-like traits. These results suggest that the broader autism phenotype is associated with lower face recognition abilities, even among typically developing individuals. PMID:24853862
Body-wide anatomy recognition in PET/CT images

NASA Astrophysics Data System (ADS)

Wang, Huiqian; Udupa, Jayaram K.; Odhner, Dewey; Tong, Yubing; Zhao, Liming; Torigian, Drew A.

2015-03-01

With the rapid growth of positron emission tomography/computed tomography (PET/CT)-based medical applications, body-wide anatomy recognition on whole-body PET/CT images becomes crucial for quantifying body-wide disease burden. This, however, is a challenging problem and seldom studied due to unclear anatomy reference frame and low spatial resolution of PET images as well as low contrast and spatial resolution of the associated low-dose CT images. We previously developed an automatic anatomy recognition (AAR) system [15] whose applicability was demonstrated on diagnostic computed tomography (CT) and magnetic resonance (MR) images in different body regions on 35 objects. The aim of the present work is to investigate strategies for adapting the previous AAR system to low-dose CT and PET images toward automated body-wide disease quantification. Our adaptation of the previous AAR methodology to PET/CT images in this paper focuses on 16 objects in three body regions - thorax, abdomen, and pelvis - and consists of the following steps: collecting whole-body PET/CT images from existing patient image databases, delineating all objects in these images, modifying the previous hierarchical models built from diagnostic CT images to account for differences in appearance in low-dose CT and PET images, automatically locating objects in these images following object hierarchy, and evaluating performance. Our preliminary evaluations indicate that the performance of the AAR approach on low-dose CT images achieves object localization accuracy within about 2 voxels, which is comparable to the accuracies achieved on diagnostic contrast-enhanced CT images. Object recognition on low-dose CT images from PET/CT examinations without requiring diagnostic contrast-enhanced CT seems feasible.

Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition

PubMed Central

Cadieu, Charles F.; Hong, Ha; Yamins, Daniel L. K.; Pinto, Nicolas; Ardila, Diego; Solomon, Ethan A.; Majaj, Najib J.; DiCarlo, James J.

2014-01-01

The primate visual system achieves remarkable visual object recognition performance even in brief presentations, and under changes to object exemplar, geometric transformations, and background variation (a.k.a. core visual object recognition). This remarkable performance is mediated by the representation formed in inferior temporal (IT) cortex. In parallel, recent advances in machine learning have led to ever higher performing models of object recognition using artificial deep neural networks (DNNs). It remains unclear, however, whether the representational performance of DNNs rivals that of the brain. To accurately produce such a comparison, a major difficulty has been a unifying metric that accounts for experimental limitations, such as the amount of noise, the number of neural recording sites, and the number of trials, and computational limitations, such as the complexity of the decoding classifier and the number of classifier training examples. In this work, we perform a direct comparison that corrects for these experimental limitations and computational considerations. As part of our methodology, we propose an extension of “kernel analysis” that measures the generalization accuracy as a function of representational complexity. Our evaluations show that, unlike previous bio-inspired models, the latest DNNs rival the representational performance of IT cortex on this visual object recognition task. Furthermore, we show that models that perform well on measures of representational performance also perform well on measures of representational similarity to IT, and on measures of predicting individual IT multi-unit responses. Whether these DNNs rely on computational mechanisms similar to the primate visual system is yet to be determined, but, unlike all previous bio-inspired models, that possibility cannot be ruled out merely on representational performance grounds. PMID:25521294
Multilevel depth and image fusion for human activity detection.

PubMed

Ni, Bingbing; Pei, Yong; Moulin, Pierre; Yan, Shuicheng

2013-10-01

Recognizing complex human activities usually requires the detection and modeling of individual visual features and the interactions between them. Current methods only rely on the visual features extracted from 2-D images, and therefore often lead to unreliable salient visual feature detection and inaccurate modeling of the interaction context between individual features. In this paper, we show that these problems can be addressed by combining data from a conventional camera and a depth sensor (e.g., Microsoft Kinect). We propose a novel complex activity recognition and localization framework that effectively fuses information from both grayscale and depth image channels at multiple levels of the video processing pipeline. In the individual visual feature detection level, depth-based filters are applied to the detected human/object rectangles to remove false detections. In the next level of interaction modeling, 3-D spatial and temporal contexts among human subjects or objects are extracted by integrating information from both grayscale and depth images. Depth information is also utilized to distinguish different types of indoor scenes. Finally, a latent structural model is developed to integrate the information from multiple levels of video processing for an activity detection. Extensive experiments on two activity recognition benchmarks (one with depth information) and a challenging grayscale + depth human activity database that contains complex interactions between human-human, human-object, and human-surroundings demonstrate the effectiveness of the proposed multilevel grayscale + depth fusion scheme. Higher recognition and localization accuracies are obtained relative to the previous methods.
Rapid Target Detection in High Resolution Remote Sensing Images Using Yolo Model

NASA Astrophysics Data System (ADS)

Wu, Z.; Chen, X.; Gao, Y.; Li, Y.

2018-04-01

Object detection in high resolution remote sensing images is a fundamental and challenging problem in the field of remote sensing imagery analysis for civil and military application due to the complex neighboring environments, which can cause the recognition algorithms to mistake irrelevant ground objects for target objects. Deep Convolution Neural Network(DCNN) is the hotspot in object detection for its powerful ability of feature extraction and has achieved state-of-the-art results in Computer Vision. Common pipeline of object detection based on DCNN consists of region proposal, CNN feature extraction, region classification and post processing. YOLO model frames object detection as a regression problem, using a single CNN predicts bounding boxes and class probabilities in an end-to-end way and make the predict faster. In this paper, a YOLO based model is used for object detection in high resolution sensing images. The experiments on NWPU VHR-10 dataset and our airport/airplane dataset gain from GoogleEarth show that, compare with the common pipeline, the proposed model speeds up the detection process and have good accuracy.
What Three-Year-Olds Remember from Their Past: Long-Term Memory for Persons, Objects, and Actions

ERIC Educational Resources Information Center

Hirte, Monika; Graf, Frauke; Kim, Ziyon; Knopf, Monika

2017-01-01

From birth on, infants show long-term recognition memory for persons. Furthermore, infants from six months onwards are able to store and retrieve demonstrated actions over long-term intervals in deferred imitation tasks. Thus, information about the model demonstrating the object-related actions is stored and recognition memory for the objects as…
Objective Auscultation of TCM Based on Wavelet Packet Fractal Dimension and Support Vector Machine.

PubMed

Yan, Jian-Jun; Guo, Rui; Wang, Yi-Qin; Liu, Guo-Ping; Yan, Hai-Xia; Xia, Chun-Ming; Shen, Xiaojing

2014-01-01

This study was conducted to illustrate that auscultation features based on the fractal dimension combined with wavelet packet transform (WPT) were conducive to the identification the pattern of syndromes of Traditional Chinese Medicine (TCM). The WPT and the fractal dimension were employed to extract features of auscultation signals of 137 patients with lung Qi-deficient pattern, 49 patients with lung Yin-deficient pattern, and 43 healthy subjects. With these features, the classification model was constructed based on multiclass support vector machine (SVM). When all auscultation signals were trained by SVM to decide the patterns of TCM syndromes, the overall recognition rate of model was 79.49%; when male and female auscultation signals were trained, respectively, to decide the patterns, the overall recognition rate of model reached 86.05%. The results showed that the methods proposed in this paper were effective to analyze auscultation signals, and the performance of model can be greatly improved when the distinction of gender was considered.
Objective Auscultation of TCM Based on Wavelet Packet Fractal Dimension and Support Vector Machine

PubMed Central

Yan, Jian-Jun; Wang, Yi-Qin; Liu, Guo-Ping; Yan, Hai-Xia; Xia, Chun-Ming; Shen, Xiaojing

2014-01-01

This study was conducted to illustrate that auscultation features based on the fractal dimension combined with wavelet packet transform (WPT) were conducive to the identification the pattern of syndromes of Traditional Chinese Medicine (TCM). The WPT and the fractal dimension were employed to extract features of auscultation signals of 137 patients with lung Qi-deficient pattern, 49 patients with lung Yin-deficient pattern, and 43 healthy subjects. With these features, the classification model was constructed based on multiclass support vector machine (SVM). When all auscultation signals were trained by SVM to decide the patterns of TCM syndromes, the overall recognition rate of model was 79.49%; when male and female auscultation signals were trained, respectively, to decide the patterns, the overall recognition rate of model reached 86.05%. The results showed that the methods proposed in this paper were effective to analyze auscultation signals, and the performance of model can be greatly improved when the distinction of gender was considered. PMID:24883068
Detailed 3D representations for object recognition and modeling.

PubMed

Zia, M Zeeshan; Stark, Michael; Schiele, Bernt; Schindler, Konrad

2013-11-01

Geometric 3D reasoning at the level of objects has received renewed attention recently in the context of visual scene understanding. The level of geometric detail, however, is typically limited to qualitative representations or coarse boxes. This is linked to the fact that today's object class detectors are tuned toward robust 2D matching rather than accurate 3D geometry, encouraged by bounding-box-based benchmarks such as Pascal VOC. In this paper, we revisit ideas from the early days of computer vision, namely, detailed, 3D geometric object class representations for recognition. These representations can recover geometrically far more accurate object hypotheses than just bounding boxes, including continuous estimates of object pose and 3D wireframes with relative 3D positions of object parts. In combination with robust techniques for shape description and inference, we outperform state-of-the-art results in monocular 3D pose estimation. In a series of experiments, we analyze our approach in detail and demonstrate novel applications enabled by such an object class representation, such as fine-grained categorization of cars and bicycles, according to their 3D geometry, and ultrawide baseline matching.
An Effective 3D Shape Descriptor for Object Recognition with RGB-D Sensors

PubMed Central

Liu, Zhong; Zhao, Changchen; Wu, Xingming; Chen, Weihai

2017-01-01

RGB-D sensors have been widely used in various areas of computer vision and graphics. A good descriptor will effectively improve the performance of operation. This article further analyzes the recognition performance of shape features extracted from multi-modality source data using RGB-D sensors. A hybrid shape descriptor is proposed as a representation of objects for recognition. We first extracted five 2D shape features from contour-based images and five 3D shape features over point cloud data to capture the global and local shape characteristics of an object. The recognition performance was tested for category recognition and instance recognition. Experimental results show that the proposed shape descriptor outperforms several common global-to-global shape descriptors and is comparable to some partial-to-global shape descriptors that achieved the best accuracies in category and instance recognition. Contribution of partial features and computational complexity were also analyzed. The results indicate that the proposed shape features are strong cues for object recognition and can be combined with other features to boost accuracy. PMID:28245553
Data-driven indexing mechanism for the recognition of polyhedral objects

NASA Astrophysics Data System (ADS)

McLean, Stewart; Horan, Peter; Caelli, Terry M.

1992-02-01

This paper is concerned with the problem of searching large model databases. To date, most object recognition systems have concentrated on the problem of matching using simple searching algorithms. This is quite acceptable when the number of object models is small. However, in the future, general purpose computer vision systems will be required to recognize hundreds or perhaps thousands of objects and, in such circumstances, efficient searching algorithms will be needed. The problem of searching a large model database is one which must be addressed if future computer vision systems are to be at all effective. In this paper we present a method we call data-driven feature-indexed hypothesis generation as one solution to the problem of searching large model databases.
On the road to invariant recognition: explaining tradeoff and morph properties of cells in inferotemporal cortex using multiple-scale task-sensitive attentive learning.

PubMed

Grossberg, Stephen; Markowitz, Jeffrey; Cao, Yongqiang

2011-12-01

Visual object recognition is an essential accomplishment of advanced brains. Object recognition needs to be tolerant, or invariant, with respect to changes in object position, size, and view. In monkeys and humans, a key area for recognition is the anterior inferotemporal cortex (ITa). Recent neurophysiological data show that ITa cells with high object selectivity often have low position tolerance. We propose a neural model whose cells learn to simulate this tradeoff, as well as ITa responses to image morphs, while explaining how invariant recognition properties may arise in stages due to processes across multiple cortical areas. These processes include the cortical magnification factor, multiple receptive field sizes, and top-down attentive matching and learning properties that may be tuned by task requirements to attend to either concrete or abstract visual features with different levels of vigilance. The model predicts that data from the tradeoff and image morph tasks emerge from different levels of vigilance in the animals performing them. This result illustrates how different vigilance requirements of a task may change the course of category learning, notably the critical features that are attended and incorporated into learned category prototypes. The model outlines a path for developing an animal model of how defective vigilance control can lead to symptoms of various mental disorders, such as autism and amnesia. Copyright © 2011 Elsevier Ltd. All rights reserved.
A rodent model for the study of invariant visual object recognition

PubMed Central

Zoccolan, Davide; Oertelt, Nadja; DiCarlo, James J.; Cox, David D.

2009-01-01

The human visual system is able to recognize objects despite tremendous variation in their appearance on the retina resulting from variation in view, size, lighting, etc. This ability—known as “invariant” object recognition—is central to visual perception, yet its computational underpinnings are poorly understood. Traditionally, nonhuman primates have been the animal model-of-choice for investigating the neuronal substrates of invariant recognition, because their visual systems closely mirror our own. Meanwhile, simpler and more accessible animal models such as rodents have been largely overlooked as possible models of higher-level visual functions, because their brains are often assumed to lack advanced visual processing machinery. As a result, little is known about rodents' ability to process complex visual stimuli in the face of real-world image variation. In the present work, we show that rats possess more advanced visual abilities than previously appreciated. Specifically, we trained pigmented rats to perform a visual task that required them to recognize objects despite substantial variation in their appearance, due to changes in size, view, and lighting. Critically, rats were able to spontaneously generalize to previously unseen transformations of learned objects. These results provide the first systematic evidence for invariant object recognition in rats and argue for an increased focus on rodents as models for studying high-level visual processing. PMID:19429704
Research on improving image recognition robustness by combining multiple features with associative memory

NASA Astrophysics Data System (ADS)

Guo, Dongwei; Wang, Zhe

2018-05-01

Convolutional neural networks (CNN) achieve great success in computer vision, it can learn hierarchical representation from raw pixels and has outstanding performance in various image recognition tasks [1]. However, CNN is easy to be fraudulent in terms of it is possible to produce images totally unrecognizable to human eyes that CNNs believe with near certainty are familiar objects. [2]. In this paper, an associative memory model based on multiple features is proposed. Within this model, feature extraction and classification are carried out by CNN, T-SNE and exponential bidirectional associative memory neural network (EBAM). The geometric features extracted from CNN and the digital features extracted from T-SNE are associated by EBAM. Thus we ensure the recognition of robustness by a comprehensive assessment of the two features. In our model, we can get only 8% error rate with fraudulent data. In systems that require a high safety factor or some key areas, strong robustness is extremely important, if we can ensure the image recognition robustness, network security will be greatly improved and the social production efficiency will be extremely enhanced.
How landmark suitability shapes recognition memory signals for objects in the medial temporal lobes.

PubMed

Martin, Chris B; Sullivan, Jacqueline A; Wright, Jessey; Köhler, Stefan

2018-02-01

A role of perirhinal cortex (PrC) in recognition memory for objects has been well established. Contributions of parahippocampal cortex (PhC) to this function, while documented, remain less well understood. Here, we used fMRI to examine whether the organization of item-based recognition memory signals across these two structures is shaped by object category, independent of any difference in representing episodic context. Guided by research suggesting that PhC plays a critical role in processing landmarks, we focused on three categories of objects that differ from each other in their landmark suitability as confirmed with behavioral ratings (buildings > trees > aircraft). Participants made item-based recognition-memory decisions for novel and previously studied objects from these categories, which were matched in accuracy. Multi-voxel pattern classification revealed category-specific item-recognition memory signals along the long axis of PrC and PhC, with no sharp functional boundaries between these structures. Memory signals for buildings were observed in the mid to posterior extent of PhC, signals for trees in anterior to posterior segments of PhC, and signals for aircraft in mid to posterior aspects of PrC and the anterior extent of PhC. Notably, item-based memory signals for the category with highest landmark suitability ratings were observed only in those posterior segments of PhC that also allowed for classification of landmark suitability of objects when memory status was held constant. These findings provide new evidence in support of the notion that item-based memory signals for objects are not limited to PrC, and that the organization of these signals along the longitudinal axis that crosses PrC and PhC can be captured with reference to landmark suitability. Copyright © 2017 Elsevier Inc. All rights reserved.
Tc1 mouse model of trisomy-21 dissociates properties of short- and long-term recognition memory.

PubMed

Hall, Jessica H; Wiseman, Frances K; Fisher, Elizabeth M C; Tybulewicz, Victor L J; Harwood, John L; Good, Mark A

2016-04-01

The present study examined memory function in Tc1 mice, a transchromosomic model of Down syndrome (DS). Tc1 mice demonstrated an unusual delay-dependent deficit in recognition memory. More specifically, Tc1 mice showed intact immediate (30sec), impaired short-term (10-min) and intact long-term (24-h) memory for objects. A similar pattern was observed for olfactory stimuli, confirming the generality of the pattern across sensory modalities. The specificity of the behavioural deficits in Tc1 mice was confirmed using APP overexpressing mice that showed the opposite pattern of object memory deficits. In contrast to object memory, Tc1 mice showed no deficit in either immediate or long-term memory for object-in-place information. Similarly, Tc1 mice showed no deficit in short-term memory for object-location information. The latter result indicates that Tc1 mice were able to detect and react to spatial novelty at the same delay interval that was sensitive to an object novelty recognition impairment. These results demonstrate (1) that novelty detection per se and (2) the encoding of visuo-spatial information was not disrupted in adult Tc1 mice. The authors conclude that the task specific nature of the short-term recognition memory deficit suggests that the trisomy of genes on human chromosome 21 in Tc1 mice impacts on (perirhinal) cortical systems supporting short-term object and olfactory recognition memory. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Exploiting core knowledge for visual object recognition.

PubMed

Schurgin, Mark W; Flombaum, Jonathan I

2017-03-01

Humans recognize thousands of objects, and with relative tolerance to variable retinal inputs. The acquisition of this ability is not fully understood, and it remains an area in which artificial systems have yet to surpass people. We sought to investigate the memory process that supports object recognition. Specifically, we investigated the association of inputs that co-occur over short periods of time. We tested the hypothesis that human perception exploits expectations about object kinematics to limit the scope of association to inputs that are likely to have the same token as a source. In several experiments we exposed participants to images of objects, and we then tested recognition sensitivity. Using motion, we manipulated whether successive encounters with an image took place through kinematics that implied the same or a different token as the source of those encounters. Images were injected with noise, or shown at varying orientations, and we included 2 manipulations of motion kinematics. Across all experiments, memory performance was better for images that had been previously encountered with kinematics that implied a single token. A model-based analysis similarly showed greater memory strength when images were shown via kinematics that implied a single token. These results suggest that constraints from physics are built into the mechanisms that support memory about objects. Such constraints-often characterized as 'Core Knowledge'-are known to support perception and cognition broadly, even in young infants. But they have never been considered as a mechanism for memory with respect to recognition. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Speckle-learning-based object recognition through scattering media.

PubMed

Ando, Takamasa; Horisaki, Ryoichi; Tanida, Jun

2015-12-28

We experimentally demonstrated object recognition through scattering media based on direct machine learning of a number of speckle intensity images. In the experiments, speckle intensity images of amplitude or phase objects on a spatial light modulator between scattering plates were captured by a camera. We used the support vector machine for binary classification of the captured speckle intensity images of face and non-face data. The experimental results showed that speckles are sufficient for machine learning.
The Development of Adaptive Decision Making: Recognition-Based Inference in Children and Adolescents

ERIC Educational Resources Information Center

Horn, Sebastian S.; Ruggeri, Azzurra; Pachur, Thorsten

2016-01-01

Judgments about objects in the world are often based on probabilistic information (or cues). A frugal judgment strategy that utilizes memory (i.e., the ability to discriminate between known and unknown objects) as a cue for inference is the recognition heuristic (RH). The usefulness of the RH depends on the structure of the environment,…
Real-time object recognition in multidimensional images based on joined extended structural tensor and higher-order tensor decomposition methods

NASA Astrophysics Data System (ADS)

Cyganek, Boguslaw; Smolka, Bogdan

2015-02-01

In this paper a system for real-time recognition of objects in multidimensional video signals is proposed. Object recognition is done by pattern projection into the tensor subspaces obtained from the factorization of the signal tensors representing the input signal. However, instead of taking only the intensity signal the novelty of this paper is first to build the Extended Structural Tensor representation from the intensity signal that conveys information on signal intensities, as well as on higher-order statistics of the input signals. This way the higher-order input pattern tensors are built from the training samples. Then, the tensor subspaces are built based on the Higher-Order Singular Value Decomposition of the prototype pattern tensors. Finally, recognition relies on measurements of the distance of a test pattern projected into the tensor subspaces obtained from the training tensors. Due to high-dimensionality of the input data, tensor based methods require high memory and computational resources. However, recent achievements in the technology of the multi-core microprocessors and graphic cards allows real-time operation of the multidimensional methods as is shown and analyzed in this paper based on real examples of object detection in digital images.
A Novel Locally Linear KNN Method With Applications to Visual Recognition.

PubMed

Liu, Qingfeng; Liu, Chengjun

2017-09-01

A locally linear K Nearest Neighbor (LLK) method is presented in this paper with applications to robust visual recognition. Specifically, the concept of an ideal representation is first presented, which improves upon the traditional sparse representation in many ways. The objective function based on a host of criteria for sparsity, locality, and reconstruction is then optimized to derive a novel representation, which is an approximation to the ideal representation. The novel representation is further processed by two classifiers, namely, an LLK-based classifier and a locally linear nearest mean-based classifier, for visual recognition. The proposed classifiers are shown to connect to the Bayes decision rule for minimum error. Additional new theoretical analysis is presented, such as the nonnegative constraint, the group regularization, and the computational efficiency of the proposed LLK method. New methods such as a shifted power transformation for improving reliability, a coefficients' truncating method for enhancing generalization, and an improved marginal Fisher analysis method for feature extraction are proposed to further improve visual recognition performance. Extensive experiments are implemented to evaluate the proposed LLK method for robust visual recognition. In particular, eight representative data sets are applied for assessing the performance of the LLK method for various visual recognition applications, such as action recognition, scene recognition, object recognition, and face recognition.
Invariant visual object recognition and shape processing in rats

PubMed Central

Zoccolan, Davide

2015-01-01

Invariant visual object recognition is the ability to recognize visual objects despite the vastly different images that each object can project onto the retina during natural vision, depending on its position and size within the visual field, its orientation relative to the viewer, etc. Achieving invariant recognition represents such a formidable computational challenge that is often assumed to be a unique hallmark of primate vision. Historically, this has limited the invasive investigation of its neuronal underpinnings to monkey studies, in spite of the narrow range of experimental approaches that these animal models allow. Meanwhile, rodents have been largely neglected as models of object vision, because of the widespread belief that they are incapable of advanced visual processing. However, the powerful array of experimental tools that have been developed to dissect neuronal circuits in rodents has made these species very attractive to vision scientists too, promoting a new tide of studies that have started to systematically explore visual functions in rats and mice. Rats, in particular, have been the subjects of several behavioral studies, aimed at assessing how advanced object recognition and shape processing is in this species. Here, I review these recent investigations, as well as earlier studies of rat pattern vision, to provide an historical overview and a critical summary of the status of the knowledge about rat object vision. The picture emerging from this survey is very encouraging with regard to the possibility of using rats as complementary models to monkeys in the study of higher-level vision. PMID:25561421

Assessment of accuracy and recognition of three-dimensional computerized forensic craniofacial reconstruction.

PubMed

Miranda, Geraldo Elias; Wilkinson, Caroline; Roughley, Mark; Beaini, Thiago Leite; Melani, Rodolfo Francisco Haltenhoff

2018-01-01

Facial reconstruction is a technique that aims to reproduce the individual facial characteristics based on interpretation of the skull, with the objective of recognition leading to identification. The aim of this paper was to evaluate the accuracy and recognition level of three-dimensional (3D) computerized forensic craniofacial reconstruction (CCFR) performed in a blind test on open-source software using computed tomography (CT) data from live subjects. Four CCFRs were produced by one of the researchers, who was provided with information concerning the age, sex, and ethnic group of each subject. The CCFRs were produced using Blender® with 3D models obtained from the CT data and templates from the MakeHuman® program. The evaluation of accuracy was carried out in CloudCompare, by geometric comparison of the CCFR to the subject 3D face model (obtained from the CT data). A recognition level was performed using the Picasa® recognition tool with a frontal standardized photography, images of the subject CT face model and the CCFR. Soft-tissue depth and nose, ears and mouth were based on published data, observing Brazilian facial parameters. The results were presented from all the points that form the CCFR model, with an average for each comparison between 63% and 74% with a distance -2.5 ≤ x ≤ 2.5 mm from the skin surface. The average distances were 1.66 to 0.33 mm and greater distances were observed around the eyes, cheeks, mental and zygomatic regions. Two of the four CCFRs were correctly matched by the Picasa® tool. Free software programs are capable of producing 3D CCFRs with plausible levels of accuracy and recognition and therefore indicate their value for use in forensic applications.
Assessment of accuracy and recognition of three-dimensional computerized forensic craniofacial reconstruction

PubMed Central

Wilkinson, Caroline; Roughley, Mark; Beaini, Thiago Leite; Melani, Rodolfo Francisco Haltenhoff

2018-01-01

Facial reconstruction is a technique that aims to reproduce the individual facial characteristics based on interpretation of the skull, with the objective of recognition leading to identification. The aim of this paper was to evaluate the accuracy and recognition level of three-dimensional (3D) computerized forensic craniofacial reconstruction (CCFR) performed in a blind test on open-source software using computed tomography (CT) data from live subjects. Four CCFRs were produced by one of the researchers, who was provided with information concerning the age, sex, and ethnic group of each subject. The CCFRs were produced using Blender® with 3D models obtained from the CT data and templates from the MakeHuman® program. The evaluation of accuracy was carried out in CloudCompare, by geometric comparison of the CCFR to the subject 3D face model (obtained from the CT data). A recognition level was performed using the Picasa® recognition tool with a frontal standardized photography, images of the subject CT face model and the CCFR. Soft-tissue depth and nose, ears and mouth were based on published data, observing Brazilian facial parameters. The results were presented from all the points that form the CCFR model, with an average for each comparison between 63% and 74% with a distance -2.5 ≤ x ≤ 2.5 mm from the skin surface. The average distances were 1.66 to 0.33 mm and greater distances were observed around the eyes, cheeks, mental and zygomatic regions. Two of the four CCFRs were correctly matched by the Picasa® tool. Free software programs are capable of producing 3D CCFRs with plausible levels of accuracy and recognition and therefore indicate their value for use in forensic applications. PMID:29718983
Bibliography of In-House and Contract Reports, Supplement 18

DTIC Science & Technology

1992-10-01

Transparent Conforming Overlays 46 TITLE REPORT NO. YEAR Development, Service Tests, and Production Model 1307 -TR 1953 Tests, Autofocusing Rectifier...Development, Test, Preparation, Delivery, and ETL- 1307 1982 Installation of Algorithms for Optimal Adjustment of Inertial Survey Data Developmental Optical...B: Terrain ETL- 0428 1986 and Object Modeling Recognition (March 13, 1985 - March 13, 1986) Knowledge-Based Vision Techniques - Task B: Terrain ETL
Seismic slope-performance analysis: from hazard map to decision support system

USGS Publications Warehouse

Miles, Scott B.; Keefer, David K.; Ho, Carlton L.

1999-01-01

In response to the growing recognition of engineers and decision-makers of the regional effects of earthquake-induced landslides, this paper presents a general approach to conducting seismic landslide zonation, based on the popular Newmark's sliding block analogy for modeling coherent landslides. Four existing models based on the sliding block analogy are compared. The comparison shows that the models forecast notably different levels of slope performance. Considering this discrepancy along with the limitations of static maps as a decision tool, a spatial decision support system (SDSS) for seismic landslide analysis is proposed, which will support investigations over multiple scales for any number of earthquake scenarios and input conditions. Most importantly, the SDSS will allow use of any seismic landslide analysis model and zonation approach. Developments associated with the SDSS will produce an object-oriented model for encapsulating spatial data, an object-oriented specification to allow construction of models using modular objects, and a direct-manipulation, dynamic user-interface that adapts to the particular seismic landslide model configuration.
Development of novel tasks for studying view-invariant object recognition in rodents: Sensitivity to scopolamine.

PubMed

Mitchnick, Krista A; Wideman, Cassidy E; Huff, Andrew E; Palmer, Daniel; McNaughton, Bruce L; Winters, Boyer D

2018-05-15

The capacity to recognize objects from different view-points or angles, referred to as view-invariance, is an essential process that humans engage in daily. Currently, the ability to investigate the neurobiological underpinnings of this phenomenon is limited, as few ethologically valid view-invariant object recognition tasks exist for rodents. Here, we report two complementary, novel view-invariant object recognition tasks in which rodents physically interact with three-dimensional objects. Prior to experimentation, rats and mice were given extensive experience with a set of 'pre-exposure' objects. In a variant of the spontaneous object recognition task, novelty preference for pre-exposed or new objects was assessed at various angles of rotation (45°, 90° or 180°); unlike control rodents, for whom the objects were novel, rats and mice tested with pre-exposed objects did not discriminate between rotated and un-rotated objects in the choice phase, indicating substantial view-invariant object recognition. Secondly, using automated operant touchscreen chambers, rats were tested on pre-exposed or novel objects in a pairwise discrimination task, where the rewarded stimulus (S+) was rotated (180°) once rats had reached acquisition criterion; rats tested with pre-exposed objects re-acquired the pairwise discrimination following S+ rotation more effectively than those tested with new objects. Systemic scopolamine impaired performance on both tasks, suggesting involvement of acetylcholine at muscarinic receptors in view-invariant object processing. These tasks present novel means of studying the behavioral and neural bases of view-invariant object recognition in rodents. Copyright © 2018 Elsevier B.V. All rights reserved.
Get rich quick: the signal to respond procedure reveals the time course of semantic richness effects during visual word recognition.

PubMed

Hargreaves, Ian S; Pexman, Penny M

2014-05-01

According to several current frameworks, semantic processing involves an early influence of language-based information followed by later influences of object-based information (e.g., situated simulations; Santos, Chaigneau, Simmons, & Barsalou, 2011). In the present study we examined whether these predictions extend to the influence of semantic variables in visual word recognition. We investigated the time course of semantic richness effects in visual word recognition using a signal-to-respond (STR) paradigm fitted to a lexical decision (LDT) and a semantic categorization (SCT) task. We used linear mixed effects to examine the relative contributions of language-based (number of senses, ARC) and object-based (imageability, number of features, body-object interaction ratings) descriptions of semantic richness at four STR durations (75, 100, 200, and 400ms). Results showed an early influence of number of senses and ARC in the SCT. In both LDT and SCT, object-based effects were the last to influence participants' decision latencies. We interpret our results within a framework in which semantic processes are available to influence word recognition as a function of their availability over time, and of their relevance to task-specific demands. Copyright © 2014 Elsevier B.V. All rights reserved.
Single prolonged stress impairs social and object novelty recognition in rats.

PubMed

Eagle, Andrew L; Fitzpatrick, Chris J; Perrine, Shane A

2013-11-01

Posttraumatic stress disorder (PTSD) results from exposure to a traumatic event and manifests as re-experiencing, arousal, avoidance, and negative cognition/mood symptoms. Avoidant symptoms, as well as the newly defined negative cognitions/mood, are a serious complication leading to diminished interest in once important or positive activities, such as social interaction; however, the basis of these symptoms remains poorly understood. PTSD patients also exhibit impaired object and social recognition, which may underlie the avoidance and symptoms of negative cognition, such as social estrangement or diminished interest in activities. Previous studies have demonstrated that single prolonged stress (SPS), models PTSD phenotypes, including impairments in learning and memory. Therefore, it was hypothesized that SPS would impair social and object recognition memory. Male Sprague Dawley rats were exposed to SPS then tested in the social choice test (SCT) or novel object recognition test (NOR). These tests measure recognition of novelty over familiarity, a natural preference of rodents. Results show that SPS impaired preference for both social and object novelty. In addition, SPS impairment in social recognition may be caused by impaired behavioral flexibility, or an inability to shift behavior during the SCT. These results demonstrate that traumatic stress can impair social and object recognition memory, which may underlie certain avoidant symptoms or negative cognition in PTSD and be related to impaired behavioral flexibility. Copyright © 2013 Elsevier B.V. All rights reserved.
Perceptual Plasticity for Auditory Object Recognition

PubMed Central

Heald, Shannon L. M.; Van Hedger, Stephen C.; Nusbaum, Howard C.

2017-01-01

In our auditory environment, we rarely experience the exact acoustic waveform twice. This is especially true for communicative signals that have meaning for listeners. In speech and music, the acoustic signal changes as a function of the talker (or instrument), speaking (or playing) rate, and room acoustics, to name a few factors. Yet, despite this acoustic variability, we are able to recognize a sentence or melody as the same across various kinds of acoustic inputs and determine meaning based on listening goals, expectations, context, and experience. The recognition process relates acoustic signals to prior experience despite variability in signal-relevant and signal-irrelevant acoustic properties, some of which could be considered as “noise” in service of a recognition goal. However, some acoustic variability, if systematic, is lawful and can be exploited by listeners to aid in recognition. Perceivable changes in systematic variability can herald a need for listeners to reorganize perception and reorient their attention to more immediately signal-relevant cues. This view is not incorporated currently in many extant theories of auditory perception, which traditionally reduce psychological or neural representations of perceptual objects and the processes that act on them to static entities. While this reduction is likely done for the sake of empirical tractability, such a reduction may seriously distort the perceptual process to be modeled. We argue that perceptual representations, as well as the processes underlying perception, are dynamically determined by an interaction between the uncertainty of the auditory signal and constraints of context. This suggests that the process of auditory recognition is highly context-dependent in that the identity of a given auditory object may be intrinsically tied to its preceding context. To argue for the flexible neural and psychological updating of sound-to-meaning mappings across speech and music, we draw upon examples of perceptual categories that are thought to be highly stable. This framework suggests that the process of auditory recognition cannot be divorced from the short-term context in which an auditory object is presented. Implications for auditory category acquisition and extant models of auditory perception, both cognitive and neural, are discussed. PMID:28588524
Automatic pole-like object modeling via 3D part-based analysis of point cloud

NASA Astrophysics Data System (ADS)

He, Liu; Yang, Haoxiang; Huang, Yuchun

2016-10-01

Pole-like objects, including trees, lampposts and traffic signs, are indispensable part of urban infrastructure. With the advance of vehicle-based laser scanning (VLS), massive point cloud of roadside urban areas becomes applied in 3D digital city modeling. Based on the property that different pole-like objects have various canopy parts and similar trunk parts, this paper proposed the 3D part-based shape analysis to robustly extract, identify and model the pole-like objects. The proposed method includes: 3D clustering and recognition of trunks, voxel growing and part-based 3D modeling. After preprocessing, the trunk center is identified as the point that has local density peak and the largest minimum inter-cluster distance. Starting from the trunk centers, the remaining points are iteratively clustered to the same centers of their nearest point with higher density. To eliminate the noisy points, cluster border is refined by trimming boundary outliers. Then, candidate trunks are extracted based on the clustering results in three orthogonal planes by shape analysis. Voxel growing obtains the completed pole-like objects regardless of overlaying. Finally, entire trunk, branch and crown part are analyzed to obtain seven feature parameters. These parameters are utilized to model three parts respectively and get signal part-assembled 3D model. The proposed method is tested using the VLS-based point cloud of Wuhan University, China. The point cloud includes many kinds of trees, lampposts and other pole-like posters under different occlusions and overlaying. Experimental results show that the proposed method can extract the exact attributes and model the roadside pole-like objects efficiently.
Affective and contextual values modulate spatial frequency use in object recognition

PubMed Central

Caplette, Laurent; West, Gregory; Gomot, Marie; Gosselin, Frédéric; Wicker, Bruno

2014-01-01

Visual object recognition is of fundamental importance in our everyday interaction with the environment. Recent models of visual perception emphasize the role of top-down predictions facilitating object recognition via initial guesses that limit the number of object representations that need to be considered. Several results suggest that this rapid and efficient object processing relies on the early extraction and processing of low spatial frequencies (LSF). The present study aimed to investigate the SF content of visual object representations and its modulation by contextual and affective values of the perceived object during a picture-name verification task. Stimuli consisted of pictures of objects equalized in SF content and categorized as having low or high affective and contextual values. To access the SF content of stored visual representations of objects, SFs of each image were then randomly sampled on a trial-by-trial basis. Results reveal that intermediate SFs between 14 and 24 cycles per object (2.3–4 cycles per degree) are correlated with fast and accurate identification for all categories of objects. Moreover, there was a significant interaction between affective and contextual values over the SFs correlating with fast recognition. These results suggest that affective and contextual values of a visual object modulate the SF content of its internal representation, thus highlighting the flexibility of the visual recognition system. PMID:24904514
Running Improves Pattern Separation during Novel Object Recognition.

PubMed

Bolz, Leoni; Heigele, Stefanie; Bischofberger, Josef

2015-10-09

Running increases adult neurogenesis and improves pattern separation in various memory tasks including context fear conditioning or touch-screen based spatial learning. However, it is unknown whether pattern separation is improved in spontaneous behavior, not emotionally biased by positive or negative reinforcement. Here we investigated the effect of voluntary running on pattern separation during novel object recognition in mice using relatively similar or substantially different objects.We show that running increases hippocampal neurogenesis but does not affect object recognition memory with 1.5 h delay after sample phase. By contrast, at 24 h delay, running significantly improves recognition memory for similar objects, whereas highly different objects can be distinguished by both, running and sedentary mice. These data show that physical exercise improves pattern separation, independent of negative or positive reinforcement. In sedentary mice there is a pronounced temporal gradient for remembering object details. In running mice, however, increased neurogenesis improves hippocampal coding and temporally preserves distinction of novel objects from familiar ones.
Recurrent Convolutional Neural Networks: A Better Model of Biological Object Recognition.

PubMed

Spoerer, Courtney J; McClure, Patrick; Kriegeskorte, Nikolaus

2017-01-01

Feedforward neural networks provide the dominant model of how the brain performs visual object recognition. However, these networks lack the lateral and feedback connections, and the resulting recurrent neuronal dynamics, of the ventral visual pathway in the human and non-human primate brain. Here we investigate recurrent convolutional neural networks with bottom-up (B), lateral (L), and top-down (T) connections. Combining these types of connections yields four architectures (B, BT, BL, and BLT), which we systematically test and compare. We hypothesized that recurrent dynamics might improve recognition performance in the challenging scenario of partial occlusion. We introduce two novel occluded object recognition tasks to test the efficacy of the models, digit clutter (where multiple target digits occlude one another) and digit debris (where target digits are occluded by digit fragments). We find that recurrent neural networks outperform feedforward control models (approximately matched in parametric complexity) at recognizing objects, both in the absence of occlusion and in all occlusion conditions. Recurrent networks were also found to be more robust to the inclusion of additive Gaussian noise. Recurrent neural networks are better in two respects: (1) they are more neurobiologically realistic than their feedforward counterparts; (2) they are better in terms of their ability to recognize objects, especially under challenging conditions. This work shows that computer vision can benefit from using recurrent convolutional architectures and suggests that the ubiquitous recurrent connections in biological brains are essential for task performance.
Comparative Study on Interaction of Form and Motion Processing Streams by Applying Two Different Classifiers in Mechanism for Recognition of Biological Movement

PubMed Central

2014-01-01

Research on psychophysics, neurophysiology, and functional imaging shows particular representation of biological movements which contains two pathways. The visual perception of biological movements formed through the visual system called dorsal and ventral processing streams. Ventral processing stream is associated with the form information extraction; on the other hand, dorsal processing stream provides motion information. Active basic model (ABM) as hierarchical representation of the human object had revealed novelty in form pathway due to applying Gabor based supervised object recognition method. It creates more biological plausibility along with similarity with original model. Fuzzy inference system is used for motion pattern information in motion pathway creating more robustness in recognition process. Besides, interaction of these paths is intriguing and many studies in various fields considered it. Here, the interaction of the pathways to get more appropriated results has been investigated. Extreme learning machine (ELM) has been implied for classification unit of this model, due to having the main properties of artificial neural networks, but crosses from the difficulty of training time substantially diminished in it. Here, there will be a comparison between two different configurations, interactions using synergetic neural network and ELM, in terms of accuracy and compatibility. PMID:25276860
Remembering the snake in the grass: Threat enhances recognition but not source memory.

PubMed

Meyer, Miriam Magdalena; Bell, Raoul; Buchner, Axel

2015-12-01

Research on the influence of emotion on source memory has yielded inconsistent findings. The object-based framework (Mather, 2007) predicts that negatively arousing stimuli attract attention, resulting in enhanced within-object binding, and, thereby, enhanced source memory for intrinsic context features of emotional stimuli. To test this prediction, we presented pictures of threatening and harmless animals, the color of which had been experimentally manipulated. In a memory test, old-new recognition for the animals and source memory for their color was assessed. In all 3 experiments, old-new recognition was better for the more threatening material, which supports previous reports of an emotional memory enhancement. This recognition advantage was due to the emotional properties of the stimulus material, and not specific for snake stimuli. However, inconsistent with the prediction of the object-based framework, intrinsic source memory was not affected by emotion. (c) 2015 APA, all rights reserved).
Identification and location of catenary insulator in complex background based on machine vision

NASA Astrophysics Data System (ADS)

Yao, Xiaotong; Pan, Yingli; Liu, Li; Cheng, Xiao

2018-04-01

It is an important premise to locate insulator precisely for fault detection. Current location algorithms for insulator under catenary checking images are not accurate, a target recognition and localization method based on binocular vision combined with SURF features is proposed. First of all, because of the location of the insulator in complex environment, using SURF features to achieve the coarse positioning of target recognition; then Using binocular vision principle to calculate the 3D coordinates of the object which has been coarsely located, realization of target object recognition and fine location; Finally, Finally, the key is to preserve the 3D coordinate of the object's center of mass, transfer to the inspection robot to control the detection position of the robot. Experimental results demonstrate that the proposed method has better recognition efficiency and accuracy, can successfully identify the target and has a define application value.
Representations of Shape in Object Recognition and Long-Term Visual Memory

DTIC Science & Technology

1993-02-11

in anything other than linguistic terms ( Biederman , 1987 , for example). STATUS 1. Viewpoint-Dependent Features in Object Representation Tarr and...is object- based orientation-independent representations sufficient for "basic-level" categorization ( Biederman , 1987 ; Corballis, 1988). Alternatively...space. REFERENCES Biederman , I. ( 1987 ). Recognition-by-components: A theory of human image understanding. Psychological Review, 94,115-147. Cooper, L
A sensor and video based ontology for activity recognition in smart environments.

PubMed

Mitchell, D; Morrow, Philip J; Nugent, Chris D

2014-01-01

Activity recognition is used in a wide range of applications including healthcare and security. In a smart environment activity recognition can be used to monitor and support the activities of a user. There have been a range of methods used in activity recognition including sensor-based approaches, vision-based approaches and ontological approaches. This paper presents a novel approach to activity recognition in a smart home environment which combines sensor and video data through an ontological framework. The ontology describes the relationships and interactions between activities, the user, objects, sensors and video data.
Classification of Anticipatory Signals for Grasp and Release from Surface Electromyography.

PubMed

Siu, Ho Chit; Shah, Julie A; Stirling, Leia A

2016-10-25

Surface electromyography (sEMG) is a technique for recording natural muscle activation signals, which can serve as control inputs for exoskeletons and prosthetic devices. Previous experiments have incorporated these signals using both classical and pattern-recognition control methods in order to actuate such devices. We used the results of an experiment incorporating grasp and release actions with object contact to develop an intent-recognition system based on Gaussian mixture models (GMM) and continuous-emission hidden Markov models (HMM) of sEMG data. We tested this system with data collected from 16 individuals using a forearm band with distributed sEMG sensors. The data contain trials with shifted band alignments to assess robustness to sensor placement. This study evaluated and found that pattern-recognition-based methods could classify transient anticipatory sEMG signals in the presence of shifted sensor placement and object contact. With the best-performing classifier, the effect of label lengths in the training data was also examined. A mean classification accuracy of 75.96% was achieved through a unigram HMM method with five mixture components. Classification accuracy on different sub-movements was found to be limited by the length of the shortest sub-movement, which means that shorter sub-movements within dynamic sequences require larger training sets to be classified correctly. This classification of user intent is a potential control mechanism for a dynamic grasping task involving user contact with external objects and noise. Further work is required to test its performance as part of an exoskeleton controller, which involves contact with actuated external surfaces.
Classification of Anticipatory Signals for Grasp and Release from Surface Electromyography

PubMed Central

Siu, Ho Chit; Shah, Julie A.; Stirling, Leia A.

2016-01-01

Surface electromyography (sEMG) is a technique for recording natural muscle activation signals, which can serve as control inputs for exoskeletons and prosthetic devices. Previous experiments have incorporated these signals using both classical and pattern-recognition control methods in order to actuate such devices. We used the results of an experiment incorporating grasp and release actions with object contact to develop an intent-recognition system based on Gaussian mixture models (GMM) and continuous-emission hidden Markov models (HMM) of sEMG data. We tested this system with data collected from 16 individuals using a forearm band with distributed sEMG sensors. The data contain trials with shifted band alignments to assess robustness to sensor placement. This study evaluated and found that pattern-recognition-based methods could classify transient anticipatory sEMG signals in the presence of shifted sensor placement and object contact. With the best-performing classifier, the effect of label lengths in the training data was also examined. A mean classification accuracy of 75.96% was achieved through a unigram HMM method with five mixture components. Classification accuracy on different sub-movements was found to be limited by the length of the shortest sub-movement, which means that shorter sub-movements within dynamic sequences require larger training sets to be classified correctly. This classification of user intent is a potential control mechanism for a dynamic grasping task involving user contact with external objects and noise. Further work is required to test its performance as part of an exoskeleton controller, which involves contact with actuated external surfaces. PMID:27792155
Contributions of Low and High Spatial Frequency Processing to Impaired Object Recognition Circuitry in Schizophrenia

PubMed Central

Calderone, Daniel J.; Hoptman, Matthew J.; Martínez, Antígona; Nair-Collins, Sangeeta; Mauro, Cristina J.; Bar, Moshe; Javitt, Daniel C.; Butler, Pamela D.

2013-01-01

Patients with schizophrenia exhibit cognitive and sensory impairment, and object recognition deficits have been linked to sensory deficits. The “frame and fill” model of object recognition posits that low spatial frequency (LSF) information rapidly reaches the prefrontal cortex (PFC) and creates a general shape of an object that feeds back to the ventral temporal cortex to assist object recognition. Visual dysfunction findings in schizophrenia suggest a preferential loss of LSF information. This study used functional magnetic resonance imaging (fMRI) and resting state functional connectivity (RSFC) to investigate the contribution of visual deficits to impaired object “framing” circuitry in schizophrenia. Participants were shown object stimuli that were intact or contained only LSF or high spatial frequency (HSF) information. For controls, fMRI revealed preferential activation to LSF information in precuneus, superior temporal, and medial and dorsolateral PFC areas, whereas patients showed a preference for HSF information or no preference. RSFC revealed a lack of connectivity between early visual areas and PFC for patients. These results demonstrate impaired processing of LSF information during object recognition in schizophrenia, with patients instead displaying increased processing of HSF information. This is consistent with findings of a preference for local over global visual information in schizophrenia. PMID:22735157

Acute effects of alcohol on intrusive memory development and viewpoint dependence in spatial memory support a dual representation model.

PubMed

Bisby, James A; King, John A; Brewin, Chris R; Burgess, Neil; Curran, H Valerie

2010-08-01

A dual representation model of intrusive memory proposes that personally experienced events give rise to two types of representation: an image-based, egocentric representation based on sensory-perceptual features; and a more abstract, allocentric representation that incorporates spatiotemporal context. The model proposes that intrusions reflect involuntary reactivation of egocentric representations in the absence of a corresponding allocentric representation. We tested the model by investigating the effect of alcohol on intrusive memories and, concurrently, on egocentric and allocentric spatial memory. With a double-blind independent group design participants were administered alcohol (.4 or .8 g/kg) or placebo. A virtual environment was used to present objects and test recognition memory from the same viewpoint as presentation (tapping egocentric memory) or a shifted viewpoint (tapping allocentric memory). Participants were also exposed to a trauma video and required to detail intrusive memories for 7 days, after which explicit memory was assessed. There was a selective impairment of shifted-view recognition after the low dose of alcohol, whereas the high dose induced a global impairment in same-view and shifted-view conditions. Alcohol showed a dose-dependent inverted "U"-shaped effect on intrusions, with only the low dose increasing the number of intrusions, replicating previous work. When same-view recognition was intact, decrements in shifted-view recognition were associated with increases in intrusions. The differential effect of alcohol on intrusive memories and on same/shifted-view recognition support a dual representation model in which intrusions might reflect an imbalance between two types of memory representation. These findings highlight important clinical implications, given alcohol's involvement in real-life trauma. Copyright 2010 Society of Biological Psychiatry. Published by Elsevier Inc. All rights reserved.
Human action recognition based on point context tensor shape descriptor

NASA Astrophysics Data System (ADS)

Li, Jianjun; Mao, Xia; Chen, Lijiang; Wang, Lan

2017-07-01

Motion trajectory recognition is one of the most important means to determine the identity of a moving object. A compact and discriminative feature representation method can improve the trajectory recognition accuracy. This paper presents an efficient framework for action recognition using a three-dimensional skeleton kinematic joint model. First, we put forward a rotation-scale-translation-invariant shape descriptor based on point context (PC) and the normal vector of hypersurface to jointly characterize local motion and shape information. Meanwhile, an algorithm for extracting the key trajectory based on the confidence coefficient is proposed to reduce the randomness and computational complexity. Second, to decrease the eigenvalue decomposition time complexity, a tensor shape descriptor (TSD) based on PC that can globally capture the spatial layout and temporal order to preserve the spatial information of each frame is proposed. Then, a multilinear projection process is achieved by tensor dynamic time warping to map the TSD to a low-dimensional tensor subspace of the same size. Experimental results show that the proposed shape descriptor is effective and feasible, and the proposed approach obtains considerable performance improvement over the state-of-the-art approaches with respect to accuracy on a public action dataset.
Pattern recognition for passive polarimetric data using nonparametric classifiers

NASA Astrophysics Data System (ADS)

Thilak, Vimal; Saini, Jatinder; Voelz, David G.; Creusere, Charles D.

2005-08-01

Passive polarization based imaging is a useful tool in computer vision and pattern recognition. A passive polarization imaging system forms a polarimetric image from the reflection of ambient light that contains useful information for computer vision tasks such as object detection (classification) and recognition. Applications of polarization based pattern recognition include material classification and automatic shape recognition. In this paper, we present two target detection algorithms for images captured by a passive polarimetric imaging system. The proposed detection algorithms are based on Bayesian decision theory. In these approaches, an object can belong to one of any given number classes and classification involves making decisions that minimize the average probability of making incorrect decisions. This minimum is achieved by assigning an object to the class that maximizes the a posteriori probability. Computing a posteriori probabilities requires estimates of class conditional probability density functions (likelihoods) and prior probabilities. A Probabilistic neural network (PNN), which is a nonparametric method that can compute Bayes optimal boundaries, and a -nearest neighbor (KNN) classifier, is used for density estimation and classification. The proposed algorithms are applied to polarimetric image data gathered in the laboratory with a liquid crystal-based system. The experimental results validate the effectiveness of the above algorithms for target detection from polarimetric data.
Evolutionary Design of Convolutional Neural Networks for Human Activity Recognition in Sensor-Rich Environments.

PubMed

Baldominos, Alejandro; Saez, Yago; Isasi, Pedro

2018-04-23

Human activity recognition is a challenging problem for context-aware systems and applications. It is gaining interest due to the ubiquity of different sensor sources, wearable smart objects, ambient sensors, etc. This task is usually approached as a supervised machine learning problem, where a label is to be predicted given some input data, such as the signals retrieved from different sensors. For tackling the human activity recognition problem in sensor network environments, in this paper we propose the use of deep learning (convolutional neural networks) to perform activity recognition using the publicly available OPPORTUNITY dataset. Instead of manually choosing a suitable topology, we will let an evolutionary algorithm design the optimal topology in order to maximize the classification F1 score. After that, we will also explore the performance of committees of the models resulting from the evolutionary process. Results analysis indicates that the proposed model was able to perform activity recognition within a heterogeneous sensor network environment, achieving very high accuracies when tested with new sensor data. Based on all conducted experiments, the proposed neuroevolutionary system has proved to be able to systematically find a classification model which is capable of outperforming previous results reported in the state-of-the-art, showing that this approach is useful and improves upon previously manually-designed architectures.
Evolutionary Design of Convolutional Neural Networks for Human Activity Recognition in Sensor-Rich Environments

PubMed Central

2018-01-01

Human activity recognition is a challenging problem for context-aware systems and applications. It is gaining interest due to the ubiquity of different sensor sources, wearable smart objects, ambient sensors, etc. This task is usually approached as a supervised machine learning problem, where a label is to be predicted given some input data, such as the signals retrieved from different sensors. For tackling the human activity recognition problem in sensor network environments, in this paper we propose the use of deep learning (convolutional neural networks) to perform activity recognition using the publicly available OPPORTUNITY dataset. Instead of manually choosing a suitable topology, we will let an evolutionary algorithm design the optimal topology in order to maximize the classification F1 score. After that, we will also explore the performance of committees of the models resulting from the evolutionary process. Results analysis indicates that the proposed model was able to perform activity recognition within a heterogeneous sensor network environment, achieving very high accuracies when tested with new sensor data. Based on all conducted experiments, the proposed neuroevolutionary system has proved to be able to systematically find a classification model which is capable of outperforming previous results reported in the state-of-the-art, showing that this approach is useful and improves upon previously manually-designed architectures. PMID:29690587
Emotional valence of stimuli modulates false recognition: Using a modified version of the simplified conjoint recognition paradigm.

PubMed

Gong, Xianmin; Xiao, Hongrui; Wang, Dahua

2016-11-01

False recognition results from the interplay of multiple cognitive processes, including verbatim memory, gist memory, phantom recollection, and response bias. In the current study, we modified the simplified Conjoint Recognition (CR) paradigm to investigate the way in which the valence of emotional stimuli affects the cognitive process and behavioral outcome of false recognition. In Study 1, we examined the applicability of the modification to the simplified CR paradigm and model. Twenty-six undergraduate students (13 females, aged 21.00±2.30years) learned and recognized both the large and small categories of photo objects. The applicability of the paradigm and model was confirmed by a fair goodness-of-fit of the model to the observational data and by their competence in detecting the memory differences between the large- and small-category conditions. In Study 2, we recruited another sample of 29 undergraduate students (14 females, aged 22.60±2.74years) to learn and recognize the categories of photo objects that were emotionally provocative. The results showed that negative valence increased false recognition, particularly the rate of false "remember" responses, by facilitating phantom recollection; positive valence did not influence false recognition significantly though enhanced gist processing. Copyright © 2016 Elsevier B.V. All rights reserved.
The development of adaptive decision making: Recognition-based inference in children and adolescents.

PubMed

Horn, Sebastian S; Ruggeri, Azzurra; Pachur, Thorsten

2016-09-01

Judgments about objects in the world are often based on probabilistic information (or cues). A frugal judgment strategy that utilizes memory (i.e., the ability to discriminate between known and unknown objects) as a cue for inference is the recognition heuristic (RH). The usefulness of the RH depends on the structure of the environment, particularly the predictive power (validity) of recognition. Little is known about developmental differences in use of the RH. In this study, the authors examined (a) to what extent children and adolescents recruit the RH when making judgments, and (b) around what age adaptive use of the RH emerges. Primary schoolchildren (M = 9 years), younger adolescents (M = 12 years), and older adolescents (M = 17 years) made comparative judgments in task environments with either high or low recognition validity. Reliance on the RH was measured with a hierarchical multinomial model. Results indicated that primary schoolchildren already made systematic use of the RH. However, only older adolescents adaptively adjusted their strategy use between environments and were better able to discriminate between situations in which the RH led to correct versus incorrect inferences. These findings suggest that the use of simple heuristics does not progress unidirectionally across development but strongly depends on the task environment, in line with the perspective of ecological rationality. Moreover, adaptive heuristic inference seems to require experience and a developed base of domain knowledge. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
The Cambridge Car Memory Test: a task matched in format to the Cambridge Face Memory Test, with norms, reliability, sex differences, dissociations from face memory, and expertise effects.

PubMed

Dennett, Hugh W; McKone, Elinor; Tavashmi, Raka; Hall, Ashleigh; Pidcock, Madeleine; Edwards, Mark; Duchaine, Bradley

2012-06-01

Many research questions require a within-class object recognition task matched for general cognitive requirements with a face recognition task. If the object task also has high internal reliability, it can improve accuracy and power in group analyses (e.g., mean inversion effects for faces vs. objects), individual-difference studies (e.g., correlations between certain perceptual abilities and face/object recognition), and case studies in neuropsychology (e.g., whether a prosopagnosic shows a face-specific or object-general deficit). Here, we present such a task. Our Cambridge Car Memory Test (CCMT) was matched in format to the established Cambridge Face Memory Test, requiring recognition of exemplars across view and lighting change. We tested 153 young adults (93 female). Results showed high reliability (Cronbach's alpha = .84) and a range of scores suitable both for normal-range individual-difference studies and, potentially, for diagnosis of impairment. The mean for males was much higher than the mean for females. We demonstrate independence between face memory and car memory (dissociation based on sex, plus a modest correlation between the two), including where participants have high relative expertise with cars. We also show that expertise with real car makes and models of the era used in the test significantly predicts CCMT performance. Surprisingly, however, regression analyses imply that there is an effect of sex per se on the CCMT that is not attributable to a stereotypical male advantage in car expertise.
Modeling global scene factors in attention

NASA Astrophysics Data System (ADS)

Torralba, Antonio

2003-07-01

Models of visual attention have focused predominantly on bottom-up approaches that ignored structured contextual and scene information. I propose a model of contextual cueing for attention guidance based on the global scene configuration. It is shown that the statistics of low-level features across the whole image can be used to prime the presence or absence of objects in the scene and to predict their location, scale, and appearance before exploring the image. In this scheme, visual context information can become available early in the visual processing chain, which allows modulation of the saliency of image regions and provides an efficient shortcut for object detection and recognition. 2003 Optical Society of America
The role of familiarity in binary choice inferences.

PubMed

Honda, Hidehito; Abe, Keiga; Matsuka, Toshihiko; Yamagishi, Kimihiko

2011-07-01

In research on the recognition heuristic (Goldstein & Gigerenzer, Psychological Review, 109, 75-90, 2002), knowledge of recognized objects has been categorized as "recognized" or "unrecognized" without regard to the degree of familiarity of the recognized object. In the present article, we propose a new inference model--familiarity-based inference. We hypothesize that when subjective knowledge levels (familiarity) of recognized objects differ, the degree of familiarity of recognized objects will influence inferences. Specifically, people are predicted to infer that the more familiar object in a pair of two objects has a higher criterion value on the to-be-judged dimension. In two experiments, using a binary choice task, we examined inferences about populations in a pair of two cities. Results support predictions of familiarity-based inference. Participants inferred that the more familiar city in a pair was more populous. Statistical modeling showed that individual differences in familiarity-based inference lie in the sensitivity to differences in familiarity. In addition, we found that familiarity-based inference can be generally regarded as an ecologically rational inference. Furthermore, when cue knowledge about the inference criterion was available, participants made inferences based on the cue knowledge about population instead of familiarity. Implications of the role of familiarity in psychological processes are discussed.
Mechanisms and neural basis of object and pattern recognition: a study with chess experts.

PubMed

Bilalić, Merim; Langner, Robert; Erb, Michael; Grodd, Wolfgang

2010-11-01

Comparing experts with novices offers unique insights into the functioning of cognition, based on the maximization of individual differences. Here we used this expertise approach to disentangle the mechanisms and neural basis behind two processes that contribute to everyday expertise: object and pattern recognition. We compared chess experts and novices performing chess-related and -unrelated (visual) search tasks. As expected, the superiority of experts was limited to the chess-specific task, as there were no differences in a control task that used the same chess stimuli but did not require chess-specific recognition. The analysis of eye movements showed that experts immediately and exclusively focused on the relevant aspects in the chess task, whereas novices also examined irrelevant aspects. With random chess positions, when pattern knowledge could not be used to guide perception, experts nevertheless maintained an advantage. Experts' superior domain-specific parafoveal vision, a consequence of their knowledge about individual domain-specific symbols, enabled improved object recognition. Functional magnetic resonance imaging corroborated this differentiation between object and pattern recognition and showed that chess-specific object recognition was accompanied by bilateral activation of the occipitotemporal junction, whereas chess-specific pattern recognition was related to bilateral activations in the middle part of the collateral sulci. Using the expertise approach together with carefully chosen controls and multiple dependent measures, we identified object and pattern recognition as two essential cognitive processes in expert visual cognition, which may also help to explain the mechanisms of everyday perception.
A method of 3D object recognition and localization in a cloud of points

NASA Astrophysics Data System (ADS)

Bielicki, Jerzy; Sitnik, Robert

2013-12-01

The proposed method given in this article is prepared for analysis of data in the form of cloud of points directly from 3D measurements. It is designed for use in the end-user applications that can directly be integrated with 3D scanning software. The method utilizes locally calculated feature vectors (FVs) in point cloud data. Recognition is based on comparison of the analyzed scene with reference object library. A global descriptor in the form of a set of spatially distributed FVs is created for each reference model. During the detection process, correlation of subsets of reference FVs with FVs calculated in the scene is computed. Features utilized in the algorithm are based on parameters, which qualitatively estimate mean and Gaussian curvatures. Replacement of differentiation with averaging in the curvatures estimation makes the algorithm more resistant to discontinuities and poor quality of the input data. Utilization of the FV subsets allows to detect partially occluded and cluttered objects in the scene, while additional spatial information maintains false positive rate at a reasonably low level.
Neurocomputational bases of object and face recognition.

PubMed Central

Biederman, I; Kalocsai, P

1997-01-01

A number of behavioural phenomena distinguish the recognition of faces and objects, even when members of a set of objects are highly similar. Because faces have the same parts in approximately the same relations, individuation of faces typically requires specification of the metric variation in a holistic and integral representation of the facial surface. The direct mapping of a hypercolumn-like pattern of activation onto a representation layer that preserves relative spatial filter values in a two-dimensional (2D) coordinate space, as proposed by C. von der Malsburg and his associates, may account for many of the phenomena associated with face recognition. An additional refinement, in which each column of filters (termed a 'jet') is centred on a particular facial feature (or fiducial point), allows selectivity of the input into the holistic representation to avoid incorporation of occluding or nearby surfaces. The initial hypercolumn representation also characterizes the first stage of object perception, but the image variation for objects at a given location in a 2D coordinate space may be too great to yield sufficient predictability directly from the output of spatial kernels. Consequently, objects can be represented by a structural description specifying qualitative (typically, non-accidental) characterizations of an object's parts, the attributes of the parts, and the relations among the parts, largely based on orientation and depth discontinuities (as shown by Hummel & Biederman). A series of experiments on the name priming or physical matching of complementary images (in the Fourier domain) of objects and faces documents that whereas face recognition is strongly dependent on the original spatial filter values, evidence from object recognition indicates strong invariance to these values, even when distinguishing among objects that are as similar as faces. PMID:9304687
Object, spatial and social recognition testing in a single test paradigm.

PubMed

Lian, Bin; Gao, Jun; Sui, Nan; Feng, Tingyong; Li, Ming

2018-07-01

Animals have the ability to process information about an object or a conspecific's physical features and location, and alter its behavior when such information is updated. In the laboratory, the object, spatial and social recognition are often studied in separate tasks, making them unsuitable to study the potential dissociations and interactions among various types of recognition memories. The present study introduced a single paradigm to detect the object and spatial recognition, and social recognition of a familiar and novel conspecific. Specifically, male and female Sprague-Dawley adult (>75 days old) or preadolescent (25-28 days old) rats were tested with two objects and one social partner in an open-field arena for four 10-min sessions with a 20-min inter-session interval. After the first sample session, a new object replaced one of the sampled objects in the second session, and the location of one of the old objects was changed in the third session. Finally, a new social partner was introduced in the fourth session and replaced the familiar one. Exploration time with each stimulus was recorded and measures for the three recognitions were calculated based on the discrimination ratio. Overall results show that adult and preadolescent male and female rats spent more time exploring the social partner than the objects, showing a clear preference for social stimulus over nonsocial one. They also did not differ in their abilities to discriminate a new object, a new location and a new social partner from a familiar one, and to recognize a familiar conspecific. Acute administration of MK-801 (a NMDA receptor antagonist, 0.025 and 0.10 mg/kg, i.p.) after the sample session dose-dependently reduced the total time spent on exploring the social partner and objects in the adult rats, and had a significantly larger effect in the females than in the males. MK-801 also dose-dependently increased motor activity. However, it did not alter the object, spatial and social recognitions. These findings indicate that the new triple recognition paradigm is capable of recording the object, spatial location and social recognition together and revealing potential sex and age differences. This paradigm is also useful for the study of object and social exploration concurrently and can be used to evaluate cognition-altering drugs in various stages of recognition memories. Copyright © 2018. Published by Elsevier Inc.
Implementation of a Peltier-based cooling device for localized deep cortical deactivation during in vivo object recognition testing

NASA Astrophysics Data System (ADS)

Marra, Kyle; Graham, Brett; Carouso, Samantha; Cox, David

2012-02-01

While the application of local cortical cooling has recently become a focus of neurological research, extended localized deactivation deep within brain structures is still unexplored. Using a wirelessly controlled thermoelectric (Peltier) device and water-based heat sink, we have achieved inactivating temperatures (<20 C) at greater depths (>8 mm) than previously reported. After implanting the device into Long Evans rats' basolateral amygdala (BLA), an inhibitory brain center that controls anxiety and fear, we ran an open field test during which anxiety-driven behavioral tendencies were observed to decrease during cooling, thus confirming the device's effect on behavior. Our device will next be implanted in the rats' temporal association cortex (TeA) and recordings from our signal-tracing multichannel microelectrodes will measure and compare activated and deactivated neuronal activity so as to isolate and study the TeA signals responsible for object recognition. Having already achieved a top performing computational face-recognition system, the lab will utilize this TeA activity data to generalize its computational efforts of face recognition to achieve general object recognition.
Deficits in long-term recognition memory reveal dissociated subtypes in congenital prosopagnosia.

PubMed

Stollhoff, Rainer; Jost, Jürgen; Elze, Tobias; Kennerknecht, Ingo

2011-01-25

The study investigates long-term recognition memory in congenital prosopagnosia (CP), a lifelong impairment in face identification that is present from birth. Previous investigations of processing deficits in CP have mostly relied on short-term recognition tests to estimate the scope and severity of individual deficits. We firstly report on a controlled test of long-term (one year) recognition memory for faces and objects conducted with a large group of participants with CP. Long-term recognition memory is significantly impaired in eight CP participants (CPs). In all but one case, this deficit was selective to faces and didn't extend to intra-class recognition of object stimuli. In a test of famous face recognition, long-term recognition deficits were less pronounced, even after accounting for differences in media consumption between controls and CPs. Secondly, we combined test results on long-term and short-term recognition of faces and objects, and found a large heterogeneity in severity and scope of individual deficits. Analysis of the observed heterogeneity revealed a dissociation of CP into subtypes with a homogeneous phenotypical profile. Thirdly, we found that among CPs self-assessment of real-life difficulties, based on a standardized questionnaire, and experimentally assessed face recognition deficits are strongly correlated. Our results demonstrate that controlled tests of long-term recognition memory are needed to fully assess face recognition deficits in CP. Based on controlled and comprehensive experimental testing, CP can be dissociated into subtypes with a homogeneous phenotypical profile. The CP subtypes identified align with those found in prosopagnosia caused by cortical lesions; they can be interpreted with respect to a hierarchical neural system for face perception.
Deficits in Long-Term Recognition Memory Reveal Dissociated Subtypes in Congenital Prosopagnosia

PubMed Central

Stollhoff, Rainer; Jost, Jürgen; Elze, Tobias; Kennerknecht, Ingo

2011-01-01

The study investigates long-term recognition memory in congenital prosopagnosia (CP), a lifelong impairment in face identification that is present from birth. Previous investigations of processing deficits in CP have mostly relied on short-term recognition tests to estimate the scope and severity of individual deficits. We firstly report on a controlled test of long-term (one year) recognition memory for faces and objects conducted with a large group of participants with CP. Long-term recognition memory is significantly impaired in eight CP participants (CPs). In all but one case, this deficit was selective to faces and didn't extend to intra-class recognition of object stimuli. In a test of famous face recognition, long-term recognition deficits were less pronounced, even after accounting for differences in media consumption between controls and CPs. Secondly, we combined test results on long-term and short-term recognition of faces and objects, and found a large heterogeneity in severity and scope of individual deficits. Analysis of the observed heterogeneity revealed a dissociation of CP into subtypes with a homogeneous phenotypical profile. Thirdly, we found that among CPs self-assessment of real-life difficulties, based on a standardized questionnaire, and experimentally assessed face recognition deficits are strongly correlated. Our results demonstrate that controlled tests of long-term recognition memory are needed to fully assess face recognition deficits in CP. Based on controlled and comprehensive experimental testing, CP can be dissociated into subtypes with a homogeneous phenotypical profile. The CP subtypes identified align with those found in prosopagnosia caused by cortical lesions; they can be interpreted with respect to a hierarchical neural system for face perception. PMID:21283572
A simulation study of detection of weapon of mass destruction based on radar

NASA Astrophysics Data System (ADS)

Sharifahmadian, E.; Choi, Y.; Latifi, S.

2013-05-01

Typical systems used for detection of Weapon of Mass Destruction (WMD) are based on sensing objects using gamma rays or neutrons. Nonetheless, depending on environmental conditions, current methods for detecting fissile materials have limited distance of effectiveness. Moreover, radiation related to gamma- rays can be easily shielded. Here, detecting concealed WMD from a distance is simulated and studied based on radar, especially WideBand (WB) technology. The WB-based method capitalizes on the fact that electromagnetic waves penetrate through different materials at different rates. While low-frequency waves can pass through objects more easily, high-frequency waves have a higher rate of absorption by objects, making the object recognition easier. Measuring the penetration depth allows one to identify the sensed material. During simulation, radar waves and propagation area including free space, and objects in the scene are modeled. In fact, each material is modeled as a layer with a certain thickness. At start of simulation, a modeled radar wave is radiated toward the layers. At the receiver side, based on the received signals from every layer, each layer can be identified. When an electromagnetic wave passes through an object, the wave's power will be subject to a certain level of attenuation depending of the object's characteristics. Simulation is performed using radar signals with different frequencies (ranges MHz-GHz) and powers to identify different layers.
Recognition ROCS Are Curvilinear--Or Are They? On Premature Arguments against the Two-High-Threshold Model of Recognition

ERIC Educational Resources Information Center

Broder, Arndt; Schutz, Julia

2009-01-01

Recent reviews of recognition receiver operating characteristics (ROCs) claim that their curvilinear shape rules out threshold models of recognition. However, the shape of ROCs based on confidence ratings is not diagnostic to refute threshold models, whereas ROCs based on experimental bias manipulations are. Also, fitting predicted frequencies to…
Combining heterogenous features for 3D hand-held object recognition

NASA Astrophysics Data System (ADS)

Lv, Xiong; Wang, Shuang; Li, Xiangyang; Jiang, Shuqiang

2014-10-01

Object recognition has wide applications in the area of human-machine interaction and multimedia retrieval. However, due to the problem of visual polysemous and concept polymorphism, it is still a great challenge to obtain reliable recognition result for the 2D images. Recently, with the emergence and easy availability of RGB-D equipment such as Kinect, this challenge could be relieved because the depth channel could bring more information. A very special and important case of object recognition is hand-held object recognition, as hand is a straight and natural way for both human-human interaction and human-machine interaction. In this paper, we study the problem of 3D object recognition by combining heterogenous features with different modalities and extraction techniques. For hand-craft feature, although it reserves the low-level information such as shape and color, it has shown weakness in representing hiconvolutionalgh-level semantic information compared with the automatic learned feature, especially deep feature. Deep feature has shown its great advantages in large scale dataset recognition but is not always robust to rotation or scale variance compared with hand-craft feature. In this paper, we propose a method to combine hand-craft point cloud features and deep learned features in RGB and depth channle. First, hand-held object segmentation is implemented by using depth cues and human skeleton information. Second, we combine the extracted hetegerogenous 3D features in different stages using linear concatenation and multiple kernel learning (MKL). Then a training model is used to recognize 3D handheld objects. Experimental results validate the effectiveness and gerneralization ability of the proposed method.

Deep Neural Networks as a Computational Model for Human Shape Sensitivity

PubMed Central

Op de Beeck, Hans P.

2016-01-01

Theories of object recognition agree that shape is of primordial importance, but there is no consensus about how shape might be represented, and so far attempts to implement a model of shape perception that would work with realistic stimuli have largely failed. Recent studies suggest that state-of-the-art convolutional ‘deep’ neural networks (DNNs) capture important aspects of human object perception. We hypothesized that these successes might be partially related to a human-like representation of object shape. Here we demonstrate that sensitivity for shape features, characteristic to human and primate vision, emerges in DNNs when trained for generic object recognition from natural photographs. We show that these models explain human shape judgments for several benchmark behavioral and neural stimulus sets on which earlier models mostly failed. In particular, although never explicitly trained for such stimuli, DNNs develop acute sensitivity to minute variations in shape and to non-accidental properties that have long been implicated to form the basis for object recognition. Even more strikingly, when tested with a challenging stimulus set in which shape and category membership are dissociated, the most complex model architectures capture human shape sensitivity as well as some aspects of the category structure that emerges from human judgments. As a whole, these results indicate that convolutional neural networks not only learn physically correct representations of object categories but also develop perceptually accurate representational spaces of shapes. An even more complete model of human object representations might be in sight by training deep architectures for multiple tasks, which is so characteristic in human development. PMID:27124699
Mechanisms and Neural Basis of Object and Pattern Recognition: A Study with Chess Experts

ERIC Educational Resources Information Center

Bilalic, Merim; Langner, Robert; Erb, Michael; Grodd, Wolfgang

2010-01-01

Comparing experts with novices offers unique insights into the functioning of cognition, based on the maximization of individual differences. Here we used this expertise approach to disentangle the mechanisms and neural basis behind two processes that contribute to everyday expertise: object and pattern recognition. We compared chess experts and…
Developmental Trajectories of Part-Based and Configural Object Recognition in Adolescence

ERIC Educational Resources Information Center

Juttner, Martin; Wakui, Elley; Petters, Dean; Kaur, Surinder; Davidoff, Jules

2013-01-01

Three experiments assessed the development of children's part and configural (part-relational) processing in object recognition during adolescence. In total, 312 school children aged 7-16 years and 80 adults were tested in 3-alternative forced choice (3-AFC) tasks. They judged the correct appearance of upright and inverted presented familiar…
Spatial-frequency cutoff requirements for pattern recognition in central and peripheral vision

PubMed Central

Kwon, MiYoung; Legge, Gordon E.

2011-01-01

It is well known that object recognition requires spatial frequencies exceeding some critical cutoff value. People with central scotomas who rely on peripheral vision have substantial difficulty with reading and face recognition. Deficiencies of pattern recognition in peripheral vision, might result in higher cutoff requirements, and may contribute to the functional problems of people with central-field loss. Here we asked about differences in spatial-cutoff requirements in central and peripheral vision for letter and face recognition. The stimuli were the 26 letters of the English alphabet and 26 celebrity faces. Each image was blurred using a low-pass filter in the spatial frequency domain. Critical cutoffs (defined as the minimum low-pass filter cutoff yielding 80% accuracy) were obtained by measuring recognition accuracy as a function of cutoff (in cycles per object). Our data showed that critical cutoffs increased from central to peripheral vision by 20% for letter recognition and by 50% for face recognition. We asked whether these differences could be accounted for by central/peripheral differences in the contrast sensitivity function (CSF). We addressed this question by implementing an ideal-observer model which incorporates empirical CSF measurements and tested the model on letter and face recognition. The success of the model indicates that central/peripheral differences in the cutoff requirements for letter and face recognition can be accounted for by the information content of the stimulus limited by the shape of the human CSF, combined with a source of internal noise and followed by an optimal decision rule. PMID:21854800
Normative Data on Audiovisual Speech Integration Using Sentence Recognition and Capacity Measures

PubMed Central

Altieri, Nicholas; Hudock, Daniel

2016-01-01

Objective The ability to use visual speech cues and integrate them with auditory information is important, especially in noisy environments and for hearing-impaired (HI) listeners. Providing data on measures of integration skills that encompass accuracy and processing speed will benefit researchers and clinicians. Design The study consisted of two experiments: First, accuracy scores were obtained using CUNY sentences, and capacity measures that assessed reaction-time distributions were obtained from a monosyllabic word recognition task. Study Sample We report data on two measures of integration obtained from a sample comprised of 86 young and middle-age adult listeners: Results To summarize our results, capacity showed a positive correlation with accuracy measures of audiovisual benefit obtained from sentence recognition. More relevant, factor analysis indicated that a single-factor model captured audiovisual speech integration better than models containing more factors. Capacity exhibited strong loadings on the factor, while the accuracy-based measures from sentence recognition exhibited weaker loadings. Conclusions Results suggest that a listener’s integration skills may be assessed optimally using a measure that incorporates both processing speed and accuracy. PMID:26853446
Speaker-independent phoneme recognition with a binaural auditory image model

NASA Astrophysics Data System (ADS)

Francis, Keith Ivan

1997-09-01

This dissertation presents phoneme recognition techniques based on a binaural fusion of outputs of the auditory image model and subsequent azimuth-selective phoneme recognition in a noisy environment. Background information concerning speech variations, phoneme recognition, current binaural fusion techniques and auditory modeling issues is explained. The research is constrained to sources in the frontal azimuthal plane of a simulated listener. A new method based on coincidence detection of neural activity patterns from the auditory image model of Patterson is used for azimuth-selective phoneme recognition. The method is tested in various levels of noise and the results are reported in contrast to binaural fusion methods based on various forms of correlation to demonstrate the potential of coincidence- based binaural phoneme recognition. This method overcomes smearing of fine speech detail typical of correlation based methods. Nevertheless, coincidence is able to measure similarity of left and right inputs and fuse them into useful feature vectors for phoneme recognition in noise.
Image object recognition based on the Zernike moment and neural networks

NASA Astrophysics Data System (ADS)

Wan, Jianwei; Wang, Ling; Huang, Fukan; Zhou, Liangzhu

1998-03-01

This paper first give a comprehensive discussion about the concept of artificial neural network its research methods and the relations with information processing. On the basis of such a discussion, we expound the mathematical similarity of artificial neural network and information processing. Then, the paper presents a new method of image recognition based on invariant features and neural network by using image Zernike transform. The method not only has the invariant properties for rotation, shift and scale of image object, but also has good fault tolerance and robustness. Meanwhile, it is also compared with statistical classifier and invariant moments recognition method.
Enhancing Perception with Tactile Object Recognition in Adaptive Grippers for Human-Robot Interaction.

PubMed

Gandarias, Juan M; Gómez-de-Gabriel, Jesús M; García-Cerezo, Alfonso J

2018-02-26

The use of tactile perception can help first response robotic teams in disaster scenarios, where visibility conditions are often reduced due to the presence of dust, mud, or smoke, distinguishing human limbs from other objects with similar shapes. Here, the integration of the tactile sensor in adaptive grippers is evaluated, measuring the performance of an object recognition task based on deep convolutional neural networks (DCNNs) using a flexible sensor mounted in adaptive grippers. A total of 15 classes with 50 tactile images each were trained, including human body parts and common environment objects, in semi-rigid and flexible adaptive grippers based on the fin ray effect. The classifier was compared against the rigid configuration and a support vector machine classifier (SVM). Finally, a two-level output network has been proposed to provide both object-type recognition and human/non-human classification. Sensors in adaptive grippers have a higher number of non-null tactels (up to 37% more), with a lower mean of pressure values (up to 72% less) than when using a rigid sensor, with a softer grip, which is needed in physical human-robot interaction (pHRI). A semi-rigid implementation with 95.13% object recognition rate was chosen, even though the human/non-human classification had better results (98.78%) with a rigid sensor.
Localization and recognition of traffic signs for automated vehicle control systems

NASA Astrophysics Data System (ADS)

Zadeh, Mahmoud M.; Kasvand, T.; Suen, Ching Y.

1998-01-01

We present a computer vision system for detection and recognition of traffic signs. Such systems are required to assist drivers and for guidance and control of autonomous vehicles on roads and city streets. For experiments we use sequences of digitized photographs and off-line analysis. The system contains four stages. First, region segmentation based on color pixel classification called SRSM. SRSM limits the search to regions of interest in the scene. Second, we use edge tracing to find parts of outer edges of signs which are circular or straight, corresponding to the geometrical shapes of traffic signs. The third step is geometrical analysis of the outer edge and preliminary recognition of each candidate region, which may be a potential traffic sign. The final step in recognition uses color combinations within each region and model matching. This system maybe used for recognition of other types of objects, provided that the geometrical shape and color content remain reasonably constant. The method is reliable, easy to implement, and fast, This differs form the road signs recognition method in the PROMETEUS. The overall structure of the approach is sketched.
Biologically Inspired Visual Model With Preliminary Cognition and Active Attention Adjustment.

PubMed

Qiao, Hong; Xi, Xuanyang; Li, Yinlin; Wu, Wei; Li, Fengfu

2015-11-01

Recently, many computational models have been proposed to simulate visual cognition process. For example, the hierarchical Max-Pooling (HMAX) model was proposed according to the hierarchical and bottom-up structure of V1 to V4 in the ventral pathway of primate visual cortex, which could achieve position- and scale-tolerant recognition. In our previous work, we have introduced memory and association into the HMAX model to simulate visual cognition process. In this paper, we improve our theoretical framework by mimicking a more elaborate structure and function of the primate visual cortex. We will mainly focus on the new formation of memory and association in visual processing under different circumstances as well as preliminary cognition and active adjustment in the inferior temporal cortex, which are absent in the HMAX model. The main contributions of this paper are: 1) in the memory and association part, we apply deep convolutional neural networks to extract various episodic features of the objects since people use different features for object recognition. Moreover, to achieve a fast and robust recognition in the retrieval and association process, different types of features are stored in separated clusters and the feature binding of the same object is stimulated in a loop discharge manner and 2) in the preliminary cognition and active adjustment part, we introduce preliminary cognition to classify different types of objects since distinct neural circuits in a human brain are used for identification of various types of objects. Furthermore, active cognition adjustment of occlusion and orientation is implemented to the model to mimic the top-down effect in human cognition process. Finally, our model is evaluated on two face databases CAS-PEAL-R1 and AR. The results demonstrate that our model exhibits its efficiency on visual recognition process with much lower memory storage requirement and a better performance compared with the traditional purely computational methods.
Service-based analysis of biological pathways

PubMed Central

Zheng, George; Bouguettaya, Athman

2009-01-01

Background Computer-based pathway discovery is concerned with two important objectives: pathway identification and analysis. Conventional mining and modeling approaches aimed at pathway discovery are often effective at achieving either objective, but not both. Such limitations can be effectively tackled leveraging a Web service-based modeling and mining approach. Results Inspired by molecular recognitions and drug discovery processes, we developed a Web service mining tool, named PathExplorer, to discover potentially interesting biological pathways linking service models of biological processes. The tool uses an innovative approach to identify useful pathways based on graph-based hints and service-based simulation verifying user's hypotheses. Conclusion Web service modeling of biological processes allows the easy access and invocation of these processes on the Web. Web service mining techniques described in this paper enable the discovery of biological pathways linking these process service models. Algorithms presented in this paper for automatically highlighting interesting subgraph within an identified pathway network enable the user to formulate hypothesis, which can be tested out using our simulation algorithm that are also described in this paper. PMID:19796403
Modal-Power-Based Haptic Motion Recognition

NASA Astrophysics Data System (ADS)

Kasahara, Yusuke; Shimono, Tomoyuki; Kuwahara, Hiroaki; Sato, Masataka; Ohnishi, Kouhei

Motion recognition based on sensory information is important for providing assistance to human using robots. Several studies have been carried out on motion recognition based on image information. However, in the motion of humans contact with an object can not be evaluated precisely by image-based recognition. This is because the considering force information is very important for describing contact motion. In this paper, a modal-power-based haptic motion recognition is proposed; modal power is considered to reveal information on both position and force. Modal power is considered to be one of the defining features of human motion. A motion recognition algorithm based on linear discriminant analysis is proposed to distinguish between similar motions. Haptic information is extracted using a bilateral master-slave system. Then, the observed motion is decomposed in terms of primitive functions in a modal space. The experimental results show the effectiveness of the proposed method.
In search of a recognition memory engram

PubMed Central

Brown, M.W.; Banks, P.J.

2015-01-01

A large body of data from human and animal studies using psychological, recording, imaging, and lesion techniques indicates that recognition memory involves at least two separable processes: familiarity discrimination and recollection. Familiarity discrimination for individual visual stimuli seems to be effected by a system centred on the perirhinal cortex of the temporal lobe. The fundamental change that encodes prior occurrence within the perirhinal cortex is a reduction in the responses of neurones when a stimulus is repeated. Neuronal network modelling indicates that a system based on such a change in responsiveness is potentially highly efficient in information theoretic terms. A review is given of findings indicating that perirhinal cortex acts as a storage site for recognition memory of objects and that such storage depends upon processes producing synaptic weakening. PMID:25280908
Atoms of recognition in human and computer vision.

PubMed

Ullman, Shimon; Assif, Liav; Fetaya, Ethan; Harari, Daniel

2016-03-08

Discovering the visual features and representations used by the brain to recognize objects is a central problem in the study of vision. Recently, neural network models of visual object recognition, including biological and deep network models, have shown remarkable progress and have begun to rival human performance in some challenging tasks. These models are trained on image examples and learn to extract features and representations and to use them for categorization. It remains unclear, however, whether the representations and learning processes discovered by current models are similar to those used by the human visual system. Here we show, by introducing and using minimal recognizable images, that the human visual system uses features and processes that are not used by current models and that are critical for recognition. We found by psychophysical studies that at the level of minimal recognizable images a minute change in the image can have a drastic effect on recognition, thus identifying features that are critical for the task. Simulations then showed that current models cannot explain this sensitivity to precise feature configurations and, more generally, do not learn to recognize minimal images at a human level. The role of the features shown here is revealed uniquely at the minimal level, where the contribution of each feature is essential. A full understanding of the learning and use of such features will extend our understanding of visual recognition and its cortical mechanisms and will enhance the capacity of computational models to learn from visual experience and to deal with recognition and detailed image interpretation.
Invariant Visual Object and Face Recognition: Neural and Computational Bases, and a Model, VisNet

PubMed Central

Rolls, Edmund T.

2012-01-01

Neurophysiological evidence for invariant representations of objects and faces in the primate inferior temporal visual cortex is described. Then a computational approach to how invariant representations are formed in the brain is described that builds on the neurophysiology. A feature hierarchy model in which invariant representations can be built by self-organizing learning based on the temporal and spatial statistics of the visual input produced by objects as they transform in the world is described. VisNet can use temporal continuity in an associative synaptic learning rule with a short-term memory trace, and/or it can use spatial continuity in continuous spatial transformation learning which does not require a temporal trace. The model of visual processing in the ventral cortical stream can build representations of objects that are invariant with respect to translation, view, size, and also lighting. The model has been extended to provide an account of invariant representations in the dorsal visual system of the global motion produced by objects such as looming, rotation, and object-based movement. The model has been extended to incorporate top-down feedback connections to model the control of attention by biased competition in, for example, spatial and object search tasks. The approach has also been extended to account for how the visual system can select single objects in complex visual scenes, and how multiple objects can be represented in a scene. The approach has also been extended to provide, with an additional layer, for the development of representations of spatial scenes of the type found in the hippocampus. PMID:22723777
Invariant Visual Object and Face Recognition: Neural and Computational Bases, and a Model, VisNet.

PubMed

Rolls, Edmund T

2012-01-01

Neurophysiological evidence for invariant representations of objects and faces in the primate inferior temporal visual cortex is described. Then a computational approach to how invariant representations are formed in the brain is described that builds on the neurophysiology. A feature hierarchy model in which invariant representations can be built by self-organizing learning based on the temporal and spatial statistics of the visual input produced by objects as they transform in the world is described. VisNet can use temporal continuity in an associative synaptic learning rule with a short-term memory trace, and/or it can use spatial continuity in continuous spatial transformation learning which does not require a temporal trace. The model of visual processing in the ventral cortical stream can build representations of objects that are invariant with respect to translation, view, size, and also lighting. The model has been extended to provide an account of invariant representations in the dorsal visual system of the global motion produced by objects such as looming, rotation, and object-based movement. The model has been extended to incorporate top-down feedback connections to model the control of attention by biased competition in, for example, spatial and object search tasks. The approach has also been extended to account for how the visual system can select single objects in complex visual scenes, and how multiple objects can be represented in a scene. The approach has also been extended to provide, with an additional layer, for the development of representations of spatial scenes of the type found in the hippocampus.
Enhanced Gender Recognition System Using an Improved Histogram of Oriented Gradient (HOG) Feature from Quality Assessment of Visible Light and Thermal Images of the Human Body.

PubMed

Nguyen, Dat Tien; Park, Kang Ryoung

2016-07-21

With higher demand from users, surveillance systems are currently being designed to provide more information about the observed scene, such as the appearance of objects, types of objects, and other information extracted from detected objects. Although the recognition of gender of an observed human can be easily performed using human perception, it remains a difficult task when using computer vision system images. In this paper, we propose a new human gender recognition method that can be applied to surveillance systems based on quality assessment of human areas in visible light and thermal camera images. Our research is novel in the following two ways: First, we utilize the combination of visible light and thermal images of the human body for a recognition task based on quality assessment. We propose a quality measurement method to assess the quality of image regions so as to remove the effects of background regions in the recognition system. Second, by combining the features extracted using the histogram of oriented gradient (HOG) method and the measured qualities of image regions, we form a new image features, called the weighted HOG (wHOG), which is used for efficient gender recognition. Experimental results show that our method produces more accurate estimation results than the state-of-the-art recognition method that uses human body images.
Enhanced Gender Recognition System Using an Improved Histogram of Oriented Gradient (HOG) Feature from Quality Assessment of Visible Light and Thermal Images of the Human Body

PubMed Central

Nguyen, Dat Tien; Park, Kang Ryoung

2016-01-01

With higher demand from users, surveillance systems are currently being designed to provide more information about the observed scene, such as the appearance of objects, types of objects, and other information extracted from detected objects. Although the recognition of gender of an observed human can be easily performed using human perception, it remains a difficult task when using computer vision system images. In this paper, we propose a new human gender recognition method that can be applied to surveillance systems based on quality assessment of human areas in visible light and thermal camera images. Our research is novel in the following two ways: First, we utilize the combination of visible light and thermal images of the human body for a recognition task based on quality assessment. We propose a quality measurement method to assess the quality of image regions so as to remove the effects of background regions in the recognition system. Second, by combining the features extracted using the histogram of oriented gradient (HOG) method and the measured qualities of image regions, we form a new image features, called the weighted HOG (wHOG), which is used for efficient gender recognition. Experimental results show that our method produces more accurate estimation results than the state-of-the-art recognition method that uses human body images. PMID:27455264
Integration trumps selection in object recognition.

PubMed

Saarela, Toni P; Landy, Michael S

2015-03-30

Finding and recognizing objects is a fundamental task of vision. Objects can be defined by several "cues" (color, luminance, texture, etc.), and humans can integrate sensory cues to improve detection and recognition [1-3]. Cortical mechanisms fuse information from multiple cues [4], and shape-selective neural mechanisms can display cue invariance by responding to a given shape independent of the visual cue defining it [5-8]. Selective attention, in contrast, improves recognition by isolating a subset of the visual information [9]. Humans can select single features (red or vertical) within a perceptual dimension (color or orientation), giving faster and more accurate responses to items having the attended feature [10, 11]. Attention elevates neural responses and sharpens neural tuning to the attended feature, as shown by studies in psychophysics and modeling [11, 12], imaging [13-16], and single-cell and neural population recordings [17, 18]. Besides single features, attention can select whole objects [19-21]. Objects are among the suggested "units" of attention because attention to a single feature of an object causes the selection of all of its features [19-21]. Here, we pit integration against attentional selection in object recognition. We find, first, that humans can integrate information near optimally from several perceptual dimensions (color, texture, luminance) to improve recognition. They cannot, however, isolate a single dimension even when the other dimensions provide task-irrelevant, potentially conflicting information. For object recognition, it appears that there is mandatory integration of information from multiple dimensions of visual experience. The advantage afforded by this integration, however, comes at the expense of attentional selection. Copyright © 2015 Elsevier Ltd. All rights reserved.
Integration trumps selection in object recognition

PubMed Central

Saarela, Toni P.; Landy, Michael S.

2015-01-01

Summary Finding and recognizing objects is a fundamental task of vision. Objects can be defined by several “cues” (color, luminance, texture etc.), and humans can integrate sensory cues to improve detection and recognition [1–3]. Cortical mechanisms fuse information from multiple cues [4], and shape-selective neural mechanisms can display cue-invariance by responding to a given shape independent of the visual cue defining it [5–8]. Selective attention, in contrast, improves recognition by isolating a subset of the visual information [9]. Humans can select single features (red or vertical) within a perceptual dimension (color or orientation), giving faster and more accurate responses to items having the attended feature [10,11]. Attention elevates neural responses and sharpens neural tuning to the attended feature, as shown by studies in psychophysics and modeling [11,12], imaging [13–16], and single-cell and neural population recordings [17,18]. Besides single features, attention can select whole objects [19–21]. Objects are among the suggested “units” of attention because attention to a single feature of an object causes the selection of all of its features [19–21]. Here, we pit integration against attentional selection in object recognition. We find, first, that humans can integrate information near-optimally from several perceptual dimensions (color, texture, luminance) to improve recognition. They cannot, however, isolate a single dimension even when the other dimensions provide task-irrelevant, potentially conflicting information. For object recognition, it appears that there is mandatory integration of information from multiple dimensions of visual experience. The advantage afforded by this integration, however, comes at the expense of attentional selection. PMID:25802154

Australian Recognition Framework Arrangements. Australia's National Training Framework.

ERIC Educational Resources Information Center

Australian National Training Authority, Brisbane.

This document explains the objectives, principles, standards, and protocols of the Australian Recognition Framework (ARF), which is a comprehensive approach to national recognition of vocational education and training (VET) that is based on a quality-assured approach to the registration of training organizations seeking to deliver training, assess…
Universal in vivo Textural Model for Human Skin based on Optical Coherence Tomograms.

PubMed

Adabi, Saba; Hosseinzadeh, Matin; Noei, Shahryar; Conforto, Silvia; Daveluy, Steven; Clayton, Anne; Mehregan, Darius; Nasiriavanaki, Mohammadreza

2017-12-20

Currently, diagnosis of skin diseases is based primarily on the visual pattern recognition skills and expertise of the physician observing the lesion. Even though dermatologists are trained to recognize patterns of morphology, it is still a subjective visual assessment. Tools for automated pattern recognition can provide objective information to support clinical decision-making. Noninvasive skin imaging techniques provide complementary information to the clinician. In recent years, optical coherence tomography (OCT) has become a powerful skin imaging technique. According to specific functional needs, skin architecture varies across different parts of the body, as do the textural characteristics in OCT images. There is, therefore, a critical need to systematically analyze OCT images from different body sites, to identify their significant qualitative and quantitative differences. Sixty-three optical and textural features extracted from OCT images of healthy and diseased skin are analyzed and, in conjunction with decision-theoretic approaches, used to create computational models of the diseases. We demonstrate that these models provide objective information to the clinician to assist in the diagnosis of abnormalities of cutaneous microstructure, and hence, aid in the determination of treatment. Specifically, we demonstrate the performance of this methodology on differentiating basal cell carcinoma (BCC) and squamous cell carcinoma (SCC) from healthy tissue.
Invariant object recognition based on the generalized discrete radon transform

NASA Astrophysics Data System (ADS)

Easley, Glenn R.; Colonna, Flavia

2004-04-01

We introduce a method for classifying objects based on special cases of the generalized discrete Radon transform. We adjust the transform and the corresponding ridgelet transform by means of circular shifting and a singular value decomposition (SVD) to obtain a translation, rotation and scaling invariant set of feature vectors. We then use a back-propagation neural network to classify the input feature vectors. We conclude with experimental results and compare these with other invariant recognition methods.
Tactile Recognition and Localization Using Object Models: The Case of Polyhedra on a Plane.

DTIC Science & Technology

1983-03-01

poor force resolution, but high spatial resolution. We feel that the viability of this recognition approach has important implications on the design of...of the touched object: 1. Surface point - On the basis of sensor readings, some points on the sensor can be identified as being in contact with...the sensor’s shape and location in space are known, one can determine the position of some point on the touched object, to within some uncertainty
Three-dimensional passive sensing photon counting for object classification

NASA Astrophysics Data System (ADS)

Yeom, Seokwon; Javidi, Bahram; Watson, Edward

2007-04-01

In this keynote address, we address three-dimensional (3D) distortion-tolerant object recognition using photon-counting integral imaging (II). A photon-counting linear discriminant analysis (LDA) is discussed for classification of photon-limited images. We develop a compact distortion-tolerant recognition system based on the multiple-perspective imaging of II. Experimental and simulation results have shown that a low level of photons is sufficient to classify out-of-plane rotated objects.
A new method of edge detection for object recognition

USGS Publications Warehouse

Maddox, Brian G.; Rhew, Benjamin

2004-01-01

Traditional edge detection systems function by returning every edge in an input image. This can result in a large amount of clutter and make certain vectorization algorithms less accurate. Accuracy problems can then have a large impact on automated object recognition systems that depend on edge information. A new method of directed edge detection can be used to limit the number of edges returned based on a particular feature. This results in a cleaner image that is easier for vectorization. Vectorized edges from this process could then feed an object recognition system where the edge data would also contain information as to what type of feature it bordered.
Lateral Entorhinal Cortex is Critical for Novel Object-Context Recognition

PubMed Central

Wilson, David IG; Langston, Rosamund F; Schlesiger, Magdalene I; Wagner, Monica; Watanabe, Sakurako; Ainge, James A

2013-01-01

Episodic memory incorporates information about specific events or occasions including spatial locations and the contextual features of the environment in which the event took place. It has been modeled in rats using spontaneous exploration of novel configurations of objects, their locations, and the contexts in which they are presented. While we have a detailed understanding of how spatial location is processed in the brain relatively little is known about where the nonspatial contextual components of episodic memory are processed. Initial experiments measured c-fos expression during an object-context recognition (OCR) task to examine which networks within the brain process contextual features of an event. Increased c-fos expression was found in the lateral entorhinal cortex (LEC; a major hippocampal afferent) during OCR relative to control conditions. In a subsequent experiment it was demonstrated that rats with lesions of LEC were unable to recognize object-context associations yet showed normal object recognition and normal context recognition. These data suggest that contextual features of the environment are integrated with object identity in LEC and demonstrate that recognition of such object-context associations requires the LEC. This is consistent with the suggestion that contextual features of an event are processed in LEC and that this information is combined with spatial information from medial entorhinal cortex to form episodic memory in the hippocampus. © 2013 Wiley Periodicals, Inc. PMID:23389958
Image-based automatic recognition of larvae

NASA Astrophysics Data System (ADS)

Sang, Ru; Yu, Guiying; Fan, Weijun; Guo, Tiantai

2010-08-01

As the main objects, imagoes have been researched in quarantine pest recognition in these days. However, pests in their larval stage are latent, and the larvae spread abroad much easily with the circulation of agricultural and forest products. It is presented in this paper that, as the new research objects, larvae are recognized by means of machine vision, image processing and pattern recognition. More visional information is reserved and the recognition rate is improved as color image segmentation is applied to images of larvae. Along with the characteristics of affine invariance, perspective invariance and brightness invariance, scale invariant feature transform (SIFT) is adopted for the feature extraction. The neural network algorithm is utilized for pattern recognition, and the automatic identification of larvae images is successfully achieved with satisfactory results.
Model-Driven Study of Visual Memory

DTIC Science & Technology

2004-12-01

dimensional stimuli (synthetic human faces ) afford important insights into episodic recognition memory. The results were well accommodated by a summed...the unusual properties of the z-transformed ROCS. 15. SUBJECT TERMS Memory, visual memory, computational model, human memory, faces , identity 16...3 Accomplishments/New Findings 3 Work on Objective One: Recognition Memory for Synthetic Faces . 3 Experim ent 1
Critical object recognition in millimeter-wave images with robustness to rotation and scale.

PubMed

Mohammadzade, Hoda; Ghojogh, Benyamin; Faezi, Sina; Shabany, Mahdi

2017-06-01

Locating critical objects is crucial in various security applications and industries. For example, in security applications, such as in airports, these objects might be hidden or covered under shields or secret sheaths. Millimeter-wave images can be utilized to discover and recognize the critical objects out of the hidden cases without any health risk due to their non-ionizing features. However, millimeter-wave images usually have waves in and around the detected objects, making object recognition difficult. Thus, regular image processing and classification methods cannot be used for these images and additional pre-processings and classification methods should be introduced. This paper proposes a novel pre-processing method for canceling rotation and scale using principal component analysis. In addition, a two-layer classification method is introduced and utilized for recognition. Moreover, a large dataset of millimeter-wave images is collected and created for experiments. Experimental results show that a typical classification method such as support vector machines can recognize 45.5% of a type of critical objects at 34.2% false alarm rate (FAR), which is a drastically poor recognition. The same method within the proposed recognition framework achieves 92.9% recognition rate at 0.43% FAR, which indicates a highly significant improvement. The significant contribution of this work is to introduce a new method for analyzing millimeter-wave images based on machine vision and learning approaches, which is not yet widely noted in the field of millimeter-wave image analysis.
Fat segmentation on chest CT images via fuzzy models

NASA Astrophysics Data System (ADS)

Tong, Yubing; Udupa, Jayaram K.; Wu, Caiyun; Pednekar, Gargi; Subramanian, Janani Rajan; Lederer, David J.; Christie, Jason; Torigian, Drew A.

2016-03-01

Quantification of fat throughout the body is vital for the study of many diseases. In the thorax, it is important for lung transplant candidates since obesity and being underweight are contraindications to lung transplantation given their associations with increased mortality. Common approaches for thoracic fat segmentation are all interactive in nature, requiring significant manual effort to draw the interfaces between fat and muscle with low efficiency and questionable repeatability. The goal of this paper is to explore a practical way for the segmentation of subcutaneous adipose tissue (SAT) and visceral adipose tissue (VAT) components of chest fat based on a recently developed body-wide automatic anatomy recognition (AAR) methodology. The AAR approach involves 3 main steps: building a fuzzy anatomy model of the body region involving all its major representative objects, recognizing objects in any given test image, and delineating the objects. We made several modifications to these steps to develop an effective solution to delineate SAT/VAT components of fat. Two new objects representing interfaces of SAT and VAT regions with other tissues, SatIn and VatIn are defined, rather than using directly the SAT and VAT components as objects for constructing the models. A hierarchical arrangement of these new and other reference objects is built to facilitate their recognition in the hierarchical order. Subsequently, accurate delineations of the SAT/VAT components are derived from these objects. Unenhanced CT images from 40 lung transplant candidates were utilized in experimentally evaluating this new strategy. Mean object location error achieved was about 2 voxels and delineation error in terms of false positive and false negative volume fractions were, respectively, 0.07 and 0.1 for SAT and 0.04 and 0.2 for VAT.
Deep learning based hand gesture recognition in complex scenes

NASA Astrophysics Data System (ADS)

Ni, Zihan; Sang, Nong; Tan, Cheng

2018-03-01

Recently, region-based convolutional neural networks(R-CNNs) have achieved significant success in the field of object detection, but their accuracy is not too high for small objects and similar objects, such as the gestures. To solve this problem, we present an online hard example testing(OHET) technology to evaluate the confidence of the R-CNNs' outputs, and regard those outputs with low confidence as hard examples. In this paper, we proposed a cascaded networks to recognize the gestures. Firstly, we use the region-based fully convolutional neural network(R-FCN), which is capable of the detection for small object, to detect the gestures, and then use the OHET to select the hard examples. To enhance the accuracy of the gesture recognition, we re-classify the hard examples through VGG-19 classification network to obtain the final output of the gesture recognition system. Through the contrast experiments with other methods, we can see that the cascaded networks combined with the OHET reached to the state-of-the-art results of 99.3% mAP on small and similar gestures in complex scenes.
Learning Distance Functions for Exemplar-Based Object Recognition

DTIC Science & Technology

2007-08-08

requires prior specific permission. Learning Distance Functions for Exemplar-Based Object Recognition by Andrea Lynn Frome B.S. ( Mary Washington...fantastic advisor and advocate when I was at Mary Washington College i and has since become a dear friend. Thank you, Dr. Bass, for continuing to stand...Antonio Torralba. 5 Chapter 1. Introduction 0 5 10 15 20 25 30 35 10 15 20 25 30 35 40 45 50 55 60 65 70 Number of training examples per class M ea n
Learning Distance Functions for Exemplar-Based Object Recognition

DTIC Science & Technology

2007-01-01

Learning Distance Functions for Exemplar-Based Object Recognition by Andrea Lynn Frome B.S. ( Mary Washington College) 1996 A dissertation submitted...advisor and advocate when I was at Mary Washington College i and has since become a dear friend. Thank you, Dr. Bass, for continuing to stand by my...Torralba. 5 Chapter 1. Introduction 0 5 10 15 20 25 30 35 10 15 20 25 30 35 40 45 50 55 60 65 70 Number of training examples per class M ea n re co
Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence

PubMed Central

Cichy, Radoslaw Martin; Khosla, Aditya; Pantazis, Dimitrios; Torralba, Antonio; Oliva, Aude

2016-01-01

The complex multi-stage architecture of cortical visual pathways provides the neural basis for efficient visual object recognition in humans. However, the stage-wise computations therein remain poorly understood. Here, we compared temporal (magnetoencephalography) and spatial (functional MRI) visual brain representations with representations in an artificial deep neural network (DNN) tuned to the statistics of real-world visual recognition. We showed that the DNN captured the stages of human visual processing in both time and space from early visual areas towards the dorsal and ventral streams. Further investigation of crucial DNN parameters revealed that while model architecture was important, training on real-world categorization was necessary to enforce spatio-temporal hierarchical relationships with the brain. Together our results provide an algorithmically informed view on the spatio-temporal dynamics of visual object recognition in the human visual brain. PMID:27282108
Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence.

PubMed

Cichy, Radoslaw Martin; Khosla, Aditya; Pantazis, Dimitrios; Torralba, Antonio; Oliva, Aude

2016-06-10

The complex multi-stage architecture of cortical visual pathways provides the neural basis for efficient visual object recognition in humans. However, the stage-wise computations therein remain poorly understood. Here, we compared temporal (magnetoencephalography) and spatial (functional MRI) visual brain representations with representations in an artificial deep neural network (DNN) tuned to the statistics of real-world visual recognition. We showed that the DNN captured the stages of human visual processing in both time and space from early visual areas towards the dorsal and ventral streams. Further investigation of crucial DNN parameters revealed that while model architecture was important, training on real-world categorization was necessary to enforce spatio-temporal hierarchical relationships with the brain. Together our results provide an algorithmically informed view on the spatio-temporal dynamics of visual object recognition in the human visual brain.
Strategies for memory-based decision making: Modeling behavioral and neural signatures within a cognitive architecture.

PubMed

Fechner, Hanna B; Pachur, Thorsten; Schooler, Lael J; Mehlhorn, Katja; Battal, Ceren; Volz, Kirsten G; Borst, Jelmer P

2016-12-01

How do people use memories to make inferences about real-world objects? We tested three strategies based on predicted patterns of response times and blood-oxygen-level-dependent (BOLD) responses: one strategy that relies solely on recognition memory, a second that retrieves additional knowledge, and a third, lexicographic (i.e., sequential) strategy, that considers knowledge conditionally on the evidence obtained from recognition memory. We implemented the strategies as computational models within the Adaptive Control of Thought-Rational (ACT-R) cognitive architecture, which allowed us to derive behavioral and neural predictions that we then compared to the results of a functional magnetic resonance imaging (fMRI) study in which participants inferred which of two cities is larger. Overall, versions of the lexicographic strategy, according to which knowledge about many but not all alternatives is searched, provided the best account of the joint patterns of response times and BOLD responses. These results provide insights into the interplay between recognition and additional knowledge in memory, hinting at an adaptive use of these two sources of information in decision making. The results highlight the usefulness of implementing models of decision making within a cognitive architecture to derive predictions on the behavioral and neural level. Copyright © 2016 Elsevier B.V. All rights reserved.
Experimental study on GMM-based speaker recognition

NASA Astrophysics Data System (ADS)

Ye, Wenxing; Wu, Dapeng; Nucci, Antonio

2010-04-01

Speaker recognition plays a very important role in the field of biometric security. In order to improve the recognition performance, many pattern recognition techniques have be explored in the literature. Among these techniques, the Gaussian Mixture Model (GMM) is proved to be an effective statistic model for speaker recognition and is used in most state-of-the-art speaker recognition systems. The GMM is used to represent the 'voice print' of a speaker through modeling the spectral characteristic of speech signals of the speaker. In this paper, we implement a speaker recognition system, which consists of preprocessing, Mel-Frequency Cepstrum Coefficients (MFCCs) based feature extraction, and GMM based classification. We test our system with TIDIGITS data set (325 speakers) and our own recordings of more than 200 speakers; our system achieves 100% correct recognition rate. Moreover, we also test our system under the scenario that training samples are from one language but test samples are from a different language; our system also achieves 100% correct recognition rate, which indicates that our system is language independent.
Analysis and Recognition of Traditional Chinese Medicine Pulse Based on the Hilbert-Huang Transform and Random Forest in Patients with Coronary Heart Disease

PubMed Central

Wang, Yiqin; Yan, Hanxia; Yan, Jianjun; Yuan, Fengyin; Xu, Zhaoxia; Liu, Guoping; Xu, Wenjie

2015-01-01

Objective. This research provides objective and quantitative parameters of the traditional Chinese medicine (TCM) pulse conditions for distinguishing between patients with the coronary heart disease (CHD) and normal people by using the proposed classification approach based on Hilbert-Huang transform (HHT) and random forest. Methods. The energy and the sample entropy features were extracted by applying the HHT to TCM pulse by treating these pulse signals as time series. By using the random forest classifier, the extracted two types of features and their combination were, respectively, used as input data to establish classification model. Results. Statistical results showed that there were significant differences in the pulse energy and sample entropy between the CHD group and the normal group. Moreover, the energy features, sample entropy features, and their combination were inputted as pulse feature vectors; the corresponding average recognition rates were 84%, 76.35%, and 90.21%, respectively. Conclusion. The proposed approach could be appropriately used to analyze pulses of patients with CHD, which can lay a foundation for research on objective and quantitative criteria on disease diagnosis or Zheng differentiation. PMID:26180536
Analysis and Recognition of Traditional Chinese Medicine Pulse Based on the Hilbert-Huang Transform and Random Forest in Patients with Coronary Heart Disease.

PubMed

Guo, Rui; Wang, Yiqin; Yan, Hanxia; Yan, Jianjun; Yuan, Fengyin; Xu, Zhaoxia; Liu, Guoping; Xu, Wenjie

2015-01-01

Objective. This research provides objective and quantitative parameters of the traditional Chinese medicine (TCM) pulse conditions for distinguishing between patients with the coronary heart disease (CHD) and normal people by using the proposed classification approach based on Hilbert-Huang transform (HHT) and random forest. Methods. The energy and the sample entropy features were extracted by applying the HHT to TCM pulse by treating these pulse signals as time series. By using the random forest classifier, the extracted two types of features and their combination were, respectively, used as input data to establish classification model. Results. Statistical results showed that there were significant differences in the pulse energy and sample entropy between the CHD group and the normal group. Moreover, the energy features, sample entropy features, and their combination were inputted as pulse feature vectors; the corresponding average recognition rates were 84%, 76.35%, and 90.21%, respectively. Conclusion. The proposed approach could be appropriately used to analyze pulses of patients with CHD, which can lay a foundation for research on objective and quantitative criteria on disease diagnosis or Zheng differentiation.

Describing, using 'recognition cones'. [parallel-series model with English-like computer program

NASA Technical Reports Server (NTRS)

Uhr, L.

1973-01-01

A parallel-serial 'recognition cone' model is examined, taking into account the model's ability to describe scenes of objects. An actual program is presented in an English-like language. The concept of a 'description' is discussed together with possible types of descriptive information. Questions regarding the level and the variety of detail are considered along with approaches for improving the serial representations of parallel systems.
Position estimation and driving of an autonomous vehicle by monocular vision

NASA Astrophysics Data System (ADS)

Hanan, Jay C.; Kayathi, Pavan; Hughlett, Casey L.

2007-04-01

Automatic adaptive tracking in real-time for target recognition provided autonomous control of a scale model electric truck. The two-wheel drive truck was modified as an autonomous rover test-bed for vision based guidance and navigation. Methods were implemented to monitor tracking error and ensure a safe, accurate arrival at the intended science target. Some methods are situation independent relying only on the confidence error of the target recognition algorithm. Other methods take advantage of the scenario of combined motion and tracking to filter out anomalies. In either case, only a single calibrated camera was needed for position estimation. Results from real-time autonomous driving tests on the JPL simulated Mars yard are presented. Recognition error was often situation dependent. For the rover case, the background was in motion and may be characterized to provide visual cues on rover travel such as rate, pitch, roll, and distance to objects of interest or hazards. Objects in the scene may be used as landmarks, or waypoints, for such estimations. As objects are approached, their scale increases and their orientation may change. In addition, particularly on rough terrain, these orientation and scale changes may be unpredictable. Feature extraction combined with the neural network algorithm was successful in providing visual odometry in the simulated Mars environment.
In search of a recognition memory engram.

PubMed

Brown, M W; Banks, P J

2015-03-01

A large body of data from human and animal studies using psychological, recording, imaging, and lesion techniques indicates that recognition memory involves at least two separable processes: familiarity discrimination and recollection. Familiarity discrimination for individual visual stimuli seems to be effected by a system centred on the perirhinal cortex of the temporal lobe. The fundamental change that encodes prior occurrence within the perirhinal cortex is a reduction in the responses of neurones when a stimulus is repeated. Neuronal network modelling indicates that a system based on such a change in responsiveness is potentially highly efficient in information theoretic terms. A review is given of findings indicating that perirhinal cortex acts as a storage site for recognition memory of objects and that such storage depends upon processes producing synaptic weakening. Copyright © 2014 The Authors. Published by Elsevier Ltd.. All rights reserved.
Effective connectivity of visual word recognition and homophone orthographic errors

PubMed Central

Guàrdia-Olmos, Joan; Peró-Cebollero, Maribel; Zarabozo-Hurtado, Daniel; González-Garrido, Andrés A.; Gudayol-Ferré, Esteve

2015-01-01

The study of orthographic errors in a transparent language like Spanish is an important topic in relation to writing acquisition. The development of neuroimaging techniques, particularly functional magnetic resonance imaging (fMRI), has enabled the study of such relationships between brain areas. The main objective of the present study was to explore the patterns of effective connectivity by processing pseudohomophone orthographic errors among subjects with high and low spelling skills. Two groups of 12 Mexican subjects each, matched by age, were formed based on their results in a series of ad hoc spelling-related out-scanner tests: a high spelling skills (HSSs) group and a low spelling skills (LSSs) group. During the f MRI session, two experimental tasks were applied (spelling recognition task and visuoperceptual recognition task). Regions of Interest and their signal values were obtained for both tasks. Based on these values, structural equation models (SEMs) were obtained for each group of spelling competence (HSS and LSS) and task through maximum likelihood estimation, and the model with the best fit was chosen in each case. Likewise, dynamic causal models (DCMs) were estimated for all the conditions across tasks and groups. The HSS group’s SEM results suggest that, in the spelling recognition task, the right middle temporal gyrus, and, to a lesser extent, the left parahippocampal gyrus receive most of the significant effects, whereas the DCM results in the visuoperceptual recognition task show less complex effects, but still congruent with the previous results, with an important role in several areas. In general, these results are consistent with the major findings in partial studies about linguistic activities but they are the first analyses of statistical effective brain connectivity in transparent languages. PMID:26042070
The roles of perceptual and conceptual information in face recognition.

PubMed

Schwartz, Linoy; Yovel, Galit

2016-11-01

The representation of familiar objects is comprised of perceptual information about their visual properties as well as the conceptual knowledge that we have about them. What is the relative contribution of perceptual and conceptual information to object recognition? Here, we examined this question by designing a face familiarization protocol during which participants were either exposed to rich perceptual information (viewing each face in different angles and illuminations) or with conceptual information (associating each face with a different name). Both conditions were compared with single-view faces presented with no labels. Recognition was tested on new images of the same identities to assess whether learning generated a view-invariant representation. Results showed better recognition of novel images of the learned identities following association of a face with a name label, but no enhancement following exposure to multiple face views. Whereas these findings may be consistent with the role of category learning in object recognition, face recognition was better for labeled faces only when faces were associated with person-related labels (name, occupation), but not with person-unrelated labels (object names or symbols). These findings suggest that association of meaningful conceptual information with an image shifts its representation from an image-based percept to a view-invariant concept. They further indicate that the role of conceptual information should be considered to account for the superior recognition that we have for familiar faces and objects. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Target recognition of log-polar ladar range images using moment invariants

NASA Astrophysics Data System (ADS)

Xia, Wenze; Han, Shaokun; Cao, Jie; Yu, Haoyong

2017-01-01

The ladar range image has received considerable attentions in the automatic target recognition field. However, previous research does not cover target recognition using log-polar ladar range images. Therefore, we construct a target recognition system based on log-polar ladar range images in this paper. In this system combined moment invariants and backpropagation neural network are selected as shape descriptor and shape classifier, respectively. In order to fully analyze the effect of log-polar sampling pattern on recognition result, several comparative experiments based on simulated and real range images are carried out. Eventually, several important conclusions are drawn: (i) if combined moments are computed directly by log-polar range images, translation, rotation and scaling invariant properties of combined moments will be invalid (ii) when object is located in the center of field of view, recognition rate of log-polar range images is less sensitive to the changing of field of view (iii) as object position changes from center to edge of field of view, recognition performance of log-polar range images will decline dramatically (iv) log-polar range images has a better noise robustness than Cartesian range images. Finally, we give a suggestion that it is better to divide field of view into recognition area and searching area in the real application.
Progestogens’ effects and mechanisms for object recognition memory across the lifespan

PubMed Central

Walf, Alicia A.; Koonce, Carolyn J.; Frye, Cheryl A.

2016-01-01

This review explores the effects of female reproductive hormones, estrogens and progestogens, with a focus on progesterone and allopregnanolone, on object memory. Progesterone and its metabolites, in particular allopregnanolone, exert various effects on both cognitive and non-mnemonic functions in females. The well-known object recognition task is a valuable experimental paradigm that can be used to determine the effects and mechanisms of progestogens for mnemonic effects across the lifespan, which will be discussed herein. In this task there is little test-decay when different objects are used as targets and baseline valance for objects is controlled. This allows repeated testing, within-subjects designs, and longitudinal assessments, which aid understanding of changes in hormonal milieu. Objects are not aversive or food-based, which are hormone-sensitive factors. This review focuses on published data from our laboratory, and others, using the object recognition task in rodents to assess the role and mechanisms of progestogens throughout the lifespan. Improvements in object recognition performance of rodents are often associated with higher hormone levels in the hippocampus and prefrontal cortex during natural cycles, with hormone replacement following ovariectomy in young animals, or with aging. The capacity for reversal of age- and reproductive senescence-related decline in cognitive performance, and changes in neural plasticity that may be dissociated from peripheral effects with such decline, are discussed. The focus here will be on the effects of brain-derived factors, such as the neurosteroid, allopregnanolone, and other hormones, for enhancing object recognition across the lifespan. PMID:26235328
Modelling of DNA-protein recognition

NASA Technical Reports Server (NTRS)

Rein, R.; Garduno, R.; Colombano, S.; Nir, S.; Haydock, K.; Macelroy, R. D.

1980-01-01

Computer model-building procedures using stereochemical principles together with theoretical energy calculations appear to be, at this stage, the most promising route toward the elucidation of DNA-protein binding schemes and recognition principles. A review of models and bonding principles is conducted and approaches to modeling are considered, taking into account possible di-hydrogen-bonding schemes between a peptide and a base (or a base pair) of a double-stranded nucleic acid in the major groove, aspects of computer graphic modeling, and a search for isogeometric helices. The energetics of recognition complexes is discussed and several models for peptide DNA recognition are presented.
Invariant recognition drives neural representations of action sequences

PubMed Central

Poggio, Tomaso

2017-01-01

Recognizing the actions of others from visual stimuli is a crucial aspect of human perception that allows individuals to respond to social cues. Humans are able to discriminate between similar actions despite transformations, like changes in viewpoint or actor, that substantially alter the visual appearance of a scene. This ability to generalize across complex transformations is a hallmark of human visual intelligence. Advances in understanding action recognition at the neural level have not always translated into precise accounts of the computational principles underlying what representations of action sequences are constructed by human visual cortex. Here we test the hypothesis that invariant action discrimination might fill this gap. Recently, the study of artificial systems for static object perception has produced models, Convolutional Neural Networks (CNNs), that achieve human level performance in complex discriminative tasks. Within this class, architectures that better support invariant object recognition also produce image representations that better match those implied by human and primate neural data. However, whether these models produce representations of action sequences that support recognition across complex transformations and closely follow neural representations of actions remains unknown. Here we show that spatiotemporal CNNs accurately categorize video stimuli into action classes, and that deliberate model modifications that improve performance on an invariant action recognition task lead to data representations that better match human neural recordings. Our results support our hypothesis that performance on invariant discrimination dictates the neural representations of actions computed in the brain. These results broaden the scope of the invariant recognition framework for understanding visual intelligence from perception of inanimate objects and faces in static images to the study of human perception of action sequences. PMID:29253864
On techniques for angle compensation in nonideal iris recognition.

PubMed

Schuckers, Stephanie A C; Schmid, Natalia A; Abhyankar, Aditya; Dorairaj, Vivekanand; Boyce, Christopher K; Hornak, Lawrence A

2007-10-01

The popularity of the iris biometric has grown considerably over the past two to three years. Most research has been focused on the development of new iris processing and recognition algorithms for frontal view iris images. However, a few challenging directions in iris research have been identified, including processing of a nonideal iris and iris at a distance. In this paper, we describe two nonideal iris recognition systems and analyze their performance. The word "nonideal" is used in the sense of compensating for off-angle occluded iris images. The system is designed to process nonideal iris images in two steps: 1) compensation for off-angle gaze direction and 2) processing and encoding of the rotated iris image. Two approaches are presented to account for angular variations in the iris images. In the first approach, we use Daugman's integrodifferential operator as an objective function to estimate the gaze direction. After the angle is estimated, the off-angle iris image undergoes geometric transformations involving the estimated angle and is further processed as if it were a frontal view image. The encoding technique developed for a frontal image is based on the application of the global independent component analysis. The second approach uses an angular deformation calibration model. The angular deformations are modeled, and calibration parameters are calculated. The proposed method consists of a closed-form solution, followed by an iterative optimization procedure. The images are projected on the plane closest to the base calibrated plane. Biorthogonal wavelets are used for encoding to perform iris recognition. We use a special dataset of the off-angle iris images to quantify the performance of the designed systems. A series of receiver operating characteristics demonstrate various effects on the performance of the nonideal-iris-based recognition system.
The posterior parietal cortex in recognition memory: a neuropsychological study.

PubMed

Haramati, Sharon; Soroker, Nachum; Dudai, Yadin; Levy, Daniel A

2008-01-01

Several recent functional neuroimaging studies have reported robust bilateral activation (L>R) in lateral posterior parietal cortex and precuneus during recognition memory retrieval tasks. It has not yet been determined what cognitive processes are represented by those activations. In order to examine whether parietal lobe-based processes are necessary for basic episodic recognition abilities, we tested a group of 17 first-incident CVA patients whose cortical damage included (but was not limited to) extensive unilateral posterior parietal lesions. These patients performed a series of tasks that yielded parietal activations in previous fMRI studies: yes/no recognition judgments on visual words and on colored object pictures and identifiable environmental sounds. We found that patients with left hemisphere lesions were not impaired compared to controls in any of the tasks. Patients with right hemisphere lesions were not significantly impaired in memory for visual words, but were impaired in recognition of object pictures and sounds. Two lesion--behavior analyses--area-based correlations and voxel-based lesion symptom mapping (VLSM)---indicate that these impairments resulted from extra-parietal damage, specifically to frontal and lateral temporal areas. These findings suggest that extensive parietal damage does not impair recognition performance. We suggest that parietal activations recorded during recognition memory tasks might reflect peri-retrieval processes, such as the storage of retrieved memoranda in a working memory buffer for further cognitive processing.
Bimodal benefits on objective and subjective outcomes for adult cochlear implant users.

PubMed

Heo, Ji-Hye; Lee, Jae-Hee; Lee, Won-Sang

2013-09-01

Given that only a few studies have focused on the bimodal benefits on objective and subjective outcomes and emphasized the importance of individual data, the present study aimed to measure the bimodal benefits on the objective and subjective outcomes for adults with cochlear implant. Fourteen listeners with bimodal devices were tested on the localization and recognition abilities using environmental sounds, 1-talker, and 2-talker speech materials. The localization ability was measured through an 8-loudspeaker array. For the recognition measures, listeners were asked to repeat the sentences or say the environmental sounds the listeners heard. As a subjective questionnaire, three domains of Korean-version of Speech, Spatial, Qualities of Hearing scale (K-SSQ) were used to explore any relationships between objective and subjective outcomes. Based on the group-mean data, the bimodal hearing enhanced both localization and recognition regardless of test material. However, the inter- and intra-subject variability appeared to be large across test materials for both localization and recognition abilities. Correlation analyses revealed that the relationships were not always consistent between the objective outcomes and the subjective self-reports with bimodal devices. Overall, this study supports significant bimodal advantages on localization and recognition measures, yet the large individual variability in bimodal benefits should be considered carefully for the clinical assessment as well as counseling. The discrepant relations between objective and subjective results suggest that the bimodal benefits in traditional localization or recognition measures might not necessarily correspond to the self-reported subjective advantages in everyday listening environments.
The cognitive structural approach for image restoration

NASA Astrophysics Data System (ADS)

Mardare, Igor; Perju, Veacheslav; Casasent, David

2008-03-01

It is analyzed the important and actual problem of the defective images of scenes restoration. The proposed approach provides restoration of scenes by a system on the basis of human intelligence phenomena reproduction used for restoration-recognition of images. The cognitive models of the restoration process are elaborated. The models are realized by the intellectual processors constructed on the base of neural networks and associative memory using neural network simulator NNToolbox from MATLAB 7.0. The models provides restoration and semantic designing of images of scenes under defective images of the separate objects.
Image understanding and the man-machine interface II; Proceedings of the Meeting, Los Angeles, CA, Jan. 17, 18, 1989

NASA Technical Reports Server (NTRS)

Barrett, Eamon B. (Editor); Pearson, James J. (Editor)

1989-01-01

Image understanding concepts and models, image understanding systems and applications, advanced digital processors and software tools, and advanced man-machine interfaces are among the topics discussed. Particular papers are presented on such topics as neural networks for computer vision, object-based segmentation and color recognition in multispectral images, the application of image algebra to image measurement and feature extraction, and the integration of modeling and graphics to create an infrared signal processing test bed.
Fusion of Multiple Sensing Modalities for Machine Vision

DTIC Science & Technology

1994-05-31

Modeling of Non-Homogeneous 3-D Objects for Thermal and Visual Image Synthesis," Pattern Recognition, in press. U [11] Nair, Dinesh , and J. K. Aggarwal...20th AIPR Workshop: Computer Vision--Meeting the Challenges, McLean, Virginia, October 1991. Nair, Dinesh , and J. K. Aggarwal, "An Object Recognition...Computer Engineering August 1992 Sunil Gupta Ph.D. Student Mohan Kumar M.S. Student Sandeep Kumar M.S. Student Xavier Lebegue Ph.D., Computer
Reconstruction of audio waveforms from spike trains of artificial cochlea models

PubMed Central

Zai, Anja T.; Bhargava, Saurabh; Mesgarani, Nima; Liu, Shih-Chii

2015-01-01

Spiking cochlea models describe the analog processing and spike generation process within the biological cochlea. Reconstructing the audio input from the artificial cochlea spikes is therefore useful for understanding the fidelity of the information preserved in the spikes. The reconstruction process is challenging particularly for spikes from the mixed signal (analog/digital) integrated circuit (IC) cochleas because of multiple non-linearities in the model and the additional variance caused by random transistor mismatch. This work proposes an offline method for reconstructing the audio input from spike responses of both a particular spike-based hardware model called the AEREAR2 cochlea and an equivalent software cochlea model. This method was previously used to reconstruct the auditory stimulus based on the peri-stimulus histogram of spike responses recorded in the ferret auditory cortex. The reconstructed audio from the hardware cochlea is evaluated against an analogous software model using objective measures of speech quality and intelligibility; and further tested in a word recognition task. The reconstructed audio under low signal-to-noise (SNR) conditions (SNR < –5 dB) gives a better classification performance than the original SNR input in this word recognition task. PMID:26528113
A biologically inspired neural network model to transformation invariant object recognition

NASA Astrophysics Data System (ADS)

Iftekharuddin, Khan M.; Li, Yaqin; Siddiqui, Faraz

2007-09-01

Transformation invariant image recognition has been an active research area due to its widespread applications in a variety of fields such as military operations, robotics, medical practices, geographic scene analysis, and many others. The primary goal for this research is detection of objects in the presence of image transformations such as changes in resolution, rotation, translation, scale and occlusion. We investigate a biologically-inspired neural network (NN) model for such transformation-invariant object recognition. In a classical training-testing setup for NN, the performance is largely dependent on the range of transformation or orientation involved in training. However, an even more serious dilemma is that there may not be enough training data available for successful learning or even no training data at all. To alleviate this problem, a biologically inspired reinforcement learning (RL) approach is proposed. In this paper, the RL approach is explored for object recognition with different types of transformations such as changes in scale, size, resolution and rotation. The RL is implemented in an adaptive critic design (ACD) framework, which approximates the neuro-dynamic programming of an action network and a critic network, respectively. Two ACD algorithms such as Heuristic Dynamic Programming (HDP) and Dual Heuristic dynamic Programming (DHP) are investigated to obtain transformation invariant object recognition. The two learning algorithms are evaluated statistically using simulated transformations in images as well as with a large-scale UMIST face database with pose variations. In the face database authentication case, the 90° out-of-plane rotation of faces from 20 different subjects in the UMIST database is used. Our simulations show promising results for both designs for transformation-invariant object recognition and authentication of faces. Comparing the two algorithms, DHP outperforms HDP in learning capability, as DHP takes fewer steps to perform a successful recognition task in general. Further, the residual critic error in DHP is generally smaller than that of HDP, and DHP achieves a 100% success rate more frequently than HDP for individual objects/subjects. On the other hand, HDP is more robust than the DHP as far as success rate across the database is concerned when applied in a stochastic and uncertain environment, and the computational time involved in DHP is more.
Facial Expression Influences Face Identity Recognition During the Attentional Blink

PubMed Central

2014-01-01

Emotional stimuli (e.g., negative facial expressions) enjoy prioritized memory access when task relevant, consistent with their ability to capture attention. Whether emotional expression also impacts on memory access when task-irrelevant is important for arbitrating between feature-based and object-based attentional capture. Here, the authors address this question in 3 experiments using an attentional blink task with face photographs as first and second target (T1, T2). They demonstrate reduced neutral T2 identity recognition after angry or happy T1 expression, compared to neutral T1, and this supports attentional capture by a task-irrelevant feature. Crucially, after neutral T1, T2 identity recognition was enhanced and not suppressed when T2 was angry—suggesting that attentional capture by this task-irrelevant feature may be object-based and not feature-based. As an unexpected finding, both angry and happy facial expressions suppress memory access for competing objects, but only angry facial expression enjoyed privileged memory access. This could imply that these 2 processes are relatively independent from one another. PMID:25286076
Facial expression influences face identity recognition during the attentional blink.

PubMed

Bach, Dominik R; Schmidt-Daffy, Martin; Dolan, Raymond J

2014-12-01

Emotional stimuli (e.g., negative facial expressions) enjoy prioritized memory access when task relevant, consistent with their ability to capture attention. Whether emotional expression also impacts on memory access when task-irrelevant is important for arbitrating between feature-based and object-based attentional capture. Here, the authors address this question in 3 experiments using an attentional blink task with face photographs as first and second target (T1, T2). They demonstrate reduced neutral T2 identity recognition after angry or happy T1 expression, compared to neutral T1, and this supports attentional capture by a task-irrelevant feature. Crucially, after neutral T1, T2 identity recognition was enhanced and not suppressed when T2 was angry-suggesting that attentional capture by this task-irrelevant feature may be object-based and not feature-based. As an unexpected finding, both angry and happy facial expressions suppress memory access for competing objects, but only angry facial expression enjoyed privileged memory access. This could imply that these 2 processes are relatively independent from one another.
Evaluating structural pattern recognition for handwritten math via primitive label graphs

NASA Astrophysics Data System (ADS)

Zanibbi, Richard; MoucheÌre, Harold; Viard-Gaudin, Christian

2013-01-01

Currently, structural pattern recognizer evaluations compare graphs of detected structure to target structures (i.e. ground truth) using recognition rates, recall and precision for object segmentation, classification and relationships. In document recognition, these target objects (e.g. symbols) are frequently comprised of multiple primitives (e.g. connected components, or strokes for online handwritten data), but current metrics do not characterize errors at the primitive level, from which object-level structure is obtained. Primitive label graphs are directed graphs defined over primitives and primitive pairs. We define new metrics obtained by Hamming distances over label graphs, which allow classification, segmentation and parsing errors to be characterized separately, or using a single measure. Recall and precision for detected objects may also be computed directly from label graphs. We illustrate the new metrics by comparing a new primitive-level evaluation to the symbol-level evaluation performed for the CROHME 2012 handwritten math recognition competition. A Python-based set of utilities for evaluating, visualizing and translating label graphs is publicly available.

Bimodal Benefits on Objective and Subjective Outcomes for Adult Cochlear Implant Users

PubMed Central

Heo, Ji-Hye; Lee, Won-Sang

2013-01-01

Background and Objectives Given that only a few studies have focused on the bimodal benefits on objective and subjective outcomes and emphasized the importance of individual data, the present study aimed to measure the bimodal benefits on the objective and subjective outcomes for adults with cochlear implant. Subjects and Methods Fourteen listeners with bimodal devices were tested on the localization and recognition abilities using environmental sounds, 1-talker, and 2-talker speech materials. The localization ability was measured through an 8-loudspeaker array. For the recognition measures, listeners were asked to repeat the sentences or say the environmental sounds the listeners heard. As a subjective questionnaire, three domains of Korean-version of Speech, Spatial, Qualities of Hearing scale (K-SSQ) were used to explore any relationships between objective and subjective outcomes. Results Based on the group-mean data, the bimodal hearing enhanced both localization and recognition regardless of test material. However, the inter- and intra-subject variability appeared to be large across test materials for both localization and recognition abilities. Correlation analyses revealed that the relationships were not always consistent between the objective outcomes and the subjective self-reports with bimodal devices. Conclusions Overall, this study supports significant bimodal advantages on localization and recognition measures, yet the large individual variability in bimodal benefits should be considered carefully for the clinical assessment as well as counseling. The discrepant relations between objective and subjective results suggest that the bimodal benefits in traditional localization or recognition measures might not necessarily correspond to the self-reported subjective advantages in everyday listening environments. PMID:24653909
Short- and long-term effects of nicotine and the histone deacetylase inhibitor phenylbutyrate on novel object recognition in zebrafish.

PubMed

Faillace, M P; Pisera-Fuster, A; Medrano, M P; Bejarano, A C; Bernabeu, R O

2017-03-01

Zebrafish have a sophisticated color- and shape-sensitive visual system, so we examined color cue-based novel object recognition in zebrafish. We evaluated preference in the absence or presence of drugs that affect attention and memory retention in rodents: nicotine and the histone deacetylase inhibitor (HDACi) phenylbutyrate (PhB). The objective of this study was to evaluate whether nicotine and PhB affect innate preferences of zebrafish for familiar and novel objects after short- and long-retention intervals. We developed modified object recognition (OR) tasks using neutral novel and familiar objects in different colors. We also tested objects which differed with respect to the exploratory behavior they elicited from naïve zebrafish. Zebrafish showed an innate preference for exploring red or green objects rather than yellow or blue objects. Zebrafish were better at discriminating color changes than changes in object shape or size. Nicotine significantly enhanced or changed short-term innate novel object preference whereas PhB had similar effects when preference was assessed 24 h after training. Analysis of other zebrafish behaviors corroborated these results. Zebrafish were innately reluctant or prone to explore colored novel objects, so drug effects on innate preference for objects can be evaluated changing the color of objects with a simple geometry. Zebrafish exhibited recognition memory for novel objects with similar innate significance. Interestingly, nicotine and PhB significantly modified innate object preference.
Bidirectional Modulation of Recognition Memory

PubMed Central

Ho, Jonathan W.; Poeta, Devon L.; Jacobson, Tara K.; Zolnik, Timothy A.; Neske, Garrett T.; Connors, Barry W.

2015-01-01

Perirhinal cortex (PER) has a well established role in the familiarity-based recognition of individual items and objects. For example, animals and humans with perirhinal damage are unable to distinguish familiar from novel objects in recognition memory tasks. In the normal brain, perirhinal neurons respond to novelty and familiarity by increasing or decreasing firing rates. Recent work also implicates oscillatory activity in the low-beta and low-gamma frequency bands in sensory detection, perception, and recognition. Using optogenetic methods in a spontaneous object exploration (SOR) task, we altered recognition memory performance in rats. In the SOR task, normal rats preferentially explore novel images over familiar ones. We modulated exploratory behavior in this task by optically stimulating channelrhodopsin-expressing perirhinal neurons at various frequencies while rats looked at novel or familiar 2D images. Stimulation at 30–40 Hz during looking caused rats to treat a familiar image as if it were novel by increasing time looking at the image. Stimulation at 30–40 Hz was not effective in increasing exploration of novel images. Stimulation at 10–15 Hz caused animals to treat a novel image as familiar by decreasing time looking at the image, but did not affect looking times for images that were already familiar. We conclude that optical stimulation of PER at different frequencies can alter visual recognition memory bidirectionally. SIGNIFICANCE STATEMENT Recognition of novelty and familiarity are important for learning, memory, and decision making. Perirhinal cortex (PER) has a well established role in the familiarity-based recognition of individual items and objects, but how novelty and familiarity are encoded and transmitted in the brain is not known. Perirhinal neurons respond to novelty and familiarity by changing firing rates, but recent work suggests that brain oscillations may also be important for recognition. In this study, we showed that stimulation of the PER could increase or decrease exploration of novel and familiar images depending on the frequency of stimulation. Our findings suggest that optical stimulation of PER at specific frequencies can predictably alter recognition memory. PMID:26424881
Object detection from images obtained through underwater turbulence medium

NASA Astrophysics Data System (ADS)

Furhad, Md. Hasan; Tahtali, Murat; Lambert, Andrew

2017-09-01

Imaging through underwater experiences severe distortions due to random fluctuations of temperature and salinity in water, which produces underwater turbulence through diffraction limited blur. Lights reflecting from objects perturb and attenuate contrast, making the recognition of objects of interest difficult. Thus, the information available for detecting underwater objects of interest becomes a challenging task as they have inherent confusion among the background, foreground and other image properties. In this paper, a saliency-based approach is proposed to detect the objects acquired through an underwater turbulent medium. This approach has drawn attention among a wide range of computer vision applications, such as image retrieval, artificial intelligence, neuro-imaging and object detection. The image is first processed through a deblurring filter. Next, a saliency technique is used on the image for object detection. In this step, a saliency map that highlights the target regions is generated and then a graph-based model is proposed to extract these target regions for object detection.
Using Prosopagnosia to Test and Modify Visual Recognition Theory.

PubMed

O'Brien, Alexander M

2018-02-01

Biederman's contemporary theory of basic visual object recognition (Recognition-by-Components) is based on structural descriptions of objects and presumes 36 visual primitives (geons) people can discriminate, but there has been no empirical test of the actual use of these 36 geons to visually distinguish objects. In this study, we tested for the actual use of these geons in basic visual discrimination by comparing object discrimination performance patterns (when distinguishing varied stimuli) of an acquired prosopagnosia patient (LB) and healthy control participants. LB's prosopagnosia left her heavily reliant on structural descriptions or categorical object differences in visual discrimination tasks versus the control participants' additional ability to use face recognition or coordinate systems (Coordinate Relations Hypothesis). Thus, when LB performed comparably to control participants with a given stimulus, her restricted reliance on basic or categorical discriminations meant that the stimuli must be distinguishable on the basis of a geon feature. By varying stimuli in eight separate experiments and presenting all 36 geons, we discerned that LB coded only 12 (vs. 36) distinct visual primitives (geons), apparently reflective of human visual systems generally.
Distorted Character Recognition Via An Associative Neural Network

NASA Astrophysics Data System (ADS)

Messner, Richard A.; Szu, Harold H.

1987-03-01

The purpose of this paper is two-fold. First, it is intended to provide some preliminary results of a character recognition scheme which has foundations in on-going neural network architecture modeling, and secondly, to apply some of the neural network results in a real application area where thirty years of effort has had little effect on providing the machine an ability to recognize distorted objects within the same object class. It is the author's belief that the time is ripe to start applying in ernest the results of over twenty years of effort in neural modeling to some of the more difficult problems which seem so hard to solve by conventional means. The character recognition scheme proposed utilizes a preprocessing stage which performs a 2-dimensional Walsh transform of an input cartesian image field, then sequency filters this spectrum into three feature bands. Various features are then extracted and organized into three sets of feature vectors. These vector patterns that are stored and recalled associatively. Two possible associative neural memory models are proposed for further investigation. The first being an outer-product linear matrix associative memory with a threshold function controlling the strength of the output pattern (similar to Kohonen's crosscorrelation approach [1]). The second approach is based upon a modified version of Grossberg's neural architecture [2] which provides better self-organizing properties due to its adaptive nature. Preliminary results of the sequency filtering and feature extraction preprocessing stage and discussion about the use of the proposed neural architectures is included.
Social and organizational factors affecting implementation of evidence-informed practice in a public health department in Ontario: a network modelling approach

PubMed Central

2014-01-01

Objective The objective of this study is to develop a statistical model to assess factors associated with information seeking in a Canadian public health department. Methods Managers and professional consultants of a public health department serving a large urban population named whom they turned to for help, whom they considered experts in evidence-informed practice, and whom they considered friends. Multilevel regression analysis and exponential random graph modeling were used to predict the formation of information seeking and expertise-recognition connections by personal characteristics of the seeker and source, and the structural attributes of the social networks. Results The respondents were more likely to recognize the members of the supervisory/administrative division as experts. The extent to which an individual implemented evidence-based practice (EBP) principles in daily practice was a significant predictor of both being an information source and being recognized as expert by peers. Friendship was a significant predictor of both information seeking and expertise-recognition connections. Conclusion The analysis showed a communication network segregated by organizational divisions. Managers were identified frequently as information sources, even though this is not a part of their formal role. Self-perceived implementation of EBP in practice was a significant predictor of being an information source or an expert, implying a positive atmosphere towards implementation of evidence-informed decision making in this public health organization. Results also implied that the perception of accessibility and trust were significant predictors of expertise recognition. PMID:24565228
Visual search in scenes involves selective and non-selective pathways

PubMed Central

Wolfe, Jeremy M; Vo, Melissa L-H; Evans, Karla K; Greene, Michelle R

2010-01-01

How do we find objects in scenes? For decades, visual search models have been built on experiments in which observers search for targets, presented among distractor items, isolated and randomly arranged on blank backgrounds. Are these models relevant to search in continuous scenes? This paper argues that the mechanisms that govern artificial, laboratory search tasks do play a role in visual search in scenes. However, scene-based information is used to guide search in ways that had no place in earlier models. Search in scenes may be best explained by a dual-path model: A “selective” path in which candidate objects must be individually selected for recognition and a “non-selective” path in which information can be extracted from global / statistical information. PMID:21227734
Automated Field-of-View, Illumination, and Recognition Algorithm Design of a Vision System for Pick-and-Place Considering Colour Information in Illumination and Images

PubMed Central

Chen, Yibing; Ogata, Taiki; Ueyama, Tsuyoshi; Takada, Toshiyuki; Ota, Jun

2018-01-01

Machine vision is playing an increasingly important role in industrial applications, and the automated design of image recognition systems has been a subject of intense research. This study has proposed a system for automatically designing the field-of-view (FOV) of a camera, the illumination strength and the parameters in a recognition algorithm. We formulated the design problem as an optimisation problem and used an experiment based on a hierarchical algorithm to solve it. The evaluation experiments using translucent plastics objects showed that the use of the proposed system resulted in an effective solution with a wide FOV, recognition of all objects and 0.32 mm and 0.4° maximal positional and angular errors when all the RGB (red, green and blue) for illumination and R channel image for recognition were used. Though all the RGB illumination and grey scale images also provided recognition of all the objects, only a narrow FOV was selected. Moreover, full recognition was not achieved by using only G illumination and a grey-scale image. The results showed that the proposed method can automatically design the FOV, illumination and parameters in the recognition algorithm and that tuning all the RGB illumination is desirable even when single-channel or grey-scale images are used for recognition. PMID:29786665
Automated Field-of-View, Illumination, and Recognition Algorithm Design of a Vision System for Pick-and-Place Considering Colour Information in Illumination and Images.

PubMed

Chen, Yibing; Ogata, Taiki; Ueyama, Tsuyoshi; Takada, Toshiyuki; Ota, Jun

2018-05-22

Machine vision is playing an increasingly important role in industrial applications, and the automated design of image recognition systems has been a subject of intense research. This study has proposed a system for automatically designing the field-of-view (FOV) of a camera, the illumination strength and the parameters in a recognition algorithm. We formulated the design problem as an optimisation problem and used an experiment based on a hierarchical algorithm to solve it. The evaluation experiments using translucent plastics objects showed that the use of the proposed system resulted in an effective solution with a wide FOV, recognition of all objects and 0.32 mm and 0.4° maximal positional and angular errors when all the RGB (red, green and blue) for illumination and R channel image for recognition were used. Though all the RGB illumination and grey scale images also provided recognition of all the objects, only a narrow FOV was selected. Moreover, full recognition was not achieved by using only G illumination and a grey-scale image. The results showed that the proposed method can automatically design the FOV, illumination and parameters in the recognition algorithm and that tuning all the RGB illumination is desirable even when single-channel or grey-scale images are used for recognition.
Collision detection in complex dynamic scenes using an LGMD-based visual neural network with feature enhancement.

PubMed

Yue, Shigang; Rind, F Claire

2006-05-01

The lobula giant movement detector (LGMD) is an identified neuron in the locust brain that responds most strongly to the images of an approaching object such as a predator. Its computational model can cope with unpredictable environments without using specific object recognition algorithms. In this paper, an LGMD-based neural network is proposed with a new feature enhancement mechanism to enhance the expanded edges of colliding objects via grouped excitation for collision detection with complex backgrounds. The isolated excitation caused by background detail will be filtered out by the new mechanism. Offline tests demonstrated the advantages of the presented LGMD-based neural network in complex backgrounds. Real time robotics experiments using the LGMD-based neural network as the only sensory system showed that the system worked reliably in a wide range of conditions; in particular, the robot was able to navigate in arenas with structured surrounds and complex backgrounds.
Visual shape perception as Bayesian inference of 3D object-centered shape representations.

PubMed

Erdogan, Goker; Jacobs, Robert A

2017-11-01

Despite decades of research, little is known about how people visually perceive object shape. We hypothesize that a promising approach to shape perception is provided by a "visual perception as Bayesian inference" framework which augments an emphasis on visual representation with an emphasis on the idea that shape perception is a form of statistical inference. Our hypothesis claims that shape perception of unfamiliar objects can be characterized as statistical inference of 3D shape in an object-centered coordinate system. We describe a computational model based on our theoretical framework, and provide evidence for the model along two lines. First, we show that, counterintuitively, the model accounts for viewpoint-dependency of object recognition, traditionally regarded as evidence against people's use of 3D object-centered shape representations. Second, we report the results of an experiment using a shape similarity task, and present an extensive evaluation of existing models' abilities to account for the experimental data. We find that our shape inference model captures subjects' behaviors better than competing models. Taken as a whole, our experimental and computational results illustrate the promise of our approach and suggest that people's shape representations of unfamiliar objects are probabilistic, 3D, and object-centered. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Neuroscience-inspired computational systems for speech recognition under noisy conditions

NASA Astrophysics Data System (ADS)

Schafer, Phillip B.

Humans routinely recognize speech in challenging acoustic environments with background music, engine sounds, competing talkers, and other acoustic noise. However, today's automatic speech recognition (ASR) systems perform poorly in such environments. In this dissertation, I present novel methods for ASR designed to approach human-level performance by emulating the brain's processing of sounds. I exploit recent advances in auditory neuroscience to compute neuron-based representations of speech, and design novel methods for decoding these representations to produce word transcriptions. I begin by considering speech representations modeled on the spectrotemporal receptive fields of auditory neurons. These representations can be tuned to optimize a variety of objective functions, which characterize the response properties of a neural population. I propose an objective function that explicitly optimizes the noise invariance of the neural responses, and find that it gives improved performance on an ASR task in noise compared to other objectives. The method as a whole, however, fails to significantly close the performance gap with humans. I next consider speech representations that make use of spiking model neurons. The neurons in this method are feature detectors that selectively respond to spectrotemporal patterns within short time windows in speech. I consider a number of methods for training the response properties of the neurons. In particular, I present a method using linear support vector machines (SVMs) and show that this method produces spikes that are robust to additive noise. I compute the spectrotemporal receptive fields of the neurons for comparison with previous physiological results. To decode the spike-based speech representations, I propose two methods designed to work on isolated word recordings. The first method uses a classical ASR technique based on the hidden Markov model. The second method is a novel template-based recognition scheme that takes advantage of the neural representation's invariance in noise. The scheme centers on a speech similarity measure based on the longest common subsequence between spike sequences. The combined encoding and decoding scheme outperforms a benchmark system in extremely noisy acoustic conditions. Finally, I consider methods for decoding spike representations of continuous speech. To help guide the alignment of templates to words, I design a syllable detection scheme that robustly marks the locations of syllabic nuclei. The scheme combines SVM-based training with a peak selection algorithm designed to improve noise tolerance. By incorporating syllable information into the ASR system, I obtain strong recognition results in noisy conditions, although the performance in noiseless conditions is below the state of the art. The work presented here constitutes a novel approach to the problem of ASR that can be applied in the many challenging acoustic environments in which we use computer technologies today. The proposed spike-based processing methods can potentially be exploited in effcient hardware implementations and could significantly reduce the computational costs of ASR. The work also provides a framework for understanding the advantages of spike-based acoustic coding in the human brain.
Lymph node detection in IASLC-defined zones on PET/CT images

NASA Astrophysics Data System (ADS)

Song, Yihua; Udupa, Jayaram K.; Odhner, Dewey; Tong, Yubing; Torigian, Drew A.

2016-03-01

Lymph node detection is challenging due to the low contrast between lymph nodes as well as surrounding soft tissues and the variation in nodal size and shape. In this paper, we propose several novel ideas which are combined into a system to operate on positron emission tomography/ computed tomography (PET/CT) images to detect abnormal thoracic nodes. First, our previous Automatic Anatomy Recognition (AAR) approach is modified where lymph node zones predominantly following International Association for the Study of Lung Cancer (IASLC) specifications are modeled as objects arranged in a hierarchy along with key anatomic anchor objects. This fuzzy anatomy model built from diagnostic CT images is then deployed on PET/CT images for automatically recognizing the zones. A novel globular filter (g-filter) to detect blob-like objects over a specified range of sizes is designed to detect the most likely locations and sizes of diseased nodes. Abnormal nodes within each automatically localized zone are subsequently detected via combined use of different items of information at various scales: lymph node zone model poses found at recognition indicating the geographic layout at the global level of node clusters, g-filter response which hones in on and carefully selects node-like globular objects at the node level, and CT and PET gray value but within only the most plausible nodal regions for node presence at the voxel level. The models are built from 25 diagnostic CT scans and refined for an object hierarchy based on a separate set of 20 diagnostic CT scans. Node detection is tested on an additional set of 20 PET/CT scans. Our preliminary results indicate node detection sensitivity and specificity at around 90% and 85%, respectively.
Role of fusiform and anterior temporal cortical areas in facial recognition.

PubMed

Nasr, Shahin; Tootell, Roger B H

2012-11-15

Recent fMRI studies suggest that cortical face processing extends well beyond the fusiform face area (FFA), including unspecified portions of the anterior temporal lobe. However, the exact location of such anterior temporal region(s), and their role during active face recognition, remain unclear. Here we demonstrate that (in addition to FFA) a small bilateral site in the anterior tip of the collateral sulcus ('AT'; the anterior temporal face patch) is selectively activated during recognition of faces but not houses (a non-face object). In contrast to the psychophysical prediction that inverted and contrast reversed faces are processed like other non-face objects, both FFA and AT (but not other visual areas) were also activated during recognition of inverted and contrast reversed faces. However, response accuracy was better correlated to recognition-driven activity in AT, compared to FFA. These data support a segregated, hierarchical model of face recognition processing, extending to the anterior temporal cortex. Copyright © 2012 Elsevier Inc. All rights reserved.
Role of Fusiform and Anterior Temporal Cortical Areas in Facial Recognition

PubMed Central

Nasr, Shahin; Tootell, Roger BH

2012-01-01

Recent FMRI studies suggest that cortical face processing extends well beyond the fusiform face area (FFA), including unspecified portions of the anterior temporal lobe. However, the exact location of such anterior temporal region(s), and their role during active face recognition, remain unclear. Here we demonstrate that (in addition to FFA) a small bilateral site in the anterior tip of the collateral sulcus (‘AT’; the anterior temporal face patch) is selectively activated during recognition of faces but not houses (a non-face object). In contrast to the psychophysical prediction that inverted and contrast reversed faces are processed like other non-face objects, both FFA and AT (but not other visual areas) were also activated during recognition of inverted and contrast reversed faces. However, response accuracy was better correlated to recognition-driven activity in AT, compared to FFA. These data support a segregated, hierarchical model of face recognition processing, extending to the anterior temporal cortex. PMID:23034518
Restoration of Dopamine Release Deficits during Object Recognition Memory Acquisition Attenuates Cognitive Impairment in a Triple Transgenic Mice Model of Alzheimer's Disease

ERIC Educational Resources Information Center

Guzman-Ramos, Kioko; Moreno-Castilla, Perla; Castro-Cruz, Monica; McGaugh, James L.; Martinez-Coria, Hilda; LaFerla, Frank M.; Bermudez-Rattoni, Federico

2012-01-01

Previous findings indicate that the acquisition and consolidation of recognition memory involves dopaminergic activity. Although dopamine deregulation has been observed in Alzheimer's disease (AD) patients, the dysfunction of this neurotransmitter has not been investigated in animal models of AD. The aim of this study was to assess, by in vivo…
Tactile recognition and localization using object models: the case of polyhedra on a plane.

PubMed

Gaston, P C; Lozano-Perez, T

1984-03-01

This paper discusses how data from multiple tactile sensors may be used to identify and locate one object, from among a set of known objects. We use only local information from sensors: 1) the position of contact points and 2) ranges of surface normals at the contact points. The recognition and localization process is structured as the development and pruning of a tree of consistent hypotheses about pairings between contact points and object surfaces. In this paper, we deal with polyhedral objects constrained to lie on a known plane, i.e., having three degrees of positioning freedom relative to the sensors. We illustrate the performance of the algorithm by simulation.
Selective visual attention in object detection processes

NASA Astrophysics Data System (ADS)

Paletta, Lucas; Goyal, Anurag; Greindl, Christian

2003-03-01

Object detection is an enabling technology that plays a key role in many application areas, such as content based media retrieval. Attentive cognitive vision systems are here proposed where the focus of attention is directed towards the most relevant target. The most promising information is interpreted in a sequential process that dynamically makes use of knowledge and that enables spatial reasoning on the local object information. The presented work proposes an innovative application of attention mechanisms for object detection which is most general in its understanding of information and action selection. The attentive detection system uses a cascade of increasingly complex classifiers for the stepwise identification of regions of interest (ROIs) and recursively refined object hypotheses. While the most coarse classifiers are used to determine first approximations on a region of interest in the input image, more complex classifiers are used for more refined ROIs to give more confident estimates. Objects are modelled by local appearance based representations and in terms of posterior distributions of the object samples in eigenspace. The discrimination function to discern between objects is modeled by a radial basis functions (RBF) network that has been compared with alternative networks and been proved consistent and superior to other artifical neural networks for appearance based object recognition. The experiments were led for the automatic detection of brand objects in Formula One broadcasts within the European Commission's cognitive vision project DETECT.
Target recognition based on convolutional neural network

NASA Astrophysics Data System (ADS)

Wang, Liqiang; Wang, Xin; Xi, Fubiao; Dong, Jian

2017-11-01

One of the important part of object target recognition is the feature extraction, which can be classified into feature extraction and automatic feature extraction. The traditional neural network is one of the automatic feature extraction methods, while it causes high possibility of over-fitting due to the global connection. The deep learning algorithm used in this paper is a hierarchical automatic feature extraction method, trained with the layer-by-layer convolutional neural network (CNN), which can extract the features from lower layers to higher layers. The features are more discriminative and it is beneficial to the object target recognition.

A fast 3-D object recognition algorithm for the vision system of a special-purpose dexterous manipulator

NASA Technical Reports Server (NTRS)

Hung, Stephen H. Y.

1989-01-01

A fast 3-D object recognition algorithm that can be used as a quick-look subsystem to the vision system for the Special-Purpose Dexterous Manipulator (SPDM) is described. Global features that can be easily computed from range data are used to characterize the images of a viewer-centered model of an object. This algorithm will speed up the processing by eliminating the low level processing whenever possible. It may identify the object, reject a set of bad data in the early stage, or create a better environment for a more powerful algorithm to carry the work further.
Effects of curcumin on short-term spatial and recognition memory, adult neurogenesis and neuroinflammation in a streptozotocin-induced rat model of dementia of Alzheimer's type.

PubMed

Bassani, Taysa B; Turnes, Joelle M; Moura, Eric L R; Bonato, Jéssica M; Cóppola-Segovia, Valentín; Zanata, Silvio M; Oliveira, Rúbia M M W; Vital, Maria A B F

2017-09-29

Curcumin is a natural polyphenol with evidence of antioxidant, anti-inflammatory and neuroprotective properties. Recent evidence also suggests that curcumin increases cognitive performance in animal models of dementia, and this effect would be related to its capacity to enhance adult neurogenesis. The aim of this study was to test the hypothesis that curcumin treatment would be able to preserve cognition by increasing neurogenesis and decreasing neuroinflammation in the model of dementia of Alzheimer's type induced by an intracerebroventricular injection of streptozotocin (ICV-STZ) in Wistar rats. The animals were injected with ICV-STZ or vehicle and curcumin treatments (25, 50 and 100mg/kg, gavage) were performed for 30days. Four weeks after surgery, STZ-lesioned animals exhibited impairments in short-term spatial memory (Object Location Test (OLT) and Y maze) and short-term recognition memory (Object Recognition Test - ORT), decreased cell proliferation and immature neurons (Ki-67- and doublecortin-positive cells, respectively) in the subventricular zone (SVZ) and dentate gyrus (DG) of hippocampus, and increased immunoreactivity for the glial markers GFAP and Iba-1 (neuroinflammation). Curcumin treatment in the doses of 50 and 100mg/kg prevented the deficits in recognition memory in the ORT, but not in spatial memory in the OLT and Y maze. Curcumin treatment exerted only slight improvements in neuroinflammation, resulting in no improvements in hippocampal and subventricular neurogenesis. These results suggest a positive effect of curcumin in object recognition memory which was not related to hippocampal neurogenesis. Copyright © 2017 Elsevier B.V. All rights reserved.
Fluent, fast, and frugal? A formal model evaluation of the interplay between memory, fluency, and comparative judgments.

PubMed

Hilbig, Benjamin E; Erdfelder, Edgar; Pohl, Rüdiger F

2011-07-01

A new process model of the interplay between memory and judgment processes was recently suggested, assuming that retrieval fluency-that is, the speed with which objects are recognized-will determine inferences concerning such objects in a single-cue fashion. This aspect of the fluency heuristic, an extension of the recognition heuristic, has remained largely untested due to methodological difficulties. To overcome the latter, we propose a measurement model from the class of multinomial processing tree models that can estimate true single-cue reliance on recognition and retrieval fluency. We applied this model to aggregate and individual data from a probabilistic inference experiment and considered both goodness of fit and model complexity to evaluate different hypotheses. The results were relatively clear-cut, revealing that the fluency heuristic is an unlikely candidate for describing comparative judgments concerning recognized objects. These findings are discussed in light of a broader theoretical view on the interplay of memory and judgment processes.
Standard object recognition memory and "what" and "where" components: Improvement by post-training epinephrine in highly habituated rats.

PubMed

Jurado-Berbel, Patricia; Costa-Miserachs, David; Torras-Garcia, Meritxell; Coll-Andreu, Margalida; Portell-Cortés, Isabel

2010-02-11

The present work examined whether post-training systemic epinephrine (EPI) is able to modulate short-term (3h) and long-term (24 h and 48 h) memory of standard object recognition, as well as long-term (24 h) memory of separate "what" (object identity) and "where" (object location) components of object recognition. Although object recognition training is associated to low arousal levels, all the animals received habituation to the training box in order to further reduce emotional arousal. Post-training EPI improved long-term (24 h and 48 h), but not short-term (3 h), memory in the standard object recognition task, as well as 24 h memory for both object identity and object location. These data indicate that post-training epinephrine: (1) facilitates long-term memory for standard object recognition; (2) exerts separate facilitatory effects on "what" (object identity) and "where" (object location) components of object recognition; and (3) is capable of improving memory for a low arousing task even in highly habituated rats.
[Visual Texture Agnosia in Humans].

PubMed

Suzuki, Kyoko

2015-06-01

Visual object recognition requires the processing of both geometric and surface properties. Patients with occipital lesions may have visual agnosia, which is impairment in the recognition and identification of visually presented objects primarily through their geometric features. An analogous condition involving the failure to recognize an object by its texture may exist, which can be called visual texture agnosia. Here we present two cases with visual texture agnosia. Case 1 had left homonymous hemianopia and right upper quadrantanopia, along with achromatopsia, prosopagnosia, and texture agnosia, because of damage to his left ventromedial occipitotemporal cortex and right lateral occipito-temporo-parietal cortex due to multiple cerebral embolisms. Although he showed difficulty matching and naming textures of real materials, he could readily name visually presented objects by their contours. Case 2 had right lower quadrantanopia, along with impairment in stereopsis and recognition of texture in 2D images, because of subcortical hemorrhage in the left occipitotemporal region. He failed to recognize shapes based on texture information, whereas shape recognition based on contours was well preserved. Our findings, along with those of three reported cases with texture agnosia, indicate that there are separate channels for processing texture, color, and geometric features, and that the regions around the left collateral sulcus are crucial for texture processing.
The Development of Invariant Object Recognition Requires Visual Experience with Temporally Smooth Objects

ERIC Educational Resources Information Center

Wood, Justin N.; Wood, Samantha M. W.

2018-01-01

How do newborns learn to recognize objects? According to temporal learning models in computational neuroscience, the brain constructs object representations by extracting smoothly changing features from the environment. To date, however, it is unknown whether newborns depend on smoothly changing features to build invariant object representations.…
Experience moderates overlap between object and face recognition, suggesting a common ability

PubMed Central

Gauthier, Isabel; McGugin, Rankin W.; Richler, Jennifer J.; Herzmann, Grit; Speegle, Magen; Van Gulick, Ana E.

2014-01-01

Some research finds that face recognition is largely independent from the recognition of other objects; a specialized and innate ability to recognize faces could therefore have little or nothing to do with our ability to recognize objects. We propose a new framework in which recognition performance for any category is the product of domain-general ability and category-specific experience. In Experiment 1, we show that the overlap between face and object recognition depends on experience with objects. In 256 subjects we measured face recognition, object recognition for eight categories, and self-reported experience with these categories. Experience predicted neither face recognition nor object recognition but moderated their relationship: Face recognition performance is increasingly similar to object recognition performance with increasing object experience. If a subject has a lot of experience with objects and is found to perform poorly, they also prove to have a low ability with faces. In a follow-up survey, we explored the dimensions of experience with objects that may have contributed to self-reported experience in Experiment 1. Different dimensions of experience appear to be more salient for different categories, with general self-reports of expertise reflecting judgments of verbal knowledge about a category more than judgments of visual performance. The complexity of experience and current limitations in its measurement support the importance of aggregating across multiple categories. Our findings imply that both face and object recognition are supported by a common, domain-general ability expressed through experience with a category and best measured when accounting for experience. PMID:24993021
Experience moderates overlap between object and face recognition, suggesting a common ability.

PubMed

Gauthier, Isabel; McGugin, Rankin W; Richler, Jennifer J; Herzmann, Grit; Speegle, Magen; Van Gulick, Ana E

2014-07-03

Some research finds that face recognition is largely independent from the recognition of other objects; a specialized and innate ability to recognize faces could therefore have little or nothing to do with our ability to recognize objects. We propose a new framework in which recognition performance for any category is the product of domain-general ability and category-specific experience. In Experiment 1, we show that the overlap between face and object recognition depends on experience with objects. In 256 subjects we measured face recognition, object recognition for eight categories, and self-reported experience with these categories. Experience predicted neither face recognition nor object recognition but moderated their relationship: Face recognition performance is increasingly similar to object recognition performance with increasing object experience. If a subject has a lot of experience with objects and is found to perform poorly, they also prove to have a low ability with faces. In a follow-up survey, we explored the dimensions of experience with objects that may have contributed to self-reported experience in Experiment 1. Different dimensions of experience appear to be more salient for different categories, with general self-reports of expertise reflecting judgments of verbal knowledge about a category more than judgments of visual performance. The complexity of experience and current limitations in its measurement support the importance of aggregating across multiple categories. Our findings imply that both face and object recognition are supported by a common, domain-general ability expressed through experience with a category and best measured when accounting for experience. © 2014 ARVO.
Visual recognition and inference using dynamic overcomplete sparse learning.

PubMed

Murray, Joseph F; Kreutz-Delgado, Kenneth

2007-09-01

We present a hierarchical architecture and learning algorithm for visual recognition and other visual inference tasks such as imagination, reconstruction of occluded images, and expectation-driven segmentation. Using properties of biological vision for guidance, we posit a stochastic generative world model and from it develop a simplified world model (SWM) based on a tractable variational approximation that is designed to enforce sparse coding. Recent developments in computational methods for learning overcomplete representations (Lewicki & Sejnowski, 2000; Teh, Welling, Osindero, & Hinton, 2003) suggest that overcompleteness can be useful for visual tasks, and we use an overcomplete dictionary learning algorithm (Kreutz-Delgado, et al., 2003) as a preprocessing stage to produce accurate, sparse codings of images. Inference is performed by constructing a dynamic multilayer network with feedforward, feedback, and lateral connections, which is trained to approximate the SWM. Learning is done with a variant of the back-propagation-through-time algorithm, which encourages convergence to desired states within a fixed number of iterations. Vision tasks require large networks, and to make learning efficient, we take advantage of the sparsity of each layer to update only a small subset of elements in a large weight matrix at each iteration. Experiments on a set of rotated objects demonstrate various types of visual inference and show that increasing the degree of overcompleteness improves recognition performance in difficult scenes with occluded objects in clutter.
The evolution of meaning: spatio-temporal dynamics of visual object recognition.

PubMed

Clarke, Alex; Taylor, Kirsten I; Tyler, Lorraine K

2011-08-01

Research on the spatio-temporal dynamics of visual object recognition suggests a recurrent, interactive model whereby an initial feedforward sweep through the ventral stream to prefrontal cortex is followed by recurrent interactions. However, critical questions remain regarding the factors that mediate the degree of recurrent interactions necessary for meaningful object recognition. The novel prediction we test here is that recurrent interactivity is driven by increasing semantic integration demands as defined by the complexity of semantic information required by the task and driven by the stimuli. To test this prediction, we recorded magnetoencephalography data while participants named living and nonliving objects during two naming tasks. We found that the spatio-temporal dynamics of neural activity were modulated by the level of semantic integration required. Specifically, source reconstructed time courses and phase synchronization measures showed increased recurrent interactions as a function of semantic integration demands. These findings demonstrate that the cortical dynamics of object processing are modulated by the complexity of semantic information required from the visual input.
Cultural differences in visual object recognition in 3-year-old children

PubMed Central

Kuwabara, Megumi; Smith, Linda B.

2016-01-01

Recent research indicates that culture penetrates fundamental processes of perception and cognition (e.g. Nisbett & Miyamoto, 2005). Here, we provide evidence that these influences begin early and influence how preschool children recognize common objects. The three tasks (n=128) examined the degree to which nonface object recognition by 3 year olds was based on individual diagnostic features versus more configural and holistic processing. Task 1 used a 6-alternative forced choice task in which children were asked to find a named category in arrays of masked objects in which only 3 diagnostic features were visible for each object. U.S. children outperformed age-matched Japanese children. Task 2 presented pictures of objects to children piece by piece. U.S. children recognized the objects given fewer pieces than Japanese children and likelihood of recognition increased for U.S., but not Japanese children when the piece added was rated by both U.S. and Japanese adults as highly defining. Task 3 used a standard measure of configural progressing, asking the degree to which recognition of matching pictures was disrupted by the rotation of one picture. Japanese children’s recognition was more disrupted by inversion than was that of U.S. children, indicating more configural processing by Japanese than U.S. children. The pattern suggests early cross-cultural differences in visual processing; findings that raise important questions about how visual experiences differ across cultures and about universal patterns of cognitive development. PMID:26985576
Cultural differences in visual object recognition in 3-year-old children.

PubMed

Kuwabara, Megumi; Smith, Linda B

2016-07-01

Recent research indicates that culture penetrates fundamental processes of perception and cognition. Here, we provide evidence that these influences begin early and influence how preschool children recognize common objects. The three tasks (N=128) examined the degree to which nonface object recognition by 3-year-olds was based on individual diagnostic features versus more configural and holistic processing. Task 1 used a 6-alternative forced choice task in which children were asked to find a named category in arrays of masked objects where only three diagnostic features were visible for each object. U.S. children outperformed age-matched Japanese children. Task 2 presented pictures of objects to children piece by piece. U.S. children recognized the objects given fewer pieces than Japanese children, and the likelihood of recognition increased for U.S. children, but not Japanese children, when the piece added was rated by both U.S. and Japanese adults as highly defining. Task 3 used a standard measure of configural progressing, asking the degree to which recognition of matching pictures was disrupted by the rotation of one picture. Japanese children's recognition was more disrupted by inversion than was that of U.S. children, indicating more configural processing by Japanese than U.S. children. The pattern suggests early cross-cultural differences in visual processing; findings that raise important questions about how visual experiences differ across cultures and about universal patterns of cognitive development. Copyright © 2016 Elsevier Inc. All rights reserved.
Associative (prosop)agnosia without (apparent) perceptual deficits: a case-study.

PubMed

Anaki, David; Kaufman, Yakir; Freedman, Morris; Moscovitch, Morris

2007-04-09

In associative agnosia early perceptual processing of faces or objects are considered to be intact, while the ability to access stored semantic information about the individual face or object is impaired. Recent claims, however, have asserted that associative agnosia is also characterized by deficits at the perceptual level, which are too subtle to be detected by current neuropsychological tests. Thus, the impaired identification of famous faces or common objects in associative agnosia stems from difficulties in extracting the minute perceptual details required to identify a face or an object. In the present study, we report the case of a patient DBO with a left occipital infarct, who shows impaired object and famous face recognition. Despite his disability, he exhibits a face inversion effect, and is able to select a famous face from among non-famous distractors. In addition, his performance is normal in an immediate and delayed recognition memory for faces, whose external features were deleted. His deficits in face recognition are apparent only when he is required to name a famous face, or select two faces from among a triad of famous figures based on their semantic relationships (a task which does not require access to names). The nature of his deficits in object perception and recognition are similar to his impairments in the face domain. This pattern of behavior supports the notion that apperceptive and associative agnosia reflect distinct and dissociated deficits, which result from damage to different stages of the face and object recognition process.
Toward a Computer Vision-based Wayfinding Aid for Blind Persons to Access Unfamiliar Indoor Environments.

PubMed

Tian, Yingli; Yang, Xiaodong; Yi, Chucai; Arditi, Aries

2013-04-01

Independent travel is a well known challenge for blind and visually impaired persons. In this paper, we propose a proof-of-concept computer vision-based wayfinding aid for blind people to independently access unfamiliar indoor environments. In order to find different rooms (e.g. an office, a lab, or a bathroom) and other building amenities (e.g. an exit or an elevator), we incorporate object detection with text recognition. First we develop a robust and efficient algorithm to detect doors, elevators, and cabinets based on their general geometric shape, by combining edges and corners. The algorithm is general enough to handle large intra-class variations of objects with different appearances among different indoor environments, as well as small inter-class differences between different objects such as doors and door-like cabinets. Next, in order to distinguish intra-class objects (e.g. an office door from a bathroom door), we extract and recognize text information associated with the detected objects. For text recognition, we first extract text regions from signs with multiple colors and possibly complex backgrounds, and then apply character localization and topological analysis to filter out background interference. The extracted text is recognized using off-the-shelf optical character recognition (OCR) software products. The object type, orientation, location, and text information are presented to the blind traveler as speech.
Toward a Computer Vision-based Wayfinding Aid for Blind Persons to Access Unfamiliar Indoor Environments

PubMed Central

Tian, YingLi; Yang, Xiaodong; Yi, Chucai; Arditi, Aries

2012-01-01

Independent travel is a well known challenge for blind and visually impaired persons. In this paper, we propose a proof-of-concept computer vision-based wayfinding aid for blind people to independently access unfamiliar indoor environments. In order to find different rooms (e.g. an office, a lab, or a bathroom) and other building amenities (e.g. an exit or an elevator), we incorporate object detection with text recognition. First we develop a robust and efficient algorithm to detect doors, elevators, and cabinets based on their general geometric shape, by combining edges and corners. The algorithm is general enough to handle large intra-class variations of objects with different appearances among different indoor environments, as well as small inter-class differences between different objects such as doors and door-like cabinets. Next, in order to distinguish intra-class objects (e.g. an office door from a bathroom door), we extract and recognize text information associated with the detected objects. For text recognition, we first extract text regions from signs with multiple colors and possibly complex backgrounds, and then apply character localization and topological analysis to filter out background interference. The extracted text is recognized using off-the-shelf optical character recognition (OCR) software products. The object type, orientation, location, and text information are presented to the blind traveler as speech. PMID:23630409
Dynamic and Contextual Information in HMM Modeling for Handwritten Word Recognition.

PubMed

Bianne-Bernard, Anne-Laure; Menasri, Farès; Al-Hajj Mohamad, Rami; Mokbel, Chafic; Kermorvant, Christopher; Likforman-Sulem, Laurence

2011-10-01

This study aims at building an efficient word recognition system resulting from the combination of three handwriting recognizers. The main component of this combined system is an HMM-based recognizer which considers dynamic and contextual information for a better modeling of writing units. For modeling the contextual units, a state-tying process based on decision tree clustering is introduced. Decision trees are built according to a set of expert-based questions on how characters are written. Questions are divided into global questions, yielding larger clusters, and precise questions, yielding smaller ones. Such clustering enables us to reduce the total number of models and Gaussians densities by 10. We then apply this modeling to the recognition of handwritten words. Experiments are conducted on three publicly available databases based on Latin or Arabic languages: Rimes, IAM, and OpenHart. The results obtained show that contextual information embedded with dynamic modeling significantly improves recognition.
Development of a sonar-based object recognition system

NASA Astrophysics Data System (ADS)

Ecemis, Mustafa Ihsan

2001-02-01

Sonars are used extensively in mobile robotics for obstacle detection, ranging and avoidance. However, these range-finding applications do not exploit the full range of information carried in sonar echoes. In addition, mobile robots need robust object recognition systems. Therefore, a simple and robust object recognition system using ultrasonic sensors may have a wide range of applications in robotics. This dissertation develops and analyzes an object recognition system that uses ultrasonic sensors of the type commonly found on mobile robots. Three principal experiments are used to test the sonar recognition system: object recognition at various distances, object recognition during unconstrained motion, and softness discrimination. The hardware setup, consisting of an inexpensive Polaroid sonar and a data acquisition board, is described first. The software for ultrasound signal generation, echo detection, data collection, and data processing is then presented. Next, the dissertation describes two methods to extract information from the echoes, one in the frequency domain and the other in the time domain. The system uses the fuzzy ARTMAP neural network to recognize objects on the basis of the information content of their echoes. In order to demonstrate that the performance of the system does not depend on the specific classification method being used, the K- Nearest Neighbors (KNN) Algorithm is also implemented. KNN yields a test accuracy similar to fuzzy ARTMAP in all experiments. Finally, the dissertation describes a method for extracting features from the envelope function in order to reduce the dimension of the input vector used by the classifiers. Decreasing the size of the input vectors reduces the memory requirements of the system and makes it run faster. It is shown that this method does not affect the performance of the system dramatically and is more appropriate for some tasks. The results of these experiments demonstrate that sonar can be used to develop a low-cost, low-computation system for real-time object recognition tasks on mobile robots. This system differs from all previous approaches in that it is relatively simple, robust, fast, and inexpensive.
Severe Cross-Modal Object Recognition Deficits in Rats Treated Sub-Chronically with NMDA Receptor Antagonists are Reversed by Systemic Nicotine: Implications for Abnormal Multisensory Integration in Schizophrenia

PubMed Central

Jacklin, Derek L; Goel, Amit; Clementino, Kyle J; Hall, Alexander W M; Talpos, John C; Winters, Boyer D

2012-01-01

Schizophrenia is a complex and debilitating disorder, characterized by positive, negative, and cognitive symptoms. Among the cognitive deficits observed in patients with schizophrenia, recent work has indicated abnormalities in multisensory integration, a process that is important for the formation of comprehensive environmental percepts and for the appropriate guidance of behavior. Very little is known about the neural bases of such multisensory integration deficits, partly because of the lack of viable behavioral tasks to assess this process in animal models. In this study, we used our recently developed rodent cross-modal object recognition (CMOR) task to investigate multisensory integration functions in rats treated sub-chronically with one of two N-methyl-D-aspartate receptor (NMDAR) antagonists, MK-801, or ketamine; such treatment is known to produce schizophrenia-like symptoms. Rats treated with the NMDAR antagonists were impaired on the standard spontaneous object recognition (SOR) task, unimodal (tactile or visual only) versions of SOR, and the CMOR task with intermediate to long retention delays between acquisition and testing phases, but they displayed a selective CMOR task deficit when mnemonic demand was minimized. This selective impairment in multisensory information processing was dose-dependently reversed by acute systemic administration of nicotine. These findings suggest that persistent NMDAR hypofunction may contribute to the multisensory integration deficits observed in patients with schizophrenia and highlight the valuable potential of the CMOR task to facilitate further systematic investigation of the neural bases of, and potential treatments for, this hitherto overlooked aspect of cognitive dysfunction in schizophrenia. PMID:22669170
Research and Implementation of Tibetan Word Segmentation Based on Syllable Methods

NASA Astrophysics Data System (ADS)

Jiang, Jing; Li, Yachao; Jiang, Tao; Yu, Hongzhi

2018-03-01

Tibetan word segmentation (TWS) is an important problem in Tibetan information processing, while abbreviated word recognition is one of the key and most difficult problems in TWS. Most of the existing methods of Tibetan abbreviated word recognition are rule-based approaches, which need vocabulary support. In this paper, we propose a method based on sequence tagging model for abbreviated word recognition, and then implement in TWS systems with sequence labeling models. The experimental results show that our abbreviated word recognition method is fast and effective and can be combined easily with the segmentation model. This significantly increases the effect of the Tibetan word segmentation.
Are Face and Object Recognition Independent? A Neurocomputational Modeling Exploration.

PubMed

Wang, Panqu; Gauthier, Isabel; Cottrell, Garrison

2016-04-01

Are face and object recognition abilities independent? Although it is commonly believed that they are, Gauthier et al. [Gauthier, I., McGugin, R. W., Richler, J. J., Herzmann, G., Speegle, M., & VanGulick, A. E. Experience moderates overlap between object and face recognition, suggesting a common ability. Journal of Vision, 14, 7, 2014] recently showed that these abilities become more correlated as experience with nonface categories increases. They argued that there is a single underlying visual ability, v, that is expressed in performance with both face and nonface categories as experience grows. Using the Cambridge Face Memory Test and the Vanderbilt Expertise Test, they showed that the shared variance between Cambridge Face Memory Test and Vanderbilt Expertise Test performance increases monotonically as experience increases. Here, we address why a shared resource across different visual domains does not lead to competition and to an inverse correlation in abilities? We explain this conundrum using our neurocomputational model of face and object processing ["The Model", TM, Cottrell, G. W., & Hsiao, J. H. Neurocomputational models of face processing. In A. J. Calder, G. Rhodes, M. Johnson, & J. Haxby (Eds.), The Oxford handbook of face perception. Oxford, UK: Oxford University Press, 2011]. We model the domain general ability v as the available computational resources (number of hidden units) in the mapping from input to label and experience as the frequency of individual exemplars in an object category appearing during network training. Our results show that, as in the behavioral data, the correlation between subordinate level face and object recognition accuracy increases as experience grows. We suggest that different domains do not compete for resources because the relevant features are shared between faces and objects. The essential power of experience is to generate a "spreading transform" for faces (separating them in representational space) that generalizes to objects that must be individuated. Interestingly, when the task of the network is basic level categorization, no increase in the correlation between domains is observed. Hence, our model predicts that it is the type of experience that matters and that the source of the correlation is in the fusiform face area, rather than in cortical areas that subserve basic level categorization. This result is consistent with our previous modeling elucidating why the FFA is recruited for novel domains of expertise [Tong, M. H., Joyce, C. A., & Cottrell, G. W. Why is the fusiform face area recruited for novel categories of expertise? A neurocomputational investigation. Brain Research, 1202, 14-24, 2008].

Effects of heavy particle irradiation and diet on object recognition memory in rats

NASA Astrophysics Data System (ADS)

Rabin, Bernard M.; Carrihill-Knoll, Kirsty; Hinchman, Marie; Shukitt-Hale, Barbara; Joseph, James A.; Foster, Brian C.

2009-04-01

On long-duration missions to other planets astronauts will be exposed to types and doses of radiation that are not experienced in low earth orbit. Previous research using a ground-based model for exposure to cosmic rays has shown that exposure to heavy particles, such as 56Fe, disrupts spatial learning and memory measured using the Morris water maze. Maintaining rats on diets containing antioxidant phytochemicals for 2 weeks prior to irradiation ameliorated this deficit. The present experiments were designed to determine: (1) the generality of the particle-induced disruption of memory by examining the effects of exposure to 56Fe particles on object recognition memory; and (2) whether maintaining rats on these antioxidant diets for 2 weeks prior to irradiation would also ameliorate any potential deficit. The results showed that exposure to low doses of 56Fe particles does disrupt recognition memory and that maintaining rats on antioxidant diets containing blueberry and strawberry extract for only 2 weeks was effective in ameliorating the disruptive effects of irradiation. The results are discussed in terms of the mechanisms by which exposure to these particles may produce effects on neurocognitive performance.
Nicotine Administration Attenuates Methamphetamine-Induced Novel Object Recognition Deficits

PubMed Central

Vieira-Brock, Paula L.; McFadden, Lisa M.; Nielsen, Shannon M.; Smith, Misty D.; Hanson, Glen R.

2015-01-01

Background: Previous studies have demonstrated that methamphetamine abuse leads to memory deficits and these are associated with relapse. Furthermore, extensive evidence indicates that nicotine prevents and/or improves memory deficits in different models of cognitive dysfunction and these nicotinic effects might be mediated by hippocampal or cortical nicotinic acetylcholine receptors. The present study investigated whether nicotine attenuates methamphetamine-induced novel object recognition deficits in rats and explored potential underlying mechanisms. Methods: Adolescent or adult male Sprague-Dawley rats received either nicotine water (10–75 μg/mL) or tap water for several weeks. Methamphetamine (4×7.5mg/kg/injection) or saline was administered either before or after chronic nicotine exposure. Novel object recognition was evaluated 6 days after methamphetamine or saline. Serotonin transporter function and density and α4β2 nicotinic acetylcholine receptor density were assessed on the following day. Results: Chronic nicotine intake via drinking water beginning during either adolescence or adulthood attenuated the novel object recognition deficits caused by a high-dose methamphetamine administration. Similarly, nicotine attenuated methamphetamine-induced deficits in novel object recognition when administered after methamphetamine treatment. However, nicotine did not attenuate the serotonergic deficits caused by methamphetamine in adults. Conversely, nicotine attenuated methamphetamine-induced deficits in α4β2 nicotinic acetylcholine receptor density in the hippocampal CA1 region. Furthermore, nicotine increased α4β2 nicotinic acetylcholine receptor density in the hippocampal CA3, dentate gyrus and perirhinal cortex in both saline- and methamphetamine-treated rats. Conclusions: Overall, these findings suggest that nicotine-induced increases in α4β2 nicotinic acetylcholine receptors in the hippocampus and perirhinal cortex might be one mechanism by which novel object recognition deficits are attenuated by nicotine in methamphetamine-treated rats. PMID:26164716
Image registration under translation and rotation in two-dimensional planes using Fourier slice theorem.

PubMed

Pohit, M; Sharma, J

2015-05-10

Image recognition in the presence of both rotation and translation is a longstanding problem in correlation pattern recognition. Use of log polar transform gives a solution to this problem, but at a cost of losing the vital phase information from the image. The main objective of this paper is to develop an algorithm based on Fourier slice theorem for measuring the simultaneous rotation and translation of an object in a 2D plane. The algorithm is applicable for any arbitrary object shift for full 180° rotation.
Emergence of transformation-tolerant representations of visual objects in rat lateral extrastriate cortex

PubMed Central

Tafazoli, Sina; Safaai, Houman; De Franceschi, Gioia; Rosselli, Federica Bianca; Vanzella, Walter; Riggi, Margherita; Buffolo, Federica; Panzeri, Stefano; Zoccolan, Davide

2017-01-01

Rodents are emerging as increasingly popular models of visual functions. Yet, evidence that rodent visual cortex is capable of advanced visual processing, such as object recognition, is limited. Here we investigate how neurons located along the progression of extrastriate areas that, in the rat brain, run laterally to primary visual cortex, encode object information. We found a progressive functional specialization of neural responses along these areas, with: (1) a sharp reduction of the amount of low-level, energy-related visual information encoded by neuronal firing; and (2) a substantial increase in the ability of both single neurons and neuronal populations to support discrimination of visual objects under identity-preserving transformations (e.g., position and size changes). These findings strongly argue for the existence of a rat object-processing pathway, and point to the rodents as promising models to dissect the neuronal circuitry underlying transformation-tolerant recognition of visual objects. DOI: http://dx.doi.org/10.7554/eLife.22794.001 PMID:28395730
On the Use of Sensor Fusion to Reduce the Impact of Rotational and Additive Noise in Human Activity Recognition

PubMed Central

Banos, Oresti; Damas, Miguel; Pomares, Hector; Rojas, Ignacio

2012-01-01

The main objective of fusion mechanisms is to increase the individual reliability of the systems through the use of the collectivity knowledge. Moreover, fusion models are also intended to guarantee a certain level of robustness. This is particularly required for problems such as human activity recognition where runtime changes in the sensor setup seriously disturb the reliability of the initial deployed systems. For commonly used recognition systems based on inertial sensors, these changes are primarily characterized as sensor rotations, displacements or faults related to the batteries or calibration. In this work we show the robustness capabilities of a sensor-weighted fusion model when dealing with such disturbances under different circumstances. Using the proposed method, up to 60% outperformance is obtained when a minority of the sensors are artificially rotated or degraded, independent of the level of disturbance (noise) imposed. These robustness capabilities also apply for any number of sensors affected by a low to moderate noise level. The presented fusion mechanism compensates the poor performance that otherwise would be obtained when just a single sensor is considered. PMID:22969386
On the use of sensor fusion to reduce the impact of rotational and additive noise in human activity recognition.

PubMed

Banos, Oresti; Damas, Miguel; Pomares, Hector; Rojas, Ignacio

2012-01-01

The main objective of fusion mechanisms is to increase the individual reliability of the systems through the use of the collectivity knowledge. Moreover, fusion models are also intended to guarantee a certain level of robustness. This is particularly required for problems such as human activity recognition where runtime changes in the sensor setup seriously disturb the reliability of the initial deployed systems. For commonly used recognition systems based on inertial sensors, these changes are primarily characterized as sensor rotations, displacements or faults related to the batteries or calibration. In this work we show the robustness capabilities of a sensor-weighted fusion model when dealing with such disturbances under different circumstances. Using the proposed method, up to 60% outperformance is obtained when a minority of the sensors are artificially rotated or degraded, independent of the level of disturbance (noise) imposed. These robustness capabilities also apply for any number of sensors affected by a low to moderate noise level. The presented fusion mechanism compensates the poor performance that otherwise would be obtained when just a single sensor is considered.
Eye movements during object recognition in visual agnosia.

PubMed

Charles Leek, E; Patterson, Candy; Paul, Matthew A; Rafal, Robert; Cristino, Filipe

2012-07-01

This paper reports the first ever detailed study about eye movement patterns during single object recognition in visual agnosia. Eye movements were recorded in a patient with an integrative agnosic deficit during two recognition tasks: common object naming and novel object recognition memory. The patient showed normal directional biases in saccades and fixation dwell times in both tasks and was as likely as controls to fixate within object bounding contour regardless of recognition accuracy. In contrast, following initial saccades of similar amplitude to controls, the patient showed a bias for short saccades. In object naming, but not in recognition memory, the similarity of the spatial distributions of patient and control fixations was modulated by recognition accuracy. The study provides new evidence about how eye movements can be used to elucidate the functional impairments underlying object recognition deficits. We argue that the results reflect a breakdown in normal functional processes involved in the integration of shape information across object structure during the visual perception of shape. Copyright © 2012 Elsevier Ltd. All rights reserved.
Automatic Target Recognition Based on Cross-Plot

PubMed Central

Wong, Kelvin Kian Loong; Abbott, Derek

2011-01-01

Automatic target recognition that relies on rapid feature extraction of real-time target from photo-realistic imaging will enable efficient identification of target patterns. To achieve this objective, Cross-plots of binary patterns are explored as potential signatures for the observed target by high-speed capture of the crucial spatial features using minimal computational resources. Target recognition was implemented based on the proposed pattern recognition concept and tested rigorously for its precision and recall performance. We conclude that Cross-plotting is able to produce a digital fingerprint of a target that correlates efficiently and effectively to signatures of patterns having its identity in a target repository. PMID:21980508
Improved object optimal synthetic description, modeling, learning, and discrimination by GEOGINE computational kernel

NASA Astrophysics Data System (ADS)

Fiorini, Rodolfo A.; Dacquino, Gianfranco

2005-03-01

GEOGINE (GEOmetrical enGINE), a state-of-the-art OMG (Ontological Model Generator) based on n-D Tensor Invariants for n-Dimensional shape/texture optimal synthetic representation, description and learning, was presented in previous conferences elsewhere recently. Improved computational algorithms based on the computational invariant theory of finite groups in Euclidean space and a demo application is presented. Progressive model automatic generation is discussed. GEOGINE can be used as an efficient computational kernel for fast reliable application development and delivery in advanced biomedical engineering, biometric, intelligent computing, target recognition, content image retrieval, data mining technological areas mainly. Ontology can be regarded as a logical theory accounting for the intended meaning of a formal dictionary, i.e., its ontological commitment to a particular conceptualization of the world object. According to this approach, "n-D Tensor Calculus" can be considered a "Formal Language" to reliably compute optimized "n-Dimensional Tensor Invariants" as specific object "invariant parameter and attribute words" for automated n-Dimensional shape/texture optimal synthetic object description by incremental model generation. The class of those "invariant parameter and attribute words" can be thought as a specific "Formal Vocabulary" learned from a "Generalized Formal Dictionary" of the "Computational Tensor Invariants" language. Even object chromatic attributes can be effectively and reliably computed from object geometric parameters into robust colour shape invariant characteristics. As a matter of fact, any highly sophisticated application needing effective, robust object geometric/colour invariant attribute capture and parameterization features, for reliable automated object learning and discrimination can deeply benefit from GEOGINE progressive automated model generation computational kernel performance. Main operational advantages over previous, similar approaches are: 1) Progressive Automated Invariant Model Generation, 2) Invariant Minimal Complete Description Set for computational efficiency, 3) Arbitrary Model Precision for robust object description and identification.
Maximal likelihood correspondence estimation for face recognition across pose.

PubMed

Li, Shaoxin; Liu, Xin; Chai, Xiujuan; Zhang, Haihong; Lao, Shihong; Shan, Shiguang

2014-10-01

Due to the misalignment of image features, the performance of many conventional face recognition methods degrades considerably in across pose scenario. To address this problem, many image matching-based methods are proposed to estimate semantic correspondence between faces in different poses. In this paper, we aim to solve two critical problems in previous image matching-based correspondence learning methods: 1) fail to fully exploit face specific structure information in correspondence estimation and 2) fail to learn personalized correspondence for each probe image. To this end, we first build a model, termed as morphable displacement field (MDF), to encode face specific structure information of semantic correspondence from a set of real samples of correspondences calculated from 3D face models. Then, we propose a maximal likelihood correspondence estimation (MLCE) method to learn personalized correspondence based on maximal likelihood frontal face assumption. After obtaining the semantic correspondence encoded in the learned displacement, we can synthesize virtual frontal images of the profile faces for subsequent recognition. Using linear discriminant analysis method with pixel-intensity features, state-of-the-art performance is achieved on three multipose benchmarks, i.e., CMU-PIE, FERET, and MultiPIE databases. Owe to the rational MDF regularization and the usage of novel maximal likelihood objective, the proposed MLCE method can reliably learn correspondence between faces in different poses even in complex wild environment, i.e., labeled face in the wild database.
The role of perceptual load in object recognition.

PubMed

Lavie, Nilli; Lin, Zhicheng; Zokaei, Nahid; Thoma, Volker

2009-10-01

Predictions from perceptual load theory (Lavie, 1995, 2005) regarding object recognition across the same or different viewpoints were tested. Results showed that high perceptual load reduces distracter recognition levels despite always presenting distracter objects from the same view. They also showed that the levels of distracter recognition were unaffected by a change in the distracter object view under conditions of low perceptual load. These results were found both with repetition priming measures of distracter recognition and with performance on a surprise recognition memory test. The results support load theory proposals that distracter recognition critically depends on the level of perceptual load. The implications for the role of attention in object recognition theories are discussed. PsycINFO Database Record (c) 2009 APA, all rights reserved.
Analysis and Recognition of Curve Type as The Basis of Object Recognition in Image

NASA Astrophysics Data System (ADS)

Nugraha, Nurma; Madenda, Sarifuddin; Indarti, Dina; Dewi Agushinta, R.; Ernastuti

2016-06-01

An object in an image when analyzed further will show the characteristics that distinguish one object with another object in an image. Characteristics that are used in object recognition in an image can be a color, shape, pattern, texture and spatial information that can be used to represent objects in the digital image. The method has recently been developed for image feature extraction on objects that share characteristics curve analysis (simple curve) and use the search feature of chain code object. This study will develop an algorithm analysis and the recognition of the type of curve as the basis for object recognition in images, with proposing addition of complex curve characteristics with maximum four branches that will be used for the process of object recognition in images. Definition of complex curve is the curve that has a point of intersection. By using some of the image of the edge detection, the algorithm was able to do the analysis and recognition of complex curve shape well.
Structured prediction models for RNN based sequence labeling in clinical text.

PubMed

Jagannatha, Abhyuday N; Yu, Hong

2016-11-01

Sequence labeling is a widely used method for named entity recognition and information extraction from unstructured natural language data. In clinical domain one major application of sequence labeling involves extraction of medical entities such as medication, indication, and side-effects from Electronic Health Record narratives. Sequence labeling in this domain, presents its own set of challenges and objectives. In this work we experimented with various CRF based structured learning models with Recurrent Neural Networks. We extend the previously studied LSTM-CRF models with explicit modeling of pairwise potentials. We also propose an approximate version of skip-chain CRF inference with RNN potentials. We use these methodologies for structured prediction in order to improve the exact phrase detection of various medical entities.
Structured prediction models for RNN based sequence labeling in clinical text

PubMed Central

Jagannatha, Abhyuday N; Yu, Hong

2016-01-01

Sequence labeling is a widely used method for named entity recognition and information extraction from unstructured natural language data. In clinical domain one major application of sequence labeling involves extraction of medical entities such as medication, indication, and side-effects from Electronic Health Record narratives. Sequence labeling in this domain, presents its own set of challenges and objectives. In this work we experimented with various CRF based structured learning models with Recurrent Neural Networks. We extend the previously studied LSTM-CRF models with explicit modeling of pairwise potentials. We also propose an approximate version of skip-chain CRF inference with RNN potentials. We use these methodologies1 for structured prediction in order to improve the exact phrase detection of various medical entities. PMID:28004040
Joint object and action recognition via fusion of partially observable surveillance imagery data

NASA Astrophysics Data System (ADS)

Shirkhodaie, Amir; Chan, Alex L.

2017-05-01

Partially observable group activities (POGA) occurring in confined spaces are epitomized by their limited observability of the objects and actions involved. In many POGA scenarios, different objects are being used by human operators for the conduct of various operations. In this paper, we describe the ontology of such as POGA in the context of In-Vehicle Group Activity (IVGA) recognition. Initially, we describe the virtue of ontology modeling in the context of IVGA and show how such an ontology and a priori knowledge about the classes of in-vehicle activities can be fused for inference of human actions that consequentially leads to understanding of human activity inside the confined space of a vehicle. In this paper, we treat the problem of "action-object" as a duality problem. We postulate a correlation between observed human actions and the object that is being utilized within those actions, and conversely, if an object being handled is recognized, we may be able to expect a number of actions that are likely to be performed on that object. In this study, we use partially observable human postural sequences to recognition actions. Inspired by convolutional neural networks (CNNs) learning capability, we present an architecture design using a new CNN model to learn "action-object" perception from surveillance videos. In this study, we apply a sequential Deep Hidden Markov Model (DHMM) as a post-processor to CNN to decode realized observations into recognized actions and activities. To generate the needed imagery data set for the training and testing of these new methods, we use the IRIS virtual simulation software to generate high-fidelity and dynamic animated scenarios that depict in-vehicle group activities under different operational contexts. The results of our comparative investigation are discussed and presented in detail.
Objects predict fixations better than early saliency.

PubMed

Einhäuser, Wolfgang; Spain, Merrielle; Perona, Pietro

2008-11-20

Humans move their eyes while looking at scenes and pictures. Eye movements correlate with shifts in attention and are thought to be a consequence of optimal resource allocation for high-level tasks such as visual recognition. Models of attention, such as "saliency maps," are often built on the assumption that "early" features (color, contrast, orientation, motion, and so forth) drive attention directly. We explore an alternative hypothesis: Observers attend to "interesting" objects. To test this hypothesis, we measure the eye position of human observers while they inspect photographs of common natural scenes. Our observers perform different tasks: artistic evaluation, analysis of content, and search. Immediately after each presentation, our observers are asked to name objects they saw. Weighted with recall frequency, these objects predict fixations in individual images better than early saliency, irrespective of task. Also, saliency combined with object positions predicts which objects are frequently named. This suggests that early saliency has only an indirect effect on attention, acting through recognized objects. Consequently, rather than treating attention as mere preprocessing step for object recognition, models of both need to be integrated.
A Low-Cost EEG System-Based Hybrid Brain-Computer Interface for Humanoid Robot Navigation and Recognition

PubMed Central

Choi, Bongjae; Jo, Sungho

2013-01-01

This paper describes a hybrid brain-computer interface (BCI) technique that combines the P300 potential, the steady state visually evoked potential (SSVEP), and event related de-synchronization (ERD) to solve a complicated multi-task problem consisting of humanoid robot navigation and control along with object recognition using a low-cost BCI system. Our approach enables subjects to control the navigation and exploration of a humanoid robot and recognize a desired object among candidates. This study aims to demonstrate the possibility of a hybrid BCI based on a low-cost system for a realistic and complex task. It also shows that the use of a simple image processing technique, combined with BCI, can further aid in making these complex tasks simpler. An experimental scenario is proposed in which a subject remotely controls a humanoid robot in a properly sized maze. The subject sees what the surrogate robot sees through visual feedback and can navigate the surrogate robot. While navigating, the robot encounters objects located in the maze. It then recognizes if the encountered object is of interest to the subject. The subject communicates with the robot through SSVEP and ERD-based BCIs to navigate and explore with the robot, and P300-based BCI to allow the surrogate robot recognize their favorites. Using several evaluation metrics, the performances of five subjects navigating the robot were quite comparable to manual keyboard control. During object recognition mode, favorite objects were successfully selected from two to four choices. Subjects conducted humanoid navigation and recognition tasks as if they embodied the robot. Analysis of the data supports the potential usefulness of the proposed hybrid BCI system for extended applications. This work presents an important implication for the future work that a hybridization of simple BCI protocols provide extended controllability to carry out complicated tasks even with a low-cost system. PMID:24023953
A low-cost EEG system-based hybrid brain-computer interface for humanoid robot navigation and recognition.

PubMed

Choi, Bongjae; Jo, Sungho

2013-01-01

This paper describes a hybrid brain-computer interface (BCI) technique that combines the P300 potential, the steady state visually evoked potential (SSVEP), and event related de-synchronization (ERD) to solve a complicated multi-task problem consisting of humanoid robot navigation and control along with object recognition using a low-cost BCI system. Our approach enables subjects to control the navigation and exploration of a humanoid robot and recognize a desired object among candidates. This study aims to demonstrate the possibility of a hybrid BCI based on a low-cost system for a realistic and complex task. It also shows that the use of a simple image processing technique, combined with BCI, can further aid in making these complex tasks simpler. An experimental scenario is proposed in which a subject remotely controls a humanoid robot in a properly sized maze. The subject sees what the surrogate robot sees through visual feedback and can navigate the surrogate robot. While navigating, the robot encounters objects located in the maze. It then recognizes if the encountered object is of interest to the subject. The subject communicates with the robot through SSVEP and ERD-based BCIs to navigate and explore with the robot, and P300-based BCI to allow the surrogate robot recognize their favorites. Using several evaluation metrics, the performances of five subjects navigating the robot were quite comparable to manual keyboard control. During object recognition mode, favorite objects were successfully selected from two to four choices. Subjects conducted humanoid navigation and recognition tasks as if they embodied the robot. Analysis of the data supports the potential usefulness of the proposed hybrid BCI system for extended applications. This work presents an important implication for the future work that a hybridization of simple BCI protocols provide extended controllability to carry out complicated tasks even with a low-cost system.
Two Visual Pathways in Primates Based on Sampling of Space: Exploitation and Exploration of Visual Information

PubMed Central

Sheth, Bhavin R.; Young, Ryan

2016-01-01

Evidence is strong that the visual pathway is segregated into two distinct streams—ventral and dorsal. Two proposals theorize that the pathways are segregated in function: The ventral stream processes information about object identity, whereas the dorsal stream, according to one model, processes information about either object location, and according to another, is responsible in executing movements under visual control. The models are influential; however recent experimental evidence challenges them, e.g., the ventral stream is not solely responsible for object recognition; conversely, its function is not strictly limited to object vision; the dorsal stream is not responsible by itself for spatial vision or visuomotor control; conversely, its function extends beyond vision or visuomotor control. In their place, we suggest a robust dichotomy consisting of a ventral stream selectively sampling high-resolution/focal spaces, and a dorsal stream sampling nearly all of space with reduced foveal bias. The proposal hews closely to the theme of embodied cognition: Function arises as a consequence of an extant sensory underpinning. A continuous, not sharp, segregation based on function emerges, and carries with it an undercurrent of an exploitation-exploration dichotomy. Under this interpretation, cells of the ventral stream, which individually have more punctate receptive fields that generally include the fovea or parafovea, provide detailed information about object shapes and features and lead to the systematic exploitation of said information; cells of the dorsal stream, which individually have large receptive fields, contribute to visuospatial perception, provide information about the presence/absence of salient objects and their locations for novel exploration and subsequent exploitation by the ventral stream or, under certain conditions, the dorsal stream. We leverage the dichotomy to unify neuropsychological cases under a common umbrella, account for the increased prevalence of multisensory integration in the dorsal stream under a Bayesian framework, predict conditions under which object recognition utilizes the ventral or dorsal stream, and explain why cells of the dorsal stream drive sensorimotor control and motion processing and have poorer feature selectivity. Finally, the model speculates on a dynamic interaction between the two streams that underscores a unified, seamless perception. Existing theories are subsumed under our proposal. PMID:27920670
Two Visual Pathways in Primates Based on Sampling of Space: Exploitation and Exploration of Visual Information.

PubMed

Sheth, Bhavin R; Young, Ryan

2016-01-01

Evidence is strong that the visual pathway is segregated into two distinct streams-ventral and dorsal. Two proposals theorize that the pathways are segregated in function: The ventral stream processes information about object identity, whereas the dorsal stream, according to one model, processes information about either object location, and according to another, is responsible in executing movements under visual control. The models are influential; however recent experimental evidence challenges them, e.g., the ventral stream is not solely responsible for object recognition; conversely, its function is not strictly limited to object vision; the dorsal stream is not responsible by itself for spatial vision or visuomotor control; conversely, its function extends beyond vision or visuomotor control. In their place, we suggest a robust dichotomy consisting of a ventral stream selectively sampling high-resolution/ focal spaces, and a dorsal stream sampling nearly all of space with reduced foveal bias. The proposal hews closely to the theme of embodied cognition: Function arises as a consequence of an extant sensory underpinning. A continuous, not sharp, segregation based on function emerges, and carries with it an undercurrent of an exploitation-exploration dichotomy. Under this interpretation, cells of the ventral stream, which individually have more punctate receptive fields that generally include the fovea or parafovea, provide detailed information about object shapes and features and lead to the systematic exploitation of said information; cells of the dorsal stream, which individually have large receptive fields, contribute to visuospatial perception, provide information about the presence/absence of salient objects and their locations for novel exploration and subsequent exploitation by the ventral stream or, under certain conditions, the dorsal stream. We leverage the dichotomy to unify neuropsychological cases under a common umbrella, account for the increased prevalence of multisensory integration in the dorsal stream under a Bayesian framework, predict conditions under which object recognition utilizes the ventral or dorsal stream, and explain why cells of the dorsal stream drive sensorimotor control and motion processing and have poorer feature selectivity. Finally, the model speculates on a dynamic interaction between the two streams that underscores a unified, seamless perception. Existing theories are subsumed under our proposal.

Higher-Order Neural Networks Applied to 2D and 3D Object Recognition

NASA Technical Reports Server (NTRS)

Spirkovska, Lilly; Reid, Max B.

1994-01-01

A Higher-Order Neural Network (HONN) can be designed to be invariant to geometric transformations such as scale, translation, and in-plane rotation. Invariances are built directly into the architecture of a HONN and do not need to be learned. Thus, for 2D object recognition, the network needs to be trained on just one view of each object class, not numerous scaled, translated, and rotated views. Because the 2D object recognition task is a component of the 3D object recognition task, built-in 2D invariance also decreases the size of the training set required for 3D object recognition. We present results for 2D object recognition both in simulation and within a robotic vision experiment and for 3D object recognition in simulation. We also compare our method to other approaches and show that HONNs have distinct advantages for position, scale, and rotation-invariant object recognition. The major drawback of HONNs is that the size of the input field is limited due to the memory required for the large number of interconnections in a fully connected network. We present partial connectivity strategies and a coarse-coding technique for overcoming this limitation and increasing the input field to that required by practical object recognition problems.
A Neural Network Architecture For Rapid Model Indexing In Computer Vision Systems

NASA Astrophysics Data System (ADS)

Pawlicki, Ted

1988-03-01

Models of objects stored in memory have been shown to be useful for guiding the processing of computer vision systems. A major consideration in such systems, however, is how stored models are initially accessed and indexed by the system. As the number of stored models increases, the time required to search memory for the correct model becomes high. Parallel distributed, connectionist, neural networks' have been shown to have appealing content addressable memory properties. This paper discusses an architecture for efficient storage and reference of model memories stored as stable patterns of activity in a parallel, distributed, connectionist, neural network. The emergent properties of content addressability and resistance to noise are exploited to perform indexing of the appropriate object centered model from image centered primitives. The system consists of three network modules each of which represent information relative to a different frame of reference. The model memory network is a large state space vector where fields in the vector correspond to ordered component objects and relative, object based spatial relationships between the component objects. The component assertion network represents evidence about the existence of object primitives in the input image. It establishes local frames of reference for object primitives relative to the image based frame of reference. The spatial relationship constraint network is an intermediate representation which enables the association between the object based and the image based frames of reference. This intermediate level represents information about possible object orderings and establishes relative spatial relationships from the image based information in the component assertion network below. It is also constrained by the lawful object orderings in the model memory network above. The system design is consistent with current psychological theories of recognition by component. It also seems to support Marr's notions of hierarchical indexing. (i.e. the specificity, adjunct, and parent indices) It supports the notion that multiple canonical views of an object may have to be stored in memory to enable its efficient identification. The use of variable fields in the state space vectors appears to keep the number of required nodes in the network down to a tractable number while imposing a semantic value on different areas of the state space. This semantic imposition supports an interface between the analogical aspects of neural networks and the propositional paradigms of symbolic processing.
Stereo Viewing Modulates Three-Dimensional Shape Processing During Object Recognition: A High-Density ERP Study

PubMed Central

2017-01-01

The role of stereo disparity in the recognition of 3-dimensional (3D) object shape remains an unresolved issue for theoretical models of the human visual system. We examined this issue using high-density (128 channel) recordings of event-related potentials (ERPs). A recognition memory task was used in which observers were trained to recognize a subset of complex, multipart, 3D novel objects under conditions of either (bi-) monocular or stereo viewing. In a subsequent test phase they discriminated previously trained targets from untrained distractor objects that shared either local parts, 3D spatial configuration, or neither dimension, across both previously seen and novel viewpoints. The behavioral data showed a stereo advantage for target recognition at untrained viewpoints. ERPs showed early differential amplitude modulations to shape similarity defined by local part structure and global 3D spatial configuration. This occurred initially during an N1 component around 145–190 ms poststimulus onset, and then subsequently during an N2/P3 component around 260–385 ms poststimulus onset. For mono viewing, amplitude modulation during the N1 was greatest between targets and distracters with different local parts for trained views only. For stereo viewing, amplitude modulation during the N2/P3 was greatest between targets and distracters with different global 3D spatial configurations and generalized across trained and untrained views. The results show that image classification is modulated by stereo information about the local part, and global 3D spatial configuration of object shape. The findings challenge current theoretical models that do not attribute functional significance to stereo input during the computation of 3D object shape. PMID:29022728
Convolutional Neural Network-Based Embarrassing Situation Detection under Camera for Social Robot in Smart Homes

PubMed Central

Sheng, Weihua; Junior, Francisco Erivaldo Fernandes; Li, Shaobo

2018-01-01

Recent research has shown that the ubiquitous use of cameras and voice monitoring equipment in a home environment can raise privacy concerns and affect human mental health. This can be a major obstacle to the deployment of smart home systems for elderly or disabled care. This study uses a social robot to detect embarrassing situations. Firstly, we designed an improved neural network structure based on the You Only Look Once (YOLO) model to obtain feature information. By focusing on reducing area redundancy and computation time, we proposed a bounding-box merging algorithm based on region proposal networks (B-RPN), to merge the areas that have similar features and determine the borders of the bounding box. Thereafter, we designed a feature extraction algorithm based on our improved YOLO and B-RPN, called F-YOLO, for our training datasets, and then proposed a real-time object detection algorithm based on F-YOLO (RODA-FY). We implemented RODA-FY and compared models on our MAT social robot. Secondly, we considered six types of situations in smart homes, and developed training and validation datasets, containing 2580 and 360 images, respectively. Meanwhile, we designed three types of experiments with four types of test datasets composed of 960 sample images. Thirdly, we analyzed how a different number of training iterations affects our prediction estimation, and then we explored the relationship between recognition accuracy and learning rates. Our results show that our proposed privacy detection system can recognize designed situations in the smart home with an acceptable recognition accuracy of 94.48%. Finally, we compared the results among RODA-FY, Inception V3, and YOLO, which indicate that our proposed RODA-FY outperforms the other comparison models in recognition accuracy. PMID:29757211
Convolutional Neural Network-Based Embarrassing Situation Detection under Camera for Social Robot in Smart Homes.

PubMed

Yang, Guanci; Yang, Jing; Sheng, Weihua; Junior, Francisco Erivaldo Fernandes; Li, Shaobo

2018-05-12

Recent research has shown that the ubiquitous use of cameras and voice monitoring equipment in a home environment can raise privacy concerns and affect human mental health. This can be a major obstacle to the deployment of smart home systems for elderly or disabled care. This study uses a social robot to detect embarrassing situations. Firstly, we designed an improved neural network structure based on the You Only Look Once (YOLO) model to obtain feature information. By focusing on reducing area redundancy and computation time, we proposed a bounding-box merging algorithm based on region proposal networks (B-RPN), to merge the areas that have similar features and determine the borders of the bounding box. Thereafter, we designed a feature extraction algorithm based on our improved YOLO and B-RPN, called F-YOLO, for our training datasets, and then proposed a real-time object detection algorithm based on F-YOLO (RODA-FY). We implemented RODA-FY and compared models on our MAT social robot. Secondly, we considered six types of situations in smart homes, and developed training and validation datasets, containing 2580 and 360 images, respectively. Meanwhile, we designed three types of experiments with four types of test datasets composed of 960 sample images. Thirdly, we analyzed how a different number of training iterations affects our prediction estimation, and then we explored the relationship between recognition accuracy and learning rates. Our results show that our proposed privacy detection system can recognize designed situations in the smart home with an acceptable recognition accuracy of 94.48%. Finally, we compared the results among RODA-FY, Inception V3, and YOLO, which indicate that our proposed RODA-FY outperforms the other comparison models in recognition accuracy.
Assessment of Motor Function, Sensory Motor Gating and Recognition Memory in a Novel BACHD Transgenic Rat Model for Huntington Disease

PubMed Central

Abada, Yah-se K.; Nguyen, Huu Phuc; Schreiber, Rudy; Ellenbroek, Bart

2013-01-01

Rationale Huntington disease (HD) is frequently first diagnosed by the appearance of motor symptoms; the diagnosis is subsequently confirmed by the presence of expanded CAG repeats (> 35) in the HUNTINGTIN (HTT) gene. A BACHD rat model for HD carrying the human full length mutated HTT with 97 CAG-CAA repeats has been established recently. Behavioral phenotyping of BACHD rats will help to determine the validity of this model and its potential use in preclinical drug discovery studies. Objectives The present study seeks to characterize the progressive emergence of motor, sensorimotor and cognitive deficits in BACHD rats. Materials and Methods Wild type and transgenic rats were tested from 1 till 12 months of age. Motor tests were selected to measure spontaneous locomotor activity (open field) and gait coordination. Sensorimotor gating was assessed in acoustic startle response paradigms and recognition memory was evaluated in an object recognition test. Results Transgenic rats showed hyperactivity at 1 month and hypoactivity starting at 4 months of age. Motor coordination imbalance in a Rotarod test was present at 2 months and gait abnormalities were seen in a Catwalk test at 12 months. Subtle sensorimotor changes were observed, whereas object recognition was unimpaired in BACHD rats up to 12 months of age. Conclusion The current BACHD rat model recapitulates certain symptoms from HD patients, especially the marked motor deficits. A subtle neuropsychological phenotype was found and further studies are needed to fully address the sensorimotor phenotype and the potential use of BACHD rats for drug discovery purposes. PMID:23874679
Astrocytes contribute to gamma oscillations and recognition memory.

PubMed

Lee, Hosuk Sean; Ghetti, Andrea; Pinto-Duarte, António; Wang, Xin; Dziewczapolski, Gustavo; Galimi, Francesco; Huitron-Resendiz, Salvador; Piña-Crespo, Juan C; Roberts, Amanda J; Verma, Inder M; Sejnowski, Terrence J; Heinemann, Stephen F

2014-08-12

Glial cells are an integral part of functional communication in the brain. Here we show that astrocytes contribute to the fast dynamics of neural circuits that underlie normal cognitive behaviors. In particular, we found that the selective expression of tetanus neurotoxin (TeNT) in astrocytes significantly reduced the duration of carbachol-induced gamma oscillations in hippocampal slices. These data prompted us to develop a novel transgenic mouse model, specifically with inducible tetanus toxin expression in astrocytes. In this in vivo model, we found evidence of a marked decrease in electroencephalographic (EEG) power in the gamma frequency range in awake-behaving mice, whereas neuronal synaptic activity remained intact. The reduction in cortical gamma oscillations was accompanied by impaired behavioral performance in the novel object recognition test, whereas other forms of memory, including working memory and fear conditioning, remained unchanged. These results support a key role for gamma oscillations in recognition memory. Both EEG alterations and behavioral deficits in novel object recognition were reversed by suppression of tetanus toxin expression. These data reveal an unexpected role for astrocytes as essential contributors to information processing and cognitive behavior.
Method and System for Object Recognition Search

NASA Technical Reports Server (NTRS)

Duong, Tuan A. (Inventor); Duong, Vu A. (Inventor); Stubberud, Allen R. (Inventor)

2012-01-01

A method for object recognition using shape and color features of the object to be recognized. An adaptive architecture is used to recognize and adapt the shape and color features for moving objects to enable object recognition.
Applied virtual reality at the Research Triangle Institute

NASA Technical Reports Server (NTRS)

Montoya, R. Jorge

1994-01-01

Virtual Reality (VR) is a way for humans to use computers in visualizing, manipulating and interacting with large geometric data bases. This paper describes a VR infrastructure and its application to marketing, modeling, architectural walk through, and training problems. VR integration techniques used in these applications are based on a uniform approach which promotes portability and reusability of developed modules. For each problem, a 3D object data base is created using data captured by hand or electronically. The object's realism is enhanced through either procedural or photo textures. The virtual environment is created and populated with the data base using software tools which also support interactions with and immersivity in the environment. These capabilities are augmented by other sensory channels such as voice recognition, 3D sound, and tracking. Four applications are presented: a virtual furniture showroom, virtual reality models of the North Carolina Global TransPark, a walk through the Dresden Fraunenkirche, and the maintenance training simulator for the National Guard.
Object memory effects on figure assignment: conscious object recognition is not necessary or sufficient.

PubMed

Peterson, M A; de Gelder, B; Rapcsak, S Z; Gerhardstein, P C; Bachoud-Lévi, A

2000-01-01

In three experiments we investigated whether conscious object recognition is necessary or sufficient for effects of object memories on figure assignment. In experiment 1, we examined a brain-damaged participant, AD, whose conscious object recognition is severely impaired. AD's responses about figure assignment do reveal effects from memories of object structure, indicating that conscious object recognition is not necessary for these effects, and identifying the figure-ground test employed here as a new implicit test of access to memories of object structure. In experiments 2 and 3, we tested a second brain-damaged participant, WG, for whom conscious object recognition was relatively spared. Nevertheless, effects from memories of object structure on figure assignment were not evident in WG's responses about figure assignment in experiment 2, indicating that conscious object recognition is not sufficient for effects of object memories on figure assignment. WG's performance sheds light on AD's performance, and has implications for the theoretical understanding of object memory effects on figure assignment.
Circular blurred shape model for multiclass symbol recognition.

PubMed

Escalera, Sergio; Fornés, Alicia; Pujol, Oriol; Lladós, Josep; Radeva, Petia

2011-04-01

In this paper, we propose a circular blurred shape model descriptor to deal with the problem of symbol detection and classification as a particular case of object recognition. The feature extraction is performed by capturing the spatial arrangement of significant object characteristics in a correlogram structure. The shape information from objects is shared among correlogram regions, where a prior blurring degree defines the level of distortion allowed in the symbol, making the descriptor tolerant to irregular deformations. Moreover, the descriptor is rotation invariant by definition. We validate the effectiveness of the proposed descriptor in both the multiclass symbol recognition and symbol detection domains. In order to perform the symbol detection, the descriptors are learned using a cascade of classifiers. In the case of multiclass categorization, the new feature space is learned using a set of binary classifiers which are embedded in an error-correcting output code design. The results over four symbol data sets show the significant improvements of the proposed descriptor compared to the state-of-the-art descriptors. In particular, the results are even more significant in those cases where the symbols suffer from elastic deformations.
Random-Profiles-Based 3D Face Recognition System

PubMed Central

Joongrock, Kim; Sunjin, Yu; Sangyoun, Lee

2014-01-01

In this paper, a noble nonintrusive three-dimensional (3D) face modeling system for random-profile-based 3D face recognition is presented. Although recent two-dimensional (2D) face recognition systems can achieve a reliable recognition rate under certain conditions, their performance is limited by internal and external changes, such as illumination and pose variation. To address these issues, 3D face recognition, which uses 3D face data, has recently received much attention. However, the performance of 3D face recognition highly depends on the precision of acquired 3D face data, while also requiring more computational power and storage capacity than 2D face recognition systems. In this paper, we present a developed nonintrusive 3D face modeling system composed of a stereo vision system and an invisible near-infrared line laser, which can be directly applied to profile-based 3D face recognition. We further propose a novel random-profile-based 3D face recognition method that is memory-efficient and pose-invariant. The experimental results demonstrate that the reconstructed 3D face data consists of more than 50 k 3D point clouds and a reliable recognition rate against pose variation. PMID:24691101
The Invariance Hypothesis Implies Domain-Specific Regions in Visual Cortex

PubMed Central

Leibo, Joel Z.; Liao, Qianli; Anselmi, Fabio; Poggio, Tomaso

2015-01-01

Is visual cortex made up of general-purpose information processing machinery, or does it consist of a collection of specialized modules? If prior knowledge, acquired from learning a set of objects is only transferable to new objects that share properties with the old, then the recognition system’s optimal organization must be one containing specialized modules for different object classes. Our analysis starts from a premise we call the invariance hypothesis: that the computational goal of the ventral stream is to compute an invariant-to-transformations and discriminative signature for recognition. The key condition enabling approximate transfer of invariance without sacrificing discriminability turns out to be that the learned and novel objects transform similarly. This implies that the optimal recognition system must contain subsystems trained only with data from similarly-transforming objects and suggests a novel interpretation of domain-specific regions like the fusiform face area (FFA). Furthermore, we can define an index of transformation-compatibility, computable from videos, that can be combined with information about the statistics of natural vision to yield predictions for which object categories ought to have domain-specific regions in agreement with the available data. The result is a unifying account linking the large literature on view-based recognition with the wealth of experimental evidence concerning domain-specific regions. PMID:26496457
Real and virtual explorations of the environment and interactive tracking of movable objects for the blind on the basis of tactile-acoustical maps and 3D environment models.

PubMed

Hub, Andreas; Hartter, Tim; Kombrink, Stefan; Ertl, Thomas

2008-01-01

PURPOSE.: This study describes the development of a multi-functional assistant system for the blind which combines localisation, real and virtual navigation within modelled environments and the identification and tracking of fixed and movable objects. The approximate position of buildings is determined with a global positioning sensor (GPS), then the user establishes exact position at a specific landmark, like a door. This location initialises indoor navigation, based on an inertial sensor, a step recognition algorithm and map. Tracking of movable objects is provided by another inertial sensor and a head-mounted stereo camera, combined with 3D environmental models. This study developed an algorithm based on shape and colour to identify objects and used a common face detection algorithm to inform the user of the presence and position of others. The system allows blind people to determine their position with approximately 1 metre accuracy. Virtual exploration of the environment can be accomplished by moving one's finger on a touch screen of a small portable tablet PC. The name of rooms, building features and hazards, modelled objects and their positions are presented acoustically or in Braille. Given adequate environmental models, this system offers blind people the opportunity to navigate independently and safely, even within unknown environments. Additionally, the system facilitates education and rehabilitation by providing, in several languages, object names, features and relative positions.
BDNF Expression in Perirhinal Cortex is Associated with Exercise-Induced Improvement in Object Recognition Memory

PubMed Central

Hopkins, Michael E.; Bucci, David J.

2010-01-01

Physical exercise induces widespread neurobiological adaptations and improves learning and memory. Most research in this field has focused on hippocampus-based spatial tasks and changes in brain-derived neurotrophic factor (BDNF) as a putative substrate underlying exercise-induced cognitive improvements. Chronic exercise can also be anxiolytic and causes adaptive changes in stress reactivity. The present study employed a perirhinal cortex-dependent object recognition task as well as the elevated plus maze to directly test for interactions between the cognitive and anxiolytic effects of exercise in male Long Evans rats. Hippocampal and perirhinal cortex tissue was collected to determine whether the relationship between BDNF and cognitive performance extends to this non-spatial and non-hippocampal-dependent task. We also examined whether the cognitive improvements persisted once the exercise regimen was terminated. Our data indicate that 4 weeks of voluntary exercise every-other-day improved object recognition memory. Importantly, BDNF expression in the perirhinal cortex of exercising rats was strongly correlated with object recognition memory. Exercise also decreased anxiety-like behavior, however there was no evidence to support a relationship between anxiety-like behavior and performance on the novel object recognition task. There was a trend for a negative relationship between anxiety-like behavior and hippocampal BDNF. Neither the cognitive improvements nor the relationship between cognitive function and perirhinal BDNF levels persisted after 2 weeks of inactivity. These are the first data demonstrating that region-specific changes in BDNF protein levels are correlated with exercise-induced improvements in non-spatial memory, mediated by structures outside the hippocampus and are consistent with the theory that, with regard to object recognition, the anxiolytic and cognitive effects of exercise may be mediated through separable mechanisms. PMID:20601027
Muscarinic Receptor-Dependent Long Term Depression in the Perirhinal Cortex and Recognition Memory are Impaired in the rTg4510 Mouse Model of Tauopathy.

PubMed

Scullion, Sarah E; Barker, Gareth R I; Warburton, E Clea; Randall, Andrew D; Brown, Jonathan T

2018-02-26

Neurodegenerative diseases affecting cognitive dysfunction, such as Alzheimer's disease and fronto-temporal dementia, are often associated impairments in the visual recognition memory system. Recent evidence suggests that synaptic plasticity, in particular long term depression (LTD), in the perirhinal cortex (PRh) is a critical cellular mechanism underlying recognition memory. In this study, we have examined novel object recognition and PRh LTD in rTg4510 mice, which transgenically overexpress tau P301L . We found that 8-9 month old rTg4510 mice had significant deficits in long- but not short-term novel object recognition memory. Furthermore, we also established that PRh slices prepared from rTg4510 mice, unlike those prepared from wildtype littermates, could not support a muscarinic acetylcholine receptor-dependent form of LTD, induced by a 5 Hz stimulation protocol. In contrast, bath application of the muscarinic agonist carbachol induced a form of chemical LTD in both WT and rTg4510 slices. Finally, when rTg4510 slices were preincubated with the acetylcholinesterase inhibitor donepezil, the 5 Hz stimulation protocol was capable of inducing significant levels of LTD. These data suggest that dysfunctional cholinergic innervation of the PRh of rTg4510 mice, results in deficits in synaptic LTD which may contribute to aberrant recognition memory in this rodent model of tauopathy.
Estradiol and luteinizing hormone regulate recognition memory following subchronic phencyclidine: Evidence for hippocampal GABA action.

PubMed

Riordan, Alexander J; Schaler, Ari W; Fried, Jenny; Paine, Tracie A; Thornton, Janice E

2018-05-01

The cognitive symptoms of schizophrenia are poorly understood and difficult to treat. Estrogens may mitigate these symptoms via unknown mechanisms. To examine these mechanisms, we tested whether increasing estradiol (E) or decreasing luteinizing hormone (LH) could mitigate short-term episodic memory loss in a phencyclidine (PCP) model of schizophrenia. We then assessed whether changes in cortical or hippocampal GABA may underlie these effects. Female rats were ovariectomized and injected subchronically with PCP. To modulate E and LH, animals received estradiol capsules or Antide injections. Short-term episodic memory was assessed using the novel object recognition task (NORT). Brain expression of GAD67 was analyzed via western blot, and parvalbumin-containing cells were counted using immunohistochemistry. Some rats received hippocampal infusions of a GABA A agonist, GABA A antagonist, or GAD inhibitor before behavioral testing. We found that PCP reduced hippocampal GAD67 and abolished recognition memory. Antide restored hippocampal GAD67 and rescued recognition memory in PCP-treated animals. Estradiol prevented PCP's amnesic effect in NORT but failed to restore hippocampal GAD67. PCP did not cause significant differences in number of parvalbumin-expressing cells or cortical expression of GAD67. Hippocampal infusions of a GABA A agonist restored recognition memory in PCP-treated rats. Blocking hippocampal GAD or GABA A receptors in ovx animals reproduced recognition memory loss similar to PCP and inhibited estradiol's protection of recognition memory in PCP-treated animals. In summary, decreasing LH or increasing E can lessen short-term episodic memory loss, as measured by novel object recognition, in a PCP model of schizophrenia. Alterations in hippocampal GABA may contribute to both PCP's effects on recognition memory and the hormones' ability to prevent or reverse them. Copyright © 2018 Elsevier Ltd. All rights reserved.
Good initialization model with constrained body structure for scene text recognition

NASA Astrophysics Data System (ADS)

Zhu, Anna; Wang, Guoyou; Dong, Yangbo

2016-09-01

Scene text recognition has gained significant attention in the computer vision community. Character detection and recognition are the promise of text recognition and affect the overall performance to a large extent. We proposed a good initialization model for scene character recognition from cropped text regions. We use constrained character's body structures with deformable part-based models to detect and recognize characters in various backgrounds. The character's body structures are achieved by an unsupervised discriminative clustering approach followed by a statistical model and a self-build minimum spanning tree model. Our method utilizes part appearance and location information, and combines character detection and recognition in cropped text region together. The evaluation results on the benchmark datasets demonstrate that our proposed scheme outperforms the state-of-the-art methods both on scene character recognition and word recognition aspects.
Mining Patients' Narratives in Social Media for Pharmacovigilance: Adverse Effects and Misuse of Methylphenidate.

PubMed

Chen, Xiaoyi; Faviez, Carole; Schuck, Stéphane; Lillo-Le-Louët, Agnès; Texier, Nathalie; Dahamna, Badisse; Huot, Charles; Foulquié, Pierre; Pereira, Suzanne; Leroux, Vincent; Karapetiantz, Pierre; Guenegou-Arnoux, Armelle; Katsahian, Sandrine; Bousquet, Cédric; Burgun, Anita

2018-01-01

Background: The Food and Drug Administration (FDA) in the United States and the European Medicines Agency (EMA) have recognized social media as a new data source to strengthen their activities regarding drug safety. Objective: Our objective in the ADR-PRISM project was to provide text mining and visualization tools to explore a corpus of posts extracted from social media. We evaluated this approach on a corpus of 21 million posts from five patient forums, and conducted a qualitative analysis of the data available on methylphenidate in this corpus. Methods: We applied text mining methods based on named entity recognition and relation extraction in the corpus, followed by signal detection using proportional reporting ratio (PRR). We also used topic modeling based on the Correlated Topic Model to obtain the list of the matics in the corpus and classify the messages based on their topics. Results: We automatically identified 3443 posts about methylphenidate published between 2007 and 2016, among which 61 adverse drug reactions (ADR) were automatically detected. Two pharmacovigilance experts evaluated manually the quality of automatic identification, and a f-measure of 0.57 was reached. Patient's reports were mainly neuro-psychiatric effects. Applying PRR, 67% of the ADRs were signals, including most of the neuro-psychiatric symptoms but also palpitations. Topic modeling showed that the most represented topics were related to Childhood and Treatment initiation , but also Side effects . Cases of misuse were also identified in this corpus, including recreational use and abuse. Conclusion: Named entity recognition combined with signal detection and topic modeling have demonstrated their complementarity in mining social media data. An in-depth analysis focused on methylphenidate showed that this approach was able to detect potential signals and to provide better understanding of patients' behaviors regarding drugs, including misuse.
Local structure preserving sparse coding for infrared target recognition

PubMed Central

Han, Jing; Yue, Jiang; Zhang, Yi; Bai, Lianfa

2017-01-01

Sparse coding performs well in image classification. However, robust target recognition requires a lot of comprehensive template images and the sparse learning process is complex. We incorporate sparsity into a template matching concept to construct a local sparse structure matching (LSSM) model for general infrared target recognition. A local structure preserving sparse coding (LSPSc) formulation is proposed to simultaneously preserve the local sparse and structural information of objects. By adding a spatial local structure constraint into the classical sparse coding algorithm, LSPSc can improve the stability of sparse representation for targets and inhibit background interference in infrared images. Furthermore, a kernel LSPSc (K-LSPSc) formulation is proposed, which extends LSPSc to the kernel space to weaken the influence of the linear structure constraint in nonlinear natural data. Because of the anti-interference and fault-tolerant capabilities, both LSPSc- and K-LSPSc-based LSSM can implement target identification based on a simple template set, which just needs several images containing enough local sparse structures to learn a sufficient sparse structure dictionary of a target class. Specifically, this LSSM approach has stable performance in the target detection with scene, shape and occlusions variations. High performance is demonstrated on several datasets, indicating robust infrared target recognition in diverse environments and imaging conditions. PMID:28323824

The development of newborn object recognition in fast and slow visual worlds

PubMed Central

Wood, Justin N.; Wood, Samantha M. W.

2016-01-01

Object recognition is central to perception and cognition. Yet relatively little is known about the environmental factors that cause invariant object recognition to emerge in the newborn brain. Is this ability a hardwired property of vision? Or does the development of invariant object recognition require experience with a particular kind of visual environment? Here, we used a high-throughput controlled-rearing method to examine whether newborn chicks (Gallus gallus) require visual experience with slowly changing objects to develop invariant object recognition abilities. When newborn chicks were raised with a slowly rotating virtual object, the chicks built invariant object representations that generalized across novel viewpoints and rotation speeds. In contrast, when newborn chicks were raised with a virtual object that rotated more quickly, the chicks built viewpoint-specific object representations that failed to generalize to novel viewpoints and rotation speeds. Moreover, there was a direct relationship between the speed of the object and the amount of invariance in the chick's object representation. Thus, visual experience with slowly changing objects plays a critical role in the development of invariant object recognition. These results indicate that invariant object recognition is not a hardwired property of vision, but is learned rapidly when newborns encounter a slowly changing visual world. PMID:27097925
Models of Recognition, Repetition Priming, and Fluency : Exploring a New Framework

ERIC Educational Resources Information Center

Berry, Christopher J.; Shanks, David R.; Speekenbrink, Maarten; Henson, Richard N. A.

2012-01-01

We present a new modeling framework for recognition memory and repetition priming based on signal detection theory. We use this framework to specify and test the predictions of 4 models: (a) a single-system (SS) model, in which one continuous memory signal drives recognition and priming; (b) a multiple-systems-1 (MS1) model, in which completely…
Viewpoint dependence in the recognition of non-elongated familiar objects: testing the effects of symmetry, front-back axis, and familiarity.

PubMed

Niimi, Ryosuke; Yokosawa, Kazuhiko

2009-01-01

Visual recognition of three-dimensional (3-D) objects is relatively impaired for some particular views, called accidental views. For most familiar objects, the front and top views are considered to be accidental views. Previous studies have shown that foreshortening of the axes of elongation of objects in these views impairs recognition, but the influence of other possible factors is largely unknown. Using familiar objects without a salient axis of elongation, we found that a foreshortened symmetry plane of the object and low familiarity of the viewpoint accounted for the relatively worse recognition for front views and top views, independently of the effect of a foreshortened axis of elongation. We found no evidence that foreshortened front-back axes impaired recognition in front views. These results suggest that the viewpoint dependence of familiar object recognition is not a unitary phenomenon. The possible role of symmetry (either 2-D or 3-D) in familiar object recognition is also discussed.
Intelligent fault recognition strategy based on adaptive optimized multiple centers

NASA Astrophysics Data System (ADS)

Zheng, Bo; Li, Yan-Feng; Huang, Hong-Zhong

2018-06-01

For the recognition principle based optimized single center, one important issue is that the data with nonlinear separatrix cannot be recognized accurately. In order to solve this problem, a novel recognition strategy based on adaptive optimized multiple centers is proposed in this paper. This strategy recognizes the data sets with nonlinear separatrix by the multiple centers. Meanwhile, the priority levels are introduced into the multi-objective optimization, including recognition accuracy, the quantity of optimized centers, and distance relationship. According to the characteristics of various data, the priority levels are adjusted to ensure the quantity of optimized centers adaptively and to keep the original accuracy. The proposed method is compared with other methods, including support vector machine (SVM), neural network, and Bayesian classifier. The results demonstrate that the proposed strategy has the same or even better recognition ability on different distribution characteristics of data.
An approach to computing direction relations between separated object groups

NASA Astrophysics Data System (ADS)

Yan, H.; Wang, Z.; Li, J.

2013-06-01

Direction relations between object groups play an important role in qualitative spatial reasoning, spatial computation and spatial recognition. However, none of existing models can be used to compute direction relations between object groups. To fill this gap, an approach to computing direction relations between separated object groups is proposed in this paper, which is theoretically based on Gestalt principles and the idea of multi-directions. The approach firstly triangulates the two object groups; and then it constructs the Voronoi Diagram between the two groups using the triangular network; after this, the normal of each Vornoi edge is calculated, and the quantitative expression of the direction relations is constructed; finally, the quantitative direction relations are transformed into qualitative ones. The psychological experiments show that the proposed approach can obtain direction relations both between two single objects and between two object groups, and the results are correct from the point of view of spatial cognition.
An approach to computing direction relations between separated object groups

NASA Astrophysics Data System (ADS)

Yan, H.; Wang, Z.; Li, J.

2013-09-01

Direction relations between object groups play an important role in qualitative spatial reasoning, spatial computation and spatial recognition. However, none of existing models can be used to compute direction relations between object groups. To fill this gap, an approach to computing direction relations between separated object groups is proposed in this paper, which is theoretically based on gestalt principles and the idea of multi-directions. The approach firstly triangulates the two object groups, and then it constructs the Voronoi diagram between the two groups using the triangular network. After this, the normal of each Voronoi edge is calculated, and the quantitative expression of the direction relations is constructed. Finally, the quantitative direction relations are transformed into qualitative ones. The psychological experiments show that the proposed approach can obtain direction relations both between two single objects and between two object groups, and the results are correct from the point of view of spatial cognition.
Towards discrete wavelet transform-based human activity recognition

NASA Astrophysics Data System (ADS)

Khare, Manish; Jeon, Moongu

2017-06-01

Providing accurate recognition of human activities is a challenging problem for visual surveillance applications. In this paper, we present a simple and efficient algorithm for human activity recognition based on a wavelet transform. We adopt discrete wavelet transform (DWT) coefficients as a feature of human objects to obtain advantages of its multiresolution approach. The proposed method is tested on multiple levels of DWT. Experiments are carried out on different standard action datasets including KTH and i3D Post. The proposed method is compared with other state-of-the-art methods in terms of different quantitative performance measures. The proposed method is found to have better recognition accuracy in comparison to the state-of-the-art methods.
A Biologically Plausible Transform for Visual Recognition that is Invariant to Translation, Scale, and Rotation.

PubMed

Sountsov, Pavel; Santucci, David M; Lisman, John E

2011-01-01

Visual object recognition occurs easily despite differences in position, size, and rotation of the object, but the neural mechanisms responsible for this invariance are not known. We have found a set of transforms that achieve invariance in a neurally plausible way. We find that a transform based on local spatial frequency analysis of oriented segments and on logarithmic mapping, when applied twice in an iterative fashion, produces an output image that is unique to the object and that remains constant as the input image is shifted, scaled, or rotated.
A Biologically Plausible Transform for Visual Recognition that is Invariant to Translation, Scale, and Rotation

PubMed Central

Sountsov, Pavel; Santucci, David M.; Lisman, John E.

2011-01-01

Visual object recognition occurs easily despite differences in position, size, and rotation of the object, but the neural mechanisms responsible for this invariance are not known. We have found a set of transforms that achieve invariance in a neurally plausible way. We find that a transform based on local spatial frequency analysis of oriented segments and on logarithmic mapping, when applied twice in an iterative fashion, produces an output image that is unique to the object and that remains constant as the input image is shifted, scaled, or rotated. PMID:22125522
Young pigs exhibit differential exploratory behavior during novelty preference tasks in response to age, sex, and delay.

PubMed

Fleming, Stephen A; Dilger, Ryan N

2017-03-15

Novelty preference paradigms have been widely used to study recognition memory and its neural substrates. The piglet model continues to advance the study of neurodevelopment, and as such, tasks that use novelty preference will serve especially useful due to their translatable nature to humans. However, there has been little use of this behavioral paradigm in the pig, and previous studies using the novel object recognition paradigm in piglets have yielded inconsistent results. The current study was conducted to determine if piglets were capable of displaying a novelty preference. Herein a series of experiments were conducted using novel object recognition or location in 3- and 4-week-old piglets. In the novel object recognition task, piglets were able to discriminate between novel and sample objects after delays of 2min, 1h, 1 day, and 2 days (all P<0.039) at both ages. Performance was sex-dependent, as females could perform both 1- and 2-day delays (P<0.036) and males could perform the 2-day delay (P=0.008) but not the 1-day delay (P=0.347). Furthermore, 4-week-old piglets and females tended to exhibit greater exploratory behavior compared with males. Such performance did not extend to novel location recognition tasks, as piglets were only able to discriminate between novel and sample locations after a short delay (P>0.046). In conclusion, this study determined that piglets are able to perform the novel object and location recognition tasks at 3-to-4 weeks of age, however performance was dependent on sex, age, and delay. Copyright © 2016 Elsevier B.V. All rights reserved.
Recognition of 3-D symmetric objects from range images in automated assembly tasks

NASA Technical Reports Server (NTRS)

Alvertos, Nicolas; Dcunha, Ivan

1990-01-01

A new technique is presented for the three dimensional recognition of symmetric objects from range images. Beginning from the implicit representation of quadrics, a set of ten coefficients is determined for symmetric objects like spheres, cones, cylinders, ellipsoids, and parallelepipeds. Instead of using these ten coefficients trying to fit them to smooth surfaces (patches) based on the traditional way of determining curvatures, a new approach based on two dimensional geometry is used. For each symmetric object, a unique set of two dimensional curves is obtained from the various angles at which the object is intersected with a plane. Using the same ten coefficients obtained earlier and based on the discriminant method, each of these curves is classified as a parabola, circle, ellipse, or hyperbola. Each symmetric object is found to possess a unique set of these two dimensional curves whereby it can be differentiated from the others. It is shown that instead of using the three dimensional discriminant which involves evaluation of the rank of its matrix, it is sufficient to use the two dimensional discriminant which only requires three arithmetic operations.
Integrating visual learning within a model-based ATR system

NASA Astrophysics Data System (ADS)

Carlotto, Mark; Nebrich, Mark

2017-05-01

Automatic target recognition (ATR) systems, like human photo-interpreters, rely on a variety of visual information for detecting, classifying, and identifying manmade objects in aerial imagery. We describe the integration of a visual learning component into the Image Data Conditioner (IDC) for target/clutter and other visual classification tasks. The component is based on an implementation of a model of the visual cortex developed by Serre, Wolf, and Poggio. Visual learning in an ATR context requires the ability to recognize objects independent of location, scale, and rotation. Our method uses IDC to extract, rotate, and scale image chips at candidate target locations. A bootstrap learning method effectively extends the operation of the classifier beyond the training set and provides a measure of confidence. We show how the classifier can be used to learn other features that are difficult to compute from imagery such as target direction, and to assess the performance of the visual learning process itself.
Humans and Deep Networks Largely Agree on Which Kinds of Variation Make Object Recognition Harder.

PubMed

Kheradpisheh, Saeed R; Ghodrati, Masoud; Ganjtabesh, Mohammad; Masquelier, Timothée

2016-01-01

View-invariant object recognition is a challenging problem that has attracted much attention among the psychology, neuroscience, and computer vision communities. Humans are notoriously good at it, even if some variations are presumably more difficult to handle than others (e.g., 3D rotations). Humans are thought to solve the problem through hierarchical processing along the ventral stream, which progressively extracts more and more invariant visual features. This feed-forward architecture has inspired a new generation of bio-inspired computer vision systems called deep convolutional neural networks (DCNN), which are currently the best models for object recognition in natural images. Here, for the first time, we systematically compared human feed-forward vision and DCNNs at view-invariant object recognition task using the same set of images and controlling the kinds of transformation (position, scale, rotation in plane, and rotation in depth) as well as their magnitude, which we call "variation level." We used four object categories: car, ship, motorcycle, and animal. In total, 89 human subjects participated in 10 experiments in which they had to discriminate between two or four categories after rapid presentation with backward masking. We also tested two recent DCNNs (proposed respectively by Hinton's group and Zisserman's group) on the same tasks. We found that humans and DCNNs largely agreed on the relative difficulties of each kind of variation: rotation in depth is by far the hardest transformation to handle, followed by scale, then rotation in plane, and finally position (much easier). This suggests that DCNNs would be reasonable models of human feed-forward vision. In addition, our results show that the variation levels in rotation in depth and scale strongly modulate both humans' and DCNNs' recognition performances. We thus argue that these variations should be controlled in the image datasets used in vision research.
Semantic Image Segmentation with Contextual Hierarchical Models.

PubMed

Seyedhosseini, Mojtaba; Tasdizen, Tolga

2016-05-01

Semantic segmentation is the problem of assigning an object label to each pixel. It unifies the image segmentation and object recognition problems. The importance of using contextual information in semantic segmentation frameworks has been widely realized in the field. We propose a contextual framework, called contextual hierarchical model (CHM), which learns contextual information in a hierarchical framework for semantic segmentation. At each level of the hierarchy, a classifier is trained based on downsampled input images and outputs of previous levels. Our model then incorporates the resulting multi-resolution contextual information into a classifier to segment the input image at original resolution. This training strategy allows for optimization of a joint posterior probability at multiple resolutions through the hierarchy. Contextual hierarchical model is purely based on the input image patches and does not make use of any fragments or shape examples. Hence, it is applicable to a variety of problems such as object segmentation and edge detection. We demonstrate that CHM performs at par with state-of-the-art on Stanford background and Weizmann horse datasets. It also outperforms state-of-the-art edge detection methods on NYU depth dataset and achieves state-of-the-art on Berkeley segmentation dataset (BSDS 500).
Man-Made Object Extraction from Remote Sensing Imagery by Graph-Based Manifold Ranking

NASA Astrophysics Data System (ADS)

He, Y.; Wang, X.; Hu, X. Y.; Liu, S. H.

2018-04-01

The automatic extraction of man-made objects from remote sensing imagery is useful in many applications. This paper proposes an algorithm for extracting man-made objects automatically by integrating a graph model with the manifold ranking algorithm. Initially, we estimate a priori value of the man-made objects with the use of symmetric and contrast features. The graph model is established to represent the spatial relationships among pre-segmented superpixels, which are used as the graph nodes. Multiple characteristics, namely colour, texture and main direction, are used to compute the weights of the adjacent nodes. Manifold ranking effectively explores the relationships among all the nodes in the feature space as well as initial query assignment; thus, it is applied to generate a ranking map, which indicates the scores of the man-made objects. The man-made objects are then segmented on the basis of the ranking map. Two typical segmentation algorithms are compared with the proposed algorithm. Experimental results show that the proposed algorithm can extract man-made objects with high recognition rate and low omission rate.
Effects of chronic prenatal MK-801 treatment on object recognition, cognitive flexibility, and drug-induced locomotor activity in juvenile and adult rat offspring.

PubMed

Gallant, S; Welch, L; Martone, P; Shalev, U

2017-06-15

Patients with schizophrenia display impaired cognitive functioning and increased sensitivity to psychomimetic drugs. The neurodevelopmental hypothesis of schizophrenia posits that disruption of the developing brain predisposes neural networks to lasting structural and functional abnormalities resulting in the emergence of such symptoms in adulthood. Given the critical role of the glutamatergic system in early brain development, we investigated whether chronic prenatal exposure to the glutamate NMDA receptor antagonist, MK-801, induces schizophrenia-like behavioural and neurochemical changes in juvenile and adult rats. Pregnant Long-Evans rats were administered saline or MK-801 (0.1mg/kg; s.c.) at gestation day 7-19. Object recognition memory and cognitive flexibility were assessed in the male offspring using a novel object preference task and a maze-based set-shifting procedure, respectively. Locomotor-activating effects of acute amphetamine and MK-801 were also assessed. Adult, but not juvenile, prenatally MK-801-treated rats failed to show novel object preference after a 90min delay, suggesting that object recognition memory may have been impaired. In addition, the set-shifting task revealed impaired acquisition of a new rule in adult prenatally MK-801-treated rats compared to controls. This deficit appeared to be driven by regression to the previously learned behaviour. There were no significant differences in drug-induced locomotor activity in juvenile offspring or in adult offspring following acute amphetamine challenges. Unexpectedly, MK-801-induced locomotor activity in adult prenatally MK-801-treated rats was lower compared to controls. Glutamate transmission dysfunction during early development may modify behavioural parameters in adulthood, though these parameters do not appear to model deficits observed in schizophrenia. Copyright © 2017 Elsevier B.V. All rights reserved.
Finding and recognizing objects in natural scenes: complementary computations in the dorsal and ventral visual systems

PubMed Central

Rolls, Edmund T.; Webb, Tristan J.

2014-01-01

Searching for and recognizing objects in complex natural scenes is implemented by multiple saccades until the eyes reach within the reduced receptive field sizes of inferior temporal cortex (IT) neurons. We analyze and model how the dorsal and ventral visual streams both contribute to this. Saliency detection in the dorsal visual system including area LIP is modeled by graph-based visual saliency, and allows the eyes to fixate potential objects within several degrees. Visual information at the fixated location subtending approximately 9° corresponding to the receptive fields of IT neurons is then passed through a four layer hierarchical model of the ventral cortical visual system, VisNet. We show that VisNet can be trained using a synaptic modification rule with a short-term memory trace of recent neuronal activity to capture both the required view and translation invariances to allow in the model approximately 90% correct object recognition for 4 objects shown in any view across a range of 135° anywhere in a scene. The model was able to generalize correctly within the four trained views and the 25 trained translations. This approach analyses the principles by which complementary computations in the dorsal and ventral visual cortical streams enable objects to be located and recognized in complex natural scenes. PMID:25161619
Neural Dynamics of Object-Based Multifocal Visual Spatial Attention and Priming: Object Cueing, Useful-Field-of-View, and Crowding

ERIC Educational Resources Information Center

Foley, Nicholas C.; Grossberg, Stephen; Mingolla, Ennio

2012-01-01

How are spatial and object attention coordinated to achieve rapid object learning and recognition during eye movement search? How do prefrontal priming and parietal spatial mechanisms interact to determine the reaction time costs of intra-object attention shifts, inter-object attention shifts, and shifts between visible objects and covertly cued…
DORSAL HIPPOCAMPAL PROGESTERONE INFUSIONS ENHANCE OBJECT RECOGNITION IN YOUNG FEMALE MICE

PubMed Central

Orr, Patrick T.; Lewis, Michael C.; Frick, Karyn M.

2009-01-01

The effects of progesterone on memory are not nearly as well studied as the effects of estrogens. Although progesterone can reportedly enhance spatial and/or object recognition in female rodents when given immediately after training, previous studies have injected progesterone systemically, and therefore, the brain regions mediating this enhancement are not clear. As such, this study was designed to determine the role of the dorsal hippocampus in mediating the beneficial effect of progesterone on object recognition. Young ovariectomized C57BL/6 mice were trained in a hippocampal-dependent object recognition task utilizing two identical objects, and then immediately or 2 hrs afterwards, received bilateral dorsal hippocampal infusions of vehicle or 0.01, 0.1, or 1.0 μg/μl water-soluble progesterone. Forty-eight hours later, object recognition memory was tested using a previously explored object and a novel object. Relative to the vehicle group, memory for the familiar object was enhanced in all groups receiving immediate infusions of progesterone. Progesterone infusion delayed 2 hrs after training did not affect object recognition. These data suggest that the dorsal hippocampus may play a critical role in progesterone-induced enhancement of object recognition. PMID:19477194
Beyond sensory images: Object-based representation in the human ventral pathway

PubMed Central

Pietrini, Pietro; Furey, Maura L.; Ricciardi, Emiliano; Gobbini, M. Ida; Wu, W.-H. Carolyn; Cohen, Leonardo; Guazzelli, Mario; Haxby, James V.

2004-01-01

We investigated whether the topographically organized, category-related patterns of neural response in the ventral visual pathway are a representation of sensory images or a more abstract representation of object form that is not dependent on sensory modality. We used functional MRI to measure patterns of response evoked during visual and tactile recognition of faces and manmade objects in sighted subjects and during tactile recognition in blind subjects. Results showed that visual and tactile recognition evoked category-related patterns of response in a ventral extrastriate visual area in the inferior temporal gyrus that were correlated across modality for manmade objects. Blind subjects also demonstrated category-related patterns of response in this “visual” area, and in more ventral cortical regions in the fusiform gyrus, indicating that these patterns are not due to visual imagery and, furthermore, that visual experience is not necessary for category-related representations to develop in these cortices. These results demonstrate that the representation of objects in the ventral visual pathway is not simply a representation of visual images but, rather, is a representation of more abstract features of object form. PMID:15064396

Segmentation, classification, and pose estimation of military vehicles in low resolution laser radar images

NASA Astrophysics Data System (ADS)

Neulist, Joerg; Armbruster, Walter

2005-05-01

Model-based object recognition in range imagery typically involves matching the image data to the expected model data for each feasible model and pose hypothesis. Since the matching procedure is computationally expensive, the key to efficient object recognition is the reduction of the set of feasible hypotheses. This is particularly important for military vehicles, which may consist of several large moving parts such as the hull, turret, and gun of a tank, and hence require an eight or higher dimensional pose space to be searched. The presented paper outlines techniques for reducing the set of feasible hypotheses based on an estimation of target dimensions and orientation. Furthermore, the presence of a turret and a main gun and their orientations are determined. The vehicle parts dimensions as well as their error estimates restrict the number of model hypotheses whereas the position and orientation estimates and their error bounds reduce the number of pose hypotheses needing to be verified. The techniques are applied to several hundred laser radar images of eight different military vehicles with various part classifications and orientations. On-target resolution in azimuth, elevation and range is about 30 cm. The range images contain up to 20% dropouts due to atmospheric absorption. Additionally some target retro-reflectors produce outliers due to signal crosstalk. The presented algorithms are extremely robust with respect to these and other error sources. The hypothesis space for hull orientation is reduced to about 5 degrees as is the error for turret rotation and gun elevation, provided the main gun is visible.
Graph-Based Object Class Discovery

NASA Astrophysics Data System (ADS)

Xia, Shengping; Hancock, Edwin R.

We are interested in the problem of discovering the set of object classes present in a database of images using a weakly supervised graph-based framework. Rather than making use of the ”Bag-of-Features (BoF)” approach widely used in current work on object recognition, we represent each image by a graph using a group of selected local invariant features. Using local feature matching and iterative Procrustes alignment, we perform graph matching and compute a similarity measure. Borrowing the idea of query expansion , we develop a similarity propagation based graph clustering (SPGC) method. Using this method class specific clusters of the graphs can be obtained. Such a cluster can be generally represented by using a higher level graph model whose vertices are the clustered graphs, and the edge weights are determined by the pairwise similarity measure. Experiments are performed on a dataset, in which the number of images increases from 1 to 50K and the number of objects increases from 1 to over 500. Some objects have been discovered with total recall and a precision 1 in a single cluster.
A rat in the sewer: How mental imagery interacts with object recognition

PubMed Central

Hamburger, Kai

2018-01-01

The role of mental imagery has been puzzling researchers for more than two millennia. Both positive and negative effects of mental imagery on information processing have been discussed. The aim of this work was to examine how mental imagery affects object recognition and associative learning. Based on different perceptual and cognitive accounts we tested our imagery-induced interaction hypothesis in a series of two experiments. According to that, mental imagery could lead to (1) a superior performance in object recognition and associative learning if these objects are imagery-congruent (semantically) and to (2) an inferior performance if these objects are imagery-incongruent. In the first experiment, we used a static environment and tested associative learning. In the second experiment, subjects encoded object information in a dynamic environment by means of a virtual sewer system. Our results demonstrate that subjects who received a role adoption task (by means of guided mental imagery) performed better when imagery-congruent objects were used and worse when imagery-incongruent objects were used. We finally discuss our findings also with respect to alternative accounts and plead for a multi-methodological approach for future research in order to solve this issue. PMID:29590161
A rat in the sewer: How mental imagery interacts with object recognition.

PubMed

Karimpur, Harun; Hamburger, Kai

2018-01-01

The role of mental imagery has been puzzling researchers for more than two millennia. Both positive and negative effects of mental imagery on information processing have been discussed. The aim of this work was to examine how mental imagery affects object recognition and associative learning. Based on different perceptual and cognitive accounts we tested our imagery-induced interaction hypothesis in a series of two experiments. According to that, mental imagery could lead to (1) a superior performance in object recognition and associative learning if these objects are imagery-congruent (semantically) and to (2) an inferior performance if these objects are imagery-incongruent. In the first experiment, we used a static environment and tested associative learning. In the second experiment, subjects encoded object information in a dynamic environment by means of a virtual sewer system. Our results demonstrate that subjects who received a role adoption task (by means of guided mental imagery) performed better when imagery-congruent objects were used and worse when imagery-incongruent objects were used. We finally discuss our findings also with respect to alternative accounts and plead for a multi-methodological approach for future research in order to solve this issue.
Learning to distinguish similar objects

NASA Astrophysics Data System (ADS)

Seibert, Michael; Waxman, Allen M.; Gove, Alan N.

1995-04-01

This paper describes how the similarities and differences among similar objects can be discovered during learning to facilitate recognition. The application domain is single views of flying model aircraft captured in silhouette by a CCD camera. The approach was motivated by human psychovisual and monkey neurophysiological data. The implementation uses neural net processing mechanisms to build a hierarchy that relates similar objects to superordinate classes, while simultaneously discovering the salient differences between objects within a class. Learning and recognition experiments both with and without the class similarity and difference learning show the effectiveness of the approach on this visual data. To test the approach, the hierarchical approach was compared to a non-hierarchical approach, and was found to improve the average percentage of correctly classified views from 77% to 84%.
Development of visuo-haptic transfer for object recognition in typical preschool and school-aged children.

PubMed

Purpura, Giulia; Cioni, Giovanni; Tinelli, Francesca

2018-07-01

Object recognition is a long and complex adaptive process and its full maturation requires combination of many different sensory experiences as well as cognitive abilities to manipulate previous experiences in order to develop new percepts and subsequently to learn from the environment. It is well recognized that the transfer of visual and haptic information facilitates object recognition in adults, but less is known about development of this ability. In this study, we explored the developmental course of object recognition capacity in children using unimodal visual information, unimodal haptic information, and visuo-haptic information transfer in children from 4 years to 10 years and 11 months of age. Participants were tested through a clinical protocol, involving visual exploration of black-and-white photographs of common objects, haptic exploration of real objects, and visuo-haptic transfer of these two types of information. Results show an age-dependent development of object recognition abilities for visual, haptic, and visuo-haptic modalities. A significant effect of time on development of unimodal and crossmodal recognition skills was found. Moreover, our data suggest that multisensory processes for common object recognition are active at 4 years of age. They facilitate recognition of common objects, and, although not fully mature, are significant in adaptive behavior from the first years of age. The study of typical development of visuo-haptic processes in childhood is a starting point for future studies regarding object recognition in impaired populations.
Chronic cannabidiol treatment improves social and object recognition in double transgenic APPswe/PS1∆E9 mice.

PubMed

Cheng, David; Low, Jac Kee; Logge, Warren; Garner, Brett; Karl, Tim

2014-08-01

Patients suffering from Alzheimer's disease (AD) exhibit a decline in cognitive abilities including an inability to recognise familiar faces. Hallmark pathological changes in AD include the aggregation of amyloid-β (Aβ), tau protein hyperphosphorylation as well as pronounced neurodegeneration, neuroinflammation, neurotoxicity and oxidative damage. The non-psychoactive phytocannabinoid cannabidiol (CBD) exerts neuroprotective, anti-oxidant and anti-inflammatory effects and promotes neurogenesis. CBD also reverses Aβ-induced spatial memory deficits in rodents. Thus we determined the therapeutic-like effects of chronic CBD treatment (20 mg/kg, daily intraperitoneal injections for 3 weeks) on the APPswe/PS1∆E9 (APPxPS1) transgenic mouse model for AD in a number of cognitive tests, including the social preference test, the novel object recognition task and the fear conditioning paradigm. We also analysed the impact of CBD on anxiety behaviours in the elevated plus maze. Vehicle-treated APPxPS1 mice demonstrated impairments in social recognition and novel object recognition compared to wild type-like mice. Chronic CBD treatment reversed these cognitive deficits in APPxPS1 mice without affecting anxiety-related behaviours. This is the first study to investigate the effect of chronic CBD treatment on cognition in an AD transgenic mouse model. Our findings suggest that CBD may have therapeutic potential for specific cognitive impairments associated with AD.
Mapping parahippocampal systems for recognition and recency memory in the absence of the rat hippocampus

PubMed Central

Kinnavane, L; Amin, E; Horne, M; Aggleton, J P

2014-01-01

The present study examined immediate-early gene expression in the perirhinal cortex of rats with hippocampal lesions. The goal was to test those models of recognition memory which assume that the perirhinal cortex can function independently of the hippocampus. The c-fos gene was targeted, as its expression in the perirhinal cortex is strongly associated with recognition memory. Four groups of rats were examined. Rats with hippocampal lesions and their surgical controls were given either a recognition memory task (novel vs. familiar objects) or a relative recency task (objects with differing degrees of familiarity). Perirhinal Fos expression in the hippocampal-lesioned groups correlated with both recognition and recency performance. The hippocampal lesions, however, had no apparent effect on overall levels of perirhinal or entorhinal cortex c-fos expression in response to novel objects, with only restricted effects being seen in the recency condition. Network analyses showed that whereas the patterns of parahippocampal interactions were differentially affected by novel or familiar objects, these correlated networks were not altered by hippocampal lesions. Additional analyses in control rats revealed two modes of correlated medial temporal activation. Novel stimuli recruited the pathway from the lateral entorhinal cortex (cortical layer II or III) to hippocampal field CA3, and thence to CA1. Familiar stimuli recruited the direct pathway from the lateral entorhinal cortex (principally layer III) to CA1. The present findings not only reveal the independence from the hippocampus of some perirhinal systems associated with recognition memory, but also show how novel stimuli engage hippocampal subfields in qualitatively different ways from familiar stimuli. PMID:25264133
Multivariate fMRI and Eye Tracking Reveal Differential Effects of Visual Interference on Recognition Memory Judgments for Objects and Scenes.

PubMed

O'Neil, Edward B; Watson, Hilary C; Dhillon, Sonya; Lobaugh, Nancy J; Lee, Andy C H

2015-09-01

Recent work has demonstrated that the perirhinal cortex (PRC) supports conjunctive object representations that aid object recognition memory following visual object interference. It is unclear, however, how these representations interact with other brain regions implicated in mnemonic retrieval and how congruent and incongruent interference influences the processing of targets and foils during object recognition. To address this, multivariate partial least squares was applied to fMRI data acquired during an interference match-to-sample task, in which participants made object or scene recognition judgments after object or scene interference. This revealed a pattern of activity sensitive to object recognition following congruent (i.e., object) interference that included PRC, prefrontal, and parietal regions. Moreover, functional connectivity analysis revealed a common pattern of PRC connectivity across interference and recognition conditions. Examination of eye movements during the same task in a separate study revealed that participants gazed more at targets than foils during correct object recognition decisions, regardless of interference congruency. By contrast, participants viewed foils more than targets for incorrect object memory judgments, but only after congruent interference. Our findings suggest that congruent interference makes object foils appear familiar and that a network of regions, including PRC, is recruited to overcome the effects of interference.
Metric invariance in object recognition: a review and further evidence.

PubMed

Cooper, E E; Biederman, I; Hummel, J E

1992-06-01

Phenomenologically, human shape recognition appears to be invariant with changes of orientation in depth (up to parts occlusion), position in the visual field, and size. Recent versions of template theories (e.g., Ullman, 1989; Lowe, 1987) assume that these invariances are achieved through the application of transformations such as rotation, translation, and scaling of the image so that it can be matched metrically to a stored template. Presumably, such transformations would require time for their execution. We describe recent priming experiments in which the effects of a prior brief presentation of an image on its subsequent recognition are assessed. The results of these experiments indicate that the invariance is complete: The magnitude of visual priming (as distinct from name or basic level concept priming) is not affected by a change in position, size, orientation in depth, or the particular lines and vertices present in the image, as long as representations of the same components can be activated. An implemented seven layer neural network model (Hummel & Biederman, 1992) that captures these fundamental properties of human object recognition is described. Given a line drawing of an object, the model activates a viewpoint-invariant structural description of the object, specifying its parts and their interrelations. Visual priming is interpreted as a change in the connection weights for the activation of: a) cells, termed geon feature assemblies (GFAs), that conjoin the output of units that represent invariant, independent properties of a single geon and its relations (such as its type, aspect ratio, relations to other geons), or b) a change in the connection weights by which several GFAs activate a cell representing an object.
Object recognition and localization from 3D point clouds by maximum-likelihood estimation

NASA Astrophysics Data System (ADS)

Dantanarayana, Harshana G.; Huntley, Jonathan M.

2017-08-01

We present an algorithm based on maximum-likelihood analysis for the automated recognition of objects, and estimation of their pose, from 3D point clouds. Surfaces segmented from depth images are used as the features, unlike `interest point'-based algorithms which normally discard such data. Compared to the 6D Hough transform, it has negligible memory requirements, and is computationally efficient compared to iterative closest point algorithms. The same method is applicable to both the initial recognition/pose estimation problem as well as subsequent pose refinement through appropriate choice of the dispersion of the probability density functions. This single unified approach therefore avoids the usual requirement for different algorithms for these two tasks. In addition to the theoretical description, a simple 2 degrees of freedom (d.f.) example is given, followed by a full 6 d.f. analysis of 3D point cloud data from a cluttered scene acquired by a projected fringe-based scanner, which demonstrated an RMS alignment error as low as 0.3 mm.
Study of the Gray Scale, Polychromatic, Distortion Invariant Neural Networks Using the Ipa Model.

NASA Astrophysics Data System (ADS)

Uang, Chii-Maw

Research in the optical neural network field is primarily motivated by the fact that humans recognize objects better than the conventional digital computers and the massively parallel inherent nature of optics. This research represents a continuous effort during the past several years in the exploitation of using neurocomputing for pattern recognition. Based on the interpattern association (IPA) model and Hamming net model, many new systems and applications are introduced. A gray level discrete associative memory that is based on object decomposition/composition is proposed for recognizing gray-level patterns. This technique extends the processing ability from the binary mode to gray-level mode, and thus the information capacity is increased. Two polychromatic optical neural networks using color liquid crystal television (LCTV) panels for color pattern recognition are introduced. By introducing a color encoding technique in conjunction with the interpattern associative algorithm, a color associative memory was realized. Based on the color decomposition and composition technique, a color exemplar-based Hamming net was built for color image classification. A shift-invariant neural network is presented through use of the translation invariant property of the modulus of the Fourier transformation and the hetero-associative interpattern association (IPA) memory. To extract the main features, a quadrantal sampling method is used to sampled data and then replace the training patterns. Using the concept of hetero-associative memory to recall the distorted object. A shift and rotation invariant neural network using an interpattern hetero-association (IHA) model is presented. To preserve the shift and rotation invariant properties, a set of binarized-encoded circular harmonic expansion (CHE) functions at the Fourier domain is used as the training set. We use the shift and symmetric properties of the modulus of the Fourier spectrum to avoid the problem of centering the CHE functions. Almost all neural networks have the positive and negative weights, which increases the difficulty of optical implementation. A method to construct a unipolar IPA IWM is discussed. By searching the redundant interconnection links, an effective way that removes all negative links is discussed.
The Role of Perceptual Load in Object Recognition

ERIC Educational Resources Information Center

Lavie, Nilli; Lin, Zhicheng; Zokaei, Nahid; Thoma, Volker

2009-01-01

Predictions from perceptual load theory (Lavie, 1995, 2005) regarding object recognition across the same or different viewpoints were tested. Results showed that high perceptual load reduces distracter recognition levels despite always presenting distracter objects from the same view. They also showed that the levels of distracter recognition were…
Connectionist model-based stereo vision for telerobotics

NASA Technical Reports Server (NTRS)

Hoff, William; Mathis, Donald

1989-01-01

Autonomous stereo vision for range measurement could greatly enhance the performance of telerobotic systems. Stereo vision could be a key component for autonomous object recognition and localization, thus enabling the system to perform low-level tasks, and allowing a human operator to perform a supervisory role. The central difficulty in stereo vision is the ambiguity in matching corresponding points in the left and right images. However, if one has a priori knowledge of the characteristics of the objects in the scene, as is often the case in telerobotics, a model-based approach can be taken. Researchers describe how matching ambiguities can be resolved by ensuring that the resulting three-dimensional points are consistent with surface models of the expected objects. A four-layer neural network hierarchy is used in which surface models of increasing complexity are represented in successive layers. These models are represented using a connectionist scheme called parameter networks, in which a parametrized object (for example, a planar patch p=f(h,m sub x, m sub y) is represented by a collection of processing units, each of which corresponds to a distinct combination of parameter values. The activity level of each unit in the parameter network can be thought of as representing the confidence with which the hypothesis represented by that unit is believed. Weights in the network are set so as to implement gradient descent in an energy function.
Chinese License Plates Recognition Method Based on A Robust and Efficient Feature Extraction and BPNN Algorithm

NASA Astrophysics Data System (ADS)

Zhang, Ming; Xie, Fei; Zhao, Jing; Sun, Rui; Zhang, Lei; Zhang, Yue

2018-04-01

The prosperity of license plate recognition technology has made great contribution to the development of Intelligent Transport System (ITS). In this paper, a robust and efficient license plate recognition method is proposed which is based on a combined feature extraction model and BPNN (Back Propagation Neural Network) algorithm. Firstly, the candidate region of the license plate detection and segmentation method is developed. Secondly, a new feature extraction model is designed considering three sets of features combination. Thirdly, the license plates classification and recognition method using the combined feature model and BPNN algorithm is presented. Finally, the experimental results indicate that the license plate segmentation and recognition both can be achieved effectively by the proposed algorithm. Compared with three traditional methods, the recognition accuracy of the proposed method has increased to 95.7% and the consuming time has decreased to 51.4ms.
A True-Color Sensor and Suitable Evaluation Algorithm for Plant Recognition

PubMed Central

Schmittmann, Oliver; Schulze Lammers, Peter

2017-01-01

Plant-specific herbicide application requires sensor systems for plant recognition and differentiation. A literature review reveals a lack of sensor systems capable of recognizing small weeds in early stages of development (in the two- or four-leaf stage) and crop plants, of making spraying decisions in real time and, in addition, are that are inexpensive and ready for practical use in sprayers. The system described in this work is based on free cascadable and programmable true-color sensors for real-time recognition and identification of individual weed and crop plants. The application of this type of sensor is suitable for municipal areas and farmland with and without crops to perform the site-specific application of herbicides. Initially, databases with reflection properties of plants, natural and artificial backgrounds were created. Crop and weed plants should be recognized by the use of mathematical algorithms and decision models based on these data. They include the characteristic color spectrum, as well as the reflectance characteristics of unvegetated areas and areas with organic material. The CIE-Lab color-space was chosen for color matching because it contains information not only about coloration (a- and b-channel), but also about luminance (L-channel), thus increasing accuracy. Four different decision making algorithms based on different parameters are explained: (i) color similarity (ΔE); (ii) color similarity split in ΔL, Δa and Δb; (iii) a virtual channel ‘d’ and (iv) statistical distribution of the differences of reflection backgrounds and plants. Afterwards, the detection success of the recognition system is described. Furthermore, the minimum weed/plant coverage of the measuring spot was calculated by a mathematical model. Plants with a size of 1–5% of the spot can be recognized, and weeds in the two-leaf stage can be identified with a measuring spot size of 5 cm. By choosing a decision model previously, the detection quality can be increased. Depending on the characteristics of the background, different models are suitable. Finally, the results of field trials on municipal areas (with models of plants), winter wheat fields (with artificial plants) and grassland (with dock) are shown. In each experimental variant, objects and weeds could be recognized. PMID:28786922
A True-Color Sensor and Suitable Evaluation Algorithm for Plant Recognition.

PubMed

Schmittmann, Oliver; Schulze Lammers, Peter

2017-08-08

Plant-specific herbicide application requires sensor systems for plant recognition and differentiation. A literature review reveals a lack of sensor systems capable of recognizing small weeds in early stages of development (in the two- or four-leaf stage) and crop plants, of making spraying decisions in real time and, in addition, are that are inexpensive and ready for practical use in sprayers. The system described in this work is based on free cascadable and programmable true-color sensors for real-time recognition and identification of individual weed and crop plants. The application of this type of sensor is suitable for municipal areas and farmland with and without crops to perform the site-specific application of herbicides. Initially, databases with reflection properties of plants, natural and artificial backgrounds were created. Crop and weed plants should be recognized by the use of mathematical algorithms and decision models based on these data. They include the characteristic color spectrum, as well as the reflectance characteristics of unvegetated areas and areas with organic material. The CIE-Lab color-space was chosen for color matching because it contains information not only about coloration (a- and b-channel), but also about luminance (L-channel), thus increasing accuracy. Four different decision making algorithms based on different parameters are explained: (i) color similarity (ΔE); (ii) color similarity split in ΔL, Δa and Δb; (iii) a virtual channel 'd' and (iv) statistical distribution of the differences of reflection backgrounds and plants. Afterwards, the detection success of the recognition system is described. Furthermore, the minimum weed/plant coverage of the measuring spot was calculated by a mathematical model. Plants with a size of 1-5% of the spot can be recognized, and weeds in the two-leaf stage can be identified with a measuring spot size of 5 cm. By choosing a decision model previously, the detection quality can be increased. Depending on the characteristics of the background, different models are suitable. Finally, the results of field trials on municipal areas (with models of plants), winter wheat fields (with artificial plants) and grassland (with dock) are shown. In each experimental variant, objects and weeds could be recognized.
Maximum mutual information estimation of a simplified hidden MRF for offline handwritten Chinese character recognition

NASA Astrophysics Data System (ADS)

Xiong, Yan; Reichenbach, Stephen E.

1999-01-01

Understanding of hand-written Chinese characters is at such a primitive stage that models include some assumptions about hand-written Chinese characters that are simply false. So Maximum Likelihood Estimation (MLE) may not be an optimal method for hand-written Chinese characters recognition. This concern motivates the research effort to consider alternative criteria. Maximum Mutual Information Estimation (MMIE) is an alternative method for parameter estimation that does not derive its rationale from presumed model correctness, but instead examines the pattern-modeling problem in automatic recognition system from an information- theoretic point of view. The objective of MMIE is to find a set of parameters in such that the resultant model allows the system to derive from the observed data as much information as possible about the class. We consider MMIE for recognition of hand-written Chinese characters using on a simplified hidden Markov Random Field. MMIE provides improved performance improvement over MLE in this application.
Visual Recognition Software for Binary Classification and Its Application to Spruce Pollen Identification

PubMed Central

Tcheng, David K.; Nayak, Ashwin K.; Fowlkes, Charless C.; Punyasena, Surangi W.

2016-01-01

Discriminating between black and white spruce (Picea mariana and Picea glauca) is a difficult palynological classification problem that, if solved, would provide valuable data for paleoclimate reconstructions. We developed an open-source visual recognition software (ARLO, Automated Recognition with Layered Optimization) capable of differentiating between these two species at an accuracy on par with human experts. The system applies pattern recognition and machine learning to the analysis of pollen images and discovers general-purpose image features, defined by simple features of lines and grids of pixels taken at different dimensions, size, spacing, and resolution. It adapts to a given problem by searching for the most effective combination of both feature representation and learning strategy. This results in a powerful and flexible framework for image classification. We worked with images acquired using an automated slide scanner. We first applied a hash-based “pollen spotting” model to segment pollen grains from the slide background. We next tested ARLO’s ability to reconstruct black to white spruce pollen ratios using artificially constructed slides of known ratios. We then developed a more scalable hash-based method of image analysis that was able to distinguish between the pollen of black and white spruce with an estimated accuracy of 83.61%, comparable to human expert performance. Our results demonstrate the capability of machine learning systems to automate challenging taxonomic classifications in pollen analysis, and our success with simple image representations suggests that our approach is generalizable to many other object recognition problems. PMID:26867017
Culture modulates implicit ownership-induced self-bias in memory.

PubMed

Sparks, Samuel; Cunningham, Sheila J; Kritikos, Ada

2016-08-01

The relation of incoming stimuli to the self implicitly determines the allocation of cognitive resources. Cultural variations in the self-concept shape cognition, but the extent is unclear because the majority of studies sample only Western participants. We report cultural differences (Asian versus Western) in ownership-induced self-bias in recognition memory for objects. In two experiments, participants allocated a series of images depicting household objects to self-owned or other-owned virtual baskets based on colour cues before completing a surprise recognition memory test for the objects. The 'other' was either a stranger or a close other. In both experiments, Western participants showed greater recognition memory accuracy for self-owned compared with other-owned objects, consistent with an independent self-construal. In Experiment 1, which required minimal attention to the owned objects, Asian participants showed no such ownership-related bias in recognition accuracy. In Experiment 2, which required attention to owned objects to move them along the screen, Asian participants again showed no overall memory advantage for self-owned items and actually exhibited higher recognition accuracy for mother-owned than self-owned objects, reversing the pattern observed for Westerners. This is consistent with an interdependent self-construal which is sensitive to the particular relationship between the self and other. Overall, our results suggest that the self acts as an organising principle for allocating cognitive resources, but that the way it is constructed depends upon cultural experience. Additionally, the manifestation of these cultural differences in self-representation depends on the allocation of attentional resources to self- and other-associated stimuli. Crown Copyright © 2016. Published by Elsevier B.V. All rights reserved.

Automatic updating and 3D modeling of airport information from high resolution images using GIS and LIDAR data

NASA Astrophysics Data System (ADS)

Lv, Zheng; Sui, Haigang; Zhang, Xilin; Huang, Xianfeng

2007-11-01

As one of the most important geo-spatial objects and military establishment, airport is always a key target in fields of transportation and military affairs. Therefore, automatic recognition and extraction of airport from remote sensing images is very important and urgent for updating of civil aviation and military application. In this paper, a new multi-source data fusion approach on automatic airport information extraction, updating and 3D modeling is addressed. Corresponding key technologies including feature extraction of airport information based on a modified Ostu algorithm, automatic change detection based on new parallel lines-based buffer detection algorithm, 3D modeling based on gradual elimination of non-building points algorithm, 3D change detecting between old airport model and LIDAR data, typical CAD models imported and so on are discussed in detail. At last, based on these technologies, we develop a prototype system and the results show our method can achieve good effects.
View-invariant object recognition ability develops after discrimination, not mere exposure, at several viewing angles.

PubMed

Yamashita, Wakayo; Wang, Gang; Tanaka, Keiji

2010-01-01

One usually fails to recognize an unfamiliar object across changes in viewing angle when it has to be discriminated from similar distractor objects. Previous work has demonstrated that after long-term experience in discriminating among a set of objects seen from the same viewing angle, immediate recognition of the objects across 30-60 degrees changes in viewing angle becomes possible. The capability for view-invariant object recognition should develop during the within-viewing-angle discrimination, which includes two kinds of experience: seeing individual views and discriminating among the objects. The aim of the present study was to determine the relative contribution of each factor to the development of view-invariant object recognition capability. Monkeys were first extensively trained in a task that required view-invariant object recognition (Object task) with several sets of objects. The animals were then exposed to a new set of objects over 26 days in one of two preparatory tasks: one in which each object view was seen individually, and a second that required discrimination among the objects at each of four viewing angles. After the preparatory period, we measured the monkeys' ability to recognize the objects across changes in viewing angle, by introducing the object set to the Object task. Results indicated significant view-invariant recognition after the second but not first preparatory task. These results suggest that discrimination of objects from distractors at each of several viewing angles is required for the development of view-invariant recognition of the objects when the distractors are similar to the objects.
3-D World Modeling For An Autonomous Robot

NASA Astrophysics Data System (ADS)

Goldstein, M.; Pin, F. G.; Weisbin, C. R.

1987-01-01

This paper presents a methodology for a concise representation of the 3-D world model for a mobile robot, using range data. The process starts with the segmentation of the scene into "objects" that are given a unique label, based on principles of range continuity. Then the external surface of each object is partitioned into homogeneous surface patches. Contours of surface patches in 3-D space are identified by estimating the normal and curvature associated with each pixel. The resulting surface patches are then classified as planar, convex or concave. Since the world model uses a volumetric representation for the 3-D environment, planar surfaces are represented by thin volumetric polyhedra. Spherical and cylindrical surfaces are extracted and represented by appropriate volumetric primitives. All other surfaces are represented using the boolean union of spherical volumes (as described in a separate paper by the same authors). The result is a general, concise representation of the external 3-D world, which allows for efficient and robust 3-D object recognition.
Thermal-to-visible face recognition using partial least squares.

PubMed

Hu, Shuowen; Choi, Jonghyun; Chan, Alex L; Schwartz, William Robson

2015-03-01

Although visible face recognition has been an active area of research for several decades, cross-modal face recognition has only been explored by the biometrics community relatively recently. Thermal-to-visible face recognition is one of the most difficult cross-modal face recognition challenges, because of the difference in phenomenology between the thermal and visible imaging modalities. We address the cross-modal recognition problem using a partial least squares (PLS) regression-based approach consisting of preprocessing, feature extraction, and PLS model building. The preprocessing and feature extraction stages are designed to reduce the modality gap between the thermal and visible facial signatures, and facilitate the subsequent one-vs-all PLS-based model building. We incorporate multi-modal information into the PLS model building stage to enhance cross-modal recognition. The performance of the proposed recognition algorithm is evaluated on three challenging datasets containing visible and thermal imagery acquired under different experimental scenarios: time-lapse, physical tasks, mental tasks, and subject-to-camera range. These scenarios represent difficult challenges relevant to real-world applications. We demonstrate that the proposed method performs robustly for the examined scenarios.
Automatic textual annotation of video news based on semantic visual object extraction

NASA Astrophysics Data System (ADS)

Boujemaa, Nozha; Fleuret, Francois; Gouet, Valerie; Sahbi, Hichem

2003-12-01

In this paper, we present our work for automatic generation of textual metadata based on visual content analysis of video news. We present two methods for semantic object detection and recognition from a cross modal image-text thesaurus. These thesaurus represent a supervised association between models and semantic labels. This paper is concerned with two semantic objects: faces and Tv logos. In the first part, we present our work for efficient face detection and recogniton with automatic name generation. This method allows us also to suggest the textual annotation of shots close-up estimation. On the other hand, we were interested to automatically detect and recognize different Tv logos present on incoming different news from different Tv Channels. This work was done jointly with the French Tv Channel TF1 within the "MediaWorks" project that consists on an hybrid text-image indexing and retrieval plateform for video news.
A neural network ActiveX based integrated image processing environment.

PubMed

Ciuca, I; Jitaru, E; Alaicescu, M; Moisil, I

2000-01-01

The paper outlines an integrated image processing environment that uses neural networks ActiveX technology for object recognition and classification. The image processing environment which is Windows based, encapsulates a Multiple-Document Interface (MDI) and is menu driven. Object (shape) parameter extraction is focused on features that are invariant in terms of translation, rotation and scale transformations. The neural network models that can be incorporated as ActiveX components into the environment allow both clustering and classification of objects from the analysed image. Mapping neural networks perform an input sensitivity analysis on the extracted feature measurements and thus facilitate the removal of irrelevant features and improvements in the degree of generalisation. The program has been used to evaluate the dimensions of the hydrocephalus in a study for calculating the Evans index and the angle of the frontal horns of the ventricular system modifications.
Medical image segmentation by combining graph cuts and oriented active appearance models.

PubMed

Chen, Xinjian; Udupa, Jayaram K; Bagci, Ulas; Zhuge, Ying; Yao, Jianhua

2012-04-01

In this paper, we propose a novel method based on a strategic combination of the active appearance model (AAM), live wire (LW), and graph cuts (GCs) for abdominal 3-D organ segmentation. The proposed method consists of three main parts: model building, object recognition, and delineation. In the model building part, we construct the AAM and train the LW cost function and GC parameters. In the recognition part, a novel algorithm is proposed for improving the conventional AAM matching method, which effectively combines the AAM and LW methods, resulting in the oriented AAM (OAAM). A multiobject strategy is utilized to help in object initialization. We employ a pseudo-3-D initialization strategy and segment the organs slice by slice via a multiobject OAAM method. For the object delineation part, a 3-D shape-constrained GC method is proposed. The object shape generated from the initialization step is integrated into the GC cost computation, and an iterative GC-OAAM method is used for object delineation. The proposed method was tested in segmenting the liver, kidneys, and spleen on a clinical CT data set and also on the MICCAI 2007 Grand Challenge liver data set. The results show the following: 1) The overall segmentation accuracy of true positive volume fraction TPVF > 94.3% and false positive volume fraction can be achieved; 2) the initialization performance can be improved by combining the AAM and LW; 3) the multiobject strategy greatly facilitates initialization; 4) compared with the traditional 3-D AAM method, the pseudo-3-D OAAM method achieves comparable performance while running 12 times faster; and 5) the performance of the proposed method is comparable to state-of-the-art liver segmentation algorithm. The executable version of the 3-D shape-constrained GC method with a user interface can be downloaded from http://xinjianchen.wordpress.com/research/.
Fluent, Fast, and Frugal? A Formal Model Evaluation of the Interplay between Memory, Fluency, and Comparative Judgments

ERIC Educational Resources Information Center

Hilbig, Benjamin E.; Erdfelder, Edgar; Pohl, Rudiger F.

2011-01-01

A new process model of the interplay between memory and judgment processes was recently suggested, assuming that retrieval fluency--that is, the speed with which objects are recognized--will determine inferences concerning such objects in a single-cue fashion. This aspect of the fluency heuristic, an extension of the recognition heuristic, has…
Scaling up spike-and-slab models for unsupervised feature learning.

PubMed

Goodfellow, Ian J; Courville, Aaron; Bengio, Yoshua

2013-08-01

We describe the use of two spike-and-slab models for modeling real-valued data, with an emphasis on their applications to object recognition. The first model, which we call spike-and-slab sparse coding (S3C), is a preexisting model for which we introduce a faster approximate inference algorithm. We introduce a deep variant of S3C, which we call the partially directed deep Boltzmann machine (PD-DBM) and extend our S3C inference algorithm for use on this model. We describe learning procedures for each. We demonstrate that our inference procedure for S3C enables scaling the model to unprecedented large problem sizes, and demonstrate that using S3C as a feature extractor results in very good object recognition performance, particularly when the number of labeled examples is low. We show that the PD-DBM generates better samples than its shallow counterpart, and that unlike DBMs or DBNs, the PD-DBM may be trained successfully without greedy layerwise training.
Are face representations depth cue invariant?

PubMed

Dehmoobadsharifabadi, Armita; Farivar, Reza

2016-06-01

The visual system can process three-dimensional depth cues defining surfaces of objects, but it is unclear whether such information contributes to complex object recognition, including face recognition. The processing of different depth cues involves both dorsal and ventral visual pathways. We investigated whether facial surfaces defined by individual depth cues resulted in meaningful face representations-representations that maintain the relationship between the population of faces as defined in a multidimensional face space. We measured face identity aftereffects for facial surfaces defined by individual depth cues (Experiments 1 and 2) and tested whether the aftereffect transfers across depth cues (Experiments 3 and 4). Facial surfaces and their morphs to the average face were defined purely by one of shading, texture, motion, or binocular disparity. We obtained identification thresholds for matched (matched identity between adapting and test stimuli), non-matched (non-matched identity between adapting and test stimuli), and no-adaptation (showing only the test stimuli) conditions for each cue and across different depth cues. We found robust face identity aftereffect in both experiments. Our results suggest that depth cues do contribute to forming meaningful face representations that are depth cue invariant. Depth cue invariance would require integration of information across different areas and different pathways for object recognition, and this in turn has important implications for cortical models of visual object recognition.
A high-fat high-sugar diet-induced impairment in place-recognition memory is reversible and training-dependent.

PubMed

Tran, Dominic M D; Westbrook, R Frederick

2017-03-01

A high-fat high-sugar (HFHS) diet is associated with cognitive deficits in people and produces spatial learning and memory deficits in rodents. Notable, such diets rapidly impair place-, but not object-recognition memory in rats within one week of exposure. Three experiments examined whether this impairment was reversed by removal of the diet, or prevented by pre-diet training. Experiment 1 showed that rats switched from HFHS to chow recovered from the place-recognition impairment that they displayed while on HFHS. Experiment 2 showed that control rats ("Untrained") who were exposed to an empty testing arena while on chow, were impaired in place-recognition when switched to HFHS and tested for the first time. However, rats tested ("Trained") on the place and object task while on chow, were protected from the diet-induce deficit and maintained good place-recognition when switched to HFHS. Experiment 3 examined the conditions of this protection effect by training rats in a square arena while on chow, and testing them in a rectangular arena while on HFHS. We have previously demonstrated that chow rats, but not HFHS rats, show geometry-based reorientation on a rectangular arena place-recognition task (Tran & Westbrook, 2015). Experiment 3 assessed whether rats switched to the HFHS diet after training on the place and object tasks in a square area, would show geometry-based reorientation in a rectangular arena. The protective benefit of training was replicated in the square arena, but both Untrained and Trained HFHS failed to show geometry-based reorientation in the rectangular arena. These findings are discussed in relation to the specificity of the training effect, the role of the hippocampus in diet-induced deficits, and their implications for dietary effects on cognition in people. Copyright © 2016 Elsevier Ltd. All rights reserved.
Deep Neural Networks for Speech Separation With Application to Robust Speech Recognition

DTIC Science & Technology

acoustic -phonetic features. The second objective is integration of spectrotemporal context for improved separation performance. Conditional random fields...will be used to encode contextual constraints. The third objective is to achieve robust ASR in the DNN framework through integrated acoustic modeling
Object instance recognition using motion cues and instance specific appearance models

NASA Astrophysics Data System (ADS)

Schumann, Arne

2014-03-01

In this paper we present an object instance retrieval approach. The baseline approach consists of a pool of image features which are computed on the bounding boxes of a query object track and compared to a database of tracks in order to find additional appearances of the same object instance. We improve over this simple baseline approach in multiple ways: 1) we include motion cues to achieve improved robustness to viewpoint and rotation changes, 2) we include operator feedback to iteratively re-rank the resulting retrieval lists and 3) we use operator feedback and location constraints to train classifiers and learn an instance specific appearance model. We use these classifiers to further improve the retrieval results. The approach is evaluated on two popular public datasets for two different applications. We evaluate person re-identification on the CAVIAR shopping mall surveillance dataset and vehicle instance recognition on the VIVID aerial dataset and achieve significant improvements over our baseline results.
The hierarchical brain network for face recognition.

PubMed

Zhen, Zonglei; Fang, Huizhen; Liu, Jia

2013-01-01

Numerous functional magnetic resonance imaging (fMRI) studies have identified multiple cortical regions that are involved in face processing in the human brain. However, few studies have characterized the face-processing network as a functioning whole. In this study, we used fMRI to identify face-selective regions in the entire brain and then explore the hierarchical structure of the face-processing network by analyzing functional connectivity among these regions. We identified twenty-five regions mainly in the occipital, temporal and frontal cortex that showed a reliable response selective to faces (versus objects) across participants and across scan sessions. Furthermore, these regions were clustered into three relatively independent sub-networks in a face-recognition task on the basis of the strength of functional connectivity among them. The functionality of the sub-networks likely corresponds to the recognition of individual identity, retrieval of semantic knowledge and representation of emotional information. Interestingly, when the task was switched to object recognition from face recognition, the functional connectivity between the inferior occipital gyrus and the rest of the face-selective regions were significantly reduced, suggesting that this region may serve as an entry node in the face-processing network. In sum, our study provides empirical evidence for cognitive and neural models of face recognition and helps elucidate the neural mechanisms underlying face recognition at the network level.
Exploring the association between visual perception abilities and reading of musical notation.

PubMed

Lee, Horng-Yih

2012-06-01

In the reading of music, the acquisition of pitch information depends primarily upon the spatial position of notes as well as upon an individual's spatial processing ability. This study investigated the relationship between the ability to read single notes and visual-spatial ability. Participants with high and low single-note reading abilities were differentiated based upon differences in musical notation-reading abilities and their spatial processing; object recognition abilities were then assessed. It was found that the group with lower note-reading abilities made more errors than did the group with a higher note-reading abilities in the mental rotation task. In contrast, there was no apparent significant difference between the two groups in the object recognition task. These results suggest that note-reading may be related to visual spatial processing abilities, and not to an individual's ability with object recognition.
Distinct roles of basal forebrain cholinergic neurons in spatial and object recognition memory.

PubMed

Okada, Kana; Nishizawa, Kayo; Kobayashi, Tomoko; Sakata, Shogo; Kobayashi, Kazuto

2015-08-06

Recognition memory requires processing of various types of information such as objects and locations. Impairment in recognition memory is a prominent feature of amnesia and a symptom of Alzheimer's disease (AD). Basal forebrain cholinergic neurons contain two major groups, one localized in the medial septum (MS)/vertical diagonal band of Broca (vDB), and the other in the nucleus basalis magnocellularis (NBM). The roles of these cell groups in recognition memory have been debated, and it remains unclear how they contribute to it. We use a genetic cell targeting technique to selectively eliminate cholinergic cell groups and then test spatial and object recognition memory through different behavioural tasks. Eliminating MS/vDB neurons impairs spatial but not object recognition memory in the reference and working memory tasks, whereas NBM elimination undermines only object recognition memory in the working memory task. These impairments are restored by treatment with acetylcholinesterase inhibitors, anti-dementia drugs for AD. Our results highlight that MS/vDB and NBM cholinergic neurons are not only implicated in recognition memory but also have essential roles in different types of recognition memory.
Thoracic lymph node station recognition on CT images based on automatic anatomy recognition with an optimal parent strategy

NASA Astrophysics Data System (ADS)

Xu, Guoping; Udupa, Jayaram K.; Tong, Yubing; Cao, Hanqiang; Odhner, Dewey; Torigian, Drew A.; Wu, Xingyu

2018-03-01

Currently, there are many papers that have been published on the detection and segmentation of lymph nodes from medical images. However, it is still a challenging problem owing to low contrast with surrounding soft tissues and the variations of lymph node size and shape on computed tomography (CT) images. This is particularly very difficult on low-dose CT of PET/CT acquisitions. In this study, we utilize our previous automatic anatomy recognition (AAR) framework to recognize the thoracic-lymph node stations defined by the International Association for the Study of Lung Cancer (IASLC) lymph node map. The lymph node stations themselves are viewed as anatomic objects and are localized by using a one-shot method in the AAR framework. Two strategies have been taken in this paper for integration into AAR framework. The first is to combine some lymph node stations into composite lymph node stations according to their geometrical nearness. The other is to find the optimal parent (organ or union of organs) as an anchor for each lymph node station based on the recognition error and thereby find an overall optimal hierarchy to arrange anchor organs and lymph node stations. Based on 28 contrast-enhanced thoracic CT image data sets for model building, 12 independent data sets for testing, our results show that thoracic lymph node stations can be localized within 2-3 voxels compared to the ground truth.
Data-centric method for object observation through scattering media

NASA Astrophysics Data System (ADS)

Tanida, Jun; Horisaki, Ryoichi

2018-03-01

A data-centric method is introduced for object observation through scattering media. A large number of training pairs are used to characterize the relation between the object and the observation signals based on machine learning. Using the method object information can be retrieved even from strongly-disturbed signals. As potential applications, object recognition, imaging, and focusing through scattering media were demonstrated.
Dentate gyrus supports slope recognition memory, shades of grey-context pattern separation and recognition memory, and CA3 supports pattern completion for object memory.

PubMed

Kesner, Raymond P; Kirk, Ryan A; Yu, Zhenghui; Polansky, Caitlin; Musso, Nick D

2016-03-01

In order to examine the role of the dorsal dentate gyrus (dDG) in slope (vertical space) recognition and possible pattern separation, various slope (vertical space) degrees were used in a novel exploratory paradigm to measure novelty detection for changes in slope (vertical space) recognition memory and slope memory pattern separation in Experiment 1. The results of the experiment indicate that control rats displayed a slope recognition memory function with a pattern separation process for slope memory that is dependent upon the magnitude of change in slope between study and test phases. In contrast, the dDG lesioned rats displayed an impairment in slope recognition memory, though because there was no significant interaction between the two groups and slope memory, a reliable pattern separation impairment for slope could not be firmly established in the DG lesioned rats. In Experiment 2, in order to determine whether, the dDG plays a role in shades of grey spatial context recognition and possible pattern separation, shades of grey were used in a novel exploratory paradigm to measure novelty detection for changes in the shades of grey context environment. The results of the experiment indicate that control rats displayed a shades of grey-context pattern separation effect across levels of separation of context (shades of grey). In contrast, the DG lesioned rats displayed a significant interaction between the two groups and levels of shades of grey suggesting impairment in a pattern separation function for levels of shades of grey. In Experiment 3 in order to determine whether the dorsal CA3 (dCA3) plays a role in object pattern completion, a new task requiring less training and using a choice that was based on choosing the correct set of objects on a two-choice discrimination task was used. The results indicated that control rats displayed a pattern completion function based on the availability of one, two, three or four cues. In contrast, the dCA3 lesioned rats displayed a significant interaction between the two groups and the number of available objects suggesting impairment in a pattern completion function for object cues. Copyright © 2015 Elsevier Inc. All rights reserved.
Visual object recognition for mobile tourist information systems

NASA Astrophysics Data System (ADS)

Paletta, Lucas; Fritz, Gerald; Seifert, Christin; Luley, Patrick; Almer, Alexander

2005-03-01

We describe a mobile vision system that is capable of automated object identification using images captured from a PDA or a camera phone. We present a solution for the enabling technology of outdoors vision based object recognition that will extend state-of-the-art location and context aware services towards object based awareness in urban environments. In the proposed application scenario, tourist pedestrians are equipped with GPS, W-LAN and a camera attached to a PDA or a camera phone. They are interested whether their field of view contains tourist sights that would point to more detailed information. Multimedia type data about related history, the architecture, or other related cultural context of historic or artistic relevance might be explored by a mobile user who is intending to learn within the urban environment. Learning from ambient cues is in this way achieved by pointing the device towards the urban sight, capturing an image, and consequently getting information about the object on site and within the focus of attention, i.e., the users current field of view.

View-invariant object category learning, recognition, and search: how spatial and object attention are coordinated using surface-based attentional shrouds.

PubMed

Fazl, Arash; Grossberg, Stephen; Mingolla, Ennio

2009-02-01

How does the brain learn to recognize an object from multiple viewpoints while scanning a scene with eye movements? How does the brain avoid the problem of erroneously classifying parts of different objects together? How are attention and eye movements intelligently coordinated to facilitate object learning? A neural model provides a unified mechanistic explanation of how spatial and object attention work together to search a scene and learn what is in it. The ARTSCAN model predicts how an object's surface representation generates a form-fitting distribution of spatial attention, or "attentional shroud". All surface representations dynamically compete for spatial attention to form a shroud. The winning shroud persists during active scanning of the object. The shroud maintains sustained activity of an emerging view-invariant category representation while multiple view-specific category representations are learned and are linked through associative learning to the view-invariant object category. The shroud also helps to restrict scanning eye movements to salient features on the attended object. Object attention plays a role in controlling and stabilizing the learning of view-specific object categories. Spatial attention hereby coordinates the deployment of object attention during object category learning. Shroud collapse releases a reset signal that inhibits the active view-invariant category in the What cortical processing stream. Then a new shroud, corresponding to a different object, forms in the Where cortical processing stream, and search using attention shifts and eye movements continues to learn new objects throughout a scene. The model mechanistically clarifies basic properties of attention shifts (engage, move, disengage) and inhibition of return. It simulates human reaction time data about object-based spatial attention shifts, and learns with 98.1% accuracy and a compression of 430 on a letter database whose letters vary in size, position, and orientation. The model provides a powerful framework for unifying many data about spatial and object attention, and their interactions during perception, cognition, and action.
Leader/Follower Behaviour Using the SIFT Algorithm for Object Recognition

DTIC Science & Technology

2006-06-01

opérations de convoiement plus complexes qui utiliseraient une vision artificielle basée sur la détection d’un chef. Les travaux futurs : Étant donné la...Systems: A Virtual Trailer Link Model, In Proceedings of IEEE/RSJ Conference on Intelligent Robots and Systems. [4] Hong, P., Sahli, H., Colon, E., and... Intelligent Robots and Systems. [6] Nguyen, H., Kogut, G., Barua, R., and Burmeister, A. (2004), A Segway RMP-based Robotic Transport System, In In
Measuring the Speed of Newborn Object Recognition in Controlled Visual Worlds

ERIC Educational Resources Information Center

Wood, Justin N.; Wood, Samantha M. W.

2017-01-01

How long does it take for a newborn to recognize an object? Adults can recognize objects rapidly, but measuring object recognition speed in newborns has not previously been possible. Here we introduce an automated controlled-rearing method for measuring the speed of newborn object recognition in controlled visual worlds. We raised newborn chicks…
Deletion of the GluA1 AMPA receptor subunit impairs recency-dependent object recognition memory

PubMed Central

Sanderson, David J.; Hindley, Emma; Smeaton, Emily; Denny, Nick; Taylor, Amy; Barkus, Chris; Sprengel, Rolf; Seeburg, Peter H.; Bannerman, David M.

2011-01-01

Deletion of the GluA1 AMPA receptor subunit impairs short-term spatial recognition memory. It has been suggested that short-term recognition depends upon memory caused by the recent presentation of a stimulus that is independent of contextual–retrieval processes. The aim of the present set of experiments was to test whether the role of GluA1 extends to nonspatial recognition memory. Wild-type and GluA1 knockout mice were tested on the standard object recognition task and a context-independent recognition task that required recency-dependent memory. In a first set of experiments it was found that GluA1 deletion failed to impair performance on either of the object recognition or recency-dependent tasks. However, GluA1 knockout mice displayed increased levels of exploration of the objects in both the sample and test phases compared to controls. In contrast, when the time that GluA1 knockout mice spent exploring the objects was yoked to control mice during the sample phase, it was found that GluA1 deletion now impaired performance on both the object recognition and the recency-dependent tasks. GluA1 deletion failed to impair performance on a context-dependent recognition task regardless of whether object exposure in knockout mice was yoked to controls or not. These results demonstrate that GluA1 is necessary for nonspatial as well as spatial recognition memory and plays an important role in recency-dependent memory processes. PMID:21378100
Attention-Based Recurrent Temporal Restricted Boltzmann Machine for Radar High Resolution Range Profile Sequence Recognition.

PubMed

Zhang, Yifan; Gao, Xunzhang; Peng, Xuan; Ye, Jiaqi; Li, Xiang

2018-05-16

The High Resolution Range Profile (HRRP) recognition has attracted great concern in the field of Radar Automatic Target Recognition (RATR). However, traditional HRRP recognition methods failed to model high dimensional sequential data efficiently and have a poor anti-noise ability. To deal with these problems, a novel stochastic neural network model named Attention-based Recurrent Temporal Restricted Boltzmann Machine (ARTRBM) is proposed in this paper. RTRBM is utilized to extract discriminative features and the attention mechanism is adopted to select major features. RTRBM is efficient to model high dimensional HRRP sequences because it can extract the information of temporal and spatial correlation between adjacent HRRPs. The attention mechanism is used in sequential data recognition tasks including machine translation and relation classification, which makes the model pay more attention to the major features of recognition. Therefore, the combination of RTRBM and the attention mechanism makes our model effective for extracting more internal related features and choose the important parts of the extracted features. Additionally, the model performs well with the noise corrupted HRRP data. Experimental results on the Moving and Stationary Target Acquisition and Recognition (MSTAR) dataset show that our proposed model outperforms other traditional methods, which indicates that ARTRBM extracts, selects, and utilizes the correlation information between adjacent HRRPs effectively and is suitable for high dimensional data or noise corrupted data.
Modeling the effect of channel number and interaction on consonant recognition in a cochlear implant peak-picking strategy.

PubMed

Verschuur, Carl

2009-03-01

Difficulties in speech recognition experienced by cochlear implant users may be attributed both to information loss caused by signal processing and to information loss associated with the interface between the electrode array and auditory nervous system, including cross-channel interaction. The objective of the work reported here was to attempt to partial out the relative contribution of these different factors to consonant recognition. This was achieved by comparing patterns of consonant feature recognition as a function of channel number and presence/absence of background noise in users of the Nucleus 24 device with normal hearing subjects listening to acoustic models that mimicked processing of that device. Additionally, in the acoustic model experiment, a simulation of cross-channel spread of excitation, or "channel interaction," was varied. Results showed that acoustic model experiments were highly correlated with patterns of performance in better-performing cochlear implant users. Deficits to consonant recognition in this subgroup could be attributed to cochlear implant processing, whereas channel interaction played a much smaller role in determining performance errors. The study also showed that large changes to channel number in the Advanced Combination Encoder signal processing strategy led to no substantial changes in performance.
A stable biologically motivated learning mechanism for visual feature extraction to handle facial categorization.

PubMed

Rajaei, Karim; Khaligh-Razavi, Seyed-Mahdi; Ghodrati, Masoud; Ebrahimpour, Reza; Shiri Ahmad Abadi, Mohammad Ebrahim

2012-01-01

The brain mechanism of extracting visual features for recognizing various objects has consistently been a controversial issue in computational models of object recognition. To extract visual features, we introduce a new, biologically motivated model for facial categorization, which is an extension of the Hubel and Wiesel simple-to-complex cell hierarchy. To address the synaptic stability versus plasticity dilemma, we apply the Adaptive Resonance Theory (ART) for extracting informative intermediate level visual features during the learning process, which also makes this model stable against the destruction of previously learned information while learning new information. Such a mechanism has been suggested to be embedded within known laminar microcircuits of the cerebral cortex. To reveal the strength of the proposed visual feature learning mechanism, we show that when we use this mechanism in the training process of a well-known biologically motivated object recognition model (the HMAX model), it performs better than the HMAX model in face/non-face classification tasks. Furthermore, we demonstrate that our proposed mechanism is capable of following similar trends in performance as humans in a psychophysical experiment using a face versus non-face rapid categorization task.
An Intelligent Systems Approach to Automated Object Recognition: A Preliminary Study

USGS Publications Warehouse

Maddox, Brian G.; Swadley, Casey L.

2002-01-01

Attempts at fully automated object recognition systems have met with varying levels of success over the years. However, none of the systems have achieved high enough accuracy rates to be run unattended. One of the reasons for this may be that they are designed from the computer's point of view and rely mainly on image-processing methods. A better solution to this problem may be to make use of modern advances in computational intelligence and distributed processing to try to mimic how the human brain is thought to recognize objects. As humans combine cognitive processes with detection techniques, such a system would combine traditional image-processing techniques with computer-based intelligence to determine the identity of various objects in a scene.
Object location and object recognition memory impairments, motivation deficits and depression in a model of Gulf War illness.

PubMed

Hattiangady, Bharathi; Mishra, Vikas; Kodali, Maheedhar; Shuai, Bing; Rao, Xiolan; Shetty, Ashok K

2014-01-01

Memory and mood deficits are the enduring brain-related symptoms in Gulf War illness (GWI). Both animal model and epidemiological investigations have indicated that these impairments in a majority of GW veterans are linked to exposures to chemicals such as pyridostigmine bromide (PB, an antinerve gas drug), permethrin (PM, an insecticide) and DEET (a mosquito repellant) encountered during the Persian Gulf War-1. Our previous study in a rat model has shown that combined exposures to low doses of GWI-related (GWIR) chemicals PB, PM, and DEET with or without 5-min of restraint stress (a mild stress paradigm) causes hippocampus-dependent spatial memory dysfunction in a water maze test (WMT) and increased depressive-like behavior in a forced swim test (FST). In this study, using a larger cohort of rats exposed to GWIR-chemicals and stress, we investigated whether the memory deficiency identified earlier in a WMT is reproducible with an alternative and stress free hippocampus-dependent memory test such as the object location test (OLT). We also ascertained the possible co-existence of hippocampus-independent memory dysfunction using a novel object recognition test (NORT), and alterations in mood function with additional tests for motivation and depression. Our results provide new evidence that exposure to low doses of GWIR-chemicals and mild stress for 4 weeks causes deficits in hippocampus-dependent object location memory and perirhinal cortex-dependent novel object recognition memory. An open field test performed prior to other behavioral analyses revealed that memory impairments were not associated with increased anxiety or deficits in general motor ability. However, behavioral tests for mood function such as a voluntary physical exercise paradigm and a novelty suppressed feeding test (NSFT) demonstrated decreased motivation levels and depression. Thus, exposure to GWIR-chemicals and stress causes both hippocampus-dependent and hippocampus-independent memory impairments as well as mood dysfunction in a rat model.
Real-time unconstrained object recognition: a processing pipeline based on the mammalian visual system.

PubMed

Aguilar, Mario; Peot, Mark A; Zhou, Jiangying; Simons, Stephen; Liao, Yuwei; Metwalli, Nader; Anderson, Mark B

2012-03-01

The mammalian visual system is still the gold standard for recognition accuracy, flexibility, efficiency, and speed. Ongoing advances in our understanding of function and mechanisms in the visual system can now be leveraged to pursue the design of computer vision architectures that will revolutionize the state of the art in computer vision.
Insular Cortex Is Involved in Consolidation of Object Recognition Memory

ERIC Educational Resources Information Center

Bermudez-Rattoni, Federico; Okuda, Shoki; Roozendaal, Benno; McGaugh, James L.

2005-01-01

Extensive evidence indicates that the insular cortex (IC), also termed gustatory cortex, is critically involved in conditioned taste aversion and taste recognition memory. Although most studies of the involvement of the IC in memory have investigated taste, there is some evidence that the IC is involved in memory that is not based on taste. In…
The limited use of the fluency heuristic: Converging evidence across different procedures.

PubMed

Pohl, Rüdiger F; Erdfelder, Edgar; Michalkiewicz, Martha; Castela, Marta; Hilbig, Benjamin E

2016-10-01

In paired comparisons based on which of two objects has the larger criterion value, decision makers could use the subjectively experienced difference in retrieval fluency of the objects as a cue. According to the fluency heuristic (FH) theory, decision makers use fluency-as indexed by recognition speed-as the only cue for pairs of recognized objects, and infer that the object retrieved more speedily has the larger criterion value (ignoring all other cues and information). Model-based analyses, however, have previously revealed that only a small portion of such inferences are indeed based on fluency alone. In the majority of cases, other information enters the decision process. However, due to the specific experimental procedures, the estimates of FH use are potentially biased: Some procedures may have led to an overestimated and others to an underestimated, or even to actually reduced, FH use. In the present article, we discuss and test the impacts of such procedural variations by reanalyzing 21 data sets. The results show noteworthy consistency across the procedural variations revealing low FH use. We discuss potential explanations and implications of this finding.
3D interactive augmented reality-enhanced digital learning systems for mobile devices

NASA Astrophysics Data System (ADS)

Feng, Kai-Ten; Tseng, Po-Hsuan; Chiu, Pei-Shuan; Yang, Jia-Lin; Chiu, Chun-Jie

2013-03-01

With enhanced processing capability of mobile platforms, augmented reality (AR) has been considered a promising technology for achieving enhanced user experiences (UX). Augmented reality is to impose virtual information, e.g., videos and images, onto a live-view digital display. UX on real-world environment via the display can be e ectively enhanced with the adoption of interactive AR technology. Enhancement on UX can be bene cial for digital learning systems. There are existing research works based on AR targeting for the design of e-learning systems. However, none of these work focuses on providing three-dimensional (3-D) object modeling for en- hanced UX based on interactive AR techniques. In this paper, the 3-D interactive augmented reality-enhanced learning (IARL) systems will be proposed to provide enhanced UX for digital learning. The proposed IARL systems consist of two major components, including the markerless pattern recognition (MPR) for 3-D models and velocity-based object tracking (VOT) algorithms. Realistic implementation of proposed IARL system is conducted on Android-based mobile platforms. UX on digital learning can be greatly improved with the adoption of proposed IARL systems.
Fixation and saliency during search of natural scenes: the case of visual agnosia.

PubMed

Foulsham, Tom; Barton, Jason J S; Kingstone, Alan; Dewhurst, Richard; Underwood, Geoffrey

2009-07-01

Models of eye movement control in natural scenes often distinguish between stimulus-driven processes (which guide the eyes to visually salient regions) and those based on task and object knowledge (which depend on expectations or identification of objects and scene gist). In the present investigation, the eye movements of a patient with visual agnosia were recorded while she searched for objects within photographs of natural scenes and compared to those made by students and age-matched controls. Agnosia is assumed to disrupt the top-down knowledge available in this task, and so may increase the reliance on bottom-up cues. The patient's deficit in object recognition was seen in poor search performance and inefficient scanning. The low-level saliency of target objects had an effect on responses in visual agnosia, and the most salient region in the scene was more likely to be fixated by the patient than by controls. An analysis of model-predicted saliency at fixation locations indicated a closer match between fixations and low-level saliency in agnosia than in controls. These findings are discussed in relation to saliency-map models and the balance between high and low-level factors in eye guidance.
SU-D-201-05: On the Automatic Recognition of Patient Safety Hazards in a Radiotherapy Setup Using a Novel 3D Camera System and a Deep Learning Framework

DOE Office of Scientific and Technical Information (OSTI.GOV)

Santhanam, A; Min, Y; Beron, P

Purpose: Patient safety hazards such as a wrong patient/site getting treated can lead to catastrophic results. The purpose of this project is to automatically detect potential patient safety hazards during the radiotherapy setup and alert the therapist before the treatment is initiated. Methods: We employed a set of co-located and co-registered 3D cameras placed inside the treatment room. Each camera provided a point-cloud of fraxels (fragment pixels with 3D depth information). Each of the cameras were calibrated using a custom-built calibration target to provide 3D information with less than 2 mm error in the 500 mm neighborhood around the isocenter.more » To identify potential patient safety hazards, the treatment room components and the patient’s body needed to be identified and tracked in real-time. For feature recognition purposes, we used a graph-cut based feature recognition with principal component analysis (PCA) based feature-to-object correlation to segment the objects in real-time. Changes in the object’s position were tracked using the CamShift algorithm. The 3D object information was then stored for each classified object (e.g. gantry, couch). A deep learning framework was then used to analyze all the classified objects in both 2D and 3D and was then used to fine-tune a convolutional network for object recognition. The number of network layers were optimized to identify the tracked objects with >95% accuracy. Results: Our systematic analyses showed that, the system was effectively able to recognize wrong patient setups and wrong patient accessories. The combined usage of 2D camera information (color + depth) enabled a topology-preserving approach to verify patient safety hazards in an automatic manner and even in scenarios where the depth information is partially available. Conclusion: By utilizing the 3D cameras inside the treatment room and a deep learning based image classification, potential patient safety hazards can be effectively avoided.« less
Dopamine D1 receptor stimulation modulates the formation and retrieval of novel object recognition memory: Role of the prelimbic cortex

PubMed Central

Pezze, Marie A.; Marshall, Hayley J.; Fone, Kevin C.F.; Cassaday, Helen J.

2015-01-01

Previous studies have shown that dopamine D1 receptor antagonists impair novel object recognition memory but the effects of dopamine D1 receptor stimulation remain to be determined. This study investigated the effects of the selective dopamine D1 receptor agonist SKF81297 on acquisition and retrieval in the novel object recognition task in male Wistar rats. SKF81297 (0.4 and 0.8 mg/kg s.c.) given 15 min before the sampling phase impaired novel object recognition evaluated 10 min or 24 h later. The same treatments also reduced novel object recognition memory tested 24 h after the sampling phase and when given 15 min before the choice session. These data indicate that D1 receptor stimulation modulates both the encoding and retrieval of object recognition memory. Microinfusion of SKF81297 (0.025 or 0.05 μg/side) into the prelimbic sub-region of the medial prefrontal cortex (mPFC) in this case 10 min before the sampling phase also impaired novel object recognition memory, suggesting that the mPFC is one important site mediating the effects of D1 receptor stimulation on visual recognition memory. PMID:26277743
Case-Based Learning in Athletic Training

ERIC Educational Resources Information Center

Berry, David C.

2013-01-01

The National Athletic Trainers' Association (NATA) Executive Committee for Education has emphasized the need for proper recognition and management of orthopaedic and general medical conditions through their support of numerous learning objectives and the clinical integrated proficiencies. These learning objectives and integrated clinical…
Optimization of Visual Information Presentation for Visual Prosthesis.

PubMed

Guo, Fei; Yang, Yuan; Gao, Yong

2018-01-01

Visual prosthesis applying electrical stimulation to restore visual function for the blind has promising prospects. However, due to the low resolution, limited visual field, and the low dynamic range of the visual perception, huge loss of information occurred when presenting daily scenes. The ability of object recognition in real-life scenarios is severely restricted for prosthetic users. To overcome the limitations, optimizing the visual information in the simulated prosthetic vision has been the focus of research. This paper proposes two image processing strategies based on a salient object detection technique. The two processing strategies enable the prosthetic implants to focus on the object of interest and suppress the background clutter. Psychophysical experiments show that techniques such as foreground zooming with background clutter removal and foreground edge detection with background reduction have positive impacts on the task of object recognition in simulated prosthetic vision. By using edge detection and zooming technique, the two processing strategies significantly improve the recognition accuracy of objects. We can conclude that the visual prosthesis using our proposed strategy can assist the blind to improve their ability to recognize objects. The results will provide effective solutions for the further development of visual prosthesis.
Optimization of Visual Information Presentation for Visual Prosthesis

PubMed Central

Gao, Yong

2018-01-01

Visual prosthesis applying electrical stimulation to restore visual function for the blind has promising prospects. However, due to the low resolution, limited visual field, and the low dynamic range of the visual perception, huge loss of information occurred when presenting daily scenes. The ability of object recognition in real-life scenarios is severely restricted for prosthetic users. To overcome the limitations, optimizing the visual information in the simulated prosthetic vision has been the focus of research. This paper proposes two image processing strategies based on a salient object detection technique. The two processing strategies enable the prosthetic implants to focus on the object of interest and suppress the background clutter. Psychophysical experiments show that techniques such as foreground zooming with background clutter removal and foreground edge detection with background reduction have positive impacts on the task of object recognition in simulated prosthetic vision. By using edge detection and zooming technique, the two processing strategies significantly improve the recognition accuracy of objects. We can conclude that the visual prosthesis using our proposed strategy can assist the blind to improve their ability to recognize objects. The results will provide effective solutions for the further development of visual prosthesis. PMID:29731769
Object recognition and pose estimation of planar objects from range data

NASA Technical Reports Server (NTRS)

Pendleton, Thomas W.; Chien, Chiun Hong; Littlefield, Mark L.; Magee, Michael

1994-01-01

The Extravehicular Activity Helper/Retriever (EVAHR) is a robotic device currently under development at the NASA Johnson Space Center that is designed to fetch objects or to assist in retrieving an astronaut who may have become inadvertently de-tethered. The EVAHR will be required to exhibit a high degree of intelligent autonomous operation and will base much of its reasoning upon information obtained from one or more three-dimensional sensors that it will carry and control. At the highest level of visual cognition and reasoning, the EVAHR will be required to detect objects, recognize them, and estimate their spatial orientation and location. The recognition phase and estimation of spatial pose will depend on the ability of the vision system to reliably extract geometric features of the objects such as whether the surface topologies observed are planar or curved and the spatial relationships between the component surfaces. In order to achieve these tasks, three-dimensional sensing of the operational environment and objects in the environment will therefore be essential. One of the sensors being considered to provide image data for object recognition and pose estimation is a phase-shift laser scanner. The characteristics of the data provided by this scanner have been studied and algorithms have been developed for segmenting range images into planar surfaces, extracting basic features such as surface area, and recognizing the object based on the characteristics of extracted features. Also, an approach has been developed for estimating the spatial orientation and location of the recognized object based on orientations of extracted planes and their intersection points. This paper presents some of the algorithms that have been developed for the purpose of recognizing and estimating the pose of objects as viewed by the laser scanner, and characterizes the desirability and utility of these algorithms within the context of the scanner itself, considering data quality and noise.

Performance improvement of multi-class detection using greedy algorithm for Viola-Jones cascade selection

NASA Astrophysics Data System (ADS)

Tereshin, Alexander A.; Usilin, Sergey A.; Arlazarov, Vladimir V.

2018-04-01

This paper aims to study the problem of multi-class object detection in video stream with Viola-Jones cascades. An adaptive algorithm for selecting Viola-Jones cascade based on greedy choice strategy in solution of the N-armed bandit problem is proposed. The efficiency of the algorithm on the problem of detection and recognition of the bank card logos in the video stream is shown. The proposed algorithm can be effectively used in documents localization and identification, recognition of road scene elements, localization and tracking of the lengthy objects , and for solving other problems of rigid object detection in a heterogeneous data flows. The computational efficiency of the algorithm makes it possible to use it both on personal computers and on mobile devices based on processors with low power consumption.
Segment-based acoustic models for continuous speech recognition

NASA Astrophysics Data System (ADS)

Ostendorf, Mari; Rohlicek, J. R.

1993-07-01

This research aims to develop new and more accurate stochastic models for speaker-independent continuous speech recognition, by extending previous work in segment-based modeling and by introducing a new hierarchical approach to representing intra-utterance statistical dependencies. These techniques, which are more costly than traditional approaches because of the large search space associated with higher order models, are made feasible through rescoring a set of HMM-generated N-best sentence hypotheses. We expect these different modeling techniques to result in improved recognition performance over that achieved by current systems, which handle only frame-based observations and assume that these observations are independent given an underlying state sequence. In the fourth quarter of the project, we have completed the following: (1) ported our recognition system to the Wall Street Journal task, a standard task in the ARPA community; (2) developed an initial dependency-tree model of intra-utterance observation correlation; and (3) implemented baseline language model estimation software. Our initial results on the Wall Street Journal task are quite good and represent significantly improved performance over most HMM systems reporting on the Nov. 1992 5k vocabulary test set.
Effect of +-methamphetamine on path integration learning, novel object recognition, and neurotoxicity in rats.

PubMed

Herring, Nicole R; Schaefer, Tori L; Gudelsky, Gary A; Vorhees, Charles V; Williams, Michael T

2008-09-01

Methamphetamine (MA) has been implicated in cognitive deficits in humans after chronic use. Animal models of neurotoxic MA exposure reveal persistent damage to monoaminergic systems but few associated cognitive effects. Since questions have been raised about the typical neurotoxic dosing regimen used in animals and whether it adequately models human cumulative drug exposure, these experiments examined two different dosing regimens. Rats were treated with one of the two regimens: one based on the typical neurotoxic regimen (4 x 10 mg/kg every 2 h) and one based on pharmacokinetic modeling (Cho AK, Melega WP, Kuczenski R, Segal DS Synapse 39:161-166, 2001) designed to better represent accumulating plasma concentrations of MA as seen in human users (24 x 1.67 mg/kg once every 15 min) matched for total daily dose. In two separate experiments, dosing regimens were compared for their effects on markers of neurotoxicity or on behavior. On markers of neurotoxicity, MA showed decreased dopamine (DA) and 5-HT, increased glial fibrillary acidic protein, and increased corticosterone levels regardless of dosing regimen 3 days post-treatment. Behaviorally, MA-treated groups, regardless of dosing regimen, showed hypoactivity, increased initial hyperactivity to a subsequent MA challenge, impaired novel object recognition, impaired learning in a multiple T water maze test of path integration, and no differences on spatial navigation or reference memory in the Morris water maze. After behavioral testing, reductions of DA and 5-HT remained. MA treatment induces an effect on path integration learning not previously reported. Dosing regimen had no differential effects on behavior or neurotoxicity.
Clozapine and glycinamide prevent MK-801-induced deficits in the novel object recognition (NOR) test in the domestic rabbit (Oryctolagus cuniculus).

PubMed

Hoffman, Kurt L; Basurto, Enrique

2014-09-01

Studies in humans indicate that acute administration of sub-anesthetic doses of ketamine, an NMDA receptor antagonist, provokes schizophrenic-like symptoms in healthy volunteers, and exacerbates existing symptoms in individuals with schizophrenia. These and other findings suggest that NMDA receptor hypofunction might participate in the pathophysiology of schizophrenia, and have prompted the development of rodent pharmacological models for this disorder based on acute or subchronic treatment with NMDA receptor antagonists, as well as the development of novel pharmacotherapies based on increasing extrasynaptic glycine concentrations. In the present study, we tested whether acute hyperlocomotory behavior and/or deficits in the novel object recognition (NOR) task, induced in male rabbits by the acute subcutaneous (s.c.) administration of MK-801 (0.025 and 0.037 mg/kg s.c., respectively), were prevented by prior administration of the atypcial antipsychotic, clozapine (0.2mg/kg, s.c.), or the glycine pro-drug glycinamide (56 mg/kg, s.c.). We found that clozapine fully prevented the MK-801-induced hyperlocomotion, and both clozapine and glycinamide prevented MK-801-induced deficits in the NOR task. The present results show that MK-801-induced hyperlocomotion and deficits in the NOR task in the domestic rabbit demonstrate predictive validity as an alternative animal model for symptoms of schizophrenia. Moreover, these results indicate that glycinamide should be investigated in pre-clinical models of neuropsychiatric disorders such as schizophrenia, obsessive compulsive disorder and anxiety disorders, where augmentation of extrasynaptic glycine concentrations may have therapeutic utility. Copyright © 2014 Elsevier B.V. All rights reserved.
Application of morphological associative memories and Fourier descriptors for classification of noisy subsurface signatures

NASA Astrophysics Data System (ADS)

Ortiz, Jorge L.; Parsiani, Hamed; Tolstoy, Leonid

2004-02-01

This paper presents a method for recognition of Noisy Subsurface Images using Morphological Associative Memories (MAM). MAM are type of associative memories that use a new kind of neural networks based in the algebra system known as semi-ring. The operations performed in this algebraic system are highly nonlinear providing additional strength when compared to other transformations. Morphological associative memories are a new kind of neural networks that provide a robust performance with noisy inputs. Two representations of morphological associative memories are used called M and W matrices. M associative memory provides a robust association with input patterns corrupted by dilative random noise, while the W associative matrix performs a robust recognition in patterns corrupted with erosive random noise. The robust performance of MAM is used in combination of the Fourier descriptors for the recognition of underground objects in Ground Penetrating Radar (GPR) images. Multiple 2-D GPR images of a site are made available by NASA-SSC center. The buried objects in these images appear in the form of hyperbolas which are the results of radar backscatter from the artifacts or objects. The Fourier descriptors of the prototype hyperbola-like and shapes from non-hyperbola shapes in the sub-surface images are used to make these shapes scale-, shift-, and rotation-invariant. Typical hyperbola-like and non-hyperbola shapes are used to calculate the morphological associative memories. The trained MAMs are used to process other noisy images to detect the presence of these underground objects. The outputs from the MAM using the noisy patterns may be equal to the training prototypes, providing a positive identification of the artifacts. The results are images with recognized hyperbolas which indicate the presence of buried artifacts. A model using MATLAB has been developed and results are presented.
Continuous recognition of spatial and nonspatial stimuli in hippocampal-lesioned rats.

PubMed

Jackson-Smith, P; Kesner, R P; Chiba, A A

1993-03-01

The present experiments compared the performance of hippocampal-lesioned rats to control rats on a spatial continuous recognition task and an analogous nonspatial task with similar processing demands. Daily sessions for Experiment 1 involved sequential presentation of individual arms on a 12-arm radial maze. Each arm contained a Froot Loop reinforcement the first time it was presented, and latency to traverse the arm was measured. A subset of the arms were repeated, but did not contain reinforcement. Repeated arms were presented with lags ranging from 0 to 6 (0 to 6 different arm presentations occurred between the first and the repeated presentation). Difference scores were computed by subtracting the latency on first presentations from the latency on repeated presentations, and these scores were high in all rats prior to surgery, with a decreasing function across lag. There were no differences in performance following cortical control or sham surgery. However, there was a total deficit in performance following large electrolytic lesions of the hippocampus. The second experiment employed the same continuous recognition memory procedure, but used three-dimensional visual objects (toys, junk items, etc., in various shapes, sizes, and textures) as stimuli on a flat runway. As in Experiment 1, the stimuli were presented successively and latency to run to and move the object was measured. Objects were repeated with lags ranging from 0 to 4. Performance on this task following surgery did not differ from performance prior to surgery for either the control group or the hippocampal lesion group. These results provide support for Kesner's attribute model of hippocampal function in that the hippocampus is assumed to mediate data-based memory for spatial locations, but not three-dimensional visual objects.
A Longitudinal Investigation of Visual Event-Related Potentials in the First Year of Life

ERIC Educational Resources Information Center

Webb, Sara J.; Long, Jeffrey D.; Nelson, Charles A.

2005-01-01

The goal of the current study was to assess general maturational changes in the ERP in the same sample of infants from 4 to 12 months of age. All participants were tested in two experimental manipulations at each age: a test of facial recognition and one of object recognition. Two sets of analyses were undertaken. First, growth curve modeling with…
Visual Persons Behavior Diary Generation Model based on Trajectories and Pose Estimation

NASA Astrophysics Data System (ADS)

Gang, Chen; Bin, Chen; Yuming, Liu; Hui, Li

2018-03-01

The behavior pattern of persons was the important output of the surveillance analysis. This paper focus on the generation model of visual person behavior diary. The pipeline includes the person detection, tracking, and the person behavior classify. This paper adopts the deep convolutional neural model YOLO (You Only Look Once)V2 for person detection module. Multi person tracking was based on the detection framework. The Hungarian assignment algorithm was used to the matching. The person appearance model was integrated by HSV color model and Hash code model. The person object motion was estimated by the Kalman Filter. The multi objects were matching with exist tracklets through the appearance and motion location distance by the Hungarian assignment method. A long continuous trajectory for one person was get by the spatial-temporal continual linking algorithm. And the face recognition information was used to identify the trajectory. The trajectories with identification information can be used to generate the visual diary of person behavior based on the scene context information and person action estimation. The relevant modules are tested in public data sets and our own capture video sets. The test results show that the method can be used to generate the visual person behavior pattern diary with certain accuracy.
Acquiring Semantically Meaningful Models for Robotic Localization, Mapping and Target Recognition

DTIC Science & Technology

2014-12-21

information, including suggesstions for reducing this burden, to Washington Headquarters Services , Directorate for Information Operations and Reports, 1215...Representations • Point features tracking • Recovery of relative motion, visual odometry • Loop closure • Environment models, sparse clouds of points...that co- occur with the object of interest Chair-Background Table-Background Object Level Segmentation Jaccard Index Silber .[5] 15.12 RenFox[4
Bridging automatic speech recognition and psycholinguistics: Extending Shortlist to an end-to-end model of human speech recognition (L)

NASA Astrophysics Data System (ADS)

Scharenborg, Odette; ten Bosch, Louis; Boves, Lou; Norris, Dennis

2003-12-01

This letter evaluates potential benefits of combining human speech recognition (HSR) and automatic speech recognition by building a joint model of an automatic phone recognizer (APR) and a computational model of HSR, viz., Shortlist [Norris, Cognition 52, 189-234 (1994)]. Experiments based on ``real-life'' speech highlight critical limitations posed by some of the simplifying assumptions made in models of human speech recognition. These limitations could be overcome by avoiding hard phone decisions at the output side of the APR, and by using a match between the input and the internal lexicon that flexibly copes with deviations from canonical phonemic representations.
Detecting Inspection Objects of Power Line from Cable Inspection Robot LiDAR Data

PubMed Central

Qin, Xinyan; Wu, Gongping; Fan, Fei

2018-01-01

Power lines are extending to complex environments (e.g., lakes and forests), and the distribution of power lines in a tower is becoming complicated (e.g., multi-loop and multi-bundle). Additionally, power line inspection is becoming heavier and more difficult. Advanced LiDAR technology is increasingly being used to solve these difficulties. Based on precise cable inspection robot (CIR) LiDAR data and the distinctive position and orientation system (POS) data, we propose a novel methodology to detect inspection objects surrounding power lines. The proposed method mainly includes four steps: firstly, the original point cloud is divided into single-span data as a processing unit; secondly, the optimal elevation threshold is constructed to remove ground points without the existing filtering algorithm, improving data processing efficiency and extraction accuracy; thirdly, a single power line and its surrounding data can be respectively extracted by a structured partition based on a POS data (SPPD) algorithm from “layer” to “block” according to power line distribution; finally, a partition recognition method is proposed based on the distribution characteristics of inspection objects, highlighting the feature information and improving the recognition effect. The local neighborhood statistics and the 3D region growing method are used to recognize different inspection objects surrounding power lines in a partition. Three datasets were collected by two CIR LIDAR systems in our study. The experimental results demonstrate that an average 90.6% accuracy and average 98.2% precision at the point cloud level can be achieved. The successful extraction indicates that the proposed method is feasible and promising. Our study can be used to obtain precise dimensions of fittings for modeling, as well as automatic detection and location of security risks, so as to improve the intelligence level of power line inspection. PMID:29690560
Detecting Inspection Objects of Power Line from Cable Inspection Robot LiDAR Data.

PubMed

Qin, Xinyan; Wu, Gongping; Lei, Jin; Fan, Fei; Ye, Xuhui

2018-04-22

Power lines are extending to complex environments (e.g., lakes and forests), and the distribution of power lines in a tower is becoming complicated (e.g., multi-loop and multi-bundle). Additionally, power line inspection is becoming heavier and more difficult. Advanced LiDAR technology is increasingly being used to solve these difficulties. Based on precise cable inspection robot (CIR) LiDAR data and the distinctive position and orientation system (POS) data, we propose a novel methodology to detect inspection objects surrounding power lines. The proposed method mainly includes four steps: firstly, the original point cloud is divided into single-span data as a processing unit; secondly, the optimal elevation threshold is constructed to remove ground points without the existing filtering algorithm, improving data processing efficiency and extraction accuracy; thirdly, a single power line and its surrounding data can be respectively extracted by a structured partition based on a POS data (SPPD) algorithm from "layer" to "block" according to power line distribution; finally, a partition recognition method is proposed based on the distribution characteristics of inspection objects, highlighting the feature information and improving the recognition effect. The local neighborhood statistics and the 3D region growing method are used to recognize different inspection objects surrounding power lines in a partition. Three datasets were collected by two CIR LIDAR systems in our study. The experimental results demonstrate that an average 90.6% accuracy and average 98.2% precision at the point cloud level can be achieved. The successful extraction indicates that the proposed method is feasible and promising. Our study can be used to obtain precise dimensions of fittings for modeling, as well as automatic detection and location of security risks, so as to improve the intelligence level of power line inspection.
Rats Fed a Diet Rich in Fats and Sugars Are Impaired in the Use of Spatial Geometry.

PubMed

Tran, Dominic M D; Westbrook, R Frederick

2015-12-01

A diet rich in fats and sugars is associated with cognitive deficits in people, and rodent models have shown that such a diet produces deficits on tasks assessing spatial learning and memory. Spatial navigation is guided by two distinct types of information: geometrical, such as distance and direction, and featural, such as luminance and pattern. To clarify the nature of diet-induced spatial impairments, we provided rats with standard chow supplemented with sugar water and a range of energy-rich foods eaten by people, and then we assessed their place- and object-recognition memory. Rats exposed to this diet performed comparably with control rats fed only chow on object recognition but worse on place recognition. This impairment on the place-recognition task was present after only a few days on the diet and persisted across tests. Critically, this spatial impairment was specific to the processing of distance and direction. © The Author(s) 2015.
Superordinate Level Processing Has Priority Over Basic-Level Processing in Scene Gist Recognition

PubMed Central

Sun, Qi; Zheng, Yang; Sun, Mingxia; Zheng, Yuanjie

2016-01-01

By combining a perceptual discrimination task and a visuospatial working memory task, the present study examined the effects of visuospatial working memory load on the hierarchical processing of scene gist. In the perceptual discrimination task, two scene images from the same (manmade–manmade pairing or natural–natural pairing) or different superordinate level categories (manmade–natural pairing) were presented simultaneously, and participants were asked to judge whether these two images belonged to the same basic-level category (e.g., street–street pairing) or not (e.g., street–highway pairing). In the concurrent working memory task, spatial load (position-based load in Experiment 1) and object load (figure-based load in Experiment 2) were manipulated. The results were as follows: (a) spatial load and object load have stronger effects on discrimination of same basic-level scene pairing than same superordinate level scene pairing; (b) spatial load has a larger impact on the discrimination of scene pairings at early stages than at later stages; on the contrary, object information has a larger influence on at later stages than at early stages. It followed that superordinate level processing has priority over basic-level processing in scene gist recognition and spatial information contributes to the earlier and object information to the later stages in scene gist recognition. PMID:28382195
Timing, timing, timing: Fast decoding of object information from intracranial field potentials in human visual cortex

PubMed Central

Liu, Hesheng; Agam, Yigal; Madsen, Joseph R.; Kreiman, Gabriel

2010-01-01

Summary The difficulty of visual recognition stems from the need to achieve high selectivity while maintaining robustness to object transformations within hundreds of milliseconds. Theories of visual recognition differ in whether the neuronal circuits invoke recurrent feedback connections or not. The timing of neurophysiological responses in visual cortex plays a key role in distinguishing between bottom-up and top-down theories. Here we quantified at millisecond resolution the amount of visual information conveyed by intracranial field potentials from 912 electrodes in 11 human subjects. We could decode object category information from human visual cortex in single trials as early as 100 ms post-stimulus. Decoding performance was robust to depth rotation and scale changes. The results suggest that physiological activity in the temporal lobe can account for key properties of visual recognition. The fast decoding in single trials is compatible with feed-forward theories and provides strong constraints for computational models of human vision. PMID:19409272
Kamikihi-to (KKT) rescues axonal and synaptic degeneration associated with memory impairment in a mouse model of Alzheimer's disease, 5XFAD.

PubMed

Tohda, Chihiro; Nakada, Rie; Urano, Takuya; Okonogi, Akira; Kuboyama, Tomoharu

2011-12-01

Alzheimer's disease (AD) is a chronic progressive neurodegenerative disorder. Current agents for AD are employed for symptomatic therapy and insufficient to cure. We consider that this is quite necessary for AD treatment and have investigated axon/synapse formation-promoting activity. The aim of this study is to investigate the effects of Kamikihi-to [KKT; traditional Japanese (Kampo) medicine] on memory deficits in an AD model, 5XFAD. KKT (200 mg/kg, p.o.) was administered for 15 days to 5XFAD mice. Object recognition memory was tested in vehicle-treated wild-type and 5XFAD mice and KKT-treated 5XFAD mice. KKT-treated 5XFAD mice showed significant improvement of object recognition memory. KKT treatment significantly reduced the number of amyloid plaques in the frontal cortex and hippocampus. Only inside of amyloid plaques were abnormal structures such as bulb-like axons and swollen presynaptic boutons observed. These degenerated axons and presynaptic terminals were significantly reduced by KKT treatment in the frontal cortex. In primary cortical neurons, KKT treatment significantly increased axon length when applied after Aβ(25-35)-induced axonal atrophy had progressed. In conclusion, KKT improved object recognition memory deficit in an AD model 5XFAD mice. Restoration of degenerated axons and synapses may be associated with the memory recovery by KKT.
Learning and recognition of on-premise signs from weakly labeled street view images.

PubMed

Tsai, Tsung-Hung; Cheng, Wen-Huang; You, Chuang-Wen; Hu, Min-Chun; Tsui, Arvin Wen; Chi, Heng-Yu

2014-03-01

Camera-enabled mobile devices are commonly used as interaction platforms for linking the user's virtual and physical worlds in numerous research and commercial applications, such as serving an augmented reality interface for mobile information retrieval. The various application scenarios give rise to a key technique of daily life visual object recognition. On-premise signs (OPSs), a popular form of commercial advertising, are widely used in our living life. The OPSs often exhibit great visual diversity (e.g., appearing in arbitrary size), accompanied with complex environmental conditions (e.g., foreground and background clutter). Observing that such real-world characteristics are lacking in most of the existing image data sets, in this paper, we first proposed an OPS data set, namely OPS-62, in which totally 4649 OPS images of 62 different businesses are collected from Google's Street View. Further, for addressing the problem of real-world OPS learning and recognition, we developed a probabilistic framework based on the distributional clustering, in which we proposed to exploit the distributional information of each visual feature (the distribution of its associated OPS labels) as a reliable selection criterion for building discriminative OPS models. Experiments on the OPS-62 data set demonstrated the outperformance of our approach over the state-of-the-art probabilistic latent semantic analysis models for more accurate recognitions and less false alarms, with a significant 151.28% relative improvement in the average recognition rate. Meanwhile, our approach is simple, linear, and can be executed in a parallel fashion, making it practical and scalable for large-scale multimedia applications.
Effect of (+)-Methamphetamine on Path Integration Learning, Novel Object Recognition, and Neurotoxicity in Rats

PubMed Central

Herring, Nicole R.; Schaefer, Tori L.; Gudelsky, Gary A.; Vorhees, Charles V.; Williams, Michael T.

2008-01-01

Rationale Methamphetamine (MA) has been implicated in cognitive deficits in humans after chronic use. Animal models of neurotoxic MA exposure reveal persistent damage to monoaminergic systems, but few associated cognitive effects. Objectives Since, questions have been raised about the typical neurotoxic dosing regimen used in animals and whether it adequately models human cumulative drug exposure, these experiments examined two different dosing regimens. Methods Rats were treated with one of two regimens, one the typical neurotoxic regimen (4 × 10 mg/kg every 2 h) and one based on pharmacokinetic modeling (Cho et al. 2001) designed to better represent accumulating plasma concentrations of MA as seen in human users (24 ×1.67 mg/kg once every 15 min); matched for total daily dose. In two separate experiments, dosing regimens were compared for their effects on markers of neurotoxicity or on behavior. Results On markers of neurotoxicity, MA showed decreased DA and 5-HT, and increased glial fibrillary acidic protein and increased corticosterone levels regardless of dosing regimen 3 days post-treatment. Behaviorally, MA-treated groups, regardless of dosing regimen, showed hypoactivity, increased initial hyperactivity to a subsequent MA challenge, impaired novel object recognition, impaired learning in a multiple-T water maze test of path integration, and no differences on spatial navigation or reference memory in the Morris water maze. After behavioral testing, reductions of DA and 5-HT remained. Conclusions MA treatment induces an effect on path integration learning not previously reported. Dosing regimen had no differential effects on behavior or neurotoxicity. PMID:18509623
A novel rotational invariants target recognition method for rotating motion blurred images

NASA Astrophysics Data System (ADS)

Lan, Jinhui; Gong, Meiling; Dong, Mingwei; Zeng, Yiliang; Zhang, Yuzhen

2017-11-01

The imaging of the image sensor is blurred due to the rotational motion of the carrier and reducing the target recognition rate greatly. Although the traditional mode that restores the image first and then identifies the target can improve the recognition rate, it takes a long time to recognize. In order to solve this problem, a rotating fuzzy invariants extracted model was constructed that recognizes target directly. The model includes three metric layers. The object description capability of metric algorithms that contain gray value statistical algorithm, improved round projection transformation algorithm and rotation-convolution moment invariants in the three metric layers ranges from low to high, and the metric layer with the lowest description ability among them is as the input which can eliminate non pixel points of target region from degenerate image gradually. Experimental results show that the proposed model can improve the correct target recognition rate of blurred image and optimum allocation between the computational complexity and function of region.
Optical character recognition of handwritten Arabic using hidden Markov models

NASA Astrophysics Data System (ADS)

Aulama, Mohannad M.; Natsheh, Asem M.; Abandah, Gheith A.; Olama, Mohammed M.

2011-04-01

The problem of optical character recognition (OCR) of handwritten Arabic has not received a satisfactory solution yet. In this paper, an Arabic OCR algorithm is developed based on Hidden Markov Models (HMMs) combined with the Viterbi algorithm, which results in an improved and more robust recognition of characters at the sub-word level. Integrating the HMMs represents another step of the overall OCR trends being currently researched in the literature. The proposed approach exploits the structure of characters in the Arabic language in addition to their extracted features to achieve improved recognition rates. Useful statistical information of the Arabic language is initially extracted and then used to estimate the probabilistic parameters of the mathematical HMM. A new custom implementation of the HMM is developed in this study, where the transition matrix is built based on the collected large corpus, and the emission matrix is built based on the results obtained via the extracted character features. The recognition process is triggered using the Viterbi algorithm which employs the most probable sequence of sub-words. The model was implemented to recognize the sub-word unit of Arabic text raising the recognition rate from being linked to the worst recognition rate for any character to the overall structure of the Arabic language. Numerical results show that there is a potentially large recognition improvement by using the proposed algorithms.

Optical character recognition of handwritten Arabic using hidden Markov models

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aulama, Mohannad M.; Natsheh, Asem M.; Abandah, Gheith A.

2011-01-01

The problem of optical character recognition (OCR) of handwritten Arabic has not received a satisfactory solution yet. In this paper, an Arabic OCR algorithm is developed based on Hidden Markov Models (HMMs) combined with the Viterbi algorithm, which results in an improved and more robust recognition of characters at the sub-word level. Integrating the HMMs represents another step of the overall OCR trends being currently researched in the literature. The proposed approach exploits the structure of characters in the Arabic language in addition to their extracted features to achieve improved recognition rates. Useful statistical information of the Arabic language ismore » initially extracted and then used to estimate the probabilistic parameters of the mathematical HMM. A new custom implementation of the HMM is developed in this study, where the transition matrix is built based on the collected large corpus, and the emission matrix is built based on the results obtained via the extracted character features. The recognition process is triggered using the Viterbi algorithm which employs the most probable sequence of sub-words. The model was implemented to recognize the sub-word unit of Arabic text raising the recognition rate from being linked to the worst recognition rate for any character to the overall structure of the Arabic language. Numerical results show that there is a potentially large recognition improvement by using the proposed algorithms.« less
EPO improved neurologic outcome in rat pups late after traumatic brain injury.

PubMed

Schober, Michelle E; Requena, Daniela F; Rodesch, Christopher K

2018-05-01

In adult rats, erythropoietin improved outcomes early and late after traumatic brain injury, associated with increased levels of Brain Derived Neurotrophic Factor. Using our model of pediatric traumatic brain injury, controlled cortical impact in 17-day old rats, we previously showed that erythropoietin increased hippocampal neuronal fraction in the first two days after injury. Erythropoietin also decreased activation of caspase3, an apoptotic enzyme modulated by Brain Derived Neurotrophic Factor, and improved Novel Object Recognition testing 14 days after injury. Data on long-term effects of erythropoietin on Brain Derived Neurotrophic Factor expression, histology and cognitive function after developmental traumatic brain injury are lacking. We hypothesized that erythropoietin would increase Brain Derived Neurotrophic Factor and improve long-term object recognition in rat pups after controlled cortical impact, associated with increased neuronal fraction in the hippocampus. Rats pups received erythropoietin or vehicle at 1, 24, and 48 h and 7 days after injury or sham surgery followed by histology at 35 days, Novel Object Recognition testing at adulthood, and Brain Derived Neurotrophic Factor measurements early and late after injury. Erythropoietin improved Novel Object Recognition performance and preserved hippocampal volume, but not neuronal fraction, late after injury. Improved object recognition in erythropoietin treated rats was associated with preserved hippocampal volume late after traumatic brain injury. Erythropoietin is approved to treat various pediatric conditions. Coupled with exciting experimental and clinical studies suggesting it is beneficial after neonatal hypoxic ischemic brain injury, our preliminary findings support further study of erythropoietin use after developmental traumatic brain injury. Copyright © 2018 The Japanese Society of Child Neurology. Published by Elsevier B.V. All rights reserved.
Recognition of abstract objects via neural oscillators: interaction among topological organization, associative memory and gamma band synchronization.

PubMed

Ursino, Mauro; Magosso, Elisa; Cuppini, Cristiano

2009-02-01

Synchronization of neural activity in the gamma band is assumed to play a significant role not only in perceptual processing, but also in higher cognitive functions. Here, we propose a neural network of Wilson-Cowan oscillators to simulate recognition of abstract objects, each represented as a collection of four features. Features are ordered in topological maps of oscillators connected via excitatory lateral synapses, to implement a similarity principle. Experience on previous objects is stored in long-range synapses connecting the different topological maps, and trained via timing dependent Hebbian learning (previous knowledge principle). Finally, a downstream decision network detects the presence of a reliable object representation, when all features are oscillating in synchrony. Simulations performed giving various simultaneous objects to the network (from 1 to 4), with some missing and/or modified properties suggest that the network can reconstruct objects, and segment them from the other simultaneously present objects, even in case of deteriorated information, noise, and moderate correlation among the inputs (one common feature). The balance between sensitivity and specificity depends on the strength of the Hebbian learning. Achieving a correct reconstruction in all cases, however, requires ad hoc selection of the oscillation frequency. The model represents an attempt to investigate the interactions among topological maps, autoassociative memory, and gamma-band synchronization, for recognition of abstract objects.
A Computational Model of Semantic Memory Impairment: Modality- Specificity and Emergent Category-Specificity

DTIC Science & Technology

1991-09-01

just one modality (e.g. visual or auditory agnosia ) or impaired manipulation of objects with specific uses, despite intact recognition of them (apraxia...Neurosurgery and itbiatzy, 51, 1201-1207. Farah, M. J. (1991) Patterns of co-occurence among the associative agnosias : Implications for visual object
Target Recognition Using Neural Networks for Model Deformation Measurements

NASA Technical Reports Server (NTRS)

Ross, Richard W.; Hibler, David L.

1999-01-01

Optical measurements provide a non-invasive method for measuring deformation of wind tunnel models. Model deformation systems use targets mounted or painted on the surface of the model to identify known positions, and photogrammetric methods are used to calculate 3-D positions of the targets on the model from digital 2-D images. Under ideal conditions, the reflective targets are placed against a dark background and provide high-contrast images, aiding in target recognition. However, glints of light reflecting from the model surface, or reduced contrast caused by light source or model smoothness constraints, can compromise accurate target determination using current algorithmic methods. This paper describes a technique using a neural network and image processing technologies which increases the reliability of target recognition systems. Unlike algorithmic methods, the neural network can be trained to identify the characteristic patterns that distinguish targets from other objects of similar size and appearance and can adapt to changes in lighting and environmental conditions.
Object recognition of ladar with support vector machine

NASA Astrophysics Data System (ADS)

Sun, Jian-Feng; Li, Qi; Wang, Qi

2005-01-01

Intensity, range and Doppler images can be obtained by using laser radar. Laser radar can detect much more object information than other detecting sensor, such as passive infrared imaging and synthetic aperture radar (SAR), so it is well suited as the sensor of object recognition. Traditional method of laser radar object recognition is extracting target features, which can be influenced by noise. In this paper, a laser radar recognition method-Support Vector Machine is introduced. Support Vector Machine (SVM) is a new hotspot of recognition research after neural network. It has well performance on digital written and face recognition. Two series experiments about SVM designed for preprocessing and non-preprocessing samples are performed by real laser radar images, and the experiments results are compared.
Individual differences in forced-choice recognition memory: partitioning contributions of recollection and familiarity.

PubMed

Migo, Ellen M; Quamme, Joel R; Holmes, Selina; Bendell, Andrew; Norman, Kenneth A; Mayes, Andrew R; Montaldi, Daniela

2014-01-01

In forced-choice recognition memory, two different testing formats are possible under conditions of high target-foil similarity: Each target can be presented alongside foils similar to itself (forced-choice corresponding; FCC), or alongside foils similar to other targets (forced-choice noncorresponding; FCNC). Recent behavioural and neuropsychological studies suggest that FCC performance can be supported by familiarity whereas FCNC performance is supported primarily by recollection. In this paper, we corroborate this finding from an individual differences perspective. A group of older adults were given a test of FCC and FCNC recognition for object pictures, as well as standardized tests of recall, recognition, and IQ. Recall measures were found to predict FCNC, but not FCC performance, consistent with a critical role for recollection in FCNC only. After the common influence of recall was removed, standardized tests of recognition predicted FCC, but not FCNC performance. This is consistent with a contribution of only familiarity in FCC. Simulations show that a two-process model, where familiarity and recollection make separate contributions to recognition, is 10 times more likely to give these results than a single-process model. This evidence highlights the importance of recognition memory test design when examining the involvement of recollection and familiarity.
Changing predictions, stable recognition: Children's representations of downward incline motion.

PubMed

Hast, Michael; Howe, Christine

2017-11-01

Various studies to-date have demonstrated children hold ill-conceived expressed beliefs about the physical world such as that one ball will fall faster than another because it is heavier. At the same time, they also demonstrate accurate recognition of dynamic events. How these representations relate is still unresolved. This study examined 5- to 11-year-olds' (N = 130) predictions and recognition of motion down inclines. Predictions were typically in error, matching previous work, but children largely recognized correct events as correct and rejected incorrect ones. The results also demonstrate while predictions change with increasing age, recognition shows signs of stability. The findings provide further support for a hybrid model of object representations and argue in favour of stable core cognition existing alongside developmental changes. Statement of contribution What is already known on this subject? Children's predictions of physical events show limitations in accuracy Their recognition of such events suggests children may use different knowledge sources in their reasoning What the present study adds? Predictions fluctuate more strongly than recognition, suggesting stable core cognition But recognition also shows some fluctuation, arguing for a hybrid model of knowledge representation. © 2017 The British Psychological Society.
The 4-D approach to visual control of autonomous systems

NASA Technical Reports Server (NTRS)

Dickmanns, Ernst D.

1994-01-01

Development of a 4-D approach to dynamic machine vision is described. Core elements of this method are spatio-temporal models oriented towards objects and laws of perspective projection in a foward mode. Integration of multi-sensory measurement data was achieved through spatio-temporal models as invariants for object recognition. Situation assessment and long term predictions were allowed through maintenance of a symbolic 4-D image of processes involving objects. Behavioral capabilities were easily realized by state feedback and feed-foward control.
Continuous statistical modelling for rapid detection of adulteration of extra virgin olive oil using mid infrared and Raman spectroscopic data.

PubMed

Georgouli, Konstantia; Martinez Del Rincon, Jesus; Koidis, Anastasios

2017-02-15

The main objective of this work was to develop a novel dimensionality reduction technique as a part of an integrated pattern recognition solution capable of identifying adulterants such as hazelnut oil in extra virgin olive oil at low percentages based on spectroscopic chemical fingerprints. A novel Continuous Locality Preserving Projections (CLPP) technique is proposed which allows the modelling of the continuous nature of the produced in-house admixtures as data series instead of discrete points. The maintenance of the continuous structure of the data manifold enables the better visualisation of this examined classification problem and facilitates the more accurate utilisation of the manifold for detecting the adulterants. The performance of the proposed technique is validated with two different spectroscopic techniques (Raman and Fourier transform infrared, FT-IR). In all cases studied, CLPP accompanied by k-Nearest Neighbors (kNN) algorithm was found to outperform any other state-of-the-art pattern recognition techniques. Copyright © 2016 Elsevier Ltd. All rights reserved.
An adaptive Hidden Markov Model for activity recognition based on a wearable multi-sensor device

USDA-ARS?s Scientific Manuscript database

Human activity recognition is important in the study of personal health, wellness and lifestyle. In order to acquire human activity information from the personal space, many wearable multi-sensor devices have been developed. In this paper, a novel technique for automatic activity recognition based o...
Testing Theories of Recognition Memory by Predicting Performance Across Paradigms

ERIC Educational Resources Information Center

Smith, David G.; Duncan, Matthew J. J.

2004-01-01

Signal-detection theory (SDT) accounts of recognition judgments depend on the assumption that recognition decisions result from a single familiarity-based process. However, fits of a hybrid SDT model, called dual-process theory (DPT), have provided evidence for the existence of a second, recollection-based process. In 2 experiments, the authors…
Recognition of human activity characteristics based on state transitions modeling technique

NASA Astrophysics Data System (ADS)

Elangovan, Vinayak; Shirkhodaie, Amir

2012-06-01

Human Activity Discovery & Recognition (HADR) is a complex, diverse and challenging task but yet an active area of ongoing research in the Department of Defense. By detecting, tracking, and characterizing cohesive Human interactional activity patterns, potential threats can be identified which can significantly improve situation awareness, particularly, in Persistent Surveillance Systems (PSS). Understanding the nature of such dynamic activities, inevitably involves interpretation of a collection of spatiotemporally correlated activities with respect to a known context. In this paper, we present a State Transition model for recognizing the characteristics of human activities with a link to a prior contextbased ontology. Modeling the state transitions between successive evidential events determines the activities' temperament. The proposed state transition model poses six categories of state transitions including: Human state transitions of Object handling, Visibility, Entity-entity relation, Human Postures, Human Kinematics and Distance to Target. The proposed state transition model generates semantic annotations describing the human interactional activities via a technique called Casual Event State Inference (CESI). The proposed approach uses a low cost kinect depth camera for indoor and normal optical camera for outdoor monitoring activities. Experimental results are presented here to demonstrate the effectiveness and efficiency of the proposed technique.
Computer-assisted visual interactive recognition and its prospects of implementation over the Internet

NASA Astrophysics Data System (ADS)

Zou, Jie; Gattani, Abhishek

2005-01-01

When completely automated systems don't yield acceptable accuracy, many practical pattern recognition systems involve the human either at the beginning (pre-processing) or towards the end (handling rejects). We believe that it may be more useful to involve the human throughout the recognition process rather than just at the beginning or end. We describe a methodology of interactive visual recognition for human-centered low-throughput applications, Computer Assisted Visual InterActive Recognition (CAVIAR), and discuss the prospects of implementing CAVIAR over the Internet. The novelty of CAVIAR is image-based interaction through a domain-specific parameterized geometrical model, which reduces the semantic gap between humans and computers. The user may interact with the computer anytime that she considers its response unsatisfactory. The interaction improves the accuracy of the classification features by improving the fit of the computer-proposed model. The computer makes subsequent use of the parameters of the improved model to refine not only its own statistical model-fitting process, but also its internal classifier. The CAVIAR methodology was applied to implement a flower recognition system. The principal conclusions from the evaluation of the system include: 1) the average recognition time of the CAVIAR system is significantly shorter than that of the unaided human; 2) its accuracy is significantly higher than that of the unaided machine; 3) it can be initialized with as few as one training sample per class and still achieve high accuracy; and 4) it demonstrates a self-learning ability. We have also implemented a Mobile CAVIAR system, where a pocket PC, as a client, connects to a server through wireless communication. The motivation behind a mobile platform for CAVIAR is to apply the methodology in a human-centered pervasive environment, where the user can seamlessly interact with the system for classifying field-data. Deploying CAVIAR to a networked mobile platform poses the challenge of classifying field images and programming under constraints of display size, network bandwidth, processor speed, and memory size. Editing of the computer-proposed model is performed on the handheld while statistical model fitting and classification take place on the server. The possibility that the user can easily take several photos of the object poses an interesting information fusion problem. The advantage of the Internet is that the patterns identified by different users can be pooled together to benefit all peer users. When users identify patterns with CAVIAR in a networked setting, they also collect training samples and provide opportunities for machine learning from their intervention. CAVIAR implemented over the Internet provides a perfect test bed for, and extends, the concept of Open Mind Initiative proposed by David Stork. Our experimental evaluation focuses on human time, machine and human accuracy, and machine learning. We devoted much effort to evaluating the use of our image-based user interface and on developing principles for the evaluation of interactive pattern recognition system. The Internet architecture and Mobile CAVIAR methodology have many applications. We are exploring in the directions of teledermatology, face recognition, and education.
A standardization model based on image recognition for performance evaluation of an oral scanner.

PubMed

Seo, Sang-Wan; Lee, Wan-Sun; Byun, Jae-Young; Lee, Kyu-Bok

2017-12-01

Accurate information is essential in dentistry. The image information of missing teeth is used in optically based medical equipment in prosthodontic treatment. To evaluate oral scanners, the standardized model was examined from cases of image recognition errors of linear discriminant analysis (LDA), and a model that combines the variables with reference to ISO 12836:2015 was designed. The basic model was fabricated by applying 4 factors to the tooth profile (chamfer, groove, curve, and square) and the bottom surface. Photo-type and video-type scanners were used to analyze 3D images after image capture. The scans were performed several times according to the prescribed sequence to distinguish the model from the one that did not form, and the results confirmed it to be the best. In the case of the initial basic model, a 3D shape could not be obtained by scanning even if several shots were taken. Subsequently, the recognition rate of the image was improved with every variable factor, and the difference depends on the tooth profile and the pattern of the floor surface. Based on the recognition error of the LDA, the recognition rate decreases when the model has a similar pattern. Therefore, to obtain the accurate 3D data, the difference of each class needs to be provided when developing a standardized model.
Online graphic symbol recognition using neural network and ARG matching

NASA Astrophysics Data System (ADS)

Yang, Bing; Li, Changhua; Xie, Weixing

2001-09-01

This paper proposes a novel method for on-line recognition of line-based graphic symbol. The input strokes are usually warped into a cursive form due to the sundry drawing style, and classifying them is very difficult. To deal with this, an ART-2 neural network is used to classify the input strokes. It has the advantages of high recognition rate, less recognition time and forming classes in a self-organized manner. The symbol recognition is achieved by an Attribute Relational Graph (ARG) matching algorithm. The ARG is very efficient for representing complex objects, but computation cost is very high. To over come this, we suggest a fast graph matching algorithm using symbol structure information. The experimental results show that the proposed method is effective for recognition of symbols with hierarchical structure.
Identification of Alfalfa Leaf Diseases Using Image Recognition Technology

PubMed Central

Qin, Feng; Liu, Dongxia; Sun, Bingda; Ruan, Liu; Ma, Zhanhong; Wang, Haiguang

2016-01-01

Common leaf spot (caused by Pseudopeziza medicaginis), rust (caused by Uromyces striatus), Leptosphaerulina leaf spot (caused by Leptosphaerulina briosiana) and Cercospora leaf spot (caused by Cercospora medicaginis) are the four common types of alfalfa leaf diseases. Timely and accurate diagnoses of these diseases are critical for disease management, alfalfa quality control and the healthy development of the alfalfa industry. In this study, the identification and diagnosis of the four types of alfalfa leaf diseases were investigated using pattern recognition algorithms based on image-processing technology. A sub-image with one or multiple typical lesions was obtained by artificial cutting from each acquired digital disease image. Then the sub-images were segmented using twelve lesion segmentation methods integrated with clustering algorithms (including K_means clustering, fuzzy C-means clustering and K_median clustering) and supervised classification algorithms (including logistic regression analysis, Naive Bayes algorithm, classification and regression tree, and linear discriminant analysis). After a comprehensive comparison, the segmentation method integrating the K_median clustering algorithm and linear discriminant analysis was chosen to obtain lesion images. After the lesion segmentation using this method, a total of 129 texture, color and shape features were extracted from the lesion images. Based on the features selected using three methods (ReliefF, 1R and correlation-based feature selection), disease recognition models were built using three supervised learning methods, including the random forest, support vector machine (SVM) and K-nearest neighbor methods. A comparison of the recognition results of the models was conducted. The results showed that when the ReliefF method was used for feature selection, the SVM model built with the most important 45 features (selected from a total of 129 features) was the optimal model. For this SVM model, the recognition accuracies of the training set and the testing set were 97.64% and 94.74%, respectively. Semi-supervised models for disease recognition were built based on the 45 effective features that were used for building the optimal SVM model. For the optimal semi-supervised models built with three ratios of labeled to unlabeled samples in the training set, the recognition accuracies of the training set and the testing set were both approximately 80%. The results indicated that image recognition of the four alfalfa leaf diseases can be implemented with high accuracy. This study provides a feasible solution for lesion image segmentation and image recognition of alfalfa leaf disease. PMID:27977767
Identification of Alfalfa Leaf Diseases Using Image Recognition Technology.

PubMed

Qin, Feng; Liu, Dongxia; Sun, Bingda; Ruan, Liu; Ma, Zhanhong; Wang, Haiguang

2016-01-01

Common leaf spot (caused by Pseudopeziza medicaginis), rust (caused by Uromyces striatus), Leptosphaerulina leaf spot (caused by Leptosphaerulina briosiana) and Cercospora leaf spot (caused by Cercospora medicaginis) are the four common types of alfalfa leaf diseases. Timely and accurate diagnoses of these diseases are critical for disease management, alfalfa quality control and the healthy development of the alfalfa industry. In this study, the identification and diagnosis of the four types of alfalfa leaf diseases were investigated using pattern recognition algorithms based on image-processing technology. A sub-image with one or multiple typical lesions was obtained by artificial cutting from each acquired digital disease image. Then the sub-images were segmented using twelve lesion segmentation methods integrated with clustering algorithms (including K_means clustering, fuzzy C-means clustering and K_median clustering) and supervised classification algorithms (including logistic regression analysis, Naive Bayes algorithm, classification and regression tree, and linear discriminant analysis). After a comprehensive comparison, the segmentation method integrating the K_median clustering algorithm and linear discriminant analysis was chosen to obtain lesion images. After the lesion segmentation using this method, a total of 129 texture, color and shape features were extracted from the lesion images. Based on the features selected using three methods (ReliefF, 1R and correlation-based feature selection), disease recognition models were built using three supervised learning methods, including the random forest, support vector machine (SVM) and K-nearest neighbor methods. A comparison of the recognition results of the models was conducted. The results showed that when the ReliefF method was used for feature selection, the SVM model built with the most important 45 features (selected from a total of 129 features) was the optimal model. For this SVM model, the recognition accuracies of the training set and the testing set were 97.64% and 94.74%, respectively. Semi-supervised models for disease recognition were built based on the 45 effective features that were used for building the optimal SVM model. For the optimal semi-supervised models built with three ratios of labeled to unlabeled samples in the training set, the recognition accuracies of the training set and the testing set were both approximately 80%. The results indicated that image recognition of the four alfalfa leaf diseases can be implemented with high accuracy. This study provides a feasible solution for lesion image segmentation and image recognition of alfalfa leaf disease.
Fundamentals of thinking, patterns

NASA Astrophysics Data System (ADS)

Gafurov, O. M.; Gafurov, D. O.; Syryamkin, V. I.

2018-05-01

The authors analyze the fundamentals of thinking and propose to consider a model of the brain based on the presence of magnetic properties of gliacytes (Schwann cells) because of their oxygen saturation (oxygen has paramagnetic properties). The authors also propose to take into account the motion of electrical discharges through synapses causing electric and magnetic fields as well as additional effects such as paramagnetic resonance, which allows combining multisensory object-related information located in different parts of the brain. Therefore, the events of the surrounding world are reflected and remembered in the cortex columns, thus, creating isolated subnets with altered magnetic properties (patterns) and subsequently participate in recognition of objects, form a memory, and so on. The possibilities for the pattern-based thinking are based on the practical experience of applying methods and technologies of artificial neural networks in the form of a neuroemulator and neuromorphic computing devices.
A Scientific Workflow Platform for Generic and Scalable Object Recognition on Medical Images

NASA Astrophysics Data System (ADS)

Möller, Manuel; Tuot, Christopher; Sintek, Michael

In the research project THESEUS MEDICO we aim at a system combining medical image information with semantic background knowledge from ontologies to give clinicians fully cross-modal access to biomedical image repositories. Therefore joint efforts have to be made in more than one dimension: Object detection processes have to be specified in which an abstraction is performed starting from low-level image features across landmark detection utilizing abstract domain knowledge up to high-level object recognition. We propose a system based on a client-server extension of the scientific workflow platform Kepler that assists the collaboration of medical experts and computer scientists during development and parameter learning.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.