Can Humans Fly Action Understanding with Multiple Classes of Actors
2015-06-08
recognition using structure from motion point clouds. In European Conference on Computer Vision, 2008. [5] R. Caruana. Multitask learning. Machine Learning...tonomous driving ? the kitti vision benchmark suite. In IEEE Conference on Computer Vision and Pattern Recognition, 2012. [12] L. Gorelick, M. Blank
Automatic micropropagation of plants--the vision-system: graph rewriting as pattern recognition
NASA Astrophysics Data System (ADS)
Schwanke, Joerg; Megnet, Roland; Jensch, Peter F.
1993-03-01
The automation of plant-micropropagation is necessary to produce high amounts of biomass. Plants have to be dissected on particular cutting-points. A vision-system is needed for the recognition of the cutting-points on the plants. With this background, this contribution is directed to the underlying formalism to determine cutting-points on abstract-plant models. We show the usefulness of pattern recognition by graph-rewriting along with some examples in this context.
NASA Technical Reports Server (NTRS)
Rahman, Zia-ur; Jobson, Daniel J.; Woodell, Glenn A.
2010-01-01
New foundational ideas are used to define a novel approach to generic visual pattern recognition. These ideas proceed from the starting point of the intrinsic equivalence of noise reduction and pattern recognition when noise reduction is taken to its theoretical limit of explicit matched filtering. This led us to think of the logical extension of sparse coding using basis function transforms for both de-noising and pattern recognition to the full pattern specificity of a lexicon of matched filter pattern templates. A key hypothesis is that such a lexicon can be constructed and is, in fact, a generic visual alphabet of spatial vision. Hence it provides a tractable solution for the design of a generic pattern recognition engine. Here we present the key scientific ideas, the basic design principles which emerge from these ideas, and a preliminary design of the Spatial Vision Tree (SVT). The latter is based upon a cryptographic approach whereby we measure a large aggregate estimate of the frequency of occurrence (FOO) for each pattern. These distributions are employed together with Hamming distance criteria to design a two-tier tree. Then using information theory, these same FOO distributions are used to define a precise method for pattern representation. Finally the experimental performance of the preliminary SVT on computer generated test images and complex natural images is assessed.
Liquid lens: advances in adaptive optics
NASA Astrophysics Data System (ADS)
Casey, Shawn Patrick
2010-12-01
'Liquid lens' technologies promise significant advancements in machine vision and optical communications systems. Adaptations for machine vision, human vision correction, and optical communications are used to exemplify the versatile nature of this technology. Utilization of liquid lens elements allows the cost effective implementation of optical velocity measurement. The project consists of a custom image processor, camera, and interface. The images are passed into customized pattern recognition and optical character recognition algorithms. A single camera would be used for both speed detection and object recognition.
Spatial-frequency cutoff requirements for pattern recognition in central and peripheral vision
Kwon, MiYoung; Legge, Gordon E.
2011-01-01
It is well known that object recognition requires spatial frequencies exceeding some critical cutoff value. People with central scotomas who rely on peripheral vision have substantial difficulty with reading and face recognition. Deficiencies of pattern recognition in peripheral vision, might result in higher cutoff requirements, and may contribute to the functional problems of people with central-field loss. Here we asked about differences in spatial-cutoff requirements in central and peripheral vision for letter and face recognition. The stimuli were the 26 letters of the English alphabet and 26 celebrity faces. Each image was blurred using a low-pass filter in the spatial frequency domain. Critical cutoffs (defined as the minimum low-pass filter cutoff yielding 80% accuracy) were obtained by measuring recognition accuracy as a function of cutoff (in cycles per object). Our data showed that critical cutoffs increased from central to peripheral vision by 20% for letter recognition and by 50% for face recognition. We asked whether these differences could be accounted for by central/peripheral differences in the contrast sensitivity function (CSF). We addressed this question by implementing an ideal-observer model which incorporates empirical CSF measurements and tested the model on letter and face recognition. The success of the model indicates that central/peripheral differences in the cutoff requirements for letter and face recognition can be accounted for by the information content of the stimulus limited by the shape of the human CSF, combined with a source of internal noise and followed by an optimal decision rule. PMID:21854800
ICPR-2016 - International Conference on Pattern Recognition
Learning for Scene Understanding" Speakers ICPR2016 PAPER AWARDS Best Piero Zamperoni Student Paper -Paced Dictionary Learning for Cross-Domain Retrieval and Recognition Xu, Dan; Song, Jingkuan; Alameda discussions on recent advances in the fields of Pattern Recognition, Machine Learning and Computer Vision, and
Image-based automatic recognition of larvae
NASA Astrophysics Data System (ADS)
Sang, Ru; Yu, Guiying; Fan, Weijun; Guo, Tiantai
2010-08-01
As the main objects, imagoes have been researched in quarantine pest recognition in these days. However, pests in their larval stage are latent, and the larvae spread abroad much easily with the circulation of agricultural and forest products. It is presented in this paper that, as the new research objects, larvae are recognized by means of machine vision, image processing and pattern recognition. More visional information is reserved and the recognition rate is improved as color image segmentation is applied to images of larvae. Along with the characteristics of affine invariance, perspective invariance and brightness invariance, scale invariant feature transform (SIFT) is adopted for the feature extraction. The neural network algorithm is utilized for pattern recognition, and the automatic identification of larvae images is successfully achieved with satisfactory results.
Wijeyekoon, Skanda; Kharicha, Kalpa; Iliffe, Steve
2015-09-01
To evaluate heuristics (rules of thumb) for recognition of undetected vision loss in older patients in primary care. Vision loss is associated with ageing, and its prevalence is increasing. Visual impairment has a broad impact on health, functioning and well-being. Unrecognised vision loss remains common, and screening interventions have yet to reduce its prevalence. An alternative approach is to enhance practitioners' skills in recognising undetected vision loss, by having a more detailed picture of those who are likely not to act on vision changes, report symptoms or have eye tests. This paper describes a qualitative technology development study to evaluate heuristics for recognition of undetected vision loss in older patients in primary care. Using a previous modelling study, two heuristics in the form of mnemonics were developed to aid pattern recognition and allow general practitioners to identify potential cases of unreported vision loss. These heuristics were then analysed with experts. Findings It was concluded that their implementation in modern general practice was unsuitable and an alternative solution should be sort.
Pattern recognition neural-net by spatial mapping of biology visual field
NASA Astrophysics Data System (ADS)
Lin, Xin; Mori, Masahiko
2000-05-01
The method of spatial mapping in biology vision field is applied to artificial neural networks for pattern recognition. By the coordinate transform that is called the complex-logarithm mapping and Fourier transform, the input images are transformed into scale- rotation- and shift- invariant patterns, and then fed into a multilayer neural network for learning and recognition. The results of computer simulation and an optical experimental system are described.
Fusion of Multiple Sensing Modalities for Machine Vision
1994-05-31
Modeling of Non-Homogeneous 3-D Objects for Thermal and Visual Image Synthesis," Pattern Recognition, in press. U [11] Nair, Dinesh , and J. K. Aggarwal...20th AIPR Workshop: Computer Vision--Meeting the Challenges, McLean, Virginia, October 1991. Nair, Dinesh , and J. K. Aggarwal, "An Object Recognition...Computer Engineering August 1992 Sunil Gupta Ph.D. Student Mohan Kumar M.S. Student Sandeep Kumar M.S. Student Xavier Lebegue Ph.D., Computer
Agnosic vision is like peripheral vision, which is limited by crowding.
Strappini, Francesca; Pelli, Denis G; Di Pace, Enrico; Martelli, Marialuisa
2017-04-01
Visual agnosia is a neuropsychological impairment of visual object recognition despite near-normal acuity and visual fields. A century of research has provided only a rudimentary account of the functional damage underlying this deficit. We find that the object-recognition ability of agnosic patients viewing an object directly is like that of normally-sighted observers viewing it indirectly, with peripheral vision. Thus, agnosic vision is like peripheral vision. We obtained 14 visual-object-recognition tests that are commonly used for diagnosis of visual agnosia. Our "standard" normal observer took these tests at various eccentricities in his periphery. Analyzing the published data of 32 apperceptive agnosia patients and a group of 14 posterior cortical atrophy (PCA) patients on these tests, we find that each patient's pattern of object recognition deficits is well characterized by one number, the equivalent eccentricity at which our standard observer's peripheral vision is like the central vision of the agnosic patient. In other words, each agnosic patient's equivalent eccentricity is conserved across tests. Across patients, equivalent eccentricity ranges from 4 to 40 deg, which rates severity of the visual deficit. In normal peripheral vision, the required size to perceive a simple image (e.g., an isolated letter) is limited by acuity, and that for a complex image (e.g., a face or a word) is limited by crowding. In crowding, adjacent simple objects appear unrecognizably jumbled unless their spacing exceeds the crowding distance, which grows linearly with eccentricity. Besides conservation of equivalent eccentricity across object-recognition tests, we also find conservation, from eccentricity to agnosia, of the relative susceptibility of recognition of ten visual tests. These findings show that agnosic vision is like eccentric vision. Whence crowding? Peripheral vision, strabismic amblyopia, and possibly apperceptive agnosia are all limited by crowding, making it urgent to know what drives crowding. Acuity does not (Song et al., 2014), but neural density might: neurons per deg 2 in the crowding-relevant cortical area. Copyright © 2017 Elsevier Ltd. All rights reserved.
An overview of computer vision
NASA Technical Reports Server (NTRS)
Gevarter, W. B.
1982-01-01
An overview of computer vision is provided. Image understanding and scene analysis are emphasized, and pertinent aspects of pattern recognition are treated. The basic approach to computer vision systems, the techniques utilized, applications, the current existing systems and state-of-the-art issues and research requirements, who is doing it and who is funding it, and future trends and expectations are reviewed.
NASA Technical Reports Server (NTRS)
Glass, Charles E.; Boyd, Richard V.; Sternberg, Ben K.
1991-01-01
The overall aim is to provide base technology for an automated vision system for on-board interpretation of geophysical data. During the first year's work, it was demonstrated that geophysical data can be treated as patterns and interpreted using single neural networks. Current research is developing an integrated vision system comprising neural networks, algorithmic preprocessing, and expert knowledge. This system is to be tested incrementally using synthetic geophysical patterns, laboratory generated geophysical patterns, and field geophysical patterns.
Nguyen, Dat Tien; Kim, Ki Wan; Hong, Hyung Gil; Koo, Ja Hyung; Kim, Min Cheol; Park, Kang Ryoung
2017-01-01
Extracting powerful image features plays an important role in computer vision systems. Many methods have previously been proposed to extract image features for various computer vision applications, such as the scale-invariant feature transform (SIFT), speed-up robust feature (SURF), local binary patterns (LBP), histogram of oriented gradients (HOG), and weighted HOG. Recently, the convolutional neural network (CNN) method for image feature extraction and classification in computer vision has been used in various applications. In this research, we propose a new gender recognition method for recognizing males and females in observation scenes of surveillance systems based on feature extraction from visible-light and thermal camera videos through CNN. Experimental results confirm the superiority of our proposed method over state-of-the-art recognition methods for the gender recognition problem using human body images. PMID:28335510
Nguyen, Dat Tien; Kim, Ki Wan; Hong, Hyung Gil; Koo, Ja Hyung; Kim, Min Cheol; Park, Kang Ryoung
2017-03-20
Extracting powerful image features plays an important role in computer vision systems. Many methods have previously been proposed to extract image features for various computer vision applications, such as the scale-invariant feature transform (SIFT), speed-up robust feature (SURF), local binary patterns (LBP), histogram of oriented gradients (HOG), and weighted HOG. Recently, the convolutional neural network (CNN) method for image feature extraction and classification in computer vision has been used in various applications. In this research, we propose a new gender recognition method for recognizing males and females in observation scenes of surveillance systems based on feature extraction from visible-light and thermal camera videos through CNN. Experimental results confirm the superiority of our proposed method over state-of-the-art recognition methods for the gender recognition problem using human body images.
Higher-order neural network software for distortion invariant object recognition
NASA Technical Reports Server (NTRS)
Reid, Max B.; Spirkovska, Lilly
1991-01-01
The state-of-the-art in pattern recognition for such applications as automatic target recognition and industrial robotic vision relies on digital image processing. We present a higher-order neural network model and software which performs the complete feature extraction-pattern classification paradigm required for automatic pattern recognition. Using a third-order neural network, we demonstrate complete, 100 percent accurate invariance to distortions of scale, position, and in-plate rotation. In a higher-order neural network, feature extraction is built into the network, and does not have to be learned. Only the relatively simple classification step must be learned. This is key to achieving very rapid training. The training set is much smaller than with standard neural network software because the higher-order network only has to be shown one view of each object to be learned, not every possible view. The software and graphical user interface run on any Sun workstation. Results of the use of the neural software in autonomous robotic vision systems are presented. Such a system could have extensive application in robotic manufacturing.
On the role of spatial phase and phase correlation in vision, illusion, and cognition
Gladilin, Evgeny; Eils, Roland
2015-01-01
Numerous findings indicate that spatial phase bears an important cognitive information. Distortion of phase affects topology of edge structures and makes images unrecognizable. In turn, appropriately phase-structured patterns give rise to various illusions of virtual image content and apparent motion. Despite a large body of phenomenological evidence not much is known yet about the role of phase information in neural mechanisms of visual perception and cognition. Here, we are concerned with analysis of the role of spatial phase in computational and biological vision, emergence of visual illusions and pattern recognition. We hypothesize that fundamental importance of phase information for invariant retrieval of structural image features and motion detection promoted development of phase-based mechanisms of neural image processing in course of evolution of biological vision. Using an extension of Fourier phase correlation technique, we show that the core functions of visual system such as motion detection and pattern recognition can be facilitated by the same basic mechanism. Our analysis suggests that emergence of visual illusions can be attributed to presence of coherently phase-shifted repetitive patterns as well as the effects of acuity compensation by saccadic eye movements. We speculate that biological vision relies on perceptual mechanisms effectively similar to phase correlation, and predict neural features of visual pattern (dis)similarity that can be used for experimental validation of our hypothesis of “cognition by phase correlation.” PMID:25954190
On the role of spatial phase and phase correlation in vision, illusion, and cognition.
Gladilin, Evgeny; Eils, Roland
2015-01-01
Numerous findings indicate that spatial phase bears an important cognitive information. Distortion of phase affects topology of edge structures and makes images unrecognizable. In turn, appropriately phase-structured patterns give rise to various illusions of virtual image content and apparent motion. Despite a large body of phenomenological evidence not much is known yet about the role of phase information in neural mechanisms of visual perception and cognition. Here, we are concerned with analysis of the role of spatial phase in computational and biological vision, emergence of visual illusions and pattern recognition. We hypothesize that fundamental importance of phase information for invariant retrieval of structural image features and motion detection promoted development of phase-based mechanisms of neural image processing in course of evolution of biological vision. Using an extension of Fourier phase correlation technique, we show that the core functions of visual system such as motion detection and pattern recognition can be facilitated by the same basic mechanism. Our analysis suggests that emergence of visual illusions can be attributed to presence of coherently phase-shifted repetitive patterns as well as the effects of acuity compensation by saccadic eye movements. We speculate that biological vision relies on perceptual mechanisms effectively similar to phase correlation, and predict neural features of visual pattern (dis)similarity that can be used for experimental validation of our hypothesis of "cognition by phase correlation."
Image-plane processing of visual information
NASA Technical Reports Server (NTRS)
Huck, F. O.; Fales, C. L.; Park, S. K.; Samms, R. W.
1984-01-01
Shannon's theory of information is used to optimize the optical design of sensor-array imaging systems which use neighborhood image-plane signal processing for enhancing edges and compressing dynamic range during image formation. The resultant edge-enhancement, or band-pass-filter, response is found to be very similar to that of human vision. Comparisons of traits in human vision with results from information theory suggest that: (1) Image-plane processing, like preprocessing in human vision, can improve visual information acquisition for pattern recognition when resolving power, sensitivity, and dynamic range are constrained. Improvements include reduced sensitivity to changes in lighter levels, reduced signal dynamic range, reduced data transmission and processing, and reduced aliasing and photosensor noise degradation. (2) Information content can be an appropriate figure of merit for optimizing the optical design of imaging systems when visual information is acquired for pattern recognition. The design trade-offs involve spatial response, sensitivity, and sampling interval.
Pattern recognition for passive polarimetric data using nonparametric classifiers
NASA Astrophysics Data System (ADS)
Thilak, Vimal; Saini, Jatinder; Voelz, David G.; Creusere, Charles D.
2005-08-01
Passive polarization based imaging is a useful tool in computer vision and pattern recognition. A passive polarization imaging system forms a polarimetric image from the reflection of ambient light that contains useful information for computer vision tasks such as object detection (classification) and recognition. Applications of polarization based pattern recognition include material classification and automatic shape recognition. In this paper, we present two target detection algorithms for images captured by a passive polarimetric imaging system. The proposed detection algorithms are based on Bayesian decision theory. In these approaches, an object can belong to one of any given number classes and classification involves making decisions that minimize the average probability of making incorrect decisions. This minimum is achieved by assigning an object to the class that maximizes the a posteriori probability. Computing a posteriori probabilities requires estimates of class conditional probability density functions (likelihoods) and prior probabilities. A Probabilistic neural network (PNN), which is a nonparametric method that can compute Bayes optimal boundaries, and a -nearest neighbor (KNN) classifier, is used for density estimation and classification. The proposed algorithms are applied to polarimetric image data gathered in the laboratory with a liquid crystal-based system. The experimental results validate the effectiveness of the above algorithms for target detection from polarimetric data.
Invariant visual object recognition and shape processing in rats
Zoccolan, Davide
2015-01-01
Invariant visual object recognition is the ability to recognize visual objects despite the vastly different images that each object can project onto the retina during natural vision, depending on its position and size within the visual field, its orientation relative to the viewer, etc. Achieving invariant recognition represents such a formidable computational challenge that is often assumed to be a unique hallmark of primate vision. Historically, this has limited the invasive investigation of its neuronal underpinnings to monkey studies, in spite of the narrow range of experimental approaches that these animal models allow. Meanwhile, rodents have been largely neglected as models of object vision, because of the widespread belief that they are incapable of advanced visual processing. However, the powerful array of experimental tools that have been developed to dissect neuronal circuits in rodents has made these species very attractive to vision scientists too, promoting a new tide of studies that have started to systematically explore visual functions in rats and mice. Rats, in particular, have been the subjects of several behavioral studies, aimed at assessing how advanced object recognition and shape processing is in this species. Here, I review these recent investigations, as well as earlier studies of rat pattern vision, to provide an historical overview and a critical summary of the status of the knowledge about rat object vision. The picture emerging from this survey is very encouraging with regard to the possibility of using rats as complementary models to monkeys in the study of higher-level vision. PMID:25561421
DOE Office of Scientific and Technical Information (OSTI.GOV)
Uhr, L.
1987-01-01
This book is written by research scientists involved in the development of massively parallel, but hierarchically structured, algorithms, architectures, and programs for image processing, pattern recognition, and computer vision. The book gives an integrated picture of the programs and algorithms that are being developed, and also of the multi-computer hardware architectures for which these systems are designed.
NASA Astrophysics Data System (ADS)
Mishra, Deependra K.; Umbaugh, Scott E.; Lama, Norsang; Dahal, Rohini; Marino, Dominic J.; Sackman, Joseph
2016-09-01
CVIPtools is a software package for the exploration of computer vision and image processing developed in the Computer Vision and Image Processing Laboratory at Southern Illinois University Edwardsville. CVIPtools is available in three variants - a) CVIPtools Graphical User Interface, b) CVIPtools C library and c) CVIPtools MATLAB toolbox, which makes it accessible to a variety of different users. It offers students, faculty, researchers and any user a free and easy way to explore computer vision and image processing techniques. Many functions have been implemented and are updated on a regular basis, the library has reached a level of sophistication that makes it suitable for both educational and research purposes. In this paper, the detail list of the functions available in the CVIPtools MATLAB toolbox are presented and how these functions can be used in image analysis and computer vision applications. The CVIPtools MATLAB toolbox allows the user to gain practical experience to better understand underlying theoretical problems in image processing and pattern recognition. As an example application, the algorithm for the automatic creation of masks for veterinary thermographic images is presented.
NASA Technical Reports Server (NTRS)
Liu, Hua-Kuang (Editor); Schenker, Paul (Editor)
1987-01-01
The papers presented in this volume provide an overview of current research in both optical and digital pattern recognition, with a theme of identifying overlapping research problems and methodologies. Topics discussed include image analysis and low-level vision, optical system design, object analysis and recognition, real-time hybrid architectures and algorithms, high-level image understanding, and optical matched filter design. Papers are presented on synthetic estimation filters for a control system; white-light correlator character recognition; optical AI architectures for intelligent sensors; interpreting aerial photographs by segmentation and search; and optical information processing using a new photopolymer.
NASA Technical Reports Server (NTRS)
Tescher, Andrew G. (Editor)
1989-01-01
Various papers on image compression and automatic target recognition are presented. Individual topics addressed include: target cluster detection in cluttered SAR imagery, model-based target recognition using laser radar imagery, Smart Sensor front-end processor for feature extraction of images, object attitude estimation and tracking from a single video sensor, symmetry detection in human vision, analysis of high resolution aerial images for object detection, obscured object recognition for an ATR application, neural networks for adaptive shape tracking, statistical mechanics and pattern recognition, detection of cylinders in aerial range images, moving object tracking using local windows, new transform method for image data compression, quad-tree product vector quantization of images, predictive trellis encoding of imagery, reduced generalized chain code for contour description, compact architecture for a real-time vision system, use of human visibility functions in segmentation coding, color texture analysis and synthesis using Gibbs random fields.
Motion Based Target Acquisition and Evaluation in an Adaptive Machine Vision System
1995-05-01
paths in facial recognition and learning. Annals of Neurology, 22, 41-45. Tolman, E.C. (1932) Purposive behavior in Animals and Men. New York: Appleton...Learned scan paths are the active processes of perception. Rizzo et al. (1987) studied the fixation patterns of two patients with impaired facial ... recognition and learning and found an increase in the randomness of the scan patterns compared to controls, indicating that the cortex was failing to direct
The recognition of graphical patterns invariant to geometrical transformation of the models
NASA Astrophysics Data System (ADS)
Ileană, Ioan; Rotar, Corina; Muntean, Maria; Ceuca, Emilian
2010-11-01
In case that a pattern recognition system is used for images recognition (in robot vision, handwritten recognition etc.), the system must have the capacity to identify an object indifferently of its size or position in the image. The problem of the invariance of recognition can be approached in some fundamental modes. One may apply the similarity criterion used in associative recall. The original pattern is replaced by a mathematical transform that assures some invariance (e.g. the value of two-dimensional Fourier transformation is translation invariant, the value of Mellin transformation is scale invariant). In a different approach the original pattern is represented through a set of features, each of them being coded indifferently of the position, orientation or position of the pattern. Generally speaking, it is easy to obtain invariance in relation with one transformation group, but is difficult to obtain simultaneous invariance at rotation, translation and scale. In this paper we analyze some methods to achieve invariant recognition of images, particularly for digit images. A great number of experiments are due and the conclusions are underplayed in the paper.
Remote Video Monitor of Vehicles in Cooperative Information Platform
NASA Astrophysics Data System (ADS)
Qin, Guofeng; Wang, Xiaoguo; Wang, Li; Li, Yang; Li, Qiyan
Detection of vehicles plays an important role in the area of the modern intelligent traffic management. And the pattern recognition is a hot issue in the area of computer vision. An auto- recognition system in cooperative information platform is studied. In the cooperative platform, 3G wireless network, including GPS, GPRS (CDMA), Internet (Intranet), remote video monitor and M-DMB networks are integrated. The remote video information can be taken from the terminals and sent to the cooperative platform, then detected by the auto-recognition system. The images are pretreated and segmented, including feature extraction, template matching and pattern recognition. The system identifies different models and gets vehicular traffic statistics. Finally, the implementation of the system is introduced.
Recognition of plant parts with problem-specific algorithms
NASA Astrophysics Data System (ADS)
Schwanke, Joerg; Brendel, Thorsten; Jensch, Peter F.; Megnet, Roland
1994-06-01
Automatic micropropagation is necessary to produce cost-effective high amounts of biomass. Juvenile plants are dissected in clean- room environment on particular points on the stem or the leaves. A vision-system detects possible cutting points and controls a specialized robot. This contribution is directed to the pattern- recognition algorithms to detect structural parts of the plant.
Computer vision for microscopy diagnosis of malaria.
Tek, F Boray; Dempster, Andrew G; Kale, Izzet
2009-07-13
This paper reviews computer vision and image analysis studies aiming at automated diagnosis or screening of malaria infection in microscope images of thin blood film smears. Existing works interpret the diagnosis problem differently or propose partial solutions to the problem. A critique of these works is furnished. In addition, a general pattern recognition framework to perform diagnosis, which includes image acquisition, pre-processing, segmentation, and pattern classification components, is described. The open problems are addressed and a perspective of the future work for realization of automated microscopy diagnosis of malaria is provided.
Common constraints limit Korean and English character recognition in peripheral vision.
He, Yingchen; Kwon, MiYoung; Legge, Gordon E
2018-01-01
The visual span refers to the number of adjacent characters that can be recognized in a single glance. It is viewed as a sensory bottleneck in reading for both normal and clinical populations. In peripheral vision, the visual span for English characters can be enlarged after training with a letter-recognition task. Here, we examined the transfer of training from Korean to English characters for a group of bilingual Korean native speakers. In the pre- and posttests, we measured visual spans for Korean characters and English letters. Training (1.5 hours × 4 days) consisted of repetitive visual-span measurements for Korean trigrams (strings of three characters). Our training enlarged the visual spans for Korean single characters and trigrams, and the benefit transferred to untrained English symbols. The improvement was largely due to a reduction of within-character and between-character crowding in Korean recognition, as well as between-letter crowding in English recognition. We also found a negative correlation between the size of the visual span and the average pattern complexity of the symbol set. Together, our results showed that the visual span is limited by common sensory (crowding) and physical (pattern complexity) factors regardless of the language script, providing evidence that the visual span reflects a universal bottleneck for text recognition.
Common constraints limit Korean and English character recognition in peripheral vision
He, Yingchen; Kwon, MiYoung; Legge, Gordon E.
2018-01-01
The visual span refers to the number of adjacent characters that can be recognized in a single glance. It is viewed as a sensory bottleneck in reading for both normal and clinical populations. In peripheral vision, the visual span for English characters can be enlarged after training with a letter-recognition task. Here, we examined the transfer of training from Korean to English characters for a group of bilingual Korean native speakers. In the pre- and posttests, we measured visual spans for Korean characters and English letters. Training (1.5 hours × 4 days) consisted of repetitive visual-span measurements for Korean trigrams (strings of three characters). Our training enlarged the visual spans for Korean single characters and trigrams, and the benefit transferred to untrained English symbols. The improvement was largely due to a reduction of within-character and between-character crowding in Korean recognition, as well as between-letter crowding in English recognition. We also found a negative correlation between the size of the visual span and the average pattern complexity of the symbol set. Together, our results showed that the visual span is limited by common sensory (crowding) and physical (pattern complexity) factors regardless of the language script, providing evidence that the visual span reflects a universal bottleneck for text recognition. PMID:29327041
Han, Wuxiao; Zhang, Linlin; He, Haoxuan; Liu, Hongmin; Xing, Lili; Xue, Xinyu
2018-06-22
The development of multifunctional electronic-skin that establishes human-machine interfaces, enhances perception abilities or has other distinct biomedical applications is the key to the realization of artificial intelligence. In this paper, a new self-powered (battery-free) flexible vision electronic-skin has been realized from pixel-patterned matrix of piezo-photodetecting PVDF/Ppy film. The electronic-skin under applied deformation can actively output piezoelectric voltage, and the outputting signal can be significantly influenced by UV illumination. The piezoelectric output can act as both the photodetecting signal and electricity power. The reliability is demonstrated over 200 light on-off cycles. The sensing unit matrix of 6 × 6 pixels on the electronic-skin can realize image recognition through mapping multi-point UV stimuli. This self-powered vision electronic-skin that simply mimics human retina may have potential application in vision substitution.
NASA Astrophysics Data System (ADS)
Han, Wuxiao; Zhang, Linlin; He, Haoxuan; Liu, Hongmin; Xing, Lili; Xue, Xinyu
2018-06-01
The development of multifunctional electronic-skin that establishes human-machine interfaces, enhances perception abilities or has other distinct biomedical applications is the key to the realization of artificial intelligence. In this paper, a new self-powered (battery-free) flexible vision electronic-skin has been realized from pixel-patterned matrix of piezo-photodetecting PVDF/Ppy film. The electronic-skin under applied deformation can actively output piezoelectric voltage, and the outputting signal can be significantly influenced by UV illumination. The piezoelectric output can act as both the photodetecting signal and electricity power. The reliability is demonstrated over 200 light on–off cycles. The sensing unit matrix of 6 × 6 pixels on the electronic-skin can realize image recognition through mapping multi-point UV stimuli. This self-powered vision electronic-skin that simply mimics human retina may have potential application in vision substitution.
Mars Rover imaging systems and directional filtering
NASA Technical Reports Server (NTRS)
Wang, Paul P.
1989-01-01
Computer literature searches were carried out at Duke University and NASA Langley Research Center. The purpose is to enhance personal knowledge based on the technical problems of pattern recognition and image understanding which must be solved for the Mars Rover and Sample Return Mission. Intensive study effort of a large collection of relevant literature resulted in a compilation of all important documents in one place. Furthermore, the documents are being classified into: Mars Rover; computer vision (theory); imaging systems; pattern recognition methodologies; and other smart techniques (AI, neural networks, fuzzy logic, etc).
Dual Use of Image Based Tracking Techniques: Laser Eye Surgery and Low Vision Prosthesis
NASA Technical Reports Server (NTRS)
Juday, Richard D.; Barton, R. Shane
1994-01-01
With a concentration on Fourier optics pattern recognition, we have developed several methods of tracking objects in dynamic imagery to automate certain space applications such as orbital rendezvous and spacecraft capture, or planetary landing. We are developing two of these techniques for Earth applications in real-time medical image processing. The first is warping of a video image, developed to evoke shift invariance to scale and rotation in correlation pattern recognition. The technology is being applied to compensation for certain field defects in low vision humans. The second is using the optical joint Fourier transform to track the translation of unmodeled scenes. Developed as an image fixation tool to assist in calculating shape from motion, it is being applied to tracking motions of the eyeball quickly enough to keep a laser photocoagulation spot fixed on the retina, thus avoiding collateral damage.
Dual use of image based tracking techniques: Laser eye surgery and low vision prosthesis
NASA Technical Reports Server (NTRS)
Juday, Richard D.
1994-01-01
With a concentration on Fourier optics pattern recognition, we have developed several methods of tracking objects in dynamic imagery to automate certain space applications such as orbital rendezvous and spacecraft capture, or planetary landing. We are developing two of these techniques for Earth applications in real-time medical image processing. The first is warping of a video image, developed to evoke shift invariance to scale and rotation in correlation pattern recognition. The technology is being applied to compensation for certain field defects in low vision humans. The second is using the optical joint Fourier transform to track the translation of unmodeled scenes. Developed as an image fixation tool to assist in calculating shape from motion, it is being applied to tracking motions of the eyeball quickly enough to keep a laser photocoagulation spot fixed on the retina, thus avoiding collateral damage.
Mechanisms and neural basis of object and pattern recognition: a study with chess experts.
Bilalić, Merim; Langner, Robert; Erb, Michael; Grodd, Wolfgang
2010-11-01
Comparing experts with novices offers unique insights into the functioning of cognition, based on the maximization of individual differences. Here we used this expertise approach to disentangle the mechanisms and neural basis behind two processes that contribute to everyday expertise: object and pattern recognition. We compared chess experts and novices performing chess-related and -unrelated (visual) search tasks. As expected, the superiority of experts was limited to the chess-specific task, as there were no differences in a control task that used the same chess stimuli but did not require chess-specific recognition. The analysis of eye movements showed that experts immediately and exclusively focused on the relevant aspects in the chess task, whereas novices also examined irrelevant aspects. With random chess positions, when pattern knowledge could not be used to guide perception, experts nevertheless maintained an advantage. Experts' superior domain-specific parafoveal vision, a consequence of their knowledge about individual domain-specific symbols, enabled improved object recognition. Functional magnetic resonance imaging corroborated this differentiation between object and pattern recognition and showed that chess-specific object recognition was accompanied by bilateral activation of the occipitotemporal junction, whereas chess-specific pattern recognition was related to bilateral activations in the middle part of the collateral sulci. Using the expertise approach together with carefully chosen controls and multiple dependent measures, we identified object and pattern recognition as two essential cognitive processes in expert visual cognition, which may also help to explain the mechanisms of everyday perception.
A Feasibility Study of View-independent Gait Identification
2012-03-01
ice skates . For walking, the footprint records for single pixels form clusters that are well separated in space and time. (Any overlap of contact...Pattern Recognition 2007, 1-8. Cheng M-H, Ho M-F & Huang C-L (2008), "Gait Analysis for Human Identification Through Manifold Learning and HMM... Learning and Cybernetics 2005, 4516-4521 Moeslund T B & Granum E (2001), "A Survey of Computer Vision-Based Human Motion Capture", Computer Vision
NASA Astrophysics Data System (ADS)
Fernández, Ariel; Ferrari, José A.
2017-05-01
Pattern recognition and feature extraction are image processing applications of great interest in defect inspection and robot vision among others. In comparison to purely digital methods, the attractiveness of optical processors for pattern recognition lies in their highly parallel operation and real-time processing capability. This work presents an optical implementation of the generalized Hough transform (GHT), a well-established technique for recognition of geometrical features in binary images. Detection of a geometric feature under the GHT is accomplished by mapping the original image to an accumulator space; the large computational requirements for this mapping make the optical implementation an attractive alternative to digital-only methods. We explore an optical setup where the transformation is obtained, and the size and orientation parameters can be controlled, allowing for dynamic scale and orientation-variant pattern recognition. A compact system for the above purposes results from the use of an electrically tunable lens for scale control and a pupil mask implemented on a high-contrast spatial light modulator for orientation/shape variation of the template. Real-time can also be achieved. In addition, by thresholding of the GHT and optically inverse transforming, the previously detected features of interest can be extracted.
NASA Astrophysics Data System (ADS)
Megherbi, Dalila B.; Yan, Yin; Tanmay, Parikh; Khoury, Jed; Woods, C. L.
2004-11-01
Recently surveillance and Automatic Target Recognition (ATR) applications are increasing as the cost of computing power needed to process the massive amount of information continues to fall. This computing power has been made possible partly by the latest advances in FPGAs and SOPCs. In particular, to design and implement state-of-the-Art electro-optical imaging systems to provide advanced surveillance capabilities, there is a need to integrate several technologies (e.g. telescope, precise optics, cameras, image/compute vision algorithms, which can be geographically distributed or sharing distributed resources) into a programmable system and DSP systems. Additionally, pattern recognition techniques and fast information retrieval, are often important components of intelligent systems. The aim of this work is using embedded FPGA as a fast, configurable and synthesizable search engine in fast image pattern recognition/retrieval in a distributed hardware/software co-design environment. In particular, we propose and show a low cost Content Addressable Memory (CAM)-based distributed embedded FPGA hardware architecture solution with real time recognition capabilities and computing for pattern look-up, pattern recognition, and image retrieval. We show how the distributed CAM-based architecture offers a performance advantage of an order-of-magnitude over RAM-based architecture (Random Access Memory) search for implementing high speed pattern recognition for image retrieval. The methods of designing, implementing, and analyzing the proposed CAM based embedded architecture are described here. Other SOPC solutions/design issues are covered. Finally, experimental results, hardware verification, and performance evaluations using both the Xilinx Virtex-II and the Altera Apex20k are provided to show the potential and power of the proposed method for low cost reconfigurable fast image pattern recognition/retrieval at the hardware/software co-design level.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moody, Daniela I.; Brumby, Steven P.; Rowland, Joel C.
Neuromimetic machine vision and pattern recognition algorithms are of great interest for landscape characterization and change detection in satellite imagery in support of global climate change science and modeling. We present results from an ongoing effort to extend machine vision methods to the environmental sciences, using adaptive sparse signal processing combined with machine learning. A Hebbian learning rule is used to build multispectral, multiresolution dictionaries from regional satellite normalized band difference index data. Land cover labels are automatically generated via our CoSA algorithm: Clustering of Sparse Approximations, using a clustering distance metric that combines spectral and spatial textural characteristics tomore » help separate geologic, vegetative, and hydrologie features. We demonstrate our method on example Worldview-2 satellite images of an Arctic region, and use CoSA labels to detect seasonal surface changes. In conclusion, our results suggest that neuroscience-based models are a promising approach to practical pattern recognition and change detection problems in remote sensing.« less
Moody, Daniela I.; Brumby, Steven P.; Rowland, Joel C.; ...
2014-10-01
Neuromimetic machine vision and pattern recognition algorithms are of great interest for landscape characterization and change detection in satellite imagery in support of global climate change science and modeling. We present results from an ongoing effort to extend machine vision methods to the environmental sciences, using adaptive sparse signal processing combined with machine learning. A Hebbian learning rule is used to build multispectral, multiresolution dictionaries from regional satellite normalized band difference index data. Land cover labels are automatically generated via our CoSA algorithm: Clustering of Sparse Approximations, using a clustering distance metric that combines spectral and spatial textural characteristics tomore » help separate geologic, vegetative, and hydrologie features. We demonstrate our method on example Worldview-2 satellite images of an Arctic region, and use CoSA labels to detect seasonal surface changes. In conclusion, our results suggest that neuroscience-based models are a promising approach to practical pattern recognition and change detection problems in remote sensing.« less
Extracting semantics from audio-visual content: the final frontier in multimedia retrieval.
Naphade, M R; Huang, T S
2002-01-01
Multimedia understanding is a fast emerging interdisciplinary research area. There is tremendous potential for effective use of multimedia content through intelligent analysis. Diverse application areas are increasingly relying on multimedia understanding systems. Advances in multimedia understanding are related directly to advances in signal processing, computer vision, pattern recognition, multimedia databases, and smart sensors. We review the state-of-the-art techniques in multimedia retrieval. In particular, we discuss how multimedia retrieval can be viewed as a pattern recognition problem. We discuss how reliance on powerful pattern recognition and machine learning techniques is increasing in the field of multimedia retrieval. We review the state-of-the-art multimedia understanding systems with particular emphasis on a system for semantic video indexing centered around multijects and multinets. We discuss how semantic retrieval is centered around concepts and context and the various mechanisms for modeling concepts and context.
Mobile Diagnostics Based on Motion? A Close Look at Motility Patterns in the Schistosome Life Cycle
Linder, Ewert; Varjo, Sami; Thors, Cecilia
2016-01-01
Imaging at high resolution and subsequent image analysis with modified mobile phones have the potential to solve problems related to microscopy-based diagnostics of parasitic infections in many endemic regions. Diagnostics using the computing power of “smartphones” is not restricted by limited expertise or limitations set by visual perception of a microscopist. Thus diagnostics currently almost exclusively dependent on recognition of morphological features of pathogenic organisms could be based on additional properties, such as motility characteristics recognizable by computer vision. Of special interest are infectious larval stages and “micro swimmers” of e.g., the schistosome life cycle, which infect the intermediate and definitive hosts, respectively. The ciliated miracidium, emerges from the excreted egg upon its contact with water. This means that for diagnostics, recognition of a swimming miracidium is equivalent to recognition of an egg. The motility pattern of miracidia could be defined by computer vision and used as a diagnostic criterion. To develop motility pattern-based diagnostics of schistosomiasis using simple imaging devices, we analyzed Paramecium as a model for the schistosome miracidium. As a model for invasive nematodes, such as strongyloids and filaria, we examined a different type of motility in the apathogenic nematode Turbatrix, the “vinegar eel.” The results of motion time and frequency analysis suggest that target motility may be expressed as specific spectrograms serving as “diagnostic fingerprints.” PMID:27322330
HOTS: A Hierarchy of Event-Based Time-Surfaces for Pattern Recognition.
Lagorce, Xavier; Orchard, Garrick; Galluppi, Francesco; Shi, Bertram E; Benosman, Ryad B
2017-07-01
This paper describes novel event-based spatio-temporal features called time-surfaces and how they can be used to create a hierarchical event-based pattern recognition architecture. Unlike existing hierarchical architectures for pattern recognition, the presented model relies on a time oriented approach to extract spatio-temporal features from the asynchronously acquired dynamics of a visual scene. These dynamics are acquired using biologically inspired frameless asynchronous event-driven vision sensors. Similarly to cortical structures, subsequent layers in our hierarchy extract increasingly abstract features using increasingly large spatio-temporal windows. The central concept is to use the rich temporal information provided by events to create contexts in the form of time-surfaces which represent the recent temporal activity within a local spatial neighborhood. We demonstrate that this concept can robustly be used at all stages of an event-based hierarchical model. First layer feature units operate on groups of pixels, while subsequent layer feature units operate on the output of lower level feature units. We report results on a previously published 36 class character recognition task and a four class canonical dynamic card pip task, achieving near 100 percent accuracy on each. We introduce a new seven class moving face recognition task, achieving 79 percent accuracy.This paper describes novel event-based spatio-temporal features called time-surfaces and how they can be used to create a hierarchical event-based pattern recognition architecture. Unlike existing hierarchical architectures for pattern recognition, the presented model relies on a time oriented approach to extract spatio-temporal features from the asynchronously acquired dynamics of a visual scene. These dynamics are acquired using biologically inspired frameless asynchronous event-driven vision sensors. Similarly to cortical structures, subsequent layers in our hierarchy extract increasingly abstract features using increasingly large spatio-temporal windows. The central concept is to use the rich temporal information provided by events to create contexts in the form of time-surfaces which represent the recent temporal activity within a local spatial neighborhood. We demonstrate that this concept can robustly be used at all stages of an event-based hierarchical model. First layer feature units operate on groups of pixels, while subsequent layer feature units operate on the output of lower level feature units. We report results on a previously published 36 class character recognition task and a four class canonical dynamic card pip task, achieving near 100 percent accuracy on each. We introduce a new seven class moving face recognition task, achieving 79 percent accuracy.
Intelligent Scene Analysis and Recognition
2010-03-30
Database, 1998, pp. 42–51. [9] I. Biederman , Aspects and extension of a theory of human image understanding, Z. Pylyshyn, Ed. Ablex Publishing Corporation...geometry in the visual system,” Biological Cybernetics, vol. 55, no. 6, pp. 367–375, 1987 . [30] W. T. Freeman and E. H. Adelson, “The design and use of...Computer Vision and Pattern Recognition, 2009, pp. 1980– 1987 . [47] M. Leordeanu and M. Hebert, “A spectral technique for correspondence problems using
Compensation for Blur Requires Increase in Field of View and Viewing Time
Kwon, MiYoung; Liu, Rong; Chien, Lillian
2016-01-01
Spatial resolution is an important factor for human pattern recognition. In particular, low resolution (blur) is a defining characteristic of low vision. Here, we examined spatial (field of view) and temporal (stimulus duration) requirements for blurry object recognition. The spatial resolution of an image such as letter or face, was manipulated with a low-pass filter. In experiment 1, studying spatial requirement, observers viewed a fixed-size object through a window of varying sizes, which was repositioned until object identification (moving window paradigm). Field of view requirement, quantified as the number of “views” (window repositions) for correct recognition, was obtained for three blur levels, including no blur. In experiment 2, studying temporal requirement, we determined threshold viewing time, the stimulus duration yielding criterion recognition accuracy, at six blur levels, including no blur. For letter and face recognition, we found blur significantly increased the number of views, suggesting a larger field of view is required to recognize blurry objects. We also found blur significantly increased threshold viewing time, suggesting longer temporal integration is necessary to recognize blurry objects. The temporal integration reflects the tradeoff between stimulus intensity and time. While humans excel at recognizing blurry objects, our findings suggest compensating for blur requires increased field of view and viewing time. The need for larger spatial and longer temporal integration for recognizing blurry objects may further challenge object recognition in low vision. Thus, interactions between blur and field of view should be considered for developing low vision rehabilitation or assistive aids. PMID:27622710
Multi-texture local ternary pattern for face recognition
NASA Astrophysics Data System (ADS)
Essa, Almabrok; Asari, Vijayan
2017-05-01
In imagery and pattern analysis domain a variety of descriptors have been proposed and employed for different computer vision applications like face detection and recognition. Many of them are affected under different conditions during the image acquisition process such as variations in illumination and presence of noise, because they totally rely on the image intensity values to encode the image information. To overcome these problems, a novel technique named Multi-Texture Local Ternary Pattern (MTLTP) is proposed in this paper. MTLTP combines the edges and corners based on the local ternary pattern strategy to extract the local texture features of the input image. Then returns a spatial histogram feature vector which is the descriptor for each image that we use to recognize a human being. Experimental results using a k-nearest neighbors classifier (k-NN) on two publicly available datasets justify our algorithm for efficient face recognition in the presence of extreme variations of illumination/lighting environments and slight variation of pose conditions.
A low-cost machine vision system for the recognition and sorting of small parts
NASA Astrophysics Data System (ADS)
Barea, Gustavo; Surgenor, Brian W.; Chauhan, Vedang; Joshi, Keyur D.
2018-04-01
An automated machine vision-based system for the recognition and sorting of small parts was designed, assembled and tested. The system was developed to address a need to expose engineering students to the issues of machine vision and assembly automation technology, with readily available and relatively low-cost hardware and software. This paper outlines the design of the system and presents experimental performance results. Three different styles of plastic gears, together with three different styles of defective gears, were used to test the system. A pattern matching tool was used for part classification. Nine experiments were conducted to demonstrate the effects of changing various hardware and software parameters, including: conveyor speed, gear feed rate, classification, and identification score thresholds. It was found that the system could achieve a maximum system accuracy of 95% at a feed rate of 60 parts/min, for a given set of parameter settings. Future work will be looking at the effect of lighting.
Unification of automatic target tracking and automatic target recognition
NASA Astrophysics Data System (ADS)
Schachter, Bruce J.
2014-06-01
The subject being addressed is how an automatic target tracker (ATT) and an automatic target recognizer (ATR) can be fused together so tightly and so well that their distinctiveness becomes lost in the merger. This has historically not been the case outside of biology and a few academic papers. The biological model of ATT∪ATR arises from dynamic patterns of activity distributed across many neural circuits and structures (including retina). The information that the brain receives from the eyes is "old news" at the time that it receives it. The eyes and brain forecast a tracked object's future position, rather than relying on received retinal position. Anticipation of the next moment - building up a consistent perception - is accomplished under difficult conditions: motion (eyes, head, body, scene background, target) and processing limitations (neural noise, delays, eye jitter, distractions). Not only does the human vision system surmount these problems, but it has innate mechanisms to exploit motion in support of target detection and classification. Biological vision doesn't normally operate on snapshots. Feature extraction, detection and recognition are spatiotemporal. When vision is viewed as a spatiotemporal process, target detection, recognition, tracking, event detection and activity recognition, do not seem as distinct as they are in current ATT and ATR designs. They appear as similar mechanism taking place at varying time scales. A framework is provided for unifying ATT and ATR.
The Poetics of "Pattern Recognition": William Gibson's Shifting Technological Subject
ERIC Educational Resources Information Center
Wetmore, Alex
2007-01-01
William Gibson's 1984 cyberpunk novel "Neuromancer" continues to be a touchstone in cultural representations of the impact of new information and communication technologies on the self. As critics have noted, the posthumanist, capital-driven, urban landscape of "Neuromancer" resembles a Foucaultian vision of a panoptically engineered social space…
A Monocular SLAM Method to Estimate Relative Pose During Satellite Proximity Operations
2015-03-26
localization and mapping with efficient outlier handling. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2013. 5. Herbert Bay...S.H. Spencer . Next generation advanced video guidance sensor. In Aerospace Conference, 2008 IEEE, pages 1–8, March 2008. 12. Michael Calonder, Vincent
2001-09-01
diagnosis natural language understanding circuit fault diagnosis pattern recognition machine vision nancial auditing map learning sensor... ACCA ACCB A ights degree of command and control FCC value is assumed to be the average of all the ACC values of the aircraft in the
Neural architectures for stereo vision.
Parker, Andrew J; Smith, Jackson E T; Krug, Kristine
2016-06-19
Stereoscopic vision delivers a sense of depth based on binocular information but additionally acts as a mechanism for achieving correspondence between patterns arriving at the left and right eyes. We analyse quantitatively the cortical architecture for stereoscopic vision in two areas of macaque visual cortex. For primary visual cortex V1, the result is consistent with a module that is isotropic in cortical space with a diameter of at least 3 mm in surface extent. This implies that the module for stereo is larger than the repeat distance between ocular dominance columns in V1. By contrast, in the extrastriate cortical area V5/MT, which has a specialized architecture for stereo depth, the module for representation of stereo is about 1 mm in surface extent, so the representation of stereo in V5/MT is more compressed than V1 in terms of neural wiring of the neocortex. The surface extent estimated for stereo in V5/MT is consistent with measurements of its specialized domains for binocular disparity. Within V1, we suggest that long-range horizontal, anatomical connections form functional modules that serve both binocular and monocular pattern recognition: this common function may explain the distortion and disruption of monocular pattern vision observed in amblyopia.This article is part of the themed issue 'Vision in our three-dimensional world'. © 2016 The Authors.
Robot Command Interface Using an Audio-Visual Speech Recognition System
NASA Astrophysics Data System (ADS)
Ceballos, Alexánder; Gómez, Juan; Prieto, Flavio; Redarce, Tanneguy
In recent years audio-visual speech recognition has emerged as an active field of research thanks to advances in pattern recognition, signal processing and machine vision. Its ultimate goal is to allow human-computer communication using voice, taking into account the visual information contained in the audio-visual speech signal. This document presents a command's automatic recognition system using audio-visual information. The system is expected to control the laparoscopic robot da Vinci. The audio signal is treated using the Mel Frequency Cepstral Coefficients parametrization method. Besides, features based on the points that define the mouth's outer contour according to the MPEG-4 standard are used in order to extract the visual speech information.
Static facial expression recognition with convolution neural networks
NASA Astrophysics Data System (ADS)
Zhang, Feng; Chen, Zhong; Ouyang, Chao; Zhang, Yifei
2018-03-01
Facial expression recognition is a currently active research topic in the fields of computer vision, pattern recognition and artificial intelligence. In this paper, we have developed a convolutional neural networks (CNN) for classifying human emotions from static facial expression into one of the seven facial emotion categories. We pre-train our CNN model on the combined FER2013 dataset formed by train, validation and test set and fine-tune on the extended Cohn-Kanade database. In order to reduce the overfitting of the models, we utilized different techniques including dropout and batch normalization in addition to data augmentation. According to the experimental result, our CNN model has excellent classification performance and robustness for facial expression recognition.
Intelligent data processing of an ultrasonic sensor system for pattern recognition improvements
NASA Astrophysics Data System (ADS)
Na, Seung You; Park, Min-Sang; Hwang, Won-Gul; Kee, Chang-Doo
1999-05-01
Though conventional time-of-flight ultrasonic sensor systems are popular due to the advantages of low cost and simplicity, the usage of the sensors is rather narrowly restricted within object detection and distance readings. There is a strong need to enlarge the amount of environmental information for mobile applications to provide intelligent autonomy. Wide sectors of such neighboring object recognition problems can be satisfactorily handled with coarse vision data such as sonar maps instead of accurate laser or optic measurements. For the usage of object pattern recognition, ultrasonic senors have inherent shortcomings of poor directionality and specularity which result in low spatial resolution and indistinctiveness of object patterns. To resolve these problems an array of increased number of sensor elements has been used for large objects. In this paper we propose a method of sensor array system with improved recognition capability using electronic circuits accompanying the sensor array and neuro-fuzzy processing of data fusion. The circuit changes transmitter output voltages of array elements in several steps. Relying upon the known sensor characteristics, a set of different return signals from neighboring senors is manipulated to provide an enhanced pattern recognition in the aspects of inclination angle, size and shift as well as distance of objects. The results show improved resolution of the measurements for smaller targets.
Multiple Optical Filter Design Simulation Results
NASA Astrophysics Data System (ADS)
Mendelsohn, J.; Englund, D. C.
1986-10-01
In this paper we continue our investigation of the application of matched filters to robotic vision problems. Specifically, we are concerned with the tray-picking problem. Our principal interest in this paper is the examination of summation affects which arise from attempting to reduce the matched filter memory size by averaging of matched filters. While the implementation of matched filtering theory to applications in pattern recognition or machine vision is ideally through the use of optics and optical correlators, in this paper the results were obtained through a digital simulation of the optical process.
Gottschlich, Carsten
2016-01-01
We present a new type of local image descriptor which yields binary patterns from small image patches. For the application to fingerprint liveness detection, we achieve rotation invariant image patches by taking the fingerprint segmentation and orientation field into account. We compute the discrete cosine transform (DCT) for these rotation invariant patches and attain binary patterns by comparing pairs of two DCT coefficients. These patterns are summarized into one or more histograms per image. Each histogram comprises the relative frequencies of pattern occurrences. Multiple histograms are concatenated and the resulting feature vector is used for image classification. We name this novel type of descriptor convolution comparison pattern (CCP). Experimental results show the usefulness of the proposed CCP descriptor for fingerprint liveness detection. CCP outperforms other local image descriptors such as LBP, LPQ and WLD on the LivDet 2013 benchmark. The CCP descriptor is a general type of local image descriptor which we expect to prove useful in areas beyond fingerprint liveness detection such as biological and medical image processing, texture recognition, face recognition and iris recognition, liveness detection for face and iris images, and machine vision for surface inspection and material classification. PMID:26844544
Survey of computer vision-based natural disaster warning systems
NASA Astrophysics Data System (ADS)
Ko, ByoungChul; Kwak, Sooyeong
2012-07-01
With the rapid development of information technology, natural disaster prevention is growing as a new research field dealing with surveillance systems. To forecast and prevent the damage caused by natural disasters, the development of systems to analyze natural disasters using remote sensing geographic information systems (GIS), and vision sensors has been receiving widespread interest over the last decade. This paper provides an up-to-date review of five different types of natural disasters and their corresponding warning systems using computer vision and pattern recognition techniques such as wildfire smoke and flame detection, water level detection for flood prevention, coastal zone monitoring, and landslide detection. Finally, we conclude with some thoughts about future research directions.
Local intensity area descriptor for facial recognition in ideal and noise conditions
NASA Astrophysics Data System (ADS)
Tran, Chi-Kien; Tseng, Chin-Dar; Chao, Pei-Ju; Ting, Hui-Min; Chang, Liyun; Huang, Yu-Jie; Lee, Tsair-Fwu
2017-03-01
We propose a local texture descriptor, local intensity area descriptor (LIAD), which is applied for human facial recognition in ideal and noisy conditions. Each facial image is divided into small regions from which LIAD histograms are extracted and concatenated into a single feature vector to represent the facial image. The recognition is performed using a nearest neighbor classifier with histogram intersection and chi-square statistics as dissimilarity measures. Experiments were conducted with LIAD using the ORL database of faces (Olivetti Research Laboratory, Cambridge), the Face94 face database, the Georgia Tech face database, and the FERET database. The results demonstrated the improvement in accuracy of our proposed descriptor compared to conventional descriptors [local binary pattern (LBP), uniform LBP, local ternary pattern, histogram of oriented gradients, and local directional pattern]. Moreover, the proposed descriptor was less sensitive to noise and had low histogram dimensionality. Thus, it is expected to be a powerful texture descriptor that can be used for various computer vision problems.
Comparing visual representations across human fMRI and computational vision
Leeds, Daniel D.; Seibert, Darren A.; Pyles, John A.; Tarr, Michael J.
2013-01-01
Feedforward visual object perception recruits a cortical network that is assumed to be hierarchical, progressing from basic visual features to complete object representations. However, the nature of the intermediate features related to this transformation remains poorly understood. Here, we explore how well different computer vision recognition models account for neural object encoding across the human cortical visual pathway as measured using fMRI. These neural data, collected during the viewing of 60 images of real-world objects, were analyzed with a searchlight procedure as in Kriegeskorte, Goebel, and Bandettini (2006): Within each searchlight sphere, the obtained patterns of neural activity for all 60 objects were compared to model responses for each computer recognition algorithm using representational dissimilarity analysis (Kriegeskorte et al., 2008). Although each of the computer vision methods significantly accounted for some of the neural data, among the different models, the scale invariant feature transform (Lowe, 2004), encoding local visual properties gathered from “interest points,” was best able to accurately and consistently account for stimulus representations within the ventral pathway. More generally, when present, significance was observed in regions of the ventral-temporal cortex associated with intermediate-level object perception. Differences in model effectiveness and the neural location of significant matches may be attributable to the fact that each model implements a different featural basis for representing objects (e.g., more holistic or more parts-based). Overall, we conclude that well-known computer vision recognition systems may serve as viable proxies for theories of intermediate visual object representation. PMID:24273227
New color vision tests to evaluate faulty color recognition.
Nakamura, Kaoru; Okajima, Osamu; Nishio, Yoshiteru; Kitahara, Kenji
2002-01-01
To develop and assess new color vision tests to be used in evaluating faulty color recognition. We developed new color vision tests to evaluate faulty color recognition. The two types of color vision tests, designed to assess faulty color recognition in color vision deficiencies, are based on principles that are different from those of the conventional color vision tests. In the first test plate, the subject is asked to choose either a red, green, or gray line from among 10 lines that are randomly colored red, green, gray, yellow, or blue. The score is the difference between the number of correct answers and the number of incorrect answers. In the second test plate, the subject is asked to identify a total of 10 red azalea blossoms, which are dispersed among numerous green leaves. Seventy-five persons with congenital color deficiencies and 20 subjects with normal color vision were examined using these new test plates. The scores differed significantly between dichromats and anomalous trichromats, and between anomalous trichromats and subjects with normal color vision. The new tests are easy to use, sensitive, and have good reproducibility for use in discriminating subjects with color vision anomalies. These tests reveal the faulty color recognition that occurs unconsciously in persons with color deficiencies, and are useful in judging the quantification of color vision required in their daily life and occupations.
Robust Indoor Human Activity Recognition Using Wireless Signals.
Wang, Yi; Jiang, Xinli; Cao, Rongyu; Wang, Xiyang
2015-07-15
Wireless signals-based activity detection and recognition technology may be complementary to the existing vision-based methods, especially under the circumstance of occlusions, viewpoint change, complex background, lighting condition change, and so on. This paper explores the properties of the channel state information (CSI) of Wi-Fi signals, and presents a robust indoor daily human activity recognition framework with only one pair of transmission points (TP) and access points (AP). First of all, some indoor human actions are selected as primitive actions forming a training set. Then, an online filtering method is designed to make actions' CSI curves smooth and allow them to contain enough pattern information. Each primitive action pattern can be segmented from the outliers of its multi-input multi-output (MIMO) signals by a proposed segmentation method. Lastly, in online activities recognition, by selecting proper features and Support Vector Machine (SVM) based multi-classification, activities constituted by primitive actions can be recognized insensitive to the locations, orientations, and speeds.
R, Elakkiya; K, Selvamani
2017-09-22
Subunit segmenting and modelling in medical sign language is one of the important studies in linguistic-oriented and vision-based Sign Language Recognition (SLR). Many efforts were made in the precedent to focus the functional subunits from the view of linguistic syllables but the problem is implementing such subunit extraction using syllables is not feasible in real-world computer vision techniques. And also, the present recognition systems are designed in such a way that it can detect the signer dependent actions under restricted and laboratory conditions. This research paper aims at solving these two important issues (1) Subunit extraction and (2) Signer independent action on visual sign language recognition. Subunit extraction involved in the sequential and parallel breakdown of sign gestures without any prior knowledge on syllables and number of subunits. A novel Bayesian Parallel Hidden Markov Model (BPaHMM) is introduced for subunit extraction to combine the features of manual and non-manual parameters to yield better results in classification and recognition of signs. Signer independent action aims in using a single web camera for different signer behaviour patterns and for cross-signer validation. Experimental results have proved that the proposed signer independent subunit level modelling for sign language classification and recognition has shown improvement and variations when compared with other existing works.
A self-learning camera for the validation of highly variable and pseudorandom patterns
NASA Astrophysics Data System (ADS)
Kelley, Michael
2004-05-01
Reliable and productive manufacturing operations have depended on people to quickly detect and solve problems whenever they appear. Over the last 20 years, more and more manufacturing operations have embraced machine vision systems to increase productivity, reliability and cost-effectiveness, including reducing the number of human operators required. Although machine vision technology has long been capable of solving simple problems, it has still not been broadly implemented. The reason is that until now, no machine vision system has been designed to meet the unique demands of complicated pattern recognition. The ZiCAM family was specifically developed to be the first practical hardware to meet these needs. To be able to address non-traditional applications, the machine vision industry must include smart camera technology that meets its users" demands for lower costs, better performance and the ability to address applications of irregular lighting, patterns and color. The next-generation smart cameras will need to evolve as a fundamentally different kind of sensor, with new technology that behaves like a human but performs like a computer. Neural network based systems, coupled with self-taught, n-space, non-linear modeling, promises to be the enabler of the next generation of machine vision equipment. Image processing technology is now available that enables a system to match an operator"s subjectivity. A Zero-Instruction-Set-Computer (ZISC) powered smart camera allows high-speed fuzzy-logic processing, without the need for computer programming. This can address applications of validating highly variable and pseudo-random patterns. A hardware-based implementation of a neural network, Zero-Instruction-Set-Computer, enables a vision system to "think" and "inspect" like a human, with the speed and reliability of a machine.
Neural network classification technique and machine vision for bread crumb grain evaluation
NASA Astrophysics Data System (ADS)
Zayas, Inna Y.; Chung, O. K.; Caley, M.
1995-10-01
Bread crumb grain was studied to develop a model for pattern recognition of bread baked at Hard Winter Wheat Quality Laboratory (HWWQL), Grain Marketing and Production Research Center (GMPRC). Images of bread slices were acquired with a scanner in a 512 multiplied by 512 format. Subimages in the central part of the slices were evaluated by several features such as mean, determinant, eigen values, shape of a slice and other crumb features. Derived features were used to describe slices and loaves. Neural network programs of MATLAB package were used for data analysis. Learning vector quantization method and multivariate discriminant analysis were applied to bread slices from what of different sources. A training and test sets of different bread crumb texture classes were obtained. The ranking of subimages was well correlated with visual judgement. The performance of different models on slice recognition rate was studied to choose the best model. The recognition of classes created according to human judgement with image features was low. Recognition of arbitrarily created classes, according to porosity patterns, with several feature patterns was approximately 90%. Correlation coefficient was approximately 0.7 between slice shape features and loaf volume.
NASA Technical Reports Server (NTRS)
1989-01-01
Texas Instruments Programmable Remapper is a research tool used to determine how to best utilize the part of a patient's visual field still usable by mapping onto his field of vision with manipulated imagery. It is an offshoot of a NASA program for speeding up, improving the accuracy of pattern recognition in video imagery. The Remapper enables an image to be "pushed around" so more of it falls into the functional portions in the retina of a low vision person. It works at video rates, and researchers hope to significantly reduce its size and cost, creating a wearable prosthesis for visually impaired people.
Kitada, Ryo; Johnsrude, Ingrid S; Kochiyama, Takanori; Lederman, Susan J
2009-10-01
Humans can recognize common objects by touch extremely well whenever vision is unavailable. Despite its importance to a thorough understanding of human object recognition, the neuroscientific study of this topic has been relatively neglected. To date, the few published studies have addressed the haptic recognition of nonbiological objects. We now focus on haptic recognition of the human body, a particularly salient object category for touch. Neuroimaging studies demonstrate that regions of the occipito-temporal cortex are specialized for visual perception of faces (fusiform face area, FFA) and other body parts (extrastriate body area, EBA). Are the same category-sensitive regions activated when these components of the body are recognized haptically? Here, we use fMRI to compare brain organization for haptic and visual recognition of human body parts. Sixteen subjects identified exemplars of faces, hands, feet, and nonbiological control objects using vision and haptics separately. We identified two discrete regions within the fusiform gyrus (FFA and the haptic face region) that were each sensitive to both haptically and visually presented faces; however, these two regions differed significantly in their response patterns. Similarly, two regions within the lateral occipito-temporal area (EBA and the haptic body region) were each sensitive to body parts in both modalities, although the response patterns differed. Thus, although the fusiform gyrus and the lateral occipito-temporal cortex appear to exhibit modality-independent, category-sensitive activity, our results also indicate a degree of functional specialization related to sensory modality within these structures.
Infrared Cephalic-Vein to Assist Blood Extraction Tasks: Automatic Projection and Recognition
NASA Astrophysics Data System (ADS)
Lagüela, S.; Gesto, M.; Riveiro, B.; González-Aguilera, D.
2017-05-01
Thermal infrared band is not commonly used in photogrammetric and computer vision algorithms, mainly due to the low spatial resolution of this type of imagery. However, this band captures sub-superficial information, increasing the capabilities of visible bands regarding applications. This fact is especially important in biomedicine and biometrics, allowing the geometric characterization of interior organs and pathologies with photogrammetric principles, as well as the automatic identification and labelling using computer vision algorithms. This paper presents advances of close-range photogrammetry and computer vision applied to thermal infrared imagery, with the final application of Augmented Reality in order to widen its application in the biomedical field. In this case, the thermal infrared image of the arm is acquired and simultaneously projected on the arm, together with the identification label of the cephalic-vein. This way, blood analysts are assisted in finding the vein for blood extraction, especially in those cases where the identification by the human eye is a complex task. Vein recognition is performed based on the Gaussian temperature distribution in the area of the vein, while the calibration between projector and thermographic camera is developed through feature extraction and pattern recognition. The method is validated through its application to a set of volunteers, with different ages and genres, in such way that different conditions of body temperature and vein depth are covered for the applicability and reproducibility of the method.
Stereo vision with distance and gradient recognition
NASA Astrophysics Data System (ADS)
Kim, Soo-Hyun; Kang, Suk-Bum; Yang, Tae-Kyu
2007-12-01
Robot vision technology is needed for the stable walking, object recognition and the movement to the target spot. By some sensors which use infrared rays and ultrasonic, robot can overcome the urgent state or dangerous time. But stereo vision of three dimensional space would make robot have powerful artificial intelligence. In this paper we consider about the stereo vision for stable and correct movement of a biped robot. When a robot confront with an inclination plane or steps, particular algorithms are needed to go on without failure. This study developed the recognition algorithm of distance and gradient of environment by stereo matching process.
Laser Opto-Electronic Correlator for Robotic Vision Automated Pattern Recognition
NASA Technical Reports Server (NTRS)
Marzwell, Neville
1995-01-01
A compact laser opto-electronic correlator for pattern recognition has been designed, fabricated, and tested. Specifically it is a translation sensitivity adjustable compact optical correlator (TSACOC) utilizing convergent laser beams for the holographic filter. Its properties and performance, including the location of the correlation peak and the effects of lateral and longitudinal displacements for both filters and input images, are systematically analyzed based on the nonparaxial approximation for the reference beam. The theoretical analyses have been verified in experiments. In applying the TSACOC to important practical problems including fingerprint identification, we have found that the tolerance of the system to the input lateral displacement can be conveniently increased by changing a geometric factor of the system. The system can be compactly packaged using the miniature laser diode sources and can be used in space by the National Aeronautics and Space Administration (NASA) and ground commercial applications which include robotic vision, and industrial inspection of automated quality control operations. The personnel of Standard International will work closely with the Jet Propulsion Laboratory (JPL) to transfer the technology to the commercial market. Prototype systems will be fabricated to test the market and perfect the product. Large production will follow after successful results are achieved.
Neural correlates of virtual route recognition in congenital blindness.
Kupers, Ron; Chebat, Daniel R; Madsen, Kristoffer H; Paulson, Olaf B; Ptito, Maurice
2010-07-13
Despite the importance of vision for spatial navigation, blind subjects retain the ability to represent spatial information and to move independently in space to localize and reach targets. However, the neural correlates of navigation in subjects lacking vision remain elusive. We therefore used functional MRI (fMRI) to explore the cortical network underlying successful navigation in blind subjects. We first trained congenitally blind and blindfolded sighted control subjects to perform a virtual navigation task with the tongue display unit (TDU), a tactile-to-vision sensory substitution device that translates a visual image into electrotactile stimulation applied to the tongue. After training, participants repeated the navigation task during fMRI. Although both groups successfully learned to use the TDU in the virtual navigation task, the brain activation patterns showed substantial differences. Blind but not blindfolded sighted control subjects activated the parahippocampus and visual cortex during navigation, areas that are recruited during topographical learning and spatial representation in sighted subjects. When the navigation task was performed under full vision in a second group of sighted participants, the activation pattern strongly resembled the one obtained in the blind when using the TDU. This suggests that in the absence of vision, cross-modal plasticity permits the recruitment of the same cortical network used for spatial navigation tasks in sighted subjects.
Creating a meaningful visual perception in blind volunteers by optic nerve stimulation
NASA Astrophysics Data System (ADS)
Brelén, M. E.; Duret, F.; Gérard, B.; Delbeke, J.; Veraart, C.
2005-03-01
A blind volunteer, suffering from retinitis pigmentosa, has been chronically implanted with an optic nerve visual prosthesis. Vision rehabilitation with this volunteer has concentrated on the development of a stimulation strategy according to which video camera images are converted into stimulation pulses. The aim is to convey as much information as possible about the visual scene within the limits of the device's capabilities. Pattern recognition tasks were used to assess the effectiveness of the stimulation strategy. The results demonstrate how even a relatively basic algorithm can efficiently convey useful information regarding the visual scene. By increasing the number of phosphenes used in the algorithm, better performance is observed but a longer training period is required. After a learning period, the volunteer achieved a pattern recognition score of 85% at 54 s on average per pattern. After nine evaluation sessions, when using a stimulation strategy exploiting all available phosphenes, no saturation effect has yet been observed.
Intelligent Vision On The SM9O Mini-Computer Basis And Applications
NASA Astrophysics Data System (ADS)
Hawryszkiw, J.
1985-02-01
Distinction has to be made between image processing and vision Image processing finds its roots in the strong tradition of linear signal processing and promotes geometrical transform techniques, such as fi I tering , compression, and restoration. Its purpose is to transform an image for a human observer to easily extract from that image information significant for him. For example edges after a gradient operator, or a specific direction after a directional filtering operation. Image processing consists in fact in a set of local or global space-time transforms. The interpretation of the final image is done by the human observer. The purpose of vision is to extract the semantic content of the image. The machine can then understand that content, and run a process of decision, which turns into an action. Thus, intel I i gent vision depends on - Image processing - Pattern recognition - Artificial intel I igence
Pattern recognition and feature extraction with an optical Hough transform
NASA Astrophysics Data System (ADS)
Fernández, Ariel
2016-09-01
Pattern recognition and localization along with feature extraction are image processing applications of great interest in defect inspection and robot vision among others. In comparison to purely digital methods, the attractiveness of optical processors for pattern recognition lies in their highly parallel operation and real-time processing capability. This work presents an optical implementation of the generalized Hough transform (GHT), a well-established technique for the recognition of geometrical features in binary images. Detection of a geometric feature under the GHT is accomplished by mapping the original image to an accumulator space; the large computational requirements for this mapping make the optical implementation an attractive alternative to digital- only methods. Starting from the integral representation of the GHT, it is possible to device an optical setup where the transformation is obtained, and the size and orientation parameters can be controlled, allowing for dynamic scale and orientation-variant pattern recognition. A compact system for the above purposes results from the use of an electrically tunable lens for scale control and a rotating pupil mask for orientation variation, implemented on a high-contrast spatial light modulator (SLM). Real-time (as limited by the frame rate of the device used to capture the GHT) can also be achieved, allowing for the processing of video sequences. Besides, by thresholding of the GHT (with the aid of another SLM) and inverse transforming (which is optically achieved in the incoherent system under appropriate focusing setting), the previously detected features of interest can be extracted.
Reading recognition of pointer meter based on pattern recognition and dynamic three-points on a line
NASA Astrophysics Data System (ADS)
Zhang, Yongqiang; Ding, Mingli; Fu, Wuyifang; Li, Yongqiang
2017-03-01
Pointer meters are frequently applied to industrial production for they are directly readable. They should be calibrated regularly to ensure the precision of the readings. Currently the method of manual calibration is most frequently adopted to accomplish the verification of the pointer meter, and professional skills and subjective judgment may lead to big measurement errors and poor reliability and low efficiency, etc. In the past decades, with the development of computer technology, the skills of machine vision and digital image processing have been applied to recognize the reading of the dial instrument. In terms of the existing recognition methods, all the parameters of dial instruments are supposed to be the same, which is not the case in practice. In this work, recognition of pointer meter reading is regarded as an issue of pattern recognition. We obtain the features of a small area around the detected point, make those features as a pattern, divide those certified images based on Gradient Pyramid Algorithm, train a classifier with the support vector machine (SVM) and complete the pattern matching of the divided mages. Then we get the reading of the pointer meter precisely under the theory of dynamic three points make a line (DTPML), which eliminates the error caused by tiny differences of the panels. Eventually, the result of the experiment proves that the proposed method in this work is superior to state-of-the-art works.
Aging and solid shape recognition: Vision and haptics.
Norman, J Farley; Cheeseman, Jacob R; Adkins, Olivia C; Cox, Andrea G; Rogers, Connor E; Dowell, Catherine J; Baxter, Michael W; Norman, Hideko F; Reyes, Cecia M
2015-10-01
The ability of 114 younger and older adults to recognize naturally-shaped objects was evaluated in three experiments. The participants viewed or haptically explored six randomly-chosen bell peppers (Capsicum annuum) in a study session and were later required to judge whether each of twelve bell peppers was "old" (previously presented during the study session) or "new" (not presented during the study session). When recognition memory was tested immediately after study, the younger adults' (Experiment 1) performance for vision and haptics was identical when the individual study objects were presented once. Vision became superior to haptics, however, when the individual study objects were presented multiple times. When 10- and 20-min delays (Experiment 2) were inserted in between study and test sessions, no significant differences occurred between vision and haptics: recognition performance in both modalities was comparable. When the recognition performance of older adults was evaluated (Experiment 3), a negative effect of age was found for visual shape recognition (younger adults' overall recognition performance was 60% higher). There was no age effect, however, for haptic shape recognition. The results of the present experiments indicate that the visual recognition of natural object shape is different from haptic recognition in multiple ways: visual shape recognition can be superior to that of haptics and is affected by aging, while haptic shape recognition is less accurate and unaffected by aging. Copyright © 2015 Elsevier Ltd. All rights reserved.
Training improves reading speed in peripheral vision: is it due to attention?
Lee, Hye-Won; Kwon, Miyoung; Legge, Gordon E; Gefroh, Joshua J
2010-06-01
Previous research has shown that perceptual training in peripheral vision, using a letter-recognition task, increases reading speed and letter recognition (S. T. L. Chung, G. E. Legge, & S. H. Cheung, 2004). We tested the hypothesis that enhanced deployment of spatial attention to peripheral vision explains this training effect. Subjects were pre- and post-tested with 3 tasks at 10° above and below fixation-RSVP reading speed, trigram letter recognition (used to construct visual-span profiles), and deployment of spatial attention (measured as the benefit of a pre-cue for target position in a lexical-decision task). Groups of five normally sighted young adults received 4 days of trigram letter-recognition training in upper or lower visual fields, or central vision. A control group received no training. Our measure of deployment of spatial attention revealed visual-field anisotropies; better deployment of attention in the lower field than the upper, and in the lower-right quadrant compared with the other three quadrants. All subject groups exhibited slight improvement in deployment of spatial attention to peripheral vision in the post-test, but this improvement was not correlated with training-related increases in reading speed and the size of visual-span profiles. Our results indicate that improved deployment of spatial attention to peripheral vision does not account for improved reading speed and letter recognition in peripheral vision.
Research of Daily Conversation Transmitting System Based on Mouth Part Pattern Recognition
NASA Astrophysics Data System (ADS)
Watanabe, Mutsumi; Nishi, Natsuko
The authors are developing a vision-based intension transfer technique by recognizing user’s face expressions and movements, to help free and convenient communications with aged or disabled persons who find difficulties in talking, discriminating small character prints and operating keyboards by hands and fingers. In this paper we report a prototype system, where layered daily conversations are successively selected by recognizing the transition in shape of user’s mouth parts using camera image sequences settled in front of the user. Four mouth part patterns are used in the system. A method that automatically recognizes these patterns by analyzing the intensity histogram data around the mouth region is newly developed. The confirmation of a selection on the way is executed by detecting the open and shut movements of mouth through the temporal change in intensity histogram data. The method has been installed in a desktop PC by VC++ programs. Experimental results of mouth shape pattern recognition by twenty-five persons have shown the effectiveness of the method.
Vision-based obstacle recognition system for automated lawn mower robot development
NASA Astrophysics Data System (ADS)
Mohd Zin, Zalhan; Ibrahim, Ratnawati
2011-06-01
Digital image processing techniques (DIP) have been widely used in various types of application recently. Classification and recognition of a specific object using vision system require some challenging tasks in the field of image processing and artificial intelligence. The ability and efficiency of vision system to capture and process the images is very important for any intelligent system such as autonomous robot. This paper gives attention to the development of a vision system that could contribute to the development of an automated vision based lawn mower robot. The works involve on the implementation of DIP techniques to detect and recognize three different types of obstacles that usually exist on a football field. The focus was given on the study on different types and sizes of obstacles, the development of vision based obstacle recognition system and the evaluation of the system's performance. Image processing techniques such as image filtering, segmentation, enhancement and edge detection have been applied in the system. The results have shown that the developed system is able to detect and recognize various types of obstacles on a football field with recognition rate of more 80%.
Container-code recognition system based on computer vision and deep neural networks
NASA Astrophysics Data System (ADS)
Liu, Yi; Li, Tianjian; Jiang, Li; Liang, Xiaoyao
2018-04-01
Automatic container-code recognition system becomes a crucial requirement for ship transportation industry in recent years. In this paper, an automatic container-code recognition system based on computer vision and deep neural networks is proposed. The system consists of two modules, detection module and recognition module. The detection module applies both algorithms based on computer vision and neural networks, and generates a better detection result through combination to avoid the drawbacks of the two methods. The combined detection results are also collected for online training of the neural networks. The recognition module exploits both character segmentation and end-to-end recognition, and outputs the recognition result which passes the verification. When the recognition module generates false recognition, the result will be corrected and collected for online training of the end-to-end recognition sub-module. By combining several algorithms, the system is able to deal with more situations, and the online training mechanism can improve the performance of the neural networks at runtime. The proposed system is able to achieve 93% of overall recognition accuracy.
Action Recognition in a Crowded Environment
Nieuwenhuis, Judith; Bülthoff, Isabelle; Barraclough, Nick; de la Rosa, Stephan
2017-01-01
So far, action recognition has been mainly examined with small point-light human stimuli presented alone within a narrow central area of the observer’s visual field. Yet, we need to recognize the actions of life-size humans viewed alone or surrounded by bystanders, whether they are seen in central or peripheral vision. Here, we examined the mechanisms in central vision and far periphery (40° eccentricity) involved in the recognition of the actions of a life-size actor (target) and their sensitivity to the presence of a crowd surrounding the target. In Experiment 1, we used an action adaptation paradigm to probe whether static or idly moving crowds might interfere with the recognition of a target’s action (hug or clap). We found that this type of crowds whose movements were dissimilar to the target action hardly affected action recognition in central and peripheral vision. In Experiment 2, we examined whether crowd actions that were more similar to the target actions affected action recognition. Indeed, the presence of that crowd diminished adaptation aftereffects in central vision as wells as in the periphery. We replicated Experiment 2 using a recognition task instead of an adaptation paradigm. With this task, we found evidence of decreased action recognition accuracy, but this was significant in peripheral vision only. Our results suggest that the presence of a crowd carrying out actions similar to that of the target affects its recognition. We outline how these results can be understood in terms of high-level crowding effects that operate on action-sensitive perceptual channels. PMID:29308177
Digital Images and Human Vision
NASA Technical Reports Server (NTRS)
Watson, Andrew B.; Null, Cynthia H. (Technical Monitor)
1997-01-01
Processing of digital images destined for visual consumption raises many interesting questions regarding human visual sensitivity. This talk will survey some of these questions, including some that have been answered and some that have not. There will be an emphasis upon visual masking, and a distinction will be drawn between masking due to contrast gain control processes, and due to processes such as hypothesis testing, pattern recognition, and visual search.
Identifying and detecting facial expressions of emotion in peripheral vision.
Smith, Fraser W; Rossit, Stephanie
2018-01-01
Facial expressions of emotion are signals of high biological value. Whilst recognition of facial expressions has been much studied in central vision, the ability to perceive these signals in peripheral vision has only seen limited research to date, despite the potential adaptive advantages of such perception. In the present experiment, we investigate facial expression recognition and detection performance for each of the basic emotions (plus neutral) at up to 30 degrees of eccentricity. We demonstrate, as expected, a decrease in recognition and detection performance with increasing eccentricity, with happiness and surprised being the best recognized expressions in peripheral vision. In detection however, while happiness and surprised are still well detected, fear is also a well detected expression. We show that fear is a better detected than recognized expression. Our results demonstrate that task constraints shape the perception of expression in peripheral vision and provide novel evidence that detection and recognition rely on partially separate underlying mechanisms, with the latter more dependent on the higher spatial frequency content of the face stimulus.
Identifying and detecting facial expressions of emotion in peripheral vision
Rossit, Stephanie
2018-01-01
Facial expressions of emotion are signals of high biological value. Whilst recognition of facial expressions has been much studied in central vision, the ability to perceive these signals in peripheral vision has only seen limited research to date, despite the potential adaptive advantages of such perception. In the present experiment, we investigate facial expression recognition and detection performance for each of the basic emotions (plus neutral) at up to 30 degrees of eccentricity. We demonstrate, as expected, a decrease in recognition and detection performance with increasing eccentricity, with happiness and surprised being the best recognized expressions in peripheral vision. In detection however, while happiness and surprised are still well detected, fear is also a well detected expression. We show that fear is a better detected than recognized expression. Our results demonstrate that task constraints shape the perception of expression in peripheral vision and provide novel evidence that detection and recognition rely on partially separate underlying mechanisms, with the latter more dependent on the higher spatial frequency content of the face stimulus. PMID:29847562
Relationship between slow visual processing and reading speed in people with macular degeneration
Cheong, Allen MY; Legge, Gordon E; Lawrence, Mary G; Cheung, Sing-Hang; Ruff, Mary A
2007-01-01
Purpose People with macular degeneration (MD) often read slowly even with adequate magnification to compensate for acuity loss. Oculomotor deficits may affect reading in MD, but cannot fully explain the substantial reduction in reading speed. Central-field loss (CFL) is often a consequence of macular degeneration, necessitating the use of peripheral vision for reading. We hypothesized that slower temporal processing of visual patterns in peripheral vision is a factor contributing to slow reading performance in MD patients. Methods Fifteen subjects with MD, including 12 with CFL, and five age-matched control subjects were recruited. Maximum reading speed and critical print size were measured with RSVP (Rapid Serial Visual Presentation). Temporal processing speed was studied by measuring letter-recognition accuracy for strings of three randomly selected letters centered at fixation for a range of exposure times. Temporal threshold was defined as the exposure time yielding 80% recognition accuracy for the central letter. Results Temporal thresholds for the MD subjects ranged from 159 to 5881 ms, much longer than values for age-matched controls in central vision (13 ms, p<0.01). The mean temporal threshold for the 11 MD subjects who used eccentric fixation (1555.8 ± 1708.4 ms) was much longer than the mean temporal threshold (97.0 ms ± 34.2 ms, p<0.01) for the age-matched controls at 10° in the lower visual field. Individual temporal thresholds accounted for 30% of the variance in reading speed (p<0.05). Conclusion The significant association between increased temporal threshold for letter recognition and reduced reading speed is consistent with the hypothesis that slower visual processing of letter recognition is one of the factors limiting reading speed in MD subjects. PMID:17881032
NASA Technical Reports Server (NTRS)
Knasel, T. Michael
1996-01-01
The primary goal of the Adaptive Vision Laboratory Research project was to develop advanced computer vision systems for automatic target recognition. The approach used in this effort combined several machine learning paradigms including evolutionary learning algorithms, neural networks, and adaptive clustering techniques to develop the E-MOR.PH system. This system is capable of generating pattern recognition systems to solve a wide variety of complex recognition tasks. A series of simulation experiments were conducted using E-MORPH to solve problems in OCR, military target recognition, industrial inspection, and medical image analysis. The bulk of the funds provided through this grant were used to purchase computer hardware and software to support these computationally intensive simulations. The payoff from this effort is the reduced need for human involvement in the design and implementation of recognition systems. We have shown that the techniques used in E-MORPH are generic and readily transition to other problem domains. Specifically, E-MORPH is multi-phase evolutionary leaming system that evolves cooperative sets of features detectors and combines their response using an adaptive classifier to form a complete pattern recognition system. The system can operate on binary or grayscale images. In our most recent experiments, we used multi-resolution images that are formed by applying a Gabor wavelet transform to a set of grayscale input images. To begin the leaming process, candidate chips are extracted from the multi-resolution images to form a training set and a test set. A population of detector sets is randomly initialized to start the evolutionary process. Using a combination of evolutionary programming and genetic algorithms, the feature detectors are enhanced to solve a recognition problem. The design of E-MORPH and recognition results for a complex problem in medical image analysis are described at the end of this report. The specific task involves the identification of vertebrae in x-ray images of human spinal columns. This problem is extremely challenging because the individual vertebra exhibit variation in shape, scale, orientation, and contrast. E-MORPH generated several accurate recognition systems to solve this task. This dual use of this ATR technology clearly demonstrates the flexibility and power of our approach.
Hotspot detection using image pattern recognition based on higher-order local auto-correlation
NASA Astrophysics Data System (ADS)
Maeda, Shimon; Matsunawa, Tetsuaki; Ogawa, Ryuji; Ichikawa, Hirotaka; Takahata, Kazuhiro; Miyairi, Masahiro; Kotani, Toshiya; Nojima, Shigeki; Tanaka, Satoshi; Nakagawa, Kei; Saito, Tamaki; Mimotogi, Shoji; Inoue, Soichi; Nosato, Hirokazu; Sakanashi, Hidenori; Kobayashi, Takumi; Murakawa, Masahiro; Higuchi, Tetsuya; Takahashi, Eiichi; Otsu, Nobuyuki
2011-04-01
Below 40nm design node, systematic variation due to lithography must be taken into consideration during the early stage of design. So far, litho-aware design using lithography simulation models has been widely applied to assure that designs are printed on silicon without any error. However, the lithography simulation approach is very time consuming, and under time-to-market pressure, repetitive redesign by this approach may result in the missing of the market window. This paper proposes a fast hotspot detection support method by flexible and intelligent vision system image pattern recognition based on Higher-Order Local Autocorrelation. Our method learns the geometrical properties of the given design data without any defects as normal patterns, and automatically detects the design patterns with hotspots from the test data as abnormal patterns. The Higher-Order Local Autocorrelation method can extract features from the graphic image of design pattern, and computational cost of the extraction is constant regardless of the number of design pattern polygons. This approach can reduce turnaround time (TAT) dramatically only on 1CPU, compared with the conventional simulation-based approach, and by distributed processing, this has proven to deliver linear scalability with each additional CPU.
Exploring Techniques for Vision Based Human Activity Recognition: Methods, Systems, and Evaluation
Xu, Xin; Tang, Jinshan; Zhang, Xiaolong; Liu, Xiaoming; Zhang, Hong; Qiu, Yimin
2013-01-01
With the wide applications of vision based intelligent systems, image and video analysis technologies have attracted the attention of researchers in the computer vision field. In image and video analysis, human activity recognition is an important research direction. By interpreting and understanding human activities, we can recognize and predict the occurrence of crimes and help the police or other agencies react immediately. In the past, a large number of papers have been published on human activity recognition in video and image sequences. In this paper, we provide a comprehensive survey of the recent development of the techniques, including methods, systems, and quantitative evaluation of the performance of human activity recognition. PMID:23353144
Milestones on the road to independence for the blind
NASA Astrophysics Data System (ADS)
Reed, Kenneth
1997-02-01
Ken will talk about his experiences as an end user of technology. Even moderate technological progress in the field of pattern recognition and artificial intelligence can be, often surprisingly, of great help to the blind. An example is the providing of portable bar code scanners so that a blind person knows what he is buying and what color it is. In this age of microprocessors controlling everything, how can a blind person find out what his VCR is doing? Is there some technique that will allow a blind musician to convert print music into midi files to drive a synthesizer? Can computer vision help the blind cross a road including predictions of where oncoming traffic will be located? Can computer vision technology provide spoken description of scenes so a blind person can figure out where doors and entrances are located, and what the signage on the building says? He asks 'can computer vision help me flip a pancake?' His challenge to those in the computer vision field is 'where can we go from here?'
Using an Augmented Reality Device as a Distance-based Vision Aid-Promise and Limitations.
Kinateder, Max; Gualtieri, Justin; Dunn, Matt J; Jarosz, Wojciech; Yang, Xing-Dong; Cooper, Emily A
2018-06-06
For people with limited vision, wearable displays hold the potential to digitally enhance visual function. As these display technologies advance, it is important to understand their promise and limitations as vision aids. The aim of this study was to test the potential of a consumer augmented reality (AR) device for improving the functional vision of people with near-complete vision loss. An AR application that translates spatial information into high-contrast visual patterns was developed. Two experiments assessed the efficacy of the application to improve vision: an exploratory study with four visually impaired participants and a main controlled study with participants with simulated vision loss (n = 48). In both studies, performance was tested on a range of visual tasks (identifying the location, pose and gesture of a person, identifying objects, and moving around in an unfamiliar space). Participants' accuracy and confidence were compared on these tasks with and without augmented vision, as well as their subjective responses about ease of mobility. In the main study, the AR application was associated with substantially improved accuracy and confidence in object recognition (all P < .001) and to a lesser degree in gesture recognition (P < .05). There was no significant change in performance on identifying body poses or in subjective assessments of mobility, as compared with a control group. Consumer AR devices may soon be able to support applications that improve the functional vision of users for some tasks. In our study, both artificially impaired participants and participants with near-complete vision loss performed tasks that they could not do without the AR system. Current limitations in system performance and form factor, as well as the risk of overconfidence, will need to be overcome.This is an open-access article distributed under the terms of the Creative Commons Attribution-Non Commercial-No Derivatives License 4.0 (CCBY-NC-ND), where it is permissible to download and share the work provided it is properly cited. The work cannot be changed in any way or used commercially without permission from the journal.
NASA Astrophysics Data System (ADS)
Zamora Ramos, Ernesto
Artificial Intelligence is a big part of automation and with today's technological advances, artificial intelligence has taken great strides towards positioning itself as the technology of the future to control, enhance and perfect automation. Computer vision includes pattern recognition and classification and machine learning. Computer vision is at the core of decision making and it is a vast and fruitful branch of artificial intelligence. In this work, we expose novel algorithms and techniques built upon existing technologies to improve pattern recognition and neural network training, initially motivated by a multidisciplinary effort to build a robot that helps maintain and optimize solar panel energy production. Our contributions detail an improved non-linear pre-processing technique to enhance poorly illuminated images based on modifications to the standard histogram equalization for an image. While the original motivation was to improve nocturnal navigation, the results have applications in surveillance, search and rescue, medical imaging enhancing, and many others. We created a vision system for precise camera distance positioning motivated to correctly locate the robot for capture of solar panel images for classification. The classification algorithm marks solar panels as clean or dirty for later processing. Our algorithm extends past image classification and, based on historical and experimental data, it identifies the optimal moment in which to perform maintenance on marked solar panels as to minimize the energy and profit loss. In order to improve upon the classification algorithm, we delved into feedforward neural networks because of their recent advancements, proven universal approximation and classification capabilities, and excellent recognition rates. We explore state-of-the-art neural network training techniques offering pointers and insights, culminating on the implementation of a complete library with support for modern deep learning architectures, multilayer percepterons and convolutional neural networks. Our research with neural networks has encountered a great deal of difficulties regarding hyperparameter estimation for good training convergence rate and accuracy. Most hyperparameters, including architecture, learning rate, regularization, trainable parameters (or weights) initialization, and so on, are chosen via a trial and error process with some educated guesses. However, we developed the first quantitative method to compare weight initialization strategies, a critical hyperparameter choice during training, to estimate among a group of candidate strategies which would make the network converge to the highest classification accuracy faster with high probability. Our method provides a quick, objective measure to compare initialization strategies to select the best possible among them beforehand without having to complete multiple training sessions for each candidate strategy to compare final results.
Improved word recognition for observers with age-related maculopathies using compensation filters
NASA Technical Reports Server (NTRS)
Lawton, Teri B.
1988-01-01
A method for improving word recognition for people with age-related maculopathies, which cause a loss of central vision, is discussed. It is found that the use of individualized compensation filters based on an person's normalized contrast sensitivity function can improve word recognition for people with age-related maculopathies. It is shown that 27-70 pct more magnification is needed for unfiltered words compared to filtered words. The improvement in word recognition is positively correlated with the severity of vision loss.
Fast Legendre moment computation for template matching
NASA Astrophysics Data System (ADS)
Li, Bing C.
2017-05-01
Normalized cross correlation (NCC) based template matching is insensitive to intensity changes and it has many applications in image processing, object detection, video tracking and pattern recognition. However, normalized cross correlation implementation is computationally expensive since it involves both correlation computation and normalization implementation. In this paper, we propose Legendre moment approach for fast normalized cross correlation implementation and show that the computational cost of this proposed approach is independent of template mask sizes which is significantly faster than traditional mask size dependent approaches, especially for large mask templates. Legendre polynomials have been widely used in solving Laplace equation in electrodynamics in spherical coordinate systems, and solving Schrodinger equation in quantum mechanics. In this paper, we extend Legendre polynomials from physics to computer vision and pattern recognition fields, and demonstrate that Legendre polynomials can help to reduce the computational cost of NCC based template matching significantly.
Generic decoding of seen and imagined objects using hierarchical visual features.
Horikawa, Tomoyasu; Kamitani, Yukiyasu
2017-05-22
Object recognition is a key function in both human and machine vision. While brain decoding of seen and imagined objects has been achieved, the prediction is limited to training examples. We present a decoding approach for arbitrary objects using the machine vision principle that an object category is represented by a set of features rendered invariant through hierarchical processing. We show that visual features, including those derived from a deep convolutional neural network, can be predicted from fMRI patterns, and that greater accuracy is achieved for low-/high-level features with lower-/higher-level visual areas, respectively. Predicted features are used to identify seen/imagined object categories (extending beyond decoder training) from a set of computed features for numerous object images. Furthermore, decoding of imagined objects reveals progressive recruitment of higher-to-lower visual representations. Our results demonstrate a homology between human and machine vision and its utility for brain-based information retrieval.
Sparse Representation of Multimodality Sensing Databases for Data Mining and Retrieval
2015-04-09
Savarese. Estimating the Aspect Layout of Object Categories, EEE Conference on Computer Vision and Pattern Recognition (CVPR). 19-JUN-12...Time Equivalent (FTE) support provided by this agreement, and total for each category): (a) Graduate Students Liang Mei, 50% FTE, EE : systems...PhD candidate Min Sun, 50% FTE, EE : systems, PhD candidate Yu Xiang, 50% FTE, EE : systems, PhD candidate Dae Yon Jung, 50% FTE, EE : systems, PhD
Computer Vision for Artificially Intelligent Robotic Systems
NASA Astrophysics Data System (ADS)
Ma, Chialo; Ma, Yung-Lung
1987-04-01
In this paper An Acoustic Imaging Recognition System (AIRS) will be introduced which is installed on an Intelligent Robotic System and can recognize different type of Hand tools' by Dynamic pattern recognition. The dynamic pattern recognition is approached by look up table method in this case, the method can save a lot of calculation time and it is practicable. The Acoustic Imaging Recognition System (AIRS) is consist of four parts -- position control unit, pulse-echo signal processing unit, pattern recognition unit and main control unit. The position control of AIRS can rotate an angle of ±5 degree Horizental and Vertical seperately, the purpose of rotation is to find the maximum reflection intensity area, from the distance, angles and intensity of the target we can decide the characteristic of this target, of course all the decision is target, of course all the decision is processed bye the main control unit. In Pulse-Echo Signal Process Unit, we ultilize the correlation method, to overcome the limitation of short burst of ultrasonic, because the Correlation system can transmit large time bandwidth signals and obtain their resolution and increased intensity through pulse compression in the correlation receiver. The output of correlator is sampled and transfer into digital data by u law coding method, and this data together with delay time T, angle information OH, eV will be sent into main control unit for further analysis. The recognition process in this paper, we use dynamic look up table method, in this method at first we shall set up serval recognition pattern table and then the new pattern scanned by Transducer array will be devided into serval stages and compare with the sampling table. The comparison is implemented by dynamic programing and Markovian process. All the hardware control signals, such as optimum delay time for correlator receiver, horizental and vertical rotation angle for transducer plate, are controlled by the Main Control Unit, the Main Control Unit also handles the pattern recognition process. The distance from the target to the transducer plate is limitted by the power and beam angle of transducer elements, in this AIRS Model, we use a narrow beam transducer and it's input voltage is 50V p-p. A RobOt equipped with AIRS can not only measure the distance from the target but also recognize a three dimensional image of target from the image lab of Robot memory. Indexitems, Accoustic System, Supersonic transducer, Dynamic programming, Look-up-table, Image process, pattern Recognition, Quad Tree, Quadappoach.
Development of a battery of functional tests for low vision.
Dougherty, Bradley E; Martin, Scott R; Kelly, Corey B; Jones, Lisa A; Raasch, Thomas W; Bullimore, Mark A
2009-08-01
We describe the development and evaluation of a battery of tests of functional visual performance of everyday tasks intended to be suitable for assessment of low vision patients. The functional test battery comprises-Reading rate: reading aloud 20 unrelated words for each of four print sizes (8, 4, 2, & 1 M); Telephone book: finding a name and reading the telephone number; Medicine bottle label: reading the name and dosing; Utility bill: reading the due date and amount due; Cooking instructions: reading cooking time on a food package; Coin sorting: making a specified amount from coins placed on a table; Playing card recognition: identifying denomination and suit; and Face recognition: identifying expressions of printed, life-size faces at 1 and 3 m. All tests were timed except face and playing card recognition. Fourteen normally sighted and 24 low vision subjects were assessed with the functional test battery. Visual acuity, contrast sensitivity, and quality of life (National Eye Institute Visual Function Questionnaire 25 [NEI-VFQ 25]) were measured and the functional tests repeated. Subsequently, 23 low vision patients participated in a pilot randomized clinical trial with half receiving low vision rehabilitation and half a delayed intervention. The functional tests were administered at enrollment and 3 months later. Normally sighted subjects could perform all tasks but the proportion of trials performed correctly by the low vision subjects ranged from 35% for face recognition at 3 m, to 95% for the playing card identification. On average, low vision subjects performed three times slower than the normally sighted subjects. Timed tasks with a visual search component showed poorer repeatability. In the pilot clinical trial, low vision rehabilitation produced the greatest improvement for the medicine bottle and cooking instruction tasks. Performance of patients on these functional tests has been assessed. Some appear responsive to low vision rehabilitation.
NASA Astrophysics Data System (ADS)
Anwer, Rao Muhammad; Khan, Fahad Shahbaz; van de Weijer, Joost; Molinier, Matthieu; Laaksonen, Jorma
2018-04-01
Designing discriminative powerful texture features robust to realistic imaging conditions is a challenging computer vision problem with many applications, including material recognition and analysis of satellite or aerial imagery. In the past, most texture description approaches were based on dense orderless statistical distribution of local features. However, most recent approaches to texture recognition and remote sensing scene classification are based on Convolutional Neural Networks (CNNs). The de facto practice when learning these CNN models is to use RGB patches as input with training performed on large amounts of labeled data (ImageNet). In this paper, we show that Local Binary Patterns (LBP) encoded CNN models, codenamed TEX-Nets, trained using mapped coded images with explicit LBP based texture information provide complementary information to the standard RGB deep models. Additionally, two deep architectures, namely early and late fusion, are investigated to combine the texture and color information. To the best of our knowledge, we are the first to investigate Binary Patterns encoded CNNs and different deep network fusion architectures for texture recognition and remote sensing scene classification. We perform comprehensive experiments on four texture recognition datasets and four remote sensing scene classification benchmarks: UC-Merced with 21 scene categories, WHU-RS19 with 19 scene classes, RSSCN7 with 7 categories and the recently introduced large scale aerial image dataset (AID) with 30 aerial scene types. We demonstrate that TEX-Nets provide complementary information to standard RGB deep model of the same network architecture. Our late fusion TEX-Net architecture always improves the overall performance compared to the standard RGB network on both recognition problems. Furthermore, our final combination leads to consistent improvement over the state-of-the-art for remote sensing scene classification.
Al-Marri, Faraj; Reza, Faruque; Begum, Tahamina; Hitam, Wan Hazabbah Wan; Jin, Goh Khean; Xiang, Jing
2017-10-25
Visual cognitive function is important to build up executive function in daily life. Perception of visual Number form (e.g., Arabic digit) and numerosity (magnitude of the Number) is of interest to cognitive neuroscientists. Neural correlates and the functional measurement of Number representations are complex occurrences when their semantic categories are assimilated with other concepts of shape and colour. Colour perception can be processed further to modulate visual cognition. The Ishihara pseudoisochromatic plates are one of the best and most common screening tools for basic red-green colour vision testing. However, there is a lack of study of visual cognitive function assessment using these pseudoisochromatic plates. We recruited 25 healthy normal trichromat volunteers and extended these studies using a 128-sensor net to record event-related EEG. Subjects were asked to respond by pressing Numbered buttons when they saw the Number and Non-number plates of the Ishihara colour vision test. Amplitudes and latencies of N100 and P300 event related potential (ERP) components were analysed from 19 electrode sites in the international 10-20 system. A brain topographic map, cortical activation patterns and Granger causation (effective connectivity) were analysed from 128 electrode sites. No major significant differences between N100 ERP components in either stimulus indicate early selective attention processing was similar for Number and Non-number plate stimuli, but Non-number plate stimuli evoked significantly higher amplitudes, longer latencies of the P300 ERP component with a slower reaction time compared to Number plate stimuli imply the allocation of attentional load was more in Non-number plate processing. A different pattern of asymmetric scalp voltage map was noticed for P300 components with a higher intensity in the left hemisphere for Number plate tasks and higher intensity in the right hemisphere for Non-number plate tasks. Asymmetric cortical activation and connectivity patterns revealed that Number recognition occurred in the occipital and left frontal areas where as the consequence was limited to the occipital area during the Non-number plate processing. Finally, the results displayed that the visual recognition of Numbers dissociates from the recognition of Non-numbers at the level of defined neural networks. Number recognition was not only a process of visual perception and attention, but it was also related to a higher level of cognitive function, that of language.
Velocity and Structure Estimation of a Moving Object Using a Moving Monocular Camera
2006-01-01
map the Euclidean position of static landmarks or visual features in the environment . Recent applications of this technique include aerial...From Motion in a Piecewise Planar Environment ,” International Journal of Pattern Recognition and Artificial Intelligence, Vol. 2, No. 3, pp. 485-508...1988. [9] J. M. Ferryman, S. J. Maybank , and A. D. Worrall, “Visual Surveil- lance for Moving Vehicles,” Intl. Journal of Computer Vision, Vol. 37, No
Scheirer, Walter J; de Rezende Rocha, Anderson; Sapkota, Archana; Boult, Terrance E
2013-07-01
To date, almost all experimental evaluations of machine learning-based recognition algorithms in computer vision have taken the form of "closed set" recognition, whereby all testing classes are known at training time. A more realistic scenario for vision applications is "open set" recognition, where incomplete knowledge of the world is present at training time, and unknown classes can be submitted to an algorithm during testing. This paper explores the nature of open set recognition and formalizes its definition as a constrained minimization problem. The open set recognition problem is not well addressed by existing algorithms because it requires strong generalization. As a step toward a solution, we introduce a novel "1-vs-set machine," which sculpts a decision space from the marginal distances of a 1-class or binary SVM with a linear kernel. This methodology applies to several different applications in computer vision where open set recognition is a challenging problem, including object recognition and face verification. We consider both in this work, with large scale cross-dataset experiments performed over the Caltech 256 and ImageNet sets, as well as face matching experiments performed over the Labeled Faces in the Wild set. The experiments highlight the effectiveness of machines adapted for open set evaluation compared to existing 1-class and binary SVMs for the same tasks.
Identification and location of catenary insulator in complex background based on machine vision
NASA Astrophysics Data System (ADS)
Yao, Xiaotong; Pan, Yingli; Liu, Li; Cheng, Xiao
2018-04-01
It is an important premise to locate insulator precisely for fault detection. Current location algorithms for insulator under catenary checking images are not accurate, a target recognition and localization method based on binocular vision combined with SURF features is proposed. First of all, because of the location of the insulator in complex environment, using SURF features to achieve the coarse positioning of target recognition; then Using binocular vision principle to calculate the 3D coordinates of the object which has been coarsely located, realization of target object recognition and fine location; Finally, Finally, the key is to preserve the 3D coordinate of the object's center of mass, transfer to the inspection robot to control the detection position of the robot. Experimental results demonstrate that the proposed method has better recognition efficiency and accuracy, can successfully identify the target and has a define application value.
Preserved Haptic Shape Processing after Bilateral LOC Lesions.
Snow, Jacqueline C; Goodale, Melvyn A; Culham, Jody C
2015-10-07
The visual and haptic perceptual systems are understood to share a common neural representation of object shape. A region thought to be critical for recognizing visual and haptic shape information is the lateral occipital complex (LOC). We investigated whether LOC is essential for haptic shape recognition in humans by studying behavioral responses and brain activation for haptically explored objects in a patient (M.C.) with bilateral lesions of the occipitotemporal cortex, including LOC. Despite severe deficits in recognizing objects using vision, M.C. was able to accurately recognize objects via touch. M.C.'s psychophysical response profile to haptically explored shapes was also indistinguishable from controls. Using fMRI, M.C. showed no object-selective visual or haptic responses in LOC, but her pattern of haptic activation in other brain regions was remarkably similar to healthy controls. Although LOC is routinely active during visual and haptic shape recognition tasks, it is not essential for haptic recognition of object shape. The lateral occipital complex (LOC) is a brain region regarded to be critical for recognizing object shape, both in vision and in touch. However, causal evidence linking LOC with haptic shape processing is lacking. We studied recognition performance, psychophysical sensitivity, and brain response to touched objects, in a patient (M.C.) with extensive lesions involving LOC bilaterally. Despite being severely impaired in visual shape recognition, M.C. was able to identify objects via touch and she showed normal sensitivity to a haptic shape illusion. M.C.'s brain response to touched objects in areas of undamaged cortex was also very similar to that observed in neurologically healthy controls. These results demonstrate that LOC is not necessary for recognizing objects via touch. Copyright © 2015 the authors 0270-6474/15/3513745-16$15.00/0.
Proceedings of the 1986 IEEE international conference on systems, man and cybernetics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Not Available
1986-01-01
This book presents the papers given at a conference on man-machine systems. Topics considered at the conference included neural model-based cognitive theory and engineering, user interfaces, adaptive and learning systems, human interaction with robotics, decision making, the testing and evaluation of expert systems, software development, international conflict resolution, intelligent interfaces, automation in man-machine system design aiding, knowledge acquisition in expert systems, advanced architectures for artificial intelligence, pattern recognition, knowledge bases, and machine vision.
Tensor Rank Preserving Discriminant Analysis for Facial Recognition.
Tao, Dapeng; Guo, Yanan; Li, Yaotang; Gao, Xinbo
2017-10-12
Facial recognition, one of the basic topics in computer vision and pattern recognition, has received substantial attention in recent years. However, for those traditional facial recognition algorithms, the facial images are reshaped to a long vector, thereby losing part of the original spatial constraints of each pixel. In this paper, a new tensor-based feature extraction algorithm termed tensor rank preserving discriminant analysis (TRPDA) for facial image recognition is proposed; the proposed method involves two stages: in the first stage, the low-dimensional tensor subspace of the original input tensor samples was obtained; in the second stage, discriminative locality alignment was utilized to obtain the ultimate vector feature representation for subsequent facial recognition. On the one hand, the proposed TRPDA algorithm fully utilizes the natural structure of the input samples, and it applies an optimization criterion that can directly handle the tensor spectral analysis problem, thereby decreasing the computation cost compared those traditional tensor-based feature selection algorithms. On the other hand, the proposed TRPDA algorithm extracts feature by finding a tensor subspace that preserves most of the rank order information of the intra-class input samples. Experiments on the three facial databases are performed here to determine the effectiveness of the proposed TRPDA algorithm.
Aguilar, Mario; Peot, Mark A; Zhou, Jiangying; Simons, Stephen; Liao, Yuwei; Metwalli, Nader; Anderson, Mark B
2012-03-01
The mammalian visual system is still the gold standard for recognition accuracy, flexibility, efficiency, and speed. Ongoing advances in our understanding of function and mechanisms in the visual system can now be leveraged to pursue the design of computer vision architectures that will revolutionize the state of the art in computer vision.
Vision requirements for Space Station applications
NASA Technical Reports Server (NTRS)
Crouse, K. R.
1985-01-01
Problems which will be encountered by computer vision systems in Space Station operations are discussed, along with solutions be examined at Johnson Space Station. Lighting cannot be controlled in space, nor can the random presence of reflective surfaces. Task-oriented capabilities are to include docking to moving objects, identification of unexpected objects during autonomous flights to different orbits, and diagnoses of damage and repair requirements for autonomous Space Station inspection robots. The approaches being examined to provide these and other capabilities are television IR sensors, advanced pattern recognition programs feeding on data from laser probes, laser radar for robot eyesight and arrays of SMART sensors for automated location and tracking of target objects. Attention is also being given to liquid crystal light valves for optical processing of images for comparisons with on-board electronic libraries of images.
Feature extraction inspired by V1 in visual cortex
NASA Astrophysics Data System (ADS)
Lv, Chao; Xu, Yuelei; Zhang, Xulei; Ma, Shiping; Li, Shuai; Xin, Peng; Zhu, Mingning; Ma, Hongqiang
2018-04-01
Target feature extraction plays an important role in pattern recognition. It is the most complicated activity in the brain mechanism of biological vision. Inspired by high properties of primary visual cortex (V1) in extracting dynamic and static features, a visual perception model was raised. Firstly, 28 spatial-temporal filters with different orientations, half-squaring operation and divisive normalization were adopted to obtain the responses of V1 simple cells; then, an adjustable parameter was added to the output weight so that the response of complex cells was got. Experimental results indicate that the proposed V1 model can perceive motion information well. Besides, it has a good edge detection capability. The model inspired by V1 has good performance in feature extraction and effectively combines brain-inspired intelligence with computer vision.
Log-Gabor Weber descriptor for face recognition
NASA Astrophysics Data System (ADS)
Li, Jing; Sang, Nong; Gao, Changxin
2015-09-01
The Log-Gabor transform, which is suitable for analyzing gradually changing data such as in iris and face images, has been widely used in image processing, pattern recognition, and computer vision. In most cases, only the magnitude or phase information of the Log-Gabor transform is considered. However, the complementary effect taken by combining magnitude and phase information simultaneously for an image-feature extraction problem has not been systematically explored in the existing works. We propose a local image descriptor for face recognition, called Log-Gabor Weber descriptor (LGWD). The novelty of our LGWD is twofold: (1) to fully utilize the information from the magnitude or phase feature of multiscale and orientation Log-Gabor transform, we apply the Weber local binary pattern operator to each transform response. (2) The encoded Log-Gabor magnitude and phase information are fused at the feature level by utilizing kernel canonical correlation analysis strategy, considering that feature level information fusion is effective when the modalities are correlated. Experimental results on the AR, Extended Yale B, and UMIST face databases, compared with those available from recent experiments reported in the literature, show that our descriptor yields a better performance than state-of-the art methods.
Quality detection system and method of micro-accessory based on microscopic vision
NASA Astrophysics Data System (ADS)
Li, Dongjie; Wang, Shiwei; Fu, Yu
2017-10-01
Considering that the traditional manual detection of micro-accessory has some problems, such as heavy workload, low efficiency and large artificial error, a kind of quality inspection system of micro-accessory has been designed. Micro-vision technology has been used to inspect quality, which optimizes the structure of the detection system. The stepper motor is used to drive the rotating micro-platform to transfer quarantine device and the microscopic vision system is applied to get graphic information of micro-accessory. The methods of image processing and pattern matching, the variable scale Sobel differential edge detection algorithm and the improved Zernike moments sub-pixel edge detection algorithm are combined in the system in order to achieve a more detailed and accurate edge of the defect detection. The grade at the edge of the complex signal can be achieved accurately by extracting through the proposed system, and then it can distinguish the qualified products and unqualified products with high precision recognition.
Deep hierarchies in the primate visual cortex: what can we learn for computer vision?
Krüger, Norbert; Janssen, Peter; Kalkan, Sinan; Lappe, Markus; Leonardis, Ales; Piater, Justus; Rodríguez-Sánchez, Antonio J; Wiskott, Laurenz
2013-08-01
Computational modeling of the primate visual system yields insights of potential relevance to some of the challenges that computer vision is facing, such as object recognition and categorization, motion detection and activity recognition, or vision-based navigation and manipulation. This paper reviews some functional principles and structures that are generally thought to underlie the primate visual cortex, and attempts to extract biological principles that could further advance computer vision research. Organized for a computer vision audience, we present functional principles of the processing hierarchies present in the primate visual system considering recent discoveries in neurophysiology. The hierarchical processing in the primate visual system is characterized by a sequence of different levels of processing (on the order of 10) that constitute a deep hierarchy in contrast to the flat vision architectures predominantly used in today's mainstream computer vision. We hope that the functional description of the deep hierarchies realized in the primate visual system provides valuable insights for the design of computer vision algorithms, fostering increasingly productive interaction between biological and computer vision research.
Deep Learning for Computer Vision: A Brief Review
Doulamis, Nikolaos; Doulamis, Anastasios; Protopapadakis, Eftychios
2018-01-01
Over the last years deep learning methods have been shown to outperform previous state-of-the-art machine learning techniques in several fields, with computer vision being one of the most prominent cases. This review paper provides a brief overview of some of the most significant deep learning schemes used in computer vision problems, that is, Convolutional Neural Networks, Deep Boltzmann Machines and Deep Belief Networks, and Stacked Denoising Autoencoders. A brief account of their history, structure, advantages, and limitations is given, followed by a description of their applications in various computer vision tasks, such as object detection, face recognition, action and activity recognition, and human pose estimation. Finally, a brief overview is given of future directions in designing deep learning schemes for computer vision problems and the challenges involved therein. PMID:29487619
Artificial vision by multi-layered neural networks: neocognitron and its advances.
Fukushima, Kunihiko
2013-01-01
The neocognitron is a neural network model proposed by Fukushima (1980). Its architecture was suggested by neurophysiological findings on the visual systems of mammals. It is a hierarchical multi-layered network. It acquires the ability to robustly recognize visual patterns through learning. Although the neocognitron has a long history, modifications of the network to improve its performance are still going on. For example, a recent neocognitron uses a new learning rule, named add-if-silent, which makes the learning process much simpler and more stable. Nevertheless, a high recognition rate can be kept with a smaller scale of the network. Referring to the history of the neocognitron, this paper discusses recent advances in the neocognitron. We also show that various new functions can be realized by, for example, introducing top-down connections to the neocognitron: mechanism of selective attention, recognition and completion of partly occluded patterns, restoring occluded contours, and so on. Copyright © 2012 Elsevier Ltd. All rights reserved.
Artificial Neural Networks for Processing Graphs with Application to Image Understanding: A Survey
NASA Astrophysics Data System (ADS)
Bianchini, Monica; Scarselli, Franco
In graphical pattern recognition, each data is represented as an arrangement of elements, that encodes both the properties of each element and the relations among them. Hence, patterns are modelled as labelled graphs where, in general, labels can be attached to both nodes and edges. Artificial neural networks able to process graphs are a powerful tool for addressing a great variety of real-world problems, where the information is naturally organized in entities and relationships among entities and, in fact, they have been widely used in computer vision, f.i. in logo recognition, in similarity retrieval, and for object detection. In this chapter, we propose a survey of neural network models able to process structured information, with a particular focus on those architectures tailored to address image understanding applications. Starting from the original recursive model (RNNs), we subsequently present different ways to represent images - by trees, forests of trees, multiresolution trees, directed acyclic graphs with labelled edges, general graphs - and, correspondingly, neural network architectures appropriate to process such structures.
Bernard, Jean-Baptiste; Aguilar, Carlos; Castet, Eric
2016-01-01
Reading speed is dramatically reduced when readers cannot use their central vision. This is because low visual acuity and crowding negatively impact letter recognition in the periphery. In this study, we designed a new font (referred to as the Eido font) in order to reduce inter-letter similarity and consequently to increase peripheral letter recognition performance. We tested this font by running five experiments that compared the Eido font with the standard Courier font. Letter spacing and x-height were identical for the two monospaced fonts. Six normally-sighted subjects used exclusively their peripheral vision to run two aloud reading tasks (with eye movements), a letter recognition task (without eye movements), a word recognition task (without eye movements) and a lexical decision task. Results show that reading speed was not significantly different between the Eido and the Courier font when subjects had to read single sentences with a round simulated gaze-contingent central scotoma (10° diameter). In contrast, Eido significantly decreased perceptual errors in peripheral crowded letter recognition (-30% errors on average for letters briefly presented at 6° eccentricity) and in peripheral word recognition (-32% errors on average for words briefly presented at 6° eccentricity). PMID:27074013
Bernard, Jean-Baptiste; Aguilar, Carlos; Castet, Eric
2016-01-01
Reading speed is dramatically reduced when readers cannot use their central vision. This is because low visual acuity and crowding negatively impact letter recognition in the periphery. In this study, we designed a new font (referred to as the Eido font) in order to reduce inter-letter similarity and consequently to increase peripheral letter recognition performance. We tested this font by running five experiments that compared the Eido font with the standard Courier font. Letter spacing and x-height were identical for the two monospaced fonts. Six normally-sighted subjects used exclusively their peripheral vision to run two aloud reading tasks (with eye movements), a letter recognition task (without eye movements), a word recognition task (without eye movements) and a lexical decision task. Results show that reading speed was not significantly different between the Eido and the Courier font when subjects had to read single sentences with a round simulated gaze-contingent central scotoma (10° diameter). In contrast, Eido significantly decreased perceptual errors in peripheral crowded letter recognition (-30% errors on average for letters briefly presented at 6° eccentricity) and in peripheral word recognition (-32% errors on average for words briefly presented at 6° eccentricity).
Capturing specific abilities as a window into human individuality: the example of face recognition.
Wilmer, Jeremy B; Germine, Laura; Chabris, Christopher F; Chatterjee, Garga; Gerbasi, Margaret; Nakayama, Ken
2012-01-01
Proper characterization of each individual's unique pattern of strengths and weaknesses requires good measures of diverse abilities. Here, we advocate combining our growing understanding of neural and cognitive mechanisms with modern psychometric methods in a renewed effort to capture human individuality through a consideration of specific abilities. We articulate five criteria for the isolation and measurement of specific abilities, then apply these criteria to face recognition. We cleanly dissociate face recognition from more general visual and verbal recognition. This dissociation stretches across ability as well as disability, suggesting that specific developmental face recognition deficits are a special case of a broader specificity that spans the entire spectrum of human face recognition performance. Item-by-item results from 1,471 web-tested participants, included as supplementary information, fuel item analyses, validation, norming, and item response theory (IRT) analyses of our three tests: (a) the widely used Cambridge Face Memory Test (CFMT); (b) an Abstract Art Memory Test (AAMT), and (c) a Verbal Paired-Associates Memory Test (VPMT). The availability of this data set provides a solid foundation for interpreting future scores on these tests. We argue that the allied fields of experimental psychology, cognitive neuroscience, and vision science could fuel the discovery of additional specific abilities to add to face recognition, thereby providing new perspectives on human individuality.
Complete Vision-Based Traffic Sign Recognition Supported by an I2V Communication System
García-Garrido, Miguel A.; Ocaña, Manuel; Llorca, David F.; Arroyo, Estefanía; Pozuelo, Jorge; Gavilán, Miguel
2012-01-01
This paper presents a complete traffic sign recognition system based on vision sensor onboard a moving vehicle which detects and recognizes up to one hundred of the most important road signs, including circular and triangular signs. A restricted Hough transform is used as detection method from the information extracted in contour images, while the proposed recognition system is based on Support Vector Machines (SVM). A novel solution to the problem of discarding detected signs that do not pertain to the host road is proposed. For that purpose infrastructure-to-vehicle (I2V) communication and a stereo vision sensor are used. Furthermore, the outputs provided by the vision sensor and the data supplied by the CAN Bus and a GPS sensor are combined to obtain the global position of the detected traffic signs, which is used to identify a traffic sign in the I2V communication. This paper presents plenty of tests in real driving conditions, both day and night, in which an average detection rate over 95% and an average recognition rate around 93% were obtained with an average runtime of 35 ms that allows real-time performance. PMID:22438704
Complete vision-based traffic sign recognition supported by an I2V communication system.
García-Garrido, Miguel A; Ocaña, Manuel; Llorca, David F; Arroyo, Estefanía; Pozuelo, Jorge; Gavilán, Miguel
2012-01-01
This paper presents a complete traffic sign recognition system based on vision sensor onboard a moving vehicle which detects and recognizes up to one hundred of the most important road signs, including circular and triangular signs. A restricted Hough transform is used as detection method from the information extracted in contour images, while the proposed recognition system is based on Support Vector Machines (SVM). A novel solution to the problem of discarding detected signs that do not pertain to the host road is proposed. For that purpose infrastructure-to-vehicle (I2V) communication and a stereo vision sensor are used. Furthermore, the outputs provided by the vision sensor and the data supplied by the CAN Bus and a GPS sensor are combined to obtain the global position of the detected traffic signs, which is used to identify a traffic sign in the I2V communication. This paper presents plenty of tests in real driving conditions, both day and night, in which an average detection rate over 95% and an average recognition rate around 93% were obtained with an average runtime of 35 ms that allows real-time performance.
A Computer Vision Approach to Identify Einstein Rings and Arcs
NASA Astrophysics Data System (ADS)
Lee, Chien-Hsiu
2017-03-01
Einstein rings are rare gems of strong lensing phenomena; the ring images can be used to probe the underlying lens gravitational potential at every position angles, tightly constraining the lens mass profile. In addition, the magnified images also enable us to probe high-z galaxies with enhanced resolution and signal-to-noise ratios. However, only a handful of Einstein rings have been reported, either from serendipitous discoveries or or visual inspections of hundred thousands of massive galaxies or galaxy clusters. In the era of large sky surveys, an automated approach to identify ring pattern in the big data to come is in high demand. Here, we present an Einstein ring recognition approach based on computer vision techniques. The workhorse is the circle Hough transform that recognise circular patterns or arcs in the images. We propose a two-tier approach by first pre-selecting massive galaxies associated with multiple blue objects as possible lens, than use Hough transform to identify circular pattern. As a proof-of-concept, we apply our approach to SDSS, with a high completeness, albeit with low purity. We also apply our approach to other lenses in DES, HSC-SSP, and UltraVISTA survey, illustrating the versatility of our approach.
Modeling peripheral vision for moving target search and detection.
Yang, Ji Hyun; Huston, Jesse; Day, Michael; Balogh, Imre
2012-06-01
Most target search and detection models focus on foveal vision. In reality, peripheral vision plays a significant role, especially in detecting moving objects. There were 23 subjects who participated in experiments simulating target detection tasks in urban and rural environments while their gaze parameters were tracked. Button responses associated with foveal object and peripheral object (PO) detection and recognition were recorded. In an urban scenario, pedestrians appearing in the periphery holding guns were threats and pedestrians with empty hands were non-threats. In a rural scenario, non-U.S. unmanned aerial vehicles (UAVs) were considered threats and U.S. UAVs non-threats. On average, subjects missed detecting 2.48 POs among 50 POs in the urban scenario and 5.39 POs in the rural scenario. Both saccade reaction time and button reaction time can be predicted by peripheral angle and entrance speed of POs. Fast moving objects were detected faster than slower objects and POs appearing at wider angles took longer to detect than those closer to the gaze center. A second-order mixed-effect model was applied to provide each subject's prediction model for peripheral target detection performance as a function of eccentricity angle and speed. About half the subjects used active search patterns while the other half used passive search patterns. An interactive 3-D visualization tool was developed to provide a representation of macro-scale head and gaze movement in the search and target detection task. An experimentally validated stochastic model of peripheral vision in realistic target detection scenarios was developed.
Object Recognition in Flight: How Do Bees Distinguish between 3D Shapes?
Werner, Annette; Stürzl, Wolfgang; Zanker, Johannes
2016-01-01
Honeybees (Apis mellifera) discriminate multiple object features such as colour, pattern and 2D shape, but it remains unknown whether and how bees recover three-dimensional shape. Here we show that bees can recognize objects by their three-dimensional form, whereby they employ an active strategy to uncover the depth profiles. We trained individual, free flying honeybees to collect sugar water from small three-dimensional objects made of styrofoam (sphere, cylinder, cuboids) or folded paper (convex, concave, planar) and found that bees can easily discriminate between these stimuli. We also tested possible strategies employed by the bees to uncover the depth profiles. For the card stimuli, we excluded overall shape and pictorial features (shading, texture gradients) as cues for discrimination. Lacking sufficient stereo vision, bees are known to use speed gradients in optic flow to detect edges; could the bees apply this strategy also to recover the fine details of a surface depth profile? Analysing the bees’ flight tracks in front of the stimuli revealed specific combinations of flight maneuvers (lateral translations in combination with yaw rotations), which are particularly suitable to extract depth cues from motion parallax. We modelled the generated optic flow and found characteristic patterns of angular displacement corresponding to the depth profiles of our stimuli: optic flow patterns from pure translations successfully recovered depth relations from the magnitude of angular displacements, additional rotation provided robust depth information based on the direction of the displacements; thus, the bees flight maneuvers may reflect an optimized visuo-motor strategy to extract depth structure from motion signals. The robustness and simplicity of this strategy offers an efficient solution for 3D-object-recognition without stereo vision, and could be employed by other flying insects, or mobile robots. PMID:26886006
Object Recognition in Flight: How Do Bees Distinguish between 3D Shapes?
Werner, Annette; Stürzl, Wolfgang; Zanker, Johannes
2016-01-01
Honeybees (Apis mellifera) discriminate multiple object features such as colour, pattern and 2D shape, but it remains unknown whether and how bees recover three-dimensional shape. Here we show that bees can recognize objects by their three-dimensional form, whereby they employ an active strategy to uncover the depth profiles. We trained individual, free flying honeybees to collect sugar water from small three-dimensional objects made of styrofoam (sphere, cylinder, cuboids) or folded paper (convex, concave, planar) and found that bees can easily discriminate between these stimuli. We also tested possible strategies employed by the bees to uncover the depth profiles. For the card stimuli, we excluded overall shape and pictorial features (shading, texture gradients) as cues for discrimination. Lacking sufficient stereo vision, bees are known to use speed gradients in optic flow to detect edges; could the bees apply this strategy also to recover the fine details of a surface depth profile? Analysing the bees' flight tracks in front of the stimuli revealed specific combinations of flight maneuvers (lateral translations in combination with yaw rotations), which are particularly suitable to extract depth cues from motion parallax. We modelled the generated optic flow and found characteristic patterns of angular displacement corresponding to the depth profiles of our stimuli: optic flow patterns from pure translations successfully recovered depth relations from the magnitude of angular displacements, additional rotation provided robust depth information based on the direction of the displacements; thus, the bees flight maneuvers may reflect an optimized visuo-motor strategy to extract depth structure from motion signals. The robustness and simplicity of this strategy offers an efficient solution for 3D-object-recognition without stereo vision, and could be employed by other flying insects, or mobile robots.
Development of a written music-recognition system using Java and open source technologies
NASA Astrophysics Data System (ADS)
Loibner, Gernot; Schwarzl, Andreas; Kovač, Matthias; Paulus, Dietmar; Pölzleitner, Wolfgang
2005-10-01
We report on the development of a software system to recognize and interpret printed music. The overall goal is to scan printed music sheets, analyze and recognize the notes, timing, and written text, and derive the all necessary information to use the computers MIDI sound system to play the music. This function is primarily useful for musicians who want to digitize printed music for editing purposes. There exist a number of commercial systems that offer such a functionality. However, on testing these systems, we were astonished on how weak they behave in their pattern recognition parts. Although we submitted very clear and rather flawless scanning input, none of these systems was able to e.g. recognize all notes, staff lines, and systems. They all require a high degree of interaction, post-processing, and editing to get a decent digital version of the hard copy material. In this paper we focus on the pattern recognition area. In a first approach we tested more or less standard methods of adaptive thresholding, blob detection, line detection, and corner detection to find the notes, staff lines, and candidate objects subject to OCR. Many of the objects on this type of material can be learned in a training phase. None of the commercial systems we saw offers the option to train special characters or unusual signatures. A second goal in this project is to use a modern software engineering platform. We were interested in how well Java and open source technologies are suitable for pattern recognition and machine vision. The scanning of music served as a case-study.
Visual Word Recognition Across the Adult Lifespan
Cohen-Shikora, Emily R.; Balota, David A.
2016-01-01
The current study examines visual word recognition in a large sample (N = 148) across the adult lifespan and across a large set of stimuli (N = 1187) in three different lexical processing tasks (pronunciation, lexical decision, and animacy judgments). Although the focus of the present study is on the influence of word frequency, a diverse set of other variables are examined as the system ages and acquires more experience with language. Computational models and conceptual theories of visual word recognition and aging make differing predictions for age-related changes in the system. However, these have been difficult to assess because prior studies have produced inconsistent results, possibly due to sample differences, analytic procedures, and/or task-specific processes. The current study confronts these potential differences by using three different tasks, treating age and word variables as continuous, and exploring the influence of individual differences such as vocabulary, vision, and working memory. The primary finding is remarkable stability in the influence of a diverse set of variables on visual word recognition across the adult age spectrum. This pattern is discussed in reference to previous inconsistent findings in the literature and implications for current models of visual word recognition. PMID:27336629
Parallel Architectures and Parallel Algorithms for Integrated Vision Systems. Ph.D. Thesis
NASA Technical Reports Server (NTRS)
Choudhary, Alok Nidhi
1989-01-01
Computer vision is regarded as one of the most complex and computationally intensive problems. An integrated vision system (IVS) is a system that uses vision algorithms from all levels of processing to perform for a high level application (e.g., object recognition). An IVS normally involves algorithms from low level, intermediate level, and high level vision. Designing parallel architectures for vision systems is of tremendous interest to researchers. Several issues are addressed in parallel architectures and parallel algorithms for integrated vision systems.
Research into the Architecture of CAD Based Robot Vision Systems
1988-02-09
Vision and "Automatic Generation of Recognition Features for Com- puter Vision," Mudge, Turney and Volz, published in Robotica (1987). All of the...Occluded Parts," (T.N. Mudge, J.L. Turney, and R.A. Volz), Robotica , vol. 5, 1987, pp. 117-127. 5. "Vision Algorithms for Hypercube Machines," (T.N. Mudge
A computerized recognition system for the home-based physiotherapy exercises using an RGBD camera.
Ar, Ilktan; Akgul, Yusuf Sinan
2014-11-01
Computerized recognition of the home based physiotherapy exercises has many benefits and it has attracted considerable interest among the computer vision community. However, most methods in the literature view this task as a special case of motion recognition. In contrast, we propose to employ the three main components of a physiotherapy exercise (the motion patterns, the stance knowledge, and the exercise object) as different recognition tasks and embed them separately into the recognition system. The low level information about each component is gathered using machine learning methods. Then, we use a generative Bayesian network to recognize the exercise types by combining the information from these sources at an abstract level, which takes the advantage of domain knowledge for a more robust system. Finally, a novel postprocessing step is employed to estimate the exercise repetitions counts. The performance evaluation of the system is conducted with a new dataset which contains RGB (red, green, and blue) and depth videos of home-based exercise sessions for commonly applied shoulder and knee exercises. The proposed system works without any body-part segmentation, bodypart tracking, joint detection, and temporal segmentation methods. In the end, favorable exercise recognition rates and encouraging results on the estimation of repetition counts are obtained.
Human movement activity classification approaches that use wearable sensors and mobile devices
NASA Astrophysics Data System (ADS)
Kaghyan, Sahak; Sarukhanyan, Hakob; Akopian, David
2013-03-01
Cell phones and other mobile devices become part of human culture and change activity and lifestyle patterns. Mobile phone technology continuously evolves and incorporates more and more sensors for enabling advanced applications. Latest generations of smart phones incorporate GPS and WLAN location finding modules, vision cameras, microphones, accelerometers, temperature sensors etc. The availability of these sensors in mass-market communication devices creates exciting new opportunities for data mining applications. Particularly healthcare applications exploiting build-in sensors are very promising. This paper reviews different approaches of human activity recognition.
IEEE 1982. Proceedings of the international conference on cybernetics and society
DOE Office of Scientific and Technical Information (OSTI.GOV)
Not Available
1982-01-01
The following topics were dealt with: knowledge-based systems; risk analysis; man-machine interactions; human information processing; metaphor, analogy and problem-solving; manual control modelling; transportation systems; simulation; adaptive and learning systems; biocybernetics; cybernetics; mathematical programming; robotics; decision support systems; analysis, design and validation of models; computer vision; systems science; energy systems; environmental modelling and policy; pattern recognition; nuclear warfare; technological forecasting; artificial intelligence; the Turin shroud; optimisation; workloads. Abstracts of individual papers can be found under the relevant classification codes in this or future issues.
Caetano, Tibério S; McAuley, Julian J; Cheng, Li; Le, Quoc V; Smola, Alex J
2009-06-01
As a fundamental problem in pattern recognition, graph matching has applications in a variety of fields, from computer vision to computational biology. In graph matching, patterns are modeled as graphs and pattern recognition amounts to finding a correspondence between the nodes of different graphs. Many formulations of this problem can be cast in general as a quadratic assignment problem, where a linear term in the objective function encodes node compatibility and a quadratic term encodes edge compatibility. The main research focus in this theme is about designing efficient algorithms for approximately solving the quadratic assignment problem, since it is NP-hard. In this paper we turn our attention to a different question: how to estimate compatibility functions such that the solution of the resulting graph matching problem best matches the expected solution that a human would manually provide. We present a method for learning graph matching: the training examples are pairs of graphs and the 'labels' are matches between them. Our experimental results reveal that learning can substantially improve the performance of standard graph matching algorithms. In particular, we find that simple linear assignment with such a learning scheme outperforms Graduated Assignment with bistochastic normalisation, a state-of-the-art quadratic assignment relaxation algorithm.
Multidimensional brain activity dictated by winner-take-all mechanisms.
Tozzi, Arturo; Peters, James F
2018-06-21
A novel demon-based architecture is introduced to elucidate brain functions such as pattern recognition during human perception and mental interpretation of visual scenes. Starting from the topological concepts of invariance and persistence, we introduce a Selfridge pandemonium variant of brain activity that takes into account a novel feature, namely, demons that recognize short straight-line segments, curved lines and scene shapes, such as shape interior, density and texture. Low-level representations of objects can be mapped to higher-level views (our mental interpretations): a series of transformations can be gradually applied to a pattern in a visual scene, without affecting its invariant properties. This makes it possible to construct a symbolic multi-dimensional representation of the environment. These representations can be projected continuously to an object that we have seen and continue to see, thanks to the mapping from shapes in our memory to shapes in Euclidean space. Although perceived shapes are 3-dimensional (plus time), the evaluation of shape features (volume, color, contour, closeness, texture, and so on) leads to n-dimensional brain landscapes. Here we discuss the advantages of our parallel, hierarchical model in pattern recognition, computer vision and biological nervous system's evolution. Copyright © 2018 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Babayan, Pavel; Smirnov, Sergey; Strotov, Valery
2017-10-01
This paper describes the aerial object recognition algorithm for on-board and stationary vision system. Suggested algorithm is intended to recognize the objects of a specific kind using the set of the reference objects defined by 3D models. The proposed algorithm based on the outer contour descriptor building. The algorithm consists of two stages: learning and recognition. Learning stage is devoted to the exploring of reference objects. Using 3D models we can build the database containing training images by rendering the 3D model from viewpoints evenly distributed on a sphere. Sphere points distribution is made by the geosphere principle. Gathered training image set is used for calculating descriptors, which will be used in the recognition stage of the algorithm. The recognition stage is focusing on estimating the similarity of the captured object and the reference objects by matching an observed image descriptor and the reference object descriptors. The experimental research was performed using a set of the models of the aircraft of the different types (airplanes, helicopters, UAVs). The proposed orientation estimation algorithm showed good accuracy in all case studies. The real-time performance of the algorithm in FPGA-based vision system was demonstrated.
Engel, Annerose; Bangert, Marc; Horbank, David; Hijmans, Brenda S; Wilkens, Katharina; Keller, Peter E; Keysers, Christian
2012-11-01
To investigate the cross-modal transfer of movement patterns necessary to perform melodies on the piano, 22 non-musicians learned to play short sequences on a piano keyboard by (1) merely listening and replaying (vision of own fingers occluded) or (2) merely observing silent finger movements and replaying (on a silent keyboard). After training, participants recognized with above chance accuracy (1) audio-motor learned sequences upon visual presentation (89±17%), and (2) visuo-motor learned sequences upon auditory presentation (77±22%). The recognition rates for visual presentation significantly exceeded those for auditory presentation (p<.05). fMRI revealed that observing finger movements corresponding to audio-motor trained melodies is associated with stronger activation in the left rolandic operculum than observing untrained sequences. This region was also involved in silent execution of sequences, suggesting that a link to motor representations may play a role in cross-modal transfer from audio-motor training condition to visual recognition. No significant differences in brain activity were found during listening to visuo-motor trained compared to untrained melodies. Cross-modal transfer was stronger from the audio-motor training condition to visual recognition and this is discussed in relation to the fact that non-musicians are familiar with how their finger movements look (motor-to-vision transformation), but not with how they sound on a piano (motor-to-sound transformation). Copyright © 2012 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Lucio Rapoport, Diego
2013-04-01
We present a unified principle for science that surmounts dualism, in terms of torsion fields and the non-orientable surfaces, notably the Klein Bottle and its logic, the Möbius strip and the projective plane. We apply it to the complex numbers and cosmology, to non-linear systems integrating the issue of hyperbolic divergences with the change of orientability, to the biomechanics of vision and the mammal heart, to the morphogenesis of crustal shapes on Earth in connection to the wavefronts of gravitation, elasticity and electromagnetism, to pattern recognition of artificial images and visual recognition, to neurology and the topographic maps of the sensorium, to perception, in particular of music. We develop it in terms of the fundamental 2:1 resonance inherent to the Möbius strip and the Klein Bottle, the minimal surfaces representation of the wavefronts, and the non-dual Klein Bottle logic inherent to pattern recognition, to the harmonic functions and vector fields that lay at the basis of geophysics and physics at large. We discuss the relation between the topographic maps of the sensorium, and the issue of turning inside-out of the visual world as a general principle for cognition, topological chemistry, cell biology and biological morphogenesis in particular in embryology
Capturing specific abilities as a window into human individuality: The example of face recognition
Wilmer, Jeremy B.; Germine, Laura; Chabris, Christopher F.; Chatterjee, Garga; Gerbasi, Margaret; Nakayama, Ken
2013-01-01
Proper characterization of each individual's unique pattern of strengths and weaknesses requires good measures of diverse abilities. Here, we advocate combining our growing understanding of neural and cognitive mechanisms with modern psychometric methods in a renewed effort to capture human individuality through a consideration of specific abilities. We articulate five criteria for the isolation and measurement of specific abilities, then apply these criteria to face recognition. We cleanly dissociate face recognition from more general visual and verbal recognition. This dissociation stretches across ability as well as disability, suggesting that specific developmental face recognition deficits are a special case of a broader specificity that spans the entire spectrum of human face recognition performance. Item-by-item results from 1,471 web-tested participants, included as supplementary information, fuel item analyses, validation, norming, and item response theory (IRT) analyses of our three tests: (a) the widely used Cambridge Face Memory Test (CFMT); (b) an Abstract Art Memory Test (AAMT), and (c) a Verbal Paired-Associates Memory Test (VPMT). The availability of this data set provides a solid foundation for interpreting future scores on these tests. We argue that the allied fields of experimental psychology, cognitive neuroscience, and vision science could fuel the discovery of additional specific abilities to add to face recognition, thereby providing new perspectives on human individuality. PMID:23428079
Low, slow, small target recognition based on spatial vision network
NASA Astrophysics Data System (ADS)
Cheng, Zhao; Guo, Pei; Qi, Xin
2018-03-01
Traditional photoelectric monitoring is monitored using a large number of identical cameras. In order to ensure the full coverage of the monitoring area, this monitoring method uses more cameras, which leads to more monitoring and repetition areas, and higher costs, resulting in more waste. In order to reduce the monitoring cost and solve the difficult problem of finding, identifying and tracking a low altitude, slow speed and small target, this paper presents spatial vision network for low-slow-small targets recognition. Based on camera imaging principle and monitoring model, spatial vision network is modeled and optimized. Simulation experiment results demonstrate that the proposed method has good performance.
What can neuromorphic event-driven precise timing add to spike-based pattern recognition?
Akolkar, Himanshu; Meyer, Cedric; Clady, Zavier; Marre, Olivier; Bartolozzi, Chiara; Panzeri, Stefano; Benosman, Ryad
2015-03-01
This letter introduces a study to precisely measure what an increase in spike timing precision can add to spike-driven pattern recognition algorithms. The concept of generating spikes from images by converting gray levels into spike timings is currently at the basis of almost every spike-based modeling of biological visual systems. The use of images naturally leads to generating incorrect artificial and redundant spike timings and, more important, also contradicts biological findings indicating that visual processing is massively parallel, asynchronous with high temporal resolution. A new concept for acquiring visual information through pixel-individual asynchronous level-crossing sampling has been proposed in a recent generation of asynchronous neuromorphic visual sensors. Unlike conventional cameras, these sensors acquire data not at fixed points in time for the entire array but at fixed amplitude changes of their input, resulting optimally sparse in space and time-pixel individually and precisely timed only if new, (previously unknown) information is available (event based). This letter uses the high temporal resolution spiking output of neuromorphic event-based visual sensors to show that lowering time precision degrades performance on several recognition tasks specifically when reaching the conventional range of machine vision acquisition frequencies (30-60 Hz). The use of information theory to characterize separability between classes for each temporal resolution shows that high temporal acquisition provides up to 70% more information that conventional spikes generated from frame-based acquisition as used in standard artificial vision, thus drastically increasing the separability between classes of objects. Experiments on real data show that the amount of information loss is correlated with temporal precision. Our information-theoretic study highlights the potentials of neuromorphic asynchronous visual sensors for both practical applications and theoretical investigations. Moreover, it suggests that representing visual information as a precise sequence of spike times as reported in the retina offers considerable advantages for neuro-inspired visual computations.
Robust Pedestrian Tracking and Recognition from FLIR Video: A Unified Approach via Sparse Coding
Li, Xin; Guo, Rui; Chen, Chao
2014-01-01
Sparse coding is an emerging method that has been successfully applied to both robust object tracking and recognition in the vision literature. In this paper, we propose to explore a sparse coding-based approach toward joint object tracking-and-recognition and explore its potential in the analysis of forward-looking infrared (FLIR) video to support nighttime machine vision systems. A key technical contribution of this work is to unify existing sparse coding-based approaches toward tracking and recognition under the same framework, so that they can benefit from each other in a closed-loop. On the one hand, tracking the same object through temporal frames allows us to achieve improved recognition performance through dynamical updating of template/dictionary and combining multiple recognition results; on the other hand, the recognition of individual objects facilitates the tracking of multiple objects (i.e., walking pedestrians), especially in the presence of occlusion within a crowded environment. We report experimental results on both the CASIAPedestrian Database and our own collected FLIR video database to demonstrate the effectiveness of the proposed joint tracking-and-recognition approach. PMID:24961216
Nguyen, Dat Tien; Pham, Tuyen Danh; Baek, Na Rae; Park, Kang Ryoung
2018-01-01
Although face recognition systems have wide application, they are vulnerable to presentation attack samples (fake samples). Therefore, a presentation attack detection (PAD) method is required to enhance the security level of face recognition systems. Most of the previously proposed PAD methods for face recognition systems have focused on using handcrafted image features, which are designed by expert knowledge of designers, such as Gabor filter, local binary pattern (LBP), local ternary pattern (LTP), and histogram of oriented gradients (HOG). As a result, the extracted features reflect limited aspects of the problem, yielding a detection accuracy that is low and varies with the characteristics of presentation attack face images. The deep learning method has been developed in the computer vision research community, which is proven to be suitable for automatically training a feature extractor that can be used to enhance the ability of handcrafted features. To overcome the limitations of previously proposed PAD methods, we propose a new PAD method that uses a combination of deep and handcrafted features extracted from the images by visible-light camera sensor. Our proposed method uses the convolutional neural network (CNN) method to extract deep image features and the multi-level local binary pattern (MLBP) method to extract skin detail features from face images to discriminate the real and presentation attack face images. By combining the two types of image features, we form a new type of image features, called hybrid features, which has stronger discrimination ability than single image features. Finally, we use the support vector machine (SVM) method to classify the image features into real or presentation attack class. Our experimental results indicate that our proposed method outperforms previous PAD methods by yielding the smallest error rates on the same image databases. PMID:29495417
Nguyen, Dat Tien; Pham, Tuyen Danh; Baek, Na Rae; Park, Kang Ryoung
2018-02-26
Although face recognition systems have wide application, they are vulnerable to presentation attack samples (fake samples). Therefore, a presentation attack detection (PAD) method is required to enhance the security level of face recognition systems. Most of the previously proposed PAD methods for face recognition systems have focused on using handcrafted image features, which are designed by expert knowledge of designers, such as Gabor filter, local binary pattern (LBP), local ternary pattern (LTP), and histogram of oriented gradients (HOG). As a result, the extracted features reflect limited aspects of the problem, yielding a detection accuracy that is low and varies with the characteristics of presentation attack face images. The deep learning method has been developed in the computer vision research community, which is proven to be suitable for automatically training a feature extractor that can be used to enhance the ability of handcrafted features. To overcome the limitations of previously proposed PAD methods, we propose a new PAD method that uses a combination of deep and handcrafted features extracted from the images by visible-light camera sensor. Our proposed method uses the convolutional neural network (CNN) method to extract deep image features and the multi-level local binary pattern (MLBP) method to extract skin detail features from face images to discriminate the real and presentation attack face images. By combining the two types of image features, we form a new type of image features, called hybrid features, which has stronger discrimination ability than single image features. Finally, we use the support vector machine (SVM) method to classify the image features into real or presentation attack class. Our experimental results indicate that our proposed method outperforms previous PAD methods by yielding the smallest error rates on the same image databases.
Husk, Jesse S.; Yu, Deyue
2017-01-01
Patients with central vision loss must rely on their peripheral vision for reading. Unfortunately, limitations of peripheral vision, such as crowding, pose significant challenges to letter recognition. As a result, there is a need for developing effective training methods for improving crowded letter recognition in the periphery. Several studies have shown that extensive practice with letter stimuli is beneficial to peripheral letter recognition. Here, we explore stimulus-related factors that might influence the effectiveness of peripheral letter recognition training. Specifically, we examined letter exposure (number of letter occurrences), frequency of letter use in English print, and letter complexity and evaluated their contributions to the amount of improvement observed in crowded letter recognition following training. We analyzed data collected across a range of training protocols. Using linear regression, we identified the best-fitting model and observed that all three stimulus-related factors contributed to improvement in peripheral letter recognition with letter exposure being the most important factor. As an important explanatory variable, pretest accuracy was included in the model as well to avoid estimate biases and was shown to have influence on the relationship between training improvement and letter exposure. When developing training protocols for peripheral letter recognition, it may be beneficial to not only consider the overall length of training, but also to tailor the number of stimulus occurrences for each letter according to its initial performance level, frequency, and complexity. PMID:28265651
Convolutional neural networks and face recognition task
NASA Astrophysics Data System (ADS)
Sochenkova, A.; Sochenkov, I.; Makovetskii, A.; Vokhmintsev, A.; Melnikov, A.
2017-09-01
Computer vision tasks are remaining very important for the last couple of years. One of the most complicated problems in computer vision is face recognition that could be used in security systems to provide safety and to identify person among the others. There is a variety of different approaches to solve this task, but there is still no universal solution that would give adequate results in some cases. Current paper presents following approach. Firstly, we extract an area containing face, then we use Canny edge detector. On the next stage we use convolutional neural networks (CNN) to finally solve face recognition and person identification task.
Sensor-Aware Recognition and Tracking for Wide-Area Augmented Reality on Mobile Phones
Chen, Jing; Cao, Ruochen; Wang, Yongtian
2015-01-01
Wide-area registration in outdoor environments on mobile phones is a challenging task in mobile augmented reality fields. We present a sensor-aware large-scale outdoor augmented reality system for recognition and tracking on mobile phones. GPS and gravity information is used to improve the VLAD performance for recognition. A kind of sensor-aware VLAD algorithm, which is self-adaptive to different scale scenes, is utilized to recognize complex scenes. Considering vision-based registration algorithms are too fragile and tend to drift, data coming from inertial sensors and vision are fused together by an extended Kalman filter (EKF) to achieve considerable improvements in tracking stability and robustness. Experimental results show that our method greatly enhances the recognition rate and eliminates the tracking jitters. PMID:26690439
Sensor-Aware Recognition and Tracking for Wide-Area Augmented Reality on Mobile Phones.
Chen, Jing; Cao, Ruochen; Wang, Yongtian
2015-12-10
Wide-area registration in outdoor environments on mobile phones is a challenging task in mobile augmented reality fields. We present a sensor-aware large-scale outdoor augmented reality system for recognition and tracking on mobile phones. GPS and gravity information is used to improve the VLAD performance for recognition. A kind of sensor-aware VLAD algorithm, which is self-adaptive to different scale scenes, is utilized to recognize complex scenes. Considering vision-based registration algorithms are too fragile and tend to drift, data coming from inertial sensors and vision are fused together by an extended Kalman filter (EKF) to achieve considerable improvements in tracking stability and robustness. Experimental results show that our method greatly enhances the recognition rate and eliminates the tracking jitters.
ROBOSIGHT: Robotic Vision System For Inspection And Manipulation
NASA Astrophysics Data System (ADS)
Trivedi, Mohan M.; Chen, ChuXin; Marapane, Suresh
1989-02-01
Vision is an important sensory modality that can be used for deriving information critical to the proper, efficient, flexible, and safe operation of an intelligent robot. Vision systems are uti-lized for developing higher level interpretation of the nature of a robotic workspace using images acquired by cameras mounted on a robot. Such information can be useful for tasks such as object recognition, object location, object inspection, obstacle avoidance and navigation. In this paper we describe efforts directed towards developing a vision system useful for performing various robotic inspection and manipulation tasks. The system utilizes gray scale images and can be viewed as a model-based system. It includes general purpose image analysis modules as well as special purpose, task dependent object status recognition modules. Experiments are described to verify the robust performance of the integrated system using a robotic testbed.
Machine vision system for inspecting characteristics of hybrid rice seed
NASA Astrophysics Data System (ADS)
Cheng, Fang; Ying, Yibin
2004-03-01
Obtaining clear images advantaged of improving the classification accuracy involves many factors, light source, lens extender and background were discussed in this paper. The analysis of rice seed reflectance curves showed that the wavelength of light source for discrimination of the diseased seeds from normal rice seeds in the monochromic image recognition mode was about 815nm for jinyou402 and shanyou10. To determine optimizing conditions for acquiring digital images of rice seed using a computer vision system, an adjustable color machine vision system was developed. The machine vision system with 20mm to 25mm lens extender produce close-up images which made it easy to object recognition of characteristics in hybrid rice seeds. White background was proved to be better than black background for inspecting rice seeds infected by disease and using the algorithms based on shape. Experimental results indicated good classification for most of the characteristics with the machine vision system. The same algorithm yielded better results in optimizing condition for quality inspection of rice seed. Specifically, the image processing can correct for details such as fine fissure with the machine vision system.
Humans and Deep Networks Largely Agree on Which Kinds of Variation Make Object Recognition Harder.
Kheradpisheh, Saeed R; Ghodrati, Masoud; Ganjtabesh, Mohammad; Masquelier, Timothée
2016-01-01
View-invariant object recognition is a challenging problem that has attracted much attention among the psychology, neuroscience, and computer vision communities. Humans are notoriously good at it, even if some variations are presumably more difficult to handle than others (e.g., 3D rotations). Humans are thought to solve the problem through hierarchical processing along the ventral stream, which progressively extracts more and more invariant visual features. This feed-forward architecture has inspired a new generation of bio-inspired computer vision systems called deep convolutional neural networks (DCNN), which are currently the best models for object recognition in natural images. Here, for the first time, we systematically compared human feed-forward vision and DCNNs at view-invariant object recognition task using the same set of images and controlling the kinds of transformation (position, scale, rotation in plane, and rotation in depth) as well as their magnitude, which we call "variation level." We used four object categories: car, ship, motorcycle, and animal. In total, 89 human subjects participated in 10 experiments in which they had to discriminate between two or four categories after rapid presentation with backward masking. We also tested two recent DCNNs (proposed respectively by Hinton's group and Zisserman's group) on the same tasks. We found that humans and DCNNs largely agreed on the relative difficulties of each kind of variation: rotation in depth is by far the hardest transformation to handle, followed by scale, then rotation in plane, and finally position (much easier). This suggests that DCNNs would be reasonable models of human feed-forward vision. In addition, our results show that the variation levels in rotation in depth and scale strongly modulate both humans' and DCNNs' recognition performances. We thus argue that these variations should be controlled in the image datasets used in vision research.
The use of higher-order statistics in rapid object categorization in natural scenes.
Banno, Hayaki; Saiki, Jun
2015-02-04
We can rapidly and efficiently recognize many types of objects embedded in complex scenes. What information supports this object recognition is a fundamental question for understanding our visual processing. We investigated the eccentricity-dependent role of shape and statistical information for ultrarapid object categorization, using the higher-order statistics proposed by Portilla and Simoncelli (2000). Synthesized textures computed by their algorithms have the same higher-order statistics as the originals, while the global shapes were destroyed. We used the synthesized textures to manipulate the availability of shape information separately from the statistics. We hypothesized that shape makes a greater contribution to central vision than to peripheral vision and that statistics show the opposite pattern. Results did not show contributions clearly biased by eccentricity. Statistical information demonstrated a robust contribution not only in peripheral but also in central vision. For shape, the results supported the contribution in both central and peripheral vision. Further experiments revealed some interesting properties of the statistics. They are available for a limited time, attributable to the presence or absence of animals without shape, and predict how easily humans detect animals in original images. Our data suggest that when facing the time constraint of categorical processing, higher-order statistics underlie our significant performance for rapid categorization, irrespective of eccentricity. © 2015 ARVO.
Cognitive aspects of haptic form recognition by blind and sighted subjects.
Bailes, S M; Lambert, R M
1986-11-01
Studies using haptic form recognition tasks have generally concluded that the adventitiously blind perform better than the congenitally blind, implicating the importance of early visual experience in improved spatial functioning. The hypothesis was tested that the adventitiously blind have retained some ability to encode successive information obtained haptically in terms of a global visual representation, while the congenitally blind use a coding system based on successive inputs. Eighteen blind (adventitiously and congenitally) and 18 sighted (blindfolded and performing with vision) subjects were tested on their recognition of raised line patterns when the standard was presented in segments: in immediate succession, or with unfilled intersegmental delays of 5, 10, or 15 seconds. The results did not support the above hypothesis. Three main findings were obtained: normally sighted subjects were both faster and more accurate than the other groups; all groups improved in accuracy of recognition as a function of length of interstimulus interval; sighted subjects tended to report using strategies with a strong verbal component while the blind tended to rely on imagery coding. These results are explained in terms of information-processing theory consistent with dual encoding systems in working memory.
Four not six: Revealing culturally common facial expressions of emotion.
Jack, Rachael E; Sun, Wei; Delis, Ioannis; Garrod, Oliver G B; Schyns, Philippe G
2016-06-01
As a highly social species, humans generate complex facial expressions to communicate a diverse range of emotions. Since Darwin's work, identifying among these complex patterns which are common across cultures and which are culture-specific has remained a central question in psychology, anthropology, philosophy, and more recently machine vision and social robotics. Classic approaches to addressing this question typically tested the cross-cultural recognition of theoretically motivated facial expressions representing 6 emotions, and reported universality. Yet, variable recognition accuracy across cultures suggests a narrower cross-cultural communication supported by sets of simpler expressive patterns embedded in more complex facial expressions. We explore this hypothesis by modeling the facial expressions of over 60 emotions across 2 cultures, and segregating out the latent expressive patterns. Using a multidisciplinary approach, we first map the conceptual organization of a broad spectrum of emotion words by building semantic networks in 2 cultures. For each emotion word in each culture, we then model and validate its corresponding dynamic facial expression, producing over 60 culturally valid facial expression models. We then apply to the pooled models a multivariate data reduction technique, revealing 4 latent and culturally common facial expression patterns that each communicates specific combinations of valence, arousal, and dominance. We then reveal the face movements that accentuate each latent expressive pattern to create complex facial expressions. Our data questions the widely held view that 6 facial expression patterns are universal, instead suggesting 4 latent expressive patterns with direct implications for emotion communication, social psychology, cognitive neuroscience, and social robotics. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Cortical visual dysfunction in children: a clinical study.
Dutton, G; Ballantyne, J; Boyd, G; Bradnam, M; Day, R; McCulloch, D; Mackie, R; Phillips, S; Saunders, K
1996-01-01
Damage to the cerebral cortex was responsible for impairment in vision in 90 of 130 consecutive children referred to the Vision Assessment Clinic in Glasgow. Cortical blindness was seen in 16 children. Only 2 were mobile, but both showed evidence of navigational blind-sight. Cortical visual impairment, in which it was possible to estimate visual acuity but generalised severe brain damage precluded estimation of cognitive visual function, was observed in 9 children. Complex disorders of cognitive vision were seen in 20 children. These could be divided into five categories and involved impairment of: (1) recognition, (2) orientation, (3) depth perception, (4) perception of movement and (5) simultaneous perception. These disorders were observed in a variety of combinations. The remaining children showed evidence of reduced visual acuity and/ or visual field loss, but without detectable disorders of congnitive visual function. Early recognition of disorders of cognitive vision is required if active training and remediation are to be implemented.
Cherry recognition in natural environment based on the vision of picking robot
NASA Astrophysics Data System (ADS)
Zhang, Qirong; Chen, Shanxiong; Yu, Tingzhong; Wang, Yan
2017-04-01
In order to realize the automatic recognition of cherry in the natural environment, this paper designed a robot vision system recognition method. The first step of this method is to pre-process the cherry image by median filtering. The second step is to identify the colour of the cherry through the 0.9R-G colour difference formula, and then use the Otsu algorithm for threshold segmentation. The third step is to remove noise by using the area threshold. The fourth step is to remove the holes in the cherry image by morphological closed and open operation. The fifth step is to obtain the centroid and contour of cherry by using the smallest external rectangular and the Hough transform. Through this recognition process, we can successfully identify 96% of the cherry without blocking and adhesion.
Li, Heng; Su, Xiaofan; Wang, Jing; Kan, Han; Han, Tingting; Zeng, Yajie; Chai, Xinyu
2018-01-01
Current retinal prostheses can only generate low-resolution visual percepts constituted of limited phosphenes which are elicited by an electrode array and with uncontrollable color and restricted grayscale. Under this visual perception, prosthetic recipients can just complete some simple visual tasks, but more complex tasks like face identification/object recognition are extremely difficult. Therefore, it is necessary to investigate and apply image processing strategies for optimizing the visual perception of the recipients. This study focuses on recognition of the object of interest employing simulated prosthetic vision. We used a saliency segmentation method based on a biologically plausible graph-based visual saliency model and a grabCut-based self-adaptive-iterative optimization framework to automatically extract foreground objects. Based on this, two image processing strategies, Addition of Separate Pixelization and Background Pixel Shrink, were further utilized to enhance the extracted foreground objects. i) The results showed by verification of psychophysical experiments that under simulated prosthetic vision, both strategies had marked advantages over Direct Pixelization in terms of recognition accuracy and efficiency. ii) We also found that recognition performance under two strategies was tied to the segmentation results and was affected positively by the paired-interrelated objects in the scene. The use of the saliency segmentation method and image processing strategies can automatically extract and enhance foreground objects, and significantly improve object recognition performance towards recipients implanted a high-density implant. Copyright © 2017 Elsevier B.V. All rights reserved.
Conventional and Non-Conventional Drosophila Toll Signaling
Lindsay, Scott A.; Wasserman, Steven A.
2013-01-01
The discovery of Toll in Drosophila and of the remarkable conservation in pathway composition and organization catalyzed a transformation in our understanding of innate immune recognition and response. At the center of that picture is a cascade of interactions in which specific microbial cues activate Toll receptors, which then transmit signals driving transcription factor nuclear localization and activity. Experiments gave substance to the vision of pattern recognition receptors, linked phenomena in development, gene regulation, and immunity into a coherent whole, and revealed a rich set of variations for identifying non-self and responding effectively. More recently, research in Drosophila has illuminated the positive and negative regulation of Toll activation, the organization of signaling events at and beneath membranes, the sorting of information flow, and the existence of non-conventional signaling via Toll-related receptors. Here, we provide an overview of the Toll pathway of flies and highlight these ongoing realms of research. PMID:23632253
Comparison of Object Recognition Behavior in Human and Monkey
Rajalingham, Rishi; Schmidt, Kailyn
2015-01-01
Although the rhesus monkey is used widely as an animal model of human visual processing, it is not known whether invariant visual object recognition behavior is quantitatively comparable across monkeys and humans. To address this question, we systematically compared the core object recognition behavior of two monkeys with that of human subjects. To test true object recognition behavior (rather than image matching), we generated several thousand naturalistic synthetic images of 24 basic-level objects with high variation in viewing parameters and image background. Monkeys were trained to perform binary object recognition tasks on a match-to-sample paradigm. Data from 605 human subjects performing the same tasks on Mechanical Turk were aggregated to characterize “pooled human” object recognition behavior, as well as 33 separate Mechanical Turk subjects to characterize individual human subject behavior. Our results show that monkeys learn each new object in a few days, after which they not only match mean human performance but show a pattern of object confusion that is highly correlated with pooled human confusion patterns and is statistically indistinguishable from individual human subjects. Importantly, this shared human and monkey pattern of 3D object confusion is not shared with low-level visual representations (pixels, V1+; models of the retina and primary visual cortex) but is shared with a state-of-the-art computer vision feature representation. Together, these results are consistent with the hypothesis that rhesus monkeys and humans share a common neural shape representation that directly supports object perception. SIGNIFICANCE STATEMENT To date, several mammalian species have shown promise as animal models for studying the neural mechanisms underlying high-level visual processing in humans. In light of this diversity, making tight comparisons between nonhuman and human primates is particularly critical in determining the best use of nonhuman primates to further the goal of the field of translating knowledge gained from animal models to humans. To the best of our knowledge, this study is the first systematic attempt at comparing a high-level visual behavior of humans and macaque monkeys. PMID:26338324
Vision-guided gripping of a cylinder
NASA Technical Reports Server (NTRS)
Nicewarner, Keith E.; Kelley, Robert B.
1991-01-01
The motivation for vision-guided servoing is taken from tasks in automated or telerobotic space assembly and construction. Vision-guided servoing requires the ability to perform rapid pose estimates and provide predictive feature tracking. Monocular information from a gripper-mounted camera is used to servo the gripper to grasp a cylinder. The procedure is divided into recognition and servo phases. The recognition stage verifies the presence of a cylinder in the camera field of view. Then an initial pose estimate is computed and uncluttered scan regions are selected. The servo phase processes only the selected scan regions of the image. Given the knowledge, from the recognition phase, that there is a cylinder in the image and knowing the radius of the cylinder, 4 of the 6 pose parameters can be estimated with minimal computation. The relative motion of the cylinder is obtained by using the current pose and prior pose estimates. The motion information is then used to generate a predictive feature-based trajectory for the path of the gripper.
3D visual mechinism by neural networkings
NASA Astrophysics Data System (ADS)
Sugiyama, Shigeki
2007-04-01
There are some computer vision systems that are available on a market but those are quite far from a real usage of our daily life in a sense of security guard or in a sense of a usage of recognition of a target object behaviour. Because those surroundings' sensing might need to recognize a detail description of an object, like "the distance to an object" and "an object detail figure" and "its figure of edging", which are not possible to have a clear picture of the mechanisms of them with the present recognition system. So for doing this, here studies on mechanisms of how a pair of human eyes can recognize a distance apart, an object edging, and an object in order to get basic essences of vision mechanisms. And those basic mechanisms of object recognition are simplified and are extended logically for applying to a computer vision system. Some of the results of these studies are introduced on this paper.
Tejeria, L; Harper, R A; Artes, P H; Dickinson, C M
2002-09-01
(1) To explore the relation between performance on tasks of familiar face recognition (FFR) and face expression difference discrimination (FED) with both perceived disability in face recognition and clinical measures of visual function in subjects with age related macular degeneration (AMD). (2) To quantify the gain in performance for face recognition tasks when subjects use a bioptic telescopic low vision device. 30 subjects with AMD (age range 66-90 years; visual acuity 0.4-1.4 logMAR) were recruited for the study. Perceived (self rated) disability in face recognition was assessed by an eight item questionnaire covering a range of issues relating to face recognition. Visual functions measured were distance visual acuity (ETDRS logMAR charts), continuous text reading acuity (MNRead charts), contrast sensitivity (Pelli-Robson chart), and colour vision (large panel D-15). In the FFR task, images of famous people had to be identified. FED was assessed by a forced choice test where subjects had to decide which one of four images showed a different facial expression. These tasks were repeated with subjects using a bioptic device. Overall perceived disability in face recognition did not correlate with performance on either task, although a specific item on difficulty recognising familiar faces did correlate with FFR (r = 0.49, p<0.05). FFR performance was most closely related to distance acuity (r = -0.69, p<0.001), while FED performance was most closely related to continuous text reading acuity (r = -0.79, p<0.001). In multiple regression, neither contrast sensitivity nor colour vision significantly increased the explained variance. When using a bioptic telescope, FFR performance improved in 86% of subjects (median gain = 49%; p<0.001), while FED performance increased in 79% of subjects (median gain = 50%; p<0.01). Distance and reading visual acuity are closely associated with measured task performance in FFR and FED. A bioptic low vision device can offer a significant improvement in performance for face recognition tasks, and may be useful in reducing the handicap associated with this disability. There is, however, little evidence for a correlation between self rated difficulty in face recognition and measured performance for either task. Further work is needed to explore the complex relation between the perception of disability and measured performance.
Tejeria, L; Harper, R A; Artes, P H; Dickinson, C M
2002-01-01
Aims: (1) To explore the relation between performance on tasks of familiar face recognition (FFR) and face expression difference discrimination (FED) with both perceived disability in face recognition and clinical measures of visual function in subjects with age related macular degeneration (AMD). (2) To quantify the gain in performance for face recognition tasks when subjects use a bioptic telescopic low vision device. Methods: 30 subjects with AMD (age range 66–90 years; visual acuity 0.4–1.4 logMAR) were recruited for the study. Perceived (self rated) disability in face recognition was assessed by an eight item questionnaire covering a range of issues relating to face recognition. Visual functions measured were distance visual acuity (ETDRS logMAR charts), continuous text reading acuity (MNRead charts), contrast sensitivity (Pelli-Robson chart), and colour vision (large panel D-15). In the FFR task, images of famous people had to be identified. FED was assessed by a forced choice test where subjects had to decide which one of four images showed a different facial expression. These tasks were repeated with subjects using a bioptic device. Results: Overall perceived disability in face recognition did not correlate with performance on either task, although a specific item on difficulty recognising familiar faces did correlate with FFR (r = 0.49, p<0.05). FFR performance was most closely related to distance acuity (r = −0.69, p<0.001), while FED performance was most closely related to continuous text reading acuity (r = −0.79, p<0.001). In multiple regression, neither contrast sensitivity nor colour vision significantly increased the explained variance. When using a bioptic telescope, FFR performance improved in 86% of subjects (median gain = 49%; p<0.001), while FED performance increased in 79% of subjects (median gain = 50%; p<0.01). Conclusion: Distance and reading visual acuity are closely associated with measured task performance in FFR and FED. A bioptic low vision device can offer a significant improvement in performance for face recognition tasks, and may be useful in reducing the handicap associated with this disability. There is, however, little evidence for a correlation between self rated difficulty in face recognition and measured performance for either task. Further work is needed to explore the complex relation between the perception of disability and measured performance. PMID:12185131
Simulated Prosthetic Vision: The Benefits of Computer-Based Object Recognition and Localization.
Macé, Marc J-M; Guivarch, Valérian; Denis, Grégoire; Jouffrais, Christophe
2015-07-01
Clinical trials with blind patients implanted with a visual neuroprosthesis showed that even the simplest tasks were difficult to perform with the limited vision restored with current implants. Simulated prosthetic vision (SPV) is a powerful tool to investigate the putative functions of the upcoming generations of visual neuroprostheses. Recent studies based on SPV showed that several generations of implants will be required before usable vision is restored. However, none of these studies relied on advanced image processing. High-level image processing could significantly reduce the amount of information required to perform visual tasks and help restore visuomotor behaviors, even with current low-resolution implants. In this study, we simulated a prosthetic vision device based on object localization in the scene. We evaluated the usability of this device for object recognition, localization, and reaching. We showed that a very low number of electrodes (e.g., nine) are sufficient to restore visually guided reaching movements with fair timing (10 s) and high accuracy. In addition, performance, both in terms of accuracy and speed, was comparable with 9 and 100 electrodes. Extraction of high level information (object recognition and localization) from video images could drastically enhance the usability of current visual neuroprosthesis. We suggest that this method-that is, localization of targets of interest in the scene-may restore various visuomotor behaviors. This method could prove functional on current low-resolution implants. The main limitation resides in the reliability of the vision algorithms, which are improving rapidly. Copyright © 2015 International Center for Artificial Organs and Transplantation and Wiley Periodicals, Inc.
Handwritten-word spotting using biologically inspired features.
van der Zant, Tijn; Schomaker, Lambert; Haak, Koen
2008-11-01
For quick access to new handwritten collections, current handwriting recognition methods are too cumbersome. They cannot deal with the lack of labeled data and would require extensive laboratory training for each individual script, style, language and collection. We propose a biologically inspired whole-word recognition method which is used to incrementally elicit word labels in a live, web-based annotation system, named Monk. Since human labor should be minimized given the massive amount of image data, it becomes important to rely on robust perceptual mechanisms in the machine. Recent computational models of the neuro-physiology of vision are applied to isolated word classification. A primate cortex-like mechanism allows to classify text-images that have a low frequency of occurrence. Typically these images are the most difficult to retrieve and often contain named entities and are regarded as the most important to people. Usually standard pattern-recognition technology cannot deal with these text-images if there are not enough labeled instances. The results of this retrieval system are compared to normalized word-image matching and appear to be very promising.
EEG based topography analysis in string recognition task
NASA Astrophysics Data System (ADS)
Ma, Xiaofei; Huang, Xiaolin; Shen, Yuxiaotong; Qin, Zike; Ge, Yun; Chen, Ying; Ning, Xinbao
2017-03-01
Vision perception and recognition is a complex process, during which different parts of brain are involved depending on the specific modality of the vision target, e.g. face, character, or word. In this study, brain activities in string recognition task compared with idle control state are analyzed through topographies based on multiple measurements, i.e. sample entropy, symbolic sample entropy and normalized rhythm power, extracted from simultaneously collected scalp EEG. Our analyses show that, for most subjects, both symbolic sample entropy and normalized gamma power in string recognition task are significantly higher than those in idle state, especially at locations of P4, O2, T6 and C4. It implies that these regions are highly involved in string recognition task. Since symbolic sample entropy measures complexity, from the perspective of new information generation, and normalized rhythm power reveals the power distributions in frequency domain, complementary information about the underlying dynamics can be provided through the two types of indices.
Wolff, J Gerard
2014-01-01
The SP theory of intelligence aims to simplify and integrate concepts in computing and cognition, with information compression as a unifying theme. This article is about how the SP theory may, with advantage, be applied to the understanding of natural vision and the development of computer vision. Potential benefits include an overall simplification of concepts in a universal framework for knowledge and seamless integration of vision with other sensory modalities and other aspects of intelligence. Low level perceptual features such as edges or corners may be identified by the extraction of redundancy in uniform areas in the manner of the run-length encoding technique for information compression. The concept of multiple alignment in the SP theory may be applied to the recognition of objects, and to scene analysis, with a hierarchy of parts and sub-parts, at multiple levels of abstraction, and with family-resemblance or polythetic categories. The theory has potential for the unsupervised learning of visual objects and classes of objects, and suggests how coherent concepts may be derived from fragments. As in natural vision, both recognition and learning in the SP system are robust in the face of errors of omission, commission and substitution. The theory suggests how, via vision, we may piece together a knowledge of the three-dimensional structure of objects and of our environment, it provides an account of how we may see things that are not objectively present in an image, how we may recognise something despite variations in the size of its retinal image, and how raster graphics and vector graphics may be unified. And it has things to say about the phenomena of lightness constancy and colour constancy, the role of context in recognition, ambiguities in visual perception, and the integration of vision with other senses and other aspects of intelligence.
Liu, Dong; Wang, Shengsheng; Huang, Dezhi; Deng, Gang; Zeng, Fantao; Chen, Huiling
2016-05-01
Medical image recognition is an important task in both computer vision and computational biology. In the field of medical image classification, representing an image based on local binary patterns (LBP) descriptor has become popular. However, most existing LBP-based methods encode the binary patterns in a fixed neighborhood radius and ignore the spatial relationships among local patterns. The ignoring of the spatial relationships in the LBP will cause a poor performance in the process of capturing discriminative features for complex samples, such as medical images obtained by microscope. To address this problem, in this paper we propose a novel method to improve local binary patterns by assigning an adaptive neighborhood radius for each pixel. Based on these adaptive local binary patterns, we further propose a spatial adjacent histogram strategy to encode the micro-structures for image representation. An extensive set of evaluations are performed on four medical datasets which show that the proposed method significantly improves standard LBP and compares favorably with several other prevailing approaches. Copyright © 2016 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Mehrübeoğlu, Mehrübe; McLauchlan, Lifford
2006-02-01
The goal of this project was to detect the intensity of traffic on a road at different times of the day during daytime. Although the work presented utilized images from a section of a highway, the results of this project are intended for making decisions on the type of intervention necessary on any given road at different times for traffic control, such as installation of traffic signals, duration of red, green and yellow lights at intersections, and assignment of traffic control officers near school zones or other relevant locations. In this project, directional patterns are used to detect and count the number of cars in traffic images over a fixed area of the road to determine local traffic intensity. Directional patterns are chosen because they are simple and common to almost all moving vehicles. Perspective vision effects specific to each camera orientation has to be considered, as they affect the size and direction of patterns to be recognized. In this work, a simple and fast algorithm has been developed based on horizontal directional pattern matching and perspective vision adjustment. The results of the algorithm under various conditions are presented and compared in this paper. Using the developed algorithm, the traffic intensity can accurately be determined on clear days with average sized cars. The accuracy is reduced on rainy days when the camera lens contains raindrops, when there are very long vehicles, such as trucks or tankers, in the view, and when there is very low light around dusk or dawn.
Enhanced tactile encoding and memory recognition in congenital blindness.
D'Angiulli, Amedeo; Waraich, Paul
2002-06-01
Several behavioural studies have shown that early-blind persons possess superior tactile skills. Since neurophysiological data show that early-blind persons recruit visual as well as somatosensory cortex to carry out tactile processing (cross-modal plasticity), blind persons' sharper tactile skills may be related to cortical re-organisation resulting from loss of vision early in their life. To examine the nature of blind individuals' tactile superiority and its implications for cross-modal plasticity, we compared the tactile performance of congenitally totally blind, low-vision and sighted children on raised-line picture identification test and re-test, assessing effects of task familiarity, exploratory strategy and memory recognition. What distinguished the blind from the other children was higher memory recognition and higher tactile encoding associated with efficient exploration. These results suggest that enhanced perceptual encoding and recognition memory may be two cognitive correlates of cross-modal plasticity in congenital blindness.
Episodic Reasoning for Vision-Based Human Action Recognition
Martinez-del-Rincon, Jesus
2014-01-01
Smart Spaces, Ambient Intelligence, and Ambient Assisted Living are environmental paradigms that strongly depend on their capability to recognize human actions. While most solutions rest on sensor value interpretations and video analysis applications, few have realized the importance of incorporating common-sense capabilities to support the recognition process. Unfortunately, human action recognition cannot be successfully accomplished by only analyzing body postures. On the contrary, this task should be supported by profound knowledge of human agency nature and its tight connection to the reasons and motivations that explain it. The combination of this knowledge and the knowledge about how the world works is essential for recognizing and understanding human actions without committing common-senseless mistakes. This work demonstrates the impact that episodic reasoning has in improving the accuracy of a computer vision system for human action recognition. This work also presents formalization, implementation, and evaluation details of the knowledge model that supports the episodic reasoning. PMID:24959602
Automatic decoding of facial movements reveals deceptive pain expressions
Bartlett, Marian Stewart; Littlewort, Gwen C.; Frank, Mark G.; Lee, Kang
2014-01-01
Summary In highly social species such as humans, faces have evolved to convey rich information for social interaction, including expressions of emotions and pain [1–3]. Two motor pathways control facial movement [4–7]. A subcortical extrapyramidal motor system drives spontaneous facial expressions of felt emotions. A cortical pyramidal motor system controls voluntary facial expressions. The pyramidal system enables humans to simulate facial expressions of emotions not actually experienced. Their simulation is so successful that they can deceive most observers [8–11]. Machine vision may, however, be able to distinguish deceptive from genuine facial signals by identifying the subtle differences between pyramidally and extrapyramidally driven movements. Here we show that human observers could not discriminate real from faked expressions of pain better than chance, and after training, improved accuracy to a modest 55%. However a computer vision system that automatically measures facial movements and performs pattern recognition on those movements attained 85% accuracy. The machine system’s superiority is attributable to its ability to differentiate the dynamics of genuine from faked expressions. Thus by revealing the dynamics of facial action through machine vision systems, our approach has the potential to elucidate behavioral fingerprints of neural control systems involved in emotional signaling. PMID:24656830
Image jitter enhances visual performance when spatial resolution is impaired.
Watson, Lynne M; Strang, Niall C; Scobie, Fraser; Love, Gordon D; Seidel, Dirk; Manahilov, Velitchko
2012-09-06
Visibility of low-spatial frequency stimuli improves when their contrast is modulated at 5 to 10 Hz compared with stationary stimuli. Therefore, temporal modulations of visual objects could enhance the performance of low vision patients who primarily perceive images of low-spatial frequency content. We investigated the effect of retinal-image jitter on word recognition speed and facial emotion recognition in subjects with central visual impairment. Word recognition speed and accuracy of facial emotion discrimination were measured in volunteers with AMD under stationary and jittering conditions. Computer-driven and optoelectronic approaches were used to induce retinal-image jitter with duration of 100 or 166 ms and amplitude within the range of 0.5 to 2.6° visual angle. Word recognition speed was also measured for participants with simulated (Bangerter filters) visual impairment. Text jittering markedly enhanced word recognition speed for people with severe visual loss (101 ± 25%), while for those with moderate visual impairment, this effect was weaker (19 ± 9%). The ability of low vision patients to discriminate the facial emotions of jittering images improved by a factor of 2. A prototype of optoelectronic jitter goggles produced similar improvement in facial emotion discrimination. Word recognition speed in participants with simulated visual impairment was enhanced for interjitter intervals over 100 ms and reduced for shorter intervals. Results suggest that retinal-image jitter with optimal frequency and amplitude is an effective strategy for enhancing visual information processing in the absence of spatial detail. These findings will enable the development of novel tools to improve the quality of life of low vision patients.
NASA Astrophysics Data System (ADS)
Sultana, Maryam; Bhatti, Naeem; Javed, Sajid; Jung, Soon Ki
2017-09-01
Facial expression recognition (FER) is an important task for various computer vision applications. The task becomes challenging when it requires the detection and encoding of macro- and micropatterns of facial expressions. We present a two-stage texture feature extraction framework based on the local binary pattern (LBP) variants and evaluate its significance in recognizing posed and nonposed facial expressions. We focus on the parametric limitations of the LBP variants and investigate their effects for optimal FER. The size of the local neighborhood is an important parameter of the LBP technique for its extraction in images. To make the LBP adaptive, we exploit the granulometric information of the facial images to find the local neighborhood size for the extraction of center-symmetric LBP (CS-LBP) features. Our two-stage texture representations consist of an LBP variant and the adaptive CS-LBP features. Among the presented two-stage texture feature extractions, the binarized statistical image features and adaptive CS-LBP features were found showing high FER rates. Evaluation of the adaptive texture features shows competitive and higher performance than the nonadaptive features and other state-of-the-art approaches, respectively.
NASA Astrophysics Data System (ADS)
Moody, Daniela I.; Wilson, Cathy J.; Rowland, Joel C.; Altmann, Garrett L.
2015-06-01
Advanced pattern recognition and computer vision algorithms are of great interest for landscape characterization, change detection, and change monitoring in satellite imagery, in support of global climate change science and modeling. We present results from an ongoing effort to extend neuroscience-inspired models for feature extraction to the environmental sciences, and we demonstrate our work using Worldview-2 multispectral satellite imagery. We use a Hebbian learning rule to derive multispectral, multiresolution dictionaries directly from regional satellite normalized band difference index data. These feature dictionaries are used to build sparse scene representations, from which we automatically generate land cover labels via our CoSA algorithm: Clustering of Sparse Approximations. These data adaptive feature dictionaries use joint spectral and spatial textural characteristics to help separate geologic, vegetative, and hydrologic features. Land cover labels are estimated in example Worldview-2 satellite images of Barrow, Alaska, taken at two different times, and are used to detect and discuss seasonal surface changes. Our results suggest that an approach that learns from both spectral and spatial features is promising for practical pattern recognition problems in high resolution satellite imagery.
Jiang, Nanfeng; Song, Weiran; Wang, Hui; Guo, Gongde; Liu, Yuanyuan
2018-05-23
As the expectation for higher quality of life increases, consumers have higher demands for quality food. Food authentication is the technical means of ensuring food is what it says it is. A popular approach to food authentication is based on spectroscopy, which has been widely used for identifying and quantifying the chemical components of an object. This approach is non-destructive and effective but expensive. This paper presents a computer vision-based sensor system for food authentication, i.e., differentiating organic from non-organic apples. This sensor system consists of low-cost hardware and pattern recognition software. We use a flashlight to illuminate apples and capture their images through a diffraction grating. These diffraction images are then converted into a data matrix for classification by pattern recognition algorithms, including k -nearest neighbors ( k -NN), support vector machine (SVM) and three partial least squares discriminant analysis (PLS-DA)- based methods. We carry out experiments on a reasonable collection of apple samples and employ a proper pre-processing, resulting in a highest classification accuracy of 94%. Our studies conclude that this sensor system has the potential to provide a viable solution to empower consumers in food authentication.
Color defective vision and the recognition of aviation color signal light flashes.
DOT National Transportation Integrated Search
1971-06-01
A previous study reported on the efficiency with which various tests of color defective vision can predict performance during daylight conditions on a practical test of ability to discriminate aviation signal red, white, and green. In the current stu...
A vision of the future for BMC Medicine: serving science, medicine and authors.
Cassady-Cain, Robin L; Appleford, Joanne M; Patel, Jigisha; Aulakh, Mick; Norton, Melissa L
2009-10-07
In June 2009, BMC Medicine received its first official impact factor of 3.28 from Thomson Reuters. In recognition of this landmark event, the BMC Medicine editorial team present and discuss the vision and aims of the journal.
Recent advances in the development and transfer of machine vision technologies for space
NASA Technical Reports Server (NTRS)
Defigueiredo, Rui J. P.; Pendleton, Thomas
1991-01-01
Recent work concerned with real-time machine vision is briefly reviewed. This work includes methodologies and techniques for optimal illumination, shape-from-shading of general (non-Lambertian) 3D surfaces, laser vision devices and technology, high level vision, sensor fusion, real-time computing, artificial neural network design and use, and motion estimation. Two new methods that are currently being developed for object recognition in clutter and for 3D attitude tracking based on line correspondence are discussed.
CT Image Sequence Analysis for Object Recognition - A Rule-Based 3-D Computer Vision System
Dongping Zhu; Richard W. Conners; Daniel L. Schmoldt; Philip A. Araman
1991-01-01
Research is now underway to create a vision system for hardwood log inspection using a knowledge-based approach. In this paper, we present a rule-based, 3-D vision system for locating and identifying wood defects using topological, geometric, and statistical attributes. A number of different features can be derived from the 3-D input scenes. These features and evidence...
Oyedotun, Oyebade K; Khashman, Adnan
2017-02-01
Humans are apt at recognizing patterns and discovering even abstract features which are sometimes embedded therein. Our ability to use the banknotes in circulation for business transactions lies in the effortlessness with which we can recognize the different banknote denominations after seeing them over a period of time. More significant is that we can usually recognize these banknote denominations irrespective of what parts of the banknotes are exposed to us visually. Furthermore, our recognition ability is largely unaffected even when these banknotes are partially occluded. In a similar analogy, the robustness of intelligent systems to perform the task of banknote recognition should not collapse under some minimum level of partial occlusion. Artificial neural networks are intelligent systems which from inception have taken many important cues related to structure and learning rules from the human nervous/cognition processing system. Likewise, it has been shown that advances in artificial neural network simulations can help us understand the human nervous/cognition system even furthermore. In this paper, we investigate three cognition hypothetical frameworks to vision-based recognition of banknote denominations using competitive neural networks. In order to make the task more challenging and stress-test the investigated hypotheses, we also consider the recognition of occluded banknotes. The implemented hypothetical systems are tasked to perform fast recognition of banknotes with up to 75 % occlusion. The investigated hypothetical systems are trained on Nigeria's Naira banknotes and several experiments are performed to demonstrate the findings presented within this work.
Poth, Christian H; Schneider, Werner X
2016-09-01
Rapid saccadic eye movements bring the foveal region of the eye's retina onto objects for high-acuity vision. Saccades change the location and resolution of objects' retinal images. To perceive objects as visually stable across saccades, correspondence between the objects before and after the saccade must be established. We have previously shown that breaking object correspondence across the saccade causes a decrement in object recognition (Poth, Herwig, & Schneider, 2015). Color and luminance can establish object correspondence, but it is unknown how these surface features contribute to transsaccadic visual processing. Here, we investigated whether changing the surface features color-and-luminance and color alone across saccades impairs postsaccadic object recognition. Participants made saccades to peripheral objects, which either maintained or changed their surface features across the saccade. After the saccade, participants briefly viewed a letter within the saccade target object (terminated by a pattern mask). Postsaccadic object recognition was assessed as participants' accuracy in reporting the letter. Experiment A used the colors green and red with different luminances as surface features, Experiment B blue and yellow with approximately the same luminances. Changing the surface features across the saccade deteriorated postsaccadic object recognition in both experiments. These findings reveal a link between object recognition and object correspondence relying on the surface features colors and luminance, which is currently not addressed in theories of transsaccadic perception. We interpret the findings within a recent theory ascribing this link to visual attention (Schneider, 2013).
The research of edge extraction and target recognition based on inherent feature of objects
NASA Astrophysics Data System (ADS)
Xie, Yu-chan; Lin, Yu-chi; Huang, Yin-guo
2008-03-01
Current research on computer vision often needs specific techniques for particular problems. Little use has been made of high-level aspects of computer vision, such as three-dimensional (3D) object recognition, that are appropriate for large classes of problems and situations. In particular, high-level vision often focuses mainly on the extraction of symbolic descriptions, and pays little attention to the speed of processing. In order to extract and recognize target intelligently and rapidly, in this paper we developed a new 3D target recognition method based on inherent feature of objects in which cuboid was taken as model. On the basis of analysis cuboid nature contour and greyhound distributing characteristics, overall fuzzy evaluating technique was utilized to recognize and segment the target. Then Hough transform was used to extract and match model's main edges, we reconstruct aim edges by stereo technology in the end. There are three major contributions in this paper. Firstly, the corresponding relations between the parameters of cuboid model's straight edges lines in an image field and in the transform field were summed up. By those, the aimless computations and searches in Hough transform processing can be reduced greatly and the efficiency is improved. Secondly, as the priori knowledge about cuboids contour's geometry character known already, the intersections of the component extracted edges are taken, and assess the geometry of candidate edges matches based on the intersections, rather than the extracted edges. Therefore the outlines are enhanced and the noise is depressed. Finally, a 3-D target recognition method is proposed. Compared with other recognition methods, this new method has a quick response time and can be achieved with high-level computer vision. The method present here can be used widely in vision-guide techniques to strengthen its intelligence and generalization, which can also play an important role in object tracking, port AGV, robots fields. The results of simulation experiments and theory analyzing demonstrate that the proposed method could suppress noise effectively, extracted target edges robustly, and achieve the real time need. Theory analysis and experiment shows the method is reasonable and efficient.
Bio-inspired approach for intelligent unattended ground sensors
NASA Astrophysics Data System (ADS)
Hueber, Nicolas; Raymond, Pierre; Hennequin, Christophe; Pichler, Alexander; Perrot, Maxime; Voisin, Philippe; Moeglin, Jean-Pierre
2015-05-01
Improving the surveillance capacity over wide zones requires a set of smart battery-powered Unattended Ground Sensors capable of issuing an alarm to a decision-making center. Only high-level information has to be sent when a relevant suspicious situation occurs. In this paper we propose an innovative bio-inspired approach that mimics the human bi-modal vision mechanism and the parallel processing ability of the human brain. The designed prototype exploits two levels of analysis: a low-level panoramic motion analysis, the peripheral vision, and a high-level event-focused analysis, the foveal vision. By tracking moving objects and fusing multiple criteria (size, speed, trajectory, etc.), the peripheral vision module acts as a fast relevant event detector. The foveal vision module focuses on the detected events to extract more detailed features (texture, color, shape, etc.) in order to improve the recognition efficiency. The implemented recognition core is able to acquire human knowledge and to classify in real-time a huge amount of heterogeneous data thanks to its natively parallel hardware structure. This UGS prototype validates our system approach under laboratory tests. The peripheral analysis module demonstrates a low false alarm rate whereas the foveal vision correctly focuses on the detected events. A parallel FPGA implementation of the recognition core succeeds in fulfilling the embedded application requirements. These results are paving the way of future reconfigurable virtual field agents. By locally processing the data and sending only high-level information, their energy requirements and electromagnetic signature are optimized. Moreover, the embedded Artificial Intelligence core enables these bio-inspired systems to recognize and learn new significant events. By duplicating human expertise in potentially hazardous places, our miniature visual event detector will allow early warning and contribute to better human decision making.
NASA Astrophysics Data System (ADS)
Kuvich, Gary
2004-08-01
Vision is only a part of a system that converts visual information into knowledge structures. These structures drive the vision process, resolving ambiguity and uncertainty via feedback, and provide image understanding, which is an interpretation of visual information in terms of these knowledge models. These mechanisms provide a reliable recognition if the object is occluded or cannot be recognized as a whole. It is hard to split the entire system apart, and reliable solutions to the target recognition problems are possible only within the solution of a more generic Image Understanding Problem. Brain reduces informational and computational complexities, using implicit symbolic coding of features, hierarchical compression, and selective processing of visual information. Biologically inspired Network-Symbolic representation, where both systematic structural/logical methods and neural/statistical methods are parts of a single mechanism, is the most feasible for such models. It converts visual information into relational Network-Symbolic structures, avoiding artificial precise computations of 3-dimensional models. Network-Symbolic Transformations derive abstract structures, which allows for invariant recognition of an object as exemplar of a class. Active vision helps creating consistent models. Attention, separation of figure from ground and perceptual grouping are special kinds of network-symbolic transformations. Such Image/Video Understanding Systems will be reliably recognizing targets.
Automatic Estimation of Volcanic Ash Plume Height using WorldView-2 Imagery
NASA Technical Reports Server (NTRS)
McLaren, David; Thompson, David R.; Davies, Ashley G.; Gudmundsson, Magnus T.; Chien, Steve
2012-01-01
We explore the use of machine learning, computer vision, and pattern recognition techniques to automatically identify volcanic ash plumes and plume shadows, in WorldView-2 imagery. Using information of the relative position of the sun and spacecraft and terrain information in the form of a digital elevation map, classification, the height of the ash plume can also be inferred. We present the results from applying this approach to six scenes acquired on two separate days in April and May of 2010 of the Eyjafjallajokull eruption in Iceland. These results show rough agreement with ash plume height estimates from visual and radar based measurements.
Fast linear feature detection using multiple directional non-maximum suppression.
Sun, C; Vallotton, P
2009-05-01
The capacity to detect linear features is central to image analysis, computer vision and pattern recognition and has practical applications in areas such as neurite outgrowth detection, retinal vessel extraction, skin hair removal, plant root analysis and road detection. Linear feature detection often represents the starting point for image segmentation and image interpretation. In this paper, we present a new algorithm for linear feature detection using multiple directional non-maximum suppression with symmetry checking and gap linking. Given its low computational complexity, the algorithm is very fast. We show in several examples that it performs very well in terms of both sensitivity and continuity of detected linear features.
Robust Feature Matching in Terrestrial Image Sequences
NASA Astrophysics Data System (ADS)
Abbas, A.; Ghuffar, S.
2018-04-01
From the last decade, the feature detection, description and matching techniques are most commonly exploited in various photogrammetric and computer vision applications, which includes: 3D reconstruction of scenes, image stitching for panoramic creation, image classification, or object recognition etc. However, in terrestrial imagery of urban scenes contains various issues, which include duplicate and identical structures (i.e. repeated windows and doors) that cause the problem in feature matching phase and ultimately lead to failure of results specially in case of camera pose and scene structure estimation. In this paper, we will address the issue related to ambiguous feature matching in urban environment due to repeating patterns.
Atzori, Manfredo; Cognolato, Matteo; Müller, Henning
2016-01-01
Natural control methods based on surface electromyography (sEMG) and pattern recognition are promising for hand prosthetics. However, the control robustness offered by scientific research is still not sufficient for many real life applications, and commercial prostheses are capable of offering natural control for only a few movements. In recent years deep learning revolutionized several fields of machine learning, including computer vision and speech recognition. Our objective is to test its methods for natural control of robotic hands via sEMG using a large number of intact subjects and amputees. We tested convolutional networks for the classification of an average of 50 hand movements in 67 intact subjects and 11 transradial amputees. The simple architecture of the neural network allowed to make several tests in order to evaluate the effect of pre-processing, layer architecture, data augmentation and optimization. The classification results are compared with a set of classical classification methods applied on the same datasets. The classification accuracy obtained with convolutional neural networks using the proposed architecture is higher than the average results obtained with the classical classification methods, but lower than the results obtained with the best reference methods in our tests. The results show that convolutional neural networks with a very simple architecture can produce accurate results comparable to the average classical classification methods. They show that several factors (including pre-processing, the architecture of the net and the optimization parameters) can be fundamental for the analysis of sEMG data. Larger networks can achieve higher accuracy on computer vision and object recognition tasks. This fact suggests that it may be interesting to evaluate if larger networks can increase sEMG classification accuracy too. PMID:27656140
Atzori, Manfredo; Cognolato, Matteo; Müller, Henning
2016-01-01
Natural control methods based on surface electromyography (sEMG) and pattern recognition are promising for hand prosthetics. However, the control robustness offered by scientific research is still not sufficient for many real life applications, and commercial prostheses are capable of offering natural control for only a few movements. In recent years deep learning revolutionized several fields of machine learning, including computer vision and speech recognition. Our objective is to test its methods for natural control of robotic hands via sEMG using a large number of intact subjects and amputees. We tested convolutional networks for the classification of an average of 50 hand movements in 67 intact subjects and 11 transradial amputees. The simple architecture of the neural network allowed to make several tests in order to evaluate the effect of pre-processing, layer architecture, data augmentation and optimization. The classification results are compared with a set of classical classification methods applied on the same datasets. The classification accuracy obtained with convolutional neural networks using the proposed architecture is higher than the average results obtained with the classical classification methods, but lower than the results obtained with the best reference methods in our tests. The results show that convolutional neural networks with a very simple architecture can produce accurate results comparable to the average classical classification methods. They show that several factors (including pre-processing, the architecture of the net and the optimization parameters) can be fundamental for the analysis of sEMG data. Larger networks can achieve higher accuracy on computer vision and object recognition tasks. This fact suggests that it may be interesting to evaluate if larger networks can increase sEMG classification accuracy too.
Chen, Yibing; Ogata, Taiki; Ueyama, Tsuyoshi; Takada, Toshiyuki; Ota, Jun
2018-01-01
Machine vision is playing an increasingly important role in industrial applications, and the automated design of image recognition systems has been a subject of intense research. This study has proposed a system for automatically designing the field-of-view (FOV) of a camera, the illumination strength and the parameters in a recognition algorithm. We formulated the design problem as an optimisation problem and used an experiment based on a hierarchical algorithm to solve it. The evaluation experiments using translucent plastics objects showed that the use of the proposed system resulted in an effective solution with a wide FOV, recognition of all objects and 0.32 mm and 0.4° maximal positional and angular errors when all the RGB (red, green and blue) for illumination and R channel image for recognition were used. Though all the RGB illumination and grey scale images also provided recognition of all the objects, only a narrow FOV was selected. Moreover, full recognition was not achieved by using only G illumination and a grey-scale image. The results showed that the proposed method can automatically design the FOV, illumination and parameters in the recognition algorithm and that tuning all the RGB illumination is desirable even when single-channel or grey-scale images are used for recognition. PMID:29786665
Chen, Yibing; Ogata, Taiki; Ueyama, Tsuyoshi; Takada, Toshiyuki; Ota, Jun
2018-05-22
Machine vision is playing an increasingly important role in industrial applications, and the automated design of image recognition systems has been a subject of intense research. This study has proposed a system for automatically designing the field-of-view (FOV) of a camera, the illumination strength and the parameters in a recognition algorithm. We formulated the design problem as an optimisation problem and used an experiment based on a hierarchical algorithm to solve it. The evaluation experiments using translucent plastics objects showed that the use of the proposed system resulted in an effective solution with a wide FOV, recognition of all objects and 0.32 mm and 0.4° maximal positional and angular errors when all the RGB (red, green and blue) for illumination and R channel image for recognition were used. Though all the RGB illumination and grey scale images also provided recognition of all the objects, only a narrow FOV was selected. Moreover, full recognition was not achieved by using only G illumination and a grey-scale image. The results showed that the proposed method can automatically design the FOV, illumination and parameters in the recognition algorithm and that tuning all the RGB illumination is desirable even when single-channel or grey-scale images are used for recognition.
Color defective vision and day and night recognition of aviation color signal light flashes.
DOT National Transportation Integrated Search
1971-07-01
A previous study reported on the efficiency with which various tests of color defective vision can predict performance during daylight conditions on a practical test of ability to discriminate aviation signal red, white, and green. In the current stu...
A vision of the future for BMC Medicine: serving science, medicine and authors
Cassady-Cain, Robin L; Appleford, Joanne M; Patel, Jigisha; Aulakh, Mick; Norton, Melissa L
2009-01-01
In June 2009, BMC Medicine received its first official impact factor of 3.28 from Thomson Reuters. In recognition of this landmark event, the BMC Medicine editorial team present and discuss the vision and aims of the journal. PMID:19811626
Do we understand high-level vision?
Cox, David Daniel
2014-04-01
'High-level' vision lacks a single, agreed upon definition, but it might usefully be defined as those stages of visual processing that transition from analyzing local image structure to analyzing structure of the external world that produced those images. Much work in the last several decades has focused on object recognition as a framing problem for the study of high-level visual cortex, and much progress has been made in this direction. This approach presumes that the operational goal of the visual system is to read-out the identity of an object (or objects) in a scene, in spite of variation in the position, size, lighting and the presence of other nearby objects. However, while object recognition as a operational framing of high-level is intuitive appealing, it is by no means the only task that visual cortex might do, and the study of object recognition is beset by challenges in building stimulus sets that adequately sample the infinite space of possible stimuli. Here I review the successes and limitations of this work, and ask whether we should reframe our approaches to understanding high-level vision. Copyright © 2014. Published by Elsevier Ltd.
Chuk, Tim; Chan, Antoni B; Hsiao, Janet H
2017-12-01
The hidden Markov model (HMM)-based approach for eye movement analysis is able to reflect individual differences in both spatial and temporal aspects of eye movements. Here we used this approach to understand the relationship between eye movements during face learning and recognition, and its association with recognition performance. We discovered holistic (i.e., mainly looking at the face center) and analytic (i.e., specifically looking at the two eyes in addition to the face center) patterns during both learning and recognition. Although for both learning and recognition, participants who adopted analytic patterns had better recognition performance than those with holistic patterns, a significant positive correlation between the likelihood of participants' patterns being classified as analytic and their recognition performance was only observed during recognition. Significantly more participants adopted holistic patterns during learning than recognition. Interestingly, about 40% of the participants used different patterns between learning and recognition, and among them 90% switched their patterns from holistic at learning to analytic at recognition. In contrast to the scan path theory, which posits that eye movements during learning have to be recapitulated during recognition for the recognition to be successful, participants who used the same or different patterns during learning and recognition did not differ in recognition performance. The similarity between their learning and recognition eye movement patterns also did not correlate with their recognition performance. These findings suggested that perceptuomotor memory elicited by eye movement patterns during learning does not play an important role in recognition. In contrast, the retrieval of diagnostic information for recognition, such as the eyes for face recognition, is a better predictor for recognition performance. Copyright © 2017 Elsevier Ltd. All rights reserved.
Age and visual impairment decrease driving performance as measured on a closed-road circuit.
Wood, Joanne M
2002-01-01
In this study the effects of visual impairment and age on driving were investigated and related to visual function. Participants were 139 licensed drivers (young, middle-aged, and older participants with normal vision, and older participants with ocular disease). Driving performance was assessed during the daytime on a closed-road driving circuit. Visual performance was assessed using a vision testing battery. Age and visual impairment had a significant detrimental effect on recognition tasks (detection and recognition of signs and hazards), time to complete driving tasks (overall course time, reversing, and maneuvering), maneuvering ability, divided attention, and an overall driving performance index. All vision measures were significantly affected by group membership. A combination of motion sensitivity, useful field of view (UFOV), Pelli-Robson letter contrast sensitivity, and dynamic acuity could predict 50% of the variance in overall driving scores. These results indicate that older drivers with either normal vision or visual impairment had poorer driving performance compared with younger or middle-aged drivers with normal vision. The inclusion of tests such as motion sensitivity and the UFOV significantly improve the predictive power of vision tests for driving performance. Although such measures may not be practical for widespread screening, their application in selected cases should be considered.
Modeling Interval Temporal Dependencies for Complex Activities Understanding
2013-10-11
ORGANIZATION NAMES AND ADDRESSES U.S. Army Research Office P.O. Box 12211 Research Triangle Park, NC 27709-2211 15. SUBJECT TERMS Human activity modeling...computer vision applications: human activity recognition and facial activity recognition. The results demonstrate the superior performance of the
Image Classification for Web Genre Identification
2012-01-01
recognition and landscape detection using the computer vision toolkit OpenCV1. For facial recognition , we researched the possibilities of using the...method for connecting these names with a face/personal photo and logo respectively. [2] METHODOLOGY For this project, we focused primarily on facial
The development of newborn object recognition in fast and slow visual worlds
Wood, Justin N.; Wood, Samantha M. W.
2016-01-01
Object recognition is central to perception and cognition. Yet relatively little is known about the environmental factors that cause invariant object recognition to emerge in the newborn brain. Is this ability a hardwired property of vision? Or does the development of invariant object recognition require experience with a particular kind of visual environment? Here, we used a high-throughput controlled-rearing method to examine whether newborn chicks (Gallus gallus) require visual experience with slowly changing objects to develop invariant object recognition abilities. When newborn chicks were raised with a slowly rotating virtual object, the chicks built invariant object representations that generalized across novel viewpoints and rotation speeds. In contrast, when newborn chicks were raised with a virtual object that rotated more quickly, the chicks built viewpoint-specific object representations that failed to generalize to novel viewpoints and rotation speeds. Moreover, there was a direct relationship between the speed of the object and the amount of invariance in the chick's object representation. Thus, visual experience with slowly changing objects plays a critical role in the development of invariant object recognition. These results indicate that invariant object recognition is not a hardwired property of vision, but is learned rapidly when newborns encounter a slowly changing visual world. PMID:27097925
Bioelectric Control of a 757 Class High Fidelity Aircraft Simulation
NASA Technical Reports Server (NTRS)
Jorgensen, Charles; Wheeler, Kevin; Stepniewski, Slawomir; Norvig, Peter (Technical Monitor)
2000-01-01
This paper presents results of a recent experiment in fine grain Electromyographic (EMG) signal recognition, We demonstrate bioelectric flight control of 757 class simulation aircraft landing at San Francisco International Airport. The physical instrumentality of a pilot control stick is not used. A pilot closes a fist in empty air and performs control movements which are captured by a dry electrode array on the arm, analyzed and routed through a flight director permitting full pilot outer loop control of the simulation. A Vision Dome immersive display is used to create a VR world for the aircraft body mechanics and flight changes to pilot movements. Inner loop surfaces and differential aircraft thrust is controlled using a hybrid neural network architecture that combines a damage adaptive controller (Jorgensen 1998, Totah 1998) with a propulsion only based control system (Bull & Kaneshige 1997). Thus the 757 aircraft is not only being flown bioelectrically at the pilot level but also demonstrates damage adaptive neural network control permitting adaptation to severe changes in the physical flight characteristics of the aircraft at the inner loop level. To compensate for accident scenarios, the aircraft uses remaining control surface authority and differential thrust from the engines. To the best of our knowledge this is the first time real time bioelectric fine-grained control, differential thrust based control, and neural network damage adaptive control have been integrated into a single flight demonstration. The paper describes the EMG pattern recognition system and the bioelectric pattern recognition methodology.
Eye-tracking novice and expert geologist groups in the field and laboratory
NASA Astrophysics Data System (ADS)
Cottrell, R. D.; Evans, K. M.; Jacobs, R. A.; May, B. B.; Pelz, J. B.; Rosen, M. R.; Tarduno, J. A.; Voronov, J.
2010-12-01
We are using an Active Vision approach to learn how novices and expert geologists acquire visual information in the field. The Active Vision approach emphasizes that visual perception is an active process wherein new information is acquired about a particular environment through exploratory eye movements. Eye movements are not only influenced by physical stimuli, but are also strongly influenced by high-level perceptual and cognitive processes. Eye-tracking data were collected on ten novices (undergraduate geology students) and 3 experts during a 10-day field trip across California focused on neotectonics. In addition, high-resolution panoramic images were captured at each key locality for use in a semi-immersive laboratory environment. Examples of each data type will be presented. The number of observers will be increased in subsequent field trips, but expert/novice differences are already apparent in the first set of individual eye-tracking records, including gaze time, gaze pattern and object recognition. We will review efforts to quantify these patterns, and development of semi-immersive environments to display geologic scenes. The research is a collaborative effort between Earth scientists, Cognitive scientists and Imaging scientists at the University of Rochester and the Rochester Institute of Technology and with funding from the National Science Foundation.
NASA Astrophysics Data System (ADS)
Balbin, Jessie R.; Pinugu, Jasmine Nadja J.; Bautista, Joshua Ian C.; Nebres, Pauline D.; Rey Hipolito, Cipriano M.; Santella, Jose Anthony A.
2017-06-01
Visual processing skill is used to gather visual information from environment however, there are cases that Visual Processing Disorder (VPD) occurs. The so called visual figure-ground discrimination is a type of VPD where color is one of the factors that contributes on this type. In line with this, color plays a vital role in everyday living, but individuals that have limited and inaccurate color perception suffers from Color Vision Deficiency (CVD) and still not aware on their case. To resolve this case, this study focuses on the design of KULAY, a Head-Mounted Display (HMD) device that can assess whether a user has a CVD or not thru the standard Hardy-Rand-Rittler (HRR) test. This test uses pattern recognition in order to evaluate the user. In addition, color vision deficiency simulation and color correction thru color transformation is also a concern of this research. This will enable people with normal color vision to know how color vision deficient perceives and vice-versa. For the accuracy of the simulated HRR assessment, its results were validated thru an actual assessment done by a doctor. Moreover, for the preciseness of color transformation, Structural Similarity Index Method (SSIM) was used to compare the simulated CVD images and the color corrected images to other reference sources. The output of the simulated HRR assessment and color transformation shows very promising results indicating effectiveness and efficiency of the study. Thus, due to its form factor and portability, this device is beneficial in the field of medicine and technology.
Molecular patterns of X chromosome-linked color vision genes among 134 menof European ancestry
DOE Office of Scientific and Technical Information (OSTI.GOV)
Drummond-Borg, M.; Deeb, S.S.; Motulsky, A.G.
1989-02-01
The authors used Southern blot hybridization to study X chromosome-linked color vision genes encoding the apoproteins of red and green visual pigments in 134 unselected Caucasian men. One hundred and thirteen individuals (84.3%) had a normal arrangement of their color vision pigment genes. All had one red pigment gene; the number of green pigment genes ranged from one to five with a mode of two. The frequency of molecular genotypes indicative of normal color vision (84.3%) was significantly lower than had been observed in previous studies of color vision phenotypes. Color vision defects can be due to deletions of redmore » or green pigment genes or due to formation of hybrid genes comprising portions of both red and green pigment genes. Characteristic anomalous patterns were seen in 15 (11.2%) individuals: 7 (5.2%) had patterns characteristic of deuteranomaly, 2 (1.5%) had patterns characteristic of deuteranopia, and 6 (4.5%) had protan patterns. Previously undescribed hybrid gene patterns consisting of both green and red pigment gene fragments in addition to normal red and green genes were observed in another 6 individuals (4.5%). Thus, DNA testing detected anomalous color vision pigment genes at a higher frequency than expected from phenotypic color vision tests.« less
Robotic space simulation integration of vision algorithms into an orbital operations simulation
NASA Technical Reports Server (NTRS)
Bochsler, Daniel C.
1987-01-01
In order to successfully plan and analyze future space activities, computer-based simulations of activities in low earth orbit will be required to model and integrate vision and robotic operations with vehicle dynamics and proximity operations procedures. The orbital operations simulation (OOS) is configured and enhanced as a testbed for robotic space operations. Vision integration algorithms are being developed in three areas: preprocessing, recognition, and attitude/attitude rates. The vision program (Rice University) was modified for use in the OOS. Systems integration testing is now in progress.
NASA Technical Reports Server (NTRS)
Gennery, D.; Cunningham, R.; Saund, E.; High, J.; Ruoff, C.
1981-01-01
The field of computer vision is surveyed and assessed, key research issues are identified, and possibilities for a future vision system are discussed. The problems of descriptions of two and three dimensional worlds are discussed. The representation of such features as texture, edges, curves, and corners are detailed. Recognition methods are described in which cross correlation coefficients are maximized or numerical values for a set of features are measured. Object tracking is discussed in terms of the robust matching algorithms that must be devised. Stereo vision, camera control and calibration, and the hardware and systems architecture are discussed.
AstroCV: Astronomy computer vision library
NASA Astrophysics Data System (ADS)
González, Roberto E.; Muñoz, Roberto P.; Hernández, Cristian A.
2018-04-01
AstroCV processes and analyzes big astronomical datasets, and is intended to provide a community repository of high performance Python and C++ algorithms used for image processing and computer vision. The library offers methods for object recognition, segmentation and classification, with emphasis in the automatic detection and classification of galaxies.
NASA Astrophysics Data System (ADS)
Yu, Francis T. S.; Jutamulia, Suganda
2008-10-01
Contributors; Preface; 1. Pattern recognition with optics Francis T. S. Yu and Don A. Gregory; 2. Hybrid neural networks for nonlinear pattern recognition Taiwei Lu; 3. Wavelets, optics, and pattern recognition Yao Li and Yunglong Sheng; 4. Applications of the fractional Fourier transform to optical pattern recognition David Mendlovic, Zeev Zalesky and Haldum M. Oxaktas; 5. Optical implementation of mathematical morphology Tien-Hsin Chao; 6. Nonlinear optical correlators with improved discrimination capability for object location and recognition Leonid P. Yaroslavsky; 7. Distortion-invariant quadratic filters Gregory Gheen; 8. Composite filter synthesis as applied to pattern recognition Shizhou Yin and Guowen Lu; 9. Iterative procedures in electro-optical pattern recognition Joseph Shamir; 10. Optoelectronic hybrid system for three-dimensional object pattern recognition Guoguang Mu, Mingzhe Lu and Ying Sun; 11. Applications of photrefractive devices in optical pattern recognition Ziangyang Yang; 12. Optical pattern recognition with microlasers Eung-Gi Paek; 13. Optical properties and applications of bacteriorhodopsin Q. Wang Song and Yu-He Zhang; 14. Liquid-crystal spatial light modulators Aris Tanone and Suganda Jutamulia; 15. Representations of fully complex functions on real-time spatial light modulators Robert W. Cohn and Laurence G. Hassbrook; Index.
Optical character recognition reading aid for the visually impaired.
Grandin, Juan Carlos; Cremaschi, Fabian; Lombardo, Elva; Vitu, Ed; Dujovny, Manuel
2008-06-01
An optical character recognition (OCR) reading machine is a significant help for visually impaired patients. An OCR reading machine is used. This instrument can provide a significant help in order to improve the quality of life of patients with low vision or blindness.
Molecular patterns of X chromosome-linked color vision genes among 134 men of European ancestry.
Drummond-Borg, M; Deeb, S S; Motulsky, A G
1989-01-01
We used Southern blot hybridization to study X chromosome-linked color vision genes encoding the apoproteins of red and green visual pigments in 134 unselected Caucasian men. One hundred and thirteen individuals (84.3%) had a normal arrangement of their color vision pigment genes. All had one red pigment gene; the number of green pigment genes ranged from one to five with a mode of two. The frequency of molecular genotypes indicative of normal color vision (84.3%) was significantly lower than had been observed in previous studies of color vision phenotypes. Color vision defects can be due to deletions of red or green pigment genes or due to formation of hybrid genes comprising portions of both red and green pigment genes [Nathans, J., Piantanida, T.P., Eddy, R.L., Shows, T.B., Jr., & Hogness, D.S. (1986) Science 232, 203-210]. Characteristic anomalous patterns were seen in 15 (11.2%) individuals: 7 (5.2%) had patterns characteristic of deuteranomaly (mild defect in green color perception), 2 (1.5%) had patterns characteristic of deuteranopia (severe defect in green color perception), and 6 (4.5%) had protan patterns (the red perception defects protanomaly and protanopia cannot be differentiated by current molecular methods). Previously undescribed hybrid gene patterns consisting of both green and red pigment gene fragments in addition to normal red and green genes were observed in another 6 individuals (4.5%). Only 2 of these patterns were considered as deuteranomalous. Thus, DNA testing detected anomalous color vision pigment genes at a higher frequency than expected from phenotypic color vision tests. Some color vision gene arrays associated with hybrid genes are likely to mediate normal color vision. Images PMID:2915991
Wang, Jing; Li, Heng; Fu, Weizhen; Chen, Yao; Li, Liming; Lyu, Qing; Han, Tingting; Chai, Xinyu
2016-01-01
Retinal prostheses have the potential to restore partial vision. Object recognition in scenes of daily life is one of the essential tasks for implant wearers. Still limited by the low-resolution visual percepts provided by retinal prostheses, it is important to investigate and apply image processing methods to convey more useful visual information to the wearers. We proposed two image processing strategies based on Itti's visual saliency map, region of interest (ROI) extraction, and image segmentation. Itti's saliency model generated a saliency map from the original image, in which salient regions were grouped into ROI by the fuzzy c-means clustering. Then Grabcut generated a proto-object from the ROI labeled image which was recombined with background and enhanced in two ways--8-4 separated pixelization (8-4 SP) and background edge extraction (BEE). Results showed that both 8-4 SP and BEE had significantly higher recognition accuracy in comparison with direct pixelization (DP). Each saliency-based image processing strategy was subject to the performance of image segmentation. Under good and perfect segmentation conditions, BEE and 8-4 SP obtained noticeably higher recognition accuracy than DP, and under bad segmentation condition, only BEE boosted the performance. The application of saliency-based image processing strategies was verified to be beneficial to object recognition in daily scenes under simulated prosthetic vision. They are hoped to help the development of the image processing module for future retinal prostheses, and thus provide more benefit for the patients. Copyright © 2015 International Center for Artificial Organs and Transplantation and Wiley Periodicals, Inc.
CIFAR10-DVS: An Event-Stream Dataset for Object Classification
Li, Hongmin; Liu, Hanchao; Ji, Xiangyang; Li, Guoqi; Shi, Luping
2017-01-01
Neuromorphic vision research requires high-quality and appropriately challenging event-stream datasets to support continuous improvement of algorithms and methods. However, creating event-stream datasets is a time-consuming task, which needs to be recorded using the neuromorphic cameras. Currently, there are limited event-stream datasets available. In this work, by utilizing the popular computer vision dataset CIFAR-10, we converted 10,000 frame-based images into 10,000 event streams using a dynamic vision sensor (DVS), providing an event-stream dataset of intermediate difficulty in 10 different classes, named as “CIFAR10-DVS.” The conversion of event-stream dataset was implemented by a repeated closed-loop smooth (RCLS) movement of frame-based images. Unlike the conversion of frame-based images by moving the camera, the image movement is more realistic in respect of its practical applications. The repeated closed-loop image movement generates rich local intensity changes in continuous time which are quantized by each pixel of the DVS camera to generate events. Furthermore, a performance benchmark in event-driven object classification is provided based on state-of-the-art classification algorithms. This work provides a large event-stream dataset and an initial benchmark for comparison, which may boost algorithm developments in even-driven pattern recognition and object classification. PMID:28611582
CIFAR10-DVS: An Event-Stream Dataset for Object Classification.
Li, Hongmin; Liu, Hanchao; Ji, Xiangyang; Li, Guoqi; Shi, Luping
2017-01-01
Neuromorphic vision research requires high-quality and appropriately challenging event-stream datasets to support continuous improvement of algorithms and methods. However, creating event-stream datasets is a time-consuming task, which needs to be recorded using the neuromorphic cameras. Currently, there are limited event-stream datasets available. In this work, by utilizing the popular computer vision dataset CIFAR-10, we converted 10,000 frame-based images into 10,000 event streams using a dynamic vision sensor (DVS), providing an event-stream dataset of intermediate difficulty in 10 different classes, named as "CIFAR10-DVS." The conversion of event-stream dataset was implemented by a repeated closed-loop smooth (RCLS) movement of frame-based images. Unlike the conversion of frame-based images by moving the camera, the image movement is more realistic in respect of its practical applications. The repeated closed-loop image movement generates rich local intensity changes in continuous time which are quantized by each pixel of the DVS camera to generate events. Furthermore, a performance benchmark in event-driven object classification is provided based on state-of-the-art classification algorithms. This work provides a large event-stream dataset and an initial benchmark for comparison, which may boost algorithm developments in even-driven pattern recognition and object classification.
First Results of an “Artificial Retina” Processor Prototype
Cenci, Riccardo; Bedeschi, Franco; Marino, Pietro; ...
2016-11-15
We report on the performance of a specialized processor capable of reconstructing charged particle tracks in a realistic LHC silicon tracker detector, at the same speed of the readout and with sub-microsecond latency. The processor is based on an innovative pattern-recognition algorithm, called “artificial retina algorithm”, inspired from the vision system of mammals. A prototype of the processor has been designed, simulated, and implemented on Tel62 boards equipped with high-bandwidth Altera Stratix III FPGA devices. Also, the prototype is the first step towards a real-time track reconstruction device aimed at processing complex events of high-luminosity LHC experiments at 40 MHzmore » crossing rate.« less
Ubiquitous computing technology for just-in-time motivation of behavior change.
Intille, Stephen S
2004-01-01
This paper describes a vision of health care where "just-in-time" user interfaces are used to transform people from passive to active consumers of health care. Systems that use computational pattern recognition to detect points of decision, behavior, or consequences automatically can present motivational messages to encourage healthy behavior at just the right time. Further, new ubiquitous computing and mobile computing devices permit information to be conveyed to users at just the right place. In combination, computer systems that present messages at the right time and place can be developed to motivate physical activity and healthy eating. Computational sensing technologies can also be used to measure the impact of the motivational technology on behavior.
First Results of an “Artificial Retina” Processor Prototype
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cenci, Riccardo; Bedeschi, Franco; Marino, Pietro
We report on the performance of a specialized processor capable of reconstructing charged particle tracks in a realistic LHC silicon tracker detector, at the same speed of the readout and with sub-microsecond latency. The processor is based on an innovative pattern-recognition algorithm, called “artificial retina algorithm”, inspired from the vision system of mammals. A prototype of the processor has been designed, simulated, and implemented on Tel62 boards equipped with high-bandwidth Altera Stratix III FPGA devices. Also, the prototype is the first step towards a real-time track reconstruction device aimed at processing complex events of high-luminosity LHC experiments at 40 MHzmore » crossing rate.« less
Wang, Rui; Zhou, Yongquan; Zhao, Chengyan; Wu, Haizhou
2015-01-01
Multi-threshold image segmentation is a powerful image processing technique that is used for the preprocessing of pattern recognition and computer vision. However, traditional multilevel thresholding methods are computationally expensive because they involve exhaustively searching the optimal thresholds to optimize the objective functions. To overcome this drawback, this paper proposes a flower pollination algorithm with a randomized location modification. The proposed algorithm is used to find optimal threshold values for maximizing Otsu's objective functions with regard to eight medical grayscale images. When benchmarked against other state-of-the-art evolutionary algorithms, the new algorithm proves itself to be robust and effective through numerical experimental results including Otsu's objective values and standard deviations.
NASA Technical Reports Server (NTRS)
Hung, Stephen H. Y.
1989-01-01
A fast 3-D object recognition algorithm that can be used as a quick-look subsystem to the vision system for the Special-Purpose Dexterous Manipulator (SPDM) is described. Global features that can be easily computed from range data are used to characterize the images of a viewer-centered model of an object. This algorithm will speed up the processing by eliminating the low level processing whenever possible. It may identify the object, reject a set of bad data in the early stage, or create a better environment for a more powerful algorithm to carry the work further.
Fulfilling a European Vision through Flexible Learning and Choice
ERIC Educational Resources Information Center
Harris, Margaret S. G.
2012-01-01
This article considers the value of flexibility and free choice in learning, and examines the increasing recognition of the evolving and wide range of appropriate environments for learning, such as the workplace, the home, the community, and the virtual world. This "Lifeplace Learning" is compared to the requirements and visions of the…
Computer vision cracks the leaf code
Wilf, Peter; Zhang, Shengping; Chikkerur, Sharat; Little, Stefan A.; Wing, Scott L.; Serre, Thomas
2016-01-01
Understanding the extremely variable, complex shape and venation characters of angiosperm leaves is one of the most challenging problems in botany. Machine learning offers opportunities to analyze large numbers of specimens, to discover novel leaf features of angiosperm clades that may have phylogenetic significance, and to use those characters to classify unknowns. Previous computer vision approaches have primarily focused on leaf identification at the species level. It remains an open question whether learning and classification are possible among major evolutionary groups such as families and orders, which usually contain hundreds to thousands of species each and exhibit many times the foliar variation of individual species. Here, we tested whether a computer vision algorithm could use a database of 7,597 leaf images from 2,001 genera to learn features of botanical families and orders, then classify novel images. The images are of cleared leaves, specimens that are chemically bleached, then stained to reveal venation. Machine learning was used to learn a codebook of visual elements representing leaf shape and venation patterns. The resulting automated system learned to classify images into families and orders with a success rate many times greater than chance. Of direct botanical interest, the responses of diagnostic features can be visualized on leaf images as heat maps, which are likely to prompt recognition and evolutionary interpretation of a wealth of novel morphological characters. With assistance from computer vision, leaves are poised to make numerous new contributions to systematic and paleobotanical studies. PMID:26951664
Real-time optical multiple object recognition and tracking system and method
NASA Technical Reports Server (NTRS)
Chao, Tien-Hsin (Inventor); Liu, Hua-Kuang (Inventor)
1990-01-01
System for optically recognizing and tracking a plurality of objects within a field of vision. Laser (46) produces a coherent beam (48). Beam splitter (24) splits the beam into object (26) and reference (28) beams. Beam expanders (50) and collimators (52) transform the beams (26, 28) into coherent collimated light beams (26', 28'). A two-dimensional SLM (54), disposed in the object beam (26'), modulates the object beam with optical information as a function of signals from a first camera (16) which develops X and Y signals reflecting the contents of its field of vision. A hololens (38), positioned in the object beam (26') subsequent to the modulator (54), focuses the object beam at a plurality of focal points (42). A planar transparency-forming film (32), disposed with the focal points on an exposable surface, forms a multiple position interference filter (62) upon exposure of the surface and development processing of the film (32). A reflector (53) directing the reference beam (28') onto the film (32), exposes the surface, with images focused by the hololens (38), to form interference patterns on the surface. There is apparatus (16', 64) for sensing and indicating light passage through respective ones of the positions of the filter (62), whereby recognition of objects corresponding to respective ones of the positions of the filter (62) is affected. For tracking, apparatus (64) focuses light passing through the filter (62) onto a matrix of CCD's in a second camera (16') to form a two-dimensional display of the recognized objects.
Language Education and Multilingualism in Colombia: Crossing the Divide
ERIC Educational Resources Information Center
de Mejía, Anne-Marie
2017-01-01
Despite Colombia's official recognition of its ethnic and cultural diversity, it has yet to develop in practice an inclusive educational vision involving the recognition of diversity, as well as promoting the country's insertion within the global market. Garcia et al. acknowledge the importance of "cultivating" students' diverse…
1988-04-30
side it necessary and Identify’ by’ block n~nmbot) haptic hand, touch , vision, robot, object recognition, categorization 20. AGSTRPACT (Continue an...established that the haptic system has remarkable capabilities for object recognition. We define haptics as purposive touch . The basic tactual system...gathered ratings of the importance of dimensions for categorizing common objects by touch . Texture and hardness ratings strongly co-vary, which is
Comparing the minimum spatial-frequency content for recognizing Chinese and alphabet characters
Wang, Hui; Legge, Gordon E.
2018-01-01
Visual blur is a common problem that causes difficulty in pattern recognition for normally sighted people under degraded viewing conditions (e.g., near the acuity limit, when defocused, or in fog) and also for people with impaired vision. For reliable identification, the spatial frequency content of an object needs to extend up to or exceed a minimum value in units of cycles per object, referred to as the critical spatial frequency. In this study, we investigated the critical spatial frequency for alphabet and Chinese characters, and examined the effect of pattern complexity. The stimuli were divided into seven categories based on their perimetric complexity, including the lowercase and uppercase alphabet letters, and five groups of Chinese characters. We found that the critical spatial frequency significantly increased with complexity, from 1.01 cycles per character for the simplest group to 2.00 cycles per character for the most complex group of Chinese characters. A second goal of the study was to test a space-bandwidth invariance hypothesis that would represent a tradeoff between the critical spatial frequency and the number of adjacent patterns that can be recognized at one time. We tested this hypothesis by comparing the critical spatial frequencies in cycles per character from the current study and visual-span sizes in number of characters (measured by Wang, He, & Legge, 2014) for sets of characters with different complexities. For the character size (1.2°) we used in the study, we found an invariant product of approximately 10 cycles, which may represent a capacity limitation on visual pattern recognition. PMID:29297056
Monovision techniques for telerobots
NASA Technical Reports Server (NTRS)
Goode, P. W.; Carnils, K.
1987-01-01
The primary task of the vision sensor in a telerobotic system is to provide information about the position of the system's effector relative to objects of interest in its environment. The subtasks required to perform the primary task include image segmentation, object recognition, and object location and orientation in some coordinate system. The accomplishment of the vision task requires the appropriate processing tools and the system methodology to effectively apply the tools to the subtasks. The functional structure of the telerobotic vision system used in the Langley Research Center's Intelligent Systems Research Laboratory is discussed as well as two monovision techniques for accomplishing the vision subtasks.
Dynamic programming and graph algorithms in computer vision.
Felzenszwalb, Pedro F; Zabih, Ramin
2011-04-01
Optimization is a powerful paradigm for expressing and solving problems in a wide range of areas, and has been successfully applied to many vision problems. Discrete optimization techniques are especially interesting since, by carefully exploiting problem structure, they often provide nontrivial guarantees concerning solution quality. In this paper, we review dynamic programming and graph algorithms, and discuss representative examples of how these discrete optimization techniques have been applied to some classical vision problems. We focus on the low-level vision problem of stereo, the mid-level problem of interactive object segmentation, and the high-level problem of model-based recognition.
Use of Biometrics within Sub-Saharan Refugee Communities
2013-12-01
fingerprint patterns, iris pattern recognition, and facial recognition as a means of establishing an individual’s identity. Biometrics creates and...Biometrics typically comprises fingerprint patterns, iris pattern recognition, and facial recognition as a means of establishing an individual’s identity...authentication because it identifies an individual based on mathematical analysis of the random pattern visible within the iris. Facial recognition is
Three-dimensional object recognition based on planar images
NASA Astrophysics Data System (ADS)
Mital, Dinesh P.; Teoh, Eam-Khwang; Au, K. C.; Chng, E. K.
1993-01-01
This paper presents the development and realization of a robotic vision system for the recognition of 3-dimensional (3-D) objects. The system can recognize a single object from among a group of known regular convex polyhedron objects that is constrained to lie on a calibrated flat platform. The approach adopted comprises a series of image processing operations on a single 2-dimensional (2-D) intensity image to derive an image line drawing. Subsequently, a feature matching technique is employed to determine 2-D spatial correspondences of the image line drawing with the model in the database. Besides its identification ability, the system can also provide important position and orientation information of the recognized object. The system was implemented on an IBM-PC AT machine executing at 8 MHz without the 80287 Maths Co-processor. In our overall performance evaluation based on a 600 recognition cycles test, the system demonstrated an accuracy of above 80% with recognition time well within 10 seconds. The recognition time is, however, indirectly dependent on the number of models in the database. The reliability of the system is also affected by illumination conditions which must be clinically controlled as in any industrial robotic vision system.
Rotation-invariant neural pattern recognition system with application to coin recognition.
Fukumi, M; Omatu, S; Takeda, F; Kosaka, T
1992-01-01
In pattern recognition, it is often necessary to deal with problems to classify a transformed pattern. A neural pattern recognition system which is insensitive to rotation of input pattern by various degrees is proposed. The system consists of a fixed invariance network with many slabs and a trainable multilayered network. The system was used in a rotation-invariant coin recognition problem to distinguish between a 500 yen coin and a 500 won coin. The results show that the approach works well for variable rotation pattern recognition.
ATR applications of minimax entropy models of texture and shape
NASA Astrophysics Data System (ADS)
Zhu, Song-Chun; Yuille, Alan L.; Lanterman, Aaron D.
2001-10-01
Concepts from information theory have recently found favor in both the mainstream computer vision community and the military automatic target recognition community. In the computer vision literature, the principles of minimax entropy learning theory have been used to generate rich probabilitistic models of texture and shape. In addition, the method of types and large deviation theory has permitted the difficulty of various texture and shape recognition tasks to be characterized by 'order parameters' that determine how fundamentally vexing a task is, independent of the particular algorithm used. These information-theoretic techniques have been demonstrated using traditional visual imagery in applications such as simulating cheetah skin textures and such as finding roads in aerial imagery. We discuss their application to problems in the specific application domain of automatic target recognition using infrared imagery. We also review recent theoretical and algorithmic developments which permit learning minimax entropy texture models for infrared textures in reasonable timeframes.
NASA Astrophysics Data System (ADS)
Moriwaki, Katsumi; Koike, Issei; Sano, Tsuyoshi; Fukunaga, Tetsuya; Tanaka, Katsuyuki
We propose a new method of environmental recognition around an autonomous vehicle using dual vision sensor and navigation control based on binocular images. We consider to develop a guide robot that can play the role of a guide dog as the aid to people such as the visually impaired or the aged, as an application of above-mentioned techniques. This paper presents a recognition algorithm, which finds out the line of a series of Braille blocks and the boundary line between a sidewalk and a roadway where a difference in level exists by binocular images obtained from a pair of parallelarrayed CCD cameras. This paper also presents a tracking algorithm, with which the guide robot traces along a series of Braille blocks and avoids obstacles and unsafe areas which exist in the way of a person with the guide robot.
Component Pin Recognition Using Algorithms Based on Machine Learning
NASA Astrophysics Data System (ADS)
Xiao, Yang; Hu, Hong; Liu, Ze; Xu, Jiangchang
2018-04-01
The purpose of machine vision for a plug-in machine is to improve the machine’s stability and accuracy, and recognition of the component pin is an important part of the vision. This paper focuses on component pin recognition using three different techniques. The first technique involves traditional image processing using the core algorithm for binary large object (BLOB) analysis. The second technique uses the histogram of oriented gradients (HOG), to experimentally compare the effect of the support vector machine (SVM) and the adaptive boosting machine (AdaBoost) learning meta-algorithm classifiers. The third technique is the use of an in-depth learning method known as convolution neural network (CNN), which involves identifying the pin by comparing a sample to its training. The main purpose of the research presented in this paper is to increase the knowledge of learning methods used in the plug-in machine industry in order to achieve better results.
Fast cat-eye effect target recognition based on saliency extraction
NASA Astrophysics Data System (ADS)
Li, Li; Ren, Jianlin; Wang, Xingbin
2015-09-01
Background complexity is a main reason that results in false detection in cat-eye target recognition. Human vision has selective attention property which can help search the salient target from complex unknown scenes quickly and precisely. In the paper, we propose a novel cat-eye effect target recognition method named Multi-channel Saliency Processing before Fusion (MSPF). This method combines traditional cat-eye target recognition with the selective characters of visual attention. Furthermore, parallel processing enables it to achieve fast recognition. Experimental results show that the proposed method performs better in accuracy, robustness and speed compared to other methods.
A sensor and video based ontology for activity recognition in smart environments.
Mitchell, D; Morrow, Philip J; Nugent, Chris D
2014-01-01
Activity recognition is used in a wide range of applications including healthcare and security. In a smart environment activity recognition can be used to monitor and support the activities of a user. There have been a range of methods used in activity recognition including sensor-based approaches, vision-based approaches and ontological approaches. This paper presents a novel approach to activity recognition in a smart home environment which combines sensor and video data through an ontological framework. The ontology describes the relationships and interactions between activities, the user, objects, sensors and video data.
2006-01-01
vision may enhance recognition of conspecifics or be used in mating. While mating in moths is thought to be entirely mediated by olfaction , most tasks are...time, unambiguous evidence for true color vision under scotopic conditions has only recently been acquired (Kelber et al., 2002; Roth and Kelber, 2004...color under starlight and dim moonlight, respectively, raise at least two issues. First, what is the selective advantage of color vision in these
Newton, Jenny; Barrett, Steven F; Wilcox, Michael J; Popp, Stephanie
2002-01-01
Machine vision for navigational purposes is a rapidly growing field. Many abilities such as object recognition and target tracking rely on vision. Autonomous vehicles must be able to navigate in dynamic enviroments and simultaneously locate a target position. Traditional machine vision often fails to react in real time because of large computational requirements whereas the fly achieves complex orientation and navigation with a relatively small and simple brain. Understanding how the fly extracts visual information and how neurons encode and process information could lead us to a new approach for machine vision applications. Photoreceptors in the Musca domestica eye that share the same spatial information converge into a structure called the cartridge. The cartridge consists of the photoreceptor axon terminals and monopolar cells L1, L2, and L4. It is thought that L1 and L2 cells encode edge related information relative to a single cartridge. These cells are thought to be equivalent to vertebrate bipolar cells, producing contrast enhancement and reduction of information sent to L4. Monopolar cell L4 is thought to perform image segmentation on the information input from L1 and L2 and also enhance edge detection. A mesh of interconnected L4's would correlate the output from L1 and L2 cells of adjacent cartridges and provide a parallel network for segmenting an object's edges. The focus of this research is to excite photoreceptors of the common housefly, Musca domestica, with different visual patterns. The electrical response of monopolar cells L1, L2, and L4 will be recorded using intracellular recording techniques. Signal analysis will determine the neurocircuitry to detect and segment images.
Image remapping strategies applied as protheses for the visually impaired
NASA Technical Reports Server (NTRS)
Johnson, Curtis D.
1993-01-01
Maculopathy and retinitis pigmentosa (rp) are two vision defects which render the afflicted person with impaired ability to read and recognize visual patterns. For some time there has been interest and work on the use of image remapping techniques to provide a visual aid for individuals with these impairments. The basic concept is to remap an image according to some mathematical transformation such that the image is warped around a maculopathic defect (scotoma) or within the rp foveal region of retinal sensitivity. NASA/JSC has been pursuing this research using angle invariant transformations with testing of the resulting remapping using subjects and facilities of the University of Houston, College of Optometry. Testing is facilitated by use of a hardware device, the Programmable Remapper, to provide the remapping of video images. This report presents the results of studies of alternative remapping transformations with the objective of improving subject reading rates and pattern recognition. In particular a form of conformal transformation was developed which provides for a smooth warping of an image around a scotoma. In such a case it is shown that distortion of characters and lines of characters is minimized which should lead to enhanced character recognition. In addition studies were made of alternative transformations which, although not conformal, provide for similar low character distortion remapping. A second, non-conformal transformation was studied for remapping of images to aid rp impairments. In this case a transformation was investigated which allows remapping of a vision field into a circular area representing the foveal retina region. The size and spatial representation of the image are selectable. It is shown that parametric adjustments allow for a wide variation of how a visual field is presented to the sensitive retina. This study also presents some preliminary considerations of how a prosthetic device could be implemented in a practical sense, vis-a-vis, size, weight and portability.
The role of external features in face recognition with central vision loss: A pilot study
Bernard, Jean-Baptiste; Chung, Susana T.L.
2016-01-01
Purpose We evaluated how the performance for recognizing familiar face images depends on the internal (eyebrows, eyes, nose, mouth) and external face features (chin, outline of face, hairline) in individuals with central vision loss. Methods In Experiment 1, we measured eye movements for four observers with central vision loss to determine whether they fixated more often on the internal or the external features of face images while attempting to recognize the images. We then measured the accuracy for recognizing face images that contained only the internal, only the external, or both internal and external features (Experiment 2), and for hybrid images where the internal and external features came from two different source images (Experiment 3), for five observers with central vision loss and four age-matched control observers. Results When recognizing familiar face images, approximately 40% of the fixations of observers with central vision loss were centered on the external features of faces. The recognition accuracy was higher for images containing only external features (66.8±3.3% correct) than for images containing only internal features (35.8±15.0%), a finding contradicting that of control observers. For hybrid face images, observers with central vision loss responded more accurately to the external features (50.4±17.8%) than to the internal features (9.3±4.9%), while control observers did not show the same bias toward responding to the external features. Conclusions Contrary to people with normal vision who rely more on the internal features of face images for recognizing familiar faces, individuals with central vision loss show a higher dependence on using external features of face images. PMID:26829260
The Role of External Features in Face Recognition with Central Vision Loss.
Bernard, Jean-Baptiste; Chung, Susana T L
2016-05-01
We evaluated how the performance of recognizing familiar face images depends on the internal (eyebrows, eyes, nose, mouth) and external face features (chin, outline of face, hairline) in individuals with central vision loss. In experiment 1, we measured eye movements for four observers with central vision loss to determine whether they fixated more often on the internal or the external features of face images while attempting to recognize the images. We then measured the accuracy for recognizing face images that contained only the internal, only the external, or both internal and external features (experiment 2) and for hybrid images where the internal and external features came from two different source images (experiment 3) for five observers with central vision loss and four age-matched control observers. When recognizing familiar face images, approximately 40% of the fixations of observers with central vision loss was centered on the external features of faces. The recognition accuracy was higher for images containing only external features (66.8 ± 3.3% correct) than for images containing only internal features (35.8 ± 15.0%), a finding contradicting that of control observers. For hybrid face images, observers with central vision loss responded more accurately to the external features (50.4 ± 17.8%) than to the internal features (9.3 ± 4.9%), whereas control observers did not show the same bias toward responding to the external features. Contrary to people with normal vision who rely more on the internal features of face images for recognizing familiar faces, individuals with central vision loss show a higher dependence on using external features of face images.
Selective Attention in Vision: Recognition Memory for Superimposed Line Drawings.
ERIC Educational Resources Information Center
Goldstein, E. Bruce; Fink, Susan I.
1981-01-01
Four experiments show that observers can selectively attend to one of two stationary superimposed pictures. Selective recognition occurred with large displays in which observers were free to make eye movements during a 3-sec exposure and with small displays in which observers were instructed to fixate steadily on a point. (Author/RD)
Selection of Norway spruce somatic embryos by computer vision
NASA Astrophysics Data System (ADS)
Hamalainen, Jari J.; Jokinen, Kari J.
1993-05-01
A computer vision system was developed for the classification of plant somatic embryos. The embryos are in a Petri dish that is transferred with constant speed and they are recognized as they pass a line scan camera. A classification algorithm needs to be installed for every plant species. This paper describes an algorithm for the recognition of Norway spruce (Picea abies) embryos. A short review of conifer micropropagation by somatic embryogenesis is also given. The recognition algorithm is based on features calculated from the boundary of the object. Only part of the boundary corresponding to the developing cotyledons (2 - 15) and the straight sides of the embryo are used for recognition. An index of the length of the cotyledons describes the developmental stage of the embryo. The testing set for classifier performance consisted of 118 embryos and 478 nonembryos. With the classification tolerances chosen 69% of the objects classified as embryos by a human classifier were selected and 31$% rejected. Less than 1% of the nonembryos were classified as embryos. The basic features developed can probably be easily adapted for the recognition of other conifer somatic embryos.
Face recognition in newly hatched chicks at the onset of vision.
Wood, Samantha M W; Wood, Justin N
2015-04-01
How does face recognition emerge in the newborn brain? To address this question, we used an automated controlled-rearing method with a newborn animal model: the domestic chick (Gallus gallus). This automated method allowed us to examine chicks' face recognition abilities at the onset of both face experience and object experience. In the first week of life, newly hatched chicks were raised in controlled-rearing chambers that contained no objects other than a single virtual human face. In the second week of life, we used an automated forced-choice testing procedure to examine whether chicks could distinguish that familiar face from a variety of unfamiliar faces. Chicks successfully distinguished the familiar face from most of the unfamiliar faces-for example, chicks were sensitive to changes in the face's age, gender, and orientation (upright vs. inverted). Thus, chicks can build an accurate representation of the first face they see in their life. These results show that the initial state of face recognition is surprisingly powerful: Newborn visual systems can begin encoding and recognizing faces at the onset of vision. (c) 2015 APA, all rights reserved).
Identifying and Tracking Pedestrians Based on Sensor Fusion and Motion Stability Predictions
Musleh, Basam; García, Fernando; Otamendi, Javier; Armingol, José Mª; de la Escalera, Arturo
2010-01-01
The lack of trustworthy sensors makes development of Advanced Driver Assistance System (ADAS) applications a tough task. It is necessary to develop intelligent systems by combining reliable sensors and real-time algorithms to send the proper, accurate messages to the drivers. In this article, an application to detect and predict the movement of pedestrians in order to prevent an imminent collision has been developed and tested under real conditions. The proposed application, first, accurately measures the position of obstacles using a two-sensor hybrid fusion approach: a stereo camera vision system and a laser scanner. Second, it correctly identifies pedestrians using intelligent algorithms based on polylines and pattern recognition related to leg positions (laser subsystem) and dense disparity maps and u-v disparity (vision subsystem). Third, it uses statistical validation gates and confidence regions to track the pedestrian within the detection zones of the sensors and predict their position in the upcoming frames. The intelligent sensor application has been experimentally tested with success while tracking pedestrians that cross and move in zigzag fashion in front of a vehicle. PMID:22163639
Crowding by Invisible Flankers
Ho, Cristy; Cheung, Sing-Hang
2011-01-01
Background Human object recognition degrades sharply as the target object moves from central vision into peripheral vision. In particular, one's ability to recognize a peripheral target is severely impaired by the presence of flanking objects, a phenomenon known as visual crowding. Recent studies on how visual awareness of flanker existence influences crowding had shown mixed results. More importantly, it is not known whether conscious awareness of the existence of both the target and flankers are necessary for crowding to occur. Methodology/Principal Findings Here we show that crowding persists even when people are completely unaware of the flankers, which are rendered invisible through the continuous flash suppression technique. Contrast threshold for identifying the orientation of a grating pattern was elevated in the flanked condition, even when the subjects reported that they were unaware of the perceptually suppressed flankers. Moreover, we find that orientation-specific adaptation is attenuated by flankers even when both the target and flankers are invisible. Conclusions These findings complement the suggested correlation between crowding and visual awareness. What's more, our results demonstrate that conscious awareness and attention are not prerequisite for crowding. PMID:22194919
Identifying and tracking pedestrians based on sensor fusion and motion stability predictions.
Musleh, Basam; García, Fernando; Otamendi, Javier; Armingol, José Maria; de la Escalera, Arturo
2010-01-01
The lack of trustworthy sensors makes development of Advanced Driver Assistance System (ADAS) applications a tough task. It is necessary to develop intelligent systems by combining reliable sensors and real-time algorithms to send the proper, accurate messages to the drivers. In this article, an application to detect and predict the movement of pedestrians in order to prevent an imminent collision has been developed and tested under real conditions. The proposed application, first, accurately measures the position of obstacles using a two-sensor hybrid fusion approach: a stereo camera vision system and a laser scanner. Second, it correctly identifies pedestrians using intelligent algorithms based on polylines and pattern recognition related to leg positions (laser subsystem) and dense disparity maps and u-v disparity (vision subsystem). Third, it uses statistical validation gates and confidence regions to track the pedestrian within the detection zones of the sensors and predict their position in the upcoming frames. The intelligent sensor application has been experimentally tested with success while tracking pedestrians that cross and move in zigzag fashion in front of a vehicle.
NASA Astrophysics Data System (ADS)
Hashimoto, Manabu; Fujino, Yozo
Image sensing technologies are expected as useful and effective way to suppress damages by criminals and disasters in highly safe and relieved society. In this paper, we describe current important subjects, required functions, technical trends, and a couple of real examples of developed system. As for the video surveillance, recognition of human trajectory and human behavior using image processing techniques are introduced with real examples about the violence detection for elevators. In the field of facility monitoring technologies as civil engineering, useful machine vision applications such as automatic detection of concrete cracks on walls of a building or recognition of crowded people on bridge for effective guidance in emergency are shown.
A comparison of algorithms for inference and learning in probabilistic graphical models.
Frey, Brendan J; Jojic, Nebojsa
2005-09-01
Research into methods for reasoning under uncertainty is currently one of the most exciting areas of artificial intelligence, largely because it has recently become possible to record, store, and process large amounts of data. While impressive achievements have been made in pattern classification problems such as handwritten character recognition, face detection, speaker identification, and prediction of gene function, it is even more exciting that researchers are on the verge of introducing systems that can perform large-scale combinatorial analyses of data, decomposing the data into interacting components. For example, computational methods for automatic scene analysis are now emerging in the computer vision community. These methods decompose an input image into its constituent objects, lighting conditions, motion patterns, etc. Two of the main challenges are finding effective representations and models in specific applications and finding efficient algorithms for inference and learning in these models. In this paper, we advocate the use of graph-based probability models and their associated inference and learning algorithms. We review exact techniques and various approximate, computationally efficient techniques, including iterated conditional modes, the expectation maximization (EM) algorithm, Gibbs sampling, the mean field method, variational techniques, structured variational techniques and the sum-product algorithm ("loopy" belief propagation). We describe how each technique can be applied in a vision model of multiple, occluding objects and contrast the behaviors and performances of the techniques using a unifying cost function, free energy.
Dynamic Programming and Graph Algorithms in Computer Vision*
Felzenszwalb, Pedro F.; Zabih, Ramin
2013-01-01
Optimization is a powerful paradigm for expressing and solving problems in a wide range of areas, and has been successfully applied to many vision problems. Discrete optimization techniques are especially interesting, since by carefully exploiting problem structure they often provide non-trivial guarantees concerning solution quality. In this paper we briefly review dynamic programming and graph algorithms, and discuss representative examples of how these discrete optimization techniques have been applied to some classical vision problems. We focus on the low-level vision problem of stereo; the mid-level problem of interactive object segmentation; and the high-level problem of model-based recognition. PMID:20660950
Image segmentation for enhancing symbol recognition in prosthetic vision.
Horne, Lachlan; Barnes, Nick; McCarthy, Chris; He, Xuming
2012-01-01
Current and near-term implantable prosthetic vision systems offer the potential to restore some visual function, but suffer from poor resolution and dynamic range of induced phosphenes. This can make it difficult for users of prosthetic vision systems to identify symbolic information (such as signs) except in controlled conditions. Using image segmentation techniques from computer vision, we show it is possible to improve the clarity of such symbolic information for users of prosthetic vision implants in uncontrolled conditions. We use image segmentation to automatically divide a natural image into regions, and using a fixation point controlled by the user, select a region to phosphenize. This technique improves the apparent contrast and clarity of symbolic information over traditional phosphenization approaches.
Avola, Danilo; Spezialetti, Matteo; Placidi, Giuseppe
2013-06-01
Rehabilitation is often required after stroke, surgery, or degenerative diseases. It has to be specific for each patient and can be easily calibrated if assisted by human-computer interfaces and virtual reality. Recognition and tracking of different human body landmarks represent the basic features for the design of the next generation of human-computer interfaces. The most advanced systems for capturing human gestures are focused on vision-based techniques which, on the one hand, may require compromises from real-time and spatial precision and, on the other hand, ensure natural interaction experience. The integration of vision-based interfaces with thematic virtual environments encourages the development of novel applications and services regarding rehabilitation activities. The algorithmic processes involved during gesture recognition activity, as well as the characteristics of the virtual environments, can be developed with different levels of accuracy. This paper describes the architectural aspects of a framework supporting real-time vision-based gesture recognition and virtual environments for fast prototyping of customized exercises for rehabilitation purposes. The goal is to provide the therapist with a tool for fast implementation and modification of specific rehabilitation exercises for specific patients, during functional recovery. Pilot examples of designed applications and preliminary system evaluation are reported and discussed. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Kardava, Irakli; Tadyszak, Krzysztof; Gulua, Nana; Jurga, Stefan
2017-02-01
For more flexibility of environmental perception by artificial intelligence it is needed to exist the supporting software modules, which will be able to automate the creation of specific language syntax and to make a further analysis for relevant decisions based on semantic functions. According of our proposed approach, of which implementation it is possible to create the couples of formal rules of given sentences (in case of natural languages) or statements (in case of special languages) by helping of computer vision, speech recognition or editable text conversion system for further automatic improvement. In other words, we have developed an approach, by which it can be achieved to significantly improve the training process automation of artificial intelligence, which as a result will give us a higher level of self-developing skills independently from us (from users). At the base of our approach we have developed a software demo version, which includes the algorithm and software code for the entire above mentioned component's implementation (computer vision, speech recognition and editable text conversion system). The program has the ability to work in a multi - stream mode and simultaneously create a syntax based on receiving information from several sources.
Real-Time (Vision-Based) Road Sign Recognition Using an Artificial Neural Network.
Islam, Kh Tohidul; Raj, Ram Gopal
2017-04-13
Road sign recognition is a driver support function that can be used to notify and warn the driver by showing the restrictions that may be effective on the current stretch of road. Examples for such regulations are 'traffic light ahead' or 'pedestrian crossing' indications. The present investigation targets the recognition of Malaysian road and traffic signs in real-time. Real-time video is taken by a digital camera from a moving vehicle and real world road signs are then extracted using vision-only information. The system is based on two stages, one performs the detection and another one is for recognition. In the first stage, a hybrid color segmentation algorithm has been developed and tested. In the second stage, an introduced robust custom feature extraction method is used for the first time in a road sign recognition approach. Finally, a multilayer artificial neural network (ANN) has been created to recognize and interpret various road signs. It is robust because it has been tested on both standard and non-standard road signs with significant recognition accuracy. This proposed system achieved an average of 99.90% accuracy with 99.90% of sensitivity, 99.90% of specificity, 99.90% of f-measure, and 0.001 of false positive rate (FPR) with 0.3 s computational time. This low FPR can increase the system stability and dependability in real-time applications.
Real-Time (Vision-Based) Road Sign Recognition Using an Artificial Neural Network
Islam, Kh Tohidul; Raj, Ram Gopal
2017-01-01
Road sign recognition is a driver support function that can be used to notify and warn the driver by showing the restrictions that may be effective on the current stretch of road. Examples for such regulations are ‘traffic light ahead’ or ‘pedestrian crossing’ indications. The present investigation targets the recognition of Malaysian road and traffic signs in real-time. Real-time video is taken by a digital camera from a moving vehicle and real world road signs are then extracted using vision-only information. The system is based on two stages, one performs the detection and another one is for recognition. In the first stage, a hybrid color segmentation algorithm has been developed and tested. In the second stage, an introduced robust custom feature extraction method is used for the first time in a road sign recognition approach. Finally, a multilayer artificial neural network (ANN) has been created to recognize and interpret various road signs. It is robust because it has been tested on both standard and non-standard road signs with significant recognition accuracy. This proposed system achieved an average of 99.90% accuracy with 99.90% of sensitivity, 99.90% of specificity, 99.90% of f-measure, and 0.001 of false positive rate (FPR) with 0.3 s computational time. This low FPR can increase the system stability and dependability in real-time applications. PMID:28406471
Simulation Test of a Head-Worn Display with Ambient Vision Display for Unusual Attitude Recovery
NASA Technical Reports Server (NTRS)
Arthur, Jarvis (Trey) J., III; Nicholas, Stephanie N.; Shelton, Kevin J.; Ballard, Kathryn; Prinzel, Lawrence J., III; Ellis, Kyle E.; Bailey, Randall E.; Williams, Steven P.
2017-01-01
Head-Worn Displays (HWDs) are envisioned as a possible equivalent to a Head-Up Display (HUD) in commercial and general aviation. A simulation experiment was conducted to evaluate whether the HWD can provide an equivalent or better level of performance to a HUD in terms of unusual attitude recognition and recovery. A prototype HWD was tested with ambient vision capability which were varied (on/off) as an independent variable in the experiment testing for attitude awareness. The simulation experiment was conducted in two parts: 1) short unusual attitude recovery scenarios where the aircraft is placed in an unusual attitude and a single-pilot crew recovered the aircraft; and, 2) a two-pilot crew operating in a realistic flight environment with "off-nominal" events to induce unusual attitudes. The data showed few differences in unusual attitude recognition and recovery performance between the tested head-down, head-up, and head-worn display concepts. The presence and absence of ambient vision stimulation was inconclusive. The ergonomic influences of the head-worn display, necessary to implement the ambient vision experimentation, may have influenced the pilot ratings and acceptance of the concepts.
NASA Astrophysics Data System (ADS)
Duclos, D.; Lonnoy, J.; Guillerm, Q.; Jurie, F.; Herbin, S.; D'Angelo, E.
2008-04-01
The last five years have seen a renewal of Automatic Target Recognition applications, mainly because of the latest advances in machine learning techniques. In this context, large collections of image datasets are essential for training algorithms as well as for their evaluation. Indeed, the recent proliferation of recognition algorithms, generally applied to slightly different problems, make their comparisons through clean evaluation campaigns necessary. The ROBIN project tries to fulfil these two needs by putting unclassified datasets, ground truths, competitions and metrics for the evaluation of ATR algorithms at the disposition of the scientific community. The scope of this project includes single and multi-class generic target detection and generic target recognition, in military and security contexts. From our knowledge, it is the first time that a database of this importance (several hundred thousands of visible and infrared hand annotated images) has been publicly released. Funded by the French Ministry of Defence (DGA) and by the French Ministry of Research, ROBIN is one of the ten Techno-vision projects. Techno-vision is a large and ambitious government initiative for building evaluation means for computer vision technologies, for various application contexts. ROBIN's consortium includes major companies and research centres involved in Computer Vision R&D in the field of defence: Bertin Technologies, CNES, ECA, DGA, EADS, INRIA, ONERA, MBDA, SAGEM, THALES. This paper, which first gives an overview of the whole project, is focused on one of ROBIN's key competitions, the SAGEM Defence Security database. This dataset contains more than eight hundred ground and aerial infrared images of six different vehicles in cluttered scenes including distracters. Two different sets of data are available for each target. The first set includes different views of each vehicle at close range in a "simple" background, and can be used to train algorithms. The second set contains many views of the same vehicle in different contexts and situations simulating operational scenarios.
A Decade of Neural Networks: Practical Applications and Prospects
NASA Technical Reports Server (NTRS)
Kemeny, Sabrina E.
1994-01-01
The Jet Propulsion Laboratory Neural Network Workshop, sponsored by NASA and DOD, brings together sponsoring agencies, active researchers, and the user community to formulate a vision for the next decade of neural network research and application prospects. While the speed and computing power of microprocessors continue to grow at an ever-increasing pace, the demand to intelligently and adaptively deal with the complex, fuzzy, and often ill-defined world around us remains to a large extent unaddressed. Powerful, highly parallel computing paradigms such as neural networks promise to have a major impact in addressing these needs. Papers in the workshop proceedings highlight benefits of neural networks in real-world applications compared to conventional computing techniques. Topics include fault diagnosis, pattern recognition, and multiparameter optimization.
Spoof Detection for Finger-Vein Recognition System Using NIR Camera.
Nguyen, Dat Tien; Yoon, Hyo Sik; Pham, Tuyen Danh; Park, Kang Ryoung
2017-10-01
Finger-vein recognition, a new and advanced biometrics recognition method, is attracting the attention of researchers because of its advantages such as high recognition performance and lesser likelihood of theft and inaccuracies occurring on account of skin condition defects. However, as reported by previous researchers, it is possible to attack a finger-vein recognition system by using presentation attack (fake) finger-vein images. As a result, spoof detection, named as presentation attack detection (PAD), is necessary in such recognition systems. Previous attempts to establish PAD methods primarily focused on designing feature extractors by hand (handcrafted feature extractor) based on the observations of the researchers about the difference between real (live) and presentation attack finger-vein images. Therefore, the detection performance was limited. Recently, the deep learning framework has been successfully applied in computer vision and delivered superior results compared to traditional handcrafted methods on various computer vision applications such as image-based face recognition, gender recognition and image classification. In this paper, we propose a PAD method for near-infrared (NIR) camera-based finger-vein recognition system using convolutional neural network (CNN) to enhance the detection ability of previous handcrafted methods. Using the CNN method, we can derive a more suitable feature extractor for PAD than the other handcrafted methods using a training procedure. We further process the extracted image features to enhance the presentation attack finger-vein image detection ability of the CNN method using principal component analysis method (PCA) for dimensionality reduction of feature space and support vector machine (SVM) for classification. Through extensive experimental results, we confirm that our proposed method is adequate for presentation attack finger-vein image detection and it can deliver superior detection results compared to CNN-based methods and other previous handcrafted methods.
Spoof Detection for Finger-Vein Recognition System Using NIR Camera
Nguyen, Dat Tien; Yoon, Hyo Sik; Pham, Tuyen Danh; Park, Kang Ryoung
2017-01-01
Finger-vein recognition, a new and advanced biometrics recognition method, is attracting the attention of researchers because of its advantages such as high recognition performance and lesser likelihood of theft and inaccuracies occurring on account of skin condition defects. However, as reported by previous researchers, it is possible to attack a finger-vein recognition system by using presentation attack (fake) finger-vein images. As a result, spoof detection, named as presentation attack detection (PAD), is necessary in such recognition systems. Previous attempts to establish PAD methods primarily focused on designing feature extractors by hand (handcrafted feature extractor) based on the observations of the researchers about the difference between real (live) and presentation attack finger-vein images. Therefore, the detection performance was limited. Recently, the deep learning framework has been successfully applied in computer vision and delivered superior results compared to traditional handcrafted methods on various computer vision applications such as image-based face recognition, gender recognition and image classification. In this paper, we propose a PAD method for near-infrared (NIR) camera-based finger-vein recognition system using convolutional neural network (CNN) to enhance the detection ability of previous handcrafted methods. Using the CNN method, we can derive a more suitable feature extractor for PAD than the other handcrafted methods using a training procedure. We further process the extracted image features to enhance the presentation attack finger-vein image detection ability of the CNN method using principal component analysis method (PCA) for dimensionality reduction of feature space and support vector machine (SVM) for classification. Through extensive experimental results, we confirm that our proposed method is adequate for presentation attack finger-vein image detection and it can deliver superior detection results compared to CNN-based methods and other previous handcrafted methods. PMID:28974031
Component-based target recognition inspired by human vision
NASA Astrophysics Data System (ADS)
Zheng, Yufeng; Agyepong, Kwabena
2009-05-01
In contrast with machine vision, human can recognize an object from complex background with great flexibility. For example, given the task of finding and circling all cars (no further information) in a picture, you may build a virtual image in mind from the task (or target) description before looking at the picture. Specifically, the virtual car image may be composed of the key components such as driver cabin and wheels. In this paper, we propose a component-based target recognition method by simulating the human recognition process. The component templates (equivalent to the virtual image in mind) of the target (car) are manually decomposed from the target feature image. Meanwhile, the edges of the testing image can be extracted by using a difference of Gaussian (DOG) model that simulates the spatiotemporal response in visual process. A phase correlation matching algorithm is then applied to match the templates with the testing edge image. If all key component templates are matched with the examining object, then this object is recognized as the target. Besides the recognition accuracy, we will also investigate if this method works with part targets (half cars). In our experiments, several natural pictures taken on streets were used to test the proposed method. The preliminary results show that the component-based recognition method is very promising.
Low-cost real-time automatic wheel classification system
NASA Astrophysics Data System (ADS)
Shabestari, Behrouz N.; Miller, John W. V.; Wedding, Victoria
1992-11-01
This paper describes the design and implementation of a low-cost machine vision system for identifying various types of automotive wheels which are manufactured in several styles and sizes. In this application, a variety of wheels travel on a conveyor in random order through a number of processing steps. One of these processes requires the identification of the wheel type which was performed manually by an operator. A vision system was designed to provide the required identification. The system consisted of an annular illumination source, a CCD TV camera, frame grabber, and 386-compatible computer. Statistical pattern recognition techniques were used to provide robust classification as well as a simple means for adding new wheel designs to the system. Maintenance of the system can be performed by plant personnel with minimal training. The basic steps for identification include image acquisition, segmentation of the regions of interest, extraction of selected features, and classification. The vision system has been installed in a plant and has proven to be extremely effective. The system properly identifies the wheels correctly up to 30 wheels per minute regardless of rotational orientation in the camera's field of view. Correct classification can even be achieved if a portion of the wheel is blocked off from the camera. Significant cost savings have been achieved by a reduction in scrap associated with incorrect manual classification as well as a reduction of labor in a tedious task.
García-Hernández, Alejandra; Galván-Tejada, Carlos E; Galván-Tejada, Jorge I; Celaya-Padilla, José M; Gamboa-Rosales, Hamurabi; Velasco-Elizondo, Perla; Cárdenas-Vargas, Rogelio
2017-11-21
Human Activity Recognition (HAR) is one of the main subjects of study in the areas of computer vision and machine learning due to the great benefits that can be achieved. Examples of the study areas are: health prevention, security and surveillance, automotive research, and many others. The proposed approaches are carried out using machine learning techniques and present good results. However, it is difficult to observe how the descriptors of human activities are grouped. In order to obtain a better understanding of the the behavior of descriptors, it is important to improve the abilities to recognize the human activities. This paper proposes a novel approach for the HAR based on acoustic data and similarity networks. In this approach, we were able to characterize the sound of the activities and identify those activities looking for similarity in the sound pattern. We evaluated the similarity of the sounds considering mainly two features: the sound location and the materials that were used. As a result, the materials are a good reference classifying the human activities compared with the location.
Gestonurse: a robotic surgical nurse for handling surgical instruments in the operating room.
Jacob, Mithun; Li, Yu-Ting; Akingba, George; Wachs, Juan P
2012-03-01
While surgeon-scrub nurse collaboration provides a fast, straightforward and inexpensive method of delivering surgical instruments to the surgeon, it often results in "mistakes" (e.g. missing information, ambiguity of instructions and delays). It has been shown that these errors can have a negative impact on the outcome of the surgery. These errors could potentially be reduced or eliminated by introducing robotics into the operating room. Gesture control is a natural and fundamentally sound alternative that allows interaction without disturbing the normal flow of surgery. This paper describes the development of a robotic scrub nurse Gestonurse to support surgeons by passing surgical instruments during surgery as required. The robot responds to recognized hand signals detected through sophisticated computer vision and pattern recognition techniques. Experimental results show that 95% of the gestures were recognized correctly. The gesture recognition algorithm presented is robust to changes in scale and rotation of the hand gestures. The system was compared to human task performance and was found to be only 0.83 s slower on average.
Bank note recognition for the vision impaired.
Hinwood, A; Preston, P; Suaning, G J; Lovell, N H
2006-06-01
Blind Australians find great difficulty in recognising bank notes. Each note has the same feel, with no Braille markings, irregular edges or other tangible features. In Australia, there is only one device available that can assist blind people recognise their notes. Internationally, there are devices available; however they are expensive, complex and have not been developed to cater for Australian currency. This paper discusses a new device, the MoneyTalker that takes advantage of the largely different colours and patterns on each Australian bank note and recognises the notes electronically, using the reflection and transmission properties of light. Different coloured lights are transmitted through the inserted note and the corresponding sensors detect distinct ranges of values depending on the colour of the note. Various classification algorithms were studied and the final algorithm was chosen based on accuracy and speed of recognition. The MoneyTalker has shown an accuracy of more than 99%. A blind subject has tested the device and believes that it is usable, compact and affordable. Based on the devices that are available currently in Australia, the MoneyTalker is an effective alternative in terms of accuracy and usability.
García-Hernández, Alejandra; Galván-Tejada, Jorge I.; Celaya-Padilla, José M.; Velasco-Elizondo, Perla; Cárdenas-Vargas, Rogelio
2017-01-01
Human Activity Recognition (HAR) is one of the main subjects of study in the areas of computer vision and machine learning due to the great benefits that can be achieved. Examples of the study areas are: health prevention, security and surveillance, automotive research, and many others. The proposed approaches are carried out using machine learning techniques and present good results. However, it is difficult to observe how the descriptors of human activities are grouped. In order to obtain a better understanding of the the behavior of descriptors, it is important to improve the abilities to recognize the human activities. This paper proposes a novel approach for the HAR based on acoustic data and similarity networks. In this approach, we were able to characterize the sound of the activities and identify those activities looking for similarity in the sound pattern. We evaluated the similarity of the sounds considering mainly two features: the sound location and the materials that were used. As a result, the materials are a good reference classifying the human activities compared with the location. PMID:29160799
Data-driven indexing mechanism for the recognition of polyhedral objects
NASA Astrophysics Data System (ADS)
McLean, Stewart; Horan, Peter; Caelli, Terry M.
1992-02-01
This paper is concerned with the problem of searching large model databases. To date, most object recognition systems have concentrated on the problem of matching using simple searching algorithms. This is quite acceptable when the number of object models is small. However, in the future, general purpose computer vision systems will be required to recognize hundreds or perhaps thousands of objects and, in such circumstances, efficient searching algorithms will be needed. The problem of searching a large model database is one which must be addressed if future computer vision systems are to be at all effective. In this paper we present a method we call data-driven feature-indexed hypothesis generation as one solution to the problem of searching large model databases.
A Review on Human Activity Recognition Using Vision-Based Method.
Zhang, Shugang; Wei, Zhiqiang; Nie, Jie; Huang, Lei; Wang, Shuang; Li, Zhen
2017-01-01
Human activity recognition (HAR) aims to recognize activities from a series of observations on the actions of subjects and the environmental conditions. The vision-based HAR research is the basis of many applications including video surveillance, health care, and human-computer interaction (HCI). This review highlights the advances of state-of-the-art activity recognition approaches, especially for the activity representation and classification methods. For the representation methods, we sort out a chronological research trajectory from global representations to local representations, and recent depth-based representations. For the classification methods, we conform to the categorization of template-based methods, discriminative models, and generative models and review several prevalent methods. Next, representative and available datasets are introduced. Aiming to provide an overview of those methods and a convenient way of comparing them, we classify existing literatures with a detailed taxonomy including representation and classification methods, as well as the datasets they used. Finally, we investigate the directions for future research.
Spatiotemporal dynamics underlying object completion in human ventral visual cortex.
Tang, Hanlin; Buia, Calin; Madhavan, Radhika; Crone, Nathan E; Madsen, Joseph R; Anderson, William S; Kreiman, Gabriel
2014-08-06
Natural vision often involves recognizing objects from partial information. Recognition of objects from parts presents a significant challenge for theories of vision because it requires spatial integration and extrapolation from prior knowledge. Here we recorded intracranial field potentials of 113 visually selective electrodes from epilepsy patients in response to whole and partial objects. Responses along the ventral visual stream, particularly the inferior occipital and fusiform gyri, remained selective despite showing only 9%-25% of the object areas. However, these visually selective signals emerged ∼100 ms later for partial versus whole objects. These processing delays were particularly pronounced in higher visual areas within the ventral stream. This latency difference persisted when controlling for changes in contrast, signal amplitude, and the strength of selectivity. These results argue against a purely feedforward explanation of recognition from partial information, and provide spatiotemporal constraints on theories of object recognition that involve recurrent processing. Copyright © 2014 Elsevier Inc. All rights reserved.
A Review on Human Activity Recognition Using Vision-Based Method
Nie, Jie
2017-01-01
Human activity recognition (HAR) aims to recognize activities from a series of observations on the actions of subjects and the environmental conditions. The vision-based HAR research is the basis of many applications including video surveillance, health care, and human-computer interaction (HCI). This review highlights the advances of state-of-the-art activity recognition approaches, especially for the activity representation and classification methods. For the representation methods, we sort out a chronological research trajectory from global representations to local representations, and recent depth-based representations. For the classification methods, we conform to the categorization of template-based methods, discriminative models, and generative models and review several prevalent methods. Next, representative and available datasets are introduced. Aiming to provide an overview of those methods and a convenient way of comparing them, we classify existing literatures with a detailed taxonomy including representation and classification methods, as well as the datasets they used. Finally, we investigate the directions for future research. PMID:29065585
Activity Recognition in Egocentric video using SVM, kNN and Combined SVMkNN Classifiers
NASA Astrophysics Data System (ADS)
Sanal Kumar, K. P.; Bhavani, R., Dr.
2017-08-01
Egocentric vision is a unique perspective in computer vision which is human centric. The recognition of egocentric actions is a challenging task which helps in assisting elderly people, disabled patients and so on. In this work, life logging activity videos are taken as input. There are 2 categories, first one is the top level and second one is second level. Here, the recognition is done using the features like Histogram of Oriented Gradients (HOG), Motion Boundary Histogram (MBH) and Trajectory. The features are fused together and it acts as a single feature. The extracted features are reduced using Principal Component Analysis (PCA). The features that are reduced are provided as input to the classifiers like Support Vector Machine (SVM), k nearest neighbor (kNN) and combined Support Vector Machine (SVM) and k Nearest Neighbor (kNN) (combined SVMkNN). These classifiers are evaluated and the combined SVMkNN provided better results than other classifiers in the literature.
Good initialization model with constrained body structure for scene text recognition
NASA Astrophysics Data System (ADS)
Zhu, Anna; Wang, Guoyou; Dong, Yangbo
2016-09-01
Scene text recognition has gained significant attention in the computer vision community. Character detection and recognition are the promise of text recognition and affect the overall performance to a large extent. We proposed a good initialization model for scene character recognition from cropped text regions. We use constrained character's body structures with deformable part-based models to detect and recognize characters in various backgrounds. The character's body structures are achieved by an unsupervised discriminative clustering approach followed by a statistical model and a self-build minimum spanning tree model. Our method utilizes part appearance and location information, and combines character detection and recognition in cropped text region together. The evaluation results on the benchmark datasets demonstrate that our proposed scheme outperforms the state-of-the-art methods both on scene character recognition and word recognition aspects.
Scene and human face recognition in the central vision of patients with glaucoma
Aptel, Florent; Attye, Arnaud; Guyader, Nathalie; Boucart, Muriel; Chiquet, Christophe; Peyrin, Carole
2018-01-01
Primary open-angle glaucoma (POAG) firstly mainly affects peripheral vision. Current behavioral studies support the idea that visual defects of patients with POAG extend into parts of the central visual field classified as normal by static automated perimetry analysis. This is particularly true for visual tasks involving processes of a higher level than mere detection. The purpose of this study was to assess visual abilities of POAG patients in central vision. Patients were assigned to two groups following a visual field examination (Humphrey 24–2 SITA-Standard test). Patients with both peripheral and central defects and patients with peripheral but no central defect, as well as age-matched controls, participated in the experiment. All participants had to perform two visual tasks where low-contrast stimuli were presented in the central 6° of the visual field. A categorization task of scene images and human face images assessed high-level visual recognition abilities. In contrast, a detection task using the same stimuli assessed low-level visual function. The difference in performance between detection and categorization revealed the cost of high-level visual processing. Compared to controls, patients with a central visual defect showed a deficit in both detection and categorization of all low-contrast images. This is consistent with the abnormal retinal sensitivity as assessed by perimetry. However, the deficit was greater for categorization than detection. Patients without a central defect showed similar performances to the controls concerning the detection and categorization of faces. However, while the detection of scene images was well-maintained, these patients showed a deficit in their categorization. This suggests that the simple loss of peripheral vision could be detrimental to scene recognition, even when the information is displayed in central vision. This study revealed subtle defects in the central visual field of POAG patients that cannot be predicted by static automated perimetry assessment using Humphrey 24–2 SITA-Standard test. PMID:29481572
THE EFFECT OF WORD ASSOCIATIONS ON THE RECOGNITION OF FLASHED WORDS.
ERIC Educational Resources Information Center
SAMUELS, S. JAY
THE HYPOTHESIS THAT WHEN ASSOCIATED PAIRS OF WORDS ARE PRESENTED, SPEED OF RECOGNITION WILL BE FASTER THAN WHEN NONASSOCIATED WORD PAIRS ARE PRESENTED OR WHEN A TARGET WORD IS PRESENTED BY ITSELF WAS TESTED. TWENTY UNIVERSITY STUDENTS, INITIALLY SCREENED FOR VISION, WERE ASSIGNED RANDOMLY TO ROWS OF A 5 X 5 REPEATED-MEASURES LATIN SQUARE DESIGN.…
A Vision-Based Counting and Recognition System for Flying Insects in Intelligent Agriculture.
Zhong, Yuanhong; Gao, Junyuan; Lei, Qilun; Zhou, Yao
2018-05-09
Rapid and accurate counting and recognition of flying insects are of great importance, especially for pest control. Traditional manual identification and counting of flying insects is labor intensive and inefficient. In this study, a vision-based counting and classification system for flying insects is designed and implemented. The system is constructed as follows: firstly, a yellow sticky trap is installed in the surveillance area to trap flying insects and a camera is set up to collect real-time images. Then the detection and coarse counting method based on You Only Look Once (YOLO) object detection, the classification method and fine counting based on Support Vector Machines (SVM) using global features are designed. Finally, the insect counting and recognition system is implemented on Raspberry PI. Six species of flying insects including bee, fly, mosquito, moth, chafer and fruit fly are selected to assess the effectiveness of the system. Compared with the conventional methods, the test results show promising performance. The average counting accuracy is 92.50% and average classifying accuracy is 90.18% on Raspberry PI. The proposed system is easy-to-use and provides efficient and accurate recognition data, therefore, it can be used for intelligent agriculture applications.
NASA Astrophysics Data System (ADS)
Xu, Weidong; Lei, Zhu; Yuan, Zhang; Gao, Zhenqing
2018-03-01
The application of visual recognition technology in industrial robot crawling and placing operation is one of the key tasks in the field of robot research. In order to improve the efficiency and intelligence of the material sorting in the production line, especially to realize the sorting of the scattered items, the robot target recognition and positioning crawling platform based on binocular vision is researched and developed. The images were collected by binocular camera, and the images were pretreated. Harris operator was used to identify the corners of the images. The Canny operator was used to identify the images. Hough-chain code recognition was used to identify the images. The target image in the image, obtain the coordinates of each vertex of the image, calculate the spatial position and posture of the target item, and determine the information needed to capture the movement and transmit it to the robot control crawling operation. Finally, In this paper, we use this method to experiment the wrapping problem in the express sorting process The experimental results show that the platform can effectively solve the problem of sorting of loose parts, so as to achieve the purpose of efficient and intelligent sorting.
A Vision-Based Counting and Recognition System for Flying Insects in Intelligent Agriculture
Zhong, Yuanhong; Gao, Junyuan; Lei, Qilun; Zhou, Yao
2018-01-01
Rapid and accurate counting and recognition of flying insects are of great importance, especially for pest control. Traditional manual identification and counting of flying insects is labor intensive and inefficient. In this study, a vision-based counting and classification system for flying insects is designed and implemented. The system is constructed as follows: firstly, a yellow sticky trap is installed in the surveillance area to trap flying insects and a camera is set up to collect real-time images. Then the detection and coarse counting method based on You Only Look Once (YOLO) object detection, the classification method and fine counting based on Support Vector Machines (SVM) using global features are designed. Finally, the insect counting and recognition system is implemented on Raspberry PI. Six species of flying insects including bee, fly, mosquito, moth, chafer and fruit fly are selected to assess the effectiveness of the system. Compared with the conventional methods, the test results show promising performance. The average counting accuracy is 92.50% and average classifying accuracy is 90.18% on Raspberry PI. The proposed system is easy-to-use and provides efficient and accurate recognition data, therefore, it can be used for intelligent agriculture applications. PMID:29747429
Vision Systems with the Human in the Loop
NASA Astrophysics Data System (ADS)
Bauckhage, Christian; Hanheide, Marc; Wrede, Sebastian; Käster, Thomas; Pfeiffer, Michael; Sagerer, Gerhard
2005-12-01
The emerging cognitive vision paradigm deals with vision systems that apply machine learning and automatic reasoning in order to learn from what they perceive. Cognitive vision systems can rate the relevance and consistency of newly acquired knowledge, they can adapt to their environment and thus will exhibit high robustness. This contribution presents vision systems that aim at flexibility and robustness. One is tailored for content-based image retrieval, the others are cognitive vision systems that constitute prototypes of visual active memories which evaluate, gather, and integrate contextual knowledge for visual analysis. All three systems are designed to interact with human users. After we will have discussed adaptive content-based image retrieval and object and action recognition in an office environment, the issue of assessing cognitive systems will be raised. Experiences from psychologically evaluated human-machine interactions will be reported and the promising potential of psychologically-based usability experiments will be stressed.
Feedforward object-vision models only tolerate small image variations compared to human
Ghodrati, Masoud; Farzmahdi, Amirhossein; Rajaei, Karim; Ebrahimpour, Reza; Khaligh-Razavi, Seyed-Mahdi
2014-01-01
Invariant object recognition is a remarkable ability of primates' visual system that its underlying mechanism has constantly been under intense investigations. Computational modeling is a valuable tool toward understanding the processes involved in invariant object recognition. Although recent computational models have shown outstanding performances on challenging image databases, they fail to perform well in image categorization under more complex image variations. Studies have shown that making sparse representation of objects by extracting more informative visual features through a feedforward sweep can lead to higher recognition performances. Here, however, we show that when the complexity of image variations is high, even this approach results in poor performance compared to humans. To assess the performance of models and humans in invariant object recognition tasks, we built a parametrically controlled image database consisting of several object categories varied in different dimensions and levels, rendered from 3D planes. Comparing the performance of several object recognition models with human observers shows that only in low-level image variations the models perform similar to humans in categorization tasks. Furthermore, the results of our behavioral experiments demonstrate that, even under difficult experimental conditions (i.e., briefly presented masked stimuli with complex image variations), human observers performed outstandingly well, suggesting that the models are still far from resembling humans in invariant object recognition. Taken together, we suggest that learning sparse informative visual features, although desirable, is not a complete solution for future progresses in object-vision modeling. We show that this approach is not of significant help in solving the computational crux of object recognition (i.e., invariant object recognition) when the identity-preserving image variations become more complex. PMID:25100986
Night Vision Laboratory Static Performance Model for Thermal Viewing Systems
1975-04-01
Research and Development Technical Report f ECOM- • i’.__1’=• =•NIGHT VISION LABORATORY STATIC PERFORMANCE MODEL 1 S1=• : FOR THERMAL VIEWING...resolvable temperature Infrared imaging Minimum detectable temperature1.Detection and recognition performance Night visi,-)n Noise equivalent temperature...modulation transfer function (MTF). The noise charactcristics are specified by the noise equivalent temper- ature difference (NE AT), The next sections
Face recognition system and method using face pattern words and face pattern bytes
Zheng, Yufeng
2014-12-23
The present invention provides a novel system and method for identifying individuals and for face recognition utilizing facial features for face identification. The system and method of the invention comprise creating facial features or face patterns called face pattern words and face pattern bytes for face identification. The invention also provides for pattern recognitions for identification other than face recognition. The invention further provides a means for identifying individuals based on visible and/or thermal images of those individuals by utilizing computer software implemented by instructions on a computer or computer system and a computer readable medium containing instructions on a computer system for face recognition and identification.
A computer vision system for the recognition of trees in aerial photographs
NASA Technical Reports Server (NTRS)
Pinz, Axel J.
1991-01-01
Increasing problems of forest damage in Central Europe set the demand for an appropriate forest damage assessment tool. The Vision Expert System (VES) is presented which is capable of finding trees in color infrared aerial photographs. Concept and architecture of VES are discussed briefly. The system is applied to a multisource test data set. The processing of this multisource data set leads to a multiple interpretation result for one scene. An integration of these results will provide a better scene description by the vision system. This is achieved by an implementation of Steven's correlation algorithm.
Patterns of functional vision loss in glaucoma determined with archetypal analysis
Elze, Tobias; Pasquale, Louis R.; Shen, Lucy Q.; Chen, Teresa C.; Wiggs, Janey L.; Bex, Peter J.
2015-01-01
Glaucoma is an optic neuropathy accompanied by vision loss which can be mapped by visual field (VF) testing revealing characteristic patterns related to the retinal nerve fibre layer anatomy. While detailed knowledge about these patterns is important to understand the anatomic and genetic aspects of glaucoma, current classification schemes are typically predominantly derived qualitatively. Here, we classify glaucomatous vision loss quantitatively by statistically learning prototypical patterns on the convex hull of the data space. In contrast to component-based approaches, this method emphasizes distinct aspects of the data and provides patterns that are easier to interpret for clinicians. Based on 13 231 reliable Humphrey VFs from a large clinical glaucoma practice, we identify an optimal solution with 17 glaucomatous vision loss prototypes which fit well with previously described qualitative patterns from a large clinical study. We illustrate relations of our patterns to retinal structure by a previously developed mathematical model. In contrast to the qualitative clinical approaches, our results can serve as a framework to quantify the various subtypes of glaucomatous visual field loss. PMID:25505132
The Perception of Multiple Images
ERIC Educational Resources Information Center
Goldstein, E. Bruce
1975-01-01
A discussion of visual field, foveal and peripheral vision, eye fixations, recognition and recall of pictures, memory for meaning of pictures, and the relation between speed of presentation and memory. (Editor)
Artificial intelligence and signal processing for infrastructure assessment
NASA Astrophysics Data System (ADS)
Assaleh, Khaled; Shanableh, Tamer; Yehia, Sherif
2015-04-01
The Ground Penetrating Radar (GPR) is being recognized as an effective nondestructive evaluation technique to improve the inspection process. However, data interpretation and complexity of the results impose some limitations on the practicality of using this technique. This is mainly due to the need of a trained experienced person to interpret images obtained by the GPR system. In this paper, an algorithm to classify and assess the condition of infrastructures utilizing image processing and pattern recognition techniques is discussed. Features extracted form a dataset of images of defected and healthy slabs are used to train a computer vision based system while another dataset is used to evaluate the proposed algorithm. Initial results show that the proposed algorithm is able to detect the existence of defects with about 77% success rate.
NASA Astrophysics Data System (ADS)
Daneshmend, L. K.; Pak, H. A.
1984-02-01
On-line monitoring of the cutting process in CNC lathe is desirable to ensure unattended fault-free operation in an automated environment. The state of the cutting tool is one of the most important parameters which characterises the cutting process. Direct monitoring of the cutting tool or workpiece is not feasible during machining. However several variables related to the state of the tool can be measured on-line. A novel monitoring technique is presented which uses cutting torque as the variable for on-line monitoring. A classifier is designed on the basis of the empirical relationship between cutting torque and flank wear. The empirical model required by the on-line classifier is established during an automated training cycle using machine vision for off-line direct inspection of the tool.
ERIC Educational Resources Information Center
Evans, Karen M.; Federmeier, Kara D.
2007-01-01
We examined the nature and timecourse of hemispheric asymmetries in verbal memory by recording event-related potentials (ERPs) in a continuous recognition task. Participants made overt recognition judgments to test words presented in central vision that were either novel (new words) or had been previously presented in the left or right visual…
Behavioral model of visual perception and recognition
NASA Astrophysics Data System (ADS)
Rybak, Ilya A.; Golovan, Alexander V.; Gusakova, Valentina I.
1993-09-01
In the processes of visual perception and recognition human eyes actively select essential information by way of successive fixations at the most informative points of the image. A behavioral program defining a scanpath of the image is formed at the stage of learning (object memorizing) and consists of sequential motor actions, which are shifts of attention from one to another point of fixation, and sensory signals expected to arrive in response to each shift of attention. In the modern view of the problem, invariant object recognition is provided by the following: (1) separated processing of `what' (object features) and `where' (spatial features) information at high levels of the visual system; (2) mechanisms of visual attention using `where' information; (3) representation of `what' information in an object-based frame of reference (OFR). However, most recent models of vision based on OFR have demonstrated the ability of invariant recognition of only simple objects like letters or binary objects without background, i.e. objects to which a frame of reference is easily attached. In contrast, we use not OFR, but a feature-based frame of reference (FFR), connected with the basic feature (edge) at the fixation point. This has provided for our model, the ability for invariant representation of complex objects in gray-level images, but demands realization of behavioral aspects of vision described above. The developed model contains a neural network subsystem of low-level vision which extracts a set of primary features (edges) in each fixation, and high- level subsystem consisting of `what' (Sensory Memory) and `where' (Motor Memory) modules. The resolution of primary features extraction decreases with distances from the point of fixation. FFR provides both the invariant representation of object features in Sensor Memory and shifts of attention in Motor Memory. Object recognition consists in successive recall (from Motor Memory) and execution of shifts of attention and successive verification of the expected sets of features (stored in Sensory Memory). The model shows the ability of recognition of complex objects (such as faces) in gray-level images invariant with respect to shift, rotation, and scale.
Generalization in visual recognition by the honeybee (Apis mellifera): a review and explanation.
Horridge, Adrian
2009-06-01
During a century of studies on honeybee vision, generalization was the word for the acceptance of an unfamiliar pattern in the place of the training pattern, or the ability to learn a common factor in a group of related patterns. The ideas that bees generalize one pattern for another, detect similarity and differences, or form categories, were derived from the use of the same terms in the human cognitive sciences. Recent work now reveals a mechanistic explanation for bees. Small groups of ommatidia converge upon feature detectors that respond selectively to certain parameters that are in the pattern: modulation in the receptors, edge orientations, or to areas of black or colour. Within each local region of the eye the responses of each type of feature detector are summed to form a cue. The cues are therefore not in the pattern, but are local totals in the bee. Each cue has a quality, a quantity and a position on the eye, like a neuron response. This summation of edge detector responses destroys the local pattern based on edge orientation but preserves a coarse, sparse and simplified version of the panorama. In order of preference, the cues are: local receptor modulation, positions of well-separated black areas, a small black spot, colour and positions of the centres of each cue, radial edges, the averaged edge orientation and tangential edges. A pattern is always accepted by a trained bee that detects the expected cues in the expected places and no unexpected cues. The actual patterns are irrelevant. Therefore we have an explanation of generalization that is based on experimental testing of trained bees, not by analogy with other animals. Historically, generalization appeared when the training patterns were regularly interchanged to make the bees examine them. This strategy forced the bees to ignore parameters outside the training pattern, so that learning was restricted to one local eye region. This in turn limited the memory to one cue of each type, so that recognition was ambiguous because the cues were insufficient to distinguish all patterns. On the other hand, bees trained on very large targets, or by landing on the pattern, learned cues in several eye regions, and were able to recognize the coarse configural layout.
Owls see in stereo much like humans do.
van der Willigen, Robert F
2011-06-10
While 3D experiences through binocular disparity sensitivity have acquired special status in the understanding of human stereo vision, much remains to be learned about how binocularity is put to use in animals. The owl provides an exceptional model to study stereo vision as it displays one of the highest degrees of binocular specialization throughout the animal kingdom. In a series of six behavioral experiments, equivalent to hallmark human psychophysical studies, I compiled an extensive body of stereo performance data from two trained owls. Computer-generated, binocular random-dot patterns were used to ensure pure stereo performance measurements. In all cases, I found that owls perform much like humans do, viz.: (1) disparity alone can evoke figure-ground segmentation; (2) selective use of "relative" rather than "absolute" disparity; (3) hyperacute sensitivity; (4) disparity processing allows for the avoidance of monocular feature detection prior to object recognition; (5) large binocular disparities are not tolerated; (6) disparity guides the perceptual organization of 2D shape. The robustness and very nature of these binocular disparity-based perceptual phenomena bear out that owls, like humans, exploit the third dimension to facilitate early figure-ground segmentation of tangible objects.
Ontological Representation of Light Wave Camera Data to Support Vision-Based AmI
Serrano, Miguel Ángel; Gómez-Romero, Juan; Patricio, Miguel Ángel; García, Jesús; Molina, José Manuel
2012-01-01
Recent advances in technologies for capturing video data have opened a vast amount of new application areas in visual sensor networks. Among them, the incorporation of light wave cameras on Ambient Intelligence (AmI) environments provides more accurate tracking capabilities for activity recognition. Although the performance of tracking algorithms has quickly improved, symbolic models used to represent the resulting knowledge have not yet been adapted to smart environments. This lack of representation does not allow to take advantage of the semantic quality of the information provided by new sensors. This paper advocates for the introduction of a part-based representational level in cognitive-based systems in order to accurately represent the novel sensors' knowledge. The paper also reviews the theoretical and practical issues in part-whole relationships proposing a specific taxonomy for computer vision approaches. General part-based patterns for human body and transitive part-based representation and inference are incorporated to an ontology-based previous framework to enhance scene interpretation in the area of video-based AmI. The advantages and new features of the model are demonstrated in a Social Signal Processing (SSP) application for the elaboration of live market researches.
Pattern Recognition Using Artificial Neural Network: A Review
NASA Astrophysics Data System (ADS)
Kim, Tai-Hoon
Among the various frameworks in which pattern recognition has been traditionally formulated, the statistical approach has been most intensively studied and used in practice. More recently, artificial neural network techniques theory have been receiving increasing attention. The design of a recognition system requires careful attention to the following issues: definition of pattern classes, sensing environment, pattern representation, feature extraction and selection, cluster analysis, classifier design and learning, selection of training and test samples, and performance evaluation. In spite of almost 50 years of research and development in this field, the general problem of recognizing complex patterns with arbitrary orientation, location, and scale remains unsolved. New and emerging applications, such as data mining, web searching, retrieval of multimedia data, face recognition, and cursive handwriting recognition, require robust and efficient pattern recognition techniques. The objective of this review paper is to summarize and compare some of the well-known methods used in various stages of a pattern recognition system using ANN and identify research topics and applications which are at the forefront of this exciting and challenging field.
Relevance feedback-based building recognition
NASA Astrophysics Data System (ADS)
Li, Jing; Allinson, Nigel M.
2010-07-01
Building recognition is a nontrivial task in computer vision research which can be utilized in robot localization, mobile navigation, etc. However, existing building recognition systems usually encounter the following two problems: 1) extracted low level features cannot reveal the true semantic concepts; and 2) they usually involve high dimensional data which require heavy computational costs and memory. Relevance feedback (RF), widely applied in multimedia information retrieval, is able to bridge the gap between the low level visual features and high level concepts; while dimensionality reduction methods can mitigate the high-dimensional problem. In this paper, we propose a building recognition scheme which integrates the RF and subspace learning algorithms. Experimental results undertaken on our own building database show that the newly proposed scheme appreciably enhances the recognition accuracy.
Auditory Pattern Recognition and Brief Tone Discrimination of Children with Reading Disorders
ERIC Educational Resources Information Center
Walker, Marianna M.; Givens, Gregg D.; Cranford, Jerry L.; Holbert, Don; Walker, Letitia
2006-01-01
Auditory pattern recognition skills in children with reading disorders were investigated using perceptual tests involving discrimination of frequency and duration tonal patterns. A behavioral test battery involving recognition of the pattern of presentation of tone triads was used in which individual components differed in either frequency or…
Image pattern recognition supporting interactive analysis and graphical visualization
NASA Technical Reports Server (NTRS)
Coggins, James M.
1992-01-01
Image Pattern Recognition attempts to infer properties of the world from image data. Such capabilities are crucial for making measurements from satellite or telescope images related to Earth and space science problems. Such measurements can be the required product itself, or the measurements can be used as input to a computer graphics system for visualization purposes. At present, the field of image pattern recognition lacks a unified scientific structure for developing and evaluating image pattern recognition applications. The overall goal of this project is to begin developing such a structure. This report summarizes results of a 3-year research effort in image pattern recognition addressing the following three principal aims: (1) to create a software foundation for the research and identify image pattern recognition problems in Earth and space science; (2) to develop image measurement operations based on Artificial Visual Systems; and (3) to develop multiscale image descriptions for use in interactive image analysis.
Understanding eye movements in face recognition using hidden Markov models.
Chuk, Tim; Chan, Antoni B; Hsiao, Janet H
2014-09-16
We use a hidden Markov model (HMM) based approach to analyze eye movement data in face recognition. HMMs are statistical models that are specialized in handling time-series data. We conducted a face recognition task with Asian participants, and model each participant's eye movement pattern with an HMM, which summarized the participant's scan paths in face recognition with both regions of interest and the transition probabilities among them. By clustering these HMMs, we showed that participants' eye movements could be categorized into holistic or analytic patterns, demonstrating significant individual differences even within the same culture. Participants with the analytic pattern had longer response times, but did not differ significantly in recognition accuracy from those with the holistic pattern. We also found that correct and wrong recognitions were associated with distinctive eye movement patterns; the difference between the two patterns lies in the transitions rather than locations of the fixations alone. © 2014 ARVO.
Geometry-based ensembles: toward a structural characterization of the classification boundary.
Pujol, Oriol; Masip, David
2009-06-01
This paper introduces a novel binary discriminative learning technique based on the approximation of the nonlinear decision boundary by a piecewise linear smooth additive model. The decision border is geometrically defined by means of the characterizing boundary points-points that belong to the optimal boundary under a certain notion of robustness. Based on these points, a set of locally robust linear classifiers is defined and assembled by means of a Tikhonov regularized optimization procedure in an additive model to create a final lambda-smooth decision rule. As a result, a very simple and robust classifier with a strong geometrical meaning and nonlinear behavior is obtained. The simplicity of the method allows its extension to cope with some of today's machine learning challenges, such as online learning, large-scale learning or parallelization, with linear computational complexity. We validate our approach on the UCI database, comparing with several state-of-the-art classification techniques. Finally, we apply our technique in online and large-scale scenarios and in six real-life computer vision and pattern recognition problems: gender recognition based on face images, intravascular ultrasound tissue classification, speed traffic sign detection, Chagas' disease myocardial damage severity detection, old musical scores clef classification, and action recognition using 3D accelerometer data from a wearable device. The results are promising and this paper opens a line of research that deserves further attention.
Accommodative spasm in siblings: A unique finding
Rutstein, Robert P
2010-01-01
Accommodative spasm is a rare condition occurring in children, adolescents, and young adults. A familial tendency for this binocular vision disorder has not been reported. I describe accommodative spasm occurring in a brother and sister. Both children presented on the same day with complaints of headaches and blurred vision. Treatment included cycloplegia drops and bifocals. Siblings of patients having accommodative spasm should receive a detailed eye exam with emphasis on recognition of accommodative spasm. PMID:20534925
Vision-based object detection and recognition system for intelligent vehicles
NASA Astrophysics Data System (ADS)
Ran, Bin; Liu, Henry X.; Martono, Wilfung
1999-01-01
Recently, a proactive crash mitigation system is proposed to enhance the crash avoidance and survivability of the Intelligent Vehicles. Accurate object detection and recognition system is a prerequisite for a proactive crash mitigation system, as system component deployment algorithms rely on accurate hazard detection, recognition, and tracking information. In this paper, we present a vision-based approach to detect and recognize vehicles and traffic signs, obtain their information, and track multiple objects by using a sequence of color images taken from a moving vehicle. The entire system consist of two sub-systems, the vehicle detection and recognition sub-system and traffic sign detection and recognition sub-system. Both of the sub- systems consist of four models: object detection model, object recognition model, object information model, and object tracking model. In order to detect potential objects on the road, several features of the objects are investigated, which include symmetrical shape and aspect ratio of a vehicle and color and shape information of the signs. A two-layer neural network is trained to recognize different types of vehicles and a parameterized traffic sign model is established in the process of recognizing a sign. Tracking is accomplished by combining the analysis of single image frame with the analysis of consecutive image frames. The analysis of the single image frame is performed every ten full-size images. The information model will obtain the information related to the object, such as time to collision for the object vehicle and relative distance from the traffic sings. Experimental results demonstrated a robust and accurate system in real time object detection and recognition over thousands of image frames.
High-accuracy microassembly by intelligent vision systems and smart sensor integration
NASA Astrophysics Data System (ADS)
Schilp, Johannes; Harfensteller, Mark; Jacob, Dirk; Schilp, Michael
2003-10-01
Innovative production processes and strategies from batch production to high volume scale are playing a decisive role in generating microsystems economically. In particular assembly processes are crucial operations during the production of microsystems. Due to large batch sizes many microsystems can be produced economically by conventional assembly techniques using specialized and highly automated assembly systems. At laboratory stage microsystems are mostly assembled by hand. Between these extremes there is a wide field of small and middle sized batch production wherefore common automated solutions rarely are profitable. For assembly processes at these batch sizes a flexible automated assembly system has been developed at the iwb. It is based on a modular design. Actuators like grippers, dispensers or other process tools can easily be attached due to a special tool changing system. Therefore new joining techniques can easily be implemented. A force-sensor and a vision system are integrated into the tool head. The automated assembly processes are based on different optical sensors and smart actuators like high-accuracy robots or linear-motors. A fiber optic sensor is integrated in the dispensing module to measure contactless the clearance between the dispense needle and the substrate. Robot vision systems using the strategy of optical pattern recognition are also implemented as modules. In combination with relative positioning strategies, an assembly accuracy of the assembly system of less than 3 μm can be realized. A laser system is used for manufacturing processes like soldering.
A traffic situation analysis system
NASA Astrophysics Data System (ADS)
Sidla, Oliver; Rosner, Marcin
2011-01-01
The observation and monitoring of traffic with smart visions systems for the purpose of improving traffic safety has a big potential. For example embedded vision systems built into vehicles can be used as early warning systems, or stationary camera systems can modify the switching frequency of signals at intersections. Today the automated analysis of traffic situations is still in its infancy - the patterns of vehicle motion and pedestrian flow in an urban environment are too complex to be fully understood by a vision system. We present steps towards such a traffic monitoring system which is designed to detect potentially dangerous traffic situations, especially incidents in which the interaction of pedestrians and vehicles might develop into safety critical encounters. The proposed system is field-tested at a real pedestrian crossing in the City of Vienna for the duration of one year. It consists of a cluster of 3 smart cameras, each of which is built from a very compact PC hardware system in an outdoor capable housing. Two cameras run vehicle detection software including license plate detection and recognition, one camera runs a complex pedestrian detection and tracking module based on the HOG detection principle. As a supplement, all 3 cameras use additional optical flow computation in a low-resolution video stream in order to estimate the motion path and speed of objects. This work describes the foundation for all 3 different object detection modalities (pedestrians, vehi1cles, license plates), and explains the system setup and its design.
Almabruk, Abubaker A. A.; Paterson, Kevin B.; McGowan, Victoria; Jordan, Timothy R.
2011-01-01
Background Previous studies have claimed that a precise split at the vertical midline of each fovea causes all words to the left and right of fixation to project to the opposite, contralateral hemisphere, and this division in hemispheric processing has considerable consequences for foveal word recognition. However, research in this area is dominated by the use of stimuli from Latinate languages, which may induce specific effects on performance. Consequently, we report two experiments using stimuli from a fundamentally different, non-Latinate language (Arabic) that offers an alternative way of revealing effects of split-foveal processing, if they exist. Methods and Findings Words (and pseudowords) were presented to the left or right of fixation, either close to fixation and entirely within foveal vision, or further from fixation and entirely within extrafoveal vision. Fixation location and stimulus presentations were carefully controlled using an eye-tracker linked to a fixation-contingent display. To assess word recognition, Experiment 1 used the Reicher-Wheeler task and Experiment 2 used the lexical decision task. Results Performance in both experiments indicated a functional division in hemispheric processing for words in extrafoveal locations (in recognition accuracy in Experiment 1 and in reaction times and error rates in Experiment 2) but no such division for words in foveal locations. Conclusions These findings from a non-Latinate language provide new evidence that although a functional division in hemispheric processing exists for word recognition outside the fovea, this division does not extend up to the point of fixation. Some implications for word recognition and reading are discussed. PMID:21559084
Atoms of recognition in human and computer vision.
Ullman, Shimon; Assif, Liav; Fetaya, Ethan; Harari, Daniel
2016-03-08
Discovering the visual features and representations used by the brain to recognize objects is a central problem in the study of vision. Recently, neural network models of visual object recognition, including biological and deep network models, have shown remarkable progress and have begun to rival human performance in some challenging tasks. These models are trained on image examples and learn to extract features and representations and to use them for categorization. It remains unclear, however, whether the representations and learning processes discovered by current models are similar to those used by the human visual system. Here we show, by introducing and using minimal recognizable images, that the human visual system uses features and processes that are not used by current models and that are critical for recognition. We found by psychophysical studies that at the level of minimal recognizable images a minute change in the image can have a drastic effect on recognition, thus identifying features that are critical for the task. Simulations then showed that current models cannot explain this sensitivity to precise feature configurations and, more generally, do not learn to recognize minimal images at a human level. The role of the features shown here is revealed uniquely at the minimal level, where the contribution of each feature is essential. A full understanding of the learning and use of such features will extend our understanding of visual recognition and its cortical mechanisms and will enhance the capacity of computational models to learn from visual experience and to deal with recognition and detailed image interpretation.
Biometrics: Facing Up to Terrorism
2001-10-01
ment committee appointed by Secretary of Trans- portation Norman Y. Mineta to review airport security measures will recommend that facial recogni- tion...on the Role Facial Recognition Technology Can Play in Enhancing Airport Security .” Joseph Atick, the CEO of Visionics, testified before the government...system at a U.S. air- port. This deployment is believed to be the first-in-the-nation use of face-recognition technology for airport security . The sys
Paz, Concepción; Conde, Marcos; Porteiro, Jacobo; Concheiro, Miguel
2017-01-01
This work introduces the use of machine vision in the massive bubble recognition process, which supports the validation of boiling models involving bubble dynamics, as well as nucleation frequency, active site density and size of the bubbles. The two algorithms presented are meant to be run employing quite standard images of the bubbling process, recorded in general-purpose boiling facilities. The recognition routines are easily adaptable to other facilities if a minimum number of precautions are taken in the setup and in the treatment of the information. Both the side and front projections of subcooled flow-boiling phenomenon over a plain plate are covered. Once all of the intended bubbles have been located in space and time, the proper post-process of the recorded data become capable of tracking each of the recognized bubbles, sketching their trajectories and size evolution, locating the nucleation sites, computing their diameters, and so on. After validating the algorithm’s output against the human eye and data from other researchers, machine vision systems have been demonstrated to be a very valuable option to successfully perform the recognition process, even though the optical analysis of bubbles has not been set as the main goal of the experimental facility. PMID:28632158
Pattern activation/recognition theory of mind
du Castel, Bertrand
2015-01-01
In his 2012 book How to Create a Mind, Ray Kurzweil defines a “Pattern Recognition Theory of Mind” that states that the brain uses millions of pattern recognizers, plus modules to check, organize, and augment them. In this article, I further the theory to go beyond pattern recognition and include also pattern activation, thus encompassing both sensory and motor functions. In addition, I treat checking, organizing, and augmentation as patterns of patterns instead of separate modules, therefore handling them the same as patterns in general. Henceforth I put forward a unified theory I call “Pattern Activation/Recognition Theory of Mind.” While the original theory was based on hierarchical hidden Markov models, this evolution is based on their precursor: stochastic grammars. I demonstrate that a class of self-describing stochastic grammars allows for unifying pattern activation, recognition, organization, consistency checking, metaphor, and learning, into a single theory that expresses patterns throughout. I have implemented the model as a probabilistic programming language specialized in activation/recognition grammatical and neural operations. I use this prototype to compute and present diagrams for each stochastic grammar and corresponding neural circuit. I then discuss the theory as it relates to artificial network developments, common coding, neural reuse, and unity of mind, concluding by proposing potential paths to validation. PMID:26236228
Pattern activation/recognition theory of mind.
du Castel, Bertrand
2015-01-01
In his 2012 book How to Create a Mind, Ray Kurzweil defines a "Pattern Recognition Theory of Mind" that states that the brain uses millions of pattern recognizers, plus modules to check, organize, and augment them. In this article, I further the theory to go beyond pattern recognition and include also pattern activation, thus encompassing both sensory and motor functions. In addition, I treat checking, organizing, and augmentation as patterns of patterns instead of separate modules, therefore handling them the same as patterns in general. Henceforth I put forward a unified theory I call "Pattern Activation/Recognition Theory of Mind." While the original theory was based on hierarchical hidden Markov models, this evolution is based on their precursor: stochastic grammars. I demonstrate that a class of self-describing stochastic grammars allows for unifying pattern activation, recognition, organization, consistency checking, metaphor, and learning, into a single theory that expresses patterns throughout. I have implemented the model as a probabilistic programming language specialized in activation/recognition grammatical and neural operations. I use this prototype to compute and present diagrams for each stochastic grammar and corresponding neural circuit. I then discuss the theory as it relates to artificial network developments, common coding, neural reuse, and unity of mind, concluding by proposing potential paths to validation.
Reinforcement learning in computer vision
NASA Astrophysics Data System (ADS)
Bernstein, A. V.; Burnaev, E. V.
2018-04-01
Nowadays, machine learning has become one of the basic technologies used in solving various computer vision tasks such as feature detection, image segmentation, object recognition and tracking. In many applications, various complex systems such as robots are equipped with visual sensors from which they learn state of surrounding environment by solving corresponding computer vision tasks. Solutions of these tasks are used for making decisions about possible future actions. It is not surprising that when solving computer vision tasks we should take into account special aspects of their subsequent application in model-based predictive control. Reinforcement learning is one of modern machine learning technologies in which learning is carried out through interaction with the environment. In recent years, Reinforcement learning has been used both for solving such applied tasks as processing and analysis of visual information, and for solving specific computer vision problems such as filtering, extracting image features, localizing objects in scenes, and many others. The paper describes shortly the Reinforcement learning technology and its use for solving computer vision problems.
Volumetric segmentation of range images for printed circuit board inspection
NASA Astrophysics Data System (ADS)
Van Dop, Erik R.; Regtien, Paul P. L.
1996-10-01
Conventional computer vision approaches towards object recognition and pose estimation employ 2D grey-value or color imaging. As a consequence these images contain information about projections of a 3D scene only. The subsequent image processing will then be difficult, because the object coordinates are represented with just image coordinates. Only complicated low-level vision modules like depth from stereo or depth from shading can recover some of the surface geometry of the scene. Recent advances in fast range imaging have however paved the way towards 3D computer vision, since range data of the scene can now be obtained with sufficient accuracy and speed for object recognition and pose estimation purposes. This article proposes the coded-light range-imaging method together with superquadric segmentation to approach this task. Superquadric segments are volumetric primitives that describe global object properties with 5 parameters, which provide the main features for object recognition. Besides, the principle axes of a superquadric segment determine the phase of an object in the scene. The volumetric segmentation of a range image can be used to detect missing, false or badly placed components on assembled printed circuit boards. Furthermore, this approach will be useful to recognize and extract valuable or toxic electronic components on printed circuit boards scrap that currently burden the environment during electronic waste processing. Results on synthetic range images with errors constructed according to a verified noise model illustrate the capabilities of this approach.
A neural network based artificial vision system for licence plate recognition.
Draghici, S
1997-02-01
This paper presents a neural network based artificial vision system able to analyze the image of a car given by a camera, locate the registration plate and recognize the registration number of the car. The paper describes in detail various practical problems encountered in implementing this particular application and the solutions used to solve them. The main features of the system presented are: controlled stability-plasticity behavior, controlled reliability threshold, both off-line and on-line learning, self assessment of the output reliability and high reliability based on high level multiple feedback. The system has been designed using a modular approach. Sub-modules can be upgraded and/or substituted independently, thus making the system potentially suitable in a large variety of vision applications. The OCR engine was designed as an interchangeable plug-in module. This allows the user to choose an OCR engine which is suited to the particular application and to upgrade it easily in the future. At present, there are several versions of this OCR engine. One of them is based on a fully connected feedforward artificial neural network with sigmoidal activation functions. This network can be trained with various training algorithms such as error backpropagation. An alternative OCR engine is based on the constraint based decomposition (CBD) training architecture. The system has showed the following performances (on average) on real-world data: successful plate location and segmentation about 99%, successful character recognition about 98% and successful recognition of complete registration plates about 80%.
Vision technology/algorithms for space robotics applications
NASA Technical Reports Server (NTRS)
Krishen, Kumar; Defigueiredo, Rui J. P.
1987-01-01
The thrust of automation and robotics for space applications has been proposed for increased productivity, improved reliability, increased flexibility, higher safety, and for the performance of automating time-consuming tasks, increasing productivity/performance of crew-accomplished tasks, and performing tasks beyond the capability of the crew. This paper provides a review of efforts currently in progress in the area of robotic vision. Both systems and algorithms are discussed. The evolution of future vision/sensing is projected to include the fusion of multisensors ranging from microwave to optical with multimode capability to include position, attitude, recognition, and motion parameters. The key feature of the overall system design will be small size and weight, fast signal processing, robust algorithms, and accurate parameter determination. These aspects of vision/sensing are also discussed.
What aspects of vision facilitate haptic processing?
Millar, Susanna; Al-Attar, Zainab
2005-12-01
We investigate how vision affects haptic performance when task-relevant visual cues are reduced or excluded. The task was to remember the spatial location of six landmarks that were explored by touch in a tactile map. Here, we use specially designed spectacles that simulate residual peripheral vision, tunnel vision, diffuse light perception, and total blindness. Results for target locations differed, suggesting additional effects from adjacent touch cues. These are discussed. Touch with full vision was most accurate, as expected. Peripheral and tunnel vision, which reduce visuo-spatial cues, differed in error pattern. Both were less accurate than full vision, and significantly more accurate than touch with diffuse light perception, and touch alone. The important finding was that touch with diffuse light perception, which excludes spatial cues, did not differ from touch without vision in performance accuracy, nor in location error pattern. The contrast between spatially relevant versus spatially irrelevant vision provides new, rather decisive, evidence against the hypothesis that vision affects haptic processing even if it does not add task-relevant information. The results support optimal integration theories, and suggest that spatial and non-spatial aspects of vision need explicit distinction in bimodal studies and theories of spatial integration.
NASA Technical Reports Server (NTRS)
Juday, Richard D. (Editor)
1988-01-01
The present conference discusses topics in pattern-recognition correlator architectures, digital stereo systems, geometric image transformations and their applications, topics in pattern recognition, filter algorithms, object detection and classification, shape representation techniques, and model-based object recognition methods. Attention is given to edge-enhancement preprocessing using liquid crystal TVs, massively-parallel optical data base management, three-dimensional sensing with polar exponential sensor arrays, the optical processing of imaging spectrometer data, hybrid associative memories and metric data models, the representation of shape primitives in neural networks, and the Monte Carlo estimation of moment invariants for pattern recognition.
Facial recognition using simulated prosthetic pixelized vision.
Thompson, Robert W; Barnett, G David; Humayun, Mark S; Dagnelie, Gislin
2003-11-01
To evaluate a model of simulated pixelized prosthetic vision using noncontiguous circular phosphenes, to test the effects of phosphene and grid parameters on facial recognition. A video headset was used to view a reference set of four faces, followed by a partially averted image of one of those faces viewed through a square pixelizing grid that contained 10x10 to 32x32 dots separated by gaps. The grid size, dot size, gap width, dot dropout rate, and gray-scale resolution were varied separately about a standard test condition, for a total of 16 conditions. All tests were first performed at 99% contrast and then repeated at 12.5% contrast. Discrimination speed and performance were influenced by all stimulus parameters. The subjects achieved highly significant facial recognition accuracy for all high-contrast tests except for grids with 70% random dot dropout and two gray levels. In low-contrast tests, significant facial recognition accuracy was achieved for all but the most adverse grid parameters: total grid area less than 17% of the target image, 70% dropout, four or fewer gray levels, and a gap of 40.5 arcmin. For difficult test conditions, a pronounced learning effect was noticed during high-contrast trials, and a more subtle practice effect on timing was evident during subsequent low-contrast trials. These findings suggest that reliable face recognition with crude pixelized grids can be learned and may be possible, even with a crude visual prosthesis.
Design of compactly supported wavelet to match singularities in medical images
NASA Astrophysics Data System (ADS)
Fung, Carrson C.; Shi, Pengcheng
2002-11-01
Analysis and understanding of medical images has important clinical values for patient diagnosis and treatment, as well as technical implications for computer vision and pattern recognition. One of the most fundamental issues is the detection of object boundaries or singularities, which is often the basis for further processes such as organ/tissue recognition, image registration, motion analysis, measurement of anatomical and physiological parameters, etc. The focus of this work involved taking a correlation based approach toward edge detection, by exploiting some of desirable properties of wavelet analysis. This leads to the possibility of constructing a bank of detectors, consisting of multiple wavelet basis functions of different scales which are optimal for specific types of edges, in order to optimally detect all the edges in an image. Our work involved developing a set of wavelet functions which matches the shape of the ramp and pulse edges. The matching algorithm used focuses on matching the edges in the frequency domain. It was proven that this technique could create matching wavelets applicable at all scales. Results have shown that matching wavelets can be obtained for the pulse edge while the ramp edge requires another matching algorithm.
Swartz, R. Andrew
2013-01-01
This paper investigates the time series representation methods and similarity measures for sensor data feature extraction and structural damage pattern recognition. Both model-based time series representation and dimensionality reduction methods are studied to compare the effectiveness of feature extraction for damage pattern recognition. The evaluation of feature extraction methods is performed by examining the separation of feature vectors among different damage patterns and the pattern recognition success rate. In addition, the impact of similarity measures on the pattern recognition success rate and the metrics for damage localization are also investigated. The test data used in this study are from the System Identification to Monitor Civil Engineering Structures (SIMCES) Z24 Bridge damage detection tests, a rigorous instrumentation campaign that recorded the dynamic performance of a concrete box-girder bridge under progressively increasing damage scenarios. A number of progressive damage test case datasets and damage test data with different damage modalities are used. The simulation results show that both time series representation methods and similarity measures have significant impact on the pattern recognition success rate. PMID:24191136
Automatic Mexican sign language and digits recognition using normalized central moments
NASA Astrophysics Data System (ADS)
Solís, Francisco; Martínez, David; Espinosa, Oscar; Toxqui, Carina
2016-09-01
This work presents a framework for automatic Mexican sign language and digits recognition based on computer vision system using normalized central moments and artificial neural networks. Images are captured by digital IP camera, four LED reflectors and a green background in order to reduce computational costs and prevent the use of special gloves. 42 normalized central moments are computed per frame and used in a Multi-Layer Perceptron to recognize each database. Four versions per sign and digit were used in training phase. 93% and 95% of recognition rates were achieved for Mexican sign language and digits respectively.
NASA Astrophysics Data System (ADS)
Millán, María S.
2012-10-01
On the verge of the 50th anniversary of Vander Lugt’s formulation for pattern matching based on matched filtering and optical correlation, we acknowledge the very intense research activity developed in the field of correlation-based pattern recognition during this period of time. The paper reviews some domains that appeared as emerging fields in the last years of the 20th century and have been developed later on in the 21st century. Such is the case of three-dimensional (3D) object recognition, biometric pattern matching, optical security and hybrid optical-digital processors. 3D object recognition is a challenging case of multidimensional image recognition because of its implications in the recognition of real-world objects independent of their perspective. Biometric recognition is essentially pattern recognition for which the personal identification is based on the authentication of a specific physiological characteristic possessed by the subject (e.g. fingerprint, face, iris, retina, and multifactor combinations). Biometric recognition often appears combined with encryption-decryption processes to secure information. The optical implementations of correlation-based pattern recognition processes still rely on the 4f-correlator, the joint transform correlator, or some of their variants. But the many applications developed in the field have been pushing the systems for a continuous improvement of their architectures and algorithms, thus leading towards merged optical-digital solutions.
A novel parallel architecture for local histogram equalization
NASA Astrophysics Data System (ADS)
Ohannessian, Mesrob I.; Choueiter, Ghinwa F.; Diab, Hassan
2005-07-01
Local histogram equalization is an image enhancement algorithm that has found wide application in the pre-processing stage of areas such as computer vision, pattern recognition and medical imaging. The computationally intensive nature of the procedure, however, is a main limitation when real time interactive applications are in question. This work explores the possibility of performing parallel local histogram equalization, using an array of special purpose elementary processors, through an HDL implementation that targets FPGA or ASIC platforms. A novel parallelization scheme is presented and the corresponding architecture is derived. The algorithm is reduced to pixel-level operations. Processing elements are assigned image blocks, to maintain a reasonable performance-cost ratio. To further simplify both processor and memory organizations, a bit-serial access scheme is used. A brief performance assessment is provided to illustrate and quantify the merit of the approach.
Robust autoassociative memory with coupled networks of Kuramoto-type oscillators
NASA Astrophysics Data System (ADS)
Heger, Daniel; Krischer, Katharina
2016-08-01
Uncertain recognition success, unfavorable scaling of connection complexity, or dependence on complex external input impair the usefulness of current oscillatory neural networks for pattern recognition or restrict technical realizations to small networks. We propose a network architecture of coupled oscillators for pattern recognition which shows none of the mentioned flaws. Furthermore we illustrate the recognition process with simulation results and analyze the dynamics analytically: Possible output patterns are isolated attractors of the system. Additionally, simple criteria for recognition success are derived from a lower bound on the basins of attraction.
Artificial intelligence, expert systems, computer vision, and natural language processing
NASA Technical Reports Server (NTRS)
Gevarter, W. B.
1984-01-01
An overview of artificial intelligence (AI), its core ingredients, and its applications is presented. The knowledge representation, logic, problem solving approaches, languages, and computers pertaining to AI are examined, and the state of the art in AI is reviewed. The use of AI in expert systems, computer vision, natural language processing, speech recognition and understanding, speech synthesis, problem solving, and planning is examined. Basic AI topics, including automation, search-oriented problem solving, knowledge representation, and computational logic, are discussed.
Mid-Level Vision and Recognition of Non-Rigid Objects.
1993-01-01
and the author perhaps asked to account for its lack of rigor. In computer vision, the critic often requires that the author provide particular runs ...shown here where run at 4 x 1.5 deg. Note that it is unclear though if only even symmetric lters are needed for Contour Texture as proposed there for 2D...the contrast is low. However, coloring runs into problems if the contour is not fully connected or if the inner side of the contour is hard to
Increasing the object recognition distance of compact open air on board vision system
NASA Astrophysics Data System (ADS)
Kirillov, Sergey; Kostkin, Ivan; Strotov, Valery; Dmitriev, Vladimir; Berdnikov, Vadim; Akopov, Eduard; Elyutin, Aleksey
2016-10-01
The aim of this work was developing an algorithm eliminating the atmospheric distortion and improves image quality. The proposed algorithm is entirely software without using additional hardware photographic equipment. . This algorithm does not required preliminary calibration. It can work equally effectively with the images obtained at a distances from 1 to 500 meters. An algorithm for the open air images improve designed for Raspberry Pi model B on-board vision systems is proposed. The results of experimental examination are given.
Locating the cortical bottleneck for slow reading in peripheral vision
Yu, Deyue; Jiang, Yi; Legge, Gordon E.; He, Sheng
2015-01-01
Yu, Legge, Park, Gage, and Chung (2010) suggested that the neural bottleneck for slow peripheral reading is located in nonretinotopic areas. We investigated the potential rate-limiting neural site for peripheral reading using fMRI, and contrasted peripheral reading with recognition of peripherally presented line drawings of common objects. We measured the BOLD responses to both text (three-letter words/nonwords) and line-drawing objects presented either in foveal or peripheral vision (10° lower right visual field) at three presentation rates (2, 4, and 8/second). The statistically significant interaction effect of visual field × presentation rate on the BOLD response for text but not for line drawings provides evidence for distinctive processing of peripheral text. This pattern of results was obtained in all five regions of interest (ROIs). At the early retinotopic cortical areas, the BOLD signal slightly increased with increasing presentation rate for foveal text, and remained fairly constant for peripheral text. In the Occipital Word-Responsive Area (OWRA), Visual Word Form Area (VWFA), and object sensitive areas (LO and PHA), the BOLD responses to text decreased with increasing presentation rate for peripheral but not foveal presentation. In contrast, there was no rate-dependent reduction in BOLD response for line-drawing objects in all the ROIs for either foveal or peripheral presentation. Only peripherally presented text showed a distinctive rate-dependence pattern. Although it is possible that the differentiation starts to emerge at the early retinotopic cortical representation, the neural bottleneck for slower reading of peripherally presented text may be a special property of peripheral text processing in object category selective cortex. PMID:26237299
ERIC Educational Resources Information Center
Lazzaro, Joseph J.
1993-01-01
Describes adaptive technology for personal computers that accommodate disabled users and may require special equipment including hardware, memory, expansion slots, and ports. Highlights include vision aids, including speech synthesizers, magnification, braille, and optical character recognition (OCR); hearing adaptations; motor-impaired…
The anatomical and functional specialization of the fusiform gyrus
Weiner, Kevin S.; Zilles, Karl
2015-01-01
The fusiform gyrus (FG) is commonly included in anatomical atlases and is considered a key structure for functionally-specialized computations of high-level vision such as face perception, object recognition, and reading. However, it is not widely known that the FG has a contentious history. In this review, we first provide a historical analysis of the discovery of the FG and why certain features, such as the mid-fusiform sulcus, were discovered and then forgotten. We then discuss how observer-independent methods for identifying cytoarchitectonical boundaries of the cortex revolutionized our understanding of cytoarchitecture and the correspondence between those boundaries and cortical folding patterns of the FG. We further explain that the co-occurrence between cortical folding patterns and cytoarchitectonical boundaries are more common than classically thought and also, are functionally meaningful especially on the FG and probably in high-level visual cortex more generally. We conclude by proposing a series of alternatives for how the anatomical organization of the FG can accommodate seemingly different theoretical aspects of functional processing, such as domain specificity and perceptual expertise. PMID:26119921
Bringing UAVs to the fight: recent army autonomy research and a vision for the future
NASA Astrophysics Data System (ADS)
Moorthy, Jay; Higgins, Raymond; Arthur, Keith
2008-04-01
The Unmanned Autonomous Collaborative Operations (UACO) program was initiated in recognition of the high operational burden associated with utilizing unmanned systems by both mounted and dismounted, ground and airborne warfighters. The program was previously introduced at the 62nd Annual Forum of the American Helicopter Society in May of 20061. This paper presents the three technical approaches taken and results obtained in UACO. All three approaches were validated extensively in contractor simulations, two were validated in government simulation, one was flight tested outside the UACO program, and one was flight tested in Part 2 of UACO. Results and recommendations are discussed regarding diverse areas such as user training and human-machine interface, workload distribution, UAV flight safety, data link bandwidth, user interface constructs, adaptive algorithms, air vehicle system integration, and target recognition. Finally, a vision for UAV As A Wingman is presented.
van den Berg, Ronald; Roerdink, Jos B T M; Cornelissen, Frans W
2010-01-22
An object in the peripheral visual field is more difficult to recognize when surrounded by other objects. This phenomenon is called "crowding". Crowding places a fundamental constraint on human vision that limits performance on numerous tasks. It has been suggested that crowding results from spatial feature integration necessary for object recognition. However, in the absence of convincing models, this theory has remained controversial. Here, we present a quantitative and physiologically plausible model for spatial integration of orientation signals, based on the principles of population coding. Using simulations, we demonstrate that this model coherently accounts for fundamental properties of crowding, including critical spacing, "compulsory averaging", and a foveal-peripheral anisotropy. Moreover, we show that the model predicts increased responses to correlated visual stimuli. Altogether, these results suggest that crowding has little immediate bearing on object recognition but is a by-product of a general, elementary integration mechanism in early vision aimed at improving signal quality.
Fragrant pear sexuality recognition with machine vision
NASA Astrophysics Data System (ADS)
Ma, Benxue; Ying, Yibin
2006-10-01
In this research, a method to identify Kuler fragrant pear's sexuality with machine vision was developed. Kuler fragrant pear has male pear and female pear. They have an obvious difference in favor. To detect the sexuality of Kuler fragrant pear, images of fragrant pear were acquired by CCD color camera. Before feature extraction, some preprocessing is conducted on the acquired images to remove noise and unnecessary contents. Color feature, perimeter feature and area feature of fragrant pear bottom image were extracted by digital image processing technique. And the fragrant pear sexuality was determined by complexity obtained from perimeter and area. In this research, using 128 Kurle fragrant pears as samples, good recognition rate between the male pear and the female pear was obtained for Kurle pear's sexuality detection (82.8%). Result shows this method could detect male pear and female pear with a good accuracy.
NASA Astrophysics Data System (ADS)
Madokoro, H.; Tsukada, M.; Sato, K.
2013-07-01
This paper presents an unsupervised learning-based object category formation and recognition method for mobile robot vision. Our method has the following features: detection of feature points and description of features using a scale-invariant feature transform (SIFT), selection of target feature points using one class support vector machines (OC-SVMs), generation of visual words using self-organizing maps (SOMs), formation of labels using adaptive resonance theory 2 (ART-2), and creation and classification of categories on a category map of counter propagation networks (CPNs) for visualizing spatial relations between categories. Classification results of dynamic images using time-series images obtained using two different-size robots and according to movements respectively demonstrate that our method can visualize spatial relations of categories while maintaining time-series characteristics. Moreover, we emphasize the effectiveness of our method for category formation of appearance changes of objects.
NASA Technical Reports Server (NTRS)
Schulte, Erin
2017-01-01
As augmented and virtual reality grows in popularity, and more researchers focus on its development, other fields of technology have grown in the hopes of integrating with the up-and-coming hardware currently on the market. Namely, there has been a focus on how to make an intuitive, hands-free human-computer interaction (HCI) utilizing AR and VR that allows users to control their technology with little to no physical interaction with hardware. Computer vision, which is utilized in devices such as the Microsoft Kinect, webcams and other similar hardware has shown potential in assisting with the development of a HCI system that requires next to no human interaction with computing hardware and software. Object and facial recognition are two subsets of computer vision, both of which can be applied to HCI systems in the fields of medicine, security, industrial development and other similar areas.
Benchmarking Spike-Based Visual Recognition: A Dataset and Evaluation
Liu, Qian; Pineda-García, Garibaldi; Stromatias, Evangelos; Serrano-Gotarredona, Teresa; Furber, Steve B.
2016-01-01
Today, increasing attention is being paid to research into spike-based neural computation both to gain a better understanding of the brain and to explore biologically-inspired computation. Within this field, the primate visual pathway and its hierarchical organization have been extensively studied. Spiking Neural Networks (SNNs), inspired by the understanding of observed biological structure and function, have been successfully applied to visual recognition and classification tasks. In addition, implementations on neuromorphic hardware have enabled large-scale networks to run in (or even faster than) real time, making spike-based neural vision processing accessible on mobile robots. Neuromorphic sensors such as silicon retinas are able to feed such mobile systems with real-time visual stimuli. A new set of vision benchmarks for spike-based neural processing are now needed to measure progress quantitatively within this rapidly advancing field. We propose that a large dataset of spike-based visual stimuli is needed to provide meaningful comparisons between different systems, and a corresponding evaluation methodology is also required to measure the performance of SNN models and their hardware implementations. In this paper we first propose an initial NE (Neuromorphic Engineering) dataset based on standard computer vision benchmarksand that uses digits from the MNIST database. This dataset is compatible with the state of current research on spike-based image recognition. The corresponding spike trains are produced using a range of techniques: rate-based Poisson spike generation, rank order encoding, and recorded output from a silicon retina with both flashing and oscillating input stimuli. In addition, a complementary evaluation methodology is presented to assess both model-level and hardware-level performance. Finally, we demonstrate the use of the dataset and the evaluation methodology using two SNN models to validate the performance of the models and their hardware implementations. With this dataset we hope to (1) promote meaningful comparison between algorithms in the field of neural computation, (2) allow comparison with conventional image recognition methods, (3) provide an assessment of the state of the art in spike-based visual recognition, and (4) help researchers identify future directions and advance the field. PMID:27853419
Benchmarking Spike-Based Visual Recognition: A Dataset and Evaluation.
Liu, Qian; Pineda-García, Garibaldi; Stromatias, Evangelos; Serrano-Gotarredona, Teresa; Furber, Steve B
2016-01-01
Today, increasing attention is being paid to research into spike-based neural computation both to gain a better understanding of the brain and to explore biologically-inspired computation. Within this field, the primate visual pathway and its hierarchical organization have been extensively studied. Spiking Neural Networks (SNNs), inspired by the understanding of observed biological structure and function, have been successfully applied to visual recognition and classification tasks. In addition, implementations on neuromorphic hardware have enabled large-scale networks to run in (or even faster than) real time, making spike-based neural vision processing accessible on mobile robots. Neuromorphic sensors such as silicon retinas are able to feed such mobile systems with real-time visual stimuli. A new set of vision benchmarks for spike-based neural processing are now needed to measure progress quantitatively within this rapidly advancing field. We propose that a large dataset of spike-based visual stimuli is needed to provide meaningful comparisons between different systems, and a corresponding evaluation methodology is also required to measure the performance of SNN models and their hardware implementations. In this paper we first propose an initial NE (Neuromorphic Engineering) dataset based on standard computer vision benchmarksand that uses digits from the MNIST database. This dataset is compatible with the state of current research on spike-based image recognition. The corresponding spike trains are produced using a range of techniques: rate-based Poisson spike generation, rank order encoding, and recorded output from a silicon retina with both flashing and oscillating input stimuli. In addition, a complementary evaluation methodology is presented to assess both model-level and hardware-level performance. Finally, we demonstrate the use of the dataset and the evaluation methodology using two SNN models to validate the performance of the models and their hardware implementations. With this dataset we hope to (1) promote meaningful comparison between algorithms in the field of neural computation, (2) allow comparison with conventional image recognition methods, (3) provide an assessment of the state of the art in spike-based visual recognition, and (4) help researchers identify future directions and advance the field.
NASA Astrophysics Data System (ADS)
Acciarri, R.; Adams, C.; An, R.; Anthony, J.; Asaadi, J.; Auger, M.; Bagby, L.; Balasubramanian, S.; Baller, B.; Barnes, C.; Barr, G.; Bass, M.; Bay, F.; Bishai, M.; Blake, A.; Bolton, T.; Camilleri, L.; Caratelli, D.; Carls, B.; Castillo Fernandez, R.; Cavanna, F.; Chen, H.; Church, E.; Cianci, D.; Cohen, E.; Collin, G. H.; Conrad, J. M.; Convery, M.; Crespo-Anadón, J. I.; Del Tutto, M.; Devitt, D.; Dytman, S.; Eberly, B.; Ereditato, A.; Escudero Sanchez, L.; Esquivel, J.; Fadeeva, A. A.; Fleming, B. T.; Foreman, W.; Furmanski, A. P.; Garcia-Gamez, D.; Garvey, G. T.; Genty, V.; Goeldi, D.; Gollapinni, S.; Graf, N.; Gramellini, E.; Greenlee, H.; Grosso, R.; Guenette, R.; Hackenburg, A.; Hamilton, P.; Hen, O.; Hewes, J.; Hill, C.; Ho, J.; Horton-Smith, G.; Hourlier, A.; Huang, E.-C.; James, C.; Jan de Vries, J.; Jen, C.-M.; Jiang, L.; Johnson, R. A.; Joshi, J.; Jostlein, H.; Kaleko, D.; Karagiorgi, G.; Ketchum, W.; Kirby, B.; Kirby, M.; Kobilarcik, T.; Kreslo, I.; Laube, A.; Li, Y.; Lister, A.; Littlejohn, B. R.; Lockwitz, S.; Lorca, D.; Louis, W. C.; Luethi, M.; Lundberg, B.; Luo, X.; Marchionni, A.; Mariani, C.; Marshall, J.; Martinez Caicedo, D. A.; Meddage, V.; Miceli, T.; Mills, G. B.; Moon, J.; Mooney, M.; Moore, C. D.; Mousseau, J.; Murrells, R.; Naples, D.; Nienaber, P.; Nowak, J.; Palamara, O.; Paolone, V.; Papavassiliou, V.; Pate, S. F.; Pavlovic, Z.; Piasetzky, E.; Porzio, D.; Pulliam, G.; Qian, X.; Raaf, J. L.; Rafique, A.; Rochester, L.; Rudolf von Rohr, C.; Russell, B.; Schmitz, D. W.; Schukraft, A.; Seligman, W.; Shaevitz, M. H.; Sinclair, J.; Smith, A.; Snider, E. L.; Soderberg, M.; Söldner-Rembold, S.; Soleti, S. R.; Spentzouris, P.; Spitz, J.; St. John, J.; Strauss, T.; Szelc, A. M.; Tagg, N.; Terao, K.; Thomson, M.; Toups, M.; Tsai, Y.-T.; Tufanli, S.; Usher, T.; Van De Pontseele, W.; Van de Water, R. G.; Viren, B.; Weber, M.; Wickremasinghe, D. A.; Wolbers, S.; Wongjirad, T.; Woodruff, K.; Yang, T.; Yates, L.; Zeller, G. P.; Zennamo, J.; Zhang, C.
2018-01-01
The development and operation of liquid-argon time-projection chambers for neutrino physics has created a need for new approaches to pattern recognition in order to fully exploit the imaging capabilities offered by this technology. Whereas the human brain can excel at identifying features in the recorded events, it is a significant challenge to develop an automated, algorithmic solution. The Pandora Software Development Kit provides functionality to aid the design and implementation of pattern-recognition algorithms. It promotes the use of a multi-algorithm approach to pattern recognition, in which individual algorithms each address a specific task in a particular topology. Many tens of algorithms then carefully build up a picture of the event and, together, provide a robust automated pattern-recognition solution. This paper describes details of the chain of over one hundred Pandora algorithms and tools used to reconstruct cosmic-ray muon and neutrino events in the MicroBooNE detector. Metrics that assess the current pattern-recognition performance are presented for simulated MicroBooNE events, using a selection of final-state event topologies.
Vertically integrated photonic multichip module architecture for vision applications
NASA Astrophysics Data System (ADS)
Tanguay, Armand R., Jr.; Jenkins, B. Keith; von der Malsburg, Christoph; Mel, Bartlett; Holt, Gary; O'Brien, John D.; Biederman, Irving; Madhukar, Anupam; Nasiatka, Patrick; Huang, Yunsong
2000-05-01
The development of a truly smart camera, with inherent capability for low latency semi-autonomous object recognition, tracking, and optimal image capture, has remained an elusive goal notwithstanding tremendous advances in the processing power afforded by VLSI technologies. These features are essential for a number of emerging multimedia- based applications, including enhanced augmented reality systems. Recent advances in understanding of the mechanisms of biological vision systems, together with similar advances in hybrid electronic/photonic packaging technology, offer the possibility of artificial biologically-inspired vision systems with significantly different, yet complementary, strengths and weaknesses. We describe herein several system implementation architectures based on spatial and temporal integration techniques within a multilayered structure, as well as the corresponding hardware implementation of these architectures based on the hybrid vertical integration of multiple silicon VLSI vision chips by means of dense 3D photonic interconnections.
The Pandora multi-algorithm approach to automated pattern recognition in LAr TPC detectors
NASA Astrophysics Data System (ADS)
Marshall, J. S.; Blake, A. S. T.; Thomson, M. A.; Escudero, L.; de Vries, J.; Weston, J.;
2017-09-01
The development and operation of Liquid Argon Time Projection Chambers (LAr TPCs) for neutrino physics has created a need for new approaches to pattern recognition, in order to fully exploit the superb imaging capabilities offered by this technology. The Pandora Software Development Kit provides functionality to aid the process of designing, implementing and running pattern recognition algorithms. It promotes the use of a multi-algorithm approach to pattern recognition: individual algorithms each address a specific task in a particular topology; a series of many tens of algorithms then carefully builds-up a picture of the event. The input to the Pandora pattern recognition is a list of 2D Hits. The output from the chain of over 70 algorithms is a hierarchy of reconstructed 3D Particles, each with an identified particle type, vertex and direction.
Posture recognition based on fuzzy logic for home monitoring of the elderly.
Brulin, Damien; Benezeth, Yannick; Courtial, Estelle
2012-09-01
We propose in this paper a computer vision-based posture recognition method for home monitoring of the elderly. The proposed system performs human detection prior to the posture analysis; posture recognition is performed only on a human silhouette. The human detection approach has been designed to be robust to different environmental stimuli. Thus, posture is analyzed with simple and efficient features that are not designed to manage constraints related to the environment but only designed to describe human silhouettes. The posture recognition method, based on fuzzy logic, identifies four static postures and is robust to variation in the distance between the camera and the person, and to the person's morphology. With an accuracy of 74.29% of satisfactory posture recognition, this approach can detect emergency situations such as a fall within a health smart home.
Perception and Processing of Faces in the Human Brain Is Tuned to Typical Feature Locations
Schwarzkopf, D. Samuel; Alvarez, Ivan; Lawson, Rebecca P.; Henriksson, Linda; Kriegeskorte, Nikolaus; Rees, Geraint
2016-01-01
Faces are salient social stimuli whose features attract a stereotypical pattern of fixations. The implications of this gaze behavior for perception and brain activity are largely unknown. Here, we characterize and quantify a retinotopic bias implied by typical gaze behavior toward faces, which leads to eyes and mouth appearing most often in the upper and lower visual field, respectively. We found that the adult human visual system is tuned to these contingencies. In two recognition experiments, recognition performance for isolated face parts was better when they were presented at typical, rather than reversed, visual field locations. The recognition cost of reversed locations was equal to ∼60% of that for whole face inversion in the same sample. Similarly, an fMRI experiment showed that patterns of activity evoked by eye and mouth stimuli in the right inferior occipital gyrus could be separated with significantly higher accuracy when these features were presented at typical, rather than reversed, visual field locations. Our findings demonstrate that human face perception is determined not only by the local position of features within a face context, but by whether features appear at the typical retinotopic location given normal gaze behavior. Such location sensitivity may reflect fine-tuning of category-specific visual processing to retinal input statistics. Our findings further suggest that retinotopic heterogeneity might play a role for face inversion effects and for the understanding of conditions affecting gaze behavior toward faces, such as autism spectrum disorders and congenital prosopagnosia. SIGNIFICANCE STATEMENT Faces attract our attention and trigger stereotypical patterns of visual fixations, concentrating on inner features, like eyes and mouth. Here we show that the visual system represents face features better when they are shown at retinal positions where they typically fall during natural vision. When facial features were shown at typical (rather than reversed) visual field locations, they were discriminated better by humans and could be decoded with higher accuracy from brain activity patterns in the right occipital face area. This suggests that brain representations of face features do not cover the visual field uniformly. It may help us understand the well-known face-inversion effect and conditions affecting gaze behavior toward faces, such as prosopagnosia and autism spectrum disorders. PMID:27605606
Laskowska-Macios, Karolina; Zapasnik, Monika; Hu, Tjing-Tjing; Kossut, Malgorzata; Arckens, Lutgarde; Burnat, Kalina
2015-10-01
Pattern vision deprivation (BD) can induce permanent deficits in global motion perception. The impact of timing and duration of BD on the maturation of the central and peripheral visual field representations in cat primary visual areas 17 and 18 remains unknown. We compared early BD, from eye opening for 2, 4, or 6 months, with late onset BD, after 2 months of normal vision, using the expression pattern of the visually driven activity reporter gene zif268 as readout. Decreasing zif268 mRNA levels between months 2 and 4 characterized the normal maturation of the (supra)granular layers of the central and peripheral visual field representations in areas 17 and 18. In general, all BD conditions had higher than normal zif268 levels. In area 17, early BD induced a delayed decrease, beginning later in peripheral than in central area 17. In contrast, the decrease occurred between months 2 and 4 throughout area 18. Lack of pattern vision stimulation during the first 4 months of life therefore has a different impact on the development of areas 17 and 18. A high zif268 expression level at a time when normal vision is restored seems to predict the capacity of a visual area to compensate for BD. © The Author 2014. Published by Oxford University Press.
X-Eye: a novel wearable vision system
NASA Astrophysics Data System (ADS)
Wang, Yuan-Kai; Fan, Ching-Tang; Chen, Shao-Ang; Chen, Hou-Ye
2011-03-01
This paper proposes a smart portable device, named the X-Eye, which provides a gesture interface with a small size but a large display for the application of photo capture and management. The wearable vision system is implemented with embedded systems and can achieve real-time performance. The hardware of the system includes an asymmetric dualcore processer with an ARM core and a DSP core. The display device is a pico projector which has a small volume size but can project large screen size. A triple buffering mechanism is designed for efficient memory management. Software functions are partitioned and pipelined for effective execution in parallel. The gesture recognition is achieved first by a color classification which is based on the expectation-maximization algorithm and Gaussian mixture model (GMM). To improve the performance of the GMM, we devise a LUT (Look Up Table) technique. Fingertips are extracted and geometrical features of fingertip's shape are matched to recognize user's gesture commands finally. In order to verify the accuracy of the gesture recognition module, experiments are conducted in eight scenes with 400 test videos including the challenge of colorful background, low illumination, and flickering. The processing speed of the whole system including the gesture recognition is with the frame rate of 22.9FPS. Experimental results give 99% recognition rate. The experimental results demonstrate that this small-size large-screen wearable system has effective gesture interface with real-time performance.
Real Time Large Memory Optical Pattern Recognition.
1984-06-01
AD-Ri58 023 REAL TIME LARGE MEMORY OPTICAL PATTERN RECOGNITION(U) - h ARMY MISSILE COMMAND REDSTONE ARSENAL AL RESEARCH DIRECTORATE D A GREGORY JUN...TECHNICAL REPORT RR-84-9 Ln REAL TIME LARGE MEMORY OPTICAL PATTERN RECOGNITION Don A. Gregory Research Directorate US Army Missile Laboratory JUNE 1984 L...RR-84-9 , ___/_ _ __ _ __ _ __ _ __"__ _ 4. TITLE (and Subtitle) S. TYPE OF REPORT & PERIOD COVERED Real Time Large Memory Optical Pattern Technical
Classification and machine recognition of severe weather patterns
NASA Technical Reports Server (NTRS)
Wang, P. P.; Burns, R. C.
1976-01-01
Forecasting and warning of severe weather conditions are treated from the vantage point of pattern recognition by machine. Pictorial patterns and waveform patterns are distinguished. Time series data on sferics are dealt with by considering waveform patterns. A severe storm patterns recognition machine is described, along with schemes for detection via cross-correlation of time series (same channel or different channels). Syntactic and decision-theoretic approaches to feature extraction are discussed. Active and decayed tornados and thunderstorms, lightning discharges, and funnels and their related time series data are studied.
Vision-based posture recognition using an ensemble classifier and a vote filter
NASA Astrophysics Data System (ADS)
Ji, Peng; Wu, Changcheng; Xu, Xiaonong; Song, Aiguo; Li, Huijun
2016-10-01
Posture recognition is a very important Human-Robot Interaction (HRI) way. To segment effective posture from an image, we propose an improved region grow algorithm which combining with the Single Gauss Color Model. The experiment shows that the improved region grow algorithm can get the complete and accurate posture than traditional Single Gauss Model and region grow algorithm, and it can eliminate the similar region from the background at the same time. In the posture recognition part, and in order to improve the recognition rate, we propose a CNN ensemble classifier, and in order to reduce the misjudgments during a continuous gesture control, a vote filter is proposed and applied to the sequence of recognition results. Comparing with CNN classifier, the CNN ensemble classifier we proposed can yield a 96.27% recognition rate, which is better than that of CNN classifier, and the proposed vote filter can improve the recognition result and reduce the misjudgments during the consecutive gesture switch.
A real time mobile-based face recognition with fisherface methods
NASA Astrophysics Data System (ADS)
Arisandi, D.; Syahputra, M. F.; Putri, I. L.; Purnamawati, S.; Rahmat, R. F.; Sari, P. P.
2018-03-01
Face Recognition is a field research in Computer Vision that study about learning face and determine the identity of the face from a picture sent to the system. By utilizing this face recognition technology, learning process about people’s identity between students in a university will become simpler. With this technology, student won’t need to browse student directory in university’s server site and look for the person with certain face trait. To obtain this goal, face recognition application use image processing methods consist of two phase, pre-processing phase and recognition phase. In pre-processing phase, system will process input image into the best image for recognition phase. Purpose of this pre-processing phase is to reduce noise and increase signal in image. Next, to recognize face phase, we use Fisherface Methods. This methods is chosen because of its advantage that would help system of its limited data. Therefore from experiment the accuracy of face recognition using fisherface is 90%.
Fuzzy Logic-Based Audio Pattern Recognition
NASA Astrophysics Data System (ADS)
Malcangi, M.
2008-11-01
Audio and audio-pattern recognition is becoming one of the most important technologies to automatically control embedded systems. Fuzzy logic may be the most important enabling methodology due to its ability to rapidly and economically model such application. An audio and audio-pattern recognition engine based on fuzzy logic has been developed for use in very low-cost and deeply embedded systems to automate human-to-machine and machine-to-machine interaction. This engine consists of simple digital signal-processing algorithms for feature extraction and normalization, and a set of pattern-recognition rules manually tuned or automatically tuned by a self-learning process.
New Optical Transforms For Statistical Image Recognition
NASA Astrophysics Data System (ADS)
Lee, Sing H.
1983-12-01
In optical implementation of statistical image recognition, new optical transforms on large images for real-time recognition are of special interest. Several important linear transformations frequently used in statistical pattern recognition have now been optically implemented, including the Karhunen-Loeve transform (KLT), the Fukunaga-Koontz transform (FKT) and the least-squares linear mapping technique (LSLMT).1-3 The KLT performs principle components analysis on one class of patterns for feature extraction. The FKT performs feature extraction for separating two classes of patterns. The LSLMT separates multiple classes of patterns by maximizing the interclass differences and minimizing the intraclass variations.
Optimal pattern synthesis for speech recognition based on principal component analysis
NASA Astrophysics Data System (ADS)
Korsun, O. N.; Poliyev, A. V.
2018-02-01
The algorithm for building an optimal pattern for the purpose of automatic speech recognition, which increases the probability of correct recognition, is developed and presented in this work. The optimal pattern forming is based on the decomposition of an initial pattern to principal components, which enables to reduce the dimension of multi-parameter optimization problem. At the next step the training samples are introduced and the optimal estimates for principal components decomposition coefficients are obtained by a numeric parameter optimization algorithm. Finally, we consider the experiment results that show the improvement in speech recognition introduced by the proposed optimization algorithm.
Supervised linear dimensionality reduction with robust margins for object recognition
NASA Astrophysics Data System (ADS)
Dornaika, F.; Assoum, A.
2013-01-01
Linear Dimensionality Reduction (LDR) techniques have been increasingly important in computer vision and pattern recognition since they permit a relatively simple mapping of data onto a lower dimensional subspace, leading to simple and computationally efficient classification strategies. Recently, many linear discriminant methods have been developed in order to reduce the dimensionality of visual data and to enhance the discrimination between different groups or classes. Many existing linear embedding techniques relied on the use of local margins in order to get a good discrimination performance. However, dealing with outliers and within-class diversity has not been addressed by margin-based embedding method. In this paper, we explored the use of different margin-based linear embedding methods. More precisely, we propose to use the concepts of Median miss and Median hit for building robust margin-based criteria. Based on such margins, we seek the projection directions (linear embedding) such that the sum of local margins is maximized. Our proposed approach has been applied to the problem of appearance-based face recognition. Experiments performed on four public face databases show that the proposed approach can give better generalization performance than the classic Average Neighborhood Margin Maximization (ANMM). Moreover, thanks to the use of robust margins, the proposed method down-grades gracefully when label outliers contaminate the training data set. In particular, we show that the concept of Median hit was crucial in order to get robust performance in the presence of outliers.
Kim, Min Young; Lee, Hyunkee; Cho, Hyungsuck
2008-04-10
One major research issue associated with 3D perception by robotic systems is the creation of efficient sensor systems that can generate dense range maps reliably. A visual sensor system for robotic applications is developed that is inherently equipped with two types of sensor, an active trinocular vision and a passive stereo vision. Unlike in conventional active vision systems that use a large number of images with variations of projected patterns for dense range map acquisition or from conventional passive vision systems that work well on specific environments with sufficient feature information, a cooperative bidirectional sensor fusion method for this visual sensor system enables us to acquire a reliable dense range map using active and passive information simultaneously. The fusion algorithms are composed of two parts, one in which the passive stereo vision helps active vision and the other in which the active trinocular vision helps the passive one. The first part matches the laser patterns in stereo laser images with the help of intensity images; the second part utilizes an information fusion technique using the dynamic programming method in which image regions between laser patterns are matched pixel-by-pixel with help of the fusion results obtained in the first part. To determine how the proposed sensor system and fusion algorithms can work in real applications, the sensor system is implemented on a robotic system, and the proposed algorithms are applied. A series of experimental tests is performed for a variety of configurations of robot and environments. The performance of the sensor system is discussed in detail.
The 3D Recognition, Generation, Fusion, Update and Refinement (RG4) Concept
NASA Technical Reports Server (NTRS)
Maluf, David A.; Cheeseman, Peter; Smelyanskyi, Vadim N.; Kuehnel, Frank; Morris, Robin D.; Norvig, Peter (Technical Monitor)
2001-01-01
This paper describes an active (real time) recognition strategy whereby information is inferred iteratively across several viewpoints in descent imagery. We will show how we use inverse theory within the context of parametric model generation, namely height and spectral reflection functions, to generate model assertions. Using this strategy in an active context implies that, from every viewpoint, the proposed system must refine its hypotheses taking into account the image and the effect of uncertainties as well. The proposed system employs probabilistic solutions to the problem of iteratively merging information (images) from several viewpoints. This involves feeding the posterior distribution from all previous images as a prior for the next view. Novel approaches will be developed to accelerate the inversion search using novel statistic implementations and reducing the model complexity using foveated vision. Foveated vision refers to imagery where the resolution varies across the image. In this paper, we allow the model to be foveated where the highest resolution region is called the foveation region. Typically, the images will have dynamic control of the location of the foveation region. For descent imagery in the Entry, Descent, and Landing (EDL) process, it is possible to have more than one foveation region. This research initiative is directed towards descent imagery in connection with NASA's EDL applications. Three-Dimensional Model Recognition, Generation, Fusion, Update, and Refinement (RGFUR or RG4) for height and the spectral reflection characteristics are in focus for various reasons, one of which is the prospect that their interpretation will provide for real time active vision for automated EDL.
NASA Astrophysics Data System (ADS)
Li, Heng; Zeng, Yajie; Lu, Zhuofan; Cao, Xiaofei; Su, Xiaofan; Sui, Xiaohong; Wang, Jing; Chai, Xinyu
2018-04-01
Objective. Retinal prosthesis devices have shown great value in restoring some sight for individuals with profoundly impaired vision, but the visual acuity and visual field provided by prostheses greatly limit recipients’ visual experience. In this paper, we employ computer vision approaches to seek to expand the perceptible visual field in patients implanted potentially with a high-density retinal prosthesis while maintaining visual acuity as much as possible. Approach. We propose an optimized content-aware image retargeting method, by introducing salient object detection based on color and intensity-difference contrast, aiming to remap important information of a scene into a small visual field and preserve their original scale as much as possible. It may improve prosthetic recipients’ perceived visual field and aid in performing some visual tasks (e.g. object detection and object recognition). To verify our method, psychophysical experiments, detecting object number and recognizing objects, are conducted under simulated prosthetic vision. As control, we use three other image retargeting techniques, including Cropping, Scaling, and seam-assisted shrinkability. Main results. Results show that our method outperforms in preserving more key features and has significantly higher recognition accuracy in comparison with other three image retargeting methods under the condition of small visual field and low-resolution. Significance. The proposed method is beneficial to expand the perceived visual field of prosthesis recipients and improve their object detection and recognition performance. It suggests that our method may provide an effective option for image processing module in future high-density retinal implants.
Camuñas-Mesa, Luis A; Domínguez-Cordero, Yaisel L; Linares-Barranco, Alejandro; Serrano-Gotarredona, Teresa; Linares-Barranco, Bernabé
2018-01-01
Convolutional Neural Networks (ConvNets) are a particular type of neural network often used for many applications like image recognition, video analysis or natural language processing. They are inspired by the human brain, following a specific organization of the connectivity pattern between layers of neurons known as receptive field. These networks have been traditionally implemented in software, but they are becoming more computationally expensive as they scale up, having limitations for real-time processing of high-speed stimuli. On the other hand, hardware implementations show difficulties to be used for different applications, due to their reduced flexibility. In this paper, we propose a fully configurable event-driven convolutional node with rate saturation mechanism that can be used to implement arbitrary ConvNets on FPGAs. This node includes a convolutional processing unit and a routing element which allows to build large 2D arrays where any multilayer structure can be implemented. The rate saturation mechanism emulates the refractory behavior in biological neurons, guaranteeing a minimum separation in time between consecutive events. A 4-layer ConvNet with 22 convolutional nodes trained for poker card symbol recognition has been implemented in a Spartan6 FPGA. This network has been tested with a stimulus where 40 poker cards were observed by a Dynamic Vision Sensor (DVS) in 1 s time. Different slow-down factors were applied to characterize the behavior of the system for high speed processing. For slow stimulus play-back, a 96% recognition rate is obtained with a power consumption of 0.85 mW. At maximum play-back speed, a traffic control mechanism downsamples the input stimulus, obtaining a recognition rate above 63% when less than 20% of the input events are processed, demonstrating the robustness of the network.
Camuñas-Mesa, Luis A.; Domínguez-Cordero, Yaisel L.; Linares-Barranco, Alejandro; Serrano-Gotarredona, Teresa; Linares-Barranco, Bernabé
2018-01-01
Convolutional Neural Networks (ConvNets) are a particular type of neural network often used for many applications like image recognition, video analysis or natural language processing. They are inspired by the human brain, following a specific organization of the connectivity pattern between layers of neurons known as receptive field. These networks have been traditionally implemented in software, but they are becoming more computationally expensive as they scale up, having limitations for real-time processing of high-speed stimuli. On the other hand, hardware implementations show difficulties to be used for different applications, due to their reduced flexibility. In this paper, we propose a fully configurable event-driven convolutional node with rate saturation mechanism that can be used to implement arbitrary ConvNets on FPGAs. This node includes a convolutional processing unit and a routing element which allows to build large 2D arrays where any multilayer structure can be implemented. The rate saturation mechanism emulates the refractory behavior in biological neurons, guaranteeing a minimum separation in time between consecutive events. A 4-layer ConvNet with 22 convolutional nodes trained for poker card symbol recognition has been implemented in a Spartan6 FPGA. This network has been tested with a stimulus where 40 poker cards were observed by a Dynamic Vision Sensor (DVS) in 1 s time. Different slow-down factors were applied to characterize the behavior of the system for high speed processing. For slow stimulus play-back, a 96% recognition rate is obtained with a power consumption of 0.85 mW. At maximum play-back speed, a traffic control mechanism downsamples the input stimulus, obtaining a recognition rate above 63% when less than 20% of the input events are processed, demonstrating the robustness of the network. PMID:29515349
Optimization of Visual Information Presentation for Visual Prosthesis.
Guo, Fei; Yang, Yuan; Gao, Yong
2018-01-01
Visual prosthesis applying electrical stimulation to restore visual function for the blind has promising prospects. However, due to the low resolution, limited visual field, and the low dynamic range of the visual perception, huge loss of information occurred when presenting daily scenes. The ability of object recognition in real-life scenarios is severely restricted for prosthetic users. To overcome the limitations, optimizing the visual information in the simulated prosthetic vision has been the focus of research. This paper proposes two image processing strategies based on a salient object detection technique. The two processing strategies enable the prosthetic implants to focus on the object of interest and suppress the background clutter. Psychophysical experiments show that techniques such as foreground zooming with background clutter removal and foreground edge detection with background reduction have positive impacts on the task of object recognition in simulated prosthetic vision. By using edge detection and zooming technique, the two processing strategies significantly improve the recognition accuracy of objects. We can conclude that the visual prosthesis using our proposed strategy can assist the blind to improve their ability to recognize objects. The results will provide effective solutions for the further development of visual prosthesis.
Optimization of Visual Information Presentation for Visual Prosthesis
Gao, Yong
2018-01-01
Visual prosthesis applying electrical stimulation to restore visual function for the blind has promising prospects. However, due to the low resolution, limited visual field, and the low dynamic range of the visual perception, huge loss of information occurred when presenting daily scenes. The ability of object recognition in real-life scenarios is severely restricted for prosthetic users. To overcome the limitations, optimizing the visual information in the simulated prosthetic vision has been the focus of research. This paper proposes two image processing strategies based on a salient object detection technique. The two processing strategies enable the prosthetic implants to focus on the object of interest and suppress the background clutter. Psychophysical experiments show that techniques such as foreground zooming with background clutter removal and foreground edge detection with background reduction have positive impacts on the task of object recognition in simulated prosthetic vision. By using edge detection and zooming technique, the two processing strategies significantly improve the recognition accuracy of objects. We can conclude that the visual prosthesis using our proposed strategy can assist the blind to improve their ability to recognize objects. The results will provide effective solutions for the further development of visual prosthesis. PMID:29731769
Neural system applied on an invariant industrial character recognition
NASA Astrophysics Data System (ADS)
Lecoeuche, Stephane; Deguillemont, Denis; Dubus, Jean-Paul
1997-04-01
Besides the variety of fonts, character recognition systems for the industrial world are confronted with specific problems like: the variety of support (metal, wood, paper, ceramics . . .) as well as the variety of marking (printing, engraving, . . .) and conditions of lighting. We present a system that is able to solve a part of this problem. It implements a collaboration between two neural networks. The first network specialized in vision allows the system to extract the character from an image. Besides this capability, we have equipped our system with characteristics allowing it to obtain an invariant model from the presented character. Thus, whatever the position, the size and the orientation of the character during the capture are, the model presented to the input of the second network will be identical. The second network, thanks to a learning phase, permits us to obtain a character recognition system independent of the type of fonts used. Furthermore, its capabilities of generalization permit us to recognize degraded and/or distorted characters. A feedback loop between the two networks permits the first one to modify the quality of vision.The cooperation between these two networks allows us to recognize characters whatever the support and the marking.
On-Chip Imaging of Schistosoma haematobium Eggs in Urine for Diagnosis by Computer Vision
Linder, Ewert; Grote, Anne; Varjo, Sami; Linder, Nina; Lebbad, Marianne; Lundin, Mikael; Diwan, Vinod; Hannuksela, Jari; Lundin, Johan
2013-01-01
Background Microscopy, being relatively easy to perform at low cost, is the universal diagnostic method for detection of most globally important parasitic infections. As quality control is hard to maintain, misdiagnosis is common, which affects both estimates of parasite burdens and patient care. Novel techniques for high-resolution imaging and image transfer over data networks may offer solutions to these problems through provision of education, quality assurance and diagnostics. Imaging can be done directly on image sensor chips, a technique possible to exploit commercially for the development of inexpensive “mini-microscopes”. Images can be transferred for analysis both visually and by computer vision both at point-of-care and at remote locations. Methods/Principal Findings Here we describe imaging of helminth eggs using mini-microscopes constructed from webcams and mobile phone cameras. The results show that an inexpensive webcam, stripped off its optics to allow direct application of the test sample on the exposed surface of the sensor, yields images of Schistosoma haematobium eggs, which can be identified visually. Using a highly specific image pattern recognition algorithm, 4 out of 5 eggs observed visually could be identified. Conclusions/Significance As proof of concept we show that an inexpensive imaging device, such as a webcam, may be easily modified into a microscope, for the detection of helminth eggs based on on-chip imaging. Furthermore, algorithms for helminth egg detection by machine vision can be generated for automated diagnostics. The results can be exploited for constructing simple imaging devices for low-cost diagnostics of urogenital schistosomiasis and other neglected tropical infectious diseases. PMID:24340107
A digital retina-like low-level vision processor.
Mertoguno, S; Bourbakis, N G
2003-01-01
This correspondence presents the basic design and the simulation of a low level multilayer vision processor that emulates to some degree the functional behavior of a human retina. This retina-like multilayer processor is the lower part of an autonomous self-organized vision system, called Kydon, that could be used on visually impaired people with a damaged visual cerebral cortex. The Kydon vision system, however, is not presented in this paper. The retina-like processor consists of four major layers, where each of them is an array processor based on hexagonal, autonomous processing elements that perform a certain set of low level vision tasks, such as smoothing and light adaptation, edge detection, segmentation, line recognition and region-graph generation. At each layer, the array processor is a 2D array of k/spl times/m hexagonal identical autonomous cells that simultaneously execute certain low level vision tasks. Thus, the hardware design and the simulation at the transistor level of the processing elements (PEs) of the retina-like processor and its simulated functionality with illustrative examples are provided in this paper.
Spatial-frequency requirements for reading revisited
Kwon, MiYoung; Legge, Gordon E.
2012-01-01
Blur is one of many visual factors that can limit reading in both normal and low vision. Legge et al. [Legge, G. E., Pelli, D. G., Rubin, G. S., & Schleske, M. M. (1985). Psychophysics of reading. I. Normal vision. Vision Research, 25, 239–252.] measured reading speed for text that was low-pass filtered with a range of cutoff spatial frequencies. Above 2 cycles per letter (CPL) reading speed was constant at its maximum level, but decreased rapidly for lower cutoff frequencies. It remains unknown why the critical cutoff for reading speed is near 2 CPL. The goal of the current study was to ask whether the spatial-frequency requirement for rapid reading is related to the effects of cutoff frequency on letter recognition and the size of the visual span. Visual span profiles were measured by asking subjects to recognize letters in trigrams (random strings of three letters) flashed for 150 ms at varying letter positions left and right of the fixation point. Reading speed was measured with Rapid Serial Visual Presentation (RSVP). The size of the visual span and reading speed were measured for low-pass filtered stimuli with cutoff frequencies from 0.8 to 8 CPL. Low-pass letter recognition data, obtained under similar testing conditions, were available from our previous study (Kwon & Legge, 2011). We found that the spatial-frequency requirement for reading is very similar to the spatial-frequency requirements for the size of the visual span and single letter recognition. The critical cutoff frequencies for reading speed, the size of the visual span and a contrast-invariant measure of letter recognition were all near 1.4 CPL, which is lower than the previous estimate of 2 CPL for reading speed. Although correlational in nature, these results are consistent with the hypothesis that the size of the visual span is closely linked to reading speed. PMID:22521659
The Need for Careful Data Collection for Pattern Recognition in Digital Pathology.
Marée, Raphaël
2017-01-01
Effective pattern recognition requires carefully designed ground-truth datasets. In this technical note, we first summarize potential data collection issues in digital pathology and then propose guidelines to build more realistic ground-truth datasets and to control their quality. We hope our comments will foster the effective application of pattern recognition approaches in digital pathology.
Pattern recognition: A basis for remote sensing data analysis
NASA Technical Reports Server (NTRS)
Swain, P. H.
1973-01-01
The theoretical basis for the pattern-recognition-oriented algorithms used in the multispectral data analysis software system is discussed. A model of a general pattern recognition system is presented. The receptor or sensor is usually a multispectral scanner. For each ground resolution element the receptor produces n numbers or measurements corresponding to the n channels of the scanner.
Optical Pattern Recognition With Self-Amplification
NASA Technical Reports Server (NTRS)
Liu, Hua-Kuang
1994-01-01
In optical pattern recognition system with self-amplification, no reference beam used in addressing mode. Polarization of laser beam and orientation of photorefractive crystal chosen to maximize photorefractive effect. Intensity of recognition signal is orders of magnitude greater than other optical correlators. Apparatus regarded as real-time or quasi-real-time optical pattern recognizer with memory and reprogrammability.
33 CFR 334.390 - Atlantic Ocean south of entrance to Chesapeake Bay; firing range.
Code of Federal Regulations, 2014 CFR
2014-07-01
... Fleet Combat Center, Atlantic, Dam Neck, Virginia Beach, Virginia. After darkness, night vision systems... firing on the range during periods of low visibility which would prevent the recognition of a vessel (to...
33 CFR 334.390 - Atlantic Ocean south of entrance to Chesapeake Bay; firing range.
Code of Federal Regulations, 2011 CFR
2011-07-01
... Fleet Combat Center, Atlantic, Dam Neck, Virginia Beach, Virginia. After darkness, night vision systems... firing on the range during periods of low visibility which would prevent the recognition of a vessel (to...
33 CFR 334.390 - Atlantic Ocean south of entrance to Chesapeake Bay; firing range.
Code of Federal Regulations, 2012 CFR
2012-07-01
... Fleet Combat Center, Atlantic, Dam Neck, Virginia Beach, Virginia. After darkness, night vision systems... firing on the range during periods of low visibility which would prevent the recognition of a vessel (to...
33 CFR 334.390 - Atlantic Ocean south of entrance to Chesapeake Bay; firing range.
Code of Federal Regulations, 2013 CFR
2013-07-01
... Fleet Combat Center, Atlantic, Dam Neck, Virginia Beach, Virginia. After darkness, night vision systems... firing on the range during periods of low visibility which would prevent the recognition of a vessel (to...
33 CFR 334.390 - Atlantic Ocean south of entrance to Chesapeake Bay; firing range.
Code of Federal Regulations, 2010 CFR
2010-07-01
... Fleet Combat Center, Atlantic, Dam Neck, Virginia Beach, Virginia. After darkness, night vision systems... firing on the range during periods of low visibility which would prevent the recognition of a vessel (to...
Deployment of spatial attention to words in central and peripheral vision.
Ducrot, Stéphanie; Grainger, Jonathan
2007-05-01
Four perceptual identification experiments examined the influence of spatial cues on the recognition of words presented in central vision (with fixation on either the first or last letter of the target word) and in peripheral vision (displaced left or right of a central fixation point). Stimulus location had a strong effect on word identification accuracy in both central and peripheral vision, showing a strong right visual field superiority that did not depend on eccentricity. Valid spatial cues improved word identification for peripherally presented targets but were largely ineffective for centrally presented targets. Effects of spatial cuing interacted with visual field effects in Experiment 1, with valid cues reducing the right visual field superiority for peripherally located targets, but this interaction was shown to depend on the type of neutral cue. These results provide further support for the role of attentional factors in visual field asymmetries obtained with targets in peripheral vision but not with centrally presented targets.
Christensen, Michael
2015-01-01
Voter studies conducted in the United States during the first decades after World War II transformed social scientific research on democracy. Especially important were the rapid innovations in survey research methods developed by two prominent research centers at Columbia University and the University of Michigan. This article argues that the Columbia and Michigan voter studies presented two visions for research on democracy. Where the Michigan research produced quantitative measures expressing the 'political behavior' of the electorate, the Columbia studies, and especially Paul F. Lazarsfeld, presented an alternative vision for qualitative research on political choice. Largely ignored by later voter studies, this vision prefigured much contemporary research on democracy that embraces a qualitative or interpretive approach. This article reconstructs Lazarsfeld's alternative vision, describes the institutional context in which scholars disregarded it in favor of formal quantitative models, and argues for its recognition as a forerunner to qualitative research on democratic processes. © 2015 Wiley Periodicals, Inc.
Enhanced computer vision with Microsoft Kinect sensor: a review.
Han, Jungong; Shao, Ling; Xu, Dong; Shotton, Jamie
2013-10-01
With the invention of the low-cost Microsoft Kinect sensor, high-resolution depth and visual (RGB) sensing has become available for widespread use. The complementary nature of the depth and visual information provided by the Kinect sensor opens up new opportunities to solve fundamental problems in computer vision. This paper presents a comprehensive review of recent Kinect-based computer vision algorithms and applications. The reviewed approaches are classified according to the type of vision problems that can be addressed or enhanced by means of the Kinect sensor. The covered topics include preprocessing, object tracking and recognition, human activity analysis, hand gesture analysis, and indoor 3-D mapping. For each category of methods, we outline their main algorithmic contributions and summarize their advantages/differences compared to their RGB counterparts. Finally, we give an overview of the challenges in this field and future research trends. This paper is expected to serve as a tutorial and source of references for Kinect-based computer vision researchers.
Converting Static Image Datasets to Spiking Neuromorphic Datasets Using Saccades.
Orchard, Garrick; Jayawant, Ajinkya; Cohen, Gregory K; Thakor, Nitish
2015-01-01
Creating datasets for Neuromorphic Vision is a challenging task. A lack of available recordings from Neuromorphic Vision sensors means that data must typically be recorded specifically for dataset creation rather than collecting and labeling existing data. The task is further complicated by a desire to simultaneously provide traditional frame-based recordings to allow for direct comparison with traditional Computer Vision algorithms. Here we propose a method for converting existing Computer Vision static image datasets into Neuromorphic Vision datasets using an actuated pan-tilt camera platform. Moving the sensor rather than the scene or image is a more biologically realistic approach to sensing and eliminates timing artifacts introduced by monitor updates when simulating motion on a computer monitor. We present conversion of two popular image datasets (MNIST and Caltech101) which have played important roles in the development of Computer Vision, and we provide performance metrics on these datasets using spike-based recognition algorithms. This work contributes datasets for future use in the field, as well as results from spike-based algorithms against which future works can compare. Furthermore, by converting datasets already popular in Computer Vision, we enable more direct comparison with frame-based approaches.
Sparsey™: event recognition via deep hierarchical sparse distributed codes
Rinkus, Gerard J.
2014-01-01
The visual cortex's hierarchical, multi-level organization is captured in many biologically inspired computational vision models, the general idea being that progressively larger scale (spatially/temporally) and more complex visual features are represented in progressively higher areas. However, most earlier models use localist representations (codes) in each representational field (which we equate with the cortical macrocolumn, “mac”), at each level. In localism, each represented feature/concept/event (hereinafter “item”) is coded by a single unit. The model we describe, Sparsey, is hierarchical as well but crucially, it uses sparse distributed coding (SDC) in every mac in all levels. In SDC, each represented item is coded by a small subset of the mac's units. The SDCs of different items can overlap and the size of overlap between items can be used to represent their similarity. The difference between localism and SDC is crucial because SDC allows the two essential operations of associative memory, storing a new item and retrieving the best-matching stored item, to be done in fixed time for the life of the model. Since the model's core algorithm, which does both storage and retrieval (inference), makes a single pass over all macs on each time step, the overall model's storage/retrieval operation is also fixed-time, a criterion we consider essential for scalability to the huge (“Big Data”) problems. A 2010 paper described a nonhierarchical version of this model in the context of purely spatial pattern processing. Here, we elaborate a fully hierarchical model (arbitrary numbers of levels and macs per level), describing novel model principles like progressive critical periods, dynamic modulation of principal cells' activation functions based on a mac-level familiarity measure, representation of multiple simultaneously active hypotheses, a novel method of time warp invariant recognition, and we report results showing learning/recognition of spatiotemporal patterns. PMID:25566046
Spencer, Rand
2006-01-01
The goal is to analyze the long-term visual outcome of extremely low-birth-weight children. This is a retrospective analysis of eyes of extremely low-birth-weight children on whom vision testing was performed. Visual outcomes were studied by analyzing acuity outcomes at >/=36 months of adjusted age, correlating early acuity testing with final visual outcome and evaluating adverse risk factors for vision. Data from 278 eyes are included. Mean birth weight was 731g, and mean gestational age at birth was 26 weeks. 248 eyes had grating acuity outcomes measured at 73 +/- 36 months, and 183 eyes had recognition acuity testing at 76 +/- 39 months. 54% had below normal grating acuities, and 66% had below normal recognition acuities. 27% of grating outcomes and 17% of recognition outcomes were =20/200. Abnormal early grating acuity testing was predictive of abnormal grating (P < .0001) and recognition (P = .0001) acuity testing at >/=3 years of age. A slower-than-normal rate of early visual development was predictive of abnormal grating acuity (P < .0001) and abnormal recognition acuity (P < .0001) at >/=3 years of age. Eyes diagnosed with maximal retinopathy of prematurity in zone I had lower acuity outcomes (P = .0002) than did those with maximal retinopathy of prematurity in zone II/III. Eyes of children born at =28 weeks gestational age had 4.1 times greater risk for abnormal recognition acuity than did those of children born at >28 weeks gestational age. Eyes of children with poorer general health after premature birth had a 5.3 times greater risk of abnormal recognition acuity. Long-term visual development in extremely low-birth-weight infants is problematic and associated with a high risk of subnormal acuity. Early acuity testing is useful in identifying children at greatest risk for long-term visual abnormalities. Gestational age at birth of = 28 weeks was associated with a higher risk of an abnormal long-term outcome.
Gella, Laxmi; Raman, Rajiv; Kulothungan, Vaitheeswaran; Pal, Swakshyar Saumya; Ganesan, Suganeswari; Srinivasan, Sangeetha; Sharma, Tarun
2017-01-01
Purpose: The purpose of this study is to assess color vision abnormalities in a cohort of subjects with type II diabetes and elucidate associated risk factors. Methods: Subjects were recruited from follow-up cohort of Sankara Nethralaya Diabetic Retinopathy Epidemiology and Molecular Genetics Study I. Six hundred and seventy-three eyes of 343 subjects were included from this population-based study. All subjects underwent detailed ophthalmic evaluation, including the Farnsworth-Munsell 100 hue test. Results: The prevalence of impaired color vision (ICV) was 43% (CI: 39.2–46.7). Risk factors for ICV were higher heart rate (odds ratio [OR]: 1.043, [1.023–1.064]) and a higher intraocular pressure (IOP) (OR: 1.086, [1.012–1.165]). Subjects with clinically significant macular edema (CSME) had three times higher chance of having ICV. C1, C2, and C3 are the commonly found Early Treatment Diabetic Retinopathy Study (ETDRS) patterns. The moment of inertia method showed that the angle did not reveal any specific pattern of color vision defect. Although the major and minor radii were high in those with ICV, we did not observe polarity. Confusion index was high in subjects with ICV, indicating a severe color vision defect. Conclusions: The prevalence of ICV was 43% among subjects with type II diabetes. The most commonly observed patterns were increasing severities of the blue–yellow defect on ETDRS patterns, but no specific pattern was observed at the moment of inertia analysis. The presence of CSME, a higher heart rate, and IOP was significant risk factors for ICV. This functional impairment in color vision could significantly contribute to morbidity among subjects with diabetes. PMID:29044066
Gella, Laxmi; Raman, Rajiv; Kulothungan, Vaitheeswaran; Pal, Swakshyar Saumya; Ganesan, Suganeswari; Srinivasan, Sangeetha; Sharma, Tarun
2017-10-01
The purpose of this study is to assess color vision abnormalities in a cohort of subjects with type II diabetes and elucidate associated risk factors. Subjects were recruited from follow-up cohort of Sankara Nethralaya Diabetic Retinopathy Epidemiology and Molecular Genetics Study I. Six hundred and seventy-three eyes of 343 subjects were included from this population-based study. All subjects underwent detailed ophthalmic evaluation, including the Farnsworth-Munsell 100 hue test. The prevalence of impaired color vision (ICV) was 43% (CI: 39.2-46.7). Risk factors for ICV were higher heart rate (odds ratio [OR]: 1.043, [1.023-1.064]) and a higher intraocular pressure (IOP) (OR: 1.086, [1.012-1.165]). Subjects with clinically significant macular edema (CSME) had three times higher chance of having ICV. C1, C2, and C3 are the commonly found Early Treatment Diabetic Retinopathy Study (ETDRS) patterns. The moment of inertia method showed that the angle did not reveal any specific pattern of color vision defect. Although the major and minor radii were high in those with ICV, we did not observe polarity. Confusion index was high in subjects with ICV, indicating a severe color vision defect. The prevalence of ICV was 43% among subjects with type II diabetes. The most commonly observed patterns were increasing severities of the blue-yellow defect on ETDRS patterns, but no specific pattern was observed at the moment of inertia analysis. The presence of CSME, a higher heart rate, and IOP was significant risk factors for ICV. This functional impairment in color vision could significantly contribute to morbidity among subjects with diabetes.
An Effective 3D Shape Descriptor for Object Recognition with RGB-D Sensors
Liu, Zhong; Zhao, Changchen; Wu, Xingming; Chen, Weihai
2017-01-01
RGB-D sensors have been widely used in various areas of computer vision and graphics. A good descriptor will effectively improve the performance of operation. This article further analyzes the recognition performance of shape features extracted from multi-modality source data using RGB-D sensors. A hybrid shape descriptor is proposed as a representation of objects for recognition. We first extracted five 2D shape features from contour-based images and five 3D shape features over point cloud data to capture the global and local shape characteristics of an object. The recognition performance was tested for category recognition and instance recognition. Experimental results show that the proposed shape descriptor outperforms several common global-to-global shape descriptors and is comparable to some partial-to-global shape descriptors that achieved the best accuracies in category and instance recognition. Contribution of partial features and computational complexity were also analyzed. The results indicate that the proposed shape features are strong cues for object recognition and can be combined with other features to boost accuracy. PMID:28245553
ERIC Educational Resources Information Center
Annett, John
An experienced person, in such tasks as sonar detection and recognition, has a considerable superiority over a machine recognition system in auditory pattern recognition. However, people require extensive exposure to auditory patterns before achieving a high level of performance. In an attempt to discover a method of training people to recognize…
Degraded character recognition based on gradient pattern
NASA Astrophysics Data System (ADS)
Babu, D. R. Ramesh; Ravishankar, M.; Kumar, Manish; Wadera, Kevin; Raj, Aakash
2010-02-01
Degraded character recognition is a challenging problem in the field of Optical Character Recognition (OCR). The performance of an optical character recognition depends upon printed quality of the input documents. Many OCRs have been designed which correctly identifies the fine printed documents. But, very few reported work has been found on the recognition of the degraded documents. The efficiency of the OCRs system decreases if the input image is degraded. In this paper, a novel approach based on gradient pattern for recognizing degraded printed character is proposed. The approach makes use of gradient pattern of an individual character for recognition. Experiments were conducted on character image that is either digitally written or a degraded character extracted from historical documents and the results are found to be satisfactory.
2015-10-02
ratio or physical layout than the training sample, or new vs old bananas . For our system, this is similar the multimodal case mentioned above; however...different modes. Foods with multiple “types” such as green, yellow, and brown bananas are seamlessly handled as well. Secondly, with hundreds or thousands...Recognition and Classification of Food Grains, Fruits and Flowers Using Machine Vision. INTERNATIONAL JOURNAL OF FOOD ENGINEERING, 5(4), 2009. [155] T. E
Automatic Target Recognition Based on Cross-Plot
Wong, Kelvin Kian Loong; Abbott, Derek
2011-01-01
Automatic target recognition that relies on rapid feature extraction of real-time target from photo-realistic imaging will enable efficient identification of target patterns. To achieve this objective, Cross-plots of binary patterns are explored as potential signatures for the observed target by high-speed capture of the crucial spatial features using minimal computational resources. Target recognition was implemented based on the proposed pattern recognition concept and tested rigorously for its precision and recall performance. We conclude that Cross-plotting is able to produce a digital fingerprint of a target that correlates efficiently and effectively to signatures of patterns having its identity in a target repository. PMID:21980508
DOE Office of Scientific and Technical Information (OSTI.GOV)
Acciarri, R.; Adams, C.; An, R.
The development and operation of Liquid-Argon Time-Projection Chambers for neutrino physics has created a need for new approaches to pattern recognition in order to fully exploit the imaging capabilities offered by this technology. Whereas the human brain can excel at identifying features in the recorded events, it is a significant challenge to develop an automated, algorithmic solution. The Pandora Software Development Kit provides functionality to aid the design and implementation of pattern-recognition algorithms. It promotes the use of a multi-algorithm approach to pattern recognition, in which individual algorithms each address a specific task in a particular topology. Many tens ofmore » algorithms then carefully build up a picture of the event and, together, provide a robust automated pattern-recognition solution. This paper describes details of the chain of over one hundred Pandora algorithms and tools used to reconstruct cosmic-ray muon and neutrino events in the MicroBooNE detector. Metrics that assess the current pattern-recognition performance are presented for simulated MicroBooNE events, using a selection of final-state event topologies.« less
Acciarri, R.; Adams, C.; An, R.; ...
2018-01-29
The development and operation of Liquid-Argon Time-Projection Chambers for neutrino physics has created a need for new approaches to pattern recognition in order to fully exploit the imaging capabilities offered by this technology. Whereas the human brain can excel at identifying features in the recorded events, it is a significant challenge to develop an automated, algorithmic solution. The Pandora Software Development Kit provides functionality to aid the design and implementation of pattern-recognition algorithms. It promotes the use of a multi-algorithm approach to pattern recognition, in which individual algorithms each address a specific task in a particular topology. Many tens ofmore » algorithms then carefully build up a picture of the event and, together, provide a robust automated pattern-recognition solution. This paper describes details of the chain of over one hundred Pandora algorithms and tools used to reconstruct cosmic-ray muon and neutrino events in the MicroBooNE detector. Metrics that assess the current pattern-recognition performance are presented for simulated MicroBooNE events, using a selection of final-state event topologies.« less
Random-Profiles-Based 3D Face Recognition System
Joongrock, Kim; Sunjin, Yu; Sangyoun, Lee
2014-01-01
In this paper, a noble nonintrusive three-dimensional (3D) face modeling system for random-profile-based 3D face recognition is presented. Although recent two-dimensional (2D) face recognition systems can achieve a reliable recognition rate under certain conditions, their performance is limited by internal and external changes, such as illumination and pose variation. To address these issues, 3D face recognition, which uses 3D face data, has recently received much attention. However, the performance of 3D face recognition highly depends on the precision of acquired 3D face data, while also requiring more computational power and storage capacity than 2D face recognition systems. In this paper, we present a developed nonintrusive 3D face modeling system composed of a stereo vision system and an invisible near-infrared line laser, which can be directly applied to profile-based 3D face recognition. We further propose a novel random-profile-based 3D face recognition method that is memory-efficient and pose-invariant. The experimental results demonstrate that the reconstructed 3D face data consists of more than 50 k 3D point clouds and a reliable recognition rate against pose variation. PMID:24691101
Fast neuromimetic object recognition using FPGA outperforms GPU implementations.
Orchard, Garrick; Martin, Jacob G; Vogelstein, R Jacob; Etienne-Cummings, Ralph
2013-08-01
Recognition of objects in still images has traditionally been regarded as a difficult computational problem. Although modern automated methods for visual object recognition have achieved steadily increasing recognition accuracy, even the most advanced computational vision approaches are unable to obtain performance equal to that of humans. This has led to the creation of many biologically inspired models of visual object recognition, among them the hierarchical model and X (HMAX) model. HMAX is traditionally known to achieve high accuracy in visual object recognition tasks at the expense of significant computational complexity. Increasing complexity, in turn, increases computation time, reducing the number of images that can be processed per unit time. In this paper we describe how the computationally intensive and biologically inspired HMAX model for visual object recognition can be modified for implementation on a commercial field-programmable aate Array, specifically the Xilinx Virtex 6 ML605 evaluation board with XC6VLX240T FPGA. We show that with minor modifications to the traditional HMAX model we can perform recognition on images of size 128 × 128 pixels at a rate of 190 images per second with a less than 1% loss in recognition accuracy in both binary and multiclass visual object recognition tasks.
Finger Vein Recognition Based on Local Directional Code
Meng, Xianjing; Yang, Gongping; Yin, Yilong; Xiao, Rongyang
2012-01-01
Finger vein patterns are considered as one of the most promising biometric authentication methods for its security and convenience. Most of the current available finger vein recognition methods utilize features from a segmented blood vessel network. As an improperly segmented network may degrade the recognition accuracy, binary pattern based methods are proposed, such as Local Binary Pattern (LBP), Local Derivative Pattern (LDP) and Local Line Binary Pattern (LLBP). However, the rich directional information hidden in the finger vein pattern has not been fully exploited by the existing local patterns. Inspired by the Webber Local Descriptor (WLD), this paper represents a new direction based local descriptor called Local Directional Code (LDC) and applies it to finger vein recognition. In LDC, the local gradient orientation information is coded as an octonary decimal number. Experimental results show that the proposed method using LDC achieves better performance than methods using LLBP. PMID:23202194
Finger vein recognition based on local directional code.
Meng, Xianjing; Yang, Gongping; Yin, Yilong; Xiao, Rongyang
2012-11-05
Finger vein patterns are considered as one of the most promising biometric authentication methods for its security and convenience. Most of the current available finger vein recognition methods utilize features from a segmented blood vessel network. As an improperly segmented network may degrade the recognition accuracy, binary pattern based methods are proposed, such as Local Binary Pattern (LBP), Local Derivative Pattern (LDP) and Local Line Binary Pattern (LLBP). However, the rich directional information hidden in the finger vein pattern has not been fully exploited by the existing local patterns. Inspired by the Webber Local Descriptor (WLD), this paper represents a new direction based local descriptor called Local Directional Code (LDC) and applies it to finger vein recognition. In LDC, the local gradient orientation information is coded as an octonary decimal number. Experimental results show that the proposed method using LDC achieves better performance than methods using LLBP.
Uniform Local Binary Pattern Based Texture-Edge Feature for 3D Human Behavior Recognition.
Ming, Yue; Wang, Guangchao; Fan, Chunxiao
2015-01-01
With the rapid development of 3D somatosensory technology, human behavior recognition has become an important research field. Human behavior feature analysis has evolved from traditional 2D features to 3D features. In order to improve the performance of human activity recognition, a human behavior recognition method is proposed, which is based on a hybrid texture-edge local pattern coding feature extraction and integration of RGB and depth videos information. The paper mainly focuses on background subtraction on RGB and depth video sequences of behaviors, extracting and integrating historical images of the behavior outlines, feature extraction and classification. The new method of 3D human behavior recognition has achieved the rapid and efficient recognition of behavior videos. A large number of experiments show that the proposed method has faster speed and higher recognition rate. The recognition method has good robustness for different environmental colors, lightings and other factors. Meanwhile, the feature of mixed texture-edge uniform local binary pattern can be used in most 3D behavior recognition.
ERIC Educational Resources Information Center
Butler, Judy; Grier, Terry B.
2000-01-01
A few years ago, the Williamson County (Tennessee) School District developed a strategic plan to encourage volunteers' participation. The plan includes a vision, goals, and objectives; strategies for increasing community involvement; recognition for all volunteers (via the Shining Apple Award); and program evaluation. (MLH)
ERIC Educational Resources Information Center
Macmillan, C. J. B.
1985-01-01
The recognition of teaching as a special relationship among individuals is currently being overlooked in much contemporary educational research and policymaking. The author examines the philosophy of rationality in teaching and relates it to the educational vision presented in George Orwell's novel, "Nineteen Eighty-Four." (CB)
Traffic Sign Recognition with Invariance to Lighting in Dual-Focal Active Camera System
NASA Astrophysics Data System (ADS)
Gu, Yanlei; Panahpour Tehrani, Mehrdad; Yendo, Tomohiro; Fujii, Toshiaki; Tanimoto, Masayuki
In this paper, we present an automatic vision-based traffic sign recognition system, which can detect and classify traffic signs at long distance under different lighting conditions. To realize this purpose, the traffic sign recognition is developed in an originally proposed dual-focal active camera system. In this system, a telephoto camera is equipped as an assistant of a wide angle camera. The telephoto camera can capture a high accuracy image for an object of interest in the view field of the wide angle camera. The image from the telephoto camera provides enough information for recognition when the accuracy of traffic sign is low from the wide angle camera. In the proposed system, the traffic sign detection and classification are processed separately for different images from the wide angle camera and telephoto camera. Besides, in order to detect traffic sign from complex background in different lighting conditions, we propose a type of color transformation which is invariant to light changing. This color transformation is conducted to highlight the pattern of traffic signs by reducing the complexity of background. Based on the color transformation, a multi-resolution detector with cascade mode is trained and used to locate traffic signs at low resolution in the image from the wide angle camera. After detection, the system actively captures a high accuracy image of each detected traffic sign by controlling the direction and exposure time of the telephoto camera based on the information from the wide angle camera. Moreover, in classification, a hierarchical classifier is constructed and used to recognize the detected traffic signs in the high accuracy image from the telephoto camera. Finally, based on the proposed system, a set of experiments in the domain of traffic sign recognition is presented. The experimental results demonstrate that the proposed system can effectively recognize traffic signs at low resolution in different lighting conditions.
Kriegeskorte, Nikolaus
2015-11-24
Recent advances in neural network modeling have enabled major strides in computer vision and other artificial intelligence applications. Human-level visual recognition abilities are coming within reach of artificial systems. Artificial neural networks are inspired by the brain, and their computations could be implemented in biological neurons. Convolutional feedforward networks, which now dominate computer vision, take further inspiration from the architecture of the primate visual hierarchy. However, the current models are designed with engineering goals, not to model brain computations. Nevertheless, initial studies comparing internal representations between these models and primate brains find surprisingly similar representational spaces. With human-level performance no longer out of reach, we are entering an exciting new era, in which we will be able to build biologically faithful feedforward and recurrent computational models of how biological brains perform high-level feats of intelligence, including vision.
ClimateNet: A Machine Learning dataset for Climate Science Research
NASA Astrophysics Data System (ADS)
Prabhat, M.; Biard, J.; Ganguly, S.; Ames, S.; Kashinath, K.; Kim, S. K.; Kahou, S.; Maharaj, T.; Beckham, C.; O'Brien, T. A.; Wehner, M. F.; Williams, D. N.; Kunkel, K.; Collins, W. D.
2017-12-01
Deep Learning techniques have revolutionized commercial applications in Computer vision, speech recognition and control systems. The key for all of these developments was the creation of a curated, labeled dataset ImageNet, for enabling multiple research groups around the world to develop methods, benchmark performance and compete with each other. The success of Deep Learning can be largely attributed to the broad availability of this dataset. Our empirical investigations have revealed that Deep Learning is similarly poised to benefit the task of pattern detection in climate science. Unfortunately, labeled datasets, a key pre-requisite for training, are hard to find. Individual research groups are typically interested in specialized weather patterns, making it hard to unify, and share datasets across groups and institutions. In this work, we are proposing ClimateNet: a labeled dataset that provides labeled instances of extreme weather patterns, as well as associated raw fields in model and observational output. We develop a schema in NetCDF to enumerate weather pattern classes/types, store bounding boxes, and pixel-masks. We are also working on a TensorFlow implementation to natively import such NetCDF datasets, and are providing a reference convolutional architecture for binary classification tasks. Our hope is that researchers in Climate Science, as well as ML/DL, will be able to use (and extend) ClimateNet to make rapid progress in the application of Deep Learning for Climate Science research.
NASA Astrophysics Data System (ADS)
Chang, Wen-Li
2010-01-01
We investigate the influence of blurred ways on pattern recognition of a Barabási-Albert scale-free Hopfield neural network (SFHN) with a small amount of errors. Pattern recognition is an important function of information processing in brain. Due to heterogeneous degree of scale-free network, different blurred ways have different influences on pattern recognition with same errors. Simulation shows that among partial recognition, the larger loading ratio (the number of patterns to average degree P/langlekrangle) is, the smaller the overlap of SFHN is. The influence of directed (large) way is largest and the directed (small) way is smallest while random way is intermediate between them. Under the ratio of the numbers of stored patterns to the size of the network P/N is less than 0. 1 conditions, there are three families curves of the overlap corresponding to directed (small), random and directed (large) blurred ways of patterns and these curves are not associated with the size of network and the number of patterns. This phenomenon only occurs in the SFHN. These conclusions are benefit for understanding the relation between neural network structure and brain function.
The graph neural network model.
Scarselli, Franco; Gori, Marco; Tsoi, Ah Chung; Hagenbuchner, Markus; Monfardini, Gabriele
2009-01-01
Many underlying relationships among data in several areas of science and engineering, e.g., computer vision, molecular chemistry, molecular biology, pattern recognition, and data mining, can be represented in terms of graphs. In this paper, we propose a new neural network model, called graph neural network (GNN) model, that extends existing neural network methods for processing the data represented in graph domains. This GNN model, which can directly process most of the practically useful types of graphs, e.g., acyclic, cyclic, directed, and undirected, implements a function tau(G,n) is an element of IR(m) that maps a graph G and one of its nodes n into an m-dimensional Euclidean space. A supervised learning algorithm is derived to estimate the parameters of the proposed GNN model. The computational cost of the proposed algorithm is also considered. Some experimental results are shown to validate the proposed learning algorithm, and to demonstrate its generalization capabilities.
An Automatic Registration Algorithm for 3D Maxillofacial Model
NASA Astrophysics Data System (ADS)
Qiu, Luwen; Zhou, Zhongwei; Guo, Jixiang; Lv, Jiancheng
2016-09-01
3D image registration aims at aligning two 3D data sets in a common coordinate system, which has been widely used in computer vision, pattern recognition and computer assisted surgery. One challenging problem in 3D registration is that point-wise correspondences between two point sets are often unknown apriori. In this work, we develop an automatic algorithm for 3D maxillofacial models registration including facial surface model and skull model. Our proposed registration algorithm can achieve a good alignment result between partial and whole maxillofacial model in spite of ambiguous matching, which has a potential application in the oral and maxillofacial reparative and reconstructive surgery. The proposed algorithm includes three steps: (1) 3D-SIFT features extraction and FPFH descriptors construction; (2) feature matching using SAC-IA; (3) coarse rigid alignment and refinement by ICP. Experiments on facial surfaces and mandible skull models demonstrate the efficiency and robustness of our algorithm.
Resolving human object recognition in space and time
Cichy, Radoslaw Martin; Pantazis, Dimitrios; Oliva, Aude
2014-01-01
A comprehensive picture of object processing in the human brain requires combining both spatial and temporal information about brain activity. Here, we acquired human magnetoencephalography (MEG) and functional magnetic resonance imaging (fMRI) responses to 92 object images. Multivariate pattern classification applied to MEG revealed the time course of object processing: whereas individual images were discriminated by visual representations early, ordinate and superordinate category levels emerged relatively later. Using representational similarity analysis, we combine human fMRI and MEG to show content-specific correspondence between early MEG responses and primary visual cortex (V1), and later MEG responses and inferior temporal (IT) cortex. We identified transient and persistent neural activities during object processing, with sources in V1 and IT., Finally, human MEG signals were correlated to single-unit responses in monkey IT. Together, our findings provide an integrated space- and time-resolved view of human object categorization during the first few hundred milliseconds of vision. PMID:24464044
Multiscale vector fields for image pattern recognition
NASA Technical Reports Server (NTRS)
Low, Kah-Chan; Coggins, James M.
1990-01-01
A uniform processing framework for low-level vision computing in which a bank of spatial filters maps the image intensity structure at each pixel into an abstract feature space is proposed. Some properties of the filters and the feature space are described. Local orientation is measured by a vector sum in the feature space as follows: each filter's preferred orientation along with the strength of the filter's output determine the orientation and the length of a vector in the feature space; the vectors for all filters are summed to yield a resultant vector for a particular pixel and scale. The orientation of the resultant vector indicates the local orientation, and the magnitude of the vector indicates the strength of the local orientation preference. Limitations of the vector sum method are discussed. Investigations show that the processing framework provides a useful, redundant representation of image structure across orientation and scale.
Adaptive learning compressive tracking based on Markov location prediction
NASA Astrophysics Data System (ADS)
Zhou, Xingyu; Fu, Dongmei; Yang, Tao; Shi, Yanan
2017-03-01
Object tracking is an interdisciplinary research topic in image processing, pattern recognition, and computer vision which has theoretical and practical application value in video surveillance, virtual reality, and automatic navigation. Compressive tracking (CT) has many advantages, such as efficiency and accuracy. However, when there are object occlusion, abrupt motion and blur, similar objects, and scale changing, the CT has the problem of tracking drift. We propose the Markov object location prediction to get the initial position of the object. Then CT is used to locate the object accurately, and the classifier parameter adaptive updating strategy is given based on the confidence map. At the same time according to the object location, extract the scale features, which is able to deal with object scale variations effectively. Experimental results show that the proposed algorithm has better tracking accuracy and robustness than current advanced algorithms and achieves real-time performance.
Culto: AN Ontology-Based Annotation Tool for Data Curation in Cultural Heritage
NASA Astrophysics Data System (ADS)
Garozzo, R.; Murabito, F.; Santagati, C.; Pino, C.; Spampinato, C.
2017-08-01
This paper proposes CulTO, a software tool relying on a computational ontology for Cultural Heritage domain modelling, with a specific focus on religious historical buildings, for supporting cultural heritage experts in their investigations. It is specifically thought to support annotation, automatic indexing, classification and curation of photographic data and text documents of historical buildings. CULTO also serves as a useful tool for Historical Building Information Modeling (H-BIM) by enabling semantic 3D data modeling and further enrichment with non-geometrical information of historical buildings through the inclusion of new concepts about historical documents, images, decay or deformation evidence as well as decorative elements into BIM platforms. CulTO is the result of a joint research effort between the Laboratory of Surveying and Architectural Photogrammetry "Luigi Andreozzi" and the PeRCeiVe Lab (Pattern Recognition and Computer Vision Lab) of the University of Catania,
Principal curve detection in complicated graph images
NASA Astrophysics Data System (ADS)
Liu, Yuncai; Huang, Thomas S.
2001-09-01
Finding principal curves in an image is an important low level processing in computer vision and pattern recognition. Principal curves are those curves in an image that represent boundaries or contours of objects of interest. In general, a principal curve should be smooth with certain length constraint and allow either smooth or sharp turning. In this paper, we present a method that can efficiently detect principal curves in complicated map images. For a given feature image, obtained from edge detection of an intensity image or thinning operation of a pictorial map image, the feature image is first converted to a graph representation. In graph image domain, the operation of principal curve detection is performed to identify useful image features. The shortest path and directional deviation schemes are used in our algorithm os principal verve detection, which is proven to be very efficient working with real graph images.
Crowd motion segmentation and behavior recognition fusing streak flow and collectiveness
NASA Astrophysics Data System (ADS)
Gao, Mingliang; Jiang, Jun; Shen, Jin; Zou, Guofeng; Fu, Guixia
2018-04-01
Crowd motion segmentation and crowd behavior recognition are two hot issues in computer vision. A number of methods have been proposed to tackle these two problems. Among the methods, flow dynamics is utilized to model the crowd motion, with little consideration of collective property. Moreover, the traditional crowd behavior recognition methods treat the local feature and dynamic feature separately and overlook the interconnection of topological and dynamical heterogeneity in complex crowd processes. A crowd motion segmentation method and a crowd behavior recognition method are proposed based on streak flow and crowd collectiveness. The streak flow is adopted to reveal the dynamical property of crowd motion, and the collectiveness is incorporated to reveal the structure property. Experimental results show that the proposed methods improve the crowd motion segmentation accuracy and the crowd recognition rates compared with the state-of-the-art methods.
Scene recognition based on integrating active learning with dictionary learning
NASA Astrophysics Data System (ADS)
Wang, Chengxi; Yin, Xueyan; Yang, Lin; Gong, Chengrong; Zheng, Caixia; Yi, Yugen
2018-04-01
Scene recognition is a significant topic in the field of computer vision. Most of the existing scene recognition models require a large amount of labeled training samples to achieve a good performance. However, labeling image manually is a time consuming task and often unrealistic in practice. In order to gain satisfying recognition results when labeled samples are insufficient, this paper proposed a scene recognition algorithm named Integrating Active Learning and Dictionary Leaning (IALDL). IALDL adopts projective dictionary pair learning (DPL) as classifier and introduces active learning mechanism into DPL for improving its performance. When constructing sampling criterion in active learning, IALDL considers both the uncertainty and representativeness as the sampling criteria to effectively select the useful unlabeled samples from a given sample set for expanding the training dataset. Experiment results on three standard databases demonstrate the feasibility and validity of the proposed IALDL.
Constructive autoassociative neural network for facial recognition.
Fernandes, Bruno J T; Cavalcanti, George D C; Ren, Tsang I
2014-01-01
Autoassociative artificial neural networks have been used in many different computer vision applications. However, it is difficult to define the most suitable neural network architecture because this definition is based on previous knowledge and depends on the problem domain. To address this problem, we propose a constructive autoassociative neural network called CANet (Constructive Autoassociative Neural Network). CANet integrates the concepts of receptive fields and autoassociative memory in a dynamic architecture that changes the configuration of the receptive fields by adding new neurons in the hidden layer, while a pruning algorithm removes neurons from the output layer. Neurons in the CANet output layer present lateral inhibitory connections that improve the recognition rate. Experiments in face recognition and facial expression recognition show that the CANet outperforms other methods presented in the literature.
Body-Based Gender Recognition Using Images from Visible and Thermal Cameras
Nguyen, Dat Tien; Park, Kang Ryoung
2016-01-01
Gender information has many useful applications in computer vision systems, such as surveillance systems, counting the number of males and females in a shopping mall, accessing control systems in restricted areas, or any human-computer interaction system. In most previous studies, researchers attempted to recognize gender by using visible light images of the human face or body. However, shadow, illumination, and time of day greatly affect the performance of these methods. To overcome this problem, we propose a new gender recognition method based on the combination of visible light and thermal camera images of the human body. Experimental results, through various kinds of feature extraction and fusion methods, show that our approach is efficient for gender recognition through a comparison of recognition rates with conventional systems. PMID:26828487
Body-Based Gender Recognition Using Images from Visible and Thermal Cameras.
Nguyen, Dat Tien; Park, Kang Ryoung
2016-01-27
Gender information has many useful applications in computer vision systems, such as surveillance systems, counting the number of males and females in a shopping mall, accessing control systems in restricted areas, or any human-computer interaction system. In most previous studies, researchers attempted to recognize gender by using visible light images of the human face or body. However, shadow, illumination, and time of day greatly affect the performance of these methods. To overcome this problem, we propose a new gender recognition method based on the combination of visible light and thermal camera images of the human body. Experimental results, through various kinds of feature extraction and fusion methods, show that our approach is efficient for gender recognition through a comparison of recognition rates with conventional systems.
NASA Technical Reports Server (NTRS)
Hong, J. P.
1971-01-01
Technique operates regardless of pattern rotation, translation or magnification and successfully detects out-of-register patterns. It improves accuracy and reduces cost of various optical character recognition devices and page readers and provides data input to computer.
A color-coded vision scheme for robotics
NASA Technical Reports Server (NTRS)
Johnson, Kelley Tina
1991-01-01
Most vision systems for robotic applications rely entirely on the extraction of information from gray-level images. Humans, however, regularly depend on color to discriminate between objects. Therefore, the inclusion of color in a robot vision system seems a natural extension of the existing gray-level capabilities. A method for robot object recognition using a color-coding classification scheme is discussed. The scheme is based on an algebraic system in which a two-dimensional color image is represented as a polynomial of two variables. The system is then used to find the color contour of objects. In a controlled environment, such as that of the in-orbit space station, a particular class of objects can thus be quickly recognized by its color.
Vision in laboratory rodents-Tools to measure it and implications for behavioral research.
Leinonen, Henri; Tanila, Heikki
2017-07-29
Mice and rats are nocturnal mammals and their vision is specialized for detection of motion and contrast in dim light conditions. These species possess a large proportion of UV-sensitive cones in their retinas and the majority of their optic nerve axons target superior colliculus rather than visual cortex. Therefore, it was a widely held belief that laboratory rodents hardly utilize vision during day-time behavior. This dogma is being questioned as accumulating evidence suggests that laboratory rodents are able to perform complex visual functions, such as perceiving subjective contours, and that declined vision may affect their performance in many behavioral tasks. For instance, genetic engineering may have unexpected consequences on vision as mouse models of Alzheimer's and Huntington's diseases have declined visual function. Rodent vision can be tested in numerous ways using operant training or reflex-based behavioral tasks, or alternatively using electrophysiological recordings. In this article, we will first provide a summary of visual system and explain its characteristics unique to rodents. Then, we present well-established techniques to test rodent vision, with an emphasis on pattern vision: visual water test, optomotor reflex test, pattern electroretinography and pattern visual evoked potentials. Finally, we highlight the importance of visual phenotyping in rodents. As the number of genetically engineered rodent models and volume of behavioral testing increase simultaneously, the possibility of visual dysfunctions needs to be addressed. Neglect in this matter potentially leads to crude biases in the field of neuroscience and beyond. Copyright © 2017 Elsevier B.V. All rights reserved.
2014-01-01
Myoelectric control has been used for decades to control powered upper limb prostheses. Conventional, amplitude-based control has been employed to control a single prosthesis degree of freedom (DOF) such as closing and opening of the hand. Within the last decade, new and advanced arm and hand prostheses have been constructed that are capable of actuating numerous DOFs. Pattern recognition control has been proposed to control a greater number of DOFs than conventional control, but has traditionally been limited to sequentially controlling DOFs one at a time. However, able-bodied individuals use multiple DOFs simultaneously, and it may be beneficial to provide amputees the ability to perform simultaneous movements. In this study, four amputees who had undergone targeted motor reinnervation (TMR) surgery with previous training using myoelectric prostheses were configured to use three control strategies: 1) conventional amplitude-based myoelectric control, 2) sequential (one-DOF) pattern recognition control, 3) simultaneous pattern recognition control. Simultaneous pattern recognition was enabled by having amputees train each simultaneous movement as a separate motion class. For tasks that required control over just one DOF, sequential pattern recognition based control performed the best with the lowest average completion times, completion rates and length error. For tasks that required control over 2 DOFs, the simultaneous pattern recognition controller performed the best with the lowest average completion times, completion rates and length error compared to the other control strategies. In the two strategies in which users could employ simultaneous movements (conventional and simultaneous pattern recognition), amputees chose to use simultaneous movements 78% of the time with simultaneous pattern recognition and 64% of the time with conventional control for tasks that required two DOF motions to reach the target. These results suggest that when amputees are given the ability to control multiple DOFs simultaneously, they choose to perform tasks that utilize multiple DOFs with simultaneous movements. Additionally, they were able to perform these tasks with higher performance (faster speed, lower length error and higher completion rates) without losing substantial performance in 1 DOF tasks. PMID:24410948
Effects of vision on head-putter coordination in golf.
Gonzalez, David Antonio; Kegel, Stefan; Ishikura, Tadao; Lee, Tim
2012-07-01
Low-skill golfers coordinate the movements of their head and putter with an allocentric, isodirectional coupling, which is opposite to the allocentric, antidirectional coordination pattern used by experts (Lee, Ishikura, Kegel, Gonzalez, & Passmore, 2008). The present study investigated the effects of four vision conditions (full vision, no vision, target focus, and ball focus) on head-putter coupling in low-skill golfers. Performance in the absence of vision resulted in a level of high isodirectional coupling that was similar to the full vision condition. However, when instructed to focus on the target during the putt, or focus on the ball through a restricted viewing angle, low-skill golfers significantly decoupled the head--putter coordination pattern. However, outcome measures demonstrated that target focus resulted in poorer performance compared with the other visual conditions, thereby providing overall support for use of a ball focus strategy to enhance coordination and outcome performance. Focus of attention and reduced visual tracking were hypothesized as potential reasons for the decoupling.
A Vision-Based System for Intelligent Monitoring: Human Behaviour Analysis and Privacy by Context
Chaaraoui, Alexandros Andre; Padilla-López, José Ramón; Ferrández-Pastor, Francisco Javier; Nieto-Hidalgo, Mario; Flórez-Revuelta, Francisco
2014-01-01
Due to progress and demographic change, society is facing a crucial challenge related to increased life expectancy and a higher number of people in situations of dependency. As a consequence, there exists a significant demand for support systems for personal autonomy. This article outlines the vision@home project, whose goal is to extend independent living at home for elderly and impaired people, providing care and safety services by means of vision-based monitoring. Different kinds of ambient-assisted living services are supported, from the detection of home accidents, to telecare services. In this contribution, the specification of the system is presented, and novel contributions are made regarding human behaviour analysis and privacy protection. By means of a multi-view setup of cameras, people's behaviour is recognised based on human action recognition. For this purpose, a weighted feature fusion scheme is proposed to learn from multiple views. In order to protect the right to privacy of the inhabitants when a remote connection occurs, a privacy-by-context method is proposed. The experimental results of the behaviour recognition method show an outstanding performance, as well as support for multi-view scenarios and real-time execution, which are required in order to provide the proposed services. PMID:24854209
A vision-based system for intelligent monitoring: human behaviour analysis and privacy by context.
Chaaraoui, Alexandros Andre; Padilla-López, José Ramón; Ferrández-Pastor, Francisco Javier; Nieto-Hidalgo, Mario; Flórez-Revuelta, Francisco
2014-05-20
Due to progress and demographic change, society is facing a crucial challenge related to increased life expectancy and a higher number of people in situations of dependency. As a consequence, there exists a significant demand for support systems for personal autonomy. This article outlines the vision@home project, whose goal is to extend independent living at home for elderly and impaired people, providing care and safety services by means of vision-based monitoring. Different kinds of ambient-assisted living services are supported, from the detection of home accidents, to telecare services. In this contribution, the specification of the system is presented, and novel contributions are made regarding human behaviour analysis and privacy protection. By means of a multi-view setup of cameras, people's behaviour is recognised based on human action recognition. For this purpose, a weighted feature fusion scheme is proposed to learn from multiple views. In order to protect the right to privacy of the inhabitants when a remote connection occurs, a privacy-by-context method is proposed. The experimental results of the behaviour recognition method show an outstanding performance, as well as support for multi-view scenarios and real-time execution, which are required in order to provide the proposed services.
Recognition of simulated cyanosis by color-vision-normal and color-vision-deficient subjects.
Dain, Stephen J
2014-04-01
There are anecdotal reports that the recognition of cyanosis is difficult for some color-deficient observers. The chromaticity changes of blood with oxygenation in vitro lie close to the dichromatic confusion lines. The chromaticity changes of lips and nail beds measured in vivo are also generally aligned in the same way. Experiments involving visual assessment of cyanosis in vivo are fraught with technical and ethical difficulties A single lower face image of a healthy individual was digitally altered to produce levels of simulated cyanosis. The color change is essentially one of saturation. Some images with other color changes were also included to ensure that there was no propensity to identify those as cyanosed. The images were assessed for reality by a panel of four instructors from the NSW Ambulance Service training section. The images were displayed singly and the observer was required to identify if the person was cyanosed or not. Color normal subjects comprised 32 experienced ambulance officers and 27 new recruits. Twenty-seven color deficient subjects (non-NSW Ambulance Service) were examined. The recruits were less accurate and slower at identifying the cyanosed images and the color vision deficient were less accurate and slower still. The identification of cyanosis is a skill that improves with training and is adversely affected in color deficient observers.
Quick, Accurate, Smart: 3D Computer Vision Technology Helps Assessing Confined Animals’ Behaviour
Calderara, Simone; Pistocchi, Simone; Cucchiara, Rita; Podaliri-Vulpiani, Michele; Messori, Stefano; Ferri, Nicola
2016-01-01
Mankind directly controls the environment and lifestyles of several domestic species for purposes ranging from production and research to conservation and companionship. These environments and lifestyles may not offer these animals the best quality of life. Behaviour is a direct reflection of how the animal is coping with its environment. Behavioural indicators are thus among the preferred parameters to assess welfare. However, behavioural recording (usually from video) can be very time consuming and the accuracy and reliability of the output rely on the experience and background of the observers. The outburst of new video technology and computer image processing gives the basis for promising solutions. In this pilot study, we present a new prototype software able to automatically infer the behaviour of dogs housed in kennels from 3D visual data and through structured machine learning frameworks. Depth information acquired through 3D features, body part detection and training are the key elements that allow the machine to recognise postures, trajectories inside the kennel and patterns of movement that can be later labelled at convenience. The main innovation of the software is its ability to automatically cluster frequently observed temporal patterns of movement without any pre-set ethogram. Conversely, when common patterns are defined through training, a deviation from normal behaviour in time or between individuals could be assessed. The software accuracy in correctly detecting the dogs’ behaviour was checked through a validation process. An automatic behaviour recognition system, independent from human subjectivity, could add scientific knowledge on animals’ quality of life in confinement as well as saving time and resources. This 3D framework was designed to be invariant to the dog’s shape and size and could be extended to farm, laboratory and zoo quadrupeds in artificial housing. The computer vision technique applied to this software is innovative in non-human animal behaviour science. Further improvements and validation are needed, and future applications and limitations are discussed. PMID:27415814
Quick, Accurate, Smart: 3D Computer Vision Technology Helps Assessing Confined Animals' Behaviour.
Barnard, Shanis; Calderara, Simone; Pistocchi, Simone; Cucchiara, Rita; Podaliri-Vulpiani, Michele; Messori, Stefano; Ferri, Nicola
2016-01-01
Mankind directly controls the environment and lifestyles of several domestic species for purposes ranging from production and research to conservation and companionship. These environments and lifestyles may not offer these animals the best quality of life. Behaviour is a direct reflection of how the animal is coping with its environment. Behavioural indicators are thus among the preferred parameters to assess welfare. However, behavioural recording (usually from video) can be very time consuming and the accuracy and reliability of the output rely on the experience and background of the observers. The outburst of new video technology and computer image processing gives the basis for promising solutions. In this pilot study, we present a new prototype software able to automatically infer the behaviour of dogs housed in kennels from 3D visual data and through structured machine learning frameworks. Depth information acquired through 3D features, body part detection and training are the key elements that allow the machine to recognise postures, trajectories inside the kennel and patterns of movement that can be later labelled at convenience. The main innovation of the software is its ability to automatically cluster frequently observed temporal patterns of movement without any pre-set ethogram. Conversely, when common patterns are defined through training, a deviation from normal behaviour in time or between individuals could be assessed. The software accuracy in correctly detecting the dogs' behaviour was checked through a validation process. An automatic behaviour recognition system, independent from human subjectivity, could add scientific knowledge on animals' quality of life in confinement as well as saving time and resources. This 3D framework was designed to be invariant to the dog's shape and size and could be extended to farm, laboratory and zoo quadrupeds in artificial housing. The computer vision technique applied to this software is innovative in non-human animal behaviour science. Further improvements and validation are needed, and future applications and limitations are discussed.
Automatic recognition of lactating sow behaviors through depth image processing
USDA-ARS?s Scientific Manuscript database
Manual observation and classification of animal behaviors is laborious, time-consuming, and of limited ability to process large amount of data. A computer vision-based system was developed that automatically recognizes sow behaviors (lying, sitting, standing, kneeling, feeding, drinking, and shiftin...
Making a Statement with Philanthropy.
ERIC Educational Resources Information Center
Legon, Richard D.
2001-01-01
Discusses how a policy statement on board philanthropy can clarify fundraising expectations of all governing and foundation board members. Describes essential components of such a policy statement: mission and vision, recognition of board responsibility for fundraising, specific expectations, and commitment to project and campaign goals. Also…
On Assisting a Visual-Facial Affect Recognition System with Keyboard-Stroke Pattern Information
NASA Astrophysics Data System (ADS)
Stathopoulou, I.-O.; Alepis, E.; Tsihrintzis, G. A.; Virvou, M.
Towards realizing a multimodal affect recognition system, we are considering the advantages of assisting a visual-facial expression recognition system with keyboard-stroke pattern information. Our work is based on the assumption that the visual-facial and keyboard modalities are complementary to each other and that their combination can significantly improve the accuracy in affective user models. Specifically, we present and discuss the development and evaluation process of two corresponding affect recognition subsystems, with emphasis on the recognition of 6 basic emotional states, namely happiness, sadness, surprise, anger and disgust as well as the emotion-less state which we refer to as neutral. We find that emotion recognition by the visual-facial modality can be aided greatly by keyboard-stroke pattern information and the combination of the two modalities can lead to better results towards building a multimodal affect recognition system.
Basics of identification measurement technology
NASA Astrophysics Data System (ADS)
Klikushin, Yu N.; Kobenko, V. Yu; Stepanov, P. P.
2018-01-01
All available algorithms and suitable for pattern recognition do not give 100% guarantee, therefore there is a field of scientific night activity in this direction, studies are relevant. It is proposed to develop existing technologies for pattern recognition in the form of application of identification measurements. The purpose of the study is to identify the possibility of recognizing images using identification measurement technologies. In solving problems of pattern recognition, neural networks and hidden Markov models are mainly used. A fundamentally new approach to the solution of problems of pattern recognition based on the technology of identification signal measurements (IIS) is proposed. The essence of IIS technology is the quantitative evaluation of the shape of images using special tools and algorithms.
NASA Astrophysics Data System (ADS)
Cruz-Roa, Angel; Arevalo, John; Basavanhally, Ajay; Madabhushi, Anant; González, Fabio
2015-01-01
Learning data representations directly from the data itself is an approach that has shown great success in different pattern recognition problems, outperforming state-of-the-art feature extraction schemes for different tasks in computer vision, speech recognition and natural language processing. Representation learning applies unsupervised and supervised machine learning methods to large amounts of data to find building-blocks that better represent the information in it. Digitized histopathology images represents a very good testbed for representation learning since it involves large amounts of high complex, visual data. This paper presents a comparative evaluation of different supervised and unsupervised representation learning architectures to specifically address open questions on what type of learning architectures (deep or shallow), type of learning (unsupervised or supervised) is optimal. In this paper we limit ourselves to addressing these questions in the context of distinguishing between anaplastic and non-anaplastic medulloblastomas from routine haematoxylin and eosin stained images. The unsupervised approaches evaluated were sparse autoencoders and topographic reconstruct independent component analysis, and the supervised approach was convolutional neural networks. Experimental results show that shallow architectures with more neurons are better than deeper architectures without taking into account local space invariances and that topographic constraints provide useful invariant features in scale and rotations for efficient tumor differentiation.
Design of a Borescope for Extravehicular Non-Destructive Applications
NASA Technical Reports Server (NTRS)
Bachnak, Rafic
2003-01-01
Anomalies such as corrosion, structural damage, misalignment, cracking, stress fiactures, pitting, or wear can be detected and monitored by the aid of a borescope. A borescope requires a source of light for proper operation. Today s current lighting technology market consists of incandescent lamps, fluorescent lamps and other types of electric arc and electric discharge vapor lamp. Recent advances in LED technology have made LEDs viable for a number of applications, including vehicle stoplights, traffic lights, machine-vision-inspection, illumination, and street signs. LEDs promise significant reduction in power consumption compared to other sources of light. This project focused on comparing images taken by the Olympus IPLEX, using two different light sources. One of the sources is the 50-W internal metal halide lamp and the other is a 1 W LED placed at the tip of the insertion tube. Images acquired using these two light sources were quantitatively compared using their histogram, intensity profile along a line segment, and edge detection. Also, images were qualitatively compared using image registration and transformation [l]. The gray-level histogram, edge detection, image profile and image registration do not offer conclusive results. The LED light source, however, produces good images for visual inspection by an operator. Analysis using pattern recognition using Eigenfaces and Gaussian Pyramid in face recognition may be more useful.
Cichy, Radoslaw Martin; Pantazis, Dimitrios; Oliva, Aude
2016-01-01
Every human cognitive function, such as visual object recognition, is realized in a complex spatio-temporal activity pattern in the brain. Current brain imaging techniques in isolation cannot resolve the brain's spatio-temporal dynamics, because they provide either high spatial or temporal resolution but not both. To overcome this limitation, we developed an integration approach that uses representational similarities to combine measurements of magnetoencephalography (MEG) and functional magnetic resonance imaging (fMRI) to yield a spatially and temporally integrated characterization of neuronal activation. Applying this approach to 2 independent MEG–fMRI data sets, we observed that neural activity first emerged in the occipital pole at 50–80 ms, before spreading rapidly and progressively in the anterior direction along the ventral and dorsal visual streams. Further region-of-interest analyses established that dorsal and ventral regions showed MEG–fMRI correspondence in representations later than early visual cortex. Together, these results provide a novel and comprehensive, spatio-temporally resolved view of the rapid neural dynamics during the first few hundred milliseconds of object vision. They further demonstrate the feasibility of spatially unbiased representational similarity-based fusion of MEG and fMRI, promising new insights into how the brain computes complex cognitive functions. PMID:27235099
Acceleration of spiking neural network based pattern recognition on NVIDIA graphics processors.
Han, Bing; Taha, Tarek M
2010-04-01
There is currently a strong push in the research community to develop biological scale implementations of neuron based vision models. Systems at this scale are computationally demanding and generally utilize more accurate neuron models, such as the Izhikevich and the Hodgkin-Huxley models, in favor of the more popular integrate and fire model. We examine the feasibility of using graphics processing units (GPUs) to accelerate a spiking neural network based character recognition network to enable such large scale systems. Two versions of the network utilizing the Izhikevich and Hodgkin-Huxley models are implemented. Three NVIDIA general-purpose (GP) GPU platforms are examined, including the GeForce 9800 GX2, the Tesla C1060, and the Tesla S1070. Our results show that the GPGPUs can provide significant speedup over conventional processors. In particular, the fastest GPGPU utilized, the Tesla S1070, provided a speedup of 5.6 and 84.4 over highly optimized implementations on the fastest central processing unit (CPU) tested, a quadcore 2.67 GHz Xeon processor, for the Izhikevich and the Hodgkin-Huxley models, respectively. The CPU implementation utilized all four cores and the vector data parallelism offered by the processor. The results indicate that GPUs are well suited for this application domain.
NASA Technical Reports Server (NTRS)
Lewandowski, Leon; Struckman, Keith
1994-01-01
Microwave Vision (MV), a concept originally developed in 1985, could play a significant role in the solution to robotic vision problems. Originally our Microwave Vision concept was based on a pattern matching approach employing computer based stored replica correlation processing. Artificial Neural Network (ANN) processor technology offers an attractive alternative to the correlation processing approach, namely the ability to learn and to adapt to changing environments. This paper describes the Microwave Vision concept, some initial ANN-MV experiments, and the design of an ANN-MV system that has led to a second patent disclosure in the robotic vision field.
NASA Astrophysics Data System (ADS)
Nair, Binu M.; Diskin, Yakov; Asari, Vijayan K.
2012-10-01
We present an autonomous system capable of performing security check routines. The surveillance machine, the Clearpath Husky robotic platform, is equipped with three IP cameras with different orientations for the surveillance tasks of face recognition, human activity recognition, autonomous navigation and 3D reconstruction of its environment. Combining the computer vision algorithms onto a robotic machine has given birth to the Robust Artificial Intelligencebased Defense Electro-Robot (RAIDER). The end purpose of the RAIDER is to conduct a patrolling routine on a single floor of a building several times a day. As the RAIDER travels down the corridors off-line algorithms use two of the RAIDER's side mounted cameras to perform a 3D reconstruction from monocular vision technique that updates a 3D model to the most current state of the indoor environment. Using frames from the front mounted camera, positioned at the human eye level, the system performs face recognition with real time training of unknown subjects. Human activity recognition algorithm will also be implemented in which each detected person is assigned to a set of action classes picked to classify ordinary and harmful student activities in a hallway setting.The system is designed to detect changes and irregularities within an environment as well as familiarize with regular faces and actions to distinguish potentially dangerous behavior. In this paper, we present the various algorithms and their modifications which when implemented on the RAIDER serves the purpose of indoor surveillance.
Pinto, Nicolas; Doukhan, David; DiCarlo, James J; Cox, David D
2009-11-01
While many models of biological object recognition share a common set of "broad-stroke" properties, the performance of any one model depends strongly on the choice of parameters in a particular instantiation of that model--e.g., the number of units per layer, the size of pooling kernels, exponents in normalization operations, etc. Since the number of such parameters (explicit or implicit) is typically large and the computational cost of evaluating one particular parameter set is high, the space of possible model instantiations goes largely unexplored. Thus, when a model fails to approach the abilities of biological visual systems, we are left uncertain whether this failure is because we are missing a fundamental idea or because the correct "parts" have not been tuned correctly, assembled at sufficient scale, or provided with enough training. Here, we present a high-throughput approach to the exploration of such parameter sets, leveraging recent advances in stream processing hardware (high-end NVIDIA graphic cards and the PlayStation 3's IBM Cell Processor). In analogy to high-throughput screening approaches in molecular biology and genetics, we explored thousands of potential network architectures and parameter instantiations, screening those that show promising object recognition performance for further analysis. We show that this approach can yield significant, reproducible gains in performance across an array of basic object recognition tasks, consistently outperforming a variety of state-of-the-art purpose-built vision systems from the literature. As the scale of available computational power continues to expand, we argue that this approach has the potential to greatly accelerate progress in both artificial vision and our understanding of the computational underpinning of biological vision.
Azzopardi, George; Petkov, Nicolai
2014-01-01
The remarkable abilities of the primate visual system have inspired the construction of computational models of some visual neurons. We propose a trainable hierarchical object recognition model, which we call S-COSFIRE (S stands for Shape and COSFIRE stands for Combination Of Shifted FIlter REsponses) and use it to localize and recognize objects of interests embedded in complex scenes. It is inspired by the visual processing in the ventral stream (V1/V2 → V4 → TEO). Recognition and localization of objects embedded in complex scenes is important for many computer vision applications. Most existing methods require prior segmentation of the objects from the background which on its turn requires recognition. An S-COSFIRE filter is automatically configured to be selective for an arrangement of contour-based features that belong to a prototype shape specified by an example. The configuration comprises selecting relevant vertex detectors and determining certain blur and shift parameters. The response is computed as the weighted geometric mean of the blurred and shifted responses of the selected vertex detectors. S-COSFIRE filters share similar properties with some neurons in inferotemporal cortex, which provided inspiration for this work. We demonstrate the effectiveness of S-COSFIRE filters in two applications: letter and keyword spotting in handwritten manuscripts and object spotting in complex scenes for the computer vision system of a domestic robot. S-COSFIRE filters are effective to recognize and localize (deformable) objects in images of complex scenes without requiring prior segmentation. They are versatile trainable shape detectors, conceptually simple and easy to implement. The presented hierarchical shape representation contributes to a better understanding of the brain and to more robust computer vision algorithms. PMID:25126068
Chen, Yen-Lin; Liang, Wen-Yew; Chiang, Chuan-Yen; Hsieh, Tung-Ju; Lee, Da-Cheng; Yuan, Shyan-Ming; Chang, Yang-Lang
2011-01-01
This study presents efficient vision-based finger detection, tracking, and event identification techniques and a low-cost hardware framework for multi-touch sensing and display applications. The proposed approach uses a fast bright-blob segmentation process based on automatic multilevel histogram thresholding to extract the pixels of touch blobs obtained from scattered infrared lights captured by a video camera. The advantage of this automatic multilevel thresholding approach is its robustness and adaptability when dealing with various ambient lighting conditions and spurious infrared noises. To extract the connected components of these touch blobs, a connected-component analysis procedure is applied to the bright pixels acquired by the previous stage. After extracting the touch blobs from each of the captured image frames, a blob tracking and event recognition process analyzes the spatial and temporal information of these touch blobs from consecutive frames to determine the possible touch events and actions performed by users. This process also refines the detection results and corrects for errors and occlusions caused by noise and errors during the blob extraction process. The proposed blob tracking and touch event recognition process includes two phases. First, the phase of blob tracking associates the motion correspondence of blobs in succeeding frames by analyzing their spatial and temporal features. The touch event recognition process can identify meaningful touch events based on the motion information of touch blobs, such as finger moving, rotating, pressing, hovering, and clicking actions. Experimental results demonstrate that the proposed vision-based finger detection, tracking, and event identification system is feasible and effective for multi-touch sensing applications in various operational environments and conditions. PMID:22163990
Pinto, Nicolas; Doukhan, David; DiCarlo, James J.; Cox, David D.
2009-01-01
While many models of biological object recognition share a common set of “broad-stroke” properties, the performance of any one model depends strongly on the choice of parameters in a particular instantiation of that model—e.g., the number of units per layer, the size of pooling kernels, exponents in normalization operations, etc. Since the number of such parameters (explicit or implicit) is typically large and the computational cost of evaluating one particular parameter set is high, the space of possible model instantiations goes largely unexplored. Thus, when a model fails to approach the abilities of biological visual systems, we are left uncertain whether this failure is because we are missing a fundamental idea or because the correct “parts” have not been tuned correctly, assembled at sufficient scale, or provided with enough training. Here, we present a high-throughput approach to the exploration of such parameter sets, leveraging recent advances in stream processing hardware (high-end NVIDIA graphic cards and the PlayStation 3's IBM Cell Processor). In analogy to high-throughput screening approaches in molecular biology and genetics, we explored thousands of potential network architectures and parameter instantiations, screening those that show promising object recognition performance for further analysis. We show that this approach can yield significant, reproducible gains in performance across an array of basic object recognition tasks, consistently outperforming a variety of state-of-the-art purpose-built vision systems from the literature. As the scale of available computational power continues to expand, we argue that this approach has the potential to greatly accelerate progress in both artificial vision and our understanding of the computational underpinning of biological vision. PMID:19956750
NASA Astrophysics Data System (ADS)
Cherkasov, Kirill V.; Gavrilova, Irina V.; Chernova, Elena V.; Dokolin, Andrey S.
2018-05-01
The article is devoted to reflection of separate aspects of intellectual system gesture recognition development. The peculiarity of the system is its intellectual block which completely based on open technologies: OpenCV library and Microsoft Cognitive Toolkit (CNTK) platform. The article presents the rationale for the choice of such set of tools, as well as the functional scheme of the system and the hierarchy of its modules. Experiments have shown that the system correctly recognizes about 85% of images received from sensors. The authors assume that the improvement of the algorithmic block of the system will increase the accuracy of gesture recognition up to 95%.
33 CFR 106.215 - Company or OCS facility personnel with security duties.
Code of Federal Regulations, 2011 CFR
2011-07-01
... appropriate: (a) Knowledge of current and anticipated security threats and patterns. (b) Recognition and detection of dangerous substances and devices; (c) Recognition of characteristics and behavioral patterns of persons who are likely to threaten security; (d) Recognition of techniques used to circumvent security...
33 CFR 106.215 - Company or OCS facility personnel with security duties.
Code of Federal Regulations, 2010 CFR
2010-07-01
... appropriate: (a) Knowledge of current and anticipated security threats and patterns. (b) Recognition and detection of dangerous substances and devices; (c) Recognition of characteristics and behavioral patterns of persons who are likely to threaten security; (d) Recognition of techniques used to circumvent security...
McCulloch, Kyle J; Yuan, Furong; Zhen, Ying; Aardema, Matthew L; Smith, Gilbert; Llorente-Bousquets, Jorge; Andolfatto, Peter; Briscoe, Adriana D
2017-09-01
Numerous animal lineages have expanded and diversified the opsin-based photoreceptors in their eyes underlying color vision behavior. However, the selective pressures giving rise to new photoreceptors and their spectral tuning remain mostly obscure. Previously, we identified a violet receptor (UV2) that is the result of a UV opsin gene duplication specific to Heliconius butterflies. At the same time the violet receptor evolved, Heliconius evolved UV-yellow coloration on their wings, due to the pigment 3-hydroxykynurenine (3-OHK) and the nanostructure architecture of the scale cells. In order to better understand the selective pressures giving rise to the violet receptor, we characterized opsin expression patterns using immunostaining (14 species) and RNA-Seq (18 species), and reconstructed evolutionary histories of visual traits in five major lineages within Heliconius and one species from the genus Eueides. Opsin expression patterns are hyperdiverse within Heliconius. We identified six unique retinal mosaics and three distinct forms of sexual dimorphism based on ommatidial types within the genus Heliconius. Additionally, phylogenetic analysis revealed independent losses of opsin expression, pseudogenization events, and relaxation of selection on UVRh2 in one lineage. Despite this diversity, the newly evolved violet receptor is retained across most species and sexes surveyed. Discriminability modeling of behaviorally preferred 3-OHK yellow wing coloration suggests that the violet receptor may facilitate Heliconius color vision in the context of conspecific recognition. Our observations give insights into the selective pressures underlying the origins of new visual receptors. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Facial expression recognition based on improved local ternary pattern and stacked auto-encoder
NASA Astrophysics Data System (ADS)
Wu, Yao; Qiu, Weigen
2017-08-01
In order to enhance the robustness of facial expression recognition, we propose a method of facial expression recognition based on improved Local Ternary Pattern (LTP) combined with Stacked Auto-Encoder (SAE). This method uses the improved LTP extraction feature, and then uses the improved depth belief network as the detector and classifier to extract the LTP feature. The combination of LTP and improved deep belief network is realized in facial expression recognition. The recognition rate on CK+ databases has improved significantly.
Heliophysics Data Environment: What's next? (Invited)
NASA Astrophysics Data System (ADS)
Martens, P.
2010-12-01
In the last two decades the Heliophysics community has witnessed the societal recognition of the importance of space weather and space climate for our technology and ecology, resulting in a renewed priority for and investment in Heliophysics. As a result of that and the explosive development of information technology, Heliophysics has experienced an exponential growth in the amount and variety of data acquired, as well as the easy electronic storage and distribution of these data. The Heliophysics community has responded well to these challenges. The first, most obvious and most needed response, was the development of Virtual Heliophysics Observatories. While the VxOs of Heliophysics still need a lot of work with respect to the expansion of search options and interoperability, I believe the basic structures and functionalities have been established, and that they meet the needs of the community. In the future we'll see a refinement, completion, and integration of VxOs, not a fundamentally different approach -- in my opinion. The challenge posed by the huge increase in amount of data is not met by VxOs alone. No individual scientist or group, even with the assistance of tons of graduate students, can analyze the torrent of data currently coming down from the fleet of heliospheric observatories. Once more information technology provides an opportunity: Automated feature recognition of solar imagery is feasible, has been implemented in a number of instances, and is strongly supported by NASA. For example, the SDO Feature Finding Team is developing a suite of 16 feature recognition modules for SDO imagery that operates in near-real time, produces space-weather warnings, and populates on-line event catalogs. Automated feature recognition -- "computer vision" -- not only save enormous amounts of time in the analysis of events, it also allows for a shift from the analysis of single events to that of sets of features and events -- the latter being by far the most important implication of computer vision. Consider some specific examples of possibilities here: From the on-line SDO metadata a user can produce with a few IDL line commands information that previously would have taken years to compile, e.g.: - Draw a butterfly diagram for Active Regions, - Find all filaments that coincide with sigmoids and correlate the automatically detected sigmoid handedness with filament chirality, - Correlate EUV jets with small scale flux emergence in coronal holes only, - Draw PIL maps with regions of high shear and large magnetic field gradients overlayed, to pinpoint potential flaring regions. Then correlate with actual flare occurrence. I emphasize that the access to those metadata will be provided by VxOs, and that the interplay between computer vision codes and data will be facilitated by VxOs. My vision for the near and medium future for the VxOs is then to provide a simple and seamless interface between data, cataloged metadata, and computer vision software, either existing or newly developed by the user. Heliospheric virtual observatories and computer vision systems will work together to constantly monitor the Sun, provide space weather warnings, populate catalogs of metadata, analyze trends, and produce real-time on-line imagery of current events.
NASA Astrophysics Data System (ADS)
Yu, Zhijing; Ma, Kai; Wang, Zhijun; Wu, Jun; Wang, Tao; Zhuge, Jingchang
2018-03-01
A blade is one of the most important components of an aircraft engine. Due to its high manufacturing costs, it is indispensable to come up with methods for repairing damaged blades. In order to obtain a surface model of the blades, this paper proposes a modeling method by using speckle patterns based on the virtual stereo vision system. Firstly, blades are sprayed evenly creating random speckle patterns and point clouds from blade surfaces can be calculated by using speckle patterns based on the virtual stereo vision system. Secondly, boundary points are obtained in the way of varied step lengths according to curvature and are fitted to get a blade surface envelope with a cubic B-spline curve. Finally, the surface model of blades is established with the envelope curves and the point clouds. Experimental results show that the surface model of aircraft engine blades is fair and accurate.
Patterns recognition of electric brain activity using artificial neural networks
NASA Astrophysics Data System (ADS)
Musatov, V. Yu.; Pchelintseva, S. V.; Runnova, A. E.; Hramov, A. E.
2017-04-01
An approach for the recognition of various cognitive processes in the brain activity in the perception of ambiguous images. On the basis of developed theoretical background and the experimental data, we propose a new classification of oscillating patterns in the human EEG by using an artificial neural network approach. After learning of the artificial neural network reliably identified cube recognition processes, for example, left-handed or right-oriented Necker cube with different intensity of their edges, construct an artificial neural network based on Perceptron architecture and demonstrate its effectiveness in the pattern recognition of the EEG in the experimental.
NASA Astrophysics Data System (ADS)
Obozov, A. A.; Serpik, I. N.; Mihalchenko, G. S.; Fedyaeva, G. A.
2017-01-01
In the article, the problem of application of the pattern recognition (a relatively young area of engineering cybernetics) for analysis of complicated technical systems is examined. It is shown that the application of a statistical approach for hard distinguishable situations could be the most effective. The different recognition algorithms are based on Bayes approach, which estimates posteriori probabilities of a certain event and an assumed error. Application of the statistical approach to pattern recognition is possible for solving the problem of technical diagnosis complicated systems and particularly big powered marine diesel engines.
Luo, Jiebo; Boutell, Matthew
2005-05-01
Automatic image orientation detection for natural images is a useful, yet challenging research topic. Humans use scene context and semantic object recognition to identify the correct image orientation. However, it is difficult for a computer to perform the task in the same way because current object recognition algorithms are extremely limited in their scope and robustness. As a result, existing orientation detection methods were built upon low-level vision features such as spatial distributions of color and texture. Discrepant detection rates have been reported for these methods in the literature. We have developed a probabilistic approach to image orientation detection via confidence-based integration of low-level and semantic cues within a Bayesian framework. Our current accuracy is 90 percent for unconstrained consumer photos, impressive given the findings of a psychophysical study conducted recently. The proposed framework is an attempt to bridge the gap between computer and human vision systems and is applicable to other problems involving semantic scene content understanding.
Jordan, Timothy R; Paterson, Kevin B; Kurtev, Stoyan
2009-03-01
Many studies have claimed that hemispheric projections are split precisely at the foveal midline and so hemispheric asymmetry affects word recognition right up to the point of fixation. To investigate this claim, four-letter words and nonwords were presented to the left or right of fixation, either close to fixation in foveal vision or farther from fixation in extrafoveal vision. Presentation accuracy was controlled using an eyetracker linked to a fixation-contingent display. Words presented foveally produced identical performance on each side of fixation, but words presented extrafoveally showed a clear left-hemisphere (LH) advantage. Nonwords produced no evidence of hemispheric asymmetry in any location. Foveal stimuli also produced an identical word-nonword effect on each side of fixation, whereas extrafoveal stimuli produced a word-nonword effect only for LH (not right-hemisphere) displays. These findings indicate that functional unilateral projections to contralateral hemispheres exist in extrafoveal locations but provide no evidence of a functional division in hemispheric processing at fixation.
van den Berg, Ronald; Roerdink, Jos B. T. M.; Cornelissen, Frans W.
2010-01-01
An object in the peripheral visual field is more difficult to recognize when surrounded by other objects. This phenomenon is called “crowding”. Crowding places a fundamental constraint on human vision that limits performance on numerous tasks. It has been suggested that crowding results from spatial feature integration necessary for object recognition. However, in the absence of convincing models, this theory has remained controversial. Here, we present a quantitative and physiologically plausible model for spatial integration of orientation signals, based on the principles of population coding. Using simulations, we demonstrate that this model coherently accounts for fundamental properties of crowding, including critical spacing, “compulsory averaging”, and a foveal-peripheral anisotropy. Moreover, we show that the model predicts increased responses to correlated visual stimuli. Altogether, these results suggest that crowding has little immediate bearing on object recognition but is a by-product of a general, elementary integration mechanism in early vision aimed at improving signal quality. PMID:20098499
Report on the status of vision care in Israel.
Levinson, A; Scheiman, M
1981-07-01
Recognition of optometry in Israel has not been achieved due to various factors: the lack of recognition by the Ministry of Health in proposing a law of optometry to the Knesset (Parliament); the opposition of organized ophthalmology; the internal conflict between the various associations of opticians and optometrists within the country which lead to a lack of unification for propagation of a law of optometry; the absence of an academic institution of learning in optometry. The establishment of the Optometric Centre by the American Friends of Israel Optometry has instituted low vision clinics and post-graduate courses which have helped to advance optometry. Optometry must become organized in order to apply pressure on the Ministry of Health in order to recognize optometry as an independent profession. There is a need for the establishment of operating courses to equalize the standard of eye-care in the profession, and for the founding of an academic school of optometry to maintain a constant supply of qualified optometrists.
Visual object recognition for mobile tourist information systems
NASA Astrophysics Data System (ADS)
Paletta, Lucas; Fritz, Gerald; Seifert, Christin; Luley, Patrick; Almer, Alexander
2005-03-01
We describe a mobile vision system that is capable of automated object identification using images captured from a PDA or a camera phone. We present a solution for the enabling technology of outdoors vision based object recognition that will extend state-of-the-art location and context aware services towards object based awareness in urban environments. In the proposed application scenario, tourist pedestrians are equipped with GPS, W-LAN and a camera attached to a PDA or a camera phone. They are interested whether their field of view contains tourist sights that would point to more detailed information. Multimedia type data about related history, the architecture, or other related cultural context of historic or artistic relevance might be explored by a mobile user who is intending to learn within the urban environment. Learning from ambient cues is in this way achieved by pointing the device towards the urban sight, capturing an image, and consequently getting information about the object on site and within the focus of attention, i.e., the users current field of view.
Position estimation and driving of an autonomous vehicle by monocular vision
NASA Astrophysics Data System (ADS)
Hanan, Jay C.; Kayathi, Pavan; Hughlett, Casey L.
2007-04-01
Automatic adaptive tracking in real-time for target recognition provided autonomous control of a scale model electric truck. The two-wheel drive truck was modified as an autonomous rover test-bed for vision based guidance and navigation. Methods were implemented to monitor tracking error and ensure a safe, accurate arrival at the intended science target. Some methods are situation independent relying only on the confidence error of the target recognition algorithm. Other methods take advantage of the scenario of combined motion and tracking to filter out anomalies. In either case, only a single calibrated camera was needed for position estimation. Results from real-time autonomous driving tests on the JPL simulated Mars yard are presented. Recognition error was often situation dependent. For the rover case, the background was in motion and may be characterized to provide visual cues on rover travel such as rate, pitch, roll, and distance to objects of interest or hazards. Objects in the scene may be used as landmarks, or waypoints, for such estimations. As objects are approached, their scale increases and their orientation may change. In addition, particularly on rough terrain, these orientation and scale changes may be unpredictable. Feature extraction combined with the neural network algorithm was successful in providing visual odometry in the simulated Mars environment.
Hybrid Feature Extraction-based Approach for Facial Parts Representation and Recognition
NASA Astrophysics Data System (ADS)
Rouabhia, C.; Tebbikh, H.
2008-06-01
Face recognition is a specialized image processing which has attracted a considerable attention in computer vision. In this article, we develop a new facial recognition system from video sequences images dedicated to person identification whose face is partly occulted. This system is based on a hybrid image feature extraction technique called ACPDL2D (Rouabhia et al. 2007), it combines two-dimensional principal component analysis and two-dimensional linear discriminant analysis with neural network. We performed the feature extraction task on the eyes and the nose images separately then a Multi-Layers Perceptron classifier is used. Compared to the whole face, the results of simulation are in favor of the facial parts in terms of memory capacity and recognition (99.41% for the eyes part, 98.16% for the nose part and 97.25 % for the whole face).
Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation
Khaligh-Razavi, Seyed-Mahdi; Kriegeskorte, Nikolaus
2014-01-01
Inferior temporal (IT) cortex in human and nonhuman primates serves visual object recognition. Computational object-vision models, although continually improving, do not yet reach human performance. It is unclear to what extent the internal representations of computational models can explain the IT representation. Here we investigate a wide range of computational model representations (37 in total), testing their categorization performance and their ability to account for the IT representational geometry. The models include well-known neuroscientific object-recognition models (e.g. HMAX, VisNet) along with several models from computer vision (e.g. SIFT, GIST, self-similarity features, and a deep convolutional neural network). We compared the representational dissimilarity matrices (RDMs) of the model representations with the RDMs obtained from human IT (measured with fMRI) and monkey IT (measured with cell recording) for the same set of stimuli (not used in training the models). Better performing models were more similar to IT in that they showed greater clustering of representational patterns by category. In addition, better performing models also more strongly resembled IT in terms of their within-category representational dissimilarities. Representational geometries were significantly correlated between IT and many of the models. However, the categorical clustering observed in IT was largely unexplained by the unsupervised models. The deep convolutional network, which was trained by supervision with over a million category-labeled images, reached the highest categorization performance and also best explained IT, although it did not fully explain the IT data. Combining the features of this model with appropriate weights and adding linear combinations that maximize the margin between animate and inanimate objects and between faces and other objects yielded a representation that fully explained our IT data. Overall, our results suggest that explaining IT requires computational features trained through supervised learning to emphasize the behaviorally important categorical divisions prominently reflected in IT. PMID:25375136
Hopfield's Model of Patterns Recognition and Laws of Artistic Perception
NASA Astrophysics Data System (ADS)
Yevin, Igor; Koblyakov, Alexander
The model of patterns recognition or attractor network model of associative memory, offered by J.Hopfield 1982, is the most known model in theoretical neuroscience. This paper aims to show, that such well-known laws of art perception as the Wundt curve, perception of visual ambiguity in art, and also the model perception of musical tonalities are nothing else than special cases of the Hopfield’s model of patterns recognition.
Computer discrimination procedures applicable to aerial and ERTS multispectral data
NASA Technical Reports Server (NTRS)
Richardson, A. J.; Torline, R. J.; Allen, W. A.
1970-01-01
Two statistical models are compared in the classification of crops recorded on color aerial photographs. A theory of error ellipses is applied to the pattern recognition problem. An elliptical boundary condition classification model (EBC), useful for recognition of candidate patterns, evolves out of error ellipse theory. The EBC model is compared with the minimum distance to the mean (MDM) classification model in terms of pattern recognition ability. The pattern recognition results of both models are interpreted graphically using scatter diagrams to represent measurement space. Measurement space, for this report, is determined by optical density measurements collected from Kodak Ektachrome Infrared Aero Film 8443 (EIR). The EBC model is shown to be a significant improvement over the MDM model.
Heterogeneous compute in computer vision: OpenCL in OpenCV
NASA Astrophysics Data System (ADS)
Gasparakis, Harris
2014-02-01
We explore the relevance of Heterogeneous System Architecture (HSA) in Computer Vision, both as a long term vision, and as a near term emerging reality via the recently ratified OpenCL 2.0 Khronos standard. After a brief review of OpenCL 1.2 and 2.0, including HSA features such as Shared Virtual Memory (SVM) and platform atomics, we identify what genres of Computer Vision workloads stand to benefit by leveraging those features, and we suggest a new mental framework that replaces GPU compute with hybrid HSA APU compute. As a case in point, we discuss, in some detail, popular object recognition algorithms (part-based models), emphasizing the interplay and concurrent collaboration between the GPU and CPU. We conclude by describing how OpenCL has been incorporated in OpenCV, a popular open source computer vision library, emphasizing recent work on the Transparent API, to appear in OpenCV 3.0, which unifies the native CPU and OpenCL execution paths under a single API, allowing the same code to execute either on CPU or on a OpenCL enabled device, without even recompiling.
Sub-pattern based multi-manifold discriminant analysis for face recognition
NASA Astrophysics Data System (ADS)
Dai, Jiangyan; Guo, Changlu; Zhou, Wei; Shi, Yanjiao; Cong, Lin; Yi, Yugen
2018-04-01
In this paper, we present a Sub-pattern based Multi-manifold Discriminant Analysis (SpMMDA) algorithm for face recognition. Unlike existing Multi-manifold Discriminant Analysis (MMDA) approach which is based on holistic information of face image for recognition, SpMMDA operates on sub-images partitioned from the original face image and then extracts the discriminative local feature from the sub-images separately. Moreover, the structure information of different sub-images from the same face image is considered in the proposed method with the aim of further improve the recognition performance. Extensive experiments on three standard face databases (Extended YaleB, CMU PIE and AR) demonstrate that the proposed method is effective and outperforms some other sub-pattern based face recognition methods.
Fine-grained recognition of plants from images.
Šulc, Milan; Matas, Jiří
2017-01-01
Fine-grained recognition of plants from images is a challenging computer vision task, due to the diverse appearance and complex structure of plants, high intra-class variability and small inter-class differences. We review the state-of-the-art and discuss plant recognition tasks, from identification of plants from specific plant organs to general plant recognition "in the wild". We propose texture analysis and deep learning methods for different plant recognition tasks. The methods are evaluated and compared them to the state-of-the-art. Texture analysis is only applied to images with unambiguous segmentation (bark and leaf recognition), whereas CNNs are only applied when sufficiently large datasets are available. The results provide an insight in the complexity of different plant recognition tasks. The proposed methods outperform the state-of-the-art in leaf and bark classification and achieve very competitive results in plant recognition "in the wild". The results suggest that recognition of segmented leaves is practically a solved problem, when high volumes of training data are available. The generality and higher capacity of state-of-the-art CNNs makes them suitable for plant recognition "in the wild" where the views on plant organs or plants vary significantly and the difficulty is increased by occlusions and background clutter.
Art critic: Multisignal vision and speech interaction system in a gaming context.
Reale, Michael J; Liu, Peng; Yin, Lijun; Canavan, Shaun
2013-12-01
True immersion of a player within a game can only occur when the world simulated looks and behaves as close to reality as possible. This implies that the game must correctly read and understand, among other things, the player's focus, attitude toward the objects/persons in focus, gestures, and speech. In this paper, we proposed a novel system that integrates eye gaze estimation, head pose estimation, facial expression recognition, speech recognition, and text-to-speech components for use in real-time games. Both the eye gaze and head pose components utilize underlying 3-D models, and our novel head pose estimation algorithm uniquely combines scene flow with a generic head model. The facial expression recognition module uses the local binary patterns with three orthogonal planes approach on the 2-D shape index domain rather than the pixel domain, resulting in improved classification. Our system has also been extended to use a pan-tilt-zoom camera driven by the Kinect, allowing us to track a moving player. A test game, Art Critic, is also presented, which not only demonstrates the utility of our system but also provides a template for player/non-player character (NPC) interaction in a gaming context. The player alters his/her view of the 3-D world using head pose, looks at paintings/NPCs using eye gaze, and makes an evaluation based on the player's expression and speech. The NPC artist will respond with facial expression and synthetic speech based on its personality. Both qualitative and quantitative evaluations of the system are performed to illustrate the system's effectiveness.
Head pose estimation in computer vision: a survey.
Murphy-Chutorian, Erik; Trivedi, Mohan Manubhai
2009-04-01
The capacity to estimate the head pose of another person is a common human ability that presents a unique challenge for computer vision systems. Compared to face detection and recognition, which have been the primary foci of face-related vision research, identity-invariant head pose estimation has fewer rigorously evaluated systems or generic solutions. In this paper, we discuss the inherent difficulties in head pose estimation and present an organized survey describing the evolution of the field. Our discussion focuses on the advantages and disadvantages of each approach and spans 90 of the most innovative and characteristic papers that have been published on this topic. We compare these systems by focusing on their ability to estimate coarse and fine head pose, highlighting approaches that are well suited for unconstrained environments.
Laptop Computer - Based Facial Recognition System Assessment
DOE Office of Scientific and Technical Information (OSTI.GOV)
R. A. Cain; G. B. Singleton
2001-03-01
The objective of this project was to assess the performance of the leading commercial-off-the-shelf (COTS) facial recognition software package when used as a laptop application. We performed the assessment to determine the system's usefulness for enrolling facial images in a database from remote locations and conducting real-time searches against a database of previously enrolled images. The assessment involved creating a database of 40 images and conducting 2 series of tests to determine the product's ability to recognize and match subject faces under varying conditions. This report describes the test results and includes a description of the factors affecting the results.more » After an extensive market survey, we selected Visionics' FaceIt{reg_sign} software package for evaluation and a review of the Facial Recognition Vendor Test 2000 (FRVT 2000). This test was co-sponsored by the US Department of Defense (DOD) Counterdrug Technology Development Program Office, the National Institute of Justice, and the Defense Advanced Research Projects Agency (DARPA). Administered in May-June 2000, the FRVT 2000 assessed the capabilities of facial recognition systems that were currently available for purchase on the US market. Our selection of this Visionics product does not indicate that it is the ''best'' facial recognition software package for all uses. It was the most appropriate package based on the specific applications and requirements for this specific application. In this assessment, the system configuration was evaluated for effectiveness in identifying individuals by searching for facial images captured from video displays against those stored in a facial image database. An additional criterion was that the system be capable of operating discretely. For this application, an operational facial recognition system would consist of one central computer hosting the master image database with multiple standalone systems configured with duplicates of the master operating in remote locations. Remote users could perform real-time searches where network connectivity is not available. As images are enrolled at the remote locations, periodic database synchronization is necessary.« less
NASA Astrophysics Data System (ADS)
Wang, Bingjie; Sun, Qi; Pi, Shaohua; Wu, Hongyan
2014-09-01
In this paper, feature extraction and pattern recognition of the distributed optical fiber sensing signal have been studied. We adopt Mel-Frequency Cepstral Coefficient (MFCC) feature extraction, wavelet packet energy feature extraction and wavelet packet Shannon entropy feature extraction methods to obtain sensing signals (such as speak, wind, thunder and rain signals, etc.) characteristic vectors respectively, and then perform pattern recognition via RBF neural network. Performances of these three feature extraction methods are compared according to the results. We choose MFCC characteristic vector to be 12-dimensional. For wavelet packet feature extraction, signals are decomposed into six layers by Daubechies wavelet packet transform, in which 64 frequency constituents as characteristic vector are respectively extracted. In the process of pattern recognition, the value of diffusion coefficient is introduced to increase the recognition accuracy, while keeping the samples for testing algorithm the same. Recognition results show that wavelet packet Shannon entropy feature extraction method yields the best recognition accuracy which is up to 97%; the performance of 12-dimensional MFCC feature extraction method is less satisfactory; the performance of wavelet packet energy feature extraction method is the worst.
Visual adaptation dominates bimodal visual-motor action adaptation
de la Rosa, Stephan; Ferstl, Ylva; Bülthoff, Heinrich H.
2016-01-01
A long standing debate revolves around the question whether visual action recognition primarily relies on visual or motor action information. Previous studies mainly examined the contribution of either visual or motor information to action recognition. Yet, the interaction of visual and motor action information is particularly important for understanding action recognition in social interactions, where humans often observe and execute actions at the same time. Here, we behaviourally examined the interaction of visual and motor action recognition processes when participants simultaneously observe and execute actions. We took advantage of behavioural action adaptation effects to investigate behavioural correlates of neural action recognition mechanisms. In line with previous results, we find that prolonged visual exposure (visual adaptation) and prolonged execution of the same action with closed eyes (non-visual motor adaptation) influence action recognition. However, when participants simultaneously adapted visually and motorically – akin to simultaneous execution and observation of actions in social interactions - adaptation effects were only modulated by visual but not motor adaptation. Action recognition, therefore, relies primarily on vision-based action recognition mechanisms in situations that require simultaneous action observation and execution, such as social interactions. The results suggest caution when associating social behaviour in social interactions with motor based information. PMID:27029781
Image quality assessment for video stream recognition systems
NASA Astrophysics Data System (ADS)
Chernov, Timofey S.; Razumnuy, Nikita P.; Kozharinov, Alexander S.; Nikolaev, Dmitry P.; Arlazarov, Vladimir V.
2018-04-01
Recognition and machine vision systems have long been widely used in many disciplines to automate various processes of life and industry. Input images of optical recognition systems can be subjected to a large number of different distortions, especially in uncontrolled or natural shooting conditions, which leads to unpredictable results of recognition systems, making it impossible to assess their reliability. For this reason, it is necessary to perform quality control of the input data of recognition systems, which is facilitated by modern progress in the field of image quality evaluation. In this paper, we investigate the approach to designing optical recognition systems with built-in input image quality estimation modules and feedback, for which the necessary definitions are introduced and a model for describing such systems is constructed. The efficiency of this approach is illustrated by the example of solving the problem of selecting the best frames for recognition in a video stream for a system with limited resources. Experimental results are presented for the system for identity documents recognition, showing a significant increase in the accuracy and speed of the system under simulated conditions of automatic camera focusing, leading to blurring of frames.
Pattern association--a key to recognition of shark attacks.
Cirillo, G; James, H
2004-12-01
Investigation of a number of shark attacks in South Australian waters has lead to recognition of pattern similarities on equipment recovered from the scene of such attacks. Six cases are presented in which a common pattern of striations has been noted.
International Visions of Excellence for Children with Disabilities.
ERIC Educational Resources Information Center
Mittler, Peter
1992-01-01
This paper reviews the status of children with disabilities throughout the world. It summarizes United Nations information on the prevalence of disability and on prevention efforts. Progress is noted in the areas of immunization, increased early intervention services, community-based rehabilitation, and increased recognition of governmental…
ERIC Educational Resources Information Center
Doty, Keith L.
1999-01-01
Research on neural networks and hippocampal function demonstrating how mammals construct mental maps and develop navigation strategies is being used to create Intelligent Autonomous Mobile Robots (IAMRs). Such robots are able to recognize landmarks and navigate without "vision." (SK)
US Policy approaches for assessing soil health
USDA-ARS?s Scientific Manuscript database
There is worldwide recognition for a more holistic vision of soil health and tools to guide soil conservation policy, management and restoration. To meet this need, U.S. conservation programs in the US Food, Conservation, and Energy Act of 2008 (the farm bill), including the Conservation Stewardship...
ERIC Educational Resources Information Center
Horenstein, Mary Ann
This booklet describes 12 Blue Ribbon secondary schools chosen during the 1991 federal recognition program. Representing a cross-section of America, these schools are characterized by a clear vision; empowered, hard-working teachers; a school environment that supports learning; use of local resources; and programs other than classroom learning.…
Recognition vs Reverse Engineering in Boolean Concepts Learning
ERIC Educational Resources Information Center
Shafat, Gabriel; Levin, Ilya
2012-01-01
This paper deals with two types of logical problems--recognition problems and reverse engineering problems, and with the interrelations between these types of problems. The recognition problems are modeled in the form of a visual representation of various objects in a common pattern, with a composition of represented objects in the pattern.…
Neuromorphic Hardware Architecture Using the Neural Engineering Framework for Pattern Recognition.
Wang, Runchun; Thakur, Chetan Singh; Cohen, Gregory; Hamilton, Tara Julia; Tapson, Jonathan; van Schaik, Andre
2017-06-01
We present a hardware architecture that uses the neural engineering framework (NEF) to implement large-scale neural networks on field programmable gate arrays (FPGAs) for performing massively parallel real-time pattern recognition. NEF is a framework that is capable of synthesising large-scale cognitive systems from subnetworks and we have previously presented an FPGA implementation of the NEF that successfully performs nonlinear mathematical computations. That work was developed based on a compact digital neural core, which consists of 64 neurons that are instantiated by a single physical neuron using a time-multiplexing approach. We have now scaled this approach up to build a pattern recognition system by combining identical neural cores together. As a proof of concept, we have developed a handwritten digit recognition system using the MNIST database and achieved a recognition rate of 96.55%. The system is implemented on a state-of-the-art FPGA and can process 5.12 million digits per second. The architecture and hardware optimisations presented offer high-speed and resource-efficient means for performing high-speed, neuromorphic, and massively parallel pattern recognition and classification tasks.
Finger vein recognition based on personalized weight maps.
Yang, Gongping; Xiao, Rongyang; Yin, Yilong; Yang, Lu
2013-09-10
Finger vein recognition is a promising biometric recognition technology, which verifies identities via the vein patterns in the fingers. Binary pattern based methods were thoroughly studied in order to cope with the difficulties of extracting the blood vessel network. However, current binary pattern based finger vein matching methods treat every bit of feature codes derived from different image of various individuals as equally important and assign the same weight value to them. In this paper, we propose a finger vein recognition method based on personalized weight maps (PWMs). The different bits have different weight values according to their stabilities in a certain number of training samples from an individual. Firstly we present the concept of PWM, and then propose the finger vein recognition framework, which mainly consists of preprocessing, feature extraction, and matching. Finally, we design extensive experiments to evaluate the effectiveness of our proposal. Experimental results show that PWM achieves not only better performance, but also high robustness and reliability. In addition, PWM can be used as a general framework for binary pattern based recognition.
Finger Vein Recognition Based on Personalized Weight Maps
Yang, Gongping; Xiao, Rongyang; Yin, Yilong; Yang, Lu
2013-01-01
Finger vein recognition is a promising biometric recognition technology, which verifies identities via the vein patterns in the fingers. Binary pattern based methods were thoroughly studied in order to cope with the difficulties of extracting the blood vessel network. However, current binary pattern based finger vein matching methods treat every bit of feature codes derived from different image of various individuals as equally important and assign the same weight value to them. In this paper, we propose a finger vein recognition method based on personalized weight maps (PWMs). The different bits have different weight values according to their stabilities in a certain number of training samples from an individual. Firstly we present the concept of PWM, and then propose the finger vein recognition framework, which mainly consists of preprocessing, feature extraction, and matching. Finally, we design extensive experiments to evaluate the effectiveness of our proposal. Experimental results show that PWM achieves not only better performance, but also high robustness and reliability. In addition, PWM can be used as a general framework for binary pattern based recognition. PMID:24025556
Exploring Spatio-temporal Dynamics of Cellular Automata for Pattern Recognition in Networks.
Miranda, Gisele Helena Barboni; Machicao, Jeaneth; Bruno, Odemir Martinez
2016-11-22
Network science is an interdisciplinary field which provides an integrative approach for the study of complex systems. In recent years, network modeling has been used for the study of emergent phenomena in many real-world applications. Pattern recognition in networks has been drawing attention to the importance of network characterization, which may lead to understanding the topological properties that are related to the network model. In this paper, the Life-Like Network Automata (LLNA) method is introduced, which was designed for pattern recognition in networks. LLNA uses the network topology as a tessellation of Cellular Automata (CA), whose dynamics produces a spatio-temporal pattern used to extract the feature vector for network characterization. The method was evaluated using synthetic and real-world networks. In the latter, three pattern recognition applications were used: (i) identifying organisms from distinct domains of life through their metabolic networks, (ii) identifying online social networks and (iii) classifying stomata distribution patterns varying according to different lighting conditions. LLNA was compared to structural measurements and surpasses them in real-world applications, achieving improvement in the classification rate as high as 23%, 4% and 7% respectively. Therefore, the proposed method is a good choice for pattern recognition applications using networks and demonstrates potential for general applicability.
Exploring Spatio-temporal Dynamics of Cellular Automata for Pattern Recognition in Networks
Miranda, Gisele Helena Barboni; Machicao, Jeaneth; Bruno, Odemir Martinez
2016-01-01
Network science is an interdisciplinary field which provides an integrative approach for the study of complex systems. In recent years, network modeling has been used for the study of emergent phenomena in many real-world applications. Pattern recognition in networks has been drawing attention to the importance of network characterization, which may lead to understanding the topological properties that are related to the network model. In this paper, the Life-Like Network Automata (LLNA) method is introduced, which was designed for pattern recognition in networks. LLNA uses the network topology as a tessellation of Cellular Automata (CA), whose dynamics produces a spatio-temporal pattern used to extract the feature vector for network characterization. The method was evaluated using synthetic and real-world networks. In the latter, three pattern recognition applications were used: (i) identifying organisms from distinct domains of life through their metabolic networks, (ii) identifying online social networks and (iii) classifying stomata distribution patterns varying according to different lighting conditions. LLNA was compared to structural measurements and surpasses them in real-world applications, achieving improvement in the classification rate as high as 23%, 4% and 7% respectively. Therefore, the proposed method is a good choice for pattern recognition applications using networks and demonstrates potential for general applicability. PMID:27874024
Exploring Spatio-temporal Dynamics of Cellular Automata for Pattern Recognition in Networks
NASA Astrophysics Data System (ADS)
Miranda, Gisele Helena Barboni; Machicao, Jeaneth; Bruno, Odemir Martinez
2016-11-01
Network science is an interdisciplinary field which provides an integrative approach for the study of complex systems. In recent years, network modeling has been used for the study of emergent phenomena in many real-world applications. Pattern recognition in networks has been drawing attention to the importance of network characterization, which may lead to understanding the topological properties that are related to the network model. In this paper, the Life-Like Network Automata (LLNA) method is introduced, which was designed for pattern recognition in networks. LLNA uses the network topology as a tessellation of Cellular Automata (CA), whose dynamics produces a spatio-temporal pattern used to extract the feature vector for network characterization. The method was evaluated using synthetic and real-world networks. In the latter, three pattern recognition applications were used: (i) identifying organisms from distinct domains of life through their metabolic networks, (ii) identifying online social networks and (iii) classifying stomata distribution patterns varying according to different lighting conditions. LLNA was compared to structural measurements and surpasses them in real-world applications, achieving improvement in the classification rate as high as 23%, 4% and 7% respectively. Therefore, the proposed method is a good choice for pattern recognition applications using networks and demonstrates potential for general applicability.
System of technical vision for autonomous unmanned aerial vehicles
NASA Astrophysics Data System (ADS)
Bondarchuk, A. S.
2018-05-01
This paper is devoted to the implementation of image recognition algorithm using the LabVIEW software. The created virtual instrument is designed to detect the objects on the frames from the camera mounted on the UAV. The trained classifier is invariant to changes in rotation, as well as to small changes in the camera's viewing angle. Finding objects in the image using particle analysis, allows you to classify regions of different sizes. This method allows the system of technical vision to more accurately determine the location of the objects of interest and their movement relative to the camera.
Active vision in satellite scene analysis
NASA Technical Reports Server (NTRS)
Naillon, Martine
1994-01-01
In earth observation or planetary exploration it is necessary to have more and, more autonomous systems, able to adapt to unpredictable situations. This imposes the use, in artificial systems, of new concepts in cognition, based on the fact that perception should not be separated from recognition and decision making levels. This means that low level signal processing (perception level) should interact with symbolic and high level processing (decision level). This paper is going to describe the new concept of active vision, implemented in Distributed Artificial Intelligence by Dassault Aviation following a 'structuralist' principle. An application to spatial image interpretation is given, oriented toward flexible robotics.
International VLBI Service for Geodesy and Astrometry 2004 General Meeting Proceedings
NASA Technical Reports Server (NTRS)
Vandenberg, Nancy R. (Editor); Baver, Karen D. (Editor)
2004-01-01
This volume is the proceedings of the third General Meeting of the International VLBI Service for Geodesy and Astromctry IVS), held in Otlawa, Canada, February 9-11,2004. The keynote of the third GM was visions for the next decade following the main theme of "Today's Results and Tomorrow's Vision". with a recognition that the outstanding VLBI results available today are the foundation and motivation for the next generation VLBI system requirements. The goal of the meeting was to provide an interesting and informative program for a wide cross section of IVS members, including station operators, program managers, and analysts.
Neural Encoding and Decoding with Deep Learning for Dynamic Natural Vision.
Wen, Haiguang; Shi, Junxing; Zhang, Yizhen; Lu, Kun-Han; Cao, Jiayue; Liu, Zhongming
2017-10-20
Convolutional neural network (CNN) driven by image recognition has been shown to be able to explain cortical responses to static pictures at ventral-stream areas. Here, we further showed that such CNN could reliably predict and decode functional magnetic resonance imaging data from humans watching natural movies, despite its lack of any mechanism to account for temporal dynamics or feedback processing. Using separate data, encoding and decoding models were developed and evaluated for describing the bi-directional relationships between the CNN and the brain. Through the encoding models, the CNN-predicted areas covered not only the ventral stream, but also the dorsal stream, albeit to a lesser degree; single-voxel response was visualized as the specific pixel pattern that drove the response, revealing the distinct representation of individual cortical location; cortical activation was synthesized from natural images with high-throughput to map category representation, contrast, and selectivity. Through the decoding models, fMRI signals were directly decoded to estimate the feature representations in both visual and semantic spaces, for direct visual reconstruction and semantic categorization, respectively. These results corroborate, generalize, and extend previous findings, and highlight the value of using deep learning, as an all-in-one model of the visual cortex, to understand and decode natural vision. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
A bacterial tyrosine phosphatase inhibits plant pattern recognition receptor activation
USDA-ARS?s Scientific Manuscript database
Perception of pathogen-associated molecular patterns (PAMPs) by surface-localised pattern-recognition receptors (PRRs) is a key component of plant innate immunity. Most known plant PRRs are receptor kinases and initiation of PAMP-triggered immunity (PTI) signalling requires phosphorylation of the PR...
Realism and Perspectivism: a Reevaluation of Rival Theories of Spatial Vision.
NASA Astrophysics Data System (ADS)
Thro, E. Broydrick
1990-01-01
My study reevaluates two theories of human space perception, a trigonometric surveying theory I call perspectivism and a "scene recognition" theory I call realism. Realists believe that retinal image geometry can supply no unambiguous information about an object's size and distance--and that, as a result, viewers can locate objects in space only by making discretionary interpretations based on familiar experience of object types. Perspectivists, in contrast, think viewers can disambiguate object sizes/distances on the basis of retinal image information alone. More specifically, they believe the eye responds to perspective image geometry with an automatic trigonometric calculation that not only fixes the directions and shapes, but also roughly fixes the sizes and distances of scene elements in space. Today this surveyor theory has been largely superceded by the realist approach, because most vision scientists believe retinal image geometry is ambiguous about the scale of space. However, I show that there is a considerable body of neglected evidence, both past and present, tending to call this scale ambiguity claim into question. I maintain that this evidence against scale ambiguity could hardly be more important, if one considers its subversive implications for the scene recognition theory that is not only today's reigning approach to spatial vision, but also the foundation for computer scientists' efforts to create space-perceiving robots. If viewers were deemed to be capable of automatic surveying calculations, the discretionary scene recognition theory would lose its main justification. Clearly, it would be difficult for realists to maintain that we viewers rely on scene recognition for space perception in spite of our ability to survey. And in reality, as I show, the surveyor theory does a much better job of describing the everyday space we viewers actually see--a space featuring stable, unambiguous relationships among scene elements, and a single horizon and vanishing point for (meter-scale) receding objects. In addition, I argue, the surveyor theory raises fewer philosophical difficulties, because it is more in harmony with our everyday concepts of material objects, human agency and the self.
33 CFR 104.210 - Company Security Officer (CSO).
Code of Federal Regulations, 2011 CFR
2011-07-01
... threats and patterns; (ix) Recognition and detection of dangerous substances and devices; (x) Recognition of characteristics and behavioral patterns of persons who are likely to threaten security; (xi...
33 CFR 104.210 - Company Security Officer (CSO).
Code of Federal Regulations, 2010 CFR
2010-07-01
... threats and patterns; (ix) Recognition and detection of dangerous substances and devices; (x) Recognition of characteristics and behavioral patterns of persons who are likely to threaten security; (xi...
Infrared face recognition based on LBP histogram and KW feature selection
NASA Astrophysics Data System (ADS)
Xie, Zhihua
2014-07-01
The conventional LBP-based feature as represented by the local binary pattern (LBP) histogram still has room for performance improvements. This paper focuses on the dimension reduction of LBP micro-patterns and proposes an improved infrared face recognition method based on LBP histogram representation. To extract the local robust features in infrared face images, LBP is chosen to get the composition of micro-patterns of sub-blocks. Based on statistical test theory, Kruskal-Wallis (KW) feature selection method is proposed to get the LBP patterns which are suitable for infrared face recognition. The experimental results show combination of LBP and KW features selection improves the performance of infrared face recognition, the proposed method outperforms the traditional methods based on LBP histogram, discrete cosine transform(DCT) or principal component analysis(PCA).
Demas, James A.; Payne, Hannah; Cline, Hollis T.
2011-01-01
Developing amphibians need vision to avoid predators and locate food before visual system circuits fully mature. Xenopus tadpoles can respond to visual stimuli as soon as retinal ganglion cells (RGCs) innervate the brain, however, in mammals, chicks and turtles, RGCs reach their central targets many days, or even weeks, before their retinas are capable of vision. In the absence of vision, activity-dependent refinement in these amniote species is mediated by waves of spontaneous activity that periodically spread across the retina, correlating the firing of action potentials in neighboring RGCs. Theory suggests that retinorecipient neurons in the brain use patterned RGC activity to sharpen the retinotopy first established by genetic cues. We find that in both wild type and albino Xenopus tadpoles, RGCs are spontaneously active at all stages of tadpole development studied, but their population activity never coalesces into waves. Even at the earliest stages recorded, visual stimulation dominates over spontaneous activity and can generate patterns of RGC activity similar to the locally correlated spontaneous activity observed in amniotes. In addition, we show that blocking AMPA and NMDA type glutamate receptors significantly decreases spontaneous activity in young Xenopus retina, but that blocking GABAA receptor blockers does not. Our findings indicate that vision drives correlated activity required for topographic map formation. They further suggest that developing retinal circuits in the two major subdivisions of tetrapods, amphibians and amniotes, evolved different strategies to supply appropriately patterned RGC activity to drive visual circuit refinement. PMID:21312343
The impact of privacy protection filters on gender recognition
NASA Astrophysics Data System (ADS)
Ruchaud, Natacha; Antipov, Grigory; Korshunov, Pavel; Dugelay, Jean-Luc; Ebrahimi, Touradj; Berrani, Sid-Ahmed
2015-09-01
Deep learning-based algorithms have become increasingly efficient in recognition and detection tasks, especially when they are trained on large-scale datasets. Such recent success has led to a speculation that deep learning methods are comparable to or even outperform human visual system in its ability to detect and recognize objects and their features. In this paper, we focus on the specific task of gender recognition in images when they have been processed by privacy protection filters (e.g., blurring, masking, and pixelization) applied at different strengths. Assuming a privacy protection scenario, we compare the performance of state of the art deep learning algorithms with a subjective evaluation obtained via crowdsourcing to understand how privacy protection filters affect both machine and human vision.
Multi-task learning with group information for human action recognition
NASA Astrophysics Data System (ADS)
Qian, Li; Wu, Song; Pu, Nan; Xu, Shulin; Xiao, Guoqiang
2018-04-01
Human action recognition is an important and challenging task in computer vision research, due to the variations in human motion performance, interpersonal differences and recording settings. In this paper, we propose a novel multi-task learning framework with group information (MTL-GI) for accurate and efficient human action recognition. Specifically, we firstly obtain group information through calculating the mutual information according to the latent relationship between Gaussian components and action categories, and clustering similar action categories into the same group by affinity propagation clustering. Additionally, in order to explore the relationships of related tasks, we incorporate group information into multi-task learning. Experimental results evaluated on two popular benchmarks (UCF50 and HMDB51 datasets) demonstrate the superiority of our proposed MTL-GI framework.
NASA Technical Reports Server (NTRS)
Arthur, Jarvis J., III; Shelton, Kevin J.; Prinzel, Lawrence J., III; Bailey, Randall E.
2016-01-01
During the flight trials known as Gulfstream-V Synthetic Vision Systems Integrated Technology Evaluation (GV-SITE), a Speech Recognition System (SRS) was used by the evaluation pilots. The SRS system was intended to be an intuitive interface for display control (rather than knobs, buttons, etc.). This paper describes the performance of the current "state of the art" Speech Recognition System (SRS). The commercially available technology was evaluated as an application for possible inclusion in commercial aircraft flight decks as a crew-to-vehicle interface. Specifically, the technology is to be used as an interface from aircrew to the onboard displays, controls, and flight management tasks. A flight test of a SRS as well as a laboratory test was conducted.
High Tech Aids Low Vision: A Review of Image Processing for the Visually Impaired.
Moshtael, Howard; Aslam, Tariq; Underwood, Ian; Dhillon, Baljean
2015-08-01
Recent advances in digital image processing provide promising methods for maximizing the residual vision of the visually impaired. This paper seeks to introduce this field to the readership and describe its current state as found in the literature. A systematic search revealed 37 studies that measure the value of image processing techniques for subjects with low vision. The techniques used are categorized according to their effect and the principal findings are summarized. The majority of participants preferred enhanced images over the original for a wide range of enhancement types. Adapting the contrast and spatial frequency content often improved performance at object recognition and reading speed, as did techniques that attenuate the image background and a technique that induced jitter. A lack of consistency in preference and performance measures was found, as well as a lack of independent studies. Nevertheless, the promising results should encourage further research in order to allow their widespread use in low-vision aids.
Calvo, Manuel G; Nummenmaa, Lauri
2009-12-01
Happy, surprised, disgusted, angry, sad, fearful, and neutral faces were presented extrafoveally, with fixations on faces allowed or not. The faces were preceded by a cue word that designated the face to be saccaded in a two-alternative forced-choice discrimination task (2AFC; Experiments 1 and 2), or were followed by a probe word for recognition (Experiment 3). Eye tracking was used to decompose the recognition process into stages. Relative to the other expressions, happy faces (1) were identified faster (as early as 160 msec from stimulus onset) in extrafoveal vision, as revealed by shorter saccade latencies in the 2AFC task; (2) required less encoding effort, as indexed by shorter first fixations and dwell times; and (3) required less decision-making effort, as indicated by fewer refixations on the face after the recognition probe was presented. This reveals a happy-face identification advantage both prior to and during overt attentional processing. The results are discussed in relation to prior neurophysiological findings on latencies in facial expression recognition.
Spencer, Rand
2006-01-01
Purpose The goal is to analyze the long-term visual outcome of extremely low-birth-weight children. Methods This is a retrospective analysis of eyes of extremely low-birth-weight children on whom vision testing was performed. Visual outcomes were studied by analyzing acuity outcomes at ≥36 months of adjusted age, correlating early acuity testing with final visual outcome and evaluating adverse risk factors for vision. Results Data from 278 eyes are included. Mean birth weight was 731g, and mean gestational age at birth was 26 weeks. 248 eyes had grating acuity outcomes measured at 73 ± 36 months, and 183 eyes had recognition acuity testing at 76 ± 39 months. 54% had below normal grating acuities, and 66% had below normal recognition acuities. 27% of grating outcomes and 17% of recognition outcomes were ≤20/200. Abnormal early grating acuity testing was predictive of abnormal grating (P < .0001) and recognition (P = .0001) acuity testing at ≥3 years of age. A slower-than-normal rate of early visual development was predictive of abnormal grating acuity (P < .0001) and abnormal recognition acuity (P < .0001) at ≥3 years of age. Eyes diagnosed with maximal retinopathy of prematurity in zone I had lower acuity outcomes (P = .0002) than did those with maximal retinopathy of prematurity in zone II/III. Eyes of children born at ≤28 weeks gestational age had 4.1 times greater risk for abnormal recognition acuity than did those of children born at >28 weeks gestational age. Eyes of children with poorer general health after premature birth had a 5.3 times greater risk of abnormal recognition acuity. Conclusions Long-term visual development in extremely low-birth-weight infants is problematic and associated with a high risk of subnormal acuity. Early acuity testing is useful in identifying children at greatest risk for long-term visual abnormalities. Gestational age at birth of ≤ 28 weeks was associated with a higher risk of an abnormal long-term outcome. PMID:17471358
2D DOST based local phase pattern for face recognition
NASA Astrophysics Data System (ADS)
Moniruzzaman, Md.; Alam, Mohammad S.
2017-05-01
A new two dimensional (2-D) Discrete Orthogonal Stcokwell Transform (DOST) based Local Phase Pattern (LPP) technique has been proposed for efficient face recognition. The proposed technique uses 2-D DOST as preliminary preprocessing and local phase pattern to form robust feature signature which can effectively accommodate various 3D facial distortions and illumination variations. The S-transform, is an extension of the ideas of the continuous wavelet transform (CWT), is also known for its local spectral phase properties in time-frequency representation (TFR). It provides a frequency dependent resolution of the time-frequency space and absolutely referenced local phase information while maintaining a direct relationship with the Fourier spectrum which is unique in TFR. After utilizing 2-D Stransform as the preprocessing and build local phase pattern from extracted phase information yield fast and efficient technique for face recognition. The proposed technique shows better correlation discrimination compared to alternate pattern recognition techniques such as wavelet or Gabor based face recognition. The performance of the proposed method has been tested using the Yale and extended Yale facial database under different environments such as illumination variation and 3D changes in facial expressions. Test results show that the proposed technique yields better performance compared to alternate time-frequency representation (TFR) based face recognition techniques.
Higher-Order Neural Networks Applied to 2D and 3D Object Recognition
NASA Technical Reports Server (NTRS)
Spirkovska, Lilly; Reid, Max B.
1994-01-01
A Higher-Order Neural Network (HONN) can be designed to be invariant to geometric transformations such as scale, translation, and in-plane rotation. Invariances are built directly into the architecture of a HONN and do not need to be learned. Thus, for 2D object recognition, the network needs to be trained on just one view of each object class, not numerous scaled, translated, and rotated views. Because the 2D object recognition task is a component of the 3D object recognition task, built-in 2D invariance also decreases the size of the training set required for 3D object recognition. We present results for 2D object recognition both in simulation and within a robotic vision experiment and for 3D object recognition in simulation. We also compare our method to other approaches and show that HONNs have distinct advantages for position, scale, and rotation-invariant object recognition. The major drawback of HONNs is that the size of the input field is limited due to the memory required for the large number of interconnections in a fully connected network. We present partial connectivity strategies and a coarse-coding technique for overcoming this limitation and increasing the input field to that required by practical object recognition problems.
ERIC Educational Resources Information Center
National Academy of Sciences - National Research Council, Washington, DC. Committee on Prosthetics Research and Development.
The problems of providing sensory aids for the blind are presented and a report on the present status of aids discusses direct translation and recognition reading machines as well as mobility aids. Aspects of required research considered are the following: assessment of needs; vision, audition, taction, and multimodal communication; reading aids,…
Geometric Invariants and Object Recognition.
1992-08-01
University of Chicago Press. Maybank , S.J. [1992], "The Projection of Two Non-coplanar Conics", in Geometric Invariance in Machine Vision, eds. J.L...J.L. Mundy and A. Zisserman, MIT Press, Cambridge, MA. Mundy, J.L., Kapur, .. , Maybank , S.J., and Quan, L. [1992a] "Geometric Inter- pretation of
The Critical Difference: Identifying the Dyslexic.
ERIC Educational Resources Information Center
Burgett, Russell; King, James
A study compared peripheral vision applied to letter-pair and Dolch word recognition. Subjects, 6 normal readers, 12 Chapter 1 students, and 34 learning disabled (and assumed dyslexic) students from grades one through three enrolled in a parochial school, a public school, and a university summer reading clinic, completed a test designed to…
Demonstration of a 3D vision algorithm for space applications
NASA Technical Reports Server (NTRS)
Defigueiredo, Rui J. P. (Editor)
1987-01-01
This paper reports an extension of the MIAG algorithm for recognition and motion parameter determination of general 3-D polyhedral objects based on model matching techniques and using movement invariants as features of object representation. Results of tests conducted on the algorithm under conditions simulating space conditions are presented.
Reflecting Visions. New Perspectives on Adult Education for Indigenous Peoples.
ERIC Educational Resources Information Center
King, Linda, Ed.
This book contains 14 papers: "Indigenous Peoples and Adult Education: A Growing Challenge" (Rodolfo Stavenhagen); "Indigenous Peoples: Progress in the International Recognition of Human Rights and the Role of Education" (Julian Burger); "Adult Learning in the Context of Indigenous Societies" (Linda King); "Linguistic Rights and the Role of…
ERIC Educational Resources Information Center
Minarovic, Rosanne E.; Mueller, J. Paul
2000-01-01
Responses from 369 of 500 extension professionals reflected a shared vision for sustainable agriculture and recognition of a need for environmentally sound farming practices. There was less unanimity about endorsing the social aspects of sustainable agriculture, though they agreed on the need for more systems research. (SK)
Optical Pattern Recognition for Missile Guidance.
1982-11-15
directed to novel pattern recognition algo- rithms (that allow pattern recognition and object classification in the face of various geometrical and...I wats EF5 = 50) p.j/t’ni 2 (for btith image pat tern recognitio itas a preproicessing oiperatiton. Ini devices). TIhe rt’ad light intensity (0.33t mW...electrodes on its large faces . This Priz light modulator and the motivation for its devel- SLM is known as the Prom (Pockels real-time optical opment. In Sec
Localization and recognition of traffic signs for automated vehicle control systems
NASA Astrophysics Data System (ADS)
Zadeh, Mahmoud M.; Kasvand, T.; Suen, Ching Y.
1998-01-01
We present a computer vision system for detection and recognition of traffic signs. Such systems are required to assist drivers and for guidance and control of autonomous vehicles on roads and city streets. For experiments we use sequences of digitized photographs and off-line analysis. The system contains four stages. First, region segmentation based on color pixel classification called SRSM. SRSM limits the search to regions of interest in the scene. Second, we use edge tracing to find parts of outer edges of signs which are circular or straight, corresponding to the geometrical shapes of traffic signs. The third step is geometrical analysis of the outer edge and preliminary recognition of each candidate region, which may be a potential traffic sign. The final step in recognition uses color combinations within each region and model matching. This system maybe used for recognition of other types of objects, provided that the geometrical shape and color content remain reasonably constant. The method is reliable, easy to implement, and fast, This differs form the road signs recognition method in the PROMETEUS. The overall structure of the approach is sketched.
Effects of age and illumination on night driving: a road test.
Owens, D Alfred; Wood, Joanne M; Owens, Justin M
2007-12-01
This study investigated the effects of drivers' age and low light on speed, lane keeping, and visual recognition of typical roadway stimuli. Poor visibility, which is exacerbated by age-related changes in vision, is a leading contributor to fatal nighttime crashes. There is little evidence, however, concerning the extent to which drivers recognize and compensate for their visual limitations at night. Young, middle-aged, and elder participants drove on a closed road course in day and night conditions at a "comfortable" speed without speedometer information. During night tests, headlight intensity was varied over a range of 1.5 log units using neutral density filters. Average speed and recognition of road signs decreased significantly as functions of increased age and reduced illumination. Recognition of pedestrians at night was significantly enhanced by retroreflective markings of limb joints as compared with markings of the torso, and this benefit was greater for middle-aged and elder drivers. Lane keeping showed nonlinear effects of lighting, which interacted with task conditions and drivers' lateral bias, indicating that older drivers drove more cautiously in low light. Consistent with the hypothesis that drivers misjudge their visual abilities at night, participants of all age groups failed to compensate fully for diminished visual recognition abilities in low light, although older drivers behaved more cautiously than the younger groups. These findings highlight the importance of educating all road users about the limitations of night vision and provide new evidence that retroreflective markings of the limbs can be of great benefit to pedestrians' safety at night.
Recognition as Support for Reasoning about Horizontal Motion: A Further Resource for School Science?
ERIC Educational Resources Information Center
Howe, Christine; Taylor Tavares, Joana; Devine, Amy
2016-01-01
Background: Even infants can recognize whether patterns of motion are or are not natural, yet an acknowledged challenge for science education is to promote adequate reasoning about such patterns. Since research indicates linkage between the conceptual bases of recognition and reasoning, it seems possible that recognition can be engaged to support…
Recognition Of Complex Three Dimensional Objects Using Three Dimensional Moment Invariants
NASA Astrophysics Data System (ADS)
Sadjadi, Firooz A.
1985-01-01
A technique for the recognition of complex three dimensional objects is presented. The complex 3-D objects are represented in terms of their 3-D moment invariants, algebraic expressions that remain invariant independent of the 3-D objects' orientations and locations in the field of view. The technique of 3-D moment invariants has been used successfully for simple 3-D object recognition in the past. In this work we have extended this method for the representation of more complex objects. Two complex objects are represented digitally; their 3-D moment invariants have been calculated, and then the invariancy of these 3-D invariant moment expressions is verified by changing the orientation and the location of the objects in the field of view. The results of this study have significant impact on 3-D robotic vision, 3-D target recognition, scene analysis and artificial intelligence.
Towards NIRS-based hand movement recognition.
Paleari, Marco; Luciani, Riccardo; Ariano, Paolo
2017-07-01
This work reports on preliminary results about on hand movement recognition with Near InfraRed Spectroscopy (NIRS) and surface ElectroMyoGraphy (sEMG). Either basing on physical contact (touchscreens, data-gloves, etc.), vision techniques (Microsoft Kinect, Sony PlayStation Move, etc.), or other modalities, hand movement recognition is a pervasive function in today environment and it is at the base of many gaming, social, and medical applications. Albeit, in recent years, the use of muscle information extracted by sEMG has spread out from the medical applications to contaminate the consumer world, this technique still falls short when dealing with movements of the hand. We tested NIRS as a technique to get another point of view on the muscle phenomena and proved that, within a specific movements selection, NIRS can be used to recognize movements and return information regarding muscles at different depths. Furthermore, we propose here three different multimodal movement recognition approaches and compare their performances.
Learning discriminative features from RGB-D images for gender and ethnicity identification
NASA Astrophysics Data System (ADS)
Azzakhnini, Safaa; Ballihi, Lahoucine; Aboutajdine, Driss
2016-11-01
The development of sophisticated sensor technologies gave rise to an interesting variety of data. With the appearance of affordable devices, such as the Microsoft Kinect, depth-maps and three-dimensional data became easily accessible. This attracted many computer vision researchers seeking to exploit this information in classification and recognition tasks. In this work, the problem of face classification in the context of RGB images and depth information (RGB-D images) is addressed. The purpose of this paper is to study and compare some popular techniques for gender recognition and ethnicity classification to understand how much depth data can improve the quality of recognition. Furthermore, we investigate which combination of face descriptors, feature selection methods, and learning techniques is best suited to better exploit RGB-D images. The experimental results show that depth data improve the recognition accuracy for gender and ethnicity classification applications in many use cases.
The Last Meter: Blind Visual Guidance to a Target.
Manduchi, Roberto; Coughlan, James M
2014-01-01
Smartphone apps can use object recognition software to provide information to blind or low vision users about objects in the visual environment. A crucial challenge for these users is aiming the camera properly to take a well-framed picture of the desired target object. We investigate the effects of two fundamental constraints of object recognition - frame rate and camera field of view - on a blind person's ability to use an object recognition smartphone app. The app was used by 18 blind participants to find visual targets beyond arm's reach and approach them to within 30 cm. While we expected that a faster frame rate or wider camera field of view should always improve search performance, our experimental results show that in many cases increasing the field of view does not help, and may even hurt, performance. These results have important implications for the design of object recognition systems for blind users.
33 CFR 105.210 - Facility personnel with security duties.
Code of Federal Regulations, 2011 CFR
2011-07-01
...: (a) Knowledge of current security threats and patterns; (b) Recognition and detection of dangerous substances and devices; (c) Recognition of characteristics and behavioral patterns of persons who are likely...