recognition computer vision: Topics by Science.gov

Sample records for recognition computer vision

Can Humans Fly Action Understanding with Multiple Classes of Actors

DTIC Science & Technology

2015-06-08

recognition using structure from motion point clouds. In European Conference on Computer Vision, 2008. [5] R. Caruana. Multitask learning. Machine Learning...tonomous driving ? the kitti vision benchmark suite. In IEEE Conference on Computer Vision and Pattern Recognition, 2012. [12] L. Gorelick, M. Blank
Deep hierarchies in the primate visual cortex: what can we learn for computer vision?

PubMed

Krüger, Norbert; Janssen, Peter; Kalkan, Sinan; Lappe, Markus; Leonardis, Ales; Piater, Justus; Rodríguez-Sánchez, Antonio J; Wiskott, Laurenz

2013-08-01

Computational modeling of the primate visual system yields insights of potential relevance to some of the challenges that computer vision is facing, such as object recognition and categorization, motion detection and activity recognition, or vision-based navigation and manipulation. This paper reviews some functional principles and structures that are generally thought to underlie the primate visual cortex, and attempts to extract biological principles that could further advance computer vision research. Organized for a computer vision audience, we present functional principles of the processing hierarchies present in the primate visual system considering recent discoveries in neurophysiology. The hierarchical processing in the primate visual system is characterized by a sequence of different levels of processing (on the order of 10) that constitute a deep hierarchy in contrast to the flat vision architectures predominantly used in today's mainstream computer vision. We hope that the functional description of the deep hierarchies realized in the primate visual system provides valuable insights for the design of computer vision algorithms, fostering increasingly productive interaction between biological and computer vision research.
Container-code recognition system based on computer vision and deep neural networks

NASA Astrophysics Data System (ADS)

Liu, Yi; Li, Tianjian; Jiang, Li; Liang, Xiaoyao

2018-04-01

Automatic container-code recognition system becomes a crucial requirement for ship transportation industry in recent years. In this paper, an automatic container-code recognition system based on computer vision and deep neural networks is proposed. The system consists of two modules, detection module and recognition module. The detection module applies both algorithms based on computer vision and neural networks, and generates a better detection result through combination to avoid the drawbacks of the two methods. The combined detection results are also collected for online training of the neural networks. The recognition module exploits both character segmentation and end-to-end recognition, and outputs the recognition result which passes the verification. When the recognition module generates false recognition, the result will be corrected and collected for online training of the end-to-end recognition sub-module. By combining several algorithms, the system is able to deal with more situations, and the online training mechanism can improve the performance of the neural networks at runtime. The proposed system is able to achieve 93% of overall recognition accuracy.
Deep Learning for Computer Vision: A Brief Review

PubMed Central

Doulamis, Nikolaos; Doulamis, Anastasios; Protopapadakis, Eftychios

2018-01-01

Over the last years deep learning methods have been shown to outperform previous state-of-the-art machine learning techniques in several fields, with computer vision being one of the most prominent cases. This review paper provides a brief overview of some of the most significant deep learning schemes used in computer vision problems, that is, Convolutional Neural Networks, Deep Boltzmann Machines and Deep Belief Networks, and Stacked Denoising Autoencoders. A brief account of their history, structure, advantages, and limitations is given, followed by a description of their applications in various computer vision tasks, such as object detection, face recognition, action and activity recognition, and human pose estimation. Finally, a brief overview is given of future directions in designing deep learning schemes for computer vision problems and the challenges involved therein. PMID:29487619
Real-time unconstrained object recognition: a processing pipeline based on the mammalian visual system.

PubMed

Aguilar, Mario; Peot, Mark A; Zhou, Jiangying; Simons, Stephen; Liao, Yuwei; Metwalli, Nader; Anderson, Mark B

2012-03-01

The mammalian visual system is still the gold standard for recognition accuracy, flexibility, efficiency, and speed. Ongoing advances in our understanding of function and mechanisms in the visual system can now be leveraged to pursue the design of computer vision architectures that will revolutionize the state of the art in computer vision.
Gender Recognition from Human-Body Images Using Visible-Light and Thermal Camera Videos Based on a Convolutional Neural Network for Image Feature Extraction

PubMed Central

Nguyen, Dat Tien; Kim, Ki Wan; Hong, Hyung Gil; Koo, Ja Hyung; Kim, Min Cheol; Park, Kang Ryoung

2017-01-01

Extracting powerful image features plays an important role in computer vision systems. Many methods have previously been proposed to extract image features for various computer vision applications, such as the scale-invariant feature transform (SIFT), speed-up robust feature (SURF), local binary patterns (LBP), histogram of oriented gradients (HOG), and weighted HOG. Recently, the convolutional neural network (CNN) method for image feature extraction and classification in computer vision has been used in various applications. In this research, we propose a new gender recognition method for recognizing males and females in observation scenes of surveillance systems based on feature extraction from visible-light and thermal camera videos through CNN. Experimental results confirm the superiority of our proposed method over state-of-the-art recognition methods for the gender recognition problem using human body images. PMID:28335510
Gender Recognition from Human-Body Images Using Visible-Light and Thermal Camera Videos Based on a Convolutional Neural Network for Image Feature Extraction.

PubMed

Nguyen, Dat Tien; Kim, Ki Wan; Hong, Hyung Gil; Koo, Ja Hyung; Kim, Min Cheol; Park, Kang Ryoung

2017-03-20

Extracting powerful image features plays an important role in computer vision systems. Many methods have previously been proposed to extract image features for various computer vision applications, such as the scale-invariant feature transform (SIFT), speed-up robust feature (SURF), local binary patterns (LBP), histogram of oriented gradients (HOG), and weighted HOG. Recently, the convolutional neural network (CNN) method for image feature extraction and classification in computer vision has been used in various applications. In this research, we propose a new gender recognition method for recognizing males and females in observation scenes of surveillance systems based on feature extraction from visible-light and thermal camera videos through CNN. Experimental results confirm the superiority of our proposed method over state-of-the-art recognition methods for the gender recognition problem using human body images.
Fusion of Multiple Sensing Modalities for Machine Vision

DTIC Science & Technology

1994-05-31

Modeling of Non-Homogeneous 3-D Objects for Thermal and Visual Image Synthesis," Pattern Recognition, in press. U [11] Nair, Dinesh , and J. K. Aggarwal...20th AIPR Workshop: Computer Vision--Meeting the Challenges, McLean, Virginia, October 1991. Nair, Dinesh , and J. K. Aggarwal, "An Object Recognition...Computer Engineering August 1992 Sunil Gupta Ph.D. Student Mohan Kumar M.S. Student Sandeep Kumar M.S. Student Xavier Lebegue Ph.D., Computer
An overview of computer vision

NASA Technical Reports Server (NTRS)

Gevarter, W. B.

1982-01-01

An overview of computer vision is provided. Image understanding and scene analysis are emphasized, and pertinent aspects of pattern recognition are treated. The basic approach to computer vision systems, the techniques utilized, applications, the current existing systems and state-of-the-art issues and research requirements, who is doing it and who is funding it, and future trends and expectations are reviewed.
Parallel Architectures and Parallel Algorithms for Integrated Vision Systems. Ph.D. Thesis

NASA Technical Reports Server (NTRS)

Choudhary, Alok Nidhi

1989-01-01

Computer vision is regarded as one of the most complex and computationally intensive problems. An integrated vision system (IVS) is a system that uses vision algorithms from all levels of processing to perform for a high level application (e.g., object recognition). An IVS normally involves algorithms from low level, intermediate level, and high level vision. Designing parallel architectures for vision systems is of tremendous interest to researchers. Several issues are addressed in parallel architectures and parallel algorithms for integrated vision systems.
Parallel computer vision

DOE Office of Scientific and Technical Information (OSTI.GOV)

Uhr, L.

1987-01-01

This book is written by research scientists involved in the development of massively parallel, but hierarchically structured, algorithms, architectures, and programs for image processing, pattern recognition, and computer vision. The book gives an integrated picture of the programs and algorithms that are being developed, and also of the multi-computer hardware architectures for which these systems are designed.
Exploring Techniques for Vision Based Human Activity Recognition: Methods, Systems, and Evaluation

PubMed Central

Xu, Xin; Tang, Jinshan; Zhang, Xiaolong; Liu, Xiaoming; Zhang, Hong; Qiu, Yimin

2013-01-01

With the wide applications of vision based intelligent systems, image and video analysis technologies have attracted the attention of researchers in the computer vision field. In image and video analysis, human activity recognition is an important research direction. By interpreting and understanding human activities, we can recognize and predict the occurrence of crimes and help the police or other agencies react immediately. In the past, a large number of papers have been published on human activity recognition in video and image sequences. In this paper, we provide a comprehensive survey of the recent development of the techniques, including methods, systems, and quantitative evaluation of the performance of human activity recognition. PMID:23353144
Convolutional neural networks and face recognition task

NASA Astrophysics Data System (ADS)

Sochenkova, A.; Sochenkov, I.; Makovetskii, A.; Vokhmintsev, A.; Melnikov, A.

2017-09-01

Computer vision tasks are remaining very important for the last couple of years. One of the most complicated problems in computer vision is face recognition that could be used in security systems to provide safety and to identify person among the others. There is a variety of different approaches to solve this task, but there is still no universal solution that would give adequate results in some cases. Current paper presents following approach. Firstly, we extract an area containing face, then we use Canny edge detector. On the next stage we use convolutional neural networks (CNN) to finally solve face recognition and person identification task.
Toward open set recognition.

PubMed

Scheirer, Walter J; de Rezende Rocha, Anderson; Sapkota, Archana; Boult, Terrance E

2013-07-01

To date, almost all experimental evaluations of machine learning-based recognition algorithms in computer vision have taken the form of "closed set" recognition, whereby all testing classes are known at training time. A more realistic scenario for vision applications is "open set" recognition, where incomplete knowledge of the world is present at training time, and unknown classes can be submitted to an algorithm during testing. This paper explores the nature of open set recognition and formalizes its definition as a constrained minimization problem. The open set recognition problem is not well addressed by existing algorithms because it requires strong generalization. As a step toward a solution, we introduce a novel "1-vs-set machine," which sculpts a decision space from the marginal distances of a 1-class or binary SVM with a linear kernel. This methodology applies to several different applications in computer vision where open set recognition is a challenging problem, including object recognition and face verification. We consider both in this work, with large scale cross-dataset experiments performed over the Caltech 256 and ImageNet sets, as well as face matching experiments performed over the Labeled Faces in the Wild set. The experiments highlight the effectiveness of machines adapted for open set evaluation compared to existing 1-class and binary SVMs for the same tasks.
AstroCV: Astronomy computer vision library

NASA Astrophysics Data System (ADS)

González, Roberto E.; Muñoz, Roberto P.; Hernández, Cristian A.

2018-04-01

AstroCV processes and analyzes big astronomical datasets, and is intended to provide a community repository of high performance Python and C++ algorithms used for image processing and computer vision. The library offers methods for object recognition, segmentation and classification, with emphasis in the automatic detection and classification of galaxies.
Image processing and pattern recognition with CVIPtools MATLAB toolbox: automatic creation of masks for veterinary thermographic images

NASA Astrophysics Data System (ADS)

Mishra, Deependra K.; Umbaugh, Scott E.; Lama, Norsang; Dahal, Rohini; Marino, Dominic J.; Sackman, Joseph

2016-09-01

CVIPtools is a software package for the exploration of computer vision and image processing developed in the Computer Vision and Image Processing Laboratory at Southern Illinois University Edwardsville. CVIPtools is available in three variants - a) CVIPtools Graphical User Interface, b) CVIPtools C library and c) CVIPtools MATLAB toolbox, which makes it accessible to a variety of different users. It offers students, faculty, researchers and any user a free and easy way to explore computer vision and image processing techniques. Many functions have been implemented and are updated on a regular basis, the library has reached a level of sophistication that makes it suitable for both educational and research purposes. In this paper, the detail list of the functions available in the CVIPtools MATLAB toolbox are presented and how these functions can be used in image analysis and computer vision applications. The CVIPtools MATLAB toolbox allows the user to gain practical experience to better understand underlying theoretical problems in image processing and pattern recognition. As an example application, the algorithm for the automatic creation of masks for veterinary thermographic images is presented.
Comparing visual representations across human fMRI and computational vision

PubMed Central

Leeds, Daniel D.; Seibert, Darren A.; Pyles, John A.; Tarr, Michael J.

2013-01-01

Feedforward visual object perception recruits a cortical network that is assumed to be hierarchical, progressing from basic visual features to complete object representations. However, the nature of the intermediate features related to this transformation remains poorly understood. Here, we explore how well different computer vision recognition models account for neural object encoding across the human cortical visual pathway as measured using fMRI. These neural data, collected during the viewing of 60 images of real-world objects, were analyzed with a searchlight procedure as in Kriegeskorte, Goebel, and Bandettini (2006): Within each searchlight sphere, the obtained patterns of neural activity for all 60 objects were compared to model responses for each computer recognition algorithm using representational dissimilarity analysis (Kriegeskorte et al., 2008). Although each of the computer vision methods significantly accounted for some of the neural data, among the different models, the scale invariant feature transform (Lowe, 2004), encoding local visual properties gathered from “interest points,” was best able to accurately and consistently account for stimulus representations within the ventral pathway. More generally, when present, significance was observed in regions of the ventral-temporal cortex associated with intermediate-level object perception. Differences in model effectiveness and the neural location of significant matches may be attributable to the fact that each model implements a different featural basis for representing objects (e.g., more holistic or more parts-based). Overall, we conclude that well-known computer vision recognition systems may serve as viable proxies for theories of intermediate visual object representation. PMID:24273227
Computer vision

NASA Technical Reports Server (NTRS)

Gennery, D.; Cunningham, R.; Saund, E.; High, J.; Ruoff, C.

1981-01-01

The field of computer vision is surveyed and assessed, key research issues are identified, and possibilities for a future vision system are discussed. The problems of descriptions of two and three dimensional worlds are discussed. The representation of such features as texture, edges, curves, and corners are detailed. Recognition methods are described in which cross correlation coefficients are maximized or numerical values for a set of features are measured. Object tracking is discussed in terms of the robust matching algorithms that must be devised. Stereo vision, camera control and calibration, and the hardware and systems architecture are discussed.
Reinforcement learning in computer vision

NASA Astrophysics Data System (ADS)

Bernstein, A. V.; Burnaev, E. V.

2018-04-01

Nowadays, machine learning has become one of the basic technologies used in solving various computer vision tasks such as feature detection, image segmentation, object recognition and tracking. In many applications, various complex systems such as robots are equipped with visual sensors from which they learn state of surrounding environment by solving corresponding computer vision tasks. Solutions of these tasks are used for making decisions about possible future actions. It is not surprising that when solving computer vision tasks we should take into account special aspects of their subsequent application in model-based predictive control. Reinforcement learning is one of modern machine learning technologies in which learning is carried out through interaction with the environment. In recent years, Reinforcement learning has been used both for solving such applied tasks as processing and analysis of visual information, and for solving specific computer vision problems such as filtering, extracting image features, localizing objects in scenes, and many others. The paper describes shortly the Reinforcement learning technology and its use for solving computer vision problems.
Vision-guided gripping of a cylinder

NASA Technical Reports Server (NTRS)

Nicewarner, Keith E.; Kelley, Robert B.

1991-01-01

The motivation for vision-guided servoing is taken from tasks in automated or telerobotic space assembly and construction. Vision-guided servoing requires the ability to perform rapid pose estimates and provide predictive feature tracking. Monocular information from a gripper-mounted camera is used to servo the gripper to grasp a cylinder. The procedure is divided into recognition and servo phases. The recognition stage verifies the presence of a cylinder in the camera field of view. Then an initial pose estimate is computed and uncluttered scan regions are selected. The servo phase processes only the selected scan regions of the image. Given the knowledge, from the recognition phase, that there is a cylinder in the image and knowing the radius of the cylinder, 4 of the 6 pose parameters can be estimated with minimal computation. The relative motion of the cylinder is obtained by using the current pose and prior pose estimates. The motion information is then used to generate a predictive feature-based trajectory for the path of the gripper.

3D visual mechinism by neural networkings

NASA Astrophysics Data System (ADS)

Sugiyama, Shigeki

2007-04-01

There are some computer vision systems that are available on a market but those are quite far from a real usage of our daily life in a sense of security guard or in a sense of a usage of recognition of a target object behaviour. Because those surroundings' sensing might need to recognize a detail description of an object, like "the distance to an object" and "an object detail figure" and "its figure of edging", which are not possible to have a clear picture of the mechanisms of them with the present recognition system. So for doing this, here studies on mechanisms of how a pair of human eyes can recognize a distance apart, an object edging, and an object in order to get basic essences of vision mechanisms. And those basic mechanisms of object recognition are simplified and are extended logically for applying to a computer vision system. Some of the results of these studies are introduced on this paper.
ICPR-2016 - International Conference on Pattern Recognition

Science.gov Websites

Learning for Scene Understanding" Speakers ICPR2016 PAPER AWARDS Best Piero Zamperoni Student Paper -Paced Dictionary Learning for Cross-Domain Retrieval and Recognition Xu, Dan; Song, Jingkuan; Alameda discussions on recent advances in the fields of Pattern Recognition, Machine Learning and Computer Vision, and
The research of edge extraction and target recognition based on inherent feature of objects

NASA Astrophysics Data System (ADS)

Xie, Yu-chan; Lin, Yu-chi; Huang, Yin-guo

2008-03-01

Current research on computer vision often needs specific techniques for particular problems. Little use has been made of high-level aspects of computer vision, such as three-dimensional (3D) object recognition, that are appropriate for large classes of problems and situations. In particular, high-level vision often focuses mainly on the extraction of symbolic descriptions, and pays little attention to the speed of processing. In order to extract and recognize target intelligently and rapidly, in this paper we developed a new 3D target recognition method based on inherent feature of objects in which cuboid was taken as model. On the basis of analysis cuboid nature contour and greyhound distributing characteristics, overall fuzzy evaluating technique was utilized to recognize and segment the target. Then Hough transform was used to extract and match model's main edges, we reconstruct aim edges by stereo technology in the end. There are three major contributions in this paper. Firstly, the corresponding relations between the parameters of cuboid model's straight edges lines in an image field and in the transform field were summed up. By those, the aimless computations and searches in Hough transform processing can be reduced greatly and the efficiency is improved. Secondly, as the priori knowledge about cuboids contour's geometry character known already, the intersections of the component extracted edges are taken, and assess the geometry of candidate edges matches based on the intersections, rather than the extracted edges. Therefore the outlines are enhanced and the noise is depressed. Finally, a 3-D target recognition method is proposed. Compared with other recognition methods, this new method has a quick response time and can be achieved with high-level computer vision. The method present here can be used widely in vision-guide techniques to strengthen its intelligence and generalization, which can also play an important role in object tracking, port AGV, robots fields. The results of simulation experiments and theory analyzing demonstrate that the proposed method could suppress noise effectively, extracted target edges robustly, and achieve the real time need. Theory analysis and experiment shows the method is reasonable and efficient.
Artificial intelligence, expert systems, computer vision, and natural language processing

NASA Technical Reports Server (NTRS)

Gevarter, W. B.

1984-01-01

An overview of artificial intelligence (AI), its core ingredients, and its applications is presented. The knowledge representation, logic, problem solving approaches, languages, and computers pertaining to AI are examined, and the state of the art in AI is reviewed. The use of AI in expert systems, computer vision, natural language processing, speech recognition and understanding, speech synthesis, problem solving, and planning is examined. Basic AI topics, including automation, search-oriented problem solving, knowledge representation, and computational logic, are discussed.
Pattern recognition for passive polarimetric data using nonparametric classifiers

NASA Astrophysics Data System (ADS)

Thilak, Vimal; Saini, Jatinder; Voelz, David G.; Creusere, Charles D.

2005-08-01

Passive polarization based imaging is a useful tool in computer vision and pattern recognition. A passive polarization imaging system forms a polarimetric image from the reflection of ambient light that contains useful information for computer vision tasks such as object detection (classification) and recognition. Applications of polarization based pattern recognition include material classification and automatic shape recognition. In this paper, we present two target detection algorithms for images captured by a passive polarimetric imaging system. The proposed detection algorithms are based on Bayesian decision theory. In these approaches, an object can belong to one of any given number classes and classification involves making decisions that minimize the average probability of making incorrect decisions. This minimum is achieved by assigning an object to the class that maximizes the a posteriori probability. Computing a posteriori probabilities requires estimates of class conditional probability density functions (likelihoods) and prior probabilities. A Probabilistic neural network (PNN), which is a nonparametric method that can compute Bayes optimal boundaries, and a -nearest neighbor (KNN) classifier, is used for density estimation and classification. The proposed algorithms are applied to polarimetric image data gathered in the laboratory with a liquid crystal-based system. The experimental results validate the effectiveness of the above algorithms for target detection from polarimetric data.
Application of the SP theory of intelligence to the understanding of natural vision and the development of computer vision.

PubMed

Wolff, J Gerard

2014-01-01

The SP theory of intelligence aims to simplify and integrate concepts in computing and cognition, with information compression as a unifying theme. This article is about how the SP theory may, with advantage, be applied to the understanding of natural vision and the development of computer vision. Potential benefits include an overall simplification of concepts in a universal framework for knowledge and seamless integration of vision with other sensory modalities and other aspects of intelligence. Low level perceptual features such as edges or corners may be identified by the extraction of redundancy in uniform areas in the manner of the run-length encoding technique for information compression. The concept of multiple alignment in the SP theory may be applied to the recognition of objects, and to scene analysis, with a hierarchy of parts and sub-parts, at multiple levels of abstraction, and with family-resemblance or polythetic categories. The theory has potential for the unsupervised learning of visual objects and classes of objects, and suggests how coherent concepts may be derived from fragments. As in natural vision, both recognition and learning in the SP system are robust in the face of errors of omission, commission and substitution. The theory suggests how, via vision, we may piece together a knowledge of the three-dimensional structure of objects and of our environment, it provides an account of how we may see things that are not objectively present in an image, how we may recognise something despite variations in the size of its retinal image, and how raster graphics and vector graphics may be unified. And it has things to say about the phenomena of lightness constancy and colour constancy, the role of context in recognition, ambiguities in visual perception, and the integration of vision with other senses and other aspects of intelligence.
Modeling Interval Temporal Dependencies for Complex Activities Understanding

DTIC Science & Technology

2013-10-11

ORGANIZATION NAMES AND ADDRESSES U.S. Army Research Office P.O. Box 12211 Research Triangle Park, NC 27709-2211 15. SUBJECT TERMS Human activity modeling...computer vision applications: human activity recognition and facial activity recognition. The results demonstrate the superior performance of the
Image Classification for Web Genre Identification

DTIC Science & Technology

2012-01-01

recognition and landscape detection using the computer vision toolkit OpenCV1. For facial recognition , we researched the possibilities of using the...method for connecting these names with a face/personal photo and logo respectively. [2] METHODOLOGY For this project, we focused primarily on facial
Automatic Mexican sign language and digits recognition using normalized central moments

NASA Astrophysics Data System (ADS)

Solís, Francisco; Martínez, David; Espinosa, Oscar; Toxqui, Carina

2016-09-01

This work presents a framework for automatic Mexican sign language and digits recognition based on computer vision system using normalized central moments and artificial neural networks. Images are captured by digital IP camera, four LED reflectors and a green background in order to reduce computational costs and prevent the use of special gloves. 42 normalized central moments are computed per frame and used in a Multi-Layer Perceptron to recognize each database. Four versions per sign and digit were used in training phase. 93% and 95% of recognition rates were achieved for Mexican sign language and digits respectively.
Deep Neural Networks: A New Framework for Modeling Biological Vision and Brain Information Processing.

PubMed

Kriegeskorte, Nikolaus

2015-11-24

Recent advances in neural network modeling have enabled major strides in computer vision and other artificial intelligence applications. Human-level visual recognition abilities are coming within reach of artificial systems. Artificial neural networks are inspired by the brain, and their computations could be implemented in biological neurons. Convolutional feedforward networks, which now dominate computer vision, take further inspiration from the architecture of the primate visual hierarchy. However, the current models are designed with engineering goals, not to model brain computations. Nevertheless, initial studies comparing internal representations between these models and primate brains find surprisingly similar representational spaces. With human-level performance no longer out of reach, we are entering an exciting new era, in which we will be able to build biologically faithful feedforward and recurrent computational models of how biological brains perform high-level feats of intelligence, including vision.
CT Image Sequence Analysis for Object Recognition - A Rule-Based 3-D Computer Vision System

Treesearch

Dongping Zhu; Richard W. Conners; Daniel L. Schmoldt; Philip A. Araman

1991-01-01

Research is now underway to create a vision system for hardwood log inspection using a knowledge-based approach. In this paper, we present a rule-based, 3-D vision system for locating and identifying wood defects using topological, geometric, and statistical attributes. A number of different features can be derived from the 3-D input scenes. These features and evidence...
Heterogeneous compute in computer vision: OpenCL in OpenCV

NASA Astrophysics Data System (ADS)

Gasparakis, Harris

2014-02-01

We explore the relevance of Heterogeneous System Architecture (HSA) in Computer Vision, both as a long term vision, and as a near term emerging reality via the recently ratified OpenCL 2.0 Khronos standard. After a brief review of OpenCL 1.2 and 2.0, including HSA features such as Shared Virtual Memory (SVM) and platform atomics, we identify what genres of Computer Vision workloads stand to benefit by leveraging those features, and we suggest a new mental framework that replaces GPU compute with hybrid HSA APU compute. As a case in point, we discuss, in some detail, popular object recognition algorithms (part-based models), emphasizing the interplay and concurrent collaboration between the GPU and CPU. We conclude by describing how OpenCL has been incorporated in OpenCV, a popular open source computer vision library, emphasizing recent work on the Transparent API, to appear in OpenCV 3.0, which unifies the native CPU and OpenCL execution paths under a single API, allowing the same code to execute either on CPU or on a OpenCL enabled device, without even recompiling.
Computers for the Disabled.

ERIC Educational Resources Information Center

Lazzaro, Joseph J.

1993-01-01

Describes adaptive technology for personal computers that accommodate disabled users and may require special equipment including hardware, memory, expansion slots, and ports. Highlights include vision aids, including speech synthesizers, magnification, braille, and optical character recognition (OCR); hearing adaptations; motor-impaired…
Object and Facial Recognition in Augmented and Virtual Reality: Investigation into Software, Hardware and Potential Uses

NASA Technical Reports Server (NTRS)

Schulte, Erin

2017-01-01

As augmented and virtual reality grows in popularity, and more researchers focus on its development, other fields of technology have grown in the hopes of integrating with the up-and-coming hardware currently on the market. Namely, there has been a focus on how to make an intuitive, hands-free human-computer interaction (HCI) utilizing AR and VR that allows users to control their technology with little to no physical interaction with hardware. Computer vision, which is utilized in devices such as the Microsoft Kinect, webcams and other similar hardware has shown potential in assisting with the development of a HCI system that requires next to no human interaction with computing hardware and software. Object and facial recognition are two subsets of computer vision, both of which can be applied to HCI systems in the fields of medicine, security, industrial development and other similar areas.
Design of an efficient framework for fast prototyping of customized human-computer interfaces and virtual environments for rehabilitation.

PubMed

Avola, Danilo; Spezialetti, Matteo; Placidi, Giuseppe

2013-06-01

Rehabilitation is often required after stroke, surgery, or degenerative diseases. It has to be specific for each patient and can be easily calibrated if assisted by human-computer interfaces and virtual reality. Recognition and tracking of different human body landmarks represent the basic features for the design of the next generation of human-computer interfaces. The most advanced systems for capturing human gestures are focused on vision-based techniques which, on the one hand, may require compromises from real-time and spatial precision and, on the other hand, ensure natural interaction experience. The integration of vision-based interfaces with thematic virtual environments encourages the development of novel applications and services regarding rehabilitation activities. The algorithmic processes involved during gesture recognition activity, as well as the characteristics of the virtual environments, can be developed with different levels of accuracy. This paper describes the architectural aspects of a framework supporting real-time vision-based gesture recognition and virtual environments for fast prototyping of customized exercises for rehabilitation purposes. The goal is to provide the therapist with a tool for fast implementation and modification of specific rehabilitation exercises for specific patients, during functional recovery. Pilot examples of designed applications and preliminary system evaluation are reported and discussed. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
The software for automatic creation of the formal grammars used by speech recognition, computer vision, editable text conversion systems, and some new functions

NASA Astrophysics Data System (ADS)

Kardava, Irakli; Tadyszak, Krzysztof; Gulua, Nana; Jurga, Stefan

2017-02-01

For more flexibility of environmental perception by artificial intelligence it is needed to exist the supporting software modules, which will be able to automate the creation of specific language syntax and to make a further analysis for relevant decisions based on semantic functions. According of our proposed approach, of which implementation it is possible to create the couples of formal rules of given sentences (in case of natural languages) or statements (in case of special languages) by helping of computer vision, speech recognition or editable text conversion system for further automatic improvement. In other words, we have developed an approach, by which it can be achieved to significantly improve the training process automation of artificial intelligence, which as a result will give us a higher level of self-developing skills independently from us (from users). At the base of our approach we have developed a software demo version, which includes the algorithm and software code for the entire above mentioned component's implementation (computer vision, speech recognition and editable text conversion system). The program has the ability to work in a multi - stream mode and simultaneously create a syntax based on receiving information from several sources.
Target recognition and scene interpretation in image/video understanding systems based on network-symbolic models

NASA Astrophysics Data System (ADS)

Kuvich, Gary

2004-08-01

Vision is only a part of a system that converts visual information into knowledge structures. These structures drive the vision process, resolving ambiguity and uncertainty via feedback, and provide image understanding, which is an interpretation of visual information in terms of these knowledge models. These mechanisms provide a reliable recognition if the object is occluded or cannot be recognized as a whole. It is hard to split the entire system apart, and reliable solutions to the target recognition problems are possible only within the solution of a more generic Image Understanding Problem. Brain reduces informational and computational complexities, using implicit symbolic coding of features, hierarchical compression, and selective processing of visual information. Biologically inspired Network-Symbolic representation, where both systematic structural/logical methods and neural/statistical methods are parts of a single mechanism, is the most feasible for such models. It converts visual information into relational Network-Symbolic structures, avoiding artificial precise computations of 3-dimensional models. Network-Symbolic Transformations derive abstract structures, which allows for invariant recognition of an object as exemplar of a class. Active vision helps creating consistent models. Attention, separation of figure from ground and perceptual grouping are special kinds of network-symbolic transformations. Such Image/Video Understanding Systems will be reliably recognizing targets.
ATR applications of minimax entropy models of texture and shape

NASA Astrophysics Data System (ADS)

Zhu, Song-Chun; Yuille, Alan L.; Lanterman, Aaron D.

2001-10-01

Concepts from information theory have recently found favor in both the mainstream computer vision community and the military automatic target recognition community. In the computer vision literature, the principles of minimax entropy learning theory have been used to generate rich probabilitistic models of texture and shape. In addition, the method of types and large deviation theory has permitted the difficulty of various texture and shape recognition tasks to be characterized by 'order parameters' that determine how fundamentally vexing a task is, independent of the particular algorithm used. These information-theoretic techniques have been demonstrated using traditional visual imagery in applications such as simulating cheetah skin textures and such as finding roads in aerial imagery. We discuss their application to problems in the specific application domain of automatic target recognition using infrared imagery. We also review recent theoretical and algorithmic developments which permit learning minimax entropy texture models for infrared textures in reasonable timeframes.
A Feasibility Study of View-independent Gait Identification

DTIC Science & Technology

2012-03-01

ice skates . For walking, the footprint records for single pixels form clusters that are well separated in space and time. (Any overlap of contact...Pattern Recognition 2007, 1-8. Cheng M-H, Ho M-F & Huang C-L (2008), "Gait Analysis for Human Identification Through Manifold Learning and HMM... Learning and Cybernetics 2005, 4516-4521 Moeslund T B & Granum E (2001), "A Survey of Computer Vision-Based Human Motion Capture", Computer Vision
Invariant visual object recognition and shape processing in rats

PubMed Central

Zoccolan, Davide

2015-01-01

Invariant visual object recognition is the ability to recognize visual objects despite the vastly different images that each object can project onto the retina during natural vision, depending on its position and size within the visual field, its orientation relative to the viewer, etc. Achieving invariant recognition represents such a formidable computational challenge that is often assumed to be a unique hallmark of primate vision. Historically, this has limited the invasive investigation of its neuronal underpinnings to monkey studies, in spite of the narrow range of experimental approaches that these animal models allow. Meanwhile, rodents have been largely neglected as models of object vision, because of the widespread belief that they are incapable of advanced visual processing. However, the powerful array of experimental tools that have been developed to dissect neuronal circuits in rodents has made these species very attractive to vision scientists too, promoting a new tide of studies that have started to systematically explore visual functions in rats and mice. Rats, in particular, have been the subjects of several behavioral studies, aimed at assessing how advanced object recognition and shape processing is in this species. Here, I review these recent investigations, as well as earlier studies of rat pattern vision, to provide an historical overview and a critical summary of the status of the knowledge about rat object vision. The picture emerging from this survey is very encouraging with regard to the possibility of using rats as complementary models to monkeys in the study of higher-level vision. PMID:25561421

Converting Static Image Datasets to Spiking Neuromorphic Datasets Using Saccades.

PubMed

Orchard, Garrick; Jayawant, Ajinkya; Cohen, Gregory K; Thakor, Nitish

2015-01-01

Creating datasets for Neuromorphic Vision is a challenging task. A lack of available recordings from Neuromorphic Vision sensors means that data must typically be recorded specifically for dataset creation rather than collecting and labeling existing data. The task is further complicated by a desire to simultaneously provide traditional frame-based recordings to allow for direct comparison with traditional Computer Vision algorithms. Here we propose a method for converting existing Computer Vision static image datasets into Neuromorphic Vision datasets using an actuated pan-tilt camera platform. Moving the sensor rather than the scene or image is a more biologically realistic approach to sensing and eliminates timing artifacts introduced by monitor updates when simulating motion on a computer monitor. We present conversion of two popular image datasets (MNIST and Caltech101) which have played important roles in the development of Computer Vision, and we provide performance metrics on these datasets using spike-based recognition algorithms. This work contributes datasets for future use in the field, as well as results from spike-based algorithms against which future works can compare. Furthermore, by converting datasets already popular in Computer Vision, we enable more direct comparison with frame-based approaches.
Fast neuromimetic object recognition using FPGA outperforms GPU implementations.

PubMed

Orchard, Garrick; Martin, Jacob G; Vogelstein, R Jacob; Etienne-Cummings, Ralph

2013-08-01

Recognition of objects in still images has traditionally been regarded as a difficult computational problem. Although modern automated methods for visual object recognition have achieved steadily increasing recognition accuracy, even the most advanced computational vision approaches are unable to obtain performance equal to that of humans. This has led to the creation of many biologically inspired models of visual object recognition, among them the hierarchical model and X (HMAX) model. HMAX is traditionally known to achieve high accuracy in visual object recognition tasks at the expense of significant computational complexity. Increasing complexity, in turn, increases computation time, reducing the number of images that can be processed per unit time. In this paper we describe how the computationally intensive and biologically inspired HMAX model for visual object recognition can be modified for implementation on a commercial field-programmable aate Array, specifically the Xilinx Virtex 6 ML605 evaluation board with XC6VLX240T FPGA. We show that with minor modifications to the traditional HMAX model we can perform recognition on images of size 128 × 128 pixels at a rate of 190 images per second with a less than 1% loss in recognition accuracy in both binary and multiclass visual object recognition tasks.
Recent advances in the development and transfer of machine vision technologies for space

NASA Technical Reports Server (NTRS)

Defigueiredo, Rui J. P.; Pendleton, Thomas

1991-01-01

Recent work concerned with real-time machine vision is briefly reviewed. This work includes methodologies and techniques for optimal illumination, shape-from-shading of general (non-Lambertian) 3D surfaces, laser vision devices and technology, high level vision, sensor fusion, real-time computing, artificial neural network design and use, and motion estimation. Two new methods that are currently being developed for object recognition in clutter and for 3D attitude tracking based on line correspondence are discussed.
Humans and Deep Networks Largely Agree on Which Kinds of Variation Make Object Recognition Harder.

PubMed

Kheradpisheh, Saeed R; Ghodrati, Masoud; Ganjtabesh, Mohammad; Masquelier, Timothée

2016-01-01

View-invariant object recognition is a challenging problem that has attracted much attention among the psychology, neuroscience, and computer vision communities. Humans are notoriously good at it, even if some variations are presumably more difficult to handle than others (e.g., 3D rotations). Humans are thought to solve the problem through hierarchical processing along the ventral stream, which progressively extracts more and more invariant visual features. This feed-forward architecture has inspired a new generation of bio-inspired computer vision systems called deep convolutional neural networks (DCNN), which are currently the best models for object recognition in natural images. Here, for the first time, we systematically compared human feed-forward vision and DCNNs at view-invariant object recognition task using the same set of images and controlling the kinds of transformation (position, scale, rotation in plane, and rotation in depth) as well as their magnitude, which we call "variation level." We used four object categories: car, ship, motorcycle, and animal. In total, 89 human subjects participated in 10 experiments in which they had to discriminate between two or four categories after rapid presentation with backward masking. We also tested two recent DCNNs (proposed respectively by Hinton's group and Zisserman's group) on the same tasks. We found that humans and DCNNs largely agreed on the relative difficulties of each kind of variation: rotation in depth is by far the hardest transformation to handle, followed by scale, then rotation in plane, and finally position (much easier). This suggests that DCNNs would be reasonable models of human feed-forward vision. In addition, our results show that the variation levels in rotation in depth and scale strongly modulate both humans' and DCNNs' recognition performances. We thus argue that these variations should be controlled in the image datasets used in vision research.
Data-driven indexing mechanism for the recognition of polyhedral objects

NASA Astrophysics Data System (ADS)

McLean, Stewart; Horan, Peter; Caelli, Terry M.

1992-02-01

This paper is concerned with the problem of searching large model databases. To date, most object recognition systems have concentrated on the problem of matching using simple searching algorithms. This is quite acceptable when the number of object models is small. However, in the future, general purpose computer vision systems will be required to recognize hundreds or perhaps thousands of objects and, in such circumstances, efficient searching algorithms will be needed. The problem of searching a large model database is one which must be addressed if future computer vision systems are to be at all effective. In this paper we present a method we call data-driven feature-indexed hypothesis generation as one solution to the problem of searching large model databases.
Enhanced computer vision with Microsoft Kinect sensor: a review.

PubMed

Han, Jungong; Shao, Ling; Xu, Dong; Shotton, Jamie

2013-10-01

With the invention of the low-cost Microsoft Kinect sensor, high-resolution depth and visual (RGB) sensing has become available for widespread use. The complementary nature of the depth and visual information provided by the Kinect sensor opens up new opportunities to solve fundamental problems in computer vision. This paper presents a comprehensive review of recent Kinect-based computer vision algorithms and applications. The reviewed approaches are classified according to the type of vision problems that can be addressed or enhanced by means of the Kinect sensor. The covered topics include preprocessing, object tracking and recognition, human activity analysis, hand gesture analysis, and indoor 3-D mapping. For each category of methods, we outline their main algorithmic contributions and summarize their advantages/differences compared to their RGB counterparts. Finally, we give an overview of the challenges in this field and future research trends. This paper is expected to serve as a tutorial and source of references for Kinect-based computer vision researchers.
Selection of Norway spruce somatic embryos by computer vision

NASA Astrophysics Data System (ADS)

Hamalainen, Jari J.; Jokinen, Kari J.

1993-05-01

A computer vision system was developed for the classification of plant somatic embryos. The embryos are in a Petri dish that is transferred with constant speed and they are recognized as they pass a line scan camera. A classification algorithm needs to be installed for every plant species. This paper describes an algorithm for the recognition of Norway spruce (Picea abies) embryos. A short review of conifer micropropagation by somatic embryogenesis is also given. The recognition algorithm is based on features calculated from the boundary of the object. Only part of the boundary corresponding to the developing cotyledons (2 - 15) and the straight sides of the embryo are used for recognition. An index of the length of the cotyledons describes the developmental stage of the embryo. The testing set for classifier performance consisted of 118 embryos and 478 nonembryos. With the classification tolerances chosen 69% of the objects classified as embryos by a human classifier were selected and 31$% rejected. Less than 1% of the nonembryos were classified as embryos. The basic features developed can probably be easily adapted for the recognition of other conifer somatic embryos.
Infrared Cephalic-Vein to Assist Blood Extraction Tasks: Automatic Projection and Recognition

NASA Astrophysics Data System (ADS)

Lagüela, S.; Gesto, M.; Riveiro, B.; González-Aguilera, D.

2017-05-01

Thermal infrared band is not commonly used in photogrammetric and computer vision algorithms, mainly due to the low spatial resolution of this type of imagery. However, this band captures sub-superficial information, increasing the capabilities of visible bands regarding applications. This fact is especially important in biomedicine and biometrics, allowing the geometric characterization of interior organs and pathologies with photogrammetric principles, as well as the automatic identification and labelling using computer vision algorithms. This paper presents advances of close-range photogrammetry and computer vision applied to thermal infrared imagery, with the final application of Augmented Reality in order to widen its application in the biomedical field. In this case, the thermal infrared image of the arm is acquired and simultaneously projected on the arm, together with the identification label of the cephalic-vein. This way, blood analysts are assisted in finding the vein for blood extraction, especially in those cases where the identification by the human eye is a complex task. Vein recognition is performed based on the Gaussian temperature distribution in the area of the vein, while the calibration between projector and thermographic camera is developed through feature extraction and pattern recognition. The method is validated through its application to a set of volunteers, with different ages and genres, in such way that different conditions of body temperature and vein depth are covered for the applicability and reproducibility of the method.
Pattern recognition neural-net by spatial mapping of biology visual field

NASA Astrophysics Data System (ADS)

Lin, Xin; Mori, Masahiko

2000-05-01

The method of spatial mapping in biology vision field is applied to artificial neural networks for pattern recognition. By the coordinate transform that is called the complex-logarithm mapping and Fourier transform, the input images are transformed into scale- rotation- and shift- invariant patterns, and then fed into a multilayer neural network for learning and recognition. The results of computer simulation and an optical experimental system are described.
Relevance feedback-based building recognition

NASA Astrophysics Data System (ADS)

Li, Jing; Allinson, Nigel M.

2010-07-01

Building recognition is a nontrivial task in computer vision research which can be utilized in robot localization, mobile navigation, etc. However, existing building recognition systems usually encounter the following two problems: 1) extracted low level features cannot reveal the true semantic concepts; and 2) they usually involve high dimensional data which require heavy computational costs and memory. Relevance feedback (RF), widely applied in multimedia information retrieval, is able to bridge the gap between the low level visual features and high level concepts; while dimensionality reduction methods can mitigate the high-dimensional problem. In this paper, we propose a building recognition scheme which integrates the RF and subspace learning algorithms. Experimental results undertaken on our own building database show that the newly proposed scheme appreciably enhances the recognition accuracy.
Robotic space simulation integration of vision algorithms into an orbital operations simulation

NASA Technical Reports Server (NTRS)

Bochsler, Daniel C.

1987-01-01

In order to successfully plan and analyze future space activities, computer-based simulations of activities in low earth orbit will be required to model and integrate vision and robotic operations with vehicle dynamics and proximity operations procedures. The orbital operations simulation (OOS) is configured and enhanced as a testbed for robotic space operations. Vision integration algorithms are being developed in three areas: preprocessing, recognition, and attitude/attitude rates. The vision program (Rice University) was modified for use in the OOS. Systems integration testing is now in progress.
Feedforward object-vision models only tolerate small image variations compared to human

PubMed Central

Ghodrati, Masoud; Farzmahdi, Amirhossein; Rajaei, Karim; Ebrahimpour, Reza; Khaligh-Razavi, Seyed-Mahdi

2014-01-01

Invariant object recognition is a remarkable ability of primates' visual system that its underlying mechanism has constantly been under intense investigations. Computational modeling is a valuable tool toward understanding the processes involved in invariant object recognition. Although recent computational models have shown outstanding performances on challenging image databases, they fail to perform well in image categorization under more complex image variations. Studies have shown that making sparse representation of objects by extracting more informative visual features through a feedforward sweep can lead to higher recognition performances. Here, however, we show that when the complexity of image variations is high, even this approach results in poor performance compared to humans. To assess the performance of models and humans in invariant object recognition tasks, we built a parametrically controlled image database consisting of several object categories varied in different dimensions and levels, rendered from 3D planes. Comparing the performance of several object recognition models with human observers shows that only in low-level image variations the models perform similar to humans in categorization tasks. Furthermore, the results of our behavioral experiments demonstrate that, even under difficult experimental conditions (i.e., briefly presented masked stimuli with complex image variations), human observers performed outstandingly well, suggesting that the models are still far from resembling humans in invariant object recognition. Taken together, we suggest that learning sparse informative visual features, although desirable, is not a complete solution for future progresses in object-vision modeling. We show that this approach is not of significant help in solving the computational crux of object recognition (i.e., invariant object recognition) when the identity-preserving image variations become more complex. PMID:25100986
Ventral-stream-like shape representation: from pixel intensity values to trainable object-selective COSFIRE models

PubMed Central

Azzopardi, George; Petkov, Nicolai

2014-01-01

The remarkable abilities of the primate visual system have inspired the construction of computational models of some visual neurons. We propose a trainable hierarchical object recognition model, which we call S-COSFIRE (S stands for Shape and COSFIRE stands for Combination Of Shifted FIlter REsponses) and use it to localize and recognize objects of interests embedded in complex scenes. It is inspired by the visual processing in the ventral stream (V1/V2 → V4 → TEO). Recognition and localization of objects embedded in complex scenes is important for many computer vision applications. Most existing methods require prior segmentation of the objects from the background which on its turn requires recognition. An S-COSFIRE filter is automatically configured to be selective for an arrangement of contour-based features that belong to a prototype shape specified by an example. The configuration comprises selecting relevant vertex detectors and determining certain blur and shift parameters. The response is computed as the weighted geometric mean of the blurred and shifted responses of the selected vertex detectors. S-COSFIRE filters share similar properties with some neurons in inferotemporal cortex, which provided inspiration for this work. We demonstrate the effectiveness of S-COSFIRE filters in two applications: letter and keyword spotting in handwritten manuscripts and object spotting in complex scenes for the computer vision system of a domestic robot. S-COSFIRE filters are effective to recognize and localize (deformable) objects in images of complex scenes without requiring prior segmentation. They are versatile trainable shape detectors, conceptually simple and easy to implement. The presented hierarchical shape representation contributes to a better understanding of the brain and to more robust computer vision algorithms. PMID:25126068
Milestones on the road to independence for the blind

NASA Astrophysics Data System (ADS)

Reed, Kenneth

1997-02-01

Ken will talk about his experiences as an end user of technology. Even moderate technological progress in the field of pattern recognition and artificial intelligence can be, often surprisingly, of great help to the blind. An example is the providing of portable bar code scanners so that a blind person knows what he is buying and what color it is. In this age of microprocessors controlling everything, how can a blind person find out what his VCR is doing? Is there some technique that will allow a blind musician to convert print music into midi files to drive a synthesizer? Can computer vision help the blind cross a road including predictions of where oncoming traffic will be located? Can computer vision technology provide spoken description of scenes so a blind person can figure out where doors and entrances are located, and what the signage on the building says? He asks 'can computer vision help me flip a pancake?' His challenge to those in the computer vision field is 'where can we go from here?'
Machine vision system for inspecting characteristics of hybrid rice seed

NASA Astrophysics Data System (ADS)

Cheng, Fang; Ying, Yibin

2004-03-01

Obtaining clear images advantaged of improving the classification accuracy involves many factors, light source, lens extender and background were discussed in this paper. The analysis of rice seed reflectance curves showed that the wavelength of light source for discrimination of the diseased seeds from normal rice seeds in the monochromic image recognition mode was about 815nm for jinyou402 and shanyou10. To determine optimizing conditions for acquiring digital images of rice seed using a computer vision system, an adjustable color machine vision system was developed. The machine vision system with 20mm to 25mm lens extender produce close-up images which made it easy to object recognition of characteristics in hybrid rice seeds. White background was proved to be better than black background for inspecting rice seeds infected by disease and using the algorithms based on shape. Experimental results indicated good classification for most of the characteristics with the machine vision system. The same algorithm yielded better results in optimizing condition for quality inspection of rice seed. Specifically, the image processing can correct for details such as fine fissure with the machine vision system.
Episodic Reasoning for Vision-Based Human Action Recognition

PubMed Central

Martinez-del-Rincon, Jesus

2014-01-01

Smart Spaces, Ambient Intelligence, and Ambient Assisted Living are environmental paradigms that strongly depend on their capability to recognize human actions. While most solutions rest on sensor value interpretations and video analysis applications, few have realized the importance of incorporating common-sense capabilities to support the recognition process. Unfortunately, human action recognition cannot be successfully accomplished by only analyzing body postures. On the contrary, this task should be supported by profound knowledge of human agency nature and its tight connection to the reasons and motivations that explain it. The combination of this knowledge and the knowledge about how the world works is essential for recognizing and understanding human actions without committing common-senseless mistakes. This work demonstrates the impact that episodic reasoning has in improving the accuracy of a computer vision system for human action recognition. This work also presents formalization, implementation, and evaluation details of the knowledge model that supports the episodic reasoning. PMID:24959602
Survey of computer vision-based natural disaster warning systems

NASA Astrophysics Data System (ADS)

Ko, ByoungChul; Kwak, Sooyeong

2012-07-01

With the rapid development of information technology, natural disaster prevention is growing as a new research field dealing with surveillance systems. To forecast and prevent the damage caused by natural disasters, the development of systems to analyze natural disasters using remote sensing geographic information systems (GIS), and vision sensors has been receiving widespread interest over the last decade. This paper provides an up-to-date review of five different types of natural disasters and their corresponding warning systems using computer vision and pattern recognition techniques such as wildfire smoke and flame detection, water level detection for flood prevention, coastal zone monitoring, and landslide detection. Finally, we conclude with some thoughts about future research directions.
Computer vision for microscopy diagnosis of malaria.

PubMed

Tek, F Boray; Dempster, Andrew G; Kale, Izzet

2009-07-13

This paper reviews computer vision and image analysis studies aiming at automated diagnosis or screening of malaria infection in microscope images of thin blood film smears. Existing works interpret the diagnosis problem differently or propose partial solutions to the problem. A critique of these works is furnished. In addition, a general pattern recognition framework to perform diagnosis, which includes image acquisition, pre-processing, segmentation, and pattern classification components, is described. The open problems are addressed and a perspective of the future work for realization of automated microscopy diagnosis of malaria is provided.
Computer vision syndrome-A common cause of unexplained visual symptoms in the modern era.

PubMed

Munshi, Sunil; Varghese, Ashley; Dhar-Munshi, Sushma

2017-07-01

The aim of this study was to assess the evidence and available literature on the clinical, pathogenetic, prognostic and therapeutic aspects of Computer vision syndrome. Information was collected from Medline, Embase & National Library of Medicine over the last 30 years up to March 2016. The bibliographies of relevant articles were searched for additional references. Patients with Computer vision syndrome present to a variety of different specialists, including General Practitioners, Neurologists, Stroke physicians and Ophthalmologists. While the condition is common, there is a poor awareness in the public and among health professionals. Recognising this condition in the clinic or in emergency situations like the TIA clinic is crucial. The implications are potentially huge in view of the extensive and widespread use of computers and visual display units. Greater public awareness of Computer vision syndrome and education of health professionals is vital. Preventive strategies should form part of work place ergonomics routinely. Prompt and correct recognition is important to allow management and avoid unnecessary treatments. © 2017 John Wiley & Sons Ltd.
Dynamic programming and graph algorithms in computer vision.

PubMed

Felzenszwalb, Pedro F; Zabih, Ramin

2011-04-01

Optimization is a powerful paradigm for expressing and solving problems in a wide range of areas, and has been successfully applied to many vision problems. Discrete optimization techniques are especially interesting since, by carefully exploiting problem structure, they often provide nontrivial guarantees concerning solution quality. In this paper, we review dynamic programming and graph algorithms, and discuss representative examples of how these discrete optimization techniques have been applied to some classical vision problems. We focus on the low-level vision problem of stereo, the mid-level problem of interactive object segmentation, and the high-level problem of model-based recognition.

Strategic Computing. New-Generation Computing Technology: A Strategic Plan for Its Development and Application to Critical Problems in Defense

DTIC Science & Technology

1983-10-28

Computing. By seizing an opportunity to leverage recent advances in artificial intelligence, computer science, and microelectronics, the Agency plans...occurred in many separated areas of artificial intelligence, computer science, and microelectronics. Advances in "expert system" technology now...and expert knowledge o Advances in Artificial Intelligence: Mechanization of speech recognition, vision, and natural language understanding. o
A fast 3-D object recognition algorithm for the vision system of a special-purpose dexterous manipulator

NASA Technical Reports Server (NTRS)

Hung, Stephen H. Y.

1989-01-01

A fast 3-D object recognition algorithm that can be used as a quick-look subsystem to the vision system for the Special-Purpose Dexterous Manipulator (SPDM) is described. Global features that can be easily computed from range data are used to characterize the images of a viewer-centered model of an object. This algorithm will speed up the processing by eliminating the low level processing whenever possible. It may identify the object, reject a set of bad data in the early stage, or create a better environment for a more powerful algorithm to carry the work further.
Benchmarking Spike-Based Visual Recognition: A Dataset and Evaluation

PubMed Central

Liu, Qian; Pineda-García, Garibaldi; Stromatias, Evangelos; Serrano-Gotarredona, Teresa; Furber, Steve B.

2016-01-01

Today, increasing attention is being paid to research into spike-based neural computation both to gain a better understanding of the brain and to explore biologically-inspired computation. Within this field, the primate visual pathway and its hierarchical organization have been extensively studied. Spiking Neural Networks (SNNs), inspired by the understanding of observed biological structure and function, have been successfully applied to visual recognition and classification tasks. In addition, implementations on neuromorphic hardware have enabled large-scale networks to run in (or even faster than) real time, making spike-based neural vision processing accessible on mobile robots. Neuromorphic sensors such as silicon retinas are able to feed such mobile systems with real-time visual stimuli. A new set of vision benchmarks for spike-based neural processing are now needed to measure progress quantitatively within this rapidly advancing field. We propose that a large dataset of spike-based visual stimuli is needed to provide meaningful comparisons between different systems, and a corresponding evaluation methodology is also required to measure the performance of SNN models and their hardware implementations. In this paper we first propose an initial NE (Neuromorphic Engineering) dataset based on standard computer vision benchmarksand that uses digits from the MNIST database. This dataset is compatible with the state of current research on spike-based image recognition. The corresponding spike trains are produced using a range of techniques: rate-based Poisson spike generation, rank order encoding, and recorded output from a silicon retina with both flashing and oscillating input stimuli. In addition, a complementary evaluation methodology is presented to assess both model-level and hardware-level performance. Finally, we demonstrate the use of the dataset and the evaluation methodology using two SNN models to validate the performance of the models and their hardware implementations. With this dataset we hope to (1) promote meaningful comparison between algorithms in the field of neural computation, (2) allow comparison with conventional image recognition methods, (3) provide an assessment of the state of the art in spike-based visual recognition, and (4) help researchers identify future directions and advance the field. PMID:27853419
Benchmarking Spike-Based Visual Recognition: A Dataset and Evaluation.

PubMed

Liu, Qian; Pineda-García, Garibaldi; Stromatias, Evangelos; Serrano-Gotarredona, Teresa; Furber, Steve B

2016-01-01

Today, increasing attention is being paid to research into spike-based neural computation both to gain a better understanding of the brain and to explore biologically-inspired computation. Within this field, the primate visual pathway and its hierarchical organization have been extensively studied. Spiking Neural Networks (SNNs), inspired by the understanding of observed biological structure and function, have been successfully applied to visual recognition and classification tasks. In addition, implementations on neuromorphic hardware have enabled large-scale networks to run in (or even faster than) real time, making spike-based neural vision processing accessible on mobile robots. Neuromorphic sensors such as silicon retinas are able to feed such mobile systems with real-time visual stimuli. A new set of vision benchmarks for spike-based neural processing are now needed to measure progress quantitatively within this rapidly advancing field. We propose that a large dataset of spike-based visual stimuli is needed to provide meaningful comparisons between different systems, and a corresponding evaluation methodology is also required to measure the performance of SNN models and their hardware implementations. In this paper we first propose an initial NE (Neuromorphic Engineering) dataset based on standard computer vision benchmarksand that uses digits from the MNIST database. This dataset is compatible with the state of current research on spike-based image recognition. The corresponding spike trains are produced using a range of techniques: rate-based Poisson spike generation, rank order encoding, and recorded output from a silicon retina with both flashing and oscillating input stimuli. In addition, a complementary evaluation methodology is presented to assess both model-level and hardware-level performance. Finally, we demonstrate the use of the dataset and the evaluation methodology using two SNN models to validate the performance of the models and their hardware implementations. With this dataset we hope to (1) promote meaningful comparison between algorithms in the field of neural computation, (2) allow comparison with conventional image recognition methods, (3) provide an assessment of the state of the art in spike-based visual recognition, and (4) help researchers identify future directions and advance the field.
Mars Rover imaging systems and directional filtering

NASA Technical Reports Server (NTRS)

Wang, Paul P.

1989-01-01

Computer literature searches were carried out at Duke University and NASA Langley Research Center. The purpose is to enhance personal knowledge based on the technical problems of pattern recognition and image understanding which must be solved for the Mars Rover and Sample Return Mission. Intensive study effort of a large collection of relevant literature resulted in a compilation of all important documents in one place. Furthermore, the documents are being classified into: Mars Rover; computer vision (theory); imaging systems; pattern recognition methodologies; and other smart techniques (AI, neural networks, fuzzy logic, etc).
Body-Based Gender Recognition Using Images from Visible and Thermal Cameras

PubMed Central

Nguyen, Dat Tien; Park, Kang Ryoung

2016-01-01

Gender information has many useful applications in computer vision systems, such as surveillance systems, counting the number of males and females in a shopping mall, accessing control systems in restricted areas, or any human-computer interaction system. In most previous studies, researchers attempted to recognize gender by using visible light images of the human face or body. However, shadow, illumination, and time of day greatly affect the performance of these methods. To overcome this problem, we propose a new gender recognition method based on the combination of visible light and thermal camera images of the human body. Experimental results, through various kinds of feature extraction and fusion methods, show that our approach is efficient for gender recognition through a comparison of recognition rates with conventional systems. PMID:26828487
Body-Based Gender Recognition Using Images from Visible and Thermal Cameras.

PubMed

Nguyen, Dat Tien; Park, Kang Ryoung

2016-01-27

Gender information has many useful applications in computer vision systems, such as surveillance systems, counting the number of males and females in a shopping mall, accessing control systems in restricted areas, or any human-computer interaction system. In most previous studies, researchers attempted to recognize gender by using visible light images of the human face or body. However, shadow, illumination, and time of day greatly affect the performance of these methods. To overcome this problem, we propose a new gender recognition method based on the combination of visible light and thermal camera images of the human body. Experimental results, through various kinds of feature extraction and fusion methods, show that our approach is efficient for gender recognition through a comparison of recognition rates with conventional systems.
Dynamic Programming and Graph Algorithms in Computer Vision*

PubMed Central

Felzenszwalb, Pedro F.; Zabih, Ramin

2013-01-01

Optimization is a powerful paradigm for expressing and solving problems in a wide range of areas, and has been successfully applied to many vision problems. Discrete optimization techniques are especially interesting, since by carefully exploiting problem structure they often provide non-trivial guarantees concerning solution quality. In this paper we briefly review dynamic programming and graph algorithms, and discuss representative examples of how these discrete optimization techniques have been applied to some classical vision problems. We focus on the low-level vision problem of stereo; the mid-level problem of interactive object segmentation; and the high-level problem of model-based recognition. PMID:20660950
Spoof Detection for Finger-Vein Recognition System Using NIR Camera.

PubMed

Nguyen, Dat Tien; Yoon, Hyo Sik; Pham, Tuyen Danh; Park, Kang Ryoung

2017-10-01

Finger-vein recognition, a new and advanced biometrics recognition method, is attracting the attention of researchers because of its advantages such as high recognition performance and lesser likelihood of theft and inaccuracies occurring on account of skin condition defects. However, as reported by previous researchers, it is possible to attack a finger-vein recognition system by using presentation attack (fake) finger-vein images. As a result, spoof detection, named as presentation attack detection (PAD), is necessary in such recognition systems. Previous attempts to establish PAD methods primarily focused on designing feature extractors by hand (handcrafted feature extractor) based on the observations of the researchers about the difference between real (live) and presentation attack finger-vein images. Therefore, the detection performance was limited. Recently, the deep learning framework has been successfully applied in computer vision and delivered superior results compared to traditional handcrafted methods on various computer vision applications such as image-based face recognition, gender recognition and image classification. In this paper, we propose a PAD method for near-infrared (NIR) camera-based finger-vein recognition system using convolutional neural network (CNN) to enhance the detection ability of previous handcrafted methods. Using the CNN method, we can derive a more suitable feature extractor for PAD than the other handcrafted methods using a training procedure. We further process the extracted image features to enhance the presentation attack finger-vein image detection ability of the CNN method using principal component analysis method (PCA) for dimensionality reduction of feature space and support vector machine (SVM) for classification. Through extensive experimental results, we confirm that our proposed method is adequate for presentation attack finger-vein image detection and it can deliver superior detection results compared to CNN-based methods and other previous handcrafted methods.
Spoof Detection for Finger-Vein Recognition System Using NIR Camera

PubMed Central

Nguyen, Dat Tien; Yoon, Hyo Sik; Pham, Tuyen Danh; Park, Kang Ryoung

2017-01-01

Finger-vein recognition, a new and advanced biometrics recognition method, is attracting the attention of researchers because of its advantages such as high recognition performance and lesser likelihood of theft and inaccuracies occurring on account of skin condition defects. However, as reported by previous researchers, it is possible to attack a finger-vein recognition system by using presentation attack (fake) finger-vein images. As a result, spoof detection, named as presentation attack detection (PAD), is necessary in such recognition systems. Previous attempts to establish PAD methods primarily focused on designing feature extractors by hand (handcrafted feature extractor) based on the observations of the researchers about the difference between real (live) and presentation attack finger-vein images. Therefore, the detection performance was limited. Recently, the deep learning framework has been successfully applied in computer vision and delivered superior results compared to traditional handcrafted methods on various computer vision applications such as image-based face recognition, gender recognition and image classification. In this paper, we propose a PAD method for near-infrared (NIR) camera-based finger-vein recognition system using convolutional neural network (CNN) to enhance the detection ability of previous handcrafted methods. Using the CNN method, we can derive a more suitable feature extractor for PAD than the other handcrafted methods using a training procedure. We further process the extracted image features to enhance the presentation attack finger-vein image detection ability of the CNN method using principal component analysis method (PCA) for dimensionality reduction of feature space and support vector machine (SVM) for classification. Through extensive experimental results, we confirm that our proposed method is adequate for presentation attack finger-vein image detection and it can deliver superior detection results compared to CNN-based methods and other previous handcrafted methods. PMID:28974031
Computer vision in roadway transportation systems: a survey

NASA Astrophysics Data System (ADS)

Loce, Robert P.; Bernal, Edgar A.; Wu, Wencheng; Bala, Raja

2013-10-01

There is a worldwide effort to apply 21st century intelligence to evolving our transportation networks. The goals of smart transportation networks are quite noble and manifold, including safety, efficiency, law enforcement, energy conservation, and emission reduction. Computer vision is playing a key role in this transportation evolution. Video imaging scientists are providing intelligent sensing and processing technologies for a wide variety of applications and services. There are many interesting technical challenges including imaging under a variety of environmental and illumination conditions, data overload, recognition and tracking of objects at high speed, distributed network sensing and processing, energy sources, as well as legal concerns. This paper presents a survey of computer vision techniques related to three key problems in the transportation domain: safety, efficiency, and security and law enforcement. A broad review of the literature is complemented by detailed treatment of a few selected algorithms and systems that the authors believe represent the state-of-the-art.
The Spatial Vision Tree: A Generic Pattern Recognition Engine- Scientific Foundations, Design Principles, and Preliminary Tree Design

NASA Technical Reports Server (NTRS)

Rahman, Zia-ur; Jobson, Daniel J.; Woodell, Glenn A.

2010-01-01

New foundational ideas are used to define a novel approach to generic visual pattern recognition. These ideas proceed from the starting point of the intrinsic equivalence of noise reduction and pattern recognition when noise reduction is taken to its theoretical limit of explicit matched filtering. This led us to think of the logical extension of sparse coding using basis function transforms for both de-noising and pattern recognition to the full pattern specificity of a lexicon of matched filter pattern templates. A key hypothesis is that such a lexicon can be constructed and is, in fact, a generic visual alphabet of spatial vision. Hence it provides a tractable solution for the design of a generic pattern recognition engine. Here we present the key scientific ideas, the basic design principles which emerge from these ideas, and a preliminary design of the Spatial Vision Tree (SVT). The latter is based upon a cryptographic approach whereby we measure a large aggregate estimate of the frequency of occurrence (FOO) for each pattern. These distributions are employed together with Hamming distance criteria to design a two-tier tree. Then using information theory, these same FOO distributions are used to define a precise method for pattern representation. Finally the experimental performance of the preliminary SVT on computer generated test images and complex natural images is assessed.
ROBIN: a platform for evaluating automatic target recognition algorithms: I. Overview of the project and presentation of the SAGEM DS competition

NASA Astrophysics Data System (ADS)

Duclos, D.; Lonnoy, J.; Guillerm, Q.; Jurie, F.; Herbin, S.; D'Angelo, E.

2008-04-01

The last five years have seen a renewal of Automatic Target Recognition applications, mainly because of the latest advances in machine learning techniques. In this context, large collections of image datasets are essential for training algorithms as well as for their evaluation. Indeed, the recent proliferation of recognition algorithms, generally applied to slightly different problems, make their comparisons through clean evaluation campaigns necessary. The ROBIN project tries to fulfil these two needs by putting unclassified datasets, ground truths, competitions and metrics for the evaluation of ATR algorithms at the disposition of the scientific community. The scope of this project includes single and multi-class generic target detection and generic target recognition, in military and security contexts. From our knowledge, it is the first time that a database of this importance (several hundred thousands of visible and infrared hand annotated images) has been publicly released. Funded by the French Ministry of Defence (DGA) and by the French Ministry of Research, ROBIN is one of the ten Techno-vision projects. Techno-vision is a large and ambitious government initiative for building evaluation means for computer vision technologies, for various application contexts. ROBIN's consortium includes major companies and research centres involved in Computer Vision R&D in the field of defence: Bertin Technologies, CNES, ECA, DGA, EADS, INRIA, ONERA, MBDA, SAGEM, THALES. This paper, which first gives an overview of the whole project, is focused on one of ROBIN's key competitions, the SAGEM Defence Security database. This dataset contains more than eight hundred ground and aerial infrared images of six different vehicles in cluttered scenes including distracters. Two different sets of data are available for each target. The first set includes different views of each vehicle at close range in a "simple" background, and can be used to train algorithms. The second set contains many views of the same vehicle in different contexts and situations simulating operational scenarios.
Image segmentation for enhancing symbol recognition in prosthetic vision.

PubMed

Horne, Lachlan; Barnes, Nick; McCarthy, Chris; He, Xuming

2012-01-01

Current and near-term implantable prosthetic vision systems offer the potential to restore some visual function, but suffer from poor resolution and dynamic range of induced phosphenes. This can make it difficult for users of prosthetic vision systems to identify symbolic information (such as signs) except in controlled conditions. Using image segmentation techniques from computer vision, we show it is possible to improve the clarity of such symbolic information for users of prosthetic vision implants in uncontrolled conditions. We use image segmentation to automatically divide a natural image into regions, and using a fixation point controlled by the user, select a region to phosphenize. This technique improves the apparent contrast and clarity of symbolic information over traditional phosphenization approaches.
Head pose estimation in computer vision: a survey.

PubMed

Murphy-Chutorian, Erik; Trivedi, Mohan Manubhai

2009-04-01

The capacity to estimate the head pose of another person is a common human ability that presents a unique challenge for computer vision systems. Compared to face detection and recognition, which have been the primary foci of face-related vision research, identity-invariant head pose estimation has fewer rigorously evaluated systems or generic solutions. In this paper, we discuss the inherent difficulties in head pose estimation and present an organized survey describing the evolution of the field. Our discussion focuses on the advantages and disadvantages of each approach and spans 90 of the most innovative and characteristic papers that have been published on this topic. We compare these systems by focusing on their ability to estimate coarse and fine head pose, highlighting approaches that are well suited for unconstrained environments.
Parametric Representation of the Speaker's Lips for Multimodal Sign Language and Speech Recognition

NASA Astrophysics Data System (ADS)

Ryumin, D.; Karpov, A. A.

2017-05-01

In this article, we propose a new method for parametric representation of human's lips region. The functional diagram of the method is described and implementation details with the explanation of its key stages and features are given. The results of automatic detection of the regions of interest are illustrated. A speed of the method work using several computers with different performances is reported. This universal method allows applying parametrical representation of the speaker's lipsfor the tasks of biometrics, computer vision, machine learning, and automatic recognition of face, elements of sign languages, and audio-visual speech, including lip-reading.
A computer vision system for the recognition of trees in aerial photographs

NASA Technical Reports Server (NTRS)

Pinz, Axel J.

1991-01-01

Increasing problems of forest damage in Central Europe set the demand for an appropriate forest damage assessment tool. The Vision Expert System (VES) is presented which is capable of finding trees in color infrared aerial photographs. Concept and architecture of VES are discussed briefly. The system is applied to a multisource test data set. The processing of this multisource data set leads to a multiple interpretation result for one scene. An integration of these results will provide a better scene description by the vision system. This is achieved by an implementation of Steven's correlation algorithm.
CT Image Sequence Processing For Wood Defect Recognition

Treesearch

Dongping Zhu; R.W. Conners; Philip A. Araman

1991-01-01

The research reported in this paper explores a non-destructive testing application of x-ray computed tomography (CT) in the forest products industry. This application involves a computer vision system that uses CT to locate and identify internal defects in hardwood logs. The knowledge of log defects is critical in deciding whether to veneer or to saw up a log, and how...
Simulated Prosthetic Vision: The Benefits of Computer-Based Object Recognition and Localization.

PubMed

Macé, Marc J-M; Guivarch, Valérian; Denis, Grégoire; Jouffrais, Christophe

2015-07-01

Clinical trials with blind patients implanted with a visual neuroprosthesis showed that even the simplest tasks were difficult to perform with the limited vision restored with current implants. Simulated prosthetic vision (SPV) is a powerful tool to investigate the putative functions of the upcoming generations of visual neuroprostheses. Recent studies based on SPV showed that several generations of implants will be required before usable vision is restored. However, none of these studies relied on advanced image processing. High-level image processing could significantly reduce the amount of information required to perform visual tasks and help restore visuomotor behaviors, even with current low-resolution implants. In this study, we simulated a prosthetic vision device based on object localization in the scene. We evaluated the usability of this device for object recognition, localization, and reaching. We showed that a very low number of electrodes (e.g., nine) are sufficient to restore visually guided reaching movements with fair timing (10 s) and high accuracy. In addition, performance, both in terms of accuracy and speed, was comparable with 9 and 100 electrodes. Extraction of high level information (object recognition and localization) from video images could drastically enhance the usability of current visual neuroprosthesis. We suggest that this method-that is, localization of targets of interest in the scene-may restore various visuomotor behaviors. This method could prove functional on current low-resolution implants. The main limitation resides in the reliability of the vision algorithms, which are improving rapidly. Copyright © 2015 International Center for Artificial Organs and Transplantation and Wiley Periodicals, Inc.
Automatic recognition of lactating sow behaviors through depth image processing

USDA-ARS?s Scientific Manuscript database

Manual observation and classification of animal behaviors is laborious, time-consuming, and of limited ability to process large amount of data. A computer vision-based system was developed that automatically recognizes sow behaviors (lying, sitting, standing, kneeling, feeding, drinking, and shiftin...

Review On Applications Of Neural Network To Computer Vision

NASA Astrophysics Data System (ADS)

Li, Wei; Nasrabadi, Nasser M.

1989-03-01

Neural network models have many potential applications to computer vision due to their parallel structures, learnability, implicit representation of domain knowledge, fault tolerance, and ability of handling statistical data. This paper demonstrates the basic principles, typical models and their applications in this field. Variety of neural models, such as associative memory, multilayer back-propagation perceptron, self-stabilized adaptive resonance network, hierarchical structured neocognitron, high order correlator, network with gating control and other models, can be applied to visual signal recognition, reinforcement, recall, stereo vision, motion, object tracking and other vision processes. Most of the algorithms have been simulated on com-puters. Some have been implemented with special hardware. Some systems use features, such as edges and profiles, of images as the data form for input. Other systems use raw data as input signals to the networks. We will present some novel ideas contained in these approaches and provide a comparison of these methods. Some unsolved problems are mentioned, such as extracting the intrinsic properties of the input information, integrating those low level functions to a high-level cognitive system, achieving invariances and other problems. Perspectives of applications of some human vision models and neural network models are analyzed.
An Effective 3D Shape Descriptor for Object Recognition with RGB-D Sensors

PubMed Central

Liu, Zhong; Zhao, Changchen; Wu, Xingming; Chen, Weihai

2017-01-01

RGB-D sensors have been widely used in various areas of computer vision and graphics. A good descriptor will effectively improve the performance of operation. This article further analyzes the recognition performance of shape features extracted from multi-modality source data using RGB-D sensors. A hybrid shape descriptor is proposed as a representation of objects for recognition. We first extracted five 2D shape features from contour-based images and five 3D shape features over point cloud data to capture the global and local shape characteristics of an object. The recognition performance was tested for category recognition and instance recognition. Experimental results show that the proposed shape descriptor outperforms several common global-to-global shape descriptors and is comparable to some partial-to-global shape descriptors that achieved the best accuracies in category and instance recognition. Contribution of partial features and computational complexity were also analyzed. The results indicate that the proposed shape features are strong cues for object recognition and can be combined with other features to boost accuracy. PMID:28245553
A high-throughput screening approach to discovering good forms of biologically inspired visual representation.

PubMed

Pinto, Nicolas; Doukhan, David; DiCarlo, James J; Cox, David D

2009-11-01

While many models of biological object recognition share a common set of "broad-stroke" properties, the performance of any one model depends strongly on the choice of parameters in a particular instantiation of that model--e.g., the number of units per layer, the size of pooling kernels, exponents in normalization operations, etc. Since the number of such parameters (explicit or implicit) is typically large and the computational cost of evaluating one particular parameter set is high, the space of possible model instantiations goes largely unexplored. Thus, when a model fails to approach the abilities of biological visual systems, we are left uncertain whether this failure is because we are missing a fundamental idea or because the correct "parts" have not been tuned correctly, assembled at sufficient scale, or provided with enough training. Here, we present a high-throughput approach to the exploration of such parameter sets, leveraging recent advances in stream processing hardware (high-end NVIDIA graphic cards and the PlayStation 3's IBM Cell Processor). In analogy to high-throughput screening approaches in molecular biology and genetics, we explored thousands of potential network architectures and parameter instantiations, screening those that show promising object recognition performance for further analysis. We show that this approach can yield significant, reproducible gains in performance across an array of basic object recognition tasks, consistently outperforming a variety of state-of-the-art purpose-built vision systems from the literature. As the scale of available computational power continues to expand, we argue that this approach has the potential to greatly accelerate progress in both artificial vision and our understanding of the computational underpinning of biological vision.
A High-Throughput Screening Approach to Discovering Good Forms of Biologically Inspired Visual Representation

PubMed Central

Pinto, Nicolas; Doukhan, David; DiCarlo, James J.; Cox, David D.

2009-01-01

While many models of biological object recognition share a common set of “broad-stroke” properties, the performance of any one model depends strongly on the choice of parameters in a particular instantiation of that model—e.g., the number of units per layer, the size of pooling kernels, exponents in normalization operations, etc. Since the number of such parameters (explicit or implicit) is typically large and the computational cost of evaluating one particular parameter set is high, the space of possible model instantiations goes largely unexplored. Thus, when a model fails to approach the abilities of biological visual systems, we are left uncertain whether this failure is because we are missing a fundamental idea or because the correct “parts” have not been tuned correctly, assembled at sufficient scale, or provided with enough training. Here, we present a high-throughput approach to the exploration of such parameter sets, leveraging recent advances in stream processing hardware (high-end NVIDIA graphic cards and the PlayStation 3's IBM Cell Processor). In analogy to high-throughput screening approaches in molecular biology and genetics, we explored thousands of potential network architectures and parameter instantiations, screening those that show promising object recognition performance for further analysis. We show that this approach can yield significant, reproducible gains in performance across an array of basic object recognition tasks, consistently outperforming a variety of state-of-the-art purpose-built vision systems from the literature. As the scale of available computational power continues to expand, we argue that this approach has the potential to greatly accelerate progress in both artificial vision and our understanding of the computational underpinning of biological vision. PMID:19956750
Improved dense trajectories for action recognition based on random projection and Fisher vectors

NASA Astrophysics Data System (ADS)

Ai, Shihui; Lu, Tongwei; Xiong, Yudian

2018-03-01

As an important application of intelligent monitoring system, the action recognition in video has become a very important research area of computer vision. In order to improve the accuracy rate of the action recognition in video with improved dense trajectories, one advanced vector method is introduced. Improved dense trajectories combine Fisher Vector with Random Projection. The method realizes the reduction of the characteristic trajectory though projecting the high-dimensional trajectory descriptor into the low-dimensional subspace based on defining and analyzing Gaussian mixture model by Random Projection. And a GMM-FV hybrid model is introduced to encode the trajectory feature vector and reduce dimension. The computational complexity is reduced by Random Projection which can drop Fisher coding vector. Finally, a Linear SVM is used to classifier to predict labels. We tested the algorithm in UCF101 dataset and KTH dataset. Compared with existed some others algorithm, the result showed that the method not only reduce the computational complexity but also improved the accuracy of action recognition.
Volumetric segmentation of range images for printed circuit board inspection

NASA Astrophysics Data System (ADS)

Van Dop, Erik R.; Regtien, Paul P. L.

1996-10-01

Conventional computer vision approaches towards object recognition and pose estimation employ 2D grey-value or color imaging. As a consequence these images contain information about projections of a 3D scene only. The subsequent image processing will then be difficult, because the object coordinates are represented with just image coordinates. Only complicated low-level vision modules like depth from stereo or depth from shading can recover some of the surface geometry of the scene. Recent advances in fast range imaging have however paved the way towards 3D computer vision, since range data of the scene can now be obtained with sufficient accuracy and speed for object recognition and pose estimation purposes. This article proposes the coded-light range-imaging method together with superquadric segmentation to approach this task. Superquadric segments are volumetric primitives that describe global object properties with 5 parameters, which provide the main features for object recognition. Besides, the principle axes of a superquadric segment determine the phase of an object in the scene. The volumetric segmentation of a range image can be used to detect missing, false or badly placed components on assembled printed circuit boards. Furthermore, this approach will be useful to recognize and extract valuable or toxic electronic components on printed circuit boards scrap that currently burden the environment during electronic waste processing. Results on synthetic range images with errors constructed according to a verified noise model illustrate the capabilities of this approach.
Heliophysics Data Environment: What's next? (Invited)

NASA Astrophysics Data System (ADS)

Martens, P.

2010-12-01

In the last two decades the Heliophysics community has witnessed the societal recognition of the importance of space weather and space climate for our technology and ecology, resulting in a renewed priority for and investment in Heliophysics. As a result of that and the explosive development of information technology, Heliophysics has experienced an exponential growth in the amount and variety of data acquired, as well as the easy electronic storage and distribution of these data. The Heliophysics community has responded well to these challenges. The first, most obvious and most needed response, was the development of Virtual Heliophysics Observatories. While the VxOs of Heliophysics still need a lot of work with respect to the expansion of search options and interoperability, I believe the basic structures and functionalities have been established, and that they meet the needs of the community. In the future we'll see a refinement, completion, and integration of VxOs, not a fundamentally different approach -- in my opinion. The challenge posed by the huge increase in amount of data is not met by VxOs alone. No individual scientist or group, even with the assistance of tons of graduate students, can analyze the torrent of data currently coming down from the fleet of heliospheric observatories. Once more information technology provides an opportunity: Automated feature recognition of solar imagery is feasible, has been implemented in a number of instances, and is strongly supported by NASA. For example, the SDO Feature Finding Team is developing a suite of 16 feature recognition modules for SDO imagery that operates in near-real time, produces space-weather warnings, and populates on-line event catalogs. Automated feature recognition -- "computer vision" -- not only save enormous amounts of time in the analysis of events, it also allows for a shift from the analysis of single events to that of sets of features and events -- the latter being by far the most important implication of computer vision. Consider some specific examples of possibilities here: From the on-line SDO metadata a user can produce with a few IDL line commands information that previously would have taken years to compile, e.g.: - Draw a butterfly diagram for Active Regions, - Find all filaments that coincide with sigmoids and correlate the automatically detected sigmoid handedness with filament chirality, - Correlate EUV jets with small scale flux emergence in coronal holes only, - Draw PIL maps with regions of high shear and large magnetic field gradients overlayed, to pinpoint potential flaring regions. Then correlate with actual flare occurrence. I emphasize that the access to those metadata will be provided by VxOs, and that the interplay between computer vision codes and data will be facilitated by VxOs. My vision for the near and medium future for the VxOs is then to provide a simple and seamless interface between data, cataloged metadata, and computer vision software, either existing or newly developed by the user. Heliospheric virtual observatories and computer vision systems will work together to constantly monitor the Sun, provide space weather warnings, populate catalogs of metadata, analyze trends, and produce real-time on-line imagery of current events.
Atoms of recognition in human and computer vision.

PubMed

Ullman, Shimon; Assif, Liav; Fetaya, Ethan; Harari, Daniel

2016-03-08

Discovering the visual features and representations used by the brain to recognize objects is a central problem in the study of vision. Recently, neural network models of visual object recognition, including biological and deep network models, have shown remarkable progress and have begun to rival human performance in some challenging tasks. These models are trained on image examples and learn to extract features and representations and to use them for categorization. It remains unclear, however, whether the representations and learning processes discovered by current models are similar to those used by the human visual system. Here we show, by introducing and using minimal recognizable images, that the human visual system uses features and processes that are not used by current models and that are critical for recognition. We found by psychophysical studies that at the level of minimal recognizable images a minute change in the image can have a drastic effect on recognition, thus identifying features that are critical for the task. Simulations then showed that current models cannot explain this sensitivity to precise feature configurations and, more generally, do not learn to recognize minimal images at a human level. The role of the features shown here is revealed uniquely at the minimal level, where the contribution of each feature is essential. A full understanding of the learning and use of such features will extend our understanding of visual recognition and its cortical mechanisms and will enhance the capacity of computational models to learn from visual experience and to deal with recognition and detailed image interpretation.
Automatic image orientation detection via confidence-based integration of low-level and semantic cues.

PubMed

Luo, Jiebo; Boutell, Matthew

2005-05-01

Automatic image orientation detection for natural images is a useful, yet challenging research topic. Humans use scene context and semantic object recognition to identify the correct image orientation. However, it is difficult for a computer to perform the task in the same way because current object recognition algorithms are extremely limited in their scope and robustness. As a result, existing orientation detection methods were built upon low-level vision features such as spatial distributions of color and texture. Discrepant detection rates have been reported for these methods in the literature. We have developed a probabilistic approach to image orientation detection via confidence-based integration of low-level and semantic cues within a Bayesian framework. Our current accuracy is 90 percent for unconstrained consumer photos, impressive given the findings of a psychophysical study conducted recently. The proposed framework is an attempt to bridge the gap between computer and human vision systems and is applicable to other problems involving semantic scene content understanding.
Image jitter enhances visual performance when spatial resolution is impaired.

PubMed

Watson, Lynne M; Strang, Niall C; Scobie, Fraser; Love, Gordon D; Seidel, Dirk; Manahilov, Velitchko

2012-09-06

Visibility of low-spatial frequency stimuli improves when their contrast is modulated at 5 to 10 Hz compared with stationary stimuli. Therefore, temporal modulations of visual objects could enhance the performance of low vision patients who primarily perceive images of low-spatial frequency content. We investigated the effect of retinal-image jitter on word recognition speed and facial emotion recognition in subjects with central visual impairment. Word recognition speed and accuracy of facial emotion discrimination were measured in volunteers with AMD under stationary and jittering conditions. Computer-driven and optoelectronic approaches were used to induce retinal-image jitter with duration of 100 or 166 ms and amplitude within the range of 0.5 to 2.6° visual angle. Word recognition speed was also measured for participants with simulated (Bangerter filters) visual impairment. Text jittering markedly enhanced word recognition speed for people with severe visual loss (101 ± 25%), while for those with moderate visual impairment, this effect was weaker (19 ± 9%). The ability of low vision patients to discriminate the facial emotions of jittering images improved by a factor of 2. A prototype of optoelectronic jitter goggles produced similar improvement in facial emotion discrimination. Word recognition speed in participants with simulated visual impairment was enhanced for interjitter intervals over 100 ms and reduced for shorter intervals. Results suggest that retinal-image jitter with optimal frequency and amplitude is an effective strategy for enhancing visual information processing in the absence of spatial detail. These findings will enable the development of novel tools to improve the quality of life of low vision patients.
Mobile Diagnostics Based on Motion? A Close Look at Motility Patterns in the Schistosome Life Cycle

PubMed Central

Linder, Ewert; Varjo, Sami; Thors, Cecilia

2016-01-01

Imaging at high resolution and subsequent image analysis with modified mobile phones have the potential to solve problems related to microscopy-based diagnostics of parasitic infections in many endemic regions. Diagnostics using the computing power of “smartphones” is not restricted by limited expertise or limitations set by visual perception of a microscopist. Thus diagnostics currently almost exclusively dependent on recognition of morphological features of pathogenic organisms could be based on additional properties, such as motility characteristics recognizable by computer vision. Of special interest are infectious larval stages and “micro swimmers” of e.g., the schistosome life cycle, which infect the intermediate and definitive hosts, respectively. The ciliated miracidium, emerges from the excreted egg upon its contact with water. This means that for diagnostics, recognition of a swimming miracidium is equivalent to recognition of an egg. The motility pattern of miracidia could be defined by computer vision and used as a diagnostic criterion. To develop motility pattern-based diagnostics of schistosomiasis using simple imaging devices, we analyzed Paramecium as a model for the schistosome miracidium. As a model for invasive nematodes, such as strongyloids and filaria, we examined a different type of motility in the apathogenic nematode Turbatrix, the “vinegar eel.” The results of motion time and frequency analysis suggest that target motility may be expressed as specific spectrograms serving as “diagnostic fingerprints.” PMID:27322330
Mid-Level Vision and Recognition of Non-Rigid Objects.

DTIC Science & Technology

1993-01-01

and the author perhaps asked to account for its lack of rigor. In computer vision, the critic often requires that the author provide particular runs ...shown here where run at 4 x 1.5 deg. Note that it is unclear though if only even symmetric lters are needed for Contour Texture as proposed there for 2D...the contrast is low. However, coloring runs into problems if the contour is not fully connected or if the inner side of the contour is hard to
Intelligent Scene Analysis and Recognition

DTIC Science & Technology

2010-03-30

Database, 1998, pp. 42–51. [9] I. Biederman , Aspects and extension of a theory of human image understanding, Z. Pylyshyn, Ed. Ablex Publishing Corporation...geometry in the visual system,” Biological Cybernetics, vol. 55, no. 6, pp. 367–375, 1987 . [30] W. T. Freeman and E. H. Adelson, “The design and use of...Computer Vision and Pattern Recognition, 2009, pp. 1980– 1987 . [47] M. Leordeanu and M. Hebert, “A spectral technique for correspondence problems using
Good initialization model with constrained body structure for scene text recognition

NASA Astrophysics Data System (ADS)

Zhu, Anna; Wang, Guoyou; Dong, Yangbo

2016-09-01

Scene text recognition has gained significant attention in the computer vision community. Character detection and recognition are the promise of text recognition and affect the overall performance to a large extent. We proposed a good initialization model for scene character recognition from cropped text regions. We use constrained character's body structures with deformable part-based models to detect and recognize characters in various backgrounds. The character's body structures are achieved by an unsupervised discriminative clustering approach followed by a statistical model and a self-build minimum spanning tree model. Our method utilizes part appearance and location information, and combines character detection and recognition in cropped text region together. The evaluation results on the benchmark datasets demonstrate that our proposed scheme outperforms the state-of-the-art methods both on scene character recognition and word recognition aspects.
[Ophthalmologist and "computer vision syndrome"].

PubMed

Barar, A; Apatachioaie, Ioana Daniela; Apatachioaie, C; Marceanu-Brasov, L

2007-01-01

The authors had tried to collect the data available on the Internet about a subject that we consider as being totally ignored in the Romanian scientific literature and unexpectedly insufficiently treated in the specialized ophthalmologic literature. Known in the specialty literature under the generic name of "Computer vision syndrome", it is defined by the American Optometric Association as a complex of eye and vision problems related to the activities which stress the near vision and which are experienced in relation, or during, the use of the computer. During the consultations we hear frequent complaints of eye-strain - asthenopia, headaches, blurred distance and/or near vision, dry and irritated eyes, slow refocusing, neck and backache, photophobia, sensation of diplopia, light sensitivity, and double vision, but because of the lack of information, we overlooked them too easily, without going thoroughly into the real motives. In most of the developed countries, there are recommendations issued by renowned medical associations with regard to the definition, the diagnosis, and the methods for the prevention, treatment and periodical control of the symptoms found in computer users, in conjunction with an extremely detailed ergonomic legislation. We found out that these problems incite a much too low interest in our country. We would like to rouse the interest of our ophthalmologist colleagues in the understanding and the recognition of these symptoms and in their treatment, or at least their improvement, through specialized measures or through the cooperation with our specialist occupational medicine colleagues.
Fast Legendre moment computation for template matching

NASA Astrophysics Data System (ADS)

Li, Bing C.

2017-05-01

Normalized cross correlation (NCC) based template matching is insensitive to intensity changes and it has many applications in image processing, object detection, video tracking and pattern recognition. However, normalized cross correlation implementation is computationally expensive since it involves both correlation computation and normalization implementation. In this paper, we propose Legendre moment approach for fast normalized cross correlation implementation and show that the computational cost of this proposed approach is independent of template mask sizes which is significantly faster than traditional mask size dependent approaches, especially for large mask templates. Legendre polynomials have been widely used in solving Laplace equation in electrodynamics in spherical coordinate systems, and solving Schrodinger equation in quantum mechanics. In this paper, we extend Legendre polynomials from physics to computer vision and pattern recognition fields, and demonstrate that Legendre polynomials can help to reduce the computational cost of NCC based template matching significantly.
The Next Wave: Humans, Computers, and Redefining Reality

NASA Technical Reports Server (NTRS)

Little, William

2018-01-01

The Augmented/Virtual Reality (AVR) Lab at KSC is dedicated to " exploration into the growing computer fields of Extended Reality and the Natural User Interface (it is) a proving ground for new technologies that can be integrated into future NASA projects and programs." The topics of Human Computer Interface, Human Computer Interaction, Augmented Reality, Virtual Reality, and Mixed Reality are defined; examples of work being done in these fields in the AVR Lab are given. Current new and future work in Computer Vision, Speech Recognition, and Artificial Intelligence are also outlined.
A Monocular SLAM Method to Estimate Relative Pose During Satellite Proximity Operations

DTIC Science & Technology

2015-03-26

localization and mapping with efficient outlier handling. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2013. 5. Herbert Bay...S.H. Spencer . Next generation advanced video guidance sensor. In Aerospace Conference, 2008 IEEE, pages 1–8, March 2008. 12. Michael Calonder, Vincent
Remote Video Monitor of Vehicles in Cooperative Information Platform

NASA Astrophysics Data System (ADS)

Qin, Guofeng; Wang, Xiaoguo; Wang, Li; Li, Yang; Li, Qiyan

Detection of vehicles plays an important role in the area of the modern intelligent traffic management. And the pattern recognition is a hot issue in the area of computer vision. An auto- recognition system in cooperative information platform is studied. In the cooperative platform, 3G wireless network, including GPS, GPRS (CDMA), Internet (Intranet), remote video monitor and M-DMB networks are integrated. The remote video information can be taken from the terminals and sent to the cooperative platform, then detected by the auto-recognition system. The images are pretreated and segmented, including feature extraction, template matching and pattern recognition. The system identifies different models and gets vehicular traffic statistics. Finally, the implementation of the system is introduced.
Robot Command Interface Using an Audio-Visual Speech Recognition System

NASA Astrophysics Data System (ADS)

Ceballos, Alexánder; Gómez, Juan; Prieto, Flavio; Redarce, Tanneguy

In recent years audio-visual speech recognition has emerged as an active field of research thanks to advances in pattern recognition, signal processing and machine vision. Its ultimate goal is to allow human-computer communication using voice, taking into account the visual information contained in the audio-visual speech signal. This document presents a command's automatic recognition system using audio-visual information. The system is expected to control the laparoscopic robot da Vinci. The audio signal is treated using the Mel Frequency Cepstral Coefficients parametrization method. Besides, features based on the points that define the mouth's outer contour according to the MPEG-4 standard are used in order to extract the visual speech information.

Static facial expression recognition with convolution neural networks

NASA Astrophysics Data System (ADS)

Zhang, Feng; Chen, Zhong; Ouyang, Chao; Zhang, Yifei

2018-03-01

Facial expression recognition is a currently active research topic in the fields of computer vision, pattern recognition and artificial intelligence. In this paper, we have developed a convolutional neural networks (CNN) for classifying human emotions from static facial expression into one of the seven facial emotion categories. We pre-train our CNN model on the combined FER2013 dataset formed by train, validation and test set and fine-tune on the extended Cohn-Kanade database. In order to reduce the overfitting of the models, we utilized different techniques including dropout and batch normalization in addition to data augmentation. According to the experimental result, our CNN model has excellent classification performance and robustness for facial expression recognition.
A Review on Human Activity Recognition Using Vision-Based Method.

PubMed

Zhang, Shugang; Wei, Zhiqiang; Nie, Jie; Huang, Lei; Wang, Shuang; Li, Zhen

2017-01-01

Human activity recognition (HAR) aims to recognize activities from a series of observations on the actions of subjects and the environmental conditions. The vision-based HAR research is the basis of many applications including video surveillance, health care, and human-computer interaction (HCI). This review highlights the advances of state-of-the-art activity recognition approaches, especially for the activity representation and classification methods. For the representation methods, we sort out a chronological research trajectory from global representations to local representations, and recent depth-based representations. For the classification methods, we conform to the categorization of template-based methods, discriminative models, and generative models and review several prevalent methods. Next, representative and available datasets are introduced. Aiming to provide an overview of those methods and a convenient way of comparing them, we classify existing literatures with a detailed taxonomy including representation and classification methods, as well as the datasets they used. Finally, we investigate the directions for future research.
A Review on Human Activity Recognition Using Vision-Based Method

PubMed Central

Nie, Jie

2017-01-01

Human activity recognition (HAR) aims to recognize activities from a series of observations on the actions of subjects and the environmental conditions. The vision-based HAR research is the basis of many applications including video surveillance, health care, and human-computer interaction (HCI). This review highlights the advances of state-of-the-art activity recognition approaches, especially for the activity representation and classification methods. For the representation methods, we sort out a chronological research trajectory from global representations to local representations, and recent depth-based representations. For the classification methods, we conform to the categorization of template-based methods, discriminative models, and generative models and review several prevalent methods. Next, representative and available datasets are introduced. Aiming to provide an overview of those methods and a convenient way of comparing them, we classify existing literatures with a detailed taxonomy including representation and classification methods, as well as the datasets they used. Finally, we investigate the directions for future research. PMID:29065585
Activity Recognition in Egocentric video using SVM, kNN and Combined SVMkNN Classifiers

NASA Astrophysics Data System (ADS)

Sanal Kumar, K. P.; Bhavani, R., Dr.

2017-08-01

Egocentric vision is a unique perspective in computer vision which is human centric. The recognition of egocentric actions is a challenging task which helps in assisting elderly people, disabled patients and so on. In this work, life logging activity videos are taken as input. There are 2 categories, first one is the top level and second one is second level. Here, the recognition is done using the features like Histogram of Oriented Gradients (HOG), Motion Boundary Histogram (MBH) and Trajectory. The features are fused together and it acts as a single feature. The extracted features are reduced using Principal Component Analysis (PCA). The features that are reduced are provided as input to the classifiers like Support Vector Machine (SVM), k nearest neighbor (kNN) and combined Support Vector Machine (SVM) and k Nearest Neighbor (kNN) (combined SVMkNN). These classifiers are evaluated and the combined SVMkNN provided better results than other classifiers in the literature.
New color vision tests to evaluate faulty color recognition.

PubMed

Nakamura, Kaoru; Okajima, Osamu; Nishio, Yoshiteru; Kitahara, Kenji

2002-01-01

To develop and assess new color vision tests to be used in evaluating faulty color recognition. We developed new color vision tests to evaluate faulty color recognition. The two types of color vision tests, designed to assess faulty color recognition in color vision deficiencies, are based on principles that are different from those of the conventional color vision tests. In the first test plate, the subject is asked to choose either a red, green, or gray line from among 10 lines that are randomly colored red, green, gray, yellow, or blue. The score is the difference between the number of correct answers and the number of incorrect answers. In the second test plate, the subject is asked to identify a total of 10 red azalea blossoms, which are dispersed among numerous green leaves. Seventy-five persons with congenital color deficiencies and 20 subjects with normal color vision were examined using these new test plates. The scores differed significantly between dichromats and anomalous trichromats, and between anomalous trichromats and subjects with normal color vision. The new tests are easy to use, sensitive, and have good reproducibility for use in discriminating subjects with color vision anomalies. These tests reveal the faulty color recognition that occurs unconsciously in persons with color deficiencies, and are useful in judging the quantification of color vision required in their daily life and occupations.
Guidance of visual attention by semantic information in real-world scenes

PubMed Central

Wu, Chia-Chien; Wick, Farahnaz Ahmed; Pomplun, Marc

2014-01-01

Recent research on attentional guidance in real-world scenes has focused on object recognition within the context of a scene. This approach has been valuable for determining some factors that drive the allocation of visual attention and determine visual selection. This article provides a review of experimental work on how different components of context, especially semantic information, affect attentional deployment. We review work from the areas of object recognition, scene perception, and visual search, highlighting recent studies examining semantic structure in real-world scenes. A better understanding on how humans parse scene representations will not only improve current models of visual attention but also advance next-generation computer vision systems and human-computer interfaces. PMID:24567724
The Ilac-Project Supporting Ancient Coin Classification by Means of Image Analysis

NASA Astrophysics Data System (ADS)

Kavelar, A.; Zambanini, S.; Kampel, M.; Vondrovec, K.; Siegl, K.

2013-07-01

This paper presents the ILAC project, which aims at the development of an automated image-based classification system for ancient Roman Republican coins. The benefits of such a system are manifold: operating at the suture between computer vision and numismatics, ILAC can reduce the day-to-day workload of numismatists by assisting them in classification tasks and providing a preselection of suitable coin classes. This is especially helpful for large coin hoard findings comprising several thousands of coins. Furthermore, this system could be implemented in an online platform for hobby numismatists, allowing them to access background information about their coin collection by simply uploading a photo of obverse and reverse for the coin of interest. ILAC explores different computer vision techniques and their combinations for the use of image-based coin recognition. Some of these methods, such as image matching, use the entire coin image in the classification process, while symbol or legend recognition exploit certain characteristics of the coin imagery. An overview of the methods explored so far and the respective experiments is given as well as an outlook on the next steps of the project.
Development of Collaborative Research Initiatives to Advance the Aerospace Sciences-via the Communications, Electronics, Information Systems Focus Group

NASA Technical Reports Server (NTRS)

Knasel, T. Michael

1996-01-01

The primary goal of the Adaptive Vision Laboratory Research project was to develop advanced computer vision systems for automatic target recognition. The approach used in this effort combined several machine learning paradigms including evolutionary learning algorithms, neural networks, and adaptive clustering techniques to develop the E-MOR.PH system. This system is capable of generating pattern recognition systems to solve a wide variety of complex recognition tasks. A series of simulation experiments were conducted using E-MORPH to solve problems in OCR, military target recognition, industrial inspection, and medical image analysis. The bulk of the funds provided through this grant were used to purchase computer hardware and software to support these computationally intensive simulations. The payoff from this effort is the reduced need for human involvement in the design and implementation of recognition systems. We have shown that the techniques used in E-MORPH are generic and readily transition to other problem domains. Specifically, E-MORPH is multi-phase evolutionary leaming system that evolves cooperative sets of features detectors and combines their response using an adaptive classifier to form a complete pattern recognition system. The system can operate on binary or grayscale images. In our most recent experiments, we used multi-resolution images that are formed by applying a Gabor wavelet transform to a set of grayscale input images. To begin the leaming process, candidate chips are extracted from the multi-resolution images to form a training set and a test set. A population of detector sets is randomly initialized to start the evolutionary process. Using a combination of evolutionary programming and genetic algorithms, the feature detectors are enhanced to solve a recognition problem. The design of E-MORPH and recognition results for a complex problem in medical image analysis are described at the end of this report. The specific task involves the identification of vertebrae in x-ray images of human spinal columns. This problem is extremely challenging because the individual vertebra exhibit variation in shape, scale, orientation, and contrast. E-MORPH generated several accurate recognition systems to solve this task. This dual use of this ATR technology clearly demonstrates the flexibility and power of our approach.
Compact VLSI neural computer integrated with active pixel sensor for real-time ATR applications

NASA Astrophysics Data System (ADS)

Fang, Wai-Chi; Udomkesmalee, Gabriel; Alkalai, Leon

1997-04-01

A compact VLSI neural computer integrated with an active pixel sensor has been under development to mimic what is inherent in biological vision systems. This electronic eye- brain computer is targeted for real-time machine vision applications which require both high-bandwidth communication and high-performance computing for data sensing, synergy of multiple types of sensory information, feature extraction, target detection, target recognition, and control functions. The neural computer is based on a composite structure which combines Annealing Cellular Neural Network (ACNN) and Hierarchical Self-Organization Neural Network (HSONN). The ACNN architecture is a programmable and scalable multi- dimensional array of annealing neurons which are locally connected with their local neurons. Meanwhile, the HSONN adopts a hierarchical structure with nonlinear basis functions. The ACNN+HSONN neural computer is effectively designed to perform programmable functions for machine vision processing in all levels with its embedded host processor. It provides a two order-of-magnitude increase in computation power over the state-of-the-art microcomputer and DSP microelectronics. A compact current-mode VLSI design feasibility of the ACNN+HSONN neural computer is demonstrated by a 3D 16X8X9-cube neural processor chip design in a 2-micrometers CMOS technology. Integration of this neural computer as one slice of a 4'X4' multichip module into the 3D MCM based avionics architecture for NASA's New Millennium Program is also described.
Heuristics in primary care for recognition of unreported vision loss in older people: a technology development study.

PubMed

Wijeyekoon, Skanda; Kharicha, Kalpa; Iliffe, Steve

2015-09-01

To evaluate heuristics (rules of thumb) for recognition of undetected vision loss in older patients in primary care. Vision loss is associated with ageing, and its prevalence is increasing. Visual impairment has a broad impact on health, functioning and well-being. Unrecognised vision loss remains common, and screening interventions have yet to reduce its prevalence. An alternative approach is to enhance practitioners' skills in recognising undetected vision loss, by having a more detailed picture of those who are likely not to act on vision changes, report symptoms or have eye tests. This paper describes a qualitative technology development study to evaluate heuristics for recognition of undetected vision loss in older patients in primary care. Using a previous modelling study, two heuristics in the form of mnemonics were developed to aid pattern recognition and allow general practitioners to identify potential cases of unreported vision loss. These heuristics were then analysed with experts. Findings It was concluded that their implementation in modern general practice was unsuitable and an alternative solution should be sort.
Real-Time (Vision-Based) Road Sign Recognition Using an Artificial Neural Network.

PubMed

Islam, Kh Tohidul; Raj, Ram Gopal

2017-04-13

Road sign recognition is a driver support function that can be used to notify and warn the driver by showing the restrictions that may be effective on the current stretch of road. Examples for such regulations are 'traffic light ahead' or 'pedestrian crossing' indications. The present investigation targets the recognition of Malaysian road and traffic signs in real-time. Real-time video is taken by a digital camera from a moving vehicle and real world road signs are then extracted using vision-only information. The system is based on two stages, one performs the detection and another one is for recognition. In the first stage, a hybrid color segmentation algorithm has been developed and tested. In the second stage, an introduced robust custom feature extraction method is used for the first time in a road sign recognition approach. Finally, a multilayer artificial neural network (ANN) has been created to recognize and interpret various road signs. It is robust because it has been tested on both standard and non-standard road signs with significant recognition accuracy. This proposed system achieved an average of 99.90% accuracy with 99.90% of sensitivity, 99.90% of specificity, 99.90% of f-measure, and 0.001 of false positive rate (FPR) with 0.3 s computational time. This low FPR can increase the system stability and dependability in real-time applications.
Extricating Manual and Non-Manual Features for Subunit Level Medical Sign Modelling in Automatic Sign Language Classification and Recognition.

PubMed

R, Elakkiya; K, Selvamani

2017-09-22

Subunit segmenting and modelling in medical sign language is one of the important studies in linguistic-oriented and vision-based Sign Language Recognition (SLR). Many efforts were made in the precedent to focus the functional subunits from the view of linguistic syllables but the problem is implementing such subunit extraction using syllables is not feasible in real-world computer vision techniques. And also, the present recognition systems are designed in such a way that it can detect the signer dependent actions under restricted and laboratory conditions. This research paper aims at solving these two important issues (1) Subunit extraction and (2) Signer independent action on visual sign language recognition. Subunit extraction involved in the sequential and parallel breakdown of sign gestures without any prior knowledge on syllables and number of subunits. A novel Bayesian Parallel Hidden Markov Model (BPaHMM) is introduced for subunit extraction to combine the features of manual and non-manual parameters to yield better results in classification and recognition of signs. Signer independent action aims in using a single web camera for different signer behaviour patterns and for cross-signer validation. Experimental results have proved that the proposed signer independent subunit level modelling for sign language classification and recognition has shown improvement and variations when compared with other existing works.
Real-Time (Vision-Based) Road Sign Recognition Using an Artificial Neural Network

PubMed Central

Islam, Kh Tohidul; Raj, Ram Gopal

2017-01-01

Road sign recognition is a driver support function that can be used to notify and warn the driver by showing the restrictions that may be effective on the current stretch of road. Examples for such regulations are ‘traffic light ahead’ or ‘pedestrian crossing’ indications. The present investigation targets the recognition of Malaysian road and traffic signs in real-time. Real-time video is taken by a digital camera from a moving vehicle and real world road signs are then extracted using vision-only information. The system is based on two stages, one performs the detection and another one is for recognition. In the first stage, a hybrid color segmentation algorithm has been developed and tested. In the second stage, an introduced robust custom feature extraction method is used for the first time in a road sign recognition approach. Finally, a multilayer artificial neural network (ANN) has been created to recognize and interpret various road signs. It is robust because it has been tested on both standard and non-standard road signs with significant recognition accuracy. This proposed system achieved an average of 99.90% accuracy with 99.90% of sensitivity, 99.90% of specificity, 99.90% of f-measure, and 0.001 of false positive rate (FPR) with 0.3 s computational time. This low FPR can increase the system stability and dependability in real-time applications. PMID:28406471
Agnosic vision is like peripheral vision, which is limited by crowding.

PubMed

Strappini, Francesca; Pelli, Denis G; Di Pace, Enrico; Martelli, Marialuisa

2017-04-01

Visual agnosia is a neuropsychological impairment of visual object recognition despite near-normal acuity and visual fields. A century of research has provided only a rudimentary account of the functional damage underlying this deficit. We find that the object-recognition ability of agnosic patients viewing an object directly is like that of normally-sighted observers viewing it indirectly, with peripheral vision. Thus, agnosic vision is like peripheral vision. We obtained 14 visual-object-recognition tests that are commonly used for diagnosis of visual agnosia. Our "standard" normal observer took these tests at various eccentricities in his periphery. Analyzing the published data of 32 apperceptive agnosia patients and a group of 14 posterior cortical atrophy (PCA) patients on these tests, we find that each patient's pattern of object recognition deficits is well characterized by one number, the equivalent eccentricity at which our standard observer's peripheral vision is like the central vision of the agnosic patient. In other words, each agnosic patient's equivalent eccentricity is conserved across tests. Across patients, equivalent eccentricity ranges from 4 to 40 deg, which rates severity of the visual deficit. In normal peripheral vision, the required size to perceive a simple image (e.g., an isolated letter) is limited by acuity, and that for a complex image (e.g., a face or a word) is limited by crowding. In crowding, adjacent simple objects appear unrecognizably jumbled unless their spacing exceeds the crowding distance, which grows linearly with eccentricity. Besides conservation of equivalent eccentricity across object-recognition tests, we also find conservation, from eccentricity to agnosia, of the relative susceptibility of recognition of ten visual tests. These findings show that agnosic vision is like eccentric vision. Whence crowding? Peripheral vision, strabismic amblyopia, and possibly apperceptive agnosia are all limited by crowding, making it urgent to know what drives crowding. Acuity does not (Song et al., 2014), but neural density might: neurons per deg 2 in the crowding-relevant cortical area. Copyright © 2017 Elsevier Ltd. All rights reserved.
Posture recognition based on fuzzy logic for home monitoring of the elderly.

PubMed

Brulin, Damien; Benezeth, Yannick; Courtial, Estelle

2012-09-01

We propose in this paper a computer vision-based posture recognition method for home monitoring of the elderly. The proposed system performs human detection prior to the posture analysis; posture recognition is performed only on a human silhouette. The human detection approach has been designed to be robust to different environmental stimuli. Thus, posture is analyzed with simple and efficient features that are not designed to manage constraints related to the environment but only designed to describe human silhouettes. The posture recognition method, based on fuzzy logic, identifies four static postures and is robust to variation in the distance between the camera and the person, and to the person's morphology. With an accuracy of 74.29% of satisfactory posture recognition, this approach can detect emergency situations such as a fall within a health smart home.
Tensor Rank Preserving Discriminant Analysis for Facial Recognition.

PubMed

Tao, Dapeng; Guo, Yanan; Li, Yaotang; Gao, Xinbo

2017-10-12

Facial recognition, one of the basic topics in computer vision and pattern recognition, has received substantial attention in recent years. However, for those traditional facial recognition algorithms, the facial images are reshaped to a long vector, thereby losing part of the original spatial constraints of each pixel. In this paper, a new tensor-based feature extraction algorithm termed tensor rank preserving discriminant analysis (TRPDA) for facial image recognition is proposed; the proposed method involves two stages: in the first stage, the low-dimensional tensor subspace of the original input tensor samples was obtained; in the second stage, discriminative locality alignment was utilized to obtain the ultimate vector feature representation for subsequent facial recognition. On the one hand, the proposed TRPDA algorithm fully utilizes the natural structure of the input samples, and it applies an optimization criterion that can directly handle the tensor spectral analysis problem, thereby decreasing the computation cost compared those traditional tensor-based feature selection algorithms. On the other hand, the proposed TRPDA algorithm extracts feature by finding a tensor subspace that preserves most of the rank order information of the intra-class input samples. Experiments on the three facial databases are performed here to determine the effectiveness of the proposed TRPDA algorithm.
Evaluation of Available Software for Reconstruction of a Structure from its Imagery

DTIC Science & Technology

2017-04-01

Math . 2, 164–168. Lowe, D. G. (1999) Object recognition from local scale-invariant features, in Proc. Int. Conf. Computer Vision, Vol. 2, pp. 1150–1157...Marquardt, D. (1963) An algorithm for least-squares estimation of nonlinear parameters, SIAM J. Appl. Math . 11(2), 431–441. UNCLASSIFIED 11 DST-Group–TR
Adaptive Machine Vision.

DTIC Science & Technology

1992-06-18

developed by Fukushima . The system has potential use for SDI target/decoy discrimination. For testing purposes, simulated angle-angle and range-Doppler...properties and computational requirements of the Neocognitron, a patern recognition neural network developed by Fukushima . The RADONN effort builds upon...and Information Processing, 17-21 June 1991, Plymouth State College, Plymouth, New Hampshire.) 5.0 References 1. Kunihiko Fukushima , Sei Miyake, and
Toward a Computer Vision-based Wayfinding Aid for Blind Persons to Access Unfamiliar Indoor Environments.

PubMed

Tian, Yingli; Yang, Xiaodong; Yi, Chucai; Arditi, Aries

2013-04-01

Independent travel is a well known challenge for blind and visually impaired persons. In this paper, we propose a proof-of-concept computer vision-based wayfinding aid for blind people to independently access unfamiliar indoor environments. In order to find different rooms (e.g. an office, a lab, or a bathroom) and other building amenities (e.g. an exit or an elevator), we incorporate object detection with text recognition. First we develop a robust and efficient algorithm to detect doors, elevators, and cabinets based on their general geometric shape, by combining edges and corners. The algorithm is general enough to handle large intra-class variations of objects with different appearances among different indoor environments, as well as small inter-class differences between different objects such as doors and door-like cabinets. Next, in order to distinguish intra-class objects (e.g. an office door from a bathroom door), we extract and recognize text information associated with the detected objects. For text recognition, we first extract text regions from signs with multiple colors and possibly complex backgrounds, and then apply character localization and topological analysis to filter out background interference. The extracted text is recognized using off-the-shelf optical character recognition (OCR) software products. The object type, orientation, location, and text information are presented to the blind traveler as speech.
Recognizing Materials using Perceptually Inspired Features

PubMed Central

Sharan, Lavanya; Liu, Ce; Rosenholtz, Ruth; Adelson, Edward H.

2013-01-01

Our world consists not only of objects and scenes but also of materials of various kinds. Being able to recognize the materials that surround us (e.g., plastic, glass, concrete) is important for humans as well as for computer vision systems. Unfortunately, materials have received little attention in the visual recognition literature, and very few computer vision systems have been designed specifically to recognize materials. In this paper, we present a system for recognizing material categories from single images. We propose a set of low and mid-level image features that are based on studies of human material recognition, and we combine these features using an SVM classifier. Our system outperforms a state-of-the-art system [Varma and Zisserman, 2009] on a challenging database of real-world material categories [Sharan et al., 2009]. When the performance of our system is compared directly to that of human observers, humans outperform our system quite easily. However, when we account for the local nature of our image features and the surface properties they measure (e.g., color, texture, local shape), our system rivals human performance. We suggest that future progress in material recognition will come from: (1) a deeper understanding of the role of non-local surface properties (e.g., extended highlights, object identity); and (2) efforts to model such non-local surface properties in images. PMID:23914070

Toward a Computer Vision-based Wayfinding Aid for Blind Persons to Access Unfamiliar Indoor Environments

PubMed Central

Tian, YingLi; Yang, Xiaodong; Yi, Chucai; Arditi, Aries

2012-01-01

Independent travel is a well known challenge for blind and visually impaired persons. In this paper, we propose a proof-of-concept computer vision-based wayfinding aid for blind people to independently access unfamiliar indoor environments. In order to find different rooms (e.g. an office, a lab, or a bathroom) and other building amenities (e.g. an exit or an elevator), we incorporate object detection with text recognition. First we develop a robust and efficient algorithm to detect doors, elevators, and cabinets based on their general geometric shape, by combining edges and corners. The algorithm is general enough to handle large intra-class variations of objects with different appearances among different indoor environments, as well as small inter-class differences between different objects such as doors and door-like cabinets. Next, in order to distinguish intra-class objects (e.g. an office door from a bathroom door), we extract and recognize text information associated with the detected objects. For text recognition, we first extract text regions from signs with multiple colors and possibly complex backgrounds, and then apply character localization and topological analysis to filter out background interference. The extracted text is recognized using off-the-shelf optical character recognition (OCR) software products. The object type, orientation, location, and text information are presented to the blind traveler as speech. PMID:23630409
A self-learning camera for the validation of highly variable and pseudorandom patterns

NASA Astrophysics Data System (ADS)

Kelley, Michael

2004-05-01

Reliable and productive manufacturing operations have depended on people to quickly detect and solve problems whenever they appear. Over the last 20 years, more and more manufacturing operations have embraced machine vision systems to increase productivity, reliability and cost-effectiveness, including reducing the number of human operators required. Although machine vision technology has long been capable of solving simple problems, it has still not been broadly implemented. The reason is that until now, no machine vision system has been designed to meet the unique demands of complicated pattern recognition. The ZiCAM family was specifically developed to be the first practical hardware to meet these needs. To be able to address non-traditional applications, the machine vision industry must include smart camera technology that meets its users" demands for lower costs, better performance and the ability to address applications of irregular lighting, patterns and color. The next-generation smart cameras will need to evolve as a fundamentally different kind of sensor, with new technology that behaves like a human but performs like a computer. Neural network based systems, coupled with self-taught, n-space, non-linear modeling, promises to be the enabler of the next generation of machine vision equipment. Image processing technology is now available that enables a system to match an operator"s subjectivity. A Zero-Instruction-Set-Computer (ZISC) powered smart camera allows high-speed fuzzy-logic processing, without the need for computer programming. This can address applications of validating highly variable and pseudo-random patterns. A hardware-based implementation of a neural network, Zero-Instruction-Set-Computer, enables a vision system to "think" and "inspect" like a human, with the speed and reliability of a machine.
What makes a cell face-selective: the importance of contrast

PubMed Central

Ohayon, Shay; Freiwald, Winrich A; Tsao, Doris Y

2012-01-01

Summary Faces are robustly detected by computer vision algorithms that search for characteristic coarse contrast features. Here, we investigated whether face-selective cells in the primate brain exploit contrast features as well. We recorded from face-selective neurons in macaque inferotemporal cortex, while presenting a face-like collage of regions whose luminances were changed randomly. Modulating contrast combinations between regions induced activity changes ranging from no response to a response greater than that to a real face in 50% of cells. The critical stimulus factor determining response magnitude was contrast polarity, e.g., nose region brighter than left eye. Contrast polarity preferences were consistent across cells, suggesting a common computational strategy across the population, and matched features used by computer vision algorithms for face detection. Furthermore, most cells were tuned both for contrast polarity and for the geometry of facial features, suggesting cells encode information useful both for detection and recognition. PMID:22578507
PCI bus content-addressable-memory (CAM) implementation on FPGA for pattern recognition/image retrieval in a distributed environment

NASA Astrophysics Data System (ADS)

Megherbi, Dalila B.; Yan, Yin; Tanmay, Parikh; Khoury, Jed; Woods, C. L.

2004-11-01

Recently surveillance and Automatic Target Recognition (ATR) applications are increasing as the cost of computing power needed to process the massive amount of information continues to fall. This computing power has been made possible partly by the latest advances in FPGAs and SOPCs. In particular, to design and implement state-of-the-Art electro-optical imaging systems to provide advanced surveillance capabilities, there is a need to integrate several technologies (e.g. telescope, precise optics, cameras, image/compute vision algorithms, which can be geographically distributed or sharing distributed resources) into a programmable system and DSP systems. Additionally, pattern recognition techniques and fast information retrieval, are often important components of intelligent systems. The aim of this work is using embedded FPGA as a fast, configurable and synthesizable search engine in fast image pattern recognition/retrieval in a distributed hardware/software co-design environment. In particular, we propose and show a low cost Content Addressable Memory (CAM)-based distributed embedded FPGA hardware architecture solution with real time recognition capabilities and computing for pattern look-up, pattern recognition, and image retrieval. We show how the distributed CAM-based architecture offers a performance advantage of an order-of-magnitude over RAM-based architecture (Random Access Memory) search for implementing high speed pattern recognition for image retrieval. The methods of designing, implementing, and analyzing the proposed CAM based embedded architecture are described here. Other SOPC solutions/design issues are covered. Finally, experimental results, hardware verification, and performance evaluations using both the Xilinx Virtex-II and the Altera Apex20k are provided to show the potential and power of the proposed method for low cost reconfigurable fast image pattern recognition/retrieval at the hardware/software co-design level.
Constructive autoassociative neural network for facial recognition.

PubMed

Fernandes, Bruno J T; Cavalcanti, George D C; Ren, Tsang I

2014-01-01

Autoassociative artificial neural networks have been used in many different computer vision applications. However, it is difficult to define the most suitable neural network architecture because this definition is based on previous knowledge and depends on the problem domain. To address this problem, we propose a constructive autoassociative neural network called CANet (Constructive Autoassociative Neural Network). CANet integrates the concepts of receptive fields and autoassociative memory in a dynamic architecture that changes the configuration of the receptive fields by adding new neurons in the hidden layer, while a pruning algorithm removes neurons from the output layer. Neurons in the CANet output layer present lateral inhibitory connections that improve the recognition rate. Experiments in face recognition and facial expression recognition show that the CANet outperforms other methods presented in the literature.
Stereo vision with distance and gradient recognition

NASA Astrophysics Data System (ADS)

Kim, Soo-Hyun; Kang, Suk-Bum; Yang, Tae-Kyu

2007-12-01

Robot vision technology is needed for the stable walking, object recognition and the movement to the target spot. By some sensors which use infrared rays and ultrasonic, robot can overcome the urgent state or dangerous time. But stereo vision of three dimensional space would make robot have powerful artificial intelligence. In this paper we consider about the stereo vision for stable and correct movement of a biped robot. When a robot confront with an inclination plane or steps, particular algorithms are needed to go on without failure. This study developed the recognition algorithm of distance and gradient of environment by stereo matching process.
Spatial-frequency cutoff requirements for pattern recognition in central and peripheral vision

PubMed Central

Kwon, MiYoung; Legge, Gordon E.

2011-01-01

It is well known that object recognition requires spatial frequencies exceeding some critical cutoff value. People with central scotomas who rely on peripheral vision have substantial difficulty with reading and face recognition. Deficiencies of pattern recognition in peripheral vision, might result in higher cutoff requirements, and may contribute to the functional problems of people with central-field loss. Here we asked about differences in spatial-cutoff requirements in central and peripheral vision for letter and face recognition. The stimuli were the 26 letters of the English alphabet and 26 celebrity faces. Each image was blurred using a low-pass filter in the spatial frequency domain. Critical cutoffs (defined as the minimum low-pass filter cutoff yielding 80% accuracy) were obtained by measuring recognition accuracy as a function of cutoff (in cycles per object). Our data showed that critical cutoffs increased from central to peripheral vision by 20% for letter recognition and by 50% for face recognition. We asked whether these differences could be accounted for by central/peripheral differences in the contrast sensitivity function (CSF). We addressed this question by implementing an ideal-observer model which incorporates empirical CSF measurements and tested the model on letter and face recognition. The success of the model indicates that central/peripheral differences in the cutoff requirements for letter and face recognition can be accounted for by the information content of the stimulus limited by the shape of the human CSF, combined with a source of internal noise and followed by an optimal decision rule. PMID:21854800
Aging and solid shape recognition: Vision and haptics.

PubMed

Norman, J Farley; Cheeseman, Jacob R; Adkins, Olivia C; Cox, Andrea G; Rogers, Connor E; Dowell, Catherine J; Baxter, Michael W; Norman, Hideko F; Reyes, Cecia M

2015-10-01

The ability of 114 younger and older adults to recognize naturally-shaped objects was evaluated in three experiments. The participants viewed or haptically explored six randomly-chosen bell peppers (Capsicum annuum) in a study session and were later required to judge whether each of twelve bell peppers was "old" (previously presented during the study session) or "new" (not presented during the study session). When recognition memory was tested immediately after study, the younger adults' (Experiment 1) performance for vision and haptics was identical when the individual study objects were presented once. Vision became superior to haptics, however, when the individual study objects were presented multiple times. When 10- and 20-min delays (Experiment 2) were inserted in between study and test sessions, no significant differences occurred between vision and haptics: recognition performance in both modalities was comparable. When the recognition performance of older adults was evaluated (Experiment 3), a negative effect of age was found for visual shape recognition (younger adults' overall recognition performance was 60% higher). There was no age effect, however, for haptic shape recognition. The results of the present experiments indicate that the visual recognition of natural object shape is different from haptic recognition in multiple ways: visual shape recognition can be superior to that of haptics and is affected by aging, while haptic shape recognition is less accurate and unaffected by aging. Copyright © 2015 Elsevier Ltd. All rights reserved.
Laptop Computer - Based Facial Recognition System Assessment

DOE Office of Scientific and Technical Information (OSTI.GOV)

R. A. Cain; G. B. Singleton

2001-03-01

The objective of this project was to assess the performance of the leading commercial-off-the-shelf (COTS) facial recognition software package when used as a laptop application. We performed the assessment to determine the system's usefulness for enrolling facial images in a database from remote locations and conducting real-time searches against a database of previously enrolled images. The assessment involved creating a database of 40 images and conducting 2 series of tests to determine the product's ability to recognize and match subject faces under varying conditions. This report describes the test results and includes a description of the factors affecting the results.more » After an extensive market survey, we selected Visionics' FaceIt{reg_sign} software package for evaluation and a review of the Facial Recognition Vendor Test 2000 (FRVT 2000). This test was co-sponsored by the US Department of Defense (DOD) Counterdrug Technology Development Program Office, the National Institute of Justice, and the Defense Advanced Research Projects Agency (DARPA). Administered in May-June 2000, the FRVT 2000 assessed the capabilities of facial recognition systems that were currently available for purchase on the US market. Our selection of this Visionics product does not indicate that it is the ''best'' facial recognition software package for all uses. It was the most appropriate package based on the specific applications and requirements for this specific application. In this assessment, the system configuration was evaluated for effectiveness in identifying individuals by searching for facial images captured from video displays against those stored in a facial image database. An additional criterion was that the system be capable of operating discretely. For this application, an operational facial recognition system would consist of one central computer hosting the master image database with multiple standalone systems configured with duplicates of the master operating in remote locations. Remote users could perform real-time searches where network connectivity is not available. As images are enrolled at the remote locations, periodic database synchronization is necessary.« less
A real time mobile-based face recognition with fisherface methods

NASA Astrophysics Data System (ADS)

Arisandi, D.; Syahputra, M. F.; Putri, I. L.; Purnamawati, S.; Rahmat, R. F.; Sari, P. P.

2018-03-01

Face Recognition is a field research in Computer Vision that study about learning face and determine the identity of the face from a picture sent to the system. By utilizing this face recognition technology, learning process about people’s identity between students in a university will become simpler. With this technology, student won’t need to browse student directory in university’s server site and look for the person with certain face trait. To obtain this goal, face recognition application use image processing methods consist of two phase, pre-processing phase and recognition phase. In pre-processing phase, system will process input image into the best image for recognition phase. Purpose of this pre-processing phase is to reduce noise and increase signal in image. Next, to recognize face phase, we use Fisherface Methods. This methods is chosen because of its advantage that would help system of its limited data. Therefore from experiment the accuracy of face recognition using fisherface is 90%.
Training improves reading speed in peripheral vision: is it due to attention?

PubMed

Lee, Hye-Won; Kwon, Miyoung; Legge, Gordon E; Gefroh, Joshua J

2010-06-01

Previous research has shown that perceptual training in peripheral vision, using a letter-recognition task, increases reading speed and letter recognition (S. T. L. Chung, G. E. Legge, & S. H. Cheung, 2004). We tested the hypothesis that enhanced deployment of spatial attention to peripheral vision explains this training effect. Subjects were pre- and post-tested with 3 tasks at 10° above and below fixation-RSVP reading speed, trigram letter recognition (used to construct visual-span profiles), and deployment of spatial attention (measured as the benefit of a pre-cue for target position in a lexical-decision task). Groups of five normally sighted young adults received 4 days of trigram letter-recognition training in upper or lower visual fields, or central vision. A control group received no training. Our measure of deployment of spatial attention revealed visual-field anisotropies; better deployment of attention in the lower field than the upper, and in the lower-right quadrant compared with the other three quadrants. All subject groups exhibited slight improvement in deployment of spatial attention to peripheral vision in the post-test, but this improvement was not correlated with training-related increases in reading speed and the size of visual-span profiles. Our results indicate that improved deployment of spatial attention to peripheral vision does not account for improved reading speed and letter recognition in peripheral vision.
Intelligent Vision On The SM9O Mini-Computer Basis And Applications

NASA Astrophysics Data System (ADS)

Hawryszkiw, J.

1985-02-01

Distinction has to be made between image processing and vision Image processing finds its roots in the strong tradition of linear signal processing and promotes geometrical transform techniques, such as fi I tering , compression, and restoration. Its purpose is to transform an image for a human observer to easily extract from that image information significant for him. For example edges after a gradient operator, or a specific direction after a directional filtering operation. Image processing consists in fact in a set of local or global space-time transforms. The interpretation of the final image is done by the human observer. The purpose of vision is to extract the semantic content of the image. The machine can then understand that content, and run a process of decision, which turns into an action. Thus, intel I i gent vision depends on - Image processing - Pattern recognition - Artificial intel I igence
Understanding human visual systems and its impact on our intelligent instruments

NASA Astrophysics Data System (ADS)

Strojnik Scholl, Marija; Páez, Gonzalo; Scholl, Michelle K.

2013-09-01

We review the evolution of machine vision and comment on the cross-fertilization from the neural sciences onto flourishing fields of neural processing, parallel processing, and associative memory in optical sciences and computing. Then we examine how the intensive efforts in mapping the human brain have been influenced by concepts in computer sciences, control theory, and electronic circuits. We discuss two neural paths that employ the input from the vision sense to determine the navigational options and object recognition. They are ventral temporal pathway for object recognition (what?) and dorsal parietal pathway for navigation (where?), respectively. We describe the reflexive and conscious decision centers in cerebral cortex involved with visual attention and gaze control. Interestingly, these require return path though the midbrain for ocular muscle control. We find that the cognitive psychologists currently study human brain employing low-spatial-resolution fMRI with temporal response on the order of a second. In recent years, the life scientists have concentrated on insect brains to study neural processes. We discuss how reflexive and conscious gaze-control decisions are made in the frontal eye field and inferior parietal lobe, constituting the fronto-parietal attention network. We note that ethical and experiential learnings impact our conscious decisions.
Crowd motion segmentation and behavior recognition fusing streak flow and collectiveness

NASA Astrophysics Data System (ADS)

Gao, Mingliang; Jiang, Jun; Shen, Jin; Zou, Guofeng; Fu, Guixia

2018-04-01

Crowd motion segmentation and crowd behavior recognition are two hot issues in computer vision. A number of methods have been proposed to tackle these two problems. Among the methods, flow dynamics is utilized to model the crowd motion, with little consideration of collective property. Moreover, the traditional crowd behavior recognition methods treat the local feature and dynamic feature separately and overlook the interconnection of topological and dynamical heterogeneity in complex crowd processes. A crowd motion segmentation method and a crowd behavior recognition method are proposed based on streak flow and crowd collectiveness. The streak flow is adopted to reveal the dynamical property of crowd motion, and the collectiveness is incorporated to reveal the structure property. Experimental results show that the proposed methods improve the crowd motion segmentation accuracy and the crowd recognition rates compared with the state-of-the-art methods.
Scene recognition based on integrating active learning with dictionary learning

NASA Astrophysics Data System (ADS)

Wang, Chengxi; Yin, Xueyan; Yang, Lin; Gong, Chengrong; Zheng, Caixia; Yi, Yugen

2018-04-01

Scene recognition is a significant topic in the field of computer vision. Most of the existing scene recognition models require a large amount of labeled training samples to achieve a good performance. However, labeling image manually is a time consuming task and often unrealistic in practice. In order to gain satisfying recognition results when labeled samples are insufficient, this paper proposed a scene recognition algorithm named Integrating Active Learning and Dictionary Leaning (IALDL). IALDL adopts projective dictionary pair learning (DPL) as classifier and introduces active learning mechanism into DPL for improving its performance. When constructing sampling criterion in active learning, IALDL considers both the uncertainty and representativeness as the sampling criteria to effectively select the useful unlabeled samples from a given sample set for expanding the training dataset. Experiment results on three standard databases demonstrate the feasibility and validity of the proposed IALDL.
On the Application of Image Processing Methods for Bubble Recognition to the Study of Subcooled Flow Boiling of Water in Rectangular Channels

PubMed Central

Paz, Concepción; Conde, Marcos; Porteiro, Jacobo; Concheiro, Miguel

2017-01-01

This work introduces the use of machine vision in the massive bubble recognition process, which supports the validation of boiling models involving bubble dynamics, as well as nucleation frequency, active site density and size of the bubbles. The two algorithms presented are meant to be run employing quite standard images of the bubbling process, recorded in general-purpose boiling facilities. The recognition routines are easily adaptable to other facilities if a minimum number of precautions are taken in the setup and in the treatment of the information. Both the side and front projections of subcooled flow-boiling phenomenon over a plain plate are covered. Once all of the intended bubbles have been located in space and time, the proper post-process of the recorded data become capable of tracking each of the recognized bubbles, sketching their trajectories and size evolution, locating the nucleation sites, computing their diameters, and so on. After validating the algorithm’s output against the human eye and data from other researchers, machine vision systems have been demonstrated to be a very valuable option to successfully perform the recognition process, even though the optical analysis of bubbles has not been set as the main goal of the experimental facility. PMID:28632158
Liquid lens: advances in adaptive optics

NASA Astrophysics Data System (ADS)

Casey, Shawn Patrick

2010-12-01

'Liquid lens' technologies promise significant advancements in machine vision and optical communications systems. Adaptations for machine vision, human vision correction, and optical communications are used to exemplify the versatile nature of this technology. Utilization of liquid lens elements allows the cost effective implementation of optical velocity measurement. The project consists of a custom image processor, camera, and interface. The images are passed into customized pattern recognition and optical character recognition algorithms. A single camera would be used for both speed detection and object recognition.
Vision-based obstacle recognition system for automated lawn mower robot development

NASA Astrophysics Data System (ADS)

Mohd Zin, Zalhan; Ibrahim, Ratnawati

2011-06-01

Digital image processing techniques (DIP) have been widely used in various types of application recently. Classification and recognition of a specific object using vision system require some challenging tasks in the field of image processing and artificial intelligence. The ability and efficiency of vision system to capture and process the images is very important for any intelligent system such as autonomous robot. This paper gives attention to the development of a vision system that could contribute to the development of an automated vision based lawn mower robot. The works involve on the implementation of DIP techniques to detect and recognize three different types of obstacles that usually exist on a football field. The focus was given on the study on different types and sizes of obstacles, the development of vision based obstacle recognition system and the evaluation of the system's performance. Image processing techniques such as image filtering, segmentation, enhancement and edge detection have been applied in the system. The results have shown that the developed system is able to detect and recognize various types of obstacles on a football field with recognition rate of more 80%.
Ubiquitous computing technology for just-in-time motivation of behavior change.

PubMed

Intille, Stephen S

2004-01-01

This paper describes a vision of health care where "just-in-time" user interfaces are used to transform people from passive to active consumers of health care. Systems that use computational pattern recognition to detect points of decision, behavior, or consequences automatically can present motivational messages to encourage healthy behavior at just the right time. Further, new ubiquitous computing and mobile computing devices permit information to be conveyed to users at just the right place. In combination, computer systems that present messages at the right time and place can be developed to motivate physical activity and healthy eating. Computational sensing technologies can also be used to measure the impact of the motivational technology on behavior.
Action Recognition in a Crowded Environment

PubMed Central

Nieuwenhuis, Judith; Bülthoff, Isabelle; Barraclough, Nick; de la Rosa, Stephan

2017-01-01

So far, action recognition has been mainly examined with small point-light human stimuli presented alone within a narrow central area of the observer’s visual field. Yet, we need to recognize the actions of life-size humans viewed alone or surrounded by bystanders, whether they are seen in central or peripheral vision. Here, we examined the mechanisms in central vision and far periphery (40° eccentricity) involved in the recognition of the actions of a life-size actor (target) and their sensitivity to the presence of a crowd surrounding the target. In Experiment 1, we used an action adaptation paradigm to probe whether static or idly moving crowds might interfere with the recognition of a target’s action (hug or clap). We found that this type of crowds whose movements were dissimilar to the target action hardly affected action recognition in central and peripheral vision. In Experiment 2, we examined whether crowd actions that were more similar to the target actions affected action recognition. Indeed, the presence of that crowd diminished adaptation aftereffects in central vision as wells as in the periphery. We replicated Experiment 2 using a recognition task instead of an adaptation paradigm. With this task, we found evidence of decreased action recognition accuracy, but this was significant in peripheral vision only. Our results suggest that the presence of a crowd carrying out actions similar to that of the target affects its recognition. We outline how these results can be understood in terms of high-level crowding effects that operate on action-sensitive perceptual channels. PMID:29308177

Human gait recognition by pyramid of HOG feature on silhouette images

NASA Astrophysics Data System (ADS)

Yang, Guang; Yin, Yafeng; Park, Jeanrok; Man, Hong

2013-03-01

As a uncommon biometric modality, human gait recognition has a great advantage of identify people at a distance without high resolution images. It has attracted much attention in recent years, especially in the fields of computer vision and remote sensing. In this paper, we propose a human gait recognition framework that consists of a reliable background subtraction method followed by the pyramid of Histogram of Gradient (pHOG) feature extraction on the silhouette image, and a Hidden Markov Model (HMM) based classifier. Through background subtraction, the silhouette of human gait in each frame is extracted and normalized from the raw video sequence. After removing the shadow and noise in each region of interest (ROI), pHOG feature is computed on the silhouettes images. Then the pHOG features of each gait class will be used to train a corresponding HMM. In the test stage, pHOG feature will be extracted from each test sequence and used to calculate the posterior probability toward each trained HMM model. Experimental results on the CASIA Gait Dataset B1 demonstrate that with our proposed method can achieve very competitive recognition rate.
Task-oriented situation recognition

NASA Astrophysics Data System (ADS)

Bauer, Alexander; Fischer, Yvonne

2010-04-01

From the advances in computer vision methods for the detection, tracking and recognition of objects in video streams, new opportunities for video surveillance arise: In the future, automated video surveillance systems will be able to detect critical situations early enough to enable an operator to take preventive actions, instead of using video material merely for forensic investigations. However, problems such as limited computational resources, privacy regulations and a constant change in potential threads have to be addressed by a practical automated video surveillance system. In this paper, we show how these problems can be addressed using a task-oriented approach. The system architecture of the task-oriented video surveillance system NEST and an algorithm for the detection of abnormal behavior as part of the system are presented and illustrated for the surveillance of guests inside a video-monitored building.
Computer vision cracks the leaf code

PubMed Central

Wilf, Peter; Zhang, Shengping; Chikkerur, Sharat; Little, Stefan A.; Wing, Scott L.; Serre, Thomas

2016-01-01

Understanding the extremely variable, complex shape and venation characters of angiosperm leaves is one of the most challenging problems in botany. Machine learning offers opportunities to analyze large numbers of specimens, to discover novel leaf features of angiosperm clades that may have phylogenetic significance, and to use those characters to classify unknowns. Previous computer vision approaches have primarily focused on leaf identification at the species level. It remains an open question whether learning and classification are possible among major evolutionary groups such as families and orders, which usually contain hundreds to thousands of species each and exhibit many times the foliar variation of individual species. Here, we tested whether a computer vision algorithm could use a database of 7,597 leaf images from 2,001 genera to learn features of botanical families and orders, then classify novel images. The images are of cleared leaves, specimens that are chemically bleached, then stained to reveal venation. Machine learning was used to learn a codebook of visual elements representing leaf shape and venation patterns. The resulting automated system learned to classify images into families and orders with a success rate many times greater than chance. Of direct botanical interest, the responses of diagnostic features can be visualized on leaf images as heat maps, which are likely to prompt recognition and evolutionary interpretation of a wealth of novel morphological characters. With assistance from computer vision, leaves are poised to make numerous new contributions to systematic and paleobotanical studies. PMID:26951664
Identifying and detecting facial expressions of emotion in peripheral vision.

PubMed

Smith, Fraser W; Rossit, Stephanie

2018-01-01

Facial expressions of emotion are signals of high biological value. Whilst recognition of facial expressions has been much studied in central vision, the ability to perceive these signals in peripheral vision has only seen limited research to date, despite the potential adaptive advantages of such perception. In the present experiment, we investigate facial expression recognition and detection performance for each of the basic emotions (plus neutral) at up to 30 degrees of eccentricity. We demonstrate, as expected, a decrease in recognition and detection performance with increasing eccentricity, with happiness and surprised being the best recognized expressions in peripheral vision. In detection however, while happiness and surprised are still well detected, fear is also a well detected expression. We show that fear is a better detected than recognized expression. Our results demonstrate that task constraints shape the perception of expression in peripheral vision and provide novel evidence that detection and recognition rely on partially separate underlying mechanisms, with the latter more dependent on the higher spatial frequency content of the face stimulus.
Identifying and detecting facial expressions of emotion in peripheral vision

PubMed Central

Rossit, Stephanie

2018-01-01

Facial expressions of emotion are signals of high biological value. Whilst recognition of facial expressions has been much studied in central vision, the ability to perceive these signals in peripheral vision has only seen limited research to date, despite the potential adaptive advantages of such perception. In the present experiment, we investigate facial expression recognition and detection performance for each of the basic emotions (plus neutral) at up to 30 degrees of eccentricity. We demonstrate, as expected, a decrease in recognition and detection performance with increasing eccentricity, with happiness and surprised being the best recognized expressions in peripheral vision. In detection however, while happiness and surprised are still well detected, fear is also a well detected expression. We show that fear is a better detected than recognized expression. Our results demonstrate that task constraints shape the perception of expression in peripheral vision and provide novel evidence that detection and recognition rely on partially separate underlying mechanisms, with the latter more dependent on the higher spatial frequency content of the face stimulus. PMID:29847562
Multi-task learning with group information for human action recognition

NASA Astrophysics Data System (ADS)

Qian, Li; Wu, Song; Pu, Nan; Xu, Shulin; Xiao, Guoqiang

2018-04-01

Human action recognition is an important and challenging task in computer vision research, due to the variations in human motion performance, interpersonal differences and recording settings. In this paper, we propose a novel multi-task learning framework with group information (MTL-GI) for accurate and efficient human action recognition. Specifically, we firstly obtain group information through calculating the mutual information according to the latent relationship between Gaussian components and action categories, and clustering similar action categories into the same group by affinity propagation clustering. Additionally, in order to explore the relationships of related tasks, we incorporate group information into multi-task learning. Experimental results evaluated on two popular benchmarks (UCF50 and HMDB51 datasets) demonstrate the superiority of our proposed MTL-GI framework.
Hybrid Feature Extraction-based Approach for Facial Parts Representation and Recognition

NASA Astrophysics Data System (ADS)

Rouabhia, C.; Tebbikh, H.

2008-06-01

Face recognition is a specialized image processing which has attracted a considerable attention in computer vision. In this article, we develop a new facial recognition system from video sequences images dedicated to person identification whose face is partly occulted. This system is based on a hybrid image feature extraction technique called ACPDL2D (Rouabhia et al. 2007), it combines two-dimensional principal component analysis and two-dimensional linear discriminant analysis with neural network. We performed the feature extraction task on the eyes and the nose images separately then a Multi-Layers Perceptron classifier is used. Compared to the whole face, the results of simulation are in favor of the facial parts in terms of memory capacity and recognition (99.41% for the eyes part, 98.16% for the nose part and 97.25 % for the whole face).
Classification of Normal and Pathological Gait in Young Children Based on Foot Pressure Data.

PubMed

Guo, Guodong; Guffey, Keegan; Chen, Wenbin; Pergami, Paola

2017-01-01

Human gait recognition, an active research topic in computer vision, is generally based on data obtained from images/videos. We applied computer vision technology to classify pathology-related changes in gait in young children using a foot-pressure database collected using the GAITRite walkway system. As foot positioning changes with children's development, we also investigated the possibility of age estimation based on this data. Our results demonstrate that the data collected by the GAITRite system can be used for normal/pathological gait classification. Combining age information and normal/pathological gait classification increases the accuracy of the classifier. This novel approach could support the development of an accurate, real-time, and economic measure of gait abnormalities in children, able to provide important feedback to clinicians regarding the effect of rehabilitation interventions, and to support targeted treatment modifications.
VideoWeb Dataset for Multi-camera Activities and Non-verbal Communication

NASA Astrophysics Data System (ADS)

Denina, Giovanni; Bhanu, Bir; Nguyen, Hoang Thanh; Ding, Chong; Kamal, Ahmed; Ravishankar, Chinya; Roy-Chowdhury, Amit; Ivers, Allen; Varda, Brenda

Human-activity recognition is one of the most challenging problems in computer vision. Researchers from around the world have tried to solve this problem and have come a long way in recognizing simple motions and atomic activities. As the computer vision community heads toward fully recognizing human activities, a challenging and labeled dataset is needed. To respond to that need, we collected a dataset of realistic scenarios in a multi-camera network environment (VideoWeb) involving multiple persons performing dozens of different repetitive and non-repetitive activities. This chapter describes the details of the dataset. We believe that this VideoWeb Activities dataset is unique and it is one of the most challenging datasets available today. The dataset is publicly available online at http://vwdata.ee.ucr.edu/ along with the data annotation.
State-Estimation Algorithm Based on Computer Vision

NASA Technical Reports Server (NTRS)

Bayard, David; Brugarolas, Paul

2007-01-01

An algorithm and software to implement the algorithm are being developed as means to estimate the state (that is, the position and velocity) of an autonomous vehicle, relative to a visible nearby target object, to provide guidance for maneuvering the vehicle. In the original intended application, the autonomous vehicle would be a spacecraft and the nearby object would be a small astronomical body (typically, a comet or asteroid) to be explored by the spacecraft. The algorithm could also be used on Earth in analogous applications -- for example, for guiding underwater robots near such objects of interest as sunken ships, mineral deposits, or submerged mines. It is assumed that the robot would be equipped with a vision system that would include one or more electronic cameras, image-digitizing circuitry, and an imagedata- processing computer that would generate feature-recognition data products.
Object recognition based on Google's reverse image search and image similarity

NASA Astrophysics Data System (ADS)

Horváth, András.

2015-12-01

Image classification is one of the most challenging tasks in computer vision and a general multiclass classifier could solve many different tasks in image processing. Classification is usually done by shallow learning for predefined objects, which is a difficult task and very different from human vision, which is based on continuous learning of object classes and one requires years to learn a large taxonomy of objects which are not disjunct nor independent. In this paper I present a system based on Google image similarity algorithm and Google image database, which can classify a large set of different objects in a human like manner, identifying related classes and taxonomies.
NETRA: A parallel architecture for integrated vision systems. 1: Architecture and organization

NASA Technical Reports Server (NTRS)

Choudhary, Alok N.; Patel, Janak H.; Ahuja, Narendra

1989-01-01

Computer vision is regarded as one of the most complex and computationally intensive problems. An integrated vision system (IVS) is considered to be a system that uses vision algorithms from all levels of processing for a high level application (such as object recognition). A model of computation is presented for parallel processing for an IVS. Using the model, desired features and capabilities of a parallel architecture suitable for IVSs are derived. Then a multiprocessor architecture (called NETRA) is presented. This architecture is highly flexible without the use of complex interconnection schemes. The topology of NETRA is recursively defined and hence is easily scalable from small to large systems. Homogeneity of NETRA permits fault tolerance and graceful degradation under faults. It is a recursively defined tree-type hierarchical architecture where each of the leaf nodes consists of a cluster of processors connected with a programmable crossbar with selective broadcast capability to provide for desired flexibility. A qualitative evaluation of NETRA is presented. Then general schemes are described to map parallel algorithms onto NETRA. Algorithms are classified according to their communication requirements for parallel processing. An extensive analysis of inter-cluster communication strategies in NETRA is presented, and parameters affecting performance of parallel algorithms when mapped on NETRA are discussed. Finally, a methodology to evaluate performance of algorithms on NETRA is described.
On the role of spatial phase and phase correlation in vision, illusion, and cognition

PubMed Central

Gladilin, Evgeny; Eils, Roland

2015-01-01

Numerous findings indicate that spatial phase bears an important cognitive information. Distortion of phase affects topology of edge structures and makes images unrecognizable. In turn, appropriately phase-structured patterns give rise to various illusions of virtual image content and apparent motion. Despite a large body of phenomenological evidence not much is known yet about the role of phase information in neural mechanisms of visual perception and cognition. Here, we are concerned with analysis of the role of spatial phase in computational and biological vision, emergence of visual illusions and pattern recognition. We hypothesize that fundamental importance of phase information for invariant retrieval of structural image features and motion detection promoted development of phase-based mechanisms of neural image processing in course of evolution of biological vision. Using an extension of Fourier phase correlation technique, we show that the core functions of visual system such as motion detection and pattern recognition can be facilitated by the same basic mechanism. Our analysis suggests that emergence of visual illusions can be attributed to presence of coherently phase-shifted repetitive patterns as well as the effects of acuity compensation by saccadic eye movements. We speculate that biological vision relies on perceptual mechanisms effectively similar to phase correlation, and predict neural features of visual pattern (dis)similarity that can be used for experimental validation of our hypothesis of “cognition by phase correlation.” PMID:25954190
On the role of spatial phase and phase correlation in vision, illusion, and cognition.

PubMed

Gladilin, Evgeny; Eils, Roland

2015-01-01

Numerous findings indicate that spatial phase bears an important cognitive information. Distortion of phase affects topology of edge structures and makes images unrecognizable. In turn, appropriately phase-structured patterns give rise to various illusions of virtual image content and apparent motion. Despite a large body of phenomenological evidence not much is known yet about the role of phase information in neural mechanisms of visual perception and cognition. Here, we are concerned with analysis of the role of spatial phase in computational and biological vision, emergence of visual illusions and pattern recognition. We hypothesize that fundamental importance of phase information for invariant retrieval of structural image features and motion detection promoted development of phase-based mechanisms of neural image processing in course of evolution of biological vision. Using an extension of Fourier phase correlation technique, we show that the core functions of visual system such as motion detection and pattern recognition can be facilitated by the same basic mechanism. Our analysis suggests that emergence of visual illusions can be attributed to presence of coherently phase-shifted repetitive patterns as well as the effects of acuity compensation by saccadic eye movements. We speculate that biological vision relies on perceptual mechanisms effectively similar to phase correlation, and predict neural features of visual pattern (dis)similarity that can be used for experimental validation of our hypothesis of "cognition by phase correlation."
The Invariance Hypothesis Implies Domain-Specific Regions in Visual Cortex

PubMed Central

Leibo, Joel Z.; Liao, Qianli; Anselmi, Fabio; Poggio, Tomaso

2015-01-01

Is visual cortex made up of general-purpose information processing machinery, or does it consist of a collection of specialized modules? If prior knowledge, acquired from learning a set of objects is only transferable to new objects that share properties with the old, then the recognition system’s optimal organization must be one containing specialized modules for different object classes. Our analysis starts from a premise we call the invariance hypothesis: that the computational goal of the ventral stream is to compute an invariant-to-transformations and discriminative signature for recognition. The key condition enabling approximate transfer of invariance without sacrificing discriminability turns out to be that the learned and novel objects transform similarly. This implies that the optimal recognition system must contain subsystems trained only with data from similarly-transforming objects and suggests a novel interpretation of domain-specific regions like the fusiform face area (FFA). Furthermore, we can define an index of transformation-compatibility, computable from videos, that can be combined with information about the statistics of natural vision to yield predictions for which object categories ought to have domain-specific regions in agreement with the available data. The result is a unifying account linking the large literature on view-based recognition with the wealth of experimental evidence concerning domain-specific regions. PMID:26496457
A Linked List-Based Algorithm for Blob Detection on Embedded Vision-Based Sensors.

PubMed

Acevedo-Avila, Ricardo; Gonzalez-Mendoza, Miguel; Garcia-Garcia, Andres

2016-05-28

Blob detection is a common task in vision-based applications. Most existing algorithms are aimed at execution on general purpose computers; while very few can be adapted to the computing restrictions present in embedded platforms. This paper focuses on the design of an algorithm capable of real-time blob detection that minimizes system memory consumption. The proposed algorithm detects objects in one image scan; it is based on a linked-list data structure tree used to label blobs depending on their shape and node information. An example application showing the results of a blob detection co-processor has been built on a low-powered field programmable gate array hardware as a step towards developing a smart video surveillance system. The detection method is intended for general purpose application. As such, several test cases focused on character recognition are also examined. The results obtained present a fair trade-off between accuracy and memory requirements; and prove the validity of the proposed approach for real-time implementation on resource-constrained computing platforms.
Random-Profiles-Based 3D Face Recognition System

PubMed Central

Joongrock, Kim; Sunjin, Yu; Sangyoun, Lee

2014-01-01

In this paper, a noble nonintrusive three-dimensional (3D) face modeling system for random-profile-based 3D face recognition is presented. Although recent two-dimensional (2D) face recognition systems can achieve a reliable recognition rate under certain conditions, their performance is limited by internal and external changes, such as illumination and pose variation. To address these issues, 3D face recognition, which uses 3D face data, has recently received much attention. However, the performance of 3D face recognition highly depends on the precision of acquired 3D face data, while also requiring more computational power and storage capacity than 2D face recognition systems. In this paper, we present a developed nonintrusive 3D face modeling system composed of a stereo vision system and an invisible near-infrared line laser, which can be directly applied to profile-based 3D face recognition. We further propose a novel random-profile-based 3D face recognition method that is memory-efficient and pose-invariant. The experimental results demonstrate that the reconstructed 3D face data consists of more than 50 k 3D point clouds and a reliable recognition rate against pose variation. PMID:24691101
The use of interactive computer vision and robot hand controllers for enhancing manufacturing safety

NASA Technical Reports Server (NTRS)

Marzwell, Neville I.; Jacobus, Charles J.; Peurach, Thomas M.; Mitchell, Brian T.

1994-01-01

Current available robotic systems provide limited support for CAD-based model-driven visualization, sensing algorithm development and integration, and automated graphical planning systems. This paper describes ongoing work which provides the functionality necessary to apply advanced robotics to automated manufacturing and assembly operations. An interface has been built which incorporates 6-DOF tactile manipulation, displays for three dimensional graphical models, and automated tracking functions which depend on automated machine vision. A set of tools for single and multiple focal plane sensor image processing and understanding has been demonstrated which utilizes object recognition models. The resulting tool will enable sensing and planning from computationally simple graphical objects. A synergistic interplay between human and operator vision is created from programmable feedback received from the controller. This approach can be used as the basis for implementing enhanced safety in automated robotics manufacturing, assembly, repair and inspection tasks in both ground and space applications. Thus, an interactive capability has been developed to match the modeled environment to the real task environment for safe and predictable task execution.
The use of interactive computer vision and robot hand controllers for enhancing manufacturing safety

NASA Astrophysics Data System (ADS)

Marzwell, Neville I.; Jacobus, Charles J.; Peurach, Thomas M.; Mitchell, Brian T.

1994-02-01

Current available robotic systems provide limited support for CAD-based model-driven visualization, sensing algorithm development and integration, and automated graphical planning systems. This paper describes ongoing work which provides the functionality necessary to apply advanced robotics to automated manufacturing and assembly operations. An interface has been built which incorporates 6-DOF tactile manipulation, displays for three dimensional graphical models, and automated tracking functions which depend on automated machine vision. A set of tools for single and multiple focal plane sensor image processing and understanding has been demonstrated which utilizes object recognition models. The resulting tool will enable sensing and planning from computationally simple graphical objects. A synergistic interplay between human and operator vision is created from programmable feedback received from the controller. This approach can be used as the basis for implementing enhanced safety in automated robotics manufacturing, assembly, repair and inspection tasks in both ground and space applications. Thus, an interactive capability has been developed to match the modeled environment to the real task environment for safe and predictable task execution.
Vision based assistive technology for people with dementia performing activities of daily living (ADLs): an overview

NASA Astrophysics Data System (ADS)

As'ari, M. A.; Sheikh, U. U.

2012-04-01

The rapid development of intelligent assistive technology for replacing a human caregiver in assisting people with dementia performing activities of daily living (ADLs) promises in the reduction of care cost especially in training and hiring human caregiver. The main problem however, is the various kinds of sensing agents used in such system and is dependent on the intent (types of ADLs) and environment where the activity is performed. In this paper on overview of the potential of computer vision based sensing agent in assistive system and how it can be generalized and be invariant to various kind of ADLs and environment. We find that there exists a gap from the existing vision based human action recognition method in designing such system due to cognitive and physical impairment of people with dementia.

Learning discriminative features from RGB-D images for gender and ethnicity identification

NASA Astrophysics Data System (ADS)

Azzakhnini, Safaa; Ballihi, Lahoucine; Aboutajdine, Driss

2016-11-01

The development of sophisticated sensor technologies gave rise to an interesting variety of data. With the appearance of affordable devices, such as the Microsoft Kinect, depth-maps and three-dimensional data became easily accessible. This attracted many computer vision researchers seeking to exploit this information in classification and recognition tasks. In this work, the problem of face classification in the context of RGB images and depth information (RGB-D images) is addressed. The purpose of this paper is to study and compare some popular techniques for gender recognition and ethnicity classification to understand how much depth data can improve the quality of recognition. Furthermore, we investigate which combination of face descriptors, feature selection methods, and learning techniques is best suited to better exploit RGB-D images. The experimental results show that depth data improve the recognition accuracy for gender and ethnicity classification applications in many use cases.
Recognition Using Hybrid Classifiers.

PubMed

Osadchy, Margarita; Keren, Daniel; Raviv, Dolev

2016-04-01

A canonical problem in computer vision is category recognition (e.g., find all instances of human faces, cars etc., in an image). Typically, the input for training a binary classifier is a relatively small sample of positive examples, and a huge sample of negative examples, which can be very diverse, consisting of images from a large number of categories. The difficulty of the problem sharply increases with the dimension and size of the negative example set. We propose to alleviate this problem by applying a "hybrid" classifier, which replaces the negative samples by a prior, and then finds a hyperplane which separates the positive samples from this prior. The method is extended to kernel space and to an ensemble-based approach. The resulting binary classifiers achieve an identical or better classification rate than SVM, while requiring far smaller memory and lower computational complexity to train and apply.
Hand-gesture-based sterile interface for the operating room using contextual cues for the navigation of radiological images

PubMed Central

Jacob, Mithun George; Wachs, Juan Pablo; Packer, Rebecca A

2013-01-01

This paper presents a method to improve the navigation and manipulation of radiological images through a sterile hand gesture recognition interface based on attentional contextual cues. Computer vision algorithms were developed to extract intention and attention cues from the surgeon's behavior and combine them with sensory data from a commodity depth camera. The developed interface was tested in a usability experiment to assess the effectiveness of the new interface. An image navigation and manipulation task was performed, and the gesture recognition accuracy, false positives and task completion times were computed to evaluate system performance. Experimental results show that gesture interaction and surgeon behavior analysis can be used to accurately navigate, manipulate and access MRI images, and therefore this modality could replace the use of keyboard and mice-based interfaces. PMID:23250787
Hand-gesture-based sterile interface for the operating room using contextual cues for the navigation of radiological images.

PubMed

Jacob, Mithun George; Wachs, Juan Pablo; Packer, Rebecca A

2013-06-01

This paper presents a method to improve the navigation and manipulation of radiological images through a sterile hand gesture recognition interface based on attentional contextual cues. Computer vision algorithms were developed to extract intention and attention cues from the surgeon's behavior and combine them with sensory data from a commodity depth camera. The developed interface was tested in a usability experiment to assess the effectiveness of the new interface. An image navigation and manipulation task was performed, and the gesture recognition accuracy, false positives and task completion times were computed to evaluate system performance. Experimental results show that gesture interaction and surgeon behavior analysis can be used to accurately navigate, manipulate and access MRI images, and therefore this modality could replace the use of keyboard and mice-based interfaces.
Brain-computer interaction research at the Computer Vision and Multimedia Laboratory, University of Geneva.

PubMed

Pun, Thierry; Alecu, Teodor Iulian; Chanel, Guillaume; Kronegg, Julien; Voloshynovskiy, Sviatoslav

2006-06-01

This paper describes the work being conducted in the domain of brain-computer interaction (BCI) at the Multimodal Interaction Group, Computer Vision and Multimedia Laboratory, University of Geneva, Geneva, Switzerland. The application focus of this work is on multimodal interaction rather than on rehabilitation, that is how to augment classical interaction by means of physiological measurements. Three main research topics are addressed. The first one concerns the more general problem of brain source activity recognition from EEGs. In contrast with classical deterministic approaches, we studied iterative robust stochastic based reconstruction procedures modeling source and noise statistics, to overcome known limitations of current techniques. We also developed procedures for optimal electroencephalogram (EEG) sensor system design in terms of placement and number of electrodes. The second topic is the study of BCI protocols and performance from an information-theoretic point of view. Various information rate measurements have been compared for assessing BCI abilities. The third research topic concerns the use of EEG and other physiological signals for assessing a user's emotional status.
Proceedings of the Second Joint Technology Workshop on Neural Networks and Fuzzy Logic, volume 2

NASA Technical Reports Server (NTRS)

Lea, Robert N. (Editor); Villarreal, James A. (Editor)

1991-01-01

Documented here are papers presented at the Neural Networks and Fuzzy Logic Workshop sponsored by NASA and the University of Texas, Houston. Topics addressed included adaptive systems, learning algorithms, network architectures, vision, robotics, neurobiological connections, speech recognition and synthesis, fuzzy set theory and application, control and dynamics processing, space applications, fuzzy logic and neural network computers, approximate reasoning, and multiobject decision making.
GPU-based real-time trinocular stereo vision

NASA Astrophysics Data System (ADS)

Yao, Yuanbin; Linton, R. J.; Padir, Taskin

2013-01-01

Most stereovision applications are binocular which uses information from a 2-camera array to perform stereo matching and compute the depth image. Trinocular stereovision with a 3-camera array has been proved to provide higher accuracy in stereo matching which could benefit applications like distance finding, object recognition, and detection. This paper presents a real-time stereovision algorithm implemented on a GPGPU (General-purpose graphics processing unit) using a trinocular stereovision camera array. Algorithm employs a winner-take-all method applied to perform fusion of disparities in different directions following various image processing techniques to obtain the depth information. The goal of the algorithm is to achieve real-time processing speed with the help of a GPGPU involving the use of Open Source Computer Vision Library (OpenCV) in C++ and NVidia CUDA GPGPU Solution. The results are compared in accuracy and speed to verify the improvement.
Exploration of available feature detection and identification systems and their performance on radiographs

NASA Astrophysics Data System (ADS)

Wantuch, Andrew C.; Vita, Joshua A.; Jimenez, Edward S.; Bray, Iliana E.

2016-10-01

Despite object detection, recognition, and identification being very active areas of computer vision research, many of the available tools to aid in these processes are designed with only photographs in mind. Although some algorithms used specifically for feature detection and identification may not take explicit advantage of the colors available in the image, they still under-perform on radiographs, which are grayscale images. We are especially interested in the robustness of these algorithms, specifically their performance on a preexisting database of X-ray radiographs in compressed JPEG form, with multiple ways of describing pixel information. We will review various aspects of the performance of available feature detection and identification systems, including MATLABs Computer Vision toolbox, VLFeat, and OpenCV on our non-ideal database. In the process, we will explore possible reasons for the algorithms' lessened ability to detect and identify features from the X-ray radiographs.
Improved word recognition for observers with age-related maculopathies using compensation filters

NASA Technical Reports Server (NTRS)

Lawton, Teri B.

1988-01-01

A method for improving word recognition for people with age-related maculopathies, which cause a loss of central vision, is discussed. It is found that the use of individualized compensation filters based on an person's normalized contrast sensitivity function can improve word recognition for people with age-related maculopathies. It is shown that 27-70 pct more magnification is needed for unfiltered words compared to filtered words. The improvement in word recognition is positively correlated with the severity of vision loss.
Embedded wavelet-based face recognition under variable position

NASA Astrophysics Data System (ADS)

Cotret, Pascal; Chevobbe, Stéphane; Darouich, Mehdi

2015-02-01

For several years, face recognition has been a hot topic in the image processing field: this technique is applied in several domains such as CCTV, electronic devices delocking and so on. In this context, this work studies the efficiency of a wavelet-based face recognition method in terms of subject position robustness and performance on various systems. The use of wavelet transform has a limited impact on the position robustness of PCA-based face recognition. This work shows, for a well-known database (Yale face database B*), that subject position in a 3D space can vary up to 10% of the original ROI size without decreasing recognition rates. Face recognition is performed on approximation coefficients of the image wavelet transform: results are still satisfying after 3 levels of decomposition. Furthermore, face database size can be divided by a factor 64 (22K with K = 3). In the context of ultra-embedded vision systems, memory footprint is one of the key points to be addressed; that is the reason why compression techniques such as wavelet transform are interesting. Furthermore, it leads to a low-complexity face detection stage compliant with limited computation resources available on such systems. The approach described in this work is tested on three platforms from a standard x86-based computer towards nanocomputers such as RaspberryPi and SECO boards. For K = 3 and a database with 40 faces, the execution mean time for one frame is 0.64 ms on a x86-based computer, 9 ms on a SECO board and 26 ms on a RaspberryPi (B model).
Image ratio features for facial expression recognition application.

PubMed

Song, Mingli; Tao, Dacheng; Liu, Zicheng; Li, Xuelong; Zhou, Mengchu

2010-06-01

Video-based facial expression recognition is a challenging problem in computer vision and human-computer interaction. To target this problem, texture features have been extracted and widely used, because they can capture image intensity changes raised by skin deformation. However, existing texture features encounter problems with albedo and lighting variations. To solve both problems, we propose a new texture feature called image ratio features. Compared with previously proposed texture features, e.g., high gradient component features, image ratio features are more robust to albedo and lighting variations. In addition, to further improve facial expression recognition accuracy based on image ratio features, we combine image ratio features with facial animation parameters (FAPs), which describe the geometric motions of facial feature points. The performance evaluation is based on the Carnegie Mellon University Cohn-Kanade database, our own database, and the Japanese Female Facial Expression database. Experimental results show that the proposed image ratio feature is more robust to albedo and lighting variations, and the combination of image ratio features and FAPs outperforms each feature alone. In addition, we study asymmetric facial expressions based on our own facial expression database and demonstrate the superior performance of our combined expression recognition system.
Bayesian Face Recognition and Perceptual Narrowing in Face-Space

PubMed Central

Balas, Benjamin

2012-01-01

During the first year of life, infants’ face recognition abilities are subject to “perceptual narrowing,” the end result of which is that observers lose the ability to distinguish previously discriminable faces (e.g. other-race faces) from one another. Perceptual narrowing has been reported for faces of different species and different races, in developing humans and primates. Though the phenomenon is highly robust and replicable, there have been few efforts to model the emergence of perceptual narrowing as a function of the accumulation of experience with faces during infancy. The goal of the current study is to examine how perceptual narrowing might manifest as statistical estimation in “face space,” a geometric framework for describing face recognition that has been successfully applied to adult face perception. Here, I use a computer vision algorithm for Bayesian face recognition to study how the acquisition of experience in face space and the presence of race categories affect performance for own and other-race faces. Perceptual narrowing follows from the establishment of distinct race categories, suggesting that the acquisition of category boundaries for race is a key computational mechanism in developing face expertise. PMID:22709406
Multi-Stage System for Automatic Target Recognition

NASA Technical Reports Server (NTRS)

Chao, Tien-Hsin; Lu, Thomas T.; Ye, David; Edens, Weston; Johnson, Oliver

2010-01-01

A multi-stage automated target recognition (ATR) system has been designed to perform computer vision tasks with adequate proficiency in mimicking human vision. The system is able to detect, identify, and track targets of interest. Potential regions of interest (ROIs) are first identified by the detection stage using an Optimum Trade-off Maximum Average Correlation Height (OT-MACH) filter combined with a wavelet transform. False positives are then eliminated by the verification stage using feature extraction methods in conjunction with neural networks. Feature extraction transforms the ROIs using filtering and binning algorithms to create feature vectors. A feedforward back-propagation neural network (NN) is then trained to classify each feature vector and to remove false positives. The system parameter optimizations process has been developed to adapt to various targets and datasets. The objective was to design an efficient computer vision system that can learn to detect multiple targets in large images with unknown backgrounds. Because the target size is small relative to the image size in this problem, there are many regions of the image that could potentially contain the target. A cursory analysis of every region can be computationally efficient, but may yield too many false positives. On the other hand, a detailed analysis of every region can yield better results, but may be computationally inefficient. The multi-stage ATR system was designed to achieve an optimal balance between accuracy and computational efficiency by incorporating both models. The detection stage first identifies potential ROIs where the target may be present by performing a fast Fourier domain OT-MACH filter-based correlation. Because threshold for this stage is chosen with the goal of detecting all true positives, a number of false positives are also detected as ROIs. The verification stage then transforms the regions of interest into feature space, and eliminates false positives using an artificial neural network classifier. The multi-stage system allows tuning the detection sensitivity and the identification specificity individually in each stage. It is easier to achieve optimized ATR operation based on its specific goal. The test results show that the system was successful in substantially reducing the false positive rate when tested on a sonar and video image datasets.
Automated design of image operators that detect interest points.

PubMed

Trujillo, Leonardo; Olague, Gustavo

2008-01-01

This work describes how evolutionary computation can be used to synthesize low-level image operators that detect interesting points on digital images. Interest point detection is an essential part of many modern computer vision systems that solve tasks such as object recognition, stereo correspondence, and image indexing, to name but a few. The design of the specialized operators is posed as an optimization/search problem that is solved with genetic programming (GP), a strategy still mostly unexplored by the computer vision community. The proposed approach automatically synthesizes operators that are competitive with state-of-the-art designs, taking into account an operator's geometric stability and the global separability of detected points during fitness evaluation. The GP search space is defined using simple primitive operations that are commonly found in point detectors proposed by the vision community. The experiments described in this paper extend previous results (Trujillo and Olague, 2006a,b) by presenting 15 new operators that were synthesized through the GP-based search. Some of the synthesized operators can be regarded as improved manmade designs because they employ well-known image processing techniques and achieve highly competitive performance. On the other hand, since the GP search also generates what can be considered as unconventional operators for point detection, these results provide a new perspective to feature extraction research.
Development of a battery of functional tests for low vision.

PubMed

Dougherty, Bradley E; Martin, Scott R; Kelly, Corey B; Jones, Lisa A; Raasch, Thomas W; Bullimore, Mark A

2009-08-01

We describe the development and evaluation of a battery of tests of functional visual performance of everyday tasks intended to be suitable for assessment of low vision patients. The functional test battery comprises-Reading rate: reading aloud 20 unrelated words for each of four print sizes (8, 4, 2, & 1 M); Telephone book: finding a name and reading the telephone number; Medicine bottle label: reading the name and dosing; Utility bill: reading the due date and amount due; Cooking instructions: reading cooking time on a food package; Coin sorting: making a specified amount from coins placed on a table; Playing card recognition: identifying denomination and suit; and Face recognition: identifying expressions of printed, life-size faces at 1 and 3 m. All tests were timed except face and playing card recognition. Fourteen normally sighted and 24 low vision subjects were assessed with the functional test battery. Visual acuity, contrast sensitivity, and quality of life (National Eye Institute Visual Function Questionnaire 25 [NEI-VFQ 25]) were measured and the functional tests repeated. Subsequently, 23 low vision patients participated in a pilot randomized clinical trial with half receiving low vision rehabilitation and half a delayed intervention. The functional tests were administered at enrollment and 3 months later. Normally sighted subjects could perform all tasks but the proportion of trials performed correctly by the low vision subjects ranged from 35% for face recognition at 3 m, to 95% for the playing card identification. On average, low vision subjects performed three times slower than the normally sighted subjects. Timed tasks with a visual search component showed poorer repeatability. In the pilot clinical trial, low vision rehabilitation produced the greatest improvement for the medicine bottle and cooking instruction tasks. Performance of patients on these functional tests has been assessed. Some appear responsive to low vision rehabilitation.
An optimized content-aware image retargeting method: toward expanding the perceived visual field of the high-density retinal prosthesis recipients

NASA Astrophysics Data System (ADS)

Li, Heng; Zeng, Yajie; Lu, Zhuofan; Cao, Xiaofei; Su, Xiaofan; Sui, Xiaohong; Wang, Jing; Chai, Xinyu

2018-04-01

Objective. Retinal prosthesis devices have shown great value in restoring some sight for individuals with profoundly impaired vision, but the visual acuity and visual field provided by prostheses greatly limit recipients’ visual experience. In this paper, we employ computer vision approaches to seek to expand the perceptible visual field in patients implanted potentially with a high-density retinal prosthesis while maintaining visual acuity as much as possible. Approach. We propose an optimized content-aware image retargeting method, by introducing salient object detection based on color and intensity-difference contrast, aiming to remap important information of a scene into a small visual field and preserve their original scale as much as possible. It may improve prosthetic recipients’ perceived visual field and aid in performing some visual tasks (e.g. object detection and object recognition). To verify our method, psychophysical experiments, detecting object number and recognizing objects, are conducted under simulated prosthetic vision. As control, we use three other image retargeting techniques, including Cropping, Scaling, and seam-assisted shrinkability. Main results. Results show that our method outperforms in preserving more key features and has significantly higher recognition accuracy in comparison with other three image retargeting methods under the condition of small visual field and low-resolution. Significance. The proposed method is beneficial to expand the perceived visual field of prosthesis recipients and improve their object detection and recognition performance. It suggests that our method may provide an effective option for image processing module in future high-density retinal implants.
Localization and recognition of traffic signs for automated vehicle control systems

NASA Astrophysics Data System (ADS)

Zadeh, Mahmoud M.; Kasvand, T.; Suen, Ching Y.

1998-01-01

We present a computer vision system for detection and recognition of traffic signs. Such systems are required to assist drivers and for guidance and control of autonomous vehicles on roads and city streets. For experiments we use sequences of digitized photographs and off-line analysis. The system contains four stages. First, region segmentation based on color pixel classification called SRSM. SRSM limits the search to regions of interest in the scene. Second, we use edge tracing to find parts of outer edges of signs which are circular or straight, corresponding to the geometrical shapes of traffic signs. The third step is geometrical analysis of the outer edge and preliminary recognition of each candidate region, which may be a potential traffic sign. The final step in recognition uses color combinations within each region and model matching. This system maybe used for recognition of other types of objects, provided that the geometrical shape and color content remain reasonably constant. The method is reliable, easy to implement, and fast, This differs form the road signs recognition method in the PROMETEUS. The overall structure of the approach is sketched.
Knowledge-based vision for space station object motion detection, recognition, and tracking

NASA Technical Reports Server (NTRS)

Symosek, P.; Panda, D.; Yalamanchili, S.; Wehner, W., III

1987-01-01

Computer vision, especially color image analysis and understanding, has much to offer in the area of the automation of Space Station tasks such as construction, satellite servicing, rendezvous and proximity operations, inspection, experiment monitoring, data management and training. Knowledge-based techniques improve the performance of vision algorithms for unstructured environments because of their ability to deal with imprecise a priori information or inaccurately estimated feature data and still produce useful results. Conventional techniques using statistical and purely model-based approaches lack flexibility in dealing with the variabilities anticipated in the unstructured viewing environment of space. Algorithms developed under NASA sponsorship for Space Station applications to demonstrate the value of a hypothesized architecture for a Video Image Processor (VIP) are presented. Approaches to the enhancement of the performance of these algorithms with knowledge-based techniques and the potential for deployment of highly-parallel multi-processor systems for these algorithms are discussed.
Pre-Capture Privacy for Small Vision Sensors.

PubMed

Pittaluga, Francesco; Koppal, Sanjeev Jagannatha

2017-11-01

The next wave of micro and nano devices will create a world with trillions of small networked cameras. This will lead to increased concerns about privacy and security. Most privacy preserving algorithms for computer vision are applied after image/video data has been captured. We propose to use privacy preserving optics that filter or block sensitive information directly from the incident light-field before sensor measurements are made, adding a new layer of privacy. In addition to balancing the privacy and utility of the captured data, we address trade-offs unique to miniature vision sensors, such as achieving high-quality field-of-view and resolution within the constraints of mass and volume. Our privacy preserving optics enable applications such as depth sensing, full-body motion tracking, people counting, blob detection and privacy preserving face recognition. While we demonstrate applications on macro-scale devices (smartphones, webcams, etc.) our theory has impact for smaller devices.
Generic decoding of seen and imagined objects using hierarchical visual features.

PubMed

Horikawa, Tomoyasu; Kamitani, Yukiyasu

2017-05-22

Object recognition is a key function in both human and machine vision. While brain decoding of seen and imagined objects has been achieved, the prediction is limited to training examples. We present a decoding approach for arbitrary objects using the machine vision principle that an object category is represented by a set of features rendered invariant through hierarchical processing. We show that visual features, including those derived from a deep convolutional neural network, can be predicted from fMRI patterns, and that greater accuracy is achieved for low-/high-level features with lower-/higher-level visual areas, respectively. Predicted features are used to identify seen/imagined object categories (extending beyond decoder training) from a set of computed features for numerous object images. Furthermore, decoding of imagined objects reveals progressive recruitment of higher-to-lower visual representations. Our results demonstrate a homology between human and machine vision and its utility for brain-based information retrieval.

Sparse Representation of Multimodality Sensing Databases for Data Mining and Retrieval

DTIC Science & Technology

2015-04-09

Savarese. Estimating the Aspect Layout of Object Categories, EEE Conference on Computer Vision and Pattern Recognition (CVPR). 19-JUN-12...Time Equivalent (FTE) support provided by this agreement, and total for each category): (a) Graduate Students Liang Mei, 50% FTE, EE : systems...PhD candidate Min Sun, 50% FTE, EE : systems, PhD candidate Yu Xiang, 50% FTE, EE : systems, PhD candidate Dae Yon Jung, 50% FTE, EE : systems, PhD
Fine-grained recognition of plants from images.

PubMed

Šulc, Milan; Matas, Jiří

2017-01-01

Fine-grained recognition of plants from images is a challenging computer vision task, due to the diverse appearance and complex structure of plants, high intra-class variability and small inter-class differences. We review the state-of-the-art and discuss plant recognition tasks, from identification of plants from specific plant organs to general plant recognition "in the wild". We propose texture analysis and deep learning methods for different plant recognition tasks. The methods are evaluated and compared them to the state-of-the-art. Texture analysis is only applied to images with unambiguous segmentation (bark and leaf recognition), whereas CNNs are only applied when sufficiently large datasets are available. The results provide an insight in the complexity of different plant recognition tasks. The proposed methods outperform the state-of-the-art in leaf and bark classification and achieve very competitive results in plant recognition "in the wild". The results suggest that recognition of segmented leaves is practically a solved problem, when high volumes of training data are available. The generality and higher capacity of state-of-the-art CNNs makes them suitable for plant recognition "in the wild" where the views on plant organs or plants vary significantly and the difficulty is increased by occlusions and background clutter.
Multi-modal low cost mobile indoor surveillance system on the Robust Artificial Intelligence-based Defense Electro Robot (RAIDER)

NASA Astrophysics Data System (ADS)

Nair, Binu M.; Diskin, Yakov; Asari, Vijayan K.

2012-10-01

We present an autonomous system capable of performing security check routines. The surveillance machine, the Clearpath Husky robotic platform, is equipped with three IP cameras with different orientations for the surveillance tasks of face recognition, human activity recognition, autonomous navigation and 3D reconstruction of its environment. Combining the computer vision algorithms onto a robotic machine has given birth to the Robust Artificial Intelligencebased Defense Electro-Robot (RAIDER). The end purpose of the RAIDER is to conduct a patrolling routine on a single floor of a building several times a day. As the RAIDER travels down the corridors off-line algorithms use two of the RAIDER's side mounted cameras to perform a 3D reconstruction from monocular vision technique that updates a 3D model to the most current state of the indoor environment. Using frames from the front mounted camera, positioned at the human eye level, the system performs face recognition with real time training of unknown subjects. Human activity recognition algorithm will also be implemented in which each detected person is assigned to a set of action classes picked to classify ordinary and harmful student activities in a hallway setting.The system is designed to detect changes and irregularities within an environment as well as familiarize with regular faces and actions to distinguish potentially dangerous behavior. In this paper, we present the various algorithms and their modifications which when implemented on the RAIDER serves the purpose of indoor surveillance.
Advanced biologically plausible algorithms for low-level image processing

NASA Astrophysics Data System (ADS)

Gusakova, Valentina I.; Podladchikova, Lubov N.; Shaposhnikov, Dmitry G.; Markin, Sergey N.; Golovan, Alexander V.; Lee, Seong-Whan

1999-08-01

At present, in computer vision, the approach based on modeling the biological vision mechanisms is extensively developed. However, up to now, real world image processing has no effective solution in frameworks of both biologically inspired and conventional approaches. Evidently, new algorithms and system architectures based on advanced biological motivation should be developed for solution of computational problems related to this visual task. Basic problems that should be solved for creation of effective artificial visual system to process real world imags are a search for new algorithms of low-level image processing that, in a great extent, determine system performance. In the present paper, the result of psychophysical experiments and several advanced biologically motivated algorithms for low-level processing are presented. These algorithms are based on local space-variant filter, context encoding visual information presented in the center of input window, and automatic detection of perceptually important image fragments. The core of latter algorithm are using local feature conjunctions such as noncolinear oriented segment and composite feature map formation. Developed algorithms were integrated into foveal active vision model, the MARR. It is supposed that proposed algorithms may significantly improve model performance while real world image processing during memorizing, search, and recognition.
Automatic micropropagation of plants--the vision-system: graph rewriting as pattern recognition

NASA Astrophysics Data System (ADS)

Schwanke, Joerg; Megnet, Roland; Jensch, Peter F.

1993-03-01

The automation of plant-micropropagation is necessary to produce high amounts of biomass. Plants have to be dissected on particular cutting-points. A vision-system is needed for the recognition of the cutting-points on the plants. With this background, this contribution is directed to the underlying formalism to determine cutting-points on abstract-plant models. We show the usefulness of pattern recognition by graph-rewriting along with some examples in this context.
A Linked List-Based Algorithm for Blob Detection on Embedded Vision-Based Sensors

PubMed Central

Acevedo-Avila, Ricardo; Gonzalez-Mendoza, Miguel; Garcia-Garcia, Andres

2016-01-01

Blob detection is a common task in vision-based applications. Most existing algorithms are aimed at execution on general purpose computers; while very few can be adapted to the computing restrictions present in embedded platforms. This paper focuses on the design of an algorithm capable of real-time blob detection that minimizes system memory consumption. The proposed algorithm detects objects in one image scan; it is based on a linked-list data structure tree used to label blobs depending on their shape and node information. An example application showing the results of a blob detection co-processor has been built on a low-powered field programmable gate array hardware as a step towards developing a smart video surveillance system. The detection method is intended for general purpose application. As such, several test cases focused on character recognition are also examined. The results obtained present a fair trade-off between accuracy and memory requirements; and prove the validity of the proposed approach for real-time implementation on resource-constrained computing platforms. PMID:27240382
Recognizing sights, smells, and sounds with gnostic fields.

PubMed

Kanan, Christopher

2013-01-01

Mammals rely on vision, audition, and olfaction to remotely sense stimuli in their environment. Determining how the mammalian brain uses this sensory information to recognize objects has been one of the major goals of psychology and neuroscience. Likewise, researchers in computer vision, machine audition, and machine olfaction have endeavored to discover good algorithms for stimulus classification. Almost 50 years ago, the neuroscientist Jerzy Konorski proposed a theoretical model in his final monograph in which competing sets of "gnostic" neurons sitting atop sensory processing hierarchies enabled stimuli to be robustly categorized, despite variations in their presentation. Much of what Konorski hypothesized has been remarkably accurate, and neurons with gnostic-like properties have been discovered in visual, aural, and olfactory brain regions. Surprisingly, there have not been any attempts to directly transform his theoretical model into a computational one. Here, I describe the first computational implementation of Konorski's theory. The model is not domain specific, and it surpasses the best machine learning algorithms on challenging image, music, and olfactory classification tasks, while also being simpler. My results suggest that criticisms of exemplar-based models of object recognition as being computationally intractable due to limited neural resources are unfounded.
Recognizing Sights, Smells, and Sounds with Gnostic Fields

PubMed Central

Kanan, Christopher

2013-01-01

Mammals rely on vision, audition, and olfaction to remotely sense stimuli in their environment. Determining how the mammalian brain uses this sensory information to recognize objects has been one of the major goals of psychology and neuroscience. Likewise, researchers in computer vision, machine audition, and machine olfaction have endeavored to discover good algorithms for stimulus classification. Almost 50 years ago, the neuroscientist Jerzy Konorski proposed a theoretical model in his final monograph in which competing sets of “gnostic” neurons sitting atop sensory processing hierarchies enabled stimuli to be robustly categorized, despite variations in their presentation. Much of what Konorski hypothesized has been remarkably accurate, and neurons with gnostic-like properties have been discovered in visual, aural, and olfactory brain regions. Surprisingly, there have not been any attempts to directly transform his theoretical model into a computational one. Here, I describe the first computational implementation of Konorski's theory. The model is not domain specific, and it surpasses the best machine learning algorithms on challenging image, music, and olfactory classification tasks, while also being simpler. My results suggest that criticisms of exemplar-based models of object recognition as being computationally intractable due to limited neural resources are unfounded. PMID:23365648
Computer-Vision-Assisted Palm Rehabilitation With Supervised Learning.

PubMed

Vamsikrishna, K M; Dogra, Debi Prosad; Desarkar, Maunendra Sankar

2016-05-01

Physical rehabilitation supported by the computer-assisted-interface is gaining popularity among health-care fraternity. In this paper, we have proposed a computer-vision-assisted contactless methodology to facilitate palm and finger rehabilitation. Leap motion controller has been interfaced with a computing device to record parameters describing 3-D movements of the palm of a user undergoing rehabilitation. We have proposed an interface using Unity3D development platform. Our interface is capable of analyzing intermediate steps of rehabilitation without the help of an expert, and it can provide online feedback to the user. Isolated gestures are classified using linear discriminant analysis (DA) and support vector machines (SVM). Finally, a set of discrete hidden Markov models (HMM) have been used to classify gesture sequence performed during rehabilitation. Experimental validation using a large number of samples collected from healthy volunteers reveals that DA and SVM perform similarly while applied on isolated gesture recognition. We have compared the results of HMM-based sequence classification with CRF-based techniques. Our results confirm that both HMM and CRF perform quite similarly when tested on gesture sequences. The proposed system can be used for home-based palm or finger rehabilitation in the absence of experts.
Multi-Frame Object Detection

DTIC Science & Technology

2012-09-01

ensures that the trainer will produce a cascade that achieves a 0.9044 hit rate (= 0.9910) or better, or it will fail trying. The Viola-Jones...by the user. Thus, a final cascade cannot be produced, and the trainer has failed at the specific hit and FA rate requirements. 19 THIS PAGE...International Journal of Computer Vision, vol. 63, no. 2, pp. 153–161, July 2005. [3] L. Lee, “ Gait dynamics for recognition and classification,” in AI Memo
Velocity and Structure Estimation of a Moving Object Using a Moving Monocular Camera

DTIC Science & Technology

2006-01-01

map the Euclidean position of static landmarks or visual features in the environment . Recent applications of this technique include aerial...From Motion in a Piecewise Planar Environment ,” International Journal of Pattern Recognition and Artificial Intelligence, Vol. 2, No. 3, pp. 485-508...1988. [9] J. M. Ferryman, S. J. Maybank , and A. D. Worrall, “Visual Surveil- lance for Moving Vehicles,” Intl. Journal of Computer Vision, Vol. 37, No
Robust Point Set Matching for Partial Face Recognition.

PubMed

Weng, Renliang; Lu, Jiwen; Tan, Yap-Peng

2016-03-01

Over the past three decades, a number of face recognition methods have been proposed in computer vision, and most of them use holistic face images for person identification. In many real-world scenarios especially some unconstrained environments, human faces might be occluded by other objects, and it is difficult to obtain fully holistic face images for recognition. To address this, we propose a new partial face recognition approach to recognize persons of interest from their partial faces. Given a pair of gallery image and probe face patch, we first detect keypoints and extract their local textural features. Then, we propose a robust point set matching method to discriminatively match these two extracted local feature sets, where both the textural information and geometrical information of local features are explicitly used for matching simultaneously. Finally, the similarity of two faces is converted as the distance between these two aligned feature sets. Experimental results on four public face data sets show the effectiveness of the proposed approach.
Identification and location of catenary insulator in complex background based on machine vision

NASA Astrophysics Data System (ADS)

Yao, Xiaotong; Pan, Yingli; Liu, Li; Cheng, Xiao

2018-04-01

It is an important premise to locate insulator precisely for fault detection. Current location algorithms for insulator under catenary checking images are not accurate, a target recognition and localization method based on binocular vision combined with SURF features is proposed. First of all, because of the location of the insulator in complex environment, using SURF features to achieve the coarse positioning of target recognition; then Using binocular vision principle to calculate the 3D coordinates of the object which has been coarsely located, realization of target object recognition and fine location; Finally, Finally, the key is to preserve the 3D coordinate of the object's center of mass, transfer to the inspection robot to control the detection position of the robot. Experimental results demonstrate that the proposed method has better recognition efficiency and accuracy, can successfully identify the target and has a define application value.
Deep Learning with Convolutional Neural Networks Applied to Electromyography Data: A Resource for the Classification of Movements for Prosthetic Hands

PubMed Central

Atzori, Manfredo; Cognolato, Matteo; Müller, Henning

2016-01-01

Natural control methods based on surface electromyography (sEMG) and pattern recognition are promising for hand prosthetics. However, the control robustness offered by scientific research is still not sufficient for many real life applications, and commercial prostheses are capable of offering natural control for only a few movements. In recent years deep learning revolutionized several fields of machine learning, including computer vision and speech recognition. Our objective is to test its methods for natural control of robotic hands via sEMG using a large number of intact subjects and amputees. We tested convolutional networks for the classification of an average of 50 hand movements in 67 intact subjects and 11 transradial amputees. The simple architecture of the neural network allowed to make several tests in order to evaluate the effect of pre-processing, layer architecture, data augmentation and optimization. The classification results are compared with a set of classical classification methods applied on the same datasets. The classification accuracy obtained with convolutional neural networks using the proposed architecture is higher than the average results obtained with the classical classification methods, but lower than the results obtained with the best reference methods in our tests. The results show that convolutional neural networks with a very simple architecture can produce accurate results comparable to the average classical classification methods. They show that several factors (including pre-processing, the architecture of the net and the optimization parameters) can be fundamental for the analysis of sEMG data. Larger networks can achieve higher accuracy on computer vision and object recognition tasks. This fact suggests that it may be interesting to evaluate if larger networks can increase sEMG classification accuracy too. PMID:27656140
Deep Learning with Convolutional Neural Networks Applied to Electromyography Data: A Resource for the Classification of Movements for Prosthetic Hands.

PubMed

Atzori, Manfredo; Cognolato, Matteo; Müller, Henning

2016-01-01

Natural control methods based on surface electromyography (sEMG) and pattern recognition are promising for hand prosthetics. However, the control robustness offered by scientific research is still not sufficient for many real life applications, and commercial prostheses are capable of offering natural control for only a few movements. In recent years deep learning revolutionized several fields of machine learning, including computer vision and speech recognition. Our objective is to test its methods for natural control of robotic hands via sEMG using a large number of intact subjects and amputees. We tested convolutional networks for the classification of an average of 50 hand movements in 67 intact subjects and 11 transradial amputees. The simple architecture of the neural network allowed to make several tests in order to evaluate the effect of pre-processing, layer architecture, data augmentation and optimization. The classification results are compared with a set of classical classification methods applied on the same datasets. The classification accuracy obtained with convolutional neural networks using the proposed architecture is higher than the average results obtained with the classical classification methods, but lower than the results obtained with the best reference methods in our tests. The results show that convolutional neural networks with a very simple architecture can produce accurate results comparable to the average classical classification methods. They show that several factors (including pre-processing, the architecture of the net and the optimization parameters) can be fundamental for the analysis of sEMG data. Larger networks can achieve higher accuracy on computer vision and object recognition tasks. This fact suggests that it may be interesting to evaluate if larger networks can increase sEMG classification accuracy too.
ROBIN: a platform for evaluating automatic target recognition algorithms: II. Protocols used for evaluating algorithms and results obtained on the SAGEM DS database

NASA Astrophysics Data System (ADS)

Duclos, D.; Lonnoy, J.; Guillerm, Q.; Jurie, F.; Herbin, S.; D'Angelo, E.

2008-04-01

Over the five past years, the computer vision community has explored many different avenues of research for Automatic Target Recognition. Noticeable advances have been made and we are now in the situation where large-scale evaluations of ATR technologies have to be carried out, to determine what the limitations of the recently proposed methods are and to determine the best directions for future works. ROBIN, which is a project funded by the French Ministry of Defence and by the French Ministry of Research, has the ambition of being a new reference for benchmarking ATR algorithms in operational contexts. This project, headed by major companies and research centers involved in Computer Vision R&D in the field of Defense (Bertin Technologies, CNES, ECA, DGA, EADS, INRIA, ONERA, MBDA, SAGEM, THALES) recently released a large dataset of several thousands of hand-annotated infrared and RGB images of different targets in different situations. Setting up an evaluation campaign requires us to define, accurately and carefully, sets of data (both for training ATR algorithms and for their evaluation), tasks to be evaluated, and finally protocols and metrics for the evaluation. ROBIN offers interesting contributions to each one of these three points. This paper first describes, justifies and defines the set of functions used in the ROBIN competitions and relevant for evaluating ATR algorithms (Detection, Localization, Recognition and Identification). It also defines the metrics and the protocol used for evaluating these functions. In the second part of the paper, the results obtained by several state-of-the-art algorithms on the SAGEM DS database (a subpart of ROBIN) are presented and discussed
Extracting semantics from audio-visual content: the final frontier in multimedia retrieval.

PubMed

Naphade, M R; Huang, T S

2002-01-01

Multimedia understanding is a fast emerging interdisciplinary research area. There is tremendous potential for effective use of multimedia content through intelligent analysis. Diverse application areas are increasingly relying on multimedia understanding systems. Advances in multimedia understanding are related directly to advances in signal processing, computer vision, pattern recognition, multimedia databases, and smart sensors. We review the state-of-the-art techniques in multimedia retrieval. In particular, we discuss how multimedia retrieval can be viewed as a pattern recognition problem. We discuss how reliance on powerful pattern recognition and machine learning techniques is increasing in the field of multimedia retrieval. We review the state-of-the-art multimedia understanding systems with particular emphasis on a system for semantic video indexing centered around multijects and multinets. We discuss how semantic retrieval is centered around concepts and context and the various mechanisms for modeling concepts and context.
Image-algebraic design of multispectral target recognition algorithms

NASA Astrophysics Data System (ADS)

Schmalz, Mark S.; Ritter, Gerhard X.

1994-06-01

In this paper, we discuss methods for multispectral ATR (Automated Target Recognition) of small targets that are sensed under suboptimal conditions, such as haze, smoke, and low light levels. In particular, we discuss our ongoing development of algorithms and software that effect intelligent object recognition by selecting ATR filter parameters according to ambient conditions. Our algorithms are expressed in terms of IA (image algebra), a concise, rigorous notation that unifies linear and nonlinear mathematics in the image processing domain. IA has been implemented on a variety of parallel computers, with preprocessors available for the Ada and FORTRAN languages. An image algebra C++ class library has recently been made available. Thus, our algorithms are both feasible implementationally and portable to numerous machines. Analyses emphasize the aspects of image algebra that aid the design of multispectral vision algorithms, such as parameterized templates that facilitate the flexible specification of ATR filters.
Learning Weight Uncertainty with Stochastic Gradient MCMC for Shape Classification

DOE Office of Scientific and Technical Information (OSTI.GOV)

Li, Chunyuan; Stevens, Andrew J.; Chen, Changyou

2016-08-10

Learning the representation of shape cues in 2D & 3D objects for recognition is a fundamental task in computer vision. Deep neural networks (DNNs) have shown promising performance on this task. Due to the large variability of shapes, accurate recognition relies on good estimates of model uncertainty, ignored in traditional training of DNNs, typically learned via stochastic optimization. This paper leverages recent advances in stochastic gradient Markov Chain Monte Carlo (SG-MCMC) to learn weight uncertainty in DNNs. It yields principled Bayesian interpretations for the commonly used Dropout/DropConnect techniques and incorporates them into the SG-MCMC framework. Extensive experiments on 2D &more » 3D shape datasets and various DNN models demonstrate the superiority of the proposed approach over stochastic optimization. Our approach yields higher recognition accuracy when used in conjunction with Dropout and Batch-Normalization.« less
A computer vision system for diagnosing scoliosis using moiré images.

PubMed

Batouche, M; Benlamri, R; Kholladi, M K

1996-07-01

For young people, scoliosis deformities are an evolving process which must be detected and treated as early as possible. The moiré technique is simple, inexpensive, not aggressive and especially convenient for detecting spinal deformations. Doctors make their diagnosis by analysing the symmetry of fringes obtained by such techniques. In this paper, we present a computer vision system for help diagnosing spinal deformations using noisy moiré images of the human back. The approach adopted in this paper consists of extracting fringe contours from moiré images, then localizing some anatomical features (the spinal column, lumbar hollow and shoulder blades) which are crucial for 3D surface generation carried out using Mota's relaxation operator. Finally, rules furnished by doctors are used to derive the kind of spinal deformation and to yield the diagnosis. The proposed system has been tested on a set of noisy moiré images, and the experimental result have shown its robustness and reliability for the recognition of most scoliosis deformities.

Neuro-inspired smart image sensor: analog Hmax implementation

NASA Astrophysics Data System (ADS)

Paindavoine, Michel; Dubois, Jérôme; Musa, Purnawarman

2015-03-01

Neuro-Inspired Vision approach, based on models from biology, allows to reduce the computational complexity. One of these models - The Hmax model - shows that the recognition of an object in the visual cortex mobilizes V1, V2 and V4 areas. From the computational point of view, V1 corresponds to the area of the directional filters (for example Sobel filters, Gabor filters or wavelet filters). This information is then processed in the area V2 in order to obtain local maxima. This new information is then sent to an artificial neural network. This neural processing module corresponds to area V4 of the visual cortex and is intended to categorize objects present in the scene. In order to realize autonomous vision systems (consumption of a few milliwatts) with such treatments inside, we studied and realized in 0.35μm CMOS technology prototypes of two image sensors in order to achieve the V1 and V2 processing of Hmax model.
Computer Vision Malaria Diagnostic Systems-Progress and Prospects.

PubMed

Pollak, Joseph Joel; Houri-Yafin, Arnon; Salpeter, Seth J

2017-01-01

Accurate malaria diagnosis is critical to prevent malaria fatalities, curb overuse of antimalarial drugs, and promote appropriate management of other causes of fever. While several diagnostic tests exist, the need for a rapid and highly accurate malaria assay remains. Microscopy and rapid diagnostic tests are the main diagnostic modalities available, yet they can demonstrate poor performance and accuracy. Automated microscopy platforms have the potential to significantly improve and standardize malaria diagnosis. Based on image recognition and machine learning algorithms, these systems maintain the benefits of light microscopy and provide improvements such as quicker scanning time, greater scanning area, and increased consistency brought by automation. While these applications have been in development for over a decade, recently several commercial platforms have emerged. In this review, we discuss the most advanced computer vision malaria diagnostic technologies and investigate several of their features which are central to field use. Additionally, we discuss the technological and policy barriers to implementing these technologies in low-resource settings world-wide.
Robust and Effective Component-based Banknote Recognition for the Blind

PubMed Central

Hasanuzzaman, Faiz M.; Yang, Xiaodong; Tian, YingLi

2012-01-01

We develop a novel camera-based computer vision technology to automatically recognize banknotes for assisting visually impaired people. Our banknote recognition system is robust and effective with the following features: 1) high accuracy: high true recognition rate and low false recognition rate, 2) robustness: handles a variety of currency designs and bills in various conditions, 3) high efficiency: recognizes banknotes quickly, and 4) ease of use: helps blind users to aim the target for image capture. To make the system robust to a variety of conditions including occlusion, rotation, scaling, cluttered background, illumination change, viewpoint variation, and worn or wrinkled bills, we propose a component-based framework by using Speeded Up Robust Features (SURF). Furthermore, we employ the spatial relationship of matched SURF features to detect if there is a bill in the camera view. This process largely alleviates false recognition and can guide the user to correctly aim at the bill to be recognized. The robustness and generalizability of the proposed system is evaluated on a dataset including both positive images (with U.S. banknotes) and negative images (no U.S. banknotes) collected under a variety of conditions. The proposed algorithm, achieves 100% true recognition rate and 0% false recognition rate. Our banknote recognition system is also tested by blind users. PMID:22661884
Fast linear feature detection using multiple directional non-maximum suppression.

PubMed

Sun, C; Vallotton, P

2009-05-01

The capacity to detect linear features is central to image analysis, computer vision and pattern recognition and has practical applications in areas such as neurite outgrowth detection, retinal vessel extraction, skin hair removal, plant root analysis and road detection. Linear feature detection often represents the starting point for image segmentation and image interpretation. In this paper, we present a new algorithm for linear feature detection using multiple directional non-maximum suppression with symmetry checking and gap linking. Given its low computational complexity, the algorithm is very fast. We show in several examples that it performs very well in terms of both sensitivity and continuity of detected linear features.
An improved silhouette for human pose estimation

NASA Astrophysics Data System (ADS)

Hawes, Anthony H.; Iftekharuddin, Khan M.

2017-08-01

We propose a novel method for analyzing images that exploits the natural lines of a human poses to find areas where self-occlusion could be present. Errors caused by self-occlusion cause several modern human pose estimation methods to mis-identify body parts, which reduces the performance of most action recognition algorithms. Our method is motivated by the observation that, in several cases, occlusion can be reasoned using only boundary lines of limbs. An intelligent edge detection algorithm based on the above principle could be used to augment the silhouette with information useful for pose estimation algorithms and push forward progress on occlusion handling for human action recognition. The algorithm described is applicable to computer vision scenarios involving 2D images and (appropriated flattened) 3D images.
iFER: facial expression recognition using automatically selected geometric eye and eyebrow features

NASA Astrophysics Data System (ADS)

Oztel, Ismail; Yolcu, Gozde; Oz, Cemil; Kazan, Serap; Bunyak, Filiz

2018-03-01

Facial expressions have an important role in interpersonal communications and estimation of emotional states or intentions. Automatic recognition of facial expressions has led to many practical applications and became one of the important topics in computer vision. We present a facial expression recognition system that relies on geometry-based features extracted from eye and eyebrow regions of the face. The proposed system detects keypoints on frontal face images and forms a feature set using geometric relationships among groups of detected keypoints. Obtained feature set is refined and reduced using the sequential forward selection (SFS) algorithm and fed to a support vector machine classifier to recognize five facial expression classes. The proposed system, iFER (eye-eyebrow only facial expression recognition), is robust to lower face occlusions that may be caused by beards, mustaches, scarves, etc. and lower face motion during speech production. Preliminary experiments on benchmark datasets produced promising results outperforming previous facial expression recognition studies using partial face features, and comparable results to studies using whole face information, only slightly lower by ˜ 2.5 % compared to the best whole face facial recognition system while using only ˜ 1 / 3 of the facial region.
Face sketch recognition based on edge enhancement via deep learning

NASA Astrophysics Data System (ADS)

Xie, Zhenzhu; Yang, Fumeng; Zhang, Yuming; Wu, Congzhong

2017-11-01

In this paper,we address the face sketch recognition problem. Firstly, we utilize the eigenface algorithm to convert a sketch image into a synthesized sketch face image. Subsequently, considering the low-level vision problem in synthesized face sketch image .Super resolution reconstruction algorithm based on CNN(convolutional neural network) is employed to improve the visual effect. To be specific, we uses a lightweight super-resolution structure to learn a residual mapping instead of directly mapping the feature maps from the low-level space to high-level patch representations, which making the networks are easier to optimize and have lower computational complexity. Finally, we adopt LDA(Linear Discriminant Analysis) algorithm to realize face sketch recognition on synthesized face image before super resolution and after respectively. Extensive experiments on the face sketch database(CUFS) from CUHK demonstrate that the recognition rate of SVM(Support Vector Machine) algorithm improves from 65% to 69% and the recognition rate of LDA(Linear Discriminant Analysis) algorithm improves from 69% to 75%.What'more,the synthesized face image after super resolution can not only better describer image details such as hair ,nose and mouth etc, but also improve the recognition accuracy effectively.
Proceedings of the Second Joint Technology Workshop on Neural Networks and Fuzzy Logic, volume 1

NASA Technical Reports Server (NTRS)

Lea, Robert N. (Editor); Villarreal, James (Editor)

1991-01-01

Documented here are papers presented at the Neural Networks and Fuzzy Logic Workshop sponsored by NASA and the University of Houston, Clear Lake. The workshop was held April 11 to 13 at the Johnson Space Flight Center. Technical topics addressed included adaptive systems, learning algorithms, network architectures, vision, robotics, neurobiological connections, speech recognition and synthesis, fuzzy set theory and application, control and dynamics processing, space applications, fuzzy logic and neural network computers, approximate reasoning, and multiobject decision making.
Computer vision system: a tool for evaluating the quality of wheat in a grain tank

NASA Astrophysics Data System (ADS)

Minkin, Uryi Igorevish; Panchenko, Aleksei Vladimirovich; Shkanaev, Aleksandr Yurievich; Konovalenko, Ivan Andreevich; Putintsev, Dmitry Nikolaevich; Sadekov, Rinat Nailevish

2018-04-01

The paper describes a technology that allows for automatizing the process of evaluating the grain quality in a grain tank of a combine harvester. Special recognition algorithm analyzes photographic images taken by the camera, and that provides automatic estimates of the total mass fraction of broken grains and the presence of non-grains. The paper also presents the operating details of the tank prototype as well as it defines the accuracy of the algorithms designed.
Appearance-Based Vision and the Automatic Generation of Object Recognition Programs

DTIC Science & Technology

1992-07-01

q u a groued into equivalence clases with respect o visible featms; the equivalence classes me called alpecu. A recognitio smuegy is generated from...illustates th concept. pge 9 Table 1: Summary o fSnsors Samr Vertex Edge Face Active/ Passive Edge detector line, passive Shape-fzm-shading - r passive...example of the detectability computation for a liht-stripe range finder is shown zn Fqgur 2. Figure 2: Detectability of a face for a light-stripe range
Remote logo detection using angle-distance histograms

NASA Astrophysics Data System (ADS)

Youn, Sungwook; Ok, Jiheon; Baek, Sangwook; Woo, Seongyoun; Lee, Chulhee

2016-05-01

Among all the various computer vision applications, automatic logo recognition has drawn great interest from industry as well as various academic institutions. In this paper, we propose an angle-distance map, which we used to develop a robust logo detection algorithm. The proposed angle-distance histogram is invariant against scale and rotation. The proposed method first used shape information and color characteristics to find the candidate regions and then applied the angle-distance histogram. Experiments show that the proposed method detected logos of various sizes and orientations.
Image understanding and the man-machine interface II; Proceedings of the Meeting, Los Angeles, CA, Jan. 17, 18, 1989

NASA Technical Reports Server (NTRS)

Barrett, Eamon B. (Editor); Pearson, James J. (Editor)

1989-01-01

Image understanding concepts and models, image understanding systems and applications, advanced digital processors and software tools, and advanced man-machine interfaces are among the topics discussed. Particular papers are presented on such topics as neural networks for computer vision, object-based segmentation and color recognition in multispectral images, the application of image algebra to image measurement and feature extraction, and the integration of modeling and graphics to create an infrared signal processing test bed.
Recognizing 3 D Objects from 2D Images Using Structural Knowledge Base of Genetic Views

DTIC Science & Technology

1988-08-31

technical report. [BIE85] I. Biederman , "Human image understanding: Recent research and a theory", Computer Vision, Graphics, and Image Processing, vol...model bases", Technical Report 87-85, COINS Dept, University of Massachusetts, Amherst, MA 01003, August 1987 . [BUR87b) Burns, J. B. and L. J. Kitchen...34Recognition in 2D images of 3D objects from large model bases using prediction hierarchies", Proc. IJCAI-10, 1987 . [BUR891 J. B. Burns, forthcoming
A New Font, Specifically Designed for Peripheral Vision, Improves Peripheral Letter and Word Recognition, but Not Eye-Mediated Reading Performance

PubMed Central

Bernard, Jean-Baptiste; Aguilar, Carlos; Castet, Eric

2016-01-01

Reading speed is dramatically reduced when readers cannot use their central vision. This is because low visual acuity and crowding negatively impact letter recognition in the periphery. In this study, we designed a new font (referred to as the Eido font) in order to reduce inter-letter similarity and consequently to increase peripheral letter recognition performance. We tested this font by running five experiments that compared the Eido font with the standard Courier font. Letter spacing and x-height were identical for the two monospaced fonts. Six normally-sighted subjects used exclusively their peripheral vision to run two aloud reading tasks (with eye movements), a letter recognition task (without eye movements), a word recognition task (without eye movements) and a lexical decision task. Results show that reading speed was not significantly different between the Eido and the Courier font when subjects had to read single sentences with a round simulated gaze-contingent central scotoma (10° diameter). In contrast, Eido significantly decreased perceptual errors in peripheral crowded letter recognition (-30% errors on average for letters briefly presented at 6° eccentricity) and in peripheral word recognition (-32% errors on average for words briefly presented at 6° eccentricity). PMID:27074013
A New Font, Specifically Designed for Peripheral Vision, Improves Peripheral Letter and Word Recognition, but Not Eye-Mediated Reading Performance.

PubMed

Bernard, Jean-Baptiste; Aguilar, Carlos; Castet, Eric

2016-01-01

Reading speed is dramatically reduced when readers cannot use their central vision. This is because low visual acuity and crowding negatively impact letter recognition in the periphery. In this study, we designed a new font (referred to as the Eido font) in order to reduce inter-letter similarity and consequently to increase peripheral letter recognition performance. We tested this font by running five experiments that compared the Eido font with the standard Courier font. Letter spacing and x-height were identical for the two monospaced fonts. Six normally-sighted subjects used exclusively their peripheral vision to run two aloud reading tasks (with eye movements), a letter recognition task (without eye movements), a word recognition task (without eye movements) and a lexical decision task. Results show that reading speed was not significantly different between the Eido and the Courier font when subjects had to read single sentences with a round simulated gaze-contingent central scotoma (10° diameter). In contrast, Eido significantly decreased perceptual errors in peripheral crowded letter recognition (-30% errors on average for letters briefly presented at 6° eccentricity) and in peripheral word recognition (-32% errors on average for words briefly presented at 6° eccentricity).
A Decade of Neural Networks: Practical Applications and Prospects

NASA Technical Reports Server (NTRS)

Kemeny, Sabrina E.

1994-01-01

The Jet Propulsion Laboratory Neural Network Workshop, sponsored by NASA and DOD, brings together sponsoring agencies, active researchers, and the user community to formulate a vision for the next decade of neural network research and application prospects. While the speed and computing power of microprocessors continue to grow at an ever-increasing pace, the demand to intelligently and adaptively deal with the complex, fuzzy, and often ill-defined world around us remains to a large extent unaddressed. Powerful, highly parallel computing paradigms such as neural networks promise to have a major impact in addressing these needs. Papers in the workshop proceedings highlight benefits of neural networks in real-world applications compared to conventional computing techniques. Topics include fault diagnosis, pattern recognition, and multiparameter optimization.
Sign language recognition and translation: a multidisciplined approach from the field of artificial intelligence.

PubMed

Parton, Becky Sue

2006-01-01

In recent years, research has progressed steadily in regard to the use of computers to recognize and render sign language. This paper reviews significant projects in the field beginning with finger-spelling hands such as "Ralph" (robotics), CyberGloves (virtual reality sensors to capture isolated and continuous signs), camera-based projects such as the CopyCat interactive American Sign Language game (computer vision), and sign recognition software (Hidden Markov Modeling and neural network systems). Avatars such as "Tessa" (Text and Sign Support Assistant; three-dimensional imaging) and spoken language to sign language translation systems such as Poland's project entitled "THETOS" (Text into Sign Language Automatic Translator, which operates in Polish; natural language processing) are addressed. The application of this research to education is also explored. The "ICICLE" (Interactive Computer Identification and Correction of Language Errors) project, for example, uses intelligent computer-aided instruction to build a tutorial system for deaf or hard-of-hearing children that analyzes their English writing and makes tailored lessons and recommendations. Finally, the article considers synthesized sign, which is being added to educational material and has the potential to be developed by students themselves.
Image-based automatic recognition of larvae

NASA Astrophysics Data System (ADS)

Sang, Ru; Yu, Guiying; Fan, Weijun; Guo, Tiantai

2010-08-01

As the main objects, imagoes have been researched in quarantine pest recognition in these days. However, pests in their larval stage are latent, and the larvae spread abroad much easily with the circulation of agricultural and forest products. It is presented in this paper that, as the new research objects, larvae are recognized by means of machine vision, image processing and pattern recognition. More visional information is reserved and the recognition rate is improved as color image segmentation is applied to images of larvae. Along with the characteristics of affine invariance, perspective invariance and brightness invariance, scale invariant feature transform (SIFT) is adopted for the feature extraction. The neural network algorithm is utilized for pattern recognition, and the automatic identification of larvae images is successfully achieved with satisfactory results.
Visual recognition and inference using dynamic overcomplete sparse learning.

PubMed

Murray, Joseph F; Kreutz-Delgado, Kenneth

2007-09-01

We present a hierarchical architecture and learning algorithm for visual recognition and other visual inference tasks such as imagination, reconstruction of occluded images, and expectation-driven segmentation. Using properties of biological vision for guidance, we posit a stochastic generative world model and from it develop a simplified world model (SWM) based on a tractable variational approximation that is designed to enforce sparse coding. Recent developments in computational methods for learning overcomplete representations (Lewicki & Sejnowski, 2000; Teh, Welling, Osindero, & Hinton, 2003) suggest that overcompleteness can be useful for visual tasks, and we use an overcomplete dictionary learning algorithm (Kreutz-Delgado, et al., 2003) as a preprocessing stage to produce accurate, sparse codings of images. Inference is performed by constructing a dynamic multilayer network with feedforward, feedback, and lateral connections, which is trained to approximate the SWM. Learning is done with a variant of the back-propagation-through-time algorithm, which encourages convergence to desired states within a fixed number of iterations. Vision tasks require large networks, and to make learning efficient, we take advantage of the sparsity of each layer to update only a small subset of elements in a large weight matrix at each iteration. Experiments on a set of rotated objects demonstrate various types of visual inference and show that increasing the degree of overcompleteness improves recognition performance in difficult scenes with occluded objects in clutter.
Complete Vision-Based Traffic Sign Recognition Supported by an I2V Communication System

PubMed Central

García-Garrido, Miguel A.; Ocaña, Manuel; Llorca, David F.; Arroyo, Estefanía; Pozuelo, Jorge; Gavilán, Miguel

2012-01-01

This paper presents a complete traffic sign recognition system based on vision sensor onboard a moving vehicle which detects and recognizes up to one hundred of the most important road signs, including circular and triangular signs. A restricted Hough transform is used as detection method from the information extracted in contour images, while the proposed recognition system is based on Support Vector Machines (SVM). A novel solution to the problem of discarding detected signs that do not pertain to the host road is proposed. For that purpose infrastructure-to-vehicle (I2V) communication and a stereo vision sensor are used. Furthermore, the outputs provided by the vision sensor and the data supplied by the CAN Bus and a GPS sensor are combined to obtain the global position of the detected traffic signs, which is used to identify a traffic sign in the I2V communication. This paper presents plenty of tests in real driving conditions, both day and night, in which an average detection rate over 95% and an average recognition rate around 93% were obtained with an average runtime of 35 ms that allows real-time performance. PMID:22438704

Complete vision-based traffic sign recognition supported by an I2V communication system.

PubMed

García-Garrido, Miguel A; Ocaña, Manuel; Llorca, David F; Arroyo, Estefanía; Pozuelo, Jorge; Gavilán, Miguel

2012-01-01

This paper presents a complete traffic sign recognition system based on vision sensor onboard a moving vehicle which detects and recognizes up to one hundred of the most important road signs, including circular and triangular signs. A restricted Hough transform is used as detection method from the information extracted in contour images, while the proposed recognition system is based on Support Vector Machines (SVM). A novel solution to the problem of discarding detected signs that do not pertain to the host road is proposed. For that purpose infrastructure-to-vehicle (I2V) communication and a stereo vision sensor are used. Furthermore, the outputs provided by the vision sensor and the data supplied by the CAN Bus and a GPS sensor are combined to obtain the global position of the detected traffic signs, which is used to identify a traffic sign in the I2V communication. This paper presents plenty of tests in real driving conditions, both day and night, in which an average detection rate over 95% and an average recognition rate around 93% were obtained with an average runtime of 35 ms that allows real-time performance.
Vision requirements for Space Station applications

NASA Technical Reports Server (NTRS)

Crouse, K. R.

1985-01-01

Problems which will be encountered by computer vision systems in Space Station operations are discussed, along with solutions be examined at Johnson Space Station. Lighting cannot be controlled in space, nor can the random presence of reflective surfaces. Task-oriented capabilities are to include docking to moving objects, identification of unexpected objects during autonomous flights to different orbits, and diagnoses of damage and repair requirements for autonomous Space Station inspection robots. The approaches being examined to provide these and other capabilities are television IR sensors, advanced pattern recognition programs feeding on data from laser probes, laser radar for robot eyesight and arrays of SMART sensors for automated location and tracking of target objects. Attention is also being given to liquid crystal light valves for optical processing of images for comparisons with on-board electronic libraries of images.
Feature extraction inspired by V1 in visual cortex

NASA Astrophysics Data System (ADS)

Lv, Chao; Xu, Yuelei; Zhang, Xulei; Ma, Shiping; Li, Shuai; Xin, Peng; Zhu, Mingning; Ma, Hongqiang

2018-04-01

Target feature extraction plays an important role in pattern recognition. It is the most complicated activity in the brain mechanism of biological vision. Inspired by high properties of primary visual cortex (V1) in extracting dynamic and static features, a visual perception model was raised. Firstly, 28 spatial-temporal filters with different orientations, half-squaring operation and divisive normalization were adopted to obtain the responses of V1 simple cells; then, an adjustable parameter was added to the output weight so that the response of complex cells was got. Experimental results indicate that the proposed V1 model can perceive motion information well. Besides, it has a good edge detection capability. The model inspired by V1 has good performance in feature extraction and effectively combines brain-inspired intelligence with computer vision.
Robot Vision

NASA Technical Reports Server (NTRS)

Sutro, L. L.; Lerman, J. B.

1973-01-01

The operation of a system is described that is built both to model the vision of primate animals, including man, and serve as a pre-prototype of possible object recognition system. It was employed in a series of experiments to determine the practicability of matching left and right images of a scene to determine the range and form of objects. The experiments started with computer generated random-dot stereograms as inputs and progressed through random square stereograms to a real scene. The major problems were the elimination of spurious matches, between the left and right views, and the interpretation of ambiguous regions, on the left side of an object that can be viewed only by the left camera, and on the right side of an object that can be viewed only by the right camera.
PRoViScout: a planetary scouting rover demonstrator

NASA Astrophysics Data System (ADS)

Paar, Gerhard; Woods, Mark; Gimkiewicz, Christiane; Labrosse, Frédéric; Medina, Alberto; Tyler, Laurence; Barnes, David P.; Fritz, Gerald; Kapellos, Konstantinos

2012-01-01

Mobile systems exploring Planetary surfaces in future will require more autonomy than today. The EU FP7-SPACE Project ProViScout (2010-2012) establishes the building blocks of such autonomous exploration systems in terms of robotics vision by a decision-based combination of navigation and scientific target selection, and integrates them into a framework ready for and exposed to field demonstration. The PRoViScout on-board system consists of mission management components such as an Executive, a Mars Mission On-Board Planner and Scheduler, a Science Assessment Module, and Navigation & Vision Processing modules. The platform hardware consists of the rover with the sensors and pointing devices. We report on the major building blocks and their functions & interfaces, emphasizing on the computer vision parts such as image acquisition (using a novel zoomed 3D-Time-of-Flight & RGB camera), mapping from 3D-TOF data, panoramic image & stereo reconstruction, hazard and slope maps, visual odometry and the recognition of potential scientifically interesting targets.
Viewpoint Integration for Hand-Based Recognition of Social Interactions from a First-Person View.

PubMed

Bambach, Sven; Crandall, David J; Yu, Chen

2015-11-01

Wearable devices are becoming part of everyday life, from first-person cameras (GoPro, Google Glass), to smart watches (Apple Watch), to activity trackers (FitBit). These devices are often equipped with advanced sensors that gather data about the wearer and the environment. These sensors enable new ways of recognizing and analyzing the wearer's everyday personal activities, which could be used for intelligent human-computer interfaces and other applications. We explore one possible application by investigating how egocentric video data collected from head-mounted cameras can be used to recognize social activities between two interacting partners (e.g. playing chess or cards). In particular, we demonstrate that just the positions and poses of hands within the first-person view are highly informative for activity recognition, and present a computer vision approach that detects hands to automatically estimate activities. While hand pose detection is imperfect, we show that combining evidence across first-person views from the two social partners significantly improves activity recognition accuracy. This result highlights how integrating weak but complimentary sources of evidence from social partners engaged in the same task can help to recognize the nature of their interaction.
Viewpoint Integration for Hand-Based Recognition of Social Interactions from a First-Person View

PubMed Central

Bambach, Sven; Crandall, David J.; Yu, Chen

2016-01-01

Wearable devices are becoming part of everyday life, from first-person cameras (GoPro, Google Glass), to smart watches (Apple Watch), to activity trackers (FitBit). These devices are often equipped with advanced sensors that gather data about the wearer and the environment. These sensors enable new ways of recognizing and analyzing the wearer’s everyday personal activities, which could be used for intelligent human-computer interfaces and other applications. We explore one possible application by investigating how egocentric video data collected from head-mounted cameras can be used to recognize social activities between two interacting partners (e.g. playing chess or cards). In particular, we demonstrate that just the positions and poses of hands within the first-person view are highly informative for activity recognition, and present a computer vision approach that detects hands to automatically estimate activities. While hand pose detection is imperfect, we show that combining evidence across first-person views from the two social partners significantly improves activity recognition accuracy. This result highlights how integrating weak but complimentary sources of evidence from social partners engaged in the same task can help to recognize the nature of their interaction. PMID:28966999
Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation

PubMed Central

Khaligh-Razavi, Seyed-Mahdi; Kriegeskorte, Nikolaus

2014-01-01

Inferior temporal (IT) cortex in human and nonhuman primates serves visual object recognition. Computational object-vision models, although continually improving, do not yet reach human performance. It is unclear to what extent the internal representations of computational models can explain the IT representation. Here we investigate a wide range of computational model representations (37 in total), testing their categorization performance and their ability to account for the IT representational geometry. The models include well-known neuroscientific object-recognition models (e.g. HMAX, VisNet) along with several models from computer vision (e.g. SIFT, GIST, self-similarity features, and a deep convolutional neural network). We compared the representational dissimilarity matrices (RDMs) of the model representations with the RDMs obtained from human IT (measured with fMRI) and monkey IT (measured with cell recording) for the same set of stimuli (not used in training the models). Better performing models were more similar to IT in that they showed greater clustering of representational patterns by category. In addition, better performing models also more strongly resembled IT in terms of their within-category representational dissimilarities. Representational geometries were significantly correlated between IT and many of the models. However, the categorical clustering observed in IT was largely unexplained by the unsupervised models. The deep convolutional network, which was trained by supervision with over a million category-labeled images, reached the highest categorization performance and also best explained IT, although it did not fully explain the IT data. Combining the features of this model with appropriate weights and adding linear combinations that maximize the margin between animate and inanimate objects and between faces and other objects yielded a representation that fully explained our IT data. Overall, our results suggest that explaining IT requires computational features trained through supervised learning to emphasize the behaviorally important categorical divisions prominently reflected in IT. PMID:25375136
Deep Learning: A Primer for Radiologists.

PubMed

Chartrand, Gabriel; Cheng, Phillip M; Vorontsov, Eugene; Drozdzal, Michal; Turcotte, Simon; Pal, Christopher J; Kadoury, Samuel; Tang, An

2017-01-01

Deep learning is a class of machine learning methods that are gaining success and attracting interest in many domains, including computer vision, speech recognition, natural language processing, and playing games. Deep learning methods produce a mapping from raw inputs to desired outputs (eg, image classes). Unlike traditional machine learning methods, which require hand-engineered feature extraction from inputs, deep learning methods learn these features directly from data. With the advent of large datasets and increased computing power, these methods can produce models with exceptional performance. These models are multilayer artificial neural networks, loosely inspired by biologic neural systems. Weighted connections between nodes (neurons) in the network are iteratively adjusted based on example pairs of inputs and target outputs by back-propagating a corrective error signal through the network. For computer vision tasks, convolutional neural networks (CNNs) have proven to be effective. Recently, several clinical applications of CNNs have been proposed and studied in radiology for classification, detection, and segmentation tasks. This article reviews the key concepts of deep learning for clinical radiologists, discusses technical requirements, describes emerging applications in clinical radiology, and outlines limitations and future directions in this field. Radiologists should become familiar with the principles and potential applications of deep learning in medical imaging. © RSNA, 2017.
Improving Pattern Recognition and Neural Network Algorithms with Applications to Solar Panel Energy Optimization

NASA Astrophysics Data System (ADS)

Zamora Ramos, Ernesto

Artificial Intelligence is a big part of automation and with today's technological advances, artificial intelligence has taken great strides towards positioning itself as the technology of the future to control, enhance and perfect automation. Computer vision includes pattern recognition and classification and machine learning. Computer vision is at the core of decision making and it is a vast and fruitful branch of artificial intelligence. In this work, we expose novel algorithms and techniques built upon existing technologies to improve pattern recognition and neural network training, initially motivated by a multidisciplinary effort to build a robot that helps maintain and optimize solar panel energy production. Our contributions detail an improved non-linear pre-processing technique to enhance poorly illuminated images based on modifications to the standard histogram equalization for an image. While the original motivation was to improve nocturnal navigation, the results have applications in surveillance, search and rescue, medical imaging enhancing, and many others. We created a vision system for precise camera distance positioning motivated to correctly locate the robot for capture of solar panel images for classification. The classification algorithm marks solar panels as clean or dirty for later processing. Our algorithm extends past image classification and, based on historical and experimental data, it identifies the optimal moment in which to perform maintenance on marked solar panels as to minimize the energy and profit loss. In order to improve upon the classification algorithm, we delved into feedforward neural networks because of their recent advancements, proven universal approximation and classification capabilities, and excellent recognition rates. We explore state-of-the-art neural network training techniques offering pointers and insights, culminating on the implementation of a complete library with support for modern deep learning architectures, multilayer percepterons and convolutional neural networks. Our research with neural networks has encountered a great deal of difficulties regarding hyperparameter estimation for good training convergence rate and accuracy. Most hyperparameters, including architecture, learning rate, regularization, trainable parameters (or weights) initialization, and so on, are chosen via a trial and error process with some educated guesses. However, we developed the first quantitative method to compare weight initialization strategies, a critical hyperparameter choice during training, to estimate among a group of candidate strategies which would make the network converge to the highest classification accuracy faster with high probability. Our method provides a quick, objective measure to compare initialization strategies to select the best possible among them beforehand without having to complete multiple training sessions for each candidate strategy to compare final results.
Using advanced computer vision algorithms on small mobile robots

NASA Astrophysics Data System (ADS)

Kogut, G.; Birchmore, F.; Biagtan Pacis, E.; Everett, H. R.

2006-05-01

The Technology Transfer project employs a spiral development process to enhance the functionality and autonomy of mobile robot systems in the Joint Robotics Program (JRP) Robotic Systems Pool by converging existing component technologies onto a transition platform for optimization. An example of this approach is the implementation of advanced computer vision algorithms on small mobile robots. We demonstrate the implementation and testing of the following two algorithms useful on mobile robots: 1) object classification using a boosted Cascade of classifiers trained with the Adaboost training algorithm, and 2) human presence detection from a moving platform. Object classification is performed with an Adaboost training system developed at the University of California, San Diego (UCSD) Computer Vision Lab. This classification algorithm has been used to successfully detect the license plates of automobiles in motion in real-time. While working towards a solution to increase the robustness of this system to perform generic object recognition, this paper demonstrates an extension to this application by detecting soda cans in a cluttered indoor environment. The human presence detection from a moving platform system uses a data fusion algorithm which combines results from a scanning laser and a thermal imager. The system is able to detect the presence of humans while both the humans and the robot are moving simultaneously. In both systems, the two aforementioned algorithms were implemented on embedded hardware and optimized for use in real-time. Test results are shown for a variety of environments.
Neutral face classification using personalized appearance models for fast and robust emotion detection.

PubMed

Chiranjeevi, Pojala; Gopalakrishnan, Viswanath; Moogi, Pratibha

2015-09-01

Facial expression recognition is one of the open problems in computer vision. Robust neutral face recognition in real time is a major challenge for various supervised learning-based facial expression recognition methods. This is due to the fact that supervised methods cannot accommodate all appearance variability across the faces with respect to race, pose, lighting, facial biases, and so on, in the limited amount of training data. Moreover, processing each and every frame to classify emotions is not required, as user stays neutral for majority of the time in usual applications like video chat or photo album/web browsing. Detecting neutral state at an early stage, thereby bypassing those frames from emotion classification would save the computational power. In this paper, we propose a light-weight neutral versus emotion classification engine, which acts as a pre-processer to the traditional supervised emotion classification approaches. It dynamically learns neutral appearance at key emotion (KE) points using a statistical texture model, constructed by a set of reference neutral frames for each user. The proposed method is made robust to various types of user head motions by accounting for affine distortions based on a statistical texture model. Robustness to dynamic shift of KE points is achieved by evaluating the similarities on a subset of neighborhood patches around each KE point using the prior information regarding the directionality of specific facial action units acting on the respective KE point. The proposed method, as a result, improves emotion recognition (ER) accuracy and simultaneously reduces computational complexity of the ER system, as validated on multiple databases.
Research into the Architecture of CAD Based Robot Vision Systems

DTIC Science & Technology

1988-02-09

Vision 󈨚 and "Automatic Generation of Recognition Features for Com- puter Vision," Mudge, Turney and Volz, published in Robotica (1987). All of the...Occluded Parts," (T.N. Mudge, J.L. Turney, and R.A. Volz), Robotica , vol. 5, 1987, pp. 117-127. 5. "Vision Algorithms for Hypercube Machines," (T.N. Mudge
Deep learning architecture for recognition of abnormal activities

NASA Astrophysics Data System (ADS)

Khatrouch, Marwa; Gnouma, Mariem; Ejbali, Ridha; Zaied, Mourad

2018-04-01

The video surveillance is one of the key areas in computer vision researches. The scientific challenge in this field involves the implementation of automatic systems to obtain detailed information about individuals and groups behaviors. In particular, the detection of abnormal movements of groups or individuals requires a fine analysis of frames in the video stream. In this article, we propose a new method to detect anomalies in crowded scenes. We try to categorize the video in a supervised mode accompanied by unsupervised learning using the principle of the autoencoder. In order to construct an informative concept for the recognition of these behaviors, we use a technique of representation based on the superposition of human silhouettes. The evaluation of the UMN dataset demonstrates the effectiveness of the proposed approach.
Skeleton-based human action recognition using multiple sequence alignment

NASA Astrophysics Data System (ADS)

Ding, Wenwen; Liu, Kai; Cheng, Fei; Zhang, Jin; Li, YunSong

2015-05-01

Human action recognition and analysis is an active research topic in computer vision for many years. This paper presents a method to represent human actions based on trajectories consisting of 3D joint positions. This method first decompose action into a sequence of meaningful atomic actions (actionlets), and then label actionlets with English alphabets according to the Davies-Bouldin index value. Therefore, an action can be represented using a sequence of actionlet symbols, which will preserve the temporal order of occurrence of each of the actionlets. Finally, we employ sequence comparison to classify multiple actions through using string matching algorithms (Needleman-Wunsch). The effectiveness of the proposed method is evaluated on datasets captured by commodity depth cameras. Experiments of the proposed method on three challenging 3D action datasets show promising results.
A hybrid flower pollination algorithm based modified randomized location for multi-threshold medical image segmentation.

PubMed

Wang, Rui; Zhou, Yongquan; Zhao, Chengyan; Wu, Haizhou

2015-01-01

Multi-threshold image segmentation is a powerful image processing technique that is used for the preprocessing of pattern recognition and computer vision. However, traditional multilevel thresholding methods are computationally expensive because they involve exhaustively searching the optimal thresholds to optimize the objective functions. To overcome this drawback, this paper proposes a flower pollination algorithm with a randomized location modification. The proposed algorithm is used to find optimal threshold values for maximizing Otsu's objective functions with regard to eight medical grayscale images. When benchmarked against other state-of-the-art evolutionary algorithms, the new algorithm proves itself to be robust and effective through numerical experimental results including Otsu's objective values and standard deviations.
Research on improving image recognition robustness by combining multiple features with associative memory

NASA Astrophysics Data System (ADS)

Guo, Dongwei; Wang, Zhe

2018-05-01

Convolutional neural networks (CNN) achieve great success in computer vision, it can learn hierarchical representation from raw pixels and has outstanding performance in various image recognition tasks [1]. However, CNN is easy to be fraudulent in terms of it is possible to produce images totally unrecognizable to human eyes that CNNs believe with near certainty are familiar objects. [2]. In this paper, an associative memory model based on multiple features is proposed. Within this model, feature extraction and classification are carried out by CNN, T-SNE and exponential bidirectional associative memory neural network (EBAM). The geometric features extracted from CNN and the digital features extracted from T-SNE are associated by EBAM. Thus we ensure the recognition of robustness by a comprehensive assessment of the two features. In our model, we can get only 8% error rate with fraudulent data. In systems that require a high safety factor or some key areas, strong robustness is extremely important, if we can ensure the image recognition robustness, network security will be greatly improved and the social production efficiency will be extremely enhanced.
Incoherent optical generalized Hough transform: pattern recognition and feature extraction applications

NASA Astrophysics Data System (ADS)

Fernández, Ariel; Ferrari, José A.

2017-05-01

Pattern recognition and feature extraction are image processing applications of great interest in defect inspection and robot vision among others. In comparison to purely digital methods, the attractiveness of optical processors for pattern recognition lies in their highly parallel operation and real-time processing capability. This work presents an optical implementation of the generalized Hough transform (GHT), a well-established technique for recognition of geometrical features in binary images. Detection of a geometric feature under the GHT is accomplished by mapping the original image to an accumulator space; the large computational requirements for this mapping make the optical implementation an attractive alternative to digital-only methods. We explore an optical setup where the transformation is obtained, and the size and orientation parameters can be controlled, allowing for dynamic scale and orientation-variant pattern recognition. A compact system for the above purposes results from the use of an electrically tunable lens for scale control and a pupil mask implemented on a high-contrast spatial light modulator for orientation/shape variation of the template. Real-time can also be achieved. In addition, by thresholding of the GHT and optically inverse transforming, the previously detected features of interest can be extracted.
Visual Word Recognition Across the Adult Lifespan

PubMed Central

Cohen-Shikora, Emily R.; Balota, David A.

2016-01-01

The current study examines visual word recognition in a large sample (N = 148) across the adult lifespan and across a large set of stimuli (N = 1187) in three different lexical processing tasks (pronunciation, lexical decision, and animacy judgments). Although the focus of the present study is on the influence of word frequency, a diverse set of other variables are examined as the system ages and acquires more experience with language. Computational models and conceptual theories of visual word recognition and aging make differing predictions for age-related changes in the system. However, these have been difficult to assess because prior studies have produced inconsistent results, possibly due to sample differences, analytic procedures, and/or task-specific processes. The current study confronts these potential differences by using three different tasks, treating age and word variables as continuous, and exploring the influence of individual differences such as vocabulary, vision, and working memory. The primary finding is remarkable stability in the influence of a diverse set of variables on visual word recognition across the adult age spectrum. This pattern is discussed in reference to previous inconsistent findings in the literature and implications for current models of visual word recognition. PMID:27336629
A Method of Three-Dimensional Recording of Mandibular Movement Based on Two-Dimensional Image Feature Extraction

PubMed Central

Li, Zhongke; Yang, Huifang; Lü, Peijun; Wang, Yong; Sun, Yuchun

2015-01-01

Background and Objective To develop a real-time recording system based on computer binocular vision and two-dimensional image feature extraction to accurately record mandibular movement in three dimensions. Methods A computer-based binocular vision device with two digital cameras was used in conjunction with a fixed head retention bracket to track occlusal movement. Software was developed for extracting target spatial coordinates in real time based on two-dimensional image feature recognition. A plaster model of a subject’s upper and lower dentition were made using conventional methods. A mandibular occlusal splint was made on the plaster model, and then the occlusal surface was removed. Temporal denture base resin was used to make a 3-cm handle extending outside the mouth connecting the anterior labial surface of the occlusal splint with a detection target with intersecting lines designed for spatial coordinate extraction. The subject's head was firmly fixed in place, and the occlusal splint was fully seated on the mandibular dentition. The subject was then asked to make various mouth movements while the mandibular movement target locus point set was recorded. Comparisons between the coordinate values and the actual values of the 30 intersections on the detection target were then analyzed using paired t-tests. Results The three-dimensional trajectory curve shapes of the mandibular movements were consistent with the respective subject movements. Mean XYZ coordinate values and paired t-test results were as follows: X axis: -0.0037 ± 0.02953, P = 0.502; Y axis: 0.0037 ± 0.05242, P = 0.704; and Z axis: 0.0007 ± 0.06040, P = 0.952. The t-test result showed that the coordinate values of the 30 cross points were considered statistically no significant. (P<0.05) Conclusions Use of a real-time recording system of three-dimensional mandibular movement based on computer binocular vision and two-dimensional image feature recognition technology produced a recording accuracy of approximately ± 0.1 mm, and is therefore suitable for clinical application. Certainly, further research is necessary to confirm the clinical applications of the method. PMID:26375800

The implementation of aerial object recognition algorithm based on contour descriptor in FPGA-based on-board vision system

NASA Astrophysics Data System (ADS)

Babayan, Pavel; Smirnov, Sergey; Strotov, Valery

2017-10-01

This paper describes the aerial object recognition algorithm for on-board and stationary vision system. Suggested algorithm is intended to recognize the objects of a specific kind using the set of the reference objects defined by 3D models. The proposed algorithm based on the outer contour descriptor building. The algorithm consists of two stages: learning and recognition. Learning stage is devoted to the exploring of reference objects. Using 3D models we can build the database containing training images by rendering the 3D model from viewpoints evenly distributed on a sphere. Sphere points distribution is made by the geosphere principle. Gathered training image set is used for calculating descriptors, which will be used in the recognition stage of the algorithm. The recognition stage is focusing on estimating the similarity of the captured object and the reference objects by matching an observed image descriptor and the reference object descriptors. The experimental research was performed using a set of the models of the aircraft of the different types (airplanes, helicopters, UAVs). The proposed orientation estimation algorithm showed good accuracy in all case studies. The real-time performance of the algorithm in FPGA-based vision system was demonstrated.
AI And Early Vision - Part II

NASA Astrophysics Data System (ADS)

Julesz, Bela

1989-08-01

A quarter of a century ago I introduced two paradigms into psychology which in the intervening years have had a direct impact on the psychobiology of early vision and an indirect one on artificial intelligence (AI or machine vision). The first, the computer-generated random-dot stereogram (RDS) paradigm (Julesz, 1960) at its very inception posed a strategic question both for AI and neurophysiology. The finding that stereoscopic depth perception (stereopsis) is possible without the many enigmatic cues of monocular form recognition - as assumed previously - demonstrated that stereopsis with its basic problem of finding matches between corresponding random aggregates of dots in the left and right visual fields became ripe for modeling. Indeed, the binocular matching problem of stereopsis opened up an entire field of study, eventually leading to the computational models of David Marr (1982) and his coworkers. The fusion of RDS had an even greater impact on neurophysiologists - including Hubel and Wiesel (1962) - who realized that stereopsis must occur at an early stage, and can be studied easier than form perception. This insight recently culminated in the studies by Gian Poggio (1984) who found binocular-disparity - tuned neurons in the input stage to the visual cortex (layer IVB in V1) in the monkey that were selectively triggered by dynamic RDS. Thus the first paradigm led to a strategic insight: that with stereoscopic vision there is no camouflage, and as such was advantageous for our primate ancestors to evolve the cortical machinery of stereoscopic vision to capture camouflaged prey (insects) at a standstill. Amazingly, although stereopsis evolved relatively late in primates, it captured the very input stages of the visual cortex. (For a detailed review, see Julesz, 1986a)
A computerized recognition system for the home-based physiotherapy exercises using an RGBD camera.

PubMed

Ar, Ilktan; Akgul, Yusuf Sinan

2014-11-01

Computerized recognition of the home based physiotherapy exercises has many benefits and it has attracted considerable interest among the computer vision community. However, most methods in the literature view this task as a special case of motion recognition. In contrast, we propose to employ the three main components of a physiotherapy exercise (the motion patterns, the stance knowledge, and the exercise object) as different recognition tasks and embed them separately into the recognition system. The low level information about each component is gathered using machine learning methods. Then, we use a generative Bayesian network to recognize the exercise types by combining the information from these sources at an abstract level, which takes the advantage of domain knowledge for a more robust system. Finally, a novel postprocessing step is employed to estimate the exercise repetitions counts. The performance evaluation of the system is conducted with a new dataset which contains RGB (red, green, and blue) and depth videos of home-based exercise sessions for commonly applied shoulder and knee exercises. The proposed system works without any body-part segmentation, bodypart tracking, joint detection, and temporal segmentation methods. In the end, favorable exercise recognition rates and encouraging results on the estimation of repetition counts are obtained.
Understanding and preventing computer vision syndrome.

PubMed

Loh, Ky; Redd, Sc

2008-01-01

The invention of computer and advancement in information technology has revolutionized and benefited the society but at the same time has caused symptoms related to its usage such as ocular sprain, irritation, redness, dryness, blurred vision and double vision. This cluster of symptoms is known as computer vision syndrome which is characterized by the visual symptoms which result from interaction with computer display or its environment. Three major mechanisms that lead to computer vision syndrome are extraocular mechanism, accommodative mechanism and ocular surface mechanism. The visual effects of the computer such as brightness, resolution, glare and quality all are known factors that contribute to computer vision syndrome. Prevention is the most important strategy in managing computer vision syndrome. Modification in the ergonomics of the working environment, patient education and proper eye care are crucial in managing computer vision syndrome.
Image Registration Workshop Proceedings

NASA Technical Reports Server (NTRS)

LeMoigne, Jacqueline (Editor)

1997-01-01

Automatic image registration has often been considered as a preliminary step for higher-level processing, such as object recognition or data fusion. But with the unprecedented amounts of data which are being and will continue to be generated by newly developed sensors, the very topic of automatic image registration has become and important research topic. This workshop presents a collection of very high quality work which has been grouped in four main areas: (1) theoretical aspects of image registration; (2) applications to satellite imagery; (3) applications to medical imagery; and (4) image registration for computer vision research.
IEEE 1982. Proceedings of the international conference on cybernetics and society

DOE Office of Scientific and Technical Information (OSTI.GOV)

Not Available

1982-01-01

The following topics were dealt with: knowledge-based systems; risk analysis; man-machine interactions; human information processing; metaphor, analogy and problem-solving; manual control modelling; transportation systems; simulation; adaptive and learning systems; biocybernetics; cybernetics; mathematical programming; robotics; decision support systems; analysis, design and validation of models; computer vision; systems science; energy systems; environmental modelling and policy; pattern recognition; nuclear warfare; technological forecasting; artificial intelligence; the Turin shroud; optimisation; workloads. Abstracts of individual papers can be found under the relevant classification codes in this or future issues.
Low, slow, small target recognition based on spatial vision network

NASA Astrophysics Data System (ADS)

Cheng, Zhao; Guo, Pei; Qi, Xin

2018-03-01

Traditional photoelectric monitoring is monitored using a large number of identical cameras. In order to ensure the full coverage of the monitoring area, this monitoring method uses more cameras, which leads to more monitoring and repetition areas, and higher costs, resulting in more waste. In order to reduce the monitoring cost and solve the difficult problem of finding, identifying and tracking a low altitude, slow speed and small target, this paper presents spatial vision network for low-slow-small targets recognition. Based on camera imaging principle and monitoring model, spatial vision network is modeled and optimized. Simulation experiment results demonstrate that the proposed method has good performance.
Realism and Perspectivism: a Reevaluation of Rival Theories of Spatial Vision.

NASA Astrophysics Data System (ADS)

Thro, E. Broydrick

1990-01-01

My study reevaluates two theories of human space perception, a trigonometric surveying theory I call perspectivism and a "scene recognition" theory I call realism. Realists believe that retinal image geometry can supply no unambiguous information about an object's size and distance--and that, as a result, viewers can locate objects in space only by making discretionary interpretations based on familiar experience of object types. Perspectivists, in contrast, think viewers can disambiguate object sizes/distances on the basis of retinal image information alone. More specifically, they believe the eye responds to perspective image geometry with an automatic trigonometric calculation that not only fixes the directions and shapes, but also roughly fixes the sizes and distances of scene elements in space. Today this surveyor theory has been largely superceded by the realist approach, because most vision scientists believe retinal image geometry is ambiguous about the scale of space. However, I show that there is a considerable body of neglected evidence, both past and present, tending to call this scale ambiguity claim into question. I maintain that this evidence against scale ambiguity could hardly be more important, if one considers its subversive implications for the scene recognition theory that is not only today's reigning approach to spatial vision, but also the foundation for computer scientists' efforts to create space-perceiving robots. If viewers were deemed to be capable of automatic surveying calculations, the discretionary scene recognition theory would lose its main justification. Clearly, it would be difficult for realists to maintain that we viewers rely on scene recognition for space perception in spite of our ability to survey. And in reality, as I show, the surveyor theory does a much better job of describing the everyday space we viewers actually see--a space featuring stable, unambiguous relationships among scene elements, and a single horizon and vanishing point for (meter-scale) receding objects. In addition, I argue, the surveyor theory raises fewer philosophical difficulties, because it is more in harmony with our everyday concepts of material objects, human agency and the self.
A Scalable Distributed Approach to Mobile Robot Vision

NASA Technical Reports Server (NTRS)

Kuipers, Benjamin; Browning, Robert L.; Gribble, William S.

1997-01-01

This paper documents our progress during the first year of work on our original proposal entitled 'A Scalable Distributed Approach to Mobile Robot Vision'. We are pursuing a strategy for real-time visual identification and tracking of complex objects which does not rely on specialized image-processing hardware. In this system perceptual schemas represent objects as a graph of primitive features. Distributed software agents identify and track these features, using variable-geometry image subwindows of limited size. Active control of imaging parameters and selective processing makes simultaneous real-time tracking of many primitive features tractable. Perceptual schemas operate independently from the tracking of primitive features, so that real-time tracking of a set of image features is not hurt by latency in recognition of the object that those features make up. The architecture allows semantically significant features to be tracked with limited expenditure of computational resources, and allows the visual computation to be distributed across a network of processors. Early experiments are described which demonstrate the usefulness of this formulation, followed by a brief overview of our more recent progress (after the first year).
Multi-texture local ternary pattern for face recognition

NASA Astrophysics Data System (ADS)

Essa, Almabrok; Asari, Vijayan

2017-05-01

In imagery and pattern analysis domain a variety of descriptors have been proposed and employed for different computer vision applications like face detection and recognition. Many of them are affected under different conditions during the image acquisition process such as variations in illumination and presence of noise, because they totally rely on the image intensity values to encode the image information. To overcome these problems, a novel technique named Multi-Texture Local Ternary Pattern (MTLTP) is proposed in this paper. MTLTP combines the edges and corners based on the local ternary pattern strategy to extract the local texture features of the input image. Then returns a spatial histogram feature vector which is the descriptor for each image that we use to recognize a human being. Experimental results using a k-nearest neighbors classifier (k-NN) on two publicly available datasets justify our algorithm for efficient face recognition in the presence of extreme variations of illumination/lighting environments and slight variation of pose conditions.
Timing, timing, timing: Fast decoding of object information from intracranial field potentials in human visual cortex

PubMed Central

Liu, Hesheng; Agam, Yigal; Madsen, Joseph R.; Kreiman, Gabriel

2010-01-01

Summary The difficulty of visual recognition stems from the need to achieve high selectivity while maintaining robustness to object transformations within hundreds of milliseconds. Theories of visual recognition differ in whether the neuronal circuits invoke recurrent feedback connections or not. The timing of neurophysiological responses in visual cortex plays a key role in distinguishing between bottom-up and top-down theories. Here we quantified at millisecond resolution the amount of visual information conveyed by intracranial field potentials from 912 electrodes in 11 human subjects. We could decode object category information from human visual cortex in single trials as early as 100 ms post-stimulus. Decoding performance was robust to depth rotation and scale changes. The results suggest that physiological activity in the temporal lobe can account for key properties of visual recognition. The fast decoding in single trials is compatible with feed-forward theories and provides strong constraints for computational models of human vision. PMID:19409272
Local intensity area descriptor for facial recognition in ideal and noise conditions

NASA Astrophysics Data System (ADS)

Tran, Chi-Kien; Tseng, Chin-Dar; Chao, Pei-Ju; Ting, Hui-Min; Chang, Liyun; Huang, Yu-Jie; Lee, Tsair-Fwu

2017-03-01

We propose a local texture descriptor, local intensity area descriptor (LIAD), which is applied for human facial recognition in ideal and noisy conditions. Each facial image is divided into small regions from which LIAD histograms are extracted and concatenated into a single feature vector to represent the facial image. The recognition is performed using a nearest neighbor classifier with histogram intersection and chi-square statistics as dissimilarity measures. Experiments were conducted with LIAD using the ORL database of faces (Olivetti Research Laboratory, Cambridge), the Face94 face database, the Georgia Tech face database, and the FERET database. The results demonstrated the improvement in accuracy of our proposed descriptor compared to conventional descriptors [local binary pattern (LBP), uniform LBP, local ternary pattern, histogram of oriented gradients, and local directional pattern]. Moreover, the proposed descriptor was less sensitive to noise and had low histogram dimensionality. Thus, it is expected to be a powerful texture descriptor that can be used for various computer vision problems.
Facial expression recognition based on weber local descriptor and sparse representation

NASA Astrophysics Data System (ADS)

Ouyang, Yan

2018-03-01

Automatic facial expression recognition has been one of the research hotspots in the area of computer vision for nearly ten years. During the decade, many state-of-the-art methods have been proposed which perform very high accurate rate based on the face images without any interference. Nowadays, many researchers begin to challenge the task of classifying the facial expression images with corruptions and occlusions and the Sparse Representation based Classification framework has been wildly used because it can robust to the corruptions and occlusions. Therefore, this paper proposed a novel facial expression recognition method based on Weber local descriptor (WLD) and Sparse representation. The method includes three parts: firstly the face images are divided into many local patches, and then the WLD histograms of each patch are extracted, finally all the WLD histograms features are composed into a vector and combined with SRC to classify the facial expressions. The experiment results on the Cohn-Kanade database show that the proposed method is robust to occlusions and corruptions.
Robust Pedestrian Tracking and Recognition from FLIR Video: A Unified Approach via Sparse Coding

PubMed Central

Li, Xin; Guo, Rui; Chen, Chao

2014-01-01

Sparse coding is an emerging method that has been successfully applied to both robust object tracking and recognition in the vision literature. In this paper, we propose to explore a sparse coding-based approach toward joint object tracking-and-recognition and explore its potential in the analysis of forward-looking infrared (FLIR) video to support nighttime machine vision systems. A key technical contribution of this work is to unify existing sparse coding-based approaches toward tracking and recognition under the same framework, so that they can benefit from each other in a closed-loop. On the one hand, tracking the same object through temporal frames allows us to achieve improved recognition performance through dynamical updating of template/dictionary and combining multiple recognition results; on the other hand, the recognition of individual objects facilitates the tracking of multiple objects (i.e., walking pedestrians), especially in the presence of occlusion within a crowded environment. We report experimental results on both the CASIAPedestrian Database and our own collected FLIR video database to demonstrate the effectiveness of the proposed joint tracking-and-recognition approach. PMID:24961216
Learning to recognize letters in the periphery: Effects of repeated exposure, letter frequency, and letter complexity

PubMed Central

Husk, Jesse S.; Yu, Deyue

2017-01-01

Patients with central vision loss must rely on their peripheral vision for reading. Unfortunately, limitations of peripheral vision, such as crowding, pose significant challenges to letter recognition. As a result, there is a need for developing effective training methods for improving crowded letter recognition in the periphery. Several studies have shown that extensive practice with letter stimuli is beneficial to peripheral letter recognition. Here, we explore stimulus-related factors that might influence the effectiveness of peripheral letter recognition training. Specifically, we examined letter exposure (number of letter occurrences), frequency of letter use in English print, and letter complexity and evaluated their contributions to the amount of improvement observed in crowded letter recognition following training. We analyzed data collected across a range of training protocols. Using linear regression, we identified the best-fitting model and observed that all three stimulus-related factors contributed to improvement in peripheral letter recognition with letter exposure being the most important factor. As an important explanatory variable, pretest accuracy was included in the model as well to avoid estimate biases and was shown to have influence on the relationship between training improvement and letter exposure. When developing training protocols for peripheral letter recognition, it may be beneficial to not only consider the overall length of training, but also to tailor the number of stimulus occurrences for each letter according to its initial performance level, frequency, and complexity. PMID:28265651
Enhanced Gender Recognition System Using an Improved Histogram of Oriented Gradient (HOG) Feature from Quality Assessment of Visible Light and Thermal Images of the Human Body.

PubMed

Nguyen, Dat Tien; Park, Kang Ryoung

2016-07-21

With higher demand from users, surveillance systems are currently being designed to provide more information about the observed scene, such as the appearance of objects, types of objects, and other information extracted from detected objects. Although the recognition of gender of an observed human can be easily performed using human perception, it remains a difficult task when using computer vision system images. In this paper, we propose a new human gender recognition method that can be applied to surveillance systems based on quality assessment of human areas in visible light and thermal camera images. Our research is novel in the following two ways: First, we utilize the combination of visible light and thermal images of the human body for a recognition task based on quality assessment. We propose a quality measurement method to assess the quality of image regions so as to remove the effects of background regions in the recognition system. Second, by combining the features extracted using the histogram of oriented gradient (HOG) method and the measured qualities of image regions, we form a new image features, called the weighted HOG (wHOG), which is used for efficient gender recognition. Experimental results show that our method produces more accurate estimation results than the state-of-the-art recognition method that uses human body images.
Enhanced Gender Recognition System Using an Improved Histogram of Oriented Gradient (HOG) Feature from Quality Assessment of Visible Light and Thermal Images of the Human Body

PubMed Central

Nguyen, Dat Tien; Park, Kang Ryoung

2016-01-01

With higher demand from users, surveillance systems are currently being designed to provide more information about the observed scene, such as the appearance of objects, types of objects, and other information extracted from detected objects. Although the recognition of gender of an observed human can be easily performed using human perception, it remains a difficult task when using computer vision system images. In this paper, we propose a new human gender recognition method that can be applied to surveillance systems based on quality assessment of human areas in visible light and thermal camera images. Our research is novel in the following two ways: First, we utilize the combination of visible light and thermal images of the human body for a recognition task based on quality assessment. We propose a quality measurement method to assess the quality of image regions so as to remove the effects of background regions in the recognition system. Second, by combining the features extracted using the histogram of oriented gradient (HOG) method and the measured qualities of image regions, we form a new image features, called the weighted HOG (wHOG), which is used for efficient gender recognition. Experimental results show that our method produces more accurate estimation results than the state-of-the-art recognition method that uses human body images. PMID:27455264
A method for real-time implementation of HOG feature extraction

NASA Astrophysics Data System (ADS)

Luo, Hai-bo; Yu, Xin-rong; Liu, Hong-mei; Ding, Qing-hai

2011-08-01

Histogram of oriented gradient (HOG) is an efficient feature extraction scheme, and HOG descriptors are feature descriptors which is widely used in computer vision and image processing for the purpose of biometrics, target tracking, automatic target detection(ATD) and automatic target recognition(ATR) etc. However, computation of HOG feature extraction is unsuitable for hardware implementation since it includes complicated operations. In this paper, the optimal design method and theory frame for real-time HOG feature extraction based on FPGA were proposed. The main principle is as follows: firstly, the parallel gradient computing unit circuit based on parallel pipeline structure was designed. Secondly, the calculation of arctangent and square root operation was simplified. Finally, a histogram generator based on parallel pipeline structure was designed to calculate the histogram of each sub-region. Experimental results showed that the HOG extraction can be implemented in a pixel period by these computing units.
A self-teaching image processing and voice-recognition-based, intelligent and interactive system to educate visually impaired children

NASA Astrophysics Data System (ADS)

Iqbal, Asim; Farooq, Umar; Mahmood, Hassan; Asad, Muhammad Usman; Khan, Akrama; Atiq, Hafiz Muhammad

2010-02-01

A self teaching image processing and voice recognition based system is developed to educate visually impaired children, chiefly in their primary education. System comprises of a computer, a vision camera, an ear speaker and a microphone. Camera, attached with the computer system is mounted on the ceiling opposite (on the required angle) to the desk on which the book is placed. Sample images and voices in the form of instructions and commands of English, Urdu alphabets, Numeric Digits, Operators and Shapes are already stored in the database. A blind child first reads the embossed character (object) with the help of fingers than he speaks the answer, name of the character, shape etc into the microphone. With the voice command of a blind child received by the microphone, image is taken by the camera which is processed by MATLAB® program developed with the help of Image Acquisition and Image processing toolbox and generates a response or required set of instructions to child via ear speaker, resulting in self education of a visually impaired child. Speech recognition program is also developed in MATLAB® with the help of Data Acquisition and Signal Processing toolbox which records and process the command of the blind child.
Simplification of Visual Rendering in Simulated Prosthetic Vision Facilitates Navigation.

PubMed

Vergnieux, Victor; Macé, Marc J-M; Jouffrais, Christophe

2017-09-01

Visual neuroprostheses are still limited and simulated prosthetic vision (SPV) is used to evaluate potential and forthcoming functionality of these implants. SPV has been used to evaluate the minimum requirement on visual neuroprosthetic characteristics to restore various functions such as reading, objects and face recognition, object grasping, etc. Some of these studies focused on obstacle avoidance but only a few investigated orientation or navigation abilities with prosthetic vision. The resolution of current arrays of electrodes is not sufficient to allow navigation tasks without additional processing of the visual input. In this study, we simulated a low resolution array (15 × 18 electrodes, similar to a forthcoming generation of arrays) and evaluated the navigation abilities restored when visual information was processed with various computer vision algorithms to enhance the visual rendering. Three main visual rendering strategies were compared to a control rendering in a wayfinding task within an unknown environment. The control rendering corresponded to a resizing of the original image onto the electrode array size, according to the average brightness of the pixels. In the first rendering strategy, vision distance was limited to 3, 6, or 9 m, respectively. In the second strategy, the rendering was not based on the brightness of the image pixels, but on the distance between the user and the elements in the field of view. In the last rendering strategy, only the edges of the environments were displayed, similar to a wireframe rendering. All the tested renderings, except the 3 m limitation of the viewing distance, improved navigation performance and decreased cognitive load. Interestingly, the distance-based and wireframe renderings also improved the cognitive mapping of the unknown environment. These results show that low resolution implants are usable for wayfinding if specific computer vision algorithms are used to select and display appropriate information regarding the environment. © 2017 International Center for Artificial Organs and Transplantation and Wiley Periodicals, Inc.

Advances in image compression and automatic target recognition; Proceedings of the Meeting, Orlando, FL, Mar. 30, 31, 1989

NASA Technical Reports Server (NTRS)

Tescher, Andrew G. (Editor)

1989-01-01

Various papers on image compression and automatic target recognition are presented. Individual topics addressed include: target cluster detection in cluttered SAR imagery, model-based target recognition using laser radar imagery, Smart Sensor front-end processor for feature extraction of images, object attitude estimation and tracking from a single video sensor, symmetry detection in human vision, analysis of high resolution aerial images for object detection, obscured object recognition for an ATR application, neural networks for adaptive shape tracking, statistical mechanics and pattern recognition, detection of cylinders in aerial range images, moving object tracking using local windows, new transform method for image data compression, quad-tree product vector quantization of images, predictive trellis encoding of imagery, reduced generalized chain code for contour description, compact architecture for a real-time vision system, use of human visibility functions in segmentation coding, color texture analysis and synthesis using Gibbs random fields.
Object recognition with hierarchical discriminant saliency networks.

PubMed

Han, Sunhyoung; Vasconcelos, Nuno

2014-01-01

The benefits of integrating attention and object recognition are investigated. While attention is frequently modeled as a pre-processor for recognition, we investigate the hypothesis that attention is an intrinsic component of recognition and vice-versa. This hypothesis is tested with a recognition model, the hierarchical discriminant saliency network (HDSN), whose layers are top-down saliency detectors, tuned for a visual class according to the principles of discriminant saliency. As a model of neural computation, the HDSN has two possible implementations. In a biologically plausible implementation, all layers comply with the standard neurophysiological model of visual cortex, with sub-layers of simple and complex units that implement a combination of filtering, divisive normalization, pooling, and non-linearities. In a convolutional neural network implementation, all layers are convolutional and implement a combination of filtering, rectification, and pooling. The rectification is performed with a parametric extension of the now popular rectified linear units (ReLUs), whose parameters can be tuned for the detection of target object classes. This enables a number of functional enhancements over neural network models that lack a connection to saliency, including optimal feature denoising mechanisms for recognition, modulation of saliency responses by the discriminant power of the underlying features, and the ability to detect both feature presence and absence. In either implementation, each layer has a precise statistical interpretation, and all parameters are tuned by statistical learning. Each saliency detection layer learns more discriminant saliency templates than its predecessors and higher layers have larger pooling fields. This enables the HDSN to simultaneously achieve high selectivity to target object classes and invariance. The performance of the network in saliency and object recognition tasks is compared to those of models from the biological and computer vision literatures. This demonstrates benefits for all the functional enhancements of the HDSN, the class tuning inherent to discriminant saliency, and saliency layers based on templates of increasing target selectivity and invariance. Altogether, these experiments suggest that there are non-trivial benefits in integrating attention and recognition.
Computational approaches to vision

NASA Technical Reports Server (NTRS)

Barrow, H. G.; Tenenbaum, J. M.

1986-01-01

Vision is examined in terms of a computational process, and the competence, structure, and control of computer vision systems are analyzed. Theoretical and experimental data on the formation of a computer vision system are discussed. Consideration is given to early vision, the recovery of intrinsic surface characteristics, higher levels of interpretation, and system integration and control. A computational visual processing model is proposed and its architecture and operation are described. Examples of state-of-the-art vision systems, which include some of the levels of representation and processing mechanisms, are presented.
Sensor-Aware Recognition and Tracking for Wide-Area Augmented Reality on Mobile Phones

PubMed Central

Chen, Jing; Cao, Ruochen; Wang, Yongtian

2015-01-01

Wide-area registration in outdoor environments on mobile phones is a challenging task in mobile augmented reality fields. We present a sensor-aware large-scale outdoor augmented reality system for recognition and tracking on mobile phones. GPS and gravity information is used to improve the VLAD performance for recognition. A kind of sensor-aware VLAD algorithm, which is self-adaptive to different scale scenes, is utilized to recognize complex scenes. Considering vision-based registration algorithms are too fragile and tend to drift, data coming from inertial sensors and vision are fused together by an extended Kalman filter (EKF) to achieve considerable improvements in tracking stability and robustness. Experimental results show that our method greatly enhances the recognition rate and eliminates the tracking jitters. PMID:26690439
Sensor-Aware Recognition and Tracking for Wide-Area Augmented Reality on Mobile Phones.

PubMed

Chen, Jing; Cao, Ruochen; Wang, Yongtian

2015-12-10

Wide-area registration in outdoor environments on mobile phones is a challenging task in mobile augmented reality fields. We present a sensor-aware large-scale outdoor augmented reality system for recognition and tracking on mobile phones. GPS and gravity information is used to improve the VLAD performance for recognition. A kind of sensor-aware VLAD algorithm, which is self-adaptive to different scale scenes, is utilized to recognize complex scenes. Considering vision-based registration algorithms are too fragile and tend to drift, data coming from inertial sensors and vision are fused together by an extended Kalman filter (EKF) to achieve considerable improvements in tracking stability and robustness. Experimental results show that our method greatly enhances the recognition rate and eliminates the tracking jitters.
ROBOSIGHT: Robotic Vision System For Inspection And Manipulation

NASA Astrophysics Data System (ADS)

Trivedi, Mohan M.; Chen, ChuXin; Marapane, Suresh

1989-02-01

Vision is an important sensory modality that can be used for deriving information critical to the proper, efficient, flexible, and safe operation of an intelligent robot. Vision systems are uti-lized for developing higher level interpretation of the nature of a robotic workspace using images acquired by cameras mounted on a robot. Such information can be useful for tasks such as object recognition, object location, object inspection, obstacle avoidance and navigation. In this paper we describe efforts directed towards developing a vision system useful for performing various robotic inspection and manipulation tasks. The system utilizes gray scale images and can be viewed as a model-based system. It includes general purpose image analysis modules as well as special purpose, task dependent object status recognition modules. Experiments are described to verify the robust performance of the integrated system using a robotic testbed.
SEMI-SUPERVISED OBJECT RECOGNITION USING STRUCTURE KERNEL

PubMed Central

Wang, Botao; Xiong, Hongkai; Jiang, Xiaoqian; Ling, Fan

2013-01-01

Object recognition is a fundamental problem in computer vision. Part-based models offer a sparse, flexible representation of objects, but suffer from difficulties in training and often use standard kernels. In this paper, we propose a positive definite kernel called “structure kernel”, which measures the similarity of two part-based represented objects. The structure kernel has three terms: 1) the global term that measures the global visual similarity of two objects; 2) the part term that measures the visual similarity of corresponding parts; 3) the spatial term that measures the spatial similarity of geometric configuration of parts. The contribution of this paper is to generalize the discriminant capability of local kernels to complex part-based object models. Experimental results show that the proposed kernel exhibit higher accuracy than state-of-art approaches using standard kernels. PMID:23666108
Recognition of Indian Sign Language in Live Video

NASA Astrophysics Data System (ADS)

Singha, Joyeeta; Das, Karen

2013-05-01

Sign Language Recognition has emerged as one of the important area of research in Computer Vision. The difficulty faced by the researchers is that the instances of signs vary with both motion and appearance. Thus, in this paper a novel approach for recognizing various alphabets of Indian Sign Language is proposed where continuous video sequences of the signs have been considered. The proposed system comprises of three stages: Preprocessing stage, Feature Extraction and Classification. Preprocessing stage includes skin filtering, histogram matching. Eigen values and Eigen Vectors were considered for feature extraction stage and finally Eigen value weighted Euclidean distance is used to recognize the sign. It deals with bare hands, thus allowing the user to interact with the system in natural way. We have considered 24 different alphabets in the video sequences and attained a success rate of 96.25%.
Multidisciplinary unmanned technology teammate (MUTT)

NASA Astrophysics Data System (ADS)

Uzunovic, Nenad; Schneider, Anne; Lacaze, Alberto; Murphy, Karl; Del Giorno, Mark

2013-01-01

The U.S. Army Tank Automotive Research, Development and Engineering Center (TARDEC) held an autonomous robot competition called CANINE in June 2012. The goal of the competition was to develop innovative and natural control methods for robots. This paper describes the winning technology, including the vision system, the operator interaction, and the autonomous mobility. The rules stated only gestures or voice commands could be used for control. The robots would learn a new object at the start of each phase, find the object after it was thrown into a field, and return the object to the operator. Each of the six phases became more difficult, including clutter of the same color or shape as the object, moving and stationary obstacles, and finding the operator who moved from the starting location to a new location. The Robotic Research Team integrated techniques in computer vision, speech recognition, object manipulation, and autonomous navigation. A multi-filter computer vision solution reliably detected the objects while rejecting objects of similar color or shape, even while the robot was in motion. A speech-based interface with short commands provided close to natural communication of complicated commands from the operator to the robot. An innovative gripper design allowed for efficient object pickup. A robust autonomous mobility and navigation solution for ground robotic platforms provided fast and reliable obstacle avoidance and course navigation. The research approach focused on winning the competition while remaining cognizant and relevant to real world applications.
Risk factors for computer visual syndrome (CVS) among operators of two call centers in São Paulo, Brazil.

PubMed

Sa, Eduardo Costa; Ferreira Junior, Mario; Rocha, Lys Esther

2012-01-01

The aims of this study were to investigate work conditions, to estimate the prevalence and to describe risk factors associated with Computer Vision Syndrome among two call centers' operators in São Paulo (n = 476). The methods include a quantitative cross-sectional observational study and an ergonomic work analysis, using work observation, interviews and questionnaires. The case definition was the presence of one or more specific ocular symptoms answered as always, often or sometimes. The multiple logistic regression model, were created using the stepwise forward likelihood method and remained the variables with levels below 5% (p < 0.05). The operators were mainly female and young (from 15 to 24 years old). The call center was opened 24 hours and the operators weekly hours were 36 hours with break time from 21 to 35 minutes per day. The symptoms reported were eye fatigue (73.9%), "weight" in the eyes (68.2%), "burning" eyes (54.6%), tearing (43.9%) and weakening of vision (43.5%). The prevalence of Computer Vision Syndrome was 54.6%. Associations verified were: being female (OR 2.6, 95% CI 1.6 to 4.1), lack of recognition at work (OR 1.4, 95% CI 1.1 to 1.8), organization of work in call center (OR 1.4, 95% CI 1.1 to 1.7) and high demand at work (OR 1.1, 95% CI 1.0 to 1.3). The organization and psychosocial factors at work should be included in prevention programs of visual syndrome among call centers' operators.
Automated egg grading system using computer vision: Investigation on weight measure versus shape parameters

NASA Astrophysics Data System (ADS)

Nasir, Ahmad Fakhri Ab; Suhaila Sabarudin, Siti; Majeed, Anwar P. P. Abdul; Ghani, Ahmad Shahrizan Abdul

2018-04-01

Chicken egg is a source of food of high demand by humans. Human operators cannot work perfectly and continuously when conducting egg grading. Instead of an egg grading system using weight measure, an automatic system for egg grading using computer vision (using egg shape parameter) can be used to improve the productivity of egg grading. However, early hypothesis has indicated that more number of egg classes will change when using egg shape parameter compared with using weight measure. This paper presents the comparison of egg classification by the two above-mentioned methods. Firstly, 120 images of chicken eggs of various grades (A–D) produced in Malaysia are captured. Then, the egg images are processed using image pre-processing techniques, such as image cropping, smoothing and segmentation. Thereafter, eight egg shape features, including area, major axis length, minor axis length, volume, diameter and perimeter, are extracted. Lastly, feature selection (information gain ratio) and feature extraction (principal component analysis) are performed using k-nearest neighbour classifier in the classification process. Two methods, namely, supervised learning (using weight measure as graded by egg supplier) and unsupervised learning (using egg shape parameters as graded by ourselves), are conducted to execute the experiment. Clustering results reveal many changes in egg classes after performing shape-based grading. On average, the best recognition results using shape-based grading label is 94.16% while using weight-based label is 44.17%. As conclusion, automated egg grading system using computer vision is better by implementing shape-based features since it uses image meanwhile the weight parameter is more suitable by using weight grading system.
Self-organized Evaluation of Dynamic Hand Gestures for Sign Language Recognition

NASA Astrophysics Data System (ADS)

Buciu, Ioan; Pitas, Ioannis

Two main theories exist with respect to face encoding and representation in the human visual system (HVS). The first one refers to the dense (holistic) representation of the face, where faces have "holon"-like appearance. The second one claims that a more appropriate face representation is given by a sparse code, where only a small fraction of the neural cells corresponding to face encoding is activated. Theoretical and experimental evidence suggest that the HVS performs face analysis (encoding, storing, face recognition, facial expression recognition) in a structured and hierarchical way, where both representations have their own contribution and goal. According to neuropsychological experiments, it seems that encoding for face recognition, relies on holistic image representation, while a sparse image representation is used for facial expression analysis and classification. From the computer vision perspective, the techniques developed for automatic face and facial expression recognition fall into the same two representation types. Like in Neuroscience, the techniques which perform better for face recognition yield a holistic image representation, while those techniques suitable for facial expression recognition use a sparse or local image representation. The proposed mathematical models of image formation and encoding try to simulate the efficient storing, organization and coding of data in the human cortex. This is equivalent with embedding constraints in the model design regarding dimensionality reduction, redundant information minimization, mutual information minimization, non-negativity constraints, class information, etc. The presented techniques are applied as a feature extraction step followed by a classification method, which also heavily influences the recognition results.
Task effects, performance levels, features, configurations, and holistic face processing: A reply to Rossion

PubMed Central

Riesenhuber, Maximilian; Wolff, Brian S.

2009-01-01

Summary A recent article in Acta Psychologica (“Picture-plane inversion leads to qualitative changes of face perception” by B. Rossion, 2008) criticized several aspects of an earlier paper of ours (Riesenhuber et al., “Face processing in humans is compatible with a simple shape-based model of vision”, Proc Biol Sci, 2004). We here address Rossion’s criticisms and correct some misunderstandings. To frame the discussion, we first review our previously presented computational model of face recognition in cortex (Jiang et al., “Evaluation of a shape-based model of human face discrimination using fMRI and behavioral techniques”, Neuron, 2006) that provides a concrete biologically plausible computational substrate for holistic coding, namely a neural representation learned for upright faces, in the spirit of the original simple-to-complex hierarchical model of vision by Hubel and Wiesel. We show that Rossion’s and others’ data support the model, and that there is actually a convergence of views on the mechanisms underlying face recognition, in particular regarding holistic processing. PMID:19665104
Design of a compact low-power human-computer interaction equipment for hand motion

NASA Astrophysics Data System (ADS)

Wu, Xianwei; Jin, Wenguang

2017-01-01

Human-Computer Interaction (HCI) raises demand of convenience, endurance, responsiveness and naturalness. This paper describes a design of a compact wearable low-power HCI equipment applied to gesture recognition. System combines multi-mode sense signals: the vision sense signal and the motion sense signal, and the equipment is equipped with the depth camera and the motion sensor. The dimension (40 mm × 30 mm) and structure is compact and portable after tight integration. System is built on a module layered framework, which contributes to real-time collection (60 fps), process and transmission via synchronous confusion with asynchronous concurrent collection and wireless Blue 4.0 transmission. To minimize equipment's energy consumption, system makes use of low-power components, managing peripheral state dynamically, switching into idle mode intelligently, pulse-width modulation (PWM) of the NIR LEDs of the depth camera and algorithm optimization by the motion sensor. To test this equipment's function and performance, a gesture recognition algorithm is applied to system. As the result presents, general energy consumption could be as low as 0.5 W.
Comparison of progressive addition lenses for general purpose and for computer vision: an office field study.

PubMed

Jaschinski, Wolfgang; König, Mirjam; Mekontso, Tiofil M; Ohlendorf, Arne; Welscher, Monique

2015-05-01

Two types of progressive addition lenses (PALs) were compared in an office field study: 1. General purpose PALs with continuous clear vision between infinity and near reading distances and 2. Computer vision PALs with a wider zone of clear vision at the monitor and in near vision but no clear distance vision. Twenty-three presbyopic participants wore each type of lens for two weeks in a double-masked four-week quasi-experimental procedure that included an adaptation phase (Weeks 1 and 2) and a test phase (Weeks 3 and 4). Questionnaires on visual and musculoskeletal conditions as well as preferences regarding the type of lenses were administered. After eight more weeks of free use of the spectacles, the preferences were assessed again. The ergonomic conditions were analysed from photographs. Head inclination when looking at the monitor was significantly lower by 2.3 degrees with the computer vision PALs than with the general purpose PALs. Vision at the monitor was judged significantly better with computer PALs, while distance vision was judged better with general purpose PALs; however, the reported advantage of computer vision PALs differed in extent between participants. Accordingly, 61 per cent of the participants preferred the computer vision PALs, when asked without information about lens design. After full information about lens characteristics and additional eight weeks of free spectacle use, 44 per cent preferred the computer vision PALs. On average, computer vision PALs were rated significantly better with respect to vision at the monitor during the experimental part of the study. In the final forced-choice ratings, approximately half of the participants preferred either the computer vision PAL or the general purpose PAL. Individual factors seem to play a role in this preference and in the rated advantage of computer vision PALs. © 2015 The Authors. Clinical and Experimental Optometry © 2015 Optometry Australia.
Thoughts turned into high-level commands: Proof-of-concept study of a vision-guided robot arm driven by functional MRI (fMRI) signals.

PubMed

Minati, Ludovico; Nigri, Anna; Rosazza, Cristina; Bruzzone, Maria Grazia

2012-06-01

Previous studies have demonstrated the possibility of using functional MRI to control a robot arm through a brain-machine interface by directly coupling haemodynamic activity in the sensory-motor cortex to the position of two axes. Here, we extend this work by implementing interaction at a more abstract level, whereby imagined actions deliver structured commands to a robot arm guided by a machine vision system. Rather than extracting signals from a small number of pre-selected regions, the proposed system adaptively determines at individual level how to map representative brain areas to the input nodes of a classifier network. In this initial study, a median action recognition accuracy of 90% was attained on five volunteers performing a game consisting of collecting randomly positioned coloured pawns and placing them into cups. The "pawn" and "cup" instructions were imparted through four mental imaginery tasks, linked to robot arm actions by a state machine. With the current implementation in MatLab language the median action recognition time was 24.3s and the robot execution time was 17.7s. We demonstrate the notion of combining haemodynamic brain-machine interfacing with computer vision to implement interaction at the level of high-level commands rather than individual movements, which may find application in future fMRI approaches relevant to brain-lesioned patients, and provide source code supporting further work on larger command sets and real-time processing. Copyright © 2012 IPEM. Published by Elsevier Ltd. All rights reserved.
Perceptual organization in computer vision - A review and a proposal for a classificatory structure

NASA Technical Reports Server (NTRS)

Sarkar, Sudeep; Boyer, Kim L.

1993-01-01

The evolution of perceptual organization in biological vision, and its necessity in advanced computer vision systems, arises from the characteristic that perception, the extraction of meaning from sensory input, is an intelligent process. This is particularly so for high order organisms and, analogically, for more sophisticated computational models. The role of perceptual organization in computer vision systems is explored. This is done from four vantage points. First, a brief history of perceptual organization research in both humans and computer vision is offered. Next, a classificatory structure in which to cast perceptual organization research to clarify both the nomenclature and the relationships among the many contributions is proposed. Thirdly, the perceptual organization work in computer vision in the context of this classificatory structure is reviewed. Finally, the array of computational techniques applied to perceptual organization problems in computer vision is surveyed.
Comparison of diagnosis of early retinal lesions of diabetic retinopathy between a computer system and human experts.

PubMed

Lee, S C; Lee, E T; Kingsley, R M; Wang, Y; Russell, D; Klein, R; Warn, A

2001-04-01

To investigate whether a computer vision system is comparable with humans in detecting early retinal lesions of diabetic retinopathy using color fundus photographs. A computer system has been developed using image processing and pattern recognition techniques to detect early lesions of diabetic retinopathy (hemorrhages and microaneurysms, hard exudates, and cotton-wool spots). Color fundus photographs obtained from American Indians in Oklahoma were used in developing and testing the system. A set of 369 color fundus slides were used to train the computer system using 3 diagnostic categories: lesions present, questionable, or absent (Y/Q/N). A different set of 428 slides were used to test and evaluate the system, and its diagnostic results were compared with those of 2 human experts-the grader at the University of Wisconsin Fundus Photograph Reading Center (Madison) and a general ophthalmologist. The experiments included comparisons using 3 (Y/Q/N) and 2 diagnostic categories (Y/N) (questionable cases excluded in the latter). In the training phase, the agreement rates, sensitivity, and specificity in detecting the 3 lesions between the retinal specialist and the computer system were all above 90%. The kappa statistics were high (0.75-0.97), indicating excellent agreement between the specialist and the computer system. In the testing phase, the results obtained between the computer system and human experts were consistent with those of the training phase, and they were comparable with those between the human experts. The performance of the computer vision system in diagnosing early retinal lesions was comparable with that of human experts. Therefore, this mobile, electronically easily accessible, and noninvasive computer system, could become a mass screening tool and a clinical aid in diagnosing early lesions of diabetic retinopathy.
The use of higher-order statistics in rapid object categorization in natural scenes.

PubMed

Banno, Hayaki; Saiki, Jun

2015-02-04

We can rapidly and efficiently recognize many types of objects embedded in complex scenes. What information supports this object recognition is a fundamental question for understanding our visual processing. We investigated the eccentricity-dependent role of shape and statistical information for ultrarapid object categorization, using the higher-order statistics proposed by Portilla and Simoncelli (2000). Synthesized textures computed by their algorithms have the same higher-order statistics as the originals, while the global shapes were destroyed. We used the synthesized textures to manipulate the availability of shape information separately from the statistics. We hypothesized that shape makes a greater contribution to central vision than to peripheral vision and that statistics show the opposite pattern. Results did not show contributions clearly biased by eccentricity. Statistical information demonstrated a robust contribution not only in peripheral but also in central vision. For shape, the results supported the contribution in both central and peripheral vision. Further experiments revealed some interesting properties of the statistics. They are available for a limited time, attributable to the presence or absence of animals without shape, and predict how easily humans detect animals in original images. Our data suggest that when facing the time constraint of categorical processing, higher-order statistics underlie our significant performance for rapid categorization, irrespective of eccentricity. © 2015 ARVO.
Automatic image database generation from CAD for 3D object recognition

NASA Astrophysics Data System (ADS)

Sardana, Harish K.; Daemi, Mohammad F.; Ibrahim, Mohammad K.

1993-06-01

The development and evaluation of Multiple-View 3-D object recognition systems is based on a large set of model images. Due to the various advantages of using CAD, it is becoming more and more practical to use existing CAD data in computer vision systems. Current PC- level CAD systems are capable of providing physical image modelling and rendering involving positional variations in cameras, light sources etc. We have formulated a modular scheme for automatic generation of various aspects (views) of the objects in a model based 3-D object recognition system. These views are generated at desired orientations on the unit Gaussian sphere. With a suitable network file sharing system (NFS), the images can directly be stored on a database located on a file server. This paper presents the image modelling solutions using CAD in relation to multiple-view approach. Our modular scheme for data conversion and automatic image database storage for such a system is discussed. We have used this approach in 3-D polyhedron recognition. An overview of the results, advantages and limitations of using CAD data and conclusions using such as scheme are also presented.

A robust recognition and accurate locating method for circular coded diagonal target

NASA Astrophysics Data System (ADS)

Bao, Yunna; Shang, Yang; Sun, Xiaoliang; Zhou, Jiexin

2017-10-01

As a category of special control points which can be automatically identified, artificial coded targets have been widely developed in the field of computer vision, photogrammetry, augmented reality, etc. In this paper, a new circular coded target designed by RockeTech technology Corp. Ltd is analyzed and studied, which is called circular coded diagonal target (CCDT). A novel detection and recognition method with good robustness is proposed in the paper, and implemented on Visual Studio. In this algorithm, firstly, the ellipse features of the center circle are used for rough positioning. Then, according to the characteristics of the center diagonal target, a circular frequency filter is designed to choose the correct center circle and eliminates non-target noise. The precise positioning of the coded target is done by the correlation coefficient fitting extreme value method. Finally, the coded target recognition is achieved by decoding the binary sequence in the outer ring of the extracted target. To test the proposed algorithm, this paper has carried out simulation experiments and real experiments. The results show that the CCDT recognition and accurate locating method proposed in this paper can robustly recognize and accurately locate the targets in complex and noisy background.
Cortical visual dysfunction in children: a clinical study.

PubMed

Dutton, G; Ballantyne, J; Boyd, G; Bradnam, M; Day, R; McCulloch, D; Mackie, R; Phillips, S; Saunders, K

1996-01-01

Damage to the cerebral cortex was responsible for impairment in vision in 90 of 130 consecutive children referred to the Vision Assessment Clinic in Glasgow. Cortical blindness was seen in 16 children. Only 2 were mobile, but both showed evidence of navigational blind-sight. Cortical visual impairment, in which it was possible to estimate visual acuity but generalised severe brain damage precluded estimation of cognitive visual function, was observed in 9 children. Complex disorders of cognitive vision were seen in 20 children. These could be divided into five categories and involved impairment of: (1) recognition, (2) orientation, (3) depth perception, (4) perception of movement and (5) simultaneous perception. These disorders were observed in a variety of combinations. The remaining children showed evidence of reduced visual acuity and/ or visual field loss, but without detectable disorders of congnitive visual function. Early recognition of disorders of cognitive vision is required if active training and remediation are to be implemented.
Cherry recognition in natural environment based on the vision of picking robot

NASA Astrophysics Data System (ADS)

Zhang, Qirong; Chen, Shanxiong; Yu, Tingzhong; Wang, Yan

2017-04-01

In order to realize the automatic recognition of cherry in the natural environment, this paper designed a robot vision system recognition method. The first step of this method is to pre-process the cherry image by median filtering. The second step is to identify the colour of the cherry through the 0.9R-G colour difference formula, and then use the Otsu algorithm for threshold segmentation. The third step is to remove noise by using the area threshold. The fourth step is to remove the holes in the cherry image by morphological closed and open operation. The fifth step is to obtain the centroid and contour of cherry by using the smallest external rectangular and the Hough transform. Through this recognition process, we can successfully identify 96% of the cherry without blocking and adhesion.
Image processing strategies based on saliency segmentation for object recognition under simulated prosthetic vision.

PubMed

Li, Heng; Su, Xiaofan; Wang, Jing; Kan, Han; Han, Tingting; Zeng, Yajie; Chai, Xinyu

2018-01-01

Current retinal prostheses can only generate low-resolution visual percepts constituted of limited phosphenes which are elicited by an electrode array and with uncontrollable color and restricted grayscale. Under this visual perception, prosthetic recipients can just complete some simple visual tasks, but more complex tasks like face identification/object recognition are extremely difficult. Therefore, it is necessary to investigate and apply image processing strategies for optimizing the visual perception of the recipients. This study focuses on recognition of the object of interest employing simulated prosthetic vision. We used a saliency segmentation method based on a biologically plausible graph-based visual saliency model and a grabCut-based self-adaptive-iterative optimization framework to automatically extract foreground objects. Based on this, two image processing strategies, Addition of Separate Pixelization and Background Pixel Shrink, were further utilized to enhance the extracted foreground objects. i) The results showed by verification of psychophysical experiments that under simulated prosthetic vision, both strategies had marked advantages over Direct Pixelization in terms of recognition accuracy and efficiency. ii) We also found that recognition performance under two strategies was tied to the segmentation results and was affected positively by the paired-interrelated objects in the scene. The use of the saliency segmentation method and image processing strategies can automatically extract and enhance foreground objects, and significantly improve object recognition performance towards recipients implanted a high-density implant. Copyright © 2017 Elsevier B.V. All rights reserved.
Robust image matching via ORB feature and VFC for mismatch removal

NASA Astrophysics Data System (ADS)

Ma, Tao; Fu, Wenxing; Fang, Bin; Hu, Fangyu; Quan, Siwen; Ma, Jie

2018-03-01

Image matching is at the base of many image processing and computer vision problems, such as object recognition or structure from motion. Current methods rely on good feature descriptors and mismatch removal strategies for detection and matching. In this paper, we proposed a robust image match approach based on ORB feature and VFC for mismatch removal. ORB (Oriented FAST and Rotated BRIEF) is an outstanding feature, it has the same performance as SIFT with lower cost. VFC (Vector Field Consensus) is a state-of-the-art mismatch removing method. The experiment results demonstrate that our method is efficient and robust.
Proceedings of the Third International Workshop on Neural Networks and Fuzzy Logic, volume 2

NASA Technical Reports Server (NTRS)

Culbert, Christopher J. (Editor)

1993-01-01

Papers presented at the Neural Networks and Fuzzy Logic Workshop sponsored by the National Aeronautics and Space Administration and cosponsored by the University of Houston, Clear Lake, held 1-3 Jun. 1992 at the Lyndon B. Johnson Space Center in Houston, Texas are included. During the three days approximately 50 papers were presented. Technical topics addressed included adaptive systems; learning algorithms; network architectures; vision; robotics; neurobiological connections; speech recognition and synthesis; fuzzy set theory and application, control and dynamics processing; space applications; fuzzy logic and neural network computers; approximate reasoning; and multiobject decision making.
Face recognition in age related macular degeneration: perceived disability, measured disability, and performance with a bioptic device.

PubMed

Tejeria, L; Harper, R A; Artes, P H; Dickinson, C M

2002-09-01

(1) To explore the relation between performance on tasks of familiar face recognition (FFR) and face expression difference discrimination (FED) with both perceived disability in face recognition and clinical measures of visual function in subjects with age related macular degeneration (AMD). (2) To quantify the gain in performance for face recognition tasks when subjects use a bioptic telescopic low vision device. 30 subjects with AMD (age range 66-90 years; visual acuity 0.4-1.4 logMAR) were recruited for the study. Perceived (self rated) disability in face recognition was assessed by an eight item questionnaire covering a range of issues relating to face recognition. Visual functions measured were distance visual acuity (ETDRS logMAR charts), continuous text reading acuity (MNRead charts), contrast sensitivity (Pelli-Robson chart), and colour vision (large panel D-15). In the FFR task, images of famous people had to be identified. FED was assessed by a forced choice test where subjects had to decide which one of four images showed a different facial expression. These tasks were repeated with subjects using a bioptic device. Overall perceived disability in face recognition did not correlate with performance on either task, although a specific item on difficulty recognising familiar faces did correlate with FFR (r = 0.49, p<0.05). FFR performance was most closely related to distance acuity (r = -0.69, p<0.001), while FED performance was most closely related to continuous text reading acuity (r = -0.79, p<0.001). In multiple regression, neither contrast sensitivity nor colour vision significantly increased the explained variance. When using a bioptic telescope, FFR performance improved in 86% of subjects (median gain = 49%; p<0.001), while FED performance increased in 79% of subjects (median gain = 50%; p<0.01). Distance and reading visual acuity are closely associated with measured task performance in FFR and FED. A bioptic low vision device can offer a significant improvement in performance for face recognition tasks, and may be useful in reducing the handicap associated with this disability. There is, however, little evidence for a correlation between self rated difficulty in face recognition and measured performance for either task. Further work is needed to explore the complex relation between the perception of disability and measured performance.
Face recognition in age related macular degeneration: perceived disability, measured disability, and performance with a bioptic device

PubMed Central

Tejeria, L; Harper, R A; Artes, P H; Dickinson, C M

2002-01-01

Aims: (1) To explore the relation between performance on tasks of familiar face recognition (FFR) and face expression difference discrimination (FED) with both perceived disability in face recognition and clinical measures of visual function in subjects with age related macular degeneration (AMD). (2) To quantify the gain in performance for face recognition tasks when subjects use a bioptic telescopic low vision device. Methods: 30 subjects with AMD (age range 66–90 years; visual acuity 0.4–1.4 logMAR) were recruited for the study. Perceived (self rated) disability in face recognition was assessed by an eight item questionnaire covering a range of issues relating to face recognition. Visual functions measured were distance visual acuity (ETDRS logMAR charts), continuous text reading acuity (MNRead charts), contrast sensitivity (Pelli-Robson chart), and colour vision (large panel D-15). In the FFR task, images of famous people had to be identified. FED was assessed by a forced choice test where subjects had to decide which one of four images showed a different facial expression. These tasks were repeated with subjects using a bioptic device. Results: Overall perceived disability in face recognition did not correlate with performance on either task, although a specific item on difficulty recognising familiar faces did correlate with FFR (r = 0.49, p<0.05). FFR performance was most closely related to distance acuity (r = −0.69, p<0.001), while FED performance was most closely related to continuous text reading acuity (r = −0.79, p<0.001). In multiple regression, neither contrast sensitivity nor colour vision significantly increased the explained variance. When using a bioptic telescope, FFR performance improved in 86% of subjects (median gain = 49%; p<0.001), while FED performance increased in 79% of subjects (median gain = 50%; p<0.01). Conclusion: Distance and reading visual acuity are closely associated with measured task performance in FFR and FED. A bioptic low vision device can offer a significant improvement in performance for face recognition tasks, and may be useful in reducing the handicap associated with this disability. There is, however, little evidence for a correlation between self rated difficulty in face recognition and measured performance for either task. Further work is needed to explore the complex relation between the perception of disability and measured performance. PMID:12185131
EEG based topography analysis in string recognition task

NASA Astrophysics Data System (ADS)

Ma, Xiaofei; Huang, Xiaolin; Shen, Yuxiaotong; Qin, Zike; Ge, Yun; Chen, Ying; Ning, Xinbao

2017-03-01

Vision perception and recognition is a complex process, during which different parts of brain are involved depending on the specific modality of the vision target, e.g. face, character, or word. In this study, brain activities in string recognition task compared with idle control state are analyzed through topographies based on multiple measurements, i.e. sample entropy, symbolic sample entropy and normalized rhythm power, extracted from simultaneously collected scalp EEG. Our analyses show that, for most subjects, both symbolic sample entropy and normalized gamma power in string recognition task are significantly higher than those in idle state, especially at locations of P4, O2, T6 and C4. It implies that these regions are highly involved in string recognition task. Since symbolic sample entropy measures complexity, from the perspective of new information generation, and normalized rhythm power reveals the power distributions in frequency domain, complementary information about the underlying dynamics can be provided through the two types of indices.
Fusion of 3D laser scanner and depth images for obstacle recognition in mobile applications

NASA Astrophysics Data System (ADS)

Budzan, Sebastian; Kasprzyk, Jerzy

2016-02-01

The problem of obstacle detection and recognition or, generally, scene mapping is one of the most investigated problems in computer vision, especially in mobile applications. In this paper a fused optical system using depth information with color images gathered from the Microsoft Kinect sensor and 3D laser range scanner data is proposed for obstacle detection and ground estimation in real-time mobile systems. The algorithm consists of feature extraction in the laser range images, processing of the depth information from the Kinect sensor, fusion of the sensor information, and classification of the data into two separate categories: road and obstacle. Exemplary results are presented and it is shown that fusion of information gathered from different sources increases the effectiveness of the obstacle detection in different scenarios, and it can be used successfully for road surface mapping.
Mechanisms of face perception

PubMed Central

Tsao, Doris Y.

2009-01-01

Faces are among the most informative stimuli we ever perceive: Even a split-second glimpse of a person's face tells us their identity, sex, mood, age, race, and direction of attention. The specialness of face processing is acknowledged in the artificial vision community, where contests for face recognition algorithms abound. Neurological evidence strongly implicates a dedicated machinery for face processing in the human brain, to explain the double dissociability of face and object recognition deficits. Furthermore, it has recently become clear that macaques too have specialized neural machinery for processing faces. Here we propose a unifying hypothesis, deduced from computational, neurological, fMRI, and single-unit experiments: that what makes face processing special is that it is gated by an obligatory detection process. We will clarify this idea in concrete algorithmic terms, and show how it can explain a variety of phenomena associated with face processing. PMID:18558862
Basic level scene understanding: categories, attributes and structures

PubMed Central

Xiao, Jianxiong; Hays, James; Russell, Bryan C.; Patterson, Genevieve; Ehinger, Krista A.; Torralba, Antonio; Oliva, Aude

2013-01-01

A longstanding goal of computer vision is to build a system that can automatically understand a 3D scene from a single image. This requires extracting semantic concepts and 3D information from 2D images which can depict an enormous variety of environments that comprise our visual world. This paper summarizes our recent efforts toward these goals. First, we describe the richly annotated SUN database which is a collection of annotated images spanning 908 different scene categories with object, attribute, and geometric labels for many scenes. This database allows us to systematically study the space of scenes and to establish a benchmark for scene and object recognition. We augment the categorical SUN database with 102 scene attributes for every image and explore attribute recognition. Finally, we present an integrated system to extract the 3D structure of the scene and objects depicted in an image. PMID:24009590
A Computer Vision Approach to Identify Einstein Rings and Arcs

NASA Astrophysics Data System (ADS)

Lee, Chien-Hsiu

2017-03-01

Einstein rings are rare gems of strong lensing phenomena; the ring images can be used to probe the underlying lens gravitational potential at every position angles, tightly constraining the lens mass profile. In addition, the magnified images also enable us to probe high-z galaxies with enhanced resolution and signal-to-noise ratios. However, only a handful of Einstein rings have been reported, either from serendipitous discoveries or or visual inspections of hundred thousands of massive galaxies or galaxy clusters. In the era of large sky surveys, an automated approach to identify ring pattern in the big data to come is in high demand. Here, we present an Einstein ring recognition approach based on computer vision techniques. The workhorse is the circle Hough transform that recognise circular patterns or arcs in the images. We propose a two-tier approach by first pre-selecting massive galaxies associated with multiple blue objects as possible lens, than use Hough transform to identify circular pattern. As a proof-of-concept, we apply our approach to SDSS, with a high completeness, albeit with low purity. We also apply our approach to other lenses in DES, HSC-SSP, and UltraVISTA survey, illustrating the versatility of our approach.
Remote hardware-reconfigurable robotic camera

NASA Astrophysics Data System (ADS)

Arias-Estrada, Miguel; Torres-Huitzil, Cesar; Maya-Rueda, Selene E.

2001-10-01

In this work, a camera with integrated image processing capabilities is discussed. The camera is based on an imager coupled to an FPGA device (Field Programmable Gate Array) which contains an architecture for real-time computer vision low-level processing. The architecture can be reprogrammed remotely for application specific purposes. The system is intended for rapid modification and adaptation for inspection and recognition applications, with the flexibility of hardware and software reprogrammability. FPGA reconfiguration allows the same ease of upgrade in hardware as a software upgrade process. The camera is composed of a digital imager coupled to an FPGA device, two memory banks, and a microcontroller. The microcontroller is used for communication tasks and FPGA programming. The system implements a software architecture to handle multiple FPGA architectures in the device, and the possibility to download a software/hardware object from the host computer into its internal context memory. System advantages are: small size, low power consumption, and a library of hardware/software functionalities that can be exchanged during run time. The system has been validated with an edge detection and a motion processing architecture, which will be presented in the paper. Applications targeted are in robotics, mobile robotics, and vision based quality control.
Deep recurrent neural network reveals a hierarchy of process memory during dynamic natural vision.

PubMed

Shi, Junxing; Wen, Haiguang; Zhang, Yizhen; Han, Kuan; Liu, Zhongming

2018-05-01

The human visual cortex extracts both spatial and temporal visual features to support perception and guide behavior. Deep convolutional neural networks (CNNs) provide a computational framework to model cortical representation and organization for spatial visual processing, but unable to explain how the brain processes temporal information. To overcome this limitation, we extended a CNN by adding recurrent connections to different layers of the CNN to allow spatial representations to be remembered and accumulated over time. The extended model, or the recurrent neural network (RNN), embodied a hierarchical and distributed model of process memory as an integral part of visual processing. Unlike the CNN, the RNN learned spatiotemporal features from videos to enable action recognition. The RNN better predicted cortical responses to natural movie stimuli than the CNN, at all visual areas, especially those along the dorsal stream. As a fully observable model of visual processing, the RNN also revealed a cortical hierarchy of temporal receptive window, dynamics of process memory, and spatiotemporal representations. These results support the hypothesis of process memory, and demonstrate the potential of using the RNN for in-depth computational understanding of dynamic natural vision. © 2018 Wiley Periodicals, Inc.
Handwritten-word spotting using biologically inspired features.

PubMed

van der Zant, Tijn; Schomaker, Lambert; Haak, Koen

2008-11-01

For quick access to new handwritten collections, current handwriting recognition methods are too cumbersome. They cannot deal with the lack of labeled data and would require extensive laboratory training for each individual script, style, language and collection. We propose a biologically inspired whole-word recognition method which is used to incrementally elicit word labels in a live, web-based annotation system, named Monk. Since human labor should be minimized given the massive amount of image data, it becomes important to rely on robust perceptual mechanisms in the machine. Recent computational models of the neuro-physiology of vision are applied to isolated word classification. A primate cortex-like mechanism allows to classify text-images that have a low frequency of occurrence. Typically these images are the most difficult to retrieve and often contain named entities and are regarded as the most important to people. Usually standard pattern-recognition technology cannot deal with these text-images if there are not enough labeled instances. The results of this retrieval system are compared to normalized word-image matching and appear to be very promising.
Study on road sign recognition in LabVIEW

NASA Astrophysics Data System (ADS)

Panoiu, M.; Rat, C. L.; Panoiu, C.

2016-02-01

Road and traffic sign identification is a field of study that can be used to aid the development of in-car advisory systems. It uses computer vision and artificial intelligence to extract the road signs from outdoor images acquired by a camera in uncontrolled lighting conditions where they may be occluded by other objects, or may suffer from problems such as color fading, disorientation, variations in shape and size, etc. An automatic means of identifying traffic signs, in these conditions, can make a significant contribution to develop an Intelligent Transport Systems (ITS) that continuously monitors the driver, the vehicle, and the road. Road and traffic signs are characterized by a number of features which make them recognizable from the environment. Road signs are located in standard positions and have standard shapes, standard colors, and known pictograms. These characteristics make them suitable for image identification. Traffic sign identification covers two problems: traffic sign detection and traffic sign recognition. Traffic sign detection is meant for the accurate localization of traffic signs in the image space, while traffic sign recognition handles the labeling of such detections into specific traffic sign types or subcategories [1].
Enhanced tactile encoding and memory recognition in congenital blindness.

PubMed

D'Angiulli, Amedeo; Waraich, Paul

2002-06-01

Several behavioural studies have shown that early-blind persons possess superior tactile skills. Since neurophysiological data show that early-blind persons recruit visual as well as somatosensory cortex to carry out tactile processing (cross-modal plasticity), blind persons' sharper tactile skills may be related to cortical re-organisation resulting from loss of vision early in their life. To examine the nature of blind individuals' tactile superiority and its implications for cross-modal plasticity, we compared the tactile performance of congenitally totally blind, low-vision and sighted children on raised-line picture identification test and re-test, assessing effects of task familiarity, exploratory strategy and memory recognition. What distinguished the blind from the other children was higher memory recognition and higher tactile encoding associated with efficient exploration. These results suggest that enhanced perceptual encoding and recognition memory may be two cognitive correlates of cross-modal plasticity in congenital blindness.
Geometry-based ensembles: toward a structural characterization of the classification boundary.

PubMed

Pujol, Oriol; Masip, David

2009-06-01

This paper introduces a novel binary discriminative learning technique based on the approximation of the nonlinear decision boundary by a piecewise linear smooth additive model. The decision border is geometrically defined by means of the characterizing boundary points-points that belong to the optimal boundary under a certain notion of robustness. Based on these points, a set of locally robust linear classifiers is defined and assembled by means of a Tikhonov regularized optimization procedure in an additive model to create a final lambda-smooth decision rule. As a result, a very simple and robust classifier with a strong geometrical meaning and nonlinear behavior is obtained. The simplicity of the method allows its extension to cope with some of today's machine learning challenges, such as online learning, large-scale learning or parallelization, with linear computational complexity. We validate our approach on the UCI database, comparing with several state-of-the-art classification techniques. Finally, we apply our technique in online and large-scale scenarios and in six real-life computer vision and pattern recognition problems: gender recognition based on face images, intravascular ultrasound tissue classification, speed traffic sign detection, Chagas' disease myocardial damage severity detection, old musical scores clef classification, and action recognition using 3D accelerometer data from a wearable device. The results are promising and this paper opens a line of research that deserves further attention.
Learning and Recognition of Clothing Genres From Full-Body Images.

PubMed

Hidayati, Shintami C; You, Chuang-Wen; Cheng, Wen-Huang; Hua, Kai-Lung

2018-05-01

According to the theory of clothing design, the genres of clothes can be recognized based on a set of visually differentiable style elements, which exhibit salient features of visual appearance and reflect high-level fashion styles for better describing clothing genres. Instead of using less-discriminative low-level features or ambiguous keywords to identify clothing genres, we proposed a novel approach for automatically classifying clothing genres based on the visually differentiable style elements. A set of style elements, that are crucial for recognizing specific visual styles of clothing genres, were identified based on the clothing design theory. In addition, the corresponding salient visual features of each style element were identified and formulated with variables that can be computationally derived with various computer vision algorithms. To evaluate the performance of our algorithm, a dataset containing 3250 full-body shots crawled from popular online stores was built. Recognition results show that our proposed algorithms achieved promising overall precision, recall, and -score of 88.76%, 88.53%, and 88.64% for recognizing upperwear genres, and 88.21%, 88.17%, and 88.19% for recognizing lowerwear genres, respectively. The effectiveness of each style element and its visual features on recognizing clothing genres was demonstrated through a set of experiments involving different sets of style elements or features. In summary, our experimental results demonstrate the effectiveness of the proposed method in clothing genre recognition.

Computer vision syndrome: a review.

PubMed

Blehm, Clayton; Vishnu, Seema; Khattak, Ashbala; Mitra, Shrabanee; Yee, Richard W

2005-01-01

As computers become part of our everyday life, more and more people are experiencing a variety of ocular symptoms related to computer use. These include eyestrain, tired eyes, irritation, redness, blurred vision, and double vision, collectively referred to as computer vision syndrome. This article describes both the characteristics and treatment modalities that are available at this time. Computer vision syndrome symptoms may be the cause of ocular (ocular-surface abnormalities or accommodative spasms) and/or extraocular (ergonomic) etiologies. However, the major contributor to computer vision syndrome symptoms by far appears to be dry eye. The visual effects of various display characteristics such as lighting, glare, display quality, refresh rates, and radiation are also discussed. Treatment requires a multidirectional approach combining ocular therapy with adjustment of the workstation. Proper lighting, anti-glare filters, ergonomic positioning of computer monitor and regular work breaks may help improve visual comfort. Lubricating eye drops and special computer glasses help relieve ocular surface-related symptoms. More work needs to be done to specifically define the processes that cause computer vision syndrome and to develop and improve effective treatments that successfully address these causes.
SU-C-209-06: Improving X-Ray Imaging with Computer Vision and Augmented Reality

DOE Office of Scientific and Technical Information (OSTI.GOV)

MacDougall, R.D.; Scherrer, B; Don, S

Purpose: To determine the feasibility of using a computer vision algorithm and augmented reality interface to reduce repeat rates and improve consistency of image quality and patient exposure in general radiography. Methods: A prototype device, designed for use with commercially available hardware (Microsoft Kinect 2.0) capable of depth sensing and high resolution/frame rate video, was mounted to the x-ray tube housing as part of a Philips DigitalDiagnost digital radiography room. Depth data and video was streamed to a Windows 10 PC. Proprietary software created an augmented reality interface where overlays displayed selectable information projected over real-time video of the patient.more » The information displayed prior to and during x-ray acquisition included: recognition and position of ordered body part, position of image receptor, thickness of anatomy, location of AEC cells, collimated x-ray field, degree of patient motion and suggested x-ray technique. Pre-clinical data was collected in a volunteer study to validate patient thickness measurements and x-ray images were not acquired. Results: Proprietary software correctly identified ordered body part, measured patient motion, and calculated thickness of anatomy. Pre-clinical data demonstrated accuracy and precision of body part thickness measurement when compared with other methods (e.g. laser measurement tool). Thickness measurements provided the basis for developing a database of thickness-based technique charts that can be automatically displayed to the technologist. Conclusion: The utilization of computer vision and commercial hardware to create an augmented reality view of the patient and imaging equipment has the potential to drastically improve the quality and safety of x-ray imaging by reducing repeats and optimizing technique based on patient thickness. Society of Pediatric Radiology Pilot Grant; Washington University Bear Cub Fund.« less
Quaternions in computer vision and robotics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pervin, E.; Webb, J.A.

1982-01-01

Computer vision and robotics suffer from not having good tools for manipulating three-dimensional objects. Vectors, coordinate geometry, and trigonometry all have deficiencies. Quaternions can be used to solve many of these problems. Many properties of quaternions that are relevant to computer vision and robotics are developed. Examples are given showing how quaternions can be used to simplify derivations in computer vision and robotics.
Compensation for Blur Requires Increase in Field of View and Viewing Time

PubMed Central

Kwon, MiYoung; Liu, Rong; Chien, Lillian

2016-01-01

Spatial resolution is an important factor for human pattern recognition. In particular, low resolution (blur) is a defining characteristic of low vision. Here, we examined spatial (field of view) and temporal (stimulus duration) requirements for blurry object recognition. The spatial resolution of an image such as letter or face, was manipulated with a low-pass filter. In experiment 1, studying spatial requirement, observers viewed a fixed-size object through a window of varying sizes, which was repositioned until object identification (moving window paradigm). Field of view requirement, quantified as the number of “views” (window repositions) for correct recognition, was obtained for three blur levels, including no blur. In experiment 2, studying temporal requirement, we determined threshold viewing time, the stimulus duration yielding criterion recognition accuracy, at six blur levels, including no blur. For letter and face recognition, we found blur significantly increased the number of views, suggesting a larger field of view is required to recognize blurry objects. We also found blur significantly increased threshold viewing time, suggesting longer temporal integration is necessary to recognize blurry objects. The temporal integration reflects the tradeoff between stimulus intensity and time. While humans excel at recognizing blurry objects, our findings suggest compensating for blur requires increased field of view and viewing time. The need for larger spatial and longer temporal integration for recognizing blurry objects may further challenge object recognition in low vision. Thus, interactions between blur and field of view should be considered for developing low vision rehabilitation or assistive aids. PMID:27622710
Color defective vision and the recognition of aviation color signal light flashes.

DOT National Transportation Integrated Search

1971-06-01

A previous study reported on the efficiency with which various tests of color defective vision can predict performance during daylight conditions on a practical test of ability to discriminate aviation signal red, white, and green. In the current stu...
A vision of the future for BMC Medicine: serving science, medicine and authors.

PubMed

Cassady-Cain, Robin L; Appleford, Joanne M; Patel, Jigisha; Aulakh, Mick; Norton, Melissa L

2009-10-07

In June 2009, BMC Medicine received its first official impact factor of 3.28 from Thomson Reuters. In recognition of this landmark event, the BMC Medicine editorial team present and discuss the vision and aims of the journal.
Benchmarking neuromorphic vision: lessons learnt from computer vision

PubMed Central

Tan, Cheston; Lallee, Stephane; Orchard, Garrick

2015-01-01

Neuromorphic Vision sensors have improved greatly since the first silicon retina was presented almost three decades ago. They have recently matured to the point where they are commercially available and can be operated by laymen. However, despite improved availability of sensors, there remains a lack of good datasets, while algorithms for processing spike-based visual data are still in their infancy. On the other hand, frame-based computer vision algorithms are far more mature, thanks in part to widely accepted datasets which allow direct comparison between algorithms and encourage competition. We are presented with a unique opportunity to shape the development of Neuromorphic Vision benchmarks and challenges by leveraging what has been learnt from the use of datasets in frame-based computer vision. Taking advantage of this opportunity, in this paper we review the role that benchmarks and challenges have played in the advancement of frame-based computer vision, and suggest guidelines for the creation of Neuromorphic Vision benchmarks and challenges. We also discuss the unique challenges faced when benchmarking Neuromorphic Vision algorithms, particularly when attempting to provide direct comparison with frame-based computer vision. PMID:26528120
Bio-inspired approach for intelligent unattended ground sensors

NASA Astrophysics Data System (ADS)

Hueber, Nicolas; Raymond, Pierre; Hennequin, Christophe; Pichler, Alexander; Perrot, Maxime; Voisin, Philippe; Moeglin, Jean-Pierre

2015-05-01

Improving the surveillance capacity over wide zones requires a set of smart battery-powered Unattended Ground Sensors capable of issuing an alarm to a decision-making center. Only high-level information has to be sent when a relevant suspicious situation occurs. In this paper we propose an innovative bio-inspired approach that mimics the human bi-modal vision mechanism and the parallel processing ability of the human brain. The designed prototype exploits two levels of analysis: a low-level panoramic motion analysis, the peripheral vision, and a high-level event-focused analysis, the foveal vision. By tracking moving objects and fusing multiple criteria (size, speed, trajectory, etc.), the peripheral vision module acts as a fast relevant event detector. The foveal vision module focuses on the detected events to extract more detailed features (texture, color, shape, etc.) in order to improve the recognition efficiency. The implemented recognition core is able to acquire human knowledge and to classify in real-time a huge amount of heterogeneous data thanks to its natively parallel hardware structure. This UGS prototype validates our system approach under laboratory tests. The peripheral analysis module demonstrates a low false alarm rate whereas the foveal vision correctly focuses on the detected events. A parallel FPGA implementation of the recognition core succeeds in fulfilling the embedded application requirements. These results are paving the way of future reconfigurable virtual field agents. By locally processing the data and sending only high-level information, their energy requirements and electromagnetic signature are optimized. Moreover, the embedded Artificial Intelligence core enables these bio-inspired systems to recognize and learn new significant events. By duplicating human expertise in potentially hazardous places, our miniature visual event detector will allow early warning and contribute to better human decision making.
Automatic decoding of facial movements reveals deceptive pain expressions

PubMed Central

Bartlett, Marian Stewart; Littlewort, Gwen C.; Frank, Mark G.; Lee, Kang

2014-01-01

Summary In highly social species such as humans, faces have evolved to convey rich information for social interaction, including expressions of emotions and pain [1–3]. Two motor pathways control facial movement [4–7]. A subcortical extrapyramidal motor system drives spontaneous facial expressions of felt emotions. A cortical pyramidal motor system controls voluntary facial expressions. The pyramidal system enables humans to simulate facial expressions of emotions not actually experienced. Their simulation is so successful that they can deceive most observers [8–11]. Machine vision may, however, be able to distinguish deceptive from genuine facial signals by identifying the subtle differences between pyramidally and extrapyramidally driven movements. Here we show that human observers could not discriminate real from faked expressions of pain better than chance, and after training, improved accuracy to a modest 55%. However a computer vision system that automatically measures facial movements and performs pattern recognition on those movements attained 85% accuracy. The machine system’s superiority is attributable to its ability to differentiate the dynamics of genuine from faked expressions. Thus by revealing the dynamics of facial action through machine vision systems, our approach has the potential to elucidate behavioral fingerprints of neural control systems involved in emotional signaling. PMID:24656830
Convolutional networks for fast, energy-efficient neuromorphic computing

PubMed Central

Esser, Steven K.; Merolla, Paul A.; Arthur, John V.; Cassidy, Andrew S.; Appuswamy, Rathinakumar; Andreopoulos, Alexander; Berg, David J.; McKinstry, Jeffrey L.; Melano, Timothy; Barch, Davis R.; di Nolfo, Carmelo; Datta, Pallab; Amir, Arnon; Taba, Brian; Flickner, Myron D.; Modha, Dharmendra S.

2016-01-01

Deep networks are now able to achieve human-level performance on a broad spectrum of recognition tasks. Independently, neuromorphic computing has now demonstrated unprecedented energy-efficiency through a new chip architecture based on spiking neurons, low precision synapses, and a scalable communication network. Here, we demonstrate that neuromorphic computing, despite its novel architectural primitives, can implement deep convolution networks that (i) approach state-of-the-art classification accuracy across eight standard datasets encompassing vision and speech, (ii) perform inference while preserving the hardware’s underlying energy-efficiency and high throughput, running on the aforementioned datasets at between 1,200 and 2,600 frames/s and using between 25 and 275 mW (effectively >6,000 frames/s per Watt), and (iii) can be specified and trained using backpropagation with the same ease-of-use as contemporary deep learning. This approach allows the algorithmic power of deep learning to be merged with the efficiency of neuromorphic processors, bringing the promise of embedded, intelligent, brain-inspired computing one step closer. PMID:27651489
Convolutional networks for fast, energy-efficient neuromorphic computing.

PubMed

Esser, Steven K; Merolla, Paul A; Arthur, John V; Cassidy, Andrew S; Appuswamy, Rathinakumar; Andreopoulos, Alexander; Berg, David J; McKinstry, Jeffrey L; Melano, Timothy; Barch, Davis R; di Nolfo, Carmelo; Datta, Pallab; Amir, Arnon; Taba, Brian; Flickner, Myron D; Modha, Dharmendra S

2016-10-11

Deep networks are now able to achieve human-level performance on a broad spectrum of recognition tasks. Independently, neuromorphic computing has now demonstrated unprecedented energy-efficiency through a new chip architecture based on spiking neurons, low precision synapses, and a scalable communication network. Here, we demonstrate that neuromorphic computing, despite its novel architectural primitives, can implement deep convolution networks that (i) approach state-of-the-art classification accuracy across eight standard datasets encompassing vision and speech, (ii) perform inference while preserving the hardware's underlying energy-efficiency and high throughput, running on the aforementioned datasets at between 1,200 and 2,600 frames/s and using between 25 and 275 mW (effectively >6,000 frames/s per Watt), and (iii) can be specified and trained using backpropagation with the same ease-of-use as contemporary deep learning. This approach allows the algorithmic power of deep learning to be merged with the efficiency of neuromorphic processors, bringing the promise of embedded, intelligent, brain-inspired computing one step closer.
Deep learning in the small sample size setting: cascaded feed forward neural networks for medical image segmentation

NASA Astrophysics Data System (ADS)

Gaonkar, Bilwaj; Hovda, David; Martin, Neil; Macyszyn, Luke

2016-03-01

Deep Learning, refers to large set of neural network based algorithms, have emerged as promising machine- learning tools in the general imaging and computer vision domains. Convolutional neural networks (CNNs), a specific class of deep learning algorithms, have been extremely effective in object recognition and localization in natural images. A characteristic feature of CNNs, is the use of a locally connected multi layer topology that is inspired by the animal visual cortex (the most powerful vision system in existence). While CNNs, perform admirably in object identification and localization tasks, typically require training on extremely large datasets. Unfortunately, in medical image analysis, large datasets are either unavailable or are extremely expensive to obtain. Further, the primary tasks in medical imaging are organ identification and segmentation from 3D scans, which are different from the standard computer vision tasks of object recognition. Thus, in order to translate the advantages of deep learning to medical image analysis, there is a need to develop deep network topologies and training methodologies, that are geared towards medical imaging related tasks and can work in a setting where dataset sizes are relatively small. In this paper, we present a technique for stacked supervised training of deep feed forward neural networks for segmenting organs from medical scans. Each `neural network layer' in the stack is trained to identify a sub region of the original image, that contains the organ of interest. By layering several such stacks together a very deep neural network is constructed. Such a network can be used to identify extremely small regions of interest in extremely large images, inspite of a lack of clear contrast in the signal or easily identifiable shape characteristics. What is even more intriguing is that the network stack achieves accurate segmentation even when it is trained on a single image with manually labelled ground truth. We validate this approach,using a publicly available head and neck CT dataset. We also show that a deep neural network of similar depth, if trained directly using backpropagation, cannot acheive the tasks achieved using our layer wise training paradigm.
Chinese Herbal Medicine Image Recognition and Retrieval by Convolutional Neural Network

PubMed Central

Sun, Xin; Qian, Huinan

2016-01-01

Chinese herbal medicine image recognition and retrieval have great potential of practical applications. Several previous studies have focused on the recognition with hand-crafted image features, but there are two limitations in them. Firstly, most of these hand-crafted features are low-level image representation, which is easily affected by noise and background. Secondly, the medicine images are very clean without any backgrounds, which makes it difficult to use in practical applications. Therefore, designing high-level image representation for recognition and retrieval in real world medicine images is facing a great challenge. Inspired by the recent progress of deep learning in computer vision, we realize that deep learning methods may provide robust medicine image representation. In this paper, we propose to use the Convolutional Neural Network (CNN) for Chinese herbal medicine image recognition and retrieval. For the recognition problem, we use the softmax loss to optimize the recognition network; then for the retrieval problem, we fine-tune the recognition network by adding a triplet loss to search for the most similar medicine images. To evaluate our method, we construct a public database of herbal medicine images with cluttered backgrounds, which has in total 5523 images with 95 popular Chinese medicine categories. Experimental results show that our method can achieve the average recognition precision of 71% and the average retrieval precision of 53% over all the 95 medicine categories, which are quite promising given the fact that the real world images have multiple pieces of occluded herbal and cluttered backgrounds. Besides, our proposed method achieves the state-of-the-art performance by improving previous studies with a large margin. PMID:27258404
A comparison of algorithms for inference and learning in probabilistic graphical models.

PubMed

Frey, Brendan J; Jojic, Nebojsa

2005-09-01

Research into methods for reasoning under uncertainty is currently one of the most exciting areas of artificial intelligence, largely because it has recently become possible to record, store, and process large amounts of data. While impressive achievements have been made in pattern classification problems such as handwritten character recognition, face detection, speaker identification, and prediction of gene function, it is even more exciting that researchers are on the verge of introducing systems that can perform large-scale combinatorial analyses of data, decomposing the data into interacting components. For example, computational methods for automatic scene analysis are now emerging in the computer vision community. These methods decompose an input image into its constituent objects, lighting conditions, motion patterns, etc. Two of the main challenges are finding effective representations and models in specific applications and finding efficient algorithms for inference and learning in these models. In this paper, we advocate the use of graph-based probability models and their associated inference and learning algorithms. We review exact techniques and various approximate, computationally efficient techniques, including iterated conditional modes, the expectation maximization (EM) algorithm, Gibbs sampling, the mean field method, variational techniques, structured variational techniques and the sum-product algorithm ("loopy" belief propagation). We describe how each technique can be applied in a vision model of multiple, occluding objects and contrast the behaviors and performances of the techniques using a unifying cost function, free energy.
Automated Field-of-View, Illumination, and Recognition Algorithm Design of a Vision System for Pick-and-Place Considering Colour Information in Illumination and Images

PubMed Central

Chen, Yibing; Ogata, Taiki; Ueyama, Tsuyoshi; Takada, Toshiyuki; Ota, Jun

2018-01-01

Machine vision is playing an increasingly important role in industrial applications, and the automated design of image recognition systems has been a subject of intense research. This study has proposed a system for automatically designing the field-of-view (FOV) of a camera, the illumination strength and the parameters in a recognition algorithm. We formulated the design problem as an optimisation problem and used an experiment based on a hierarchical algorithm to solve it. The evaluation experiments using translucent plastics objects showed that the use of the proposed system resulted in an effective solution with a wide FOV, recognition of all objects and 0.32 mm and 0.4° maximal positional and angular errors when all the RGB (red, green and blue) for illumination and R channel image for recognition were used. Though all the RGB illumination and grey scale images also provided recognition of all the objects, only a narrow FOV was selected. Moreover, full recognition was not achieved by using only G illumination and a grey-scale image. The results showed that the proposed method can automatically design the FOV, illumination and parameters in the recognition algorithm and that tuning all the RGB illumination is desirable even when single-channel or grey-scale images are used for recognition. PMID:29786665
Automated Field-of-View, Illumination, and Recognition Algorithm Design of a Vision System for Pick-and-Place Considering Colour Information in Illumination and Images.

PubMed

Chen, Yibing; Ogata, Taiki; Ueyama, Tsuyoshi; Takada, Toshiyuki; Ota, Jun

2018-05-22

Machine vision is playing an increasingly important role in industrial applications, and the automated design of image recognition systems has been a subject of intense research. This study has proposed a system for automatically designing the field-of-view (FOV) of a camera, the illumination strength and the parameters in a recognition algorithm. We formulated the design problem as an optimisation problem and used an experiment based on a hierarchical algorithm to solve it. The evaluation experiments using translucent plastics objects showed that the use of the proposed system resulted in an effective solution with a wide FOV, recognition of all objects and 0.32 mm and 0.4° maximal positional and angular errors when all the RGB (red, green and blue) for illumination and R channel image for recognition were used. Though all the RGB illumination and grey scale images also provided recognition of all the objects, only a narrow FOV was selected. Moreover, full recognition was not achieved by using only G illumination and a grey-scale image. The results showed that the proposed method can automatically design the FOV, illumination and parameters in the recognition algorithm and that tuning all the RGB illumination is desirable even when single-channel or grey-scale images are used for recognition.
Color defective vision and day and night recognition of aviation color signal light flashes.

DOT National Transportation Integrated Search

1971-07-01

A previous study reported on the efficiency with which various tests of color defective vision can predict performance during daylight conditions on a practical test of ability to discriminate aviation signal red, white, and green. In the current stu...
A vision of the future for BMC Medicine: serving science, medicine and authors

PubMed Central

Cassady-Cain, Robin L; Appleford, Joanne M; Patel, Jigisha; Aulakh, Mick; Norton, Melissa L

2009-01-01

In June 2009, BMC Medicine received its first official impact factor of 3.28 from Thomson Reuters. In recognition of this landmark event, the BMC Medicine editorial team present and discuss the vision and aims of the journal. PMID:19811626
A comparative study of deep learning models for medical image classification

NASA Astrophysics Data System (ADS)

Dutta, Suvajit; Manideep, B. C. S.; Rai, Shalva; Vijayarajan, V.

2017-11-01

Deep Learning(DL) techniques are conquering over the prevailing traditional approaches of neural network, when it comes to the huge amount of dataset, applications requiring complex functions demanding increase accuracy with lower time complexities. Neurosciences has already exploited DL techniques, thus portrayed itself as an inspirational source for researchers exploring the domain of Machine learning. DL enthusiasts cover the areas of vision, speech recognition, motion planning and NLP as well, moving back and forth among fields. This concerns with building models that can successfully solve variety of tasks requiring intelligence and distributed representation. The accessibility to faster CPUs, introduction of GPUs-performing complex vector and matrix computations, supported agile connectivity to network. Enhanced software infrastructures for distributed computing worked in strengthening the thought that made researchers suffice DL methodologies. The paper emphases on the following DL procedures to traditional approaches which are performed manually for classifying medical images. The medical images are used for the study Diabetic Retinopathy(DR) and computed tomography (CT) emphysema data. Both DR and CT data diagnosis is difficult task for normal image classification methods. The initial work was carried out with basic image processing along with K-means clustering for identification of image severity levels. After determining image severity levels ANN has been applied on the data to get the basic classification result, then it is compared with the result of DNNs (Deep Neural Networks), which performed efficiently because of its multiple hidden layer features basically which increases accuracy factors, but the problem of vanishing gradient in DNNs made to consider Convolution Neural Networks (CNNs) as well for better results. The CNNs are found to be providing better outcomes when compared to other learning models aimed at classification of images. CNNs are favoured as they provide better visual processing models successfully classifying the noisy data as well. The work centres on the detection on Diabetic Retinopathy-loss in vision and recognition of computed tomography (CT) emphysema data measuring the severity levels for both cases. The paper discovers how various Machine Learning algorithms can be implemented ensuing a supervised approach, so as to get accurate results with less complexity possible.
Do we understand high-level vision?

PubMed

Cox, David Daniel

2014-04-01

'High-level' vision lacks a single, agreed upon definition, but it might usefully be defined as those stages of visual processing that transition from analyzing local image structure to analyzing structure of the external world that produced those images. Much work in the last several decades has focused on object recognition as a framing problem for the study of high-level visual cortex, and much progress has been made in this direction. This approach presumes that the operational goal of the visual system is to read-out the identity of an object (or objects) in a scene, in spite of variation in the position, size, lighting and the presence of other nearby objects. However, while object recognition as a operational framing of high-level is intuitive appealing, it is by no means the only task that visual cortex might do, and the study of object recognition is beset by challenges in building stimulus sets that adequately sample the infinite space of possible stimuli. Here I review the successes and limitations of this work, and ask whether we should reframe our approaches to understanding high-level vision. Copyright © 2014. Published by Elsevier Ltd.

(Computer) Vision without Sight

PubMed Central

Manduchi, Roberto; Coughlan, James

2012-01-01

Computer vision holds great promise for helping persons with blindness or visual impairments (VI) to interpret and explore the visual world. To this end, it is worthwhile to assess the situation critically by understanding the actual needs of the VI population and which of these needs might be addressed by computer vision. This article reviews the types of assistive technology application areas that have already been developed for VI, and the possible roles that computer vision can play in facilitating these applications. We discuss how appropriate user interfaces are designed to translate the output of computer vision algorithms into information that the user can quickly and safely act upon, and how system-level characteristics affect the overall usability of an assistive technology. Finally, we conclude by highlighting a few novel and intriguing areas of application of computer vision to assistive technology. PMID:22815563
Microscope self-calibration based on micro laser line imaging and soft computing algorithms

NASA Astrophysics Data System (ADS)

Apolinar Muñoz Rodríguez, J.

2018-06-01

A technique to perform microscope self-calibration via micro laser line and soft computing algorithms is presented. In this technique, the microscope vision parameters are computed by means of soft computing algorithms based on laser line projection. To implement the self-calibration, a microscope vision system is constructed by means of a CCD camera and a 38 μm laser line. From this arrangement, the microscope vision parameters are represented via Bezier approximation networks, which are accomplished through the laser line position. In this procedure, a genetic algorithm determines the microscope vision parameters by means of laser line imaging. Also, the approximation networks compute the three-dimensional vision by means of the laser line position. Additionally, the soft computing algorithms re-calibrate the vision parameters when the microscope vision system is modified during the vision task. The proposed self-calibration improves accuracy of the traditional microscope calibration, which is accomplished via external references to the microscope system. The capability of the self-calibration based on soft computing algorithms is determined by means of the calibration accuracy and the micro-scale measurement error. This contribution is corroborated by an evaluation based on the accuracy of the traditional microscope calibration.
Learning graph matching.

PubMed

Caetano, Tibério S; McAuley, Julian J; Cheng, Li; Le, Quoc V; Smola, Alex J

2009-06-01

As a fundamental problem in pattern recognition, graph matching has applications in a variety of fields, from computer vision to computational biology. In graph matching, patterns are modeled as graphs and pattern recognition amounts to finding a correspondence between the nodes of different graphs. Many formulations of this problem can be cast in general as a quadratic assignment problem, where a linear term in the objective function encodes node compatibility and a quadratic term encodes edge compatibility. The main research focus in this theme is about designing efficient algorithms for approximately solving the quadratic assignment problem, since it is NP-hard. In this paper we turn our attention to a different question: how to estimate compatibility functions such that the solution of the resulting graph matching problem best matches the expected solution that a human would manually provide. We present a method for learning graph matching: the training examples are pairs of graphs and the 'labels' are matches between them. Our experimental results reveal that learning can substantially improve the performance of standard graph matching algorithms. In particular, we find that simple linear assignment with such a learning scheme outperforms Graduated Assignment with bistochastic normalisation, a state-of-the-art quadratic assignment relaxation algorithm.
Age and visual impairment decrease driving performance as measured on a closed-road circuit.

PubMed

Wood, Joanne M

2002-01-01

In this study the effects of visual impairment and age on driving were investigated and related to visual function. Participants were 139 licensed drivers (young, middle-aged, and older participants with normal vision, and older participants with ocular disease). Driving performance was assessed during the daytime on a closed-road driving circuit. Visual performance was assessed using a vision testing battery. Age and visual impairment had a significant detrimental effect on recognition tasks (detection and recognition of signs and hazards), time to complete driving tasks (overall course time, reversing, and maneuvering), maneuvering ability, divided attention, and an overall driving performance index. All vision measures were significantly affected by group membership. A combination of motion sensitivity, useful field of view (UFOV), Pelli-Robson letter contrast sensitivity, and dynamic acuity could predict 50% of the variance in overall driving scores. These results indicate that older drivers with either normal vision or visual impairment had poorer driving performance compared with younger or middle-aged drivers with normal vision. The inclusion of tests such as motion sensitivity and the UFOV significantly improve the predictive power of vision tests for driving performance. Although such measures may not be practical for widespread screening, their application in selected cases should be considered.
Proceedings of the Third International Workshop on Neural Networks and Fuzzy Logic, volume 1

NASA Technical Reports Server (NTRS)

Culbert, Christopher J. (Editor)

1993-01-01

Documented here are papers presented at the Neural Networks and Fuzzy Logic Workshop sponsored by the National Aeronautics and Space Administration and cosponsored by the University of Houston, Clear Lake. The workshop was held June 1-3, 1992 at the Lyndon B. Johnson Space Center in Houston, Texas. During the three days approximately 50 papers were presented. Technical topics addressed included adaptive systems; learning algorithms; network architectures; vision; robotics; neurobiological connections; speech recognition and synthesis; fuzzy set theory and application, control, and dynamics processing; space applications; fuzzy logic and neural network computers; approximate reasoning; and multiobject decision making.
The Application of Leap Motion in Astronaut Virtual Training

NASA Astrophysics Data System (ADS)

Qingchao, Xie; Jiangang, Chao

2017-03-01

With the development of computer vision, virtual reality has been applied in astronaut virtual training. As an advanced optic equipment to track hand, Leap Motion can provide precise and fluid tracking of hands. Leap Motion is suitable to be used as gesture input device in astronaut virtual training. This paper built an astronaut virtual training based Leap Motion, and established the mathematics model of hands occlusion. At last the ability of Leap Motion to handle occlusion was analysed. A virtual assembly simulation platform was developed for astronaut training, and occlusion gesture would influence the recognition process. The experimental result can guide astronaut virtual training.
Application of DBNs for concerned internet information detecting

NASA Astrophysics Data System (ADS)

Wang, Yanfang; Gao, Song

2017-03-01

In recent years, deep learning has achieved great success in many fields, ranging from voice recognition and image classification to computer vision. In this study we apply DBNs to concerned internet information in Chinese detecting problem, since there are inherent differences between English and Chinese. Contrastive divergence (CD) is employed in the DBNs to learn a multi-layer generative model from numerous unlabeled data. The features obtained by this model are used to initialize the feed-forward neural network, which can be fine-tuned with backpropagation. Experiment results indicate that, the model and training method we proposed can be used to detect the concerned internet information effectively and accurately.
Face Detection and Modeling for Recognition

DTIC Science & Technology

2002-01-01

gi st er ed ra n ge an d co lo r im ag es . 16 F ig u re 1. 12 . S y st em d ia gr...it h an d w it h ou t th e tr an sf or m ar e sh ow n . F or ea ch ex am p le , th e im ag es sh ow n in th e fi rs t co lu m n ar e sk in re gi on s...software/products /perflib/ipl/index.htm>. [187] Intel Open Source Computer Vision Library, <http://developer.intel.com/ soft- ware/opensource/cvfl/ opencv
Automatic Estimation of Volcanic Ash Plume Height using WorldView-2 Imagery

NASA Technical Reports Server (NTRS)

McLaren, David; Thompson, David R.; Davies, Ashley G.; Gudmundsson, Magnus T.; Chien, Steve

2012-01-01

We explore the use of machine learning, computer vision, and pattern recognition techniques to automatically identify volcanic ash plumes and plume shadows, in WorldView-2 imagery. Using information of the relative position of the sun and spacecraft and terrain information in the form of a digital elevation map, classification, the height of the ash plume can also be inferred. We present the results from applying this approach to six scenes acquired on two separate days in April and May of 2010 of the Eyjafjallajokull eruption in Iceland. These results show rough agreement with ash plume height estimates from visual and radar based measurements.
Robust Feature Matching in Terrestrial Image Sequences

NASA Astrophysics Data System (ADS)

Abbas, A.; Ghuffar, S.

2018-04-01

From the last decade, the feature detection, description and matching techniques are most commonly exploited in various photogrammetric and computer vision applications, which includes: 3D reconstruction of scenes, image stitching for panoramic creation, image classification, or object recognition etc. However, in terrestrial imagery of urban scenes contains various issues, which include duplicate and identical structures (i.e. repeated windows and doors) that cause the problem in feature matching phase and ultimately lead to failure of results specially in case of camera pose and scene structure estimation. In this paper, we will address the issue related to ambiguous feature matching in urban environment due to repeating patterns.
On the performances of computer vision algorithms on mobile platforms

NASA Astrophysics Data System (ADS)

Battiato, S.; Farinella, G. M.; Messina, E.; Puglisi, G.; Ravì, D.; Capra, A.; Tomaselli, V.

2012-01-01

Computer Vision enables mobile devices to extract the meaning of the observed scene from the information acquired with the onboard sensor cameras. Nowadays, there is a growing interest in Computer Vision algorithms able to work on mobile platform (e.g., phone camera, point-and-shot-camera, etc.). Indeed, bringing Computer Vision capabilities on mobile devices open new opportunities in different application contexts. The implementation of vision algorithms on mobile devices is still a challenging task since these devices have poor image sensors and optics as well as limited processing power. In this paper we have considered different algorithms covering classic Computer Vision tasks: keypoint extraction, face detection, image segmentation. Several tests have been done to compare the performances of the involved mobile platforms: Nokia N900, LG Optimus One, Samsung Galaxy SII.
The development of newborn object recognition in fast and slow visual worlds

PubMed Central

Wood, Justin N.; Wood, Samantha M. W.

2016-01-01

Object recognition is central to perception and cognition. Yet relatively little is known about the environmental factors that cause invariant object recognition to emerge in the newborn brain. Is this ability a hardwired property of vision? Or does the development of invariant object recognition require experience with a particular kind of visual environment? Here, we used a high-throughput controlled-rearing method to examine whether newborn chicks (Gallus gallus) require visual experience with slowly changing objects to develop invariant object recognition abilities. When newborn chicks were raised with a slowly rotating virtual object, the chicks built invariant object representations that generalized across novel viewpoints and rotation speeds. In contrast, when newborn chicks were raised with a virtual object that rotated more quickly, the chicks built viewpoint-specific object representations that failed to generalize to novel viewpoints and rotation speeds. Moreover, there was a direct relationship between the speed of the object and the amount of invariance in the chick's object representation. Thus, visual experience with slowly changing objects plays a critical role in the development of invariant object recognition. These results indicate that invariant object recognition is not a hardwired property of vision, but is learned rapidly when newborns encounter a slowly changing visual world. PMID:27097925
Neural Network Target Identification System for False Alarm Reduction

NASA Technical Reports Server (NTRS)

Ye, David; Edens, Weston; Lu, Thomas T.; Chao, Tien-Hsin

2009-01-01

A multi-stage automated target recognition (ATR) system has been designed to perform computer vision tasks with adequate proficiency in mimicking human vision. The system is able to detect, identify, and track targets of interest. Potential regions of interest (ROIs) are first identified by the detection stage using an Optimum Trade-off Maximum Average Correlation Height (OT-MACH) filter combined with a wavelet transform. False positives are then eliminated by the verification stage using feature extraction methods in conjunction with neural networks. Feature extraction transforms the ROIs using filtering and binning algorithms to create feature vectors. A feed forward back propagation neural network (NN) is then trained to classify each feature vector and remove false positives. This paper discusses the test of the system performance and parameter optimizations process which adapts the system to various targets and datasets. The test results show that the system was successful in substantially reducing the false positive rate when tested on a sonar image dataset.
A novel parallel architecture for local histogram equalization

NASA Astrophysics Data System (ADS)

Ohannessian, Mesrob I.; Choueiter, Ghinwa F.; Diab, Hassan

2005-07-01

Local histogram equalization is an image enhancement algorithm that has found wide application in the pre-processing stage of areas such as computer vision, pattern recognition and medical imaging. The computationally intensive nature of the procedure, however, is a main limitation when real time interactive applications are in question. This work explores the possibility of performing parallel local histogram equalization, using an array of special purpose elementary processors, through an HDL implementation that targets FPGA or ASIC platforms. A novel parallelization scheme is presented and the corresponding architecture is derived. The algorithm is reduced to pixel-level operations. Processing elements are assigned image blocks, to maintain a reasonable performance-cost ratio. To further simplify both processor and memory organizations, a bit-serial access scheme is used. A brief performance assessment is provided to illustrate and quantify the merit of the approach.
Deep learning for EEG-Based preference classification

NASA Astrophysics Data System (ADS)

Teo, Jason; Hou, Chew Lin; Mountstephens, James

2017-10-01

Electroencephalogram (EEG)-based emotion classification is rapidly becoming one of the most intensely studied areas of brain-computer interfacing (BCI). The ability to passively identify yet accurately correlate brainwaves with our immediate emotions opens up truly meaningful and previously unattainable human-computer interactions such as in forensic neuroscience, rehabilitative medicine, affective entertainment and neuro-marketing. One particularly useful yet rarely explored areas of EEG-based emotion classification is preference recognition [1], which is simply the detection of like versus dislike. Within the limited investigations into preference classification, all reported studies were based on musically-induced stimuli except for a single study which used 2D images. The main objective of this study is to apply deep learning, which has been shown to produce state-of-the-art results in diverse hard problems such as in computer vision, natural language processing and audio recognition, to 3D object preference classification over a larger group of test subjects. A cohort of 16 users was shown 60 bracelet-like objects as rotating visual stimuli on a computer display while their preferences and EEGs were recorded. After training a variety of machine learning approaches which included deep neural networks, we then attempted to classify the users' preferences for the 3D visual stimuli based on their EEGs. Here, we show that that deep learning outperforms a variety of other machine learning classifiers for this EEG-based preference classification task particularly in a highly challenging dataset with large inter- and intra-subject variability.
Optical character recognition reading aid for the visually impaired.

PubMed

Grandin, Juan Carlos; Cremaschi, Fabian; Lombardo, Elva; Vitu, Ed; Dujovny, Manuel

2008-06-01

An optical character recognition (OCR) reading machine is a significant help for visually impaired patients. An OCR reading machine is used. This instrument can provide a significant help in order to improve the quality of life of patients with low vision or blindness.
Automatic recognition of 3D GGO CT imaging signs through the fusion of hybrid resampling and layer-wise fine-tuning CNNs.

PubMed

Han, Guanghui; Liu, Xiabi; Zheng, Guangyuan; Wang, Murong; Huang, Shan

2018-06-06

Ground-glass opacity (GGO) is a common CT imaging sign on high-resolution CT, which means the lesion is more likely to be malignant compared to common solid lung nodules. The automatic recognition of GGO CT imaging signs is of great importance for early diagnosis and possible cure of lung cancers. The present GGO recognition methods employ traditional low-level features and system performance improves slowly. Considering the high-performance of CNN model in computer vision field, we proposed an automatic recognition method of 3D GGO CT imaging signs through the fusion of hybrid resampling and layer-wise fine-tuning CNN models in this paper. Our hybrid resampling is performed on multi-views and multi-receptive fields, which reduces the risk of missing small or large GGOs by adopting representative sampling panels and processing GGOs with multiple scales simultaneously. The layer-wise fine-tuning strategy has the ability to obtain the optimal fine-tuning model. Multi-CNN models fusion strategy obtains better performance than any single trained model. We evaluated our method on the GGO nodule samples in publicly available LIDC-IDRI dataset of chest CT scans. The experimental results show that our method yields excellent results with 96.64% sensitivity, 71.43% specificity, and 0.83 F1 score. Our method is a promising approach to apply deep learning method to computer-aided analysis of specific CT imaging signs with insufficient labeled images. Graphical abstract We proposed an automatic recognition method of 3D GGO CT imaging signs through the fusion of hybrid resampling and layer-wise fine-tuning CNN models in this paper. Our hybrid resampling reduces the risk of missing small or large GGOs by adopting representative sampling panels and processing GGOs with multiple scales simultaneously. The layer-wise fine-tuning strategy has ability to obtain the optimal fine-tuning model. Our method is a promising approach to apply deep learning method to computer-aided analysis of specific CT imaging signs with insufficient labeled images.
Unification of automatic target tracking and automatic target recognition

NASA Astrophysics Data System (ADS)

Schachter, Bruce J.

2014-06-01

The subject being addressed is how an automatic target tracker (ATT) and an automatic target recognizer (ATR) can be fused together so tightly and so well that their distinctiveness becomes lost in the merger. This has historically not been the case outside of biology and a few academic papers. The biological model of ATT∪ATR arises from dynamic patterns of activity distributed across many neural circuits and structures (including retina). The information that the brain receives from the eyes is "old news" at the time that it receives it. The eyes and brain forecast a tracked object's future position, rather than relying on received retinal position. Anticipation of the next moment - building up a consistent perception - is accomplished under difficult conditions: motion (eyes, head, body, scene background, target) and processing limitations (neural noise, delays, eye jitter, distractions). Not only does the human vision system surmount these problems, but it has innate mechanisms to exploit motion in support of target detection and classification. Biological vision doesn't normally operate on snapshots. Feature extraction, detection and recognition are spatiotemporal. When vision is viewed as a spatiotemporal process, target detection, recognition, tracking, event detection and activity recognition, do not seem as distinct as they are in current ATT and ATR designs. They appear as similar mechanism taking place at varying time scales. A framework is provided for unifying ATT and ATR.
Pyramidal neurovision architecture for vision machines

NASA Astrophysics Data System (ADS)

Gupta, Madan M.; Knopf, George K.

1993-08-01

The vision system employed by an intelligent robot must be active; active in the sense that it must be capable of selectively acquiring the minimal amount of relevant information for a given task. An efficient active vision system architecture that is based loosely upon the parallel-hierarchical (pyramidal) structure of the biological visual pathway is presented in this paper. Although the computational architecture of the proposed pyramidal neuro-vision system is far less sophisticated than the architecture of the biological visual pathway, it does retain some essential features such as the converging multilayered structure of its biological counterpart. In terms of visual information processing, the neuro-vision system is constructed from a hierarchy of several interactive computational levels, whereupon each level contains one or more nonlinear parallel processors. Computationally efficient vision machines can be developed by utilizing both the parallel and serial information processing techniques within the pyramidal computing architecture. A computer simulation of a pyramidal vision system for active scene surveillance is presented.
Looking inside the Ocean: Toward an Autonomous Imaging System for Monitoring Gelatinous Zooplankton

PubMed Central

Corgnati, Lorenzo; Marini, Simone; Mazzei, Luca; Ottaviani, Ennio; Aliani, Stefano; Conversi, Alessandra; Griffa, Annalisa

2016-01-01

Marine plankton abundance and dynamics in the open and interior ocean is still an unknown field. The knowledge of gelatinous zooplankton distribution is especially challenging, because this type of plankton has a very fragile structure and cannot be directly sampled using traditional net based techniques. To overcome this shortcoming, Computer Vision techniques can be successfully used for the automatic monitoring of this group.This paper presents the GUARD1 imaging system, a low-cost stand-alone instrument for underwater image acquisition and recognition of gelatinous zooplankton, and discusses the performance of three different methodologies, Tikhonov Regularization, Support Vector Machines and Genetic Programming, that have been compared in order to select the one to be run onboard the system for the automatic recognition of gelatinous zooplankton. The performance comparison results highlight the high accuracy of the three methods in gelatinous zooplankton identification, showing their good capability in robustly selecting relevant features. In particular, Genetic Programming technique achieves the same performances of the other two methods by using a smaller set of features, thus being the most efficient in avoiding computationally consuming preprocessing stages, that is a crucial requirement for running on an autonomous imaging system designed for long lasting deployments, like the GUARD1. The Genetic Programming algorithm has been installed onboard the system, that has been operationally tested in a two-months survey in the Ligurian Sea, providing satisfactory results in terms of monitoring and recognition performances. PMID:27983638

Supervised linear dimensionality reduction with robust margins for object recognition

NASA Astrophysics Data System (ADS)

Dornaika, F.; Assoum, A.

2013-01-01

Linear Dimensionality Reduction (LDR) techniques have been increasingly important in computer vision and pattern recognition since they permit a relatively simple mapping of data onto a lower dimensional subspace, leading to simple and computationally efficient classification strategies. Recently, many linear discriminant methods have been developed in order to reduce the dimensionality of visual data and to enhance the discrimination between different groups or classes. Many existing linear embedding techniques relied on the use of local margins in order to get a good discrimination performance. However, dealing with outliers and within-class diversity has not been addressed by margin-based embedding method. In this paper, we explored the use of different margin-based linear embedding methods. More precisely, we propose to use the concepts of Median miss and Median hit for building robust margin-based criteria. Based on such margins, we seek the projection directions (linear embedding) such that the sum of local margins is maximized. Our proposed approach has been applied to the problem of appearance-based face recognition. Experiments performed on four public face databases show that the proposed approach can give better generalization performance than the classic Average Neighborhood Margin Maximization (ANMM). Moreover, thanks to the use of robust margins, the proposed method down-grades gracefully when label outliers contaminate the training data set. In particular, we show that the concept of Median hit was crucial in order to get robust performance in the presence of outliers.
Computer vision in the poultry industry

USDA-ARS?s Scientific Manuscript database

Computer vision is becoming increasingly important in the poultry industry due to increasing use and speed of automation in processing operations. Growing awareness of food safety concerns has helped add food safety inspection to the list of tasks that automated computer vision can assist. Researc...
Feasibility Testing of a Wearable Behavioral Aid for Social Learning in Children with Autism.

PubMed

Daniels, Jena; Haber, Nick; Voss, Catalin; Schwartz, Jessey; Tamura, Serena; Fazel, Azar; Kline, Aaron; Washington, Peter; Phillips, Jennifer; Winograd, Terry; Feinstein, Carl; Wall, Dennis P

2018-01-01

Recent advances in computer vision and wearable technology have created an opportunity to introduce mobile therapy systems for autism spectrum disorders (ASD) that can respond to the increasing demand for therapeutic interventions; however, feasibility questions must be answered first. We studied the feasibility of a prototype therapeutic tool for children with ASD using Google Glass, examining whether children with ASD would wear such a device, if providing the emotion classification will improve emotion recognition, and how emotion recognition differs between ASD participants and neurotypical controls (NC). We ran a controlled laboratory experiment with 43 children: 23 with ASD and 20 NC. Children identified static facial images on a computer screen with one of 7 emotions in 3 successive batches: the first with no information about emotion provided to the child, the second with the correct classification from the Glass labeling the emotion, and the third again without emotion information. We then trained a logistic regression classifier on the emotion confusion matrices generated by the two information-free batches to predict ASD versus NC. All 43 children were comfortable wearing the Glass. ASD and NC participants who completed the computer task with Glass providing audible emotion labeling ( n = 33) showed increased accuracies in emotion labeling, and the logistic regression classifier achieved an accuracy of 72.7%. Further analysis suggests that the ability to recognize surprise, fear, and neutrality may distinguish ASD cases from NC. This feasibility study supports the utility of a wearable device for social affective learning in ASD children and demonstrates subtle differences in how ASD and NC children perform on an emotion recognition task. Schattauer GmbH Stuttgart.
Image Processing Strategies Based on a Visual Saliency Model for Object Recognition Under Simulated Prosthetic Vision.

PubMed

Wang, Jing; Li, Heng; Fu, Weizhen; Chen, Yao; Li, Liming; Lyu, Qing; Han, Tingting; Chai, Xinyu

2016-01-01

Retinal prostheses have the potential to restore partial vision. Object recognition in scenes of daily life is one of the essential tasks for implant wearers. Still limited by the low-resolution visual percepts provided by retinal prostheses, it is important to investigate and apply image processing methods to convey more useful visual information to the wearers. We proposed two image processing strategies based on Itti's visual saliency map, region of interest (ROI) extraction, and image segmentation. Itti's saliency model generated a saliency map from the original image, in which salient regions were grouped into ROI by the fuzzy c-means clustering. Then Grabcut generated a proto-object from the ROI labeled image which was recombined with background and enhanced in two ways--8-4 separated pixelization (8-4 SP) and background edge extraction (BEE). Results showed that both 8-4 SP and BEE had significantly higher recognition accuracy in comparison with direct pixelization (DP). Each saliency-based image processing strategy was subject to the performance of image segmentation. Under good and perfect segmentation conditions, BEE and 8-4 SP obtained noticeably higher recognition accuracy than DP, and under bad segmentation condition, only BEE boosted the performance. The application of saliency-based image processing strategies was verified to be beneficial to object recognition in daily scenes under simulated prosthetic vision. They are hoped to help the development of the image processing module for future retinal prostheses, and thus provide more benefit for the patients. Copyright © 2015 International Center for Artificial Organs and Transplantation and Wiley Periodicals, Inc.
[Comparison study between biological vision and computer vision].

PubMed

Liu, W; Yuan, X G; Yang, C X; Liu, Z Q; Wang, R

2001-08-01

The development and bearing of biology vision in structure and mechanism were discussed, especially on the aspects including anatomical structure of biological vision, tentative classification of reception field, parallel processing of visual information, feedback and conformity effect of visual cortical, and so on. The new advance in the field was introduced through the study of the morphology of biological vision. Besides, comparison between biological vision and computer vision was made, and their similarities and differences were pointed out.
On-Chip Imaging of Schistosoma haematobium Eggs in Urine for Diagnosis by Computer Vision

PubMed Central

Linder, Ewert; Grote, Anne; Varjo, Sami; Linder, Nina; Lebbad, Marianne; Lundin, Mikael; Diwan, Vinod; Hannuksela, Jari; Lundin, Johan

2013-01-01

Background Microscopy, being relatively easy to perform at low cost, is the universal diagnostic method for detection of most globally important parasitic infections. As quality control is hard to maintain, misdiagnosis is common, which affects both estimates of parasite burdens and patient care. Novel techniques for high-resolution imaging and image transfer over data networks may offer solutions to these problems through provision of education, quality assurance and diagnostics. Imaging can be done directly on image sensor chips, a technique possible to exploit commercially for the development of inexpensive “mini-microscopes”. Images can be transferred for analysis both visually and by computer vision both at point-of-care and at remote locations. Methods/Principal Findings Here we describe imaging of helminth eggs using mini-microscopes constructed from webcams and mobile phone cameras. The results show that an inexpensive webcam, stripped off its optics to allow direct application of the test sample on the exposed surface of the sensor, yields images of Schistosoma haematobium eggs, which can be identified visually. Using a highly specific image pattern recognition algorithm, 4 out of 5 eggs observed visually could be identified. Conclusions/Significance As proof of concept we show that an inexpensive imaging device, such as a webcam, may be easily modified into a microscope, for the detection of helminth eggs based on on-chip imaging. Furthermore, algorithms for helminth egg detection by machine vision can be generated for automated diagnostics. The results can be exploited for constructing simple imaging devices for low-cost diagnostics of urogenital schistosomiasis and other neglected tropical infectious diseases. PMID:24340107
Treelets Binary Feature Retrieval for Fast Keypoint Recognition.

PubMed

Zhu, Jianke; Wu, Chenxia; Chen, Chun; Cai, Deng

2015-10-01

Fast keypoint recognition is essential to many vision tasks. In contrast to the classification-based approaches, we directly formulate the keypoint recognition as an image patch retrieval problem, which enjoys the merit of finding the matched keypoint and its pose simultaneously. To effectively extract the binary features from each patch surrounding the keypoint, we make use of treelets transform that can group the highly correlated data together and reduce the noise through the local analysis. Treelets is a multiresolution analysis tool, which provides an orthogonal basis to reflect the geometry of the noise-free data. To facilitate the real-world applications, we have proposed two novel approaches. One is the convolutional treelets that capture the image patch information locally and globally while reducing the computational cost. The other is the higher-order treelets that reflect the relationship between the rows and columns within image patch. An efficient sub-signature-based locality sensitive hashing scheme is employed for fast approximate nearest neighbor search in patch retrieval. Experimental evaluations on both synthetic data and the real-world Oxford dataset have shown that our proposed treelets binary feature retrieval methods outperform the state-of-the-art feature descriptors and classification-based approaches.
Can surgical simulation be used to train detection and classification of neural networks?

PubMed

Zisimopoulos, Odysseas; Flouty, Evangello; Stacey, Mark; Muscroft, Sam; Giataganas, Petros; Nehme, Jean; Chow, Andre; Stoyanov, Danail

2017-10-01

Computer-assisted interventions (CAI) aim to increase the effectiveness, precision and repeatability of procedures to improve surgical outcomes. The presence and motion of surgical tools is a key information input for CAI surgical phase recognition algorithms. Vision-based tool detection and recognition approaches are an attractive solution and can be designed to take advantage of the powerful deep learning paradigm that is rapidly advancing image recognition and classification. The challenge for such algorithms is the availability and quality of labelled data used for training. In this Letter, surgical simulation is used to train tool detection and segmentation based on deep convolutional neural networks and generative adversarial networks. The authors experiment with two network architectures for image segmentation in tool classes commonly encountered during cataract surgery. A commercially-available simulator is used to create a simulated cataract dataset for training models prior to performing transfer learning on real surgical data. To the best of authors' knowledge, this is the first attempt to train deep learning models for surgical instrument detection on simulated data while demonstrating promising results to generalise on real data. Results indicate that simulated data does have some potential for training advanced classification methods for CAI systems.
Convolution Comparison Pattern: An Efficient Local Image Descriptor for Fingerprint Liveness Detection

PubMed Central

Gottschlich, Carsten

2016-01-01

We present a new type of local image descriptor which yields binary patterns from small image patches. For the application to fingerprint liveness detection, we achieve rotation invariant image patches by taking the fingerprint segmentation and orientation field into account. We compute the discrete cosine transform (DCT) for these rotation invariant patches and attain binary patterns by comparing pairs of two DCT coefficients. These patterns are summarized into one or more histograms per image. Each histogram comprises the relative frequencies of pattern occurrences. Multiple histograms are concatenated and the resulting feature vector is used for image classification. We name this novel type of descriptor convolution comparison pattern (CCP). Experimental results show the usefulness of the proposed CCP descriptor for fingerprint liveness detection. CCP outperforms other local image descriptors such as LBP, LPQ and WLD on the LivDet 2013 benchmark. The CCP descriptor is a general type of local image descriptor which we expect to prove useful in areas beyond fingerprint liveness detection such as biological and medical image processing, texture recognition, face recognition and iris recognition, liveness detection for face and iris images, and machine vision for surface inspection and material classification. PMID:26844544
Log-Gabor Weber descriptor for face recognition

NASA Astrophysics Data System (ADS)

Li, Jing; Sang, Nong; Gao, Changxin

2015-09-01

The Log-Gabor transform, which is suitable for analyzing gradually changing data such as in iris and face images, has been widely used in image processing, pattern recognition, and computer vision. In most cases, only the magnitude or phase information of the Log-Gabor transform is considered. However, the complementary effect taken by combining magnitude and phase information simultaneously for an image-feature extraction problem has not been systematically explored in the existing works. We propose a local image descriptor for face recognition, called Log-Gabor Weber descriptor (LGWD). The novelty of our LGWD is twofold: (1) to fully utilize the information from the magnitude or phase feature of multiscale and orientation Log-Gabor transform, we apply the Weber local binary pattern operator to each transform response. (2) The encoded Log-Gabor magnitude and phase information are fused at the feature level by utilizing kernel canonical correlation analysis strategy, considering that feature level information fusion is effective when the modalities are correlated. Experimental results on the AR, Extended Yale B, and UMIST face databases, compared with those available from recent experiments reported in the literature, show that our descriptor yields a better performance than state-of-the art methods.
Image-plane processing of visual information

NASA Technical Reports Server (NTRS)

Huck, F. O.; Fales, C. L.; Park, S. K.; Samms, R. W.

1984-01-01

Shannon's theory of information is used to optimize the optical design of sensor-array imaging systems which use neighborhood image-plane signal processing for enhancing edges and compressing dynamic range during image formation. The resultant edge-enhancement, or band-pass-filter, response is found to be very similar to that of human vision. Comparisons of traits in human vision with results from information theory suggest that: (1) Image-plane processing, like preprocessing in human vision, can improve visual information acquisition for pattern recognition when resolving power, sensitivity, and dynamic range are constrained. Improvements include reduced sensitivity to changes in lighter levels, reduced signal dynamic range, reduced data transmission and processing, and reduced aliasing and photosensor noise degradation. (2) Information content can be an appropriate figure of merit for optimizing the optical design of imaging systems when visual information is acquired for pattern recognition. The design trade-offs involve spatial response, sensitivity, and sampling interval.
Fulfilling a European Vision through Flexible Learning and Choice

ERIC Educational Resources Information Center

Harris, Margaret S. G.

2012-01-01

This article considers the value of flexibility and free choice in learning, and examines the increasing recognition of the evolving and wide range of appropriate environments for learning, such as the workplace, the home, the community, and the virtual world. This "Lifeplace Learning" is compared to the requirements and visions of the…
Stereo Vision Inside Tire

DTIC Science & Technology

2015-08-21

using the Open Computer Vision ( OpenCV ) libraries [6] for computer vision and the Qt library [7] for the user interface. The software has the...depth. The software application calibrates the cameras using the plane based calibration model from the OpenCV calib3D module and allows the...6] OpenCV . 2015. OpenCV Open Source Computer Vision. [Online]. Available at: opencv.org [Accessed]: 09/01/2015. [7] Qt. 2015. Qt Project home
Neural Networks for Computer Vision: A Framework for Specifications of a General Purpose Vision System

NASA Astrophysics Data System (ADS)

Skrzypek, Josef; Mesrobian, Edmond; Gungner, David J.

1989-03-01

The development of autonomous land vehicles (ALV) capable of operating in an unconstrained environment has proven to be a formidable research effort. The unpredictability of events in such an environment calls for the design of a robust perceptual system, an impossible task requiring the programming of a system bases on the expectation of future, unconstrained events. Hence, the need for a "general purpose" machine vision system that is capable of perceiving and understanding images in an unconstrained environment in real-time. The research undertaken at the UCLA Machine Perception Laboratory addresses this need by focusing on two specific issues: 1) the long term goals for machine vision research as a joint effort between the neurosciences and computer science; and 2) a framework for evaluating progress in machine vision. In the past, vision research has been carried out independently within different fields including neurosciences, psychology, computer science, and electrical engineering. Our interdisciplinary approach to vision research is based on the rigorous combination of computational neuroscience, as derived from neurophysiology and neuropsychology, with computer science and electrical engineering. The primary motivation behind our approach is that the human visual system is the only existing example of a "general purpose" vision system and using a neurally based computing substrate, it can complete all necessary visual tasks in real-time.
Multispectral image analysis for object recognition and classification

NASA Astrophysics Data System (ADS)

Viau, C. R.; Payeur, P.; Cretu, A.-M.

2016-05-01

Computer and machine vision applications are used in numerous fields to analyze static and dynamic imagery in order to assist or automate decision-making processes. Advancements in sensor technologies now make it possible to capture and visualize imagery at various wavelengths (or bands) of the electromagnetic spectrum. Multispectral imaging has countless applications in various fields including (but not limited to) security, defense, space, medical, manufacturing and archeology. The development of advanced algorithms to process and extract salient information from the imagery is a critical component of the overall system performance. The fundamental objective of this research project was to investigate the benefits of combining imagery from the visual and thermal bands of the electromagnetic spectrum to improve the recognition rates and accuracy of commonly found objects in an office setting. A multispectral dataset (visual and thermal) was captured and features from the visual and thermal images were extracted and used to train support vector machine (SVM) classifiers. The SVM's class prediction ability was evaluated separately on the visual, thermal and multispectral testing datasets.
Neural networks for data compression and invariant image recognition

NASA Technical Reports Server (NTRS)

Gardner, Sheldon

1989-01-01

An approach to invariant image recognition (I2R), based upon a model of biological vision in the mammalian visual system (MVS), is described. The complete I2R model incorporates several biologically inspired features: exponential mapping of retinal images, Gabor spatial filtering, and a neural network associative memory. In the I2R model, exponentially mapped retinal images are filtered by a hierarchical set of Gabor spatial filters (GSF) which provide compression of the information contained within a pixel-based image. A neural network associative memory (AM) is used to process the GSF coded images. We describe a 1-D shape function method for coding of scale and rotationally invariant shape information. This method reduces image shape information to a periodic waveform suitable for coding as an input vector to a neural network AM. The shape function method is suitable for near term applications on conventional computing architectures equipped with VLSI FFT chips to provide a rapid image search capability.
A top-down manner-based DCNN architecture for semantic image segmentation.

PubMed

Qiao, Kai; Chen, Jian; Wang, Linyuan; Zeng, Lei; Yan, Bin

2017-01-01

Given their powerful feature representation for recognition, deep convolutional neural networks (DCNNs) have been driving rapid advances in high-level computer vision tasks. However, their performance in semantic image segmentation is still not satisfactory. Based on the analysis of visual mechanism, we conclude that DCNNs in a bottom-up manner are not enough, because semantic image segmentation task requires not only recognition but also visual attention capability. In the study, superpixels containing visual attention information are introduced in a top-down manner, and an extensible architecture is proposed to improve the segmentation results of current DCNN-based methods. We employ the current state-of-the-art fully convolutional network (FCN) and FCN with conditional random field (DeepLab-CRF) as baselines to validate our architecture. Experimental results of the PASCAL VOC segmentation task qualitatively show that coarse edges and error segmentation results are well improved. We also quantitatively obtain about 2%-3% intersection over union (IOU) accuracy improvement on the PASCAL VOC 2011 and 2012 test sets.
Insights from Classifying Visual Concepts with Multiple Kernel Learning

PubMed Central

Binder, Alexander; Nakajima, Shinichi; Kloft, Marius; Müller, Christina; Samek, Wojciech; Brefeld, Ulf; Müller, Klaus-Robert; Kawanabe, Motoaki

2012-01-01

Combining information from various image features has become a standard technique in concept recognition tasks. However, the optimal way of fusing the resulting kernel functions is usually unknown in practical applications. Multiple kernel learning (MKL) techniques allow to determine an optimal linear combination of such similarity matrices. Classical approaches to MKL promote sparse mixtures. Unfortunately, 1-norm regularized MKL variants are often observed to be outperformed by an unweighted sum kernel. The main contributions of this paper are the following: we apply a recently developed non-sparse MKL variant to state-of-the-art concept recognition tasks from the application domain of computer vision. We provide insights on benefits and limits of non-sparse MKL and compare it against its direct competitors, the sum-kernel SVM and sparse MKL. We report empirical results for the PASCAL VOC 2009 Classification and ImageCLEF2010 Photo Annotation challenge data sets. Data sets (kernel matrices) as well as further information are available at http://doc.ml.tu-berlin.de/image_mkl/(Accessed 2012 Jun 25). PMID:22936970
Artificial Neural Networks for Processing Graphs with Application to Image Understanding: A Survey

NASA Astrophysics Data System (ADS)

Bianchini, Monica; Scarselli, Franco

In graphical pattern recognition, each data is represented as an arrangement of elements, that encodes both the properties of each element and the relations among them. Hence, patterns are modelled as labelled graphs where, in general, labels can be attached to both nodes and edges. Artificial neural networks able to process graphs are a powerful tool for addressing a great variety of real-world problems, where the information is naturally organized in entities and relationships among entities and, in fact, they have been widely used in computer vision, f.i. in logo recognition, in similarity retrieval, and for object detection. In this chapter, we propose a survey of neural network models able to process structured information, with a particular focus on those architectures tailored to address image understanding applications. Starting from the original recursive model (RNNs), we subsequently present different ways to represent images - by trees, forests of trees, multiresolution trees, directed acyclic graphs with labelled edges, general graphs - and, correspondingly, neural network architectures appropriate to process such structures.
Language Education and Multilingualism in Colombia: Crossing the Divide

ERIC Educational Resources Information Center

de Mejía, Anne-Marie

2017-01-01

Despite Colombia's official recognition of its ethnic and cultural diversity, it has yet to develop in practice an inclusive educational vision involving the recognition of diversity, as well as promoting the country's insertion within the global market. Garcia et al. acknowledge the importance of "cultivating" students' diverse…

Haptic Exploration in Humans and Machines: Attribute Integration and Machine Recognition/Implementation.

DTIC Science & Technology

1988-04-30

side it necessary and Identify’ by’ block n~nmbot) haptic hand, touch , vision, robot, object recognition, categorization 20. AGSTRPACT (Continue an...established that the haptic system has remarkable capabilities for object recognition. We define haptics as purposive touch . The basic tactual system...gathered ratings of the importance of dimensions for categorizing common objects by touch . Texture and hardness ratings strongly co-vary, which is
Multiple template-based image matching using alpha-rooted quaternion phase correlation

NASA Astrophysics Data System (ADS)

DelMarco, Stephen

2010-04-01

In computer vision applications, image matching performed on quality-degraded imagery is difficult due to image content distortion and noise effects. State-of-the art keypoint based matchers, such as SURF and SIFT, work very well on clean imagery. However, performance can degrade significantly in the presence of high noise and clutter levels. Noise and clutter cause the formation of false features which can degrade recognition performance. To address this problem, previously we developed an extension to the classical amplitude and phase correlation forms, which provides improved robustness and tolerance to image geometric misalignments and noise. This extension, called Alpha-Rooted Phase Correlation (ARPC), combines Fourier domain-based alpha-rooting enhancement with classical phase correlation. ARPC provides tunable parameters to control the alpha-rooting enhancement. These parameter values can be optimized to tradeoff between high narrow correlation peaks, and more robust wider, but smaller peaks. Previously, we applied ARPC in the radon transform domain for logo image recognition in the presence of rotational image misalignments. In this paper, we extend ARPC to incorporate quaternion Fourier transforms, thereby creating Alpha-Rooted Quaternion Phase Correlation (ARQPC). We apply ARQPC to the logo image recognition problem. We use ARQPC to perform multiple-reference logo template matching by representing multiple same-class reference templates as quaternion-valued images. We generate recognition performance results on publicly-available logo imagery, and compare recognition results to results generated from standard approaches. We show that small deviations in reference templates of sameclass logos can lead to improved recognition performance using the joint matching inherent in ARQPC.
Pattern recognition and feature extraction with an optical Hough transform

NASA Astrophysics Data System (ADS)

Fernández, Ariel

2016-09-01

Pattern recognition and localization along with feature extraction are image processing applications of great interest in defect inspection and robot vision among others. In comparison to purely digital methods, the attractiveness of optical processors for pattern recognition lies in their highly parallel operation and real-time processing capability. This work presents an optical implementation of the generalized Hough transform (GHT), a well-established technique for the recognition of geometrical features in binary images. Detection of a geometric feature under the GHT is accomplished by mapping the original image to an accumulator space; the large computational requirements for this mapping make the optical implementation an attractive alternative to digital- only methods. Starting from the integral representation of the GHT, it is possible to device an optical setup where the transformation is obtained, and the size and orientation parameters can be controlled, allowing for dynamic scale and orientation-variant pattern recognition. A compact system for the above purposes results from the use of an electrically tunable lens for scale control and a rotating pupil mask for orientation variation, implemented on a high-contrast spatial light modulator (SLM). Real-time (as limited by the frame rate of the device used to capture the GHT) can also be achieved, allowing for the processing of video sequences. Besides, by thresholding of the GHT (with the aid of another SLM) and inverse transforming (which is optically achieved in the incoherent system under appropriate focusing setting), the previously detected features of interest can be extracted.
Reading recognition of pointer meter based on pattern recognition and dynamic three-points on a line

NASA Astrophysics Data System (ADS)

Zhang, Yongqiang; Ding, Mingli; Fu, Wuyifang; Li, Yongqiang

2017-03-01

Pointer meters are frequently applied to industrial production for they are directly readable. They should be calibrated regularly to ensure the precision of the readings. Currently the method of manual calibration is most frequently adopted to accomplish the verification of the pointer meter, and professional skills and subjective judgment may lead to big measurement errors and poor reliability and low efficiency, etc. In the past decades, with the development of computer technology, the skills of machine vision and digital image processing have been applied to recognize the reading of the dial instrument. In terms of the existing recognition methods, all the parameters of dial instruments are supposed to be the same, which is not the case in practice. In this work, recognition of pointer meter reading is regarded as an issue of pattern recognition. We obtain the features of a small area around the detected point, make those features as a pattern, divide those certified images based on Gradient Pyramid Algorithm, train a classifier with the support vector machine (SVM) and complete the pattern matching of the divided mages. Then we get the reading of the pointer meter precisely under the theory of dynamic three points make a line (DTPML), which eliminates the error caused by tiny differences of the panels. Eventually, the result of the experiment proves that the proposed method in this work is superior to state-of-the-art works.
Computer vision for foreign body detection and removal in the food industry

USDA-ARS?s Scientific Manuscript database

Computer vision inspection systems are often used for quality control, product grading, defect detection and other product evaluation issues. This chapter focuses on the use of computer vision inspection systems that detect foreign bodies and remove them from the product stream. Specifically, we wi...
Chapter 11. Quality evaluation of apple by computer vision

USDA-ARS?s Scientific Manuscript database

Apple is one of the most consumed fruits in the world, and there is a critical need for enhanced computer vision technology for quality assessment of apples. This chapter gives a comprehensive review on recent advances in various computer vision techniques for detecting surface and internal defects ...
CIFAR10-DVS: An Event-Stream Dataset for Object Classification

PubMed Central

Li, Hongmin; Liu, Hanchao; Ji, Xiangyang; Li, Guoqi; Shi, Luping

2017-01-01

Neuromorphic vision research requires high-quality and appropriately challenging event-stream datasets to support continuous improvement of algorithms and methods. However, creating event-stream datasets is a time-consuming task, which needs to be recorded using the neuromorphic cameras. Currently, there are limited event-stream datasets available. In this work, by utilizing the popular computer vision dataset CIFAR-10, we converted 10,000 frame-based images into 10,000 event streams using a dynamic vision sensor (DVS), providing an event-stream dataset of intermediate difficulty in 10 different classes, named as “CIFAR10-DVS.” The conversion of event-stream dataset was implemented by a repeated closed-loop smooth (RCLS) movement of frame-based images. Unlike the conversion of frame-based images by moving the camera, the image movement is more realistic in respect of its practical applications. The repeated closed-loop image movement generates rich local intensity changes in continuous time which are quantized by each pixel of the DVS camera to generate events. Furthermore, a performance benchmark in event-driven object classification is provided based on state-of-the-art classification algorithms. This work provides a large event-stream dataset and an initial benchmark for comparison, which may boost algorithm developments in even-driven pattern recognition and object classification. PMID:28611582
Robust algebraic image enhancement for intelligent control systems

NASA Technical Reports Server (NTRS)

Lerner, Bao-Ting; Morrelli, Michael

1993-01-01

Robust vision capability for intelligent control systems has been an elusive goal in image processing. The computationally intensive techniques a necessary for conventional image processing make real-time applications, such as object tracking and collision avoidance difficult. In order to endow an intelligent control system with the needed vision robustness, an adequate image enhancement subsystem capable of compensating for the wide variety of real-world degradations, must exist between the image capturing and the object recognition subsystems. This enhancement stage must be adaptive and must operate with consistency in the presence of both statistical and shape-based noise. To deal with this problem, we have developed an innovative algebraic approach which provides a sound mathematical framework for image representation and manipulation. Our image model provides a natural platform from which to pursue dynamic scene analysis, and its incorporation into a vision system would serve as the front-end to an intelligent control system. We have developed a unique polynomial representation of gray level imagery and applied this representation to develop polynomial operators on complex gray level scenes. This approach is highly advantageous since polynomials can be manipulated very easily, and are readily understood, thus providing a very convenient environment for image processing. Our model presents a highly structured and compact algebraic representation of grey-level images which can be viewed as fuzzy sets.
CIFAR10-DVS: An Event-Stream Dataset for Object Classification.

PubMed

Li, Hongmin; Liu, Hanchao; Ji, Xiangyang; Li, Guoqi; Shi, Luping

2017-01-01

Neuromorphic vision research requires high-quality and appropriately challenging event-stream datasets to support continuous improvement of algorithms and methods. However, creating event-stream datasets is a time-consuming task, which needs to be recorded using the neuromorphic cameras. Currently, there are limited event-stream datasets available. In this work, by utilizing the popular computer vision dataset CIFAR-10, we converted 10,000 frame-based images into 10,000 event streams using a dynamic vision sensor (DVS), providing an event-stream dataset of intermediate difficulty in 10 different classes, named as "CIFAR10-DVS." The conversion of event-stream dataset was implemented by a repeated closed-loop smooth (RCLS) movement of frame-based images. Unlike the conversion of frame-based images by moving the camera, the image movement is more realistic in respect of its practical applications. The repeated closed-loop image movement generates rich local intensity changes in continuous time which are quantized by each pixel of the DVS camera to generate events. Furthermore, a performance benchmark in event-driven object classification is provided based on state-of-the-art classification algorithms. This work provides a large event-stream dataset and an initial benchmark for comparison, which may boost algorithm developments in even-driven pattern recognition and object classification.
Monovision techniques for telerobots

NASA Technical Reports Server (NTRS)

Goode, P. W.; Carnils, K.

1987-01-01

The primary task of the vision sensor in a telerobotic system is to provide information about the position of the system's effector relative to objects of interest in its environment. The subtasks required to perform the primary task include image segmentation, object recognition, and object location and orientation in some coordinate system. The accomplishment of the vision task requires the appropriate processing tools and the system methodology to effectively apply the tools to the subtasks. The functional structure of the telerobotic vision system used in the Langley Research Center's Intelligent Systems Research Laboratory is discussed as well as two monovision techniques for accomplishing the vision subtasks.
Three-dimensional object recognition based on planar images

NASA Astrophysics Data System (ADS)

Mital, Dinesh P.; Teoh, Eam-Khwang; Au, K. C.; Chng, E. K.

1993-01-01

This paper presents the development and realization of a robotic vision system for the recognition of 3-dimensional (3-D) objects. The system can recognize a single object from among a group of known regular convex polyhedron objects that is constrained to lie on a calibrated flat platform. The approach adopted comprises a series of image processing operations on a single 2-dimensional (2-D) intensity image to derive an image line drawing. Subsequently, a feature matching technique is employed to determine 2-D spatial correspondences of the image line drawing with the model in the database. Besides its identification ability, the system can also provide important position and orientation information of the recognized object. The system was implemented on an IBM-PC AT machine executing at 8 MHz without the 80287 Maths Co-processor. In our overall performance evaluation based on a 600 recognition cycles test, the system demonstrated an accuracy of above 80% with recognition time well within 10 seconds. The recognition time is, however, indirectly dependent on the number of models in the database. The reliability of the system is also affected by illumination conditions which must be clinically controlled as in any industrial robotic vision system.
Trends and developments in industrial machine vision: 2013

NASA Astrophysics Data System (ADS)

Niel, Kurt; Heinzl, Christoph

2014-03-01

When following current advancements and implementations in the field of machine vision there seems to be no borders for future developments: Calculating power constantly increases, and new ideas are spreading and previously challenging approaches are introduced in to mass market. Within the past decades these advances have had dramatic impacts on our lives. Consumer electronics, e.g. computers or telephones, which once occupied large volumes, now fit in the palm of a hand. To note just a few examples e.g. face recognition was adopted by the consumer market, 3D capturing became cheap, due to the huge community SW-coding got easier using sophisticated development platforms. However, still there is a remaining gap between consumer and industrial applications. While the first ones have to be entertaining, the second have to be reliable. Recent studies (e.g. VDMA [1], Germany) show a moderately increasing market for machine vision in industry. Asking industry regarding their needs the main challenges for industrial machine vision are simple usage and reliability for the process, quick support, full automation, self/easy adjustment at changing process parameters, "forget it in the line". Furthermore a big challenge is to support quality control: Nowadays the operator has to accurately define the tested features for checking the probes. There is an upcoming development also to let automated machine vision applications find out essential parameters in a more abstract level (top down). In this work we focus on three current and future topics for industrial machine vision: Metrology supporting automation, quality control (inline/atline/offline) as well as visualization and analysis of datasets with steadily growing sizes. Finally the general trend of the pixel orientated towards object orientated evaluation is addressed. We do not directly address the field of robotics taking advances from machine vision. This is actually a fast changing area which is worth an own contribution.
Computer vision challenges and technologies for agile manufacturing

NASA Astrophysics Data System (ADS)

Molley, Perry A.

1996-02-01

Sandia National Laboratories, a Department of Energy laboratory, is responsible for maintaining the safety, security, reliability, and availability of the nuclear weapons stockpile for the United States. Because of the changing national and global political climates and inevitable budget cuts, Sandia is changing the methods and processes it has traditionally used in the product realization cycle for weapon components. Because of the increasing age of the nuclear stockpile, it is certain that the reliability of these weapons will degrade with time unless eventual action is taken to repair, requalify, or renew them. Furthermore, due to the downsizing of the DOE weapons production sites and loss of technical personnel, the new product realization process is being focused on developing and deploying advanced automation technologies in order to maintain the capability for producing new components. The goal of Sandia's technology development program is to create a product realization environment that is cost effective, has improved quality and reduced cycle time for small lot sizes. The new environment will rely less on the expertise of humans and more on intelligent systems and automation to perform the production processes. The systems will be robust in order to provide maximum flexibility and responsiveness for rapidly changing component or product mixes. An integrated enterprise will allow ready access to and use of information for effective and efficient product and process design. Concurrent engineering methods will allow a speedup of the product realization cycle, reduce costs, and dramatically lessen the dependency on creating and testing physical prototypes. Virtual manufacturing will allow production processes to be designed, integrated, and programed off-line before a piece of hardware ever moves. The overriding goal is to be able to build a large variety of new weapons parts on short notice. Many of these technologies that are being developed are also applicable to commercial production processes and applications. Computer vision will play a critical role in the new agile production environment for automation of processes such as inspection, assembly, welding, material dispensing and other process control tasks. Although there are many academic and commercial solutions that have been developed, none have had widespread adoption considering the huge potential number of applications that could benefit from this technology. The reason for this slow adoption is that the advantages of computer vision for automation can be a double-edged sword. The benefits can be lost if the vision system requires an inordinate amount of time for reprogramming by a skilled operator to account for different parts, changes in lighting conditions, background clutter, changes in optics, etc. Commercially available solutions typically require an operator to manually program the vision system with features used for the recognition. In a recent survey, we asked a number of commercial manufacturers and machine vision companies the question, 'What prevents machine vision systems from being more useful in factories?' The number one (and unanimous) response was that vision systems require too much skill to set up and program to be cost effective.
Importance of balanced architectures in the design of high-performance imaging systems

NASA Astrophysics Data System (ADS)

Sgro, Joseph A.; Stanton, Paul C.

1999-03-01

Imaging systems employed in demanding military and industrial applications, such as automatic target recognition and computer vision, typically require real-time high-performance computing resources. While high- performances computing systems have traditionally relied on proprietary architectures and custom components, recent advances in high performance general-purpose microprocessor technology have produced an abundance of low cost components suitable for use in high-performance computing systems. A common pitfall in the design of high performance imaging system, particularly systems employing scalable multiprocessor architectures, is the failure to balance computational and memory bandwidth. The performance of standard cluster designs, for example, in which several processors share a common memory bus, is typically constrained by memory bandwidth. The symptom characteristic of this problem is failure to the performance of the system to scale as more processors are added. The problem becomes exacerbated if I/O and memory functions share the same bus. The recent introduction of microprocessors with large internal caches and high performance external memory interfaces makes it practical to design high performance imaging system with balanced computational and memory bandwidth. Real word examples of such designs will be presented, along with a discussion of adapting algorithm design to best utilize available memory bandwidth.
A computer vision for animal ecology.

PubMed

Weinstein, Ben G

2018-05-01

A central goal of animal ecology is to observe species in the natural world. The cost and challenge of data collection often limit the breadth and scope of ecological study. Ecologists often use image capture to bolster data collection in time and space. However, the ability to process these images remains a bottleneck. Computer vision can greatly increase the efficiency, repeatability and accuracy of image review. Computer vision uses image features, such as colour, shape and texture to infer image content. I provide a brief primer on ecological computer vision to outline its goals, tools and applications to animal ecology. I reviewed 187 existing applications of computer vision and divided articles into ecological description, counting and identity tasks. I discuss recommendations for enhancing the collaboration between ecologists and computer scientists and highlight areas for future growth of automated image analysis. © 2017 The Author. Journal of Animal Ecology © 2017 British Ecological Society.
Environmental Recognition and Guidance Control for Autonomous Vehicles using Dual Vision Sensor and Applications

NASA Astrophysics Data System (ADS)

Moriwaki, Katsumi; Koike, Issei; Sano, Tsuyoshi; Fukunaga, Tetsuya; Tanaka, Katsuyuki

We propose a new method of environmental recognition around an autonomous vehicle using dual vision sensor and navigation control based on binocular images. We consider to develop a guide robot that can play the role of a guide dog as the aid to people such as the visually impaired or the aged, as an application of above-mentioned techniques. This paper presents a recognition algorithm, which finds out the line of a series of Braille blocks and the boundary line between a sidewalk and a roadway where a difference in level exists by binocular images obtained from a pair of parallelarrayed CCD cameras. This paper also presents a tracking algorithm, with which the guide robot traces along a series of Braille blocks and avoids obstacles and unsafe areas which exist in the way of a person with the guide robot.
Component Pin Recognition Using Algorithms Based on Machine Learning

NASA Astrophysics Data System (ADS)

Xiao, Yang; Hu, Hong; Liu, Ze; Xu, Jiangchang

2018-04-01

The purpose of machine vision for a plug-in machine is to improve the machine’s stability and accuracy, and recognition of the component pin is an important part of the vision. This paper focuses on component pin recognition using three different techniques. The first technique involves traditional image processing using the core algorithm for binary large object (BLOB) analysis. The second technique uses the histogram of oriented gradients (HOG), to experimentally compare the effect of the support vector machine (SVM) and the adaptive boosting machine (AdaBoost) learning meta-algorithm classifiers. The third technique is the use of an in-depth learning method known as convolution neural network (CNN), which involves identifying the pin by comparing a sample to its training. The main purpose of the research presented in this paper is to increase the knowledge of learning methods used in the plug-in machine industry in order to achieve better results.
Fast cat-eye effect target recognition based on saliency extraction

NASA Astrophysics Data System (ADS)

Li, Li; Ren, Jianlin; Wang, Xingbin

2015-09-01

Background complexity is a main reason that results in false detection in cat-eye target recognition. Human vision has selective attention property which can help search the salient target from complex unknown scenes quickly and precisely. In the paper, we propose a novel cat-eye effect target recognition method named Multi-channel Saliency Processing before Fusion (MSPF). This method combines traditional cat-eye target recognition with the selective characters of visual attention. Furthermore, parallel processing enables it to achieve fast recognition. Experimental results show that the proposed method performs better in accuracy, robustness and speed compared to other methods.
A sensor and video based ontology for activity recognition in smart environments.

PubMed

Mitchell, D; Morrow, Philip J; Nugent, Chris D

2014-01-01

Activity recognition is used in a wide range of applications including healthcare and security. In a smart environment activity recognition can be used to monitor and support the activities of a user. There have been a range of methods used in activity recognition including sensor-based approaches, vision-based approaches and ontological approaches. This paper presents a novel approach to activity recognition in a smart home environment which combines sensor and video data through an ontological framework. The ontology describes the relationships and interactions between activities, the user, objects, sensors and video data.
Neural associative memories for the integration of language, vision and action in an autonomous agent.

PubMed

Markert, H; Kaufmann, U; Kara Kayikci, Z; Palm, G

2009-03-01

Language understanding is a long-standing problem in computer science. However, the human brain is capable of processing complex languages with seemingly no difficulties. This paper shows a model for language understanding using biologically plausible neural networks composed of associative memories. The model is able to deal with ambiguities on the single word and grammatical level. The language system is embedded into a robot in order to demonstrate the correct semantical understanding of the input sentences by letting the robot perform corresponding actions. For that purpose, a simple neural action planning system has been combined with neural networks for visual object recognition and visual attention control mechanisms.

SAVA 3: A testbed for integration and control of visual processes

NASA Technical Reports Server (NTRS)

Crowley, James L.; Christensen, Henrik

1994-01-01

The development of an experimental test-bed to investigate the integration and control of perception in a continuously operating vision system is described. The test-bed integrates a 12 axis robotic stereo camera head mounted on a mobile robot, dedicated computer boards for real-time image acquisition and processing, and a distributed system for image description. The architecture was designed to: (1) be continuously operating, (2) integrate software contributions from geographically dispersed laboratories, (3) integrate description of the environment with 2D measurements, 3D models, and recognition of objects, (4) capable of supporting diverse experiments in gaze control, visual servoing, navigation, and object surveillance, and (5) dynamically reconfiguarable.
The graph neural network model.

PubMed

Scarselli, Franco; Gori, Marco; Tsoi, Ah Chung; Hagenbuchner, Markus; Monfardini, Gabriele

2009-01-01

Many underlying relationships among data in several areas of science and engineering, e.g., computer vision, molecular chemistry, molecular biology, pattern recognition, and data mining, can be represented in terms of graphs. In this paper, we propose a new neural network model, called graph neural network (GNN) model, that extends existing neural network methods for processing the data represented in graph domains. This GNN model, which can directly process most of the practically useful types of graphs, e.g., acyclic, cyclic, directed, and undirected, implements a function tau(G,n) is an element of IR(m) that maps a graph G and one of its nodes n into an m-dimensional Euclidean space. A supervised learning algorithm is derived to estimate the parameters of the proposed GNN model. The computational cost of the proposed algorithm is also considered. Some experimental results are shown to validate the proposed learning algorithm, and to demonstrate its generalization capabilities.
An Automatic Registration Algorithm for 3D Maxillofacial Model

NASA Astrophysics Data System (ADS)

Qiu, Luwen; Zhou, Zhongwei; Guo, Jixiang; Lv, Jiancheng

2016-09-01

3D image registration aims at aligning two 3D data sets in a common coordinate system, which has been widely used in computer vision, pattern recognition and computer assisted surgery. One challenging problem in 3D registration is that point-wise correspondences between two point sets are often unknown apriori. In this work, we develop an automatic algorithm for 3D maxillofacial models registration including facial surface model and skull model. Our proposed registration algorithm can achieve a good alignment result between partial and whole maxillofacial model in spite of ambiguous matching, which has a potential application in the oral and maxillofacial reparative and reconstructive surgery. The proposed algorithm includes three steps: (1) 3D-SIFT features extraction and FPFH descriptors construction; (2) feature matching using SAC-IA; (3) coarse rigid alignment and refinement by ICP. Experiments on facial surfaces and mandible skull models demonstrate the efficiency and robustness of our algorithm.
From Phonomecanocardiography to Phonocardiography computer aided

NASA Astrophysics Data System (ADS)

Granados, J.; Tavera, F.; López, G.; Velázquez, J. M.; Hernández, R. T.; López, G. A.

2017-01-01

Due to lack of training doctors to identify many of the disorders in the heart by conventional listening, it is necessary to add an objective and methodological analysis to support this technique. In order to obtain information of the performance of the heart to be able to diagnose heart disease through a simple, cost-effective procedure by means of a data acquisition system, we have obtained Phonocardiograms (PCG), which are images of the sounds emitted by the heart. A program of acoustic, visual and artificial vision recognition was elaborated to interpret them. Based on the results of previous research of cardiologists a code of interpretation of PCG and associated diseases was elaborated. Also a site, within the university campus, of experimental sampling of cardiac data was created. Phonocardiography computer-aided is a viable and low cost procedure which provides additional medical information to make a diagnosis of complex heart diseases. We show some previous results.
Culto: AN Ontology-Based Annotation Tool for Data Curation in Cultural Heritage

NASA Astrophysics Data System (ADS)

Garozzo, R.; Murabito, F.; Santagati, C.; Pino, C.; Spampinato, C.

2017-08-01

This paper proposes CulTO, a software tool relying on a computational ontology for Cultural Heritage domain modelling, with a specific focus on religious historical buildings, for supporting cultural heritage experts in their investigations. It is specifically thought to support annotation, automatic indexing, classification and curation of photographic data and text documents of historical buildings. CULTO also serves as a useful tool for Historical Building Information Modeling (H-BIM) by enabling semantic 3D data modeling and further enrichment with non-geometrical information of historical buildings through the inclusion of new concepts about historical documents, images, decay or deformation evidence as well as decorative elements into BIM platforms. CulTO is the result of a joint research effort between the Laboratory of Surveying and Architectural Photogrammetry "Luigi Andreozzi" and the PeRCeiVe Lab (Pattern Recognition and Computer Vision Lab) of the University of Catania,
Efficient LIDAR Point Cloud Data Managing and Processing in a Hadoop-Based Distributed Framework

NASA Astrophysics Data System (ADS)

Wang, C.; Hu, F.; Sha, D.; Han, X.

2017-10-01

Light Detection and Ranging (LiDAR) is one of the most promising technologies in surveying and mapping city management, forestry, object recognition, computer vision engineer and others. However, it is challenging to efficiently storage, query and analyze the high-resolution 3D LiDAR data due to its volume and complexity. In order to improve the productivity of Lidar data processing, this study proposes a Hadoop-based framework to efficiently manage and process LiDAR data in a distributed and parallel manner, which takes advantage of Hadoop's storage and computing ability. At the same time, the Point Cloud Library (PCL), an open-source project for 2D/3D image and point cloud processing, is integrated with HDFS and MapReduce to conduct the Lidar data analysis algorithms provided by PCL in a parallel fashion. The experiment results show that the proposed framework can efficiently manage and process big LiDAR data.
Crepuscular and Nocturnal Illumination and Its Effects on Color Perception by the Nocturnal Hawkmoth Deilephila elpenor

DTIC Science & Technology

2006-01-01

vision may enhance recognition of conspecifics or be used in mating. While mating in moths is thought to be entirely mediated by olfaction , most tasks are...time, unambiguous evidence for true color vision under scotopic conditions has only recently been acquired (Kelber et al., 2002; Roth and Kelber, 2004...color under starlight and dim moonlight, respectively, raise at least two issues. First, what is the selective advantage of color vision in these
A Human Activity Recognition System Based on Dynamic Clustering of Skeleton Data.

PubMed

Manzi, Alessandro; Dario, Paolo; Cavallo, Filippo

2017-05-11

Human activity recognition is an important area in computer vision, with its wide range of applications including ambient assisted living. In this paper, an activity recognition system based on skeleton data extracted from a depth camera is presented. The system makes use of machine learning techniques to classify the actions that are described with a set of a few basic postures. The training phase creates several models related to the number of clustered postures by means of a multiclass Support Vector Machine (SVM), trained with Sequential Minimal Optimization (SMO). The classification phase adopts the X-means algorithm to find the optimal number of clusters dynamically. The contribution of the paper is twofold. The first aim is to perform activity recognition employing features based on a small number of informative postures, extracted independently from each activity instance; secondly, it aims to assess the minimum number of frames needed for an adequate classification. The system is evaluated on two publicly available datasets, the Cornell Activity Dataset (CAD-60) and the Telecommunication Systems Team (TST) Fall detection dataset. The number of clusters needed to model each instance ranges from two to four elements. The proposed approach reaches excellent performances using only about 4 s of input data (~100 frames) and outperforms the state of the art when it uses approximately 500 frames on the CAD-60 dataset. The results are promising for the test in real context.
A Case Study on Attribute Recognition of Heated Metal Mark Image Using Deep Convolutional Neural Networks.

PubMed

Mao, Keming; Lu, Duo; E, Dazhi; Tan, Zhenhua

2018-06-07

Heated metal mark is an important trace to identify the cause of fire. However, traditional methods mainly focus on the knowledge of physics and chemistry for qualitative analysis and make it still a challenging problem. This paper presents a case study on attribute recognition of the heated metal mark image using computer vision and machine learning technologies. The proposed work is composed of three parts. Material is first generated. According to national standards, actual needs and feasibility, seven attributes are selected for research. Data generation and organization are conducted, and a small size benchmark dataset is constructed. A recognition model is then implemented. Feature representation and classifier construction methods are introduced based on deep convolutional neural networks. Finally, the experimental evaluation is carried out. Multi-aspect testings are performed with various model structures, data augments, training modes, optimization methods and batch sizes. The influence of parameters, recognitio efficiency and execution time are also analyzed. The results show that with a fine-tuned model, the recognition rate of attributes metal type, heating mode, heating temperature, heating duration, cooling mode, placing duration and relative humidity are 0.925, 0.908, 0.835, 0.917, 0.928, 0.805 and 0.92, respectively. The proposed method recognizes the attribute of heated metal mark with preferable effect, and it can be used in practical application.
Computer vision-based sorting of Atlantic salmon (Salmo salar) fillets according to their color level.

PubMed

Misimi, E; Mathiassen, J R; Erikson, U

2007-01-01

Computer vision method was used to evaluate the color of Atlantic salmon (Salmo salar) fillets. Computer vision-based sorting of fillets according to their color was studied on 2 separate groups of salmon fillets. The images of fillets were captured using a digital camera of high resolution. Images of salmon fillets were then segmented in the regions of interest and analyzed in red, green, and blue (RGB) and CIE Lightness, redness, and yellowness (Lab) color spaces, and classified according to the Roche color card industrial standard. Comparisons of fillet color between visual evaluations were made by a panel of human inspectors, according to the Roche SalmoFan lineal standard, and the color scores generated from computer vision algorithm showed that there were no significant differences between the methods. Overall, computer vision can be used as a powerful tool to sort fillets by color in a fast and nondestructive manner. The low cost of implementing computer vision solutions creates the potential to replace manual labor in fish processing plants with automation.
The role of external features in face recognition with central vision loss: A pilot study

PubMed Central

Bernard, Jean-Baptiste; Chung, Susana T.L.

2016-01-01

Purpose We evaluated how the performance for recognizing familiar face images depends on the internal (eyebrows, eyes, nose, mouth) and external face features (chin, outline of face, hairline) in individuals with central vision loss. Methods In Experiment 1, we measured eye movements for four observers with central vision loss to determine whether they fixated more often on the internal or the external features of face images while attempting to recognize the images. We then measured the accuracy for recognizing face images that contained only the internal, only the external, or both internal and external features (Experiment 2), and for hybrid images where the internal and external features came from two different source images (Experiment 3), for five observers with central vision loss and four age-matched control observers. Results When recognizing familiar face images, approximately 40% of the fixations of observers with central vision loss were centered on the external features of faces. The recognition accuracy was higher for images containing only external features (66.8±3.3% correct) than for images containing only internal features (35.8±15.0%), a finding contradicting that of control observers. For hybrid face images, observers with central vision loss responded more accurately to the external features (50.4±17.8%) than to the internal features (9.3±4.9%), while control observers did not show the same bias toward responding to the external features. Conclusions Contrary to people with normal vision who rely more on the internal features of face images for recognizing familiar faces, individuals with central vision loss show a higher dependence on using external features of face images. PMID:26829260
The Role of External Features in Face Recognition with Central Vision Loss.

PubMed

Bernard, Jean-Baptiste; Chung, Susana T L

2016-05-01

We evaluated how the performance of recognizing familiar face images depends on the internal (eyebrows, eyes, nose, mouth) and external face features (chin, outline of face, hairline) in individuals with central vision loss. In experiment 1, we measured eye movements for four observers with central vision loss to determine whether they fixated more often on the internal or the external features of face images while attempting to recognize the images. We then measured the accuracy for recognizing face images that contained only the internal, only the external, or both internal and external features (experiment 2) and for hybrid images where the internal and external features came from two different source images (experiment 3) for five observers with central vision loss and four age-matched control observers. When recognizing familiar face images, approximately 40% of the fixations of observers with central vision loss was centered on the external features of faces. The recognition accuracy was higher for images containing only external features (66.8 ± 3.3% correct) than for images containing only internal features (35.8 ± 15.0%), a finding contradicting that of control observers. For hybrid face images, observers with central vision loss responded more accurately to the external features (50.4 ± 17.8%) than to the internal features (9.3 ± 4.9%), whereas control observers did not show the same bias toward responding to the external features. Contrary to people with normal vision who rely more on the internal features of face images for recognizing familiar faces, individuals with central vision loss show a higher dependence on using external features of face images.
Selective Attention in Vision: Recognition Memory for Superimposed Line Drawings.

ERIC Educational Resources Information Center

Goldstein, E. Bruce; Fink, Susan I.

1981-01-01

Four experiments show that observers can selectively attend to one of two stationary superimposed pictures. Selective recognition occurred with large displays in which observers were free to make eye movements during a 3-sec exposure and with small displays in which observers were instructed to fixate steadily on a point. (Author/RD)
Face recognition in newly hatched chicks at the onset of vision.

PubMed

Wood, Samantha M W; Wood, Justin N

2015-04-01

How does face recognition emerge in the newborn brain? To address this question, we used an automated controlled-rearing method with a newborn animal model: the domestic chick (Gallus gallus). This automated method allowed us to examine chicks' face recognition abilities at the onset of both face experience and object experience. In the first week of life, newly hatched chicks were raised in controlled-rearing chambers that contained no objects other than a single virtual human face. In the second week of life, we used an automated forced-choice testing procedure to examine whether chicks could distinguish that familiar face from a variety of unfamiliar faces. Chicks successfully distinguished the familiar face from most of the unfamiliar faces-for example, chicks were sensitive to changes in the face's age, gender, and orientation (upright vs. inverted). Thus, chicks can build an accurate representation of the first face they see in their life. These results show that the initial state of face recognition is surprisingly powerful: Newborn visual systems can begin encoding and recognizing faces at the onset of vision. (c) 2015 APA, all rights reserved).
Higher-order neural network software for distortion invariant object recognition

NASA Technical Reports Server (NTRS)

Reid, Max B.; Spirkovska, Lilly

1991-01-01

The state-of-the-art in pattern recognition for such applications as automatic target recognition and industrial robotic vision relies on digital image processing. We present a higher-order neural network model and software which performs the complete feature extraction-pattern classification paradigm required for automatic pattern recognition. Using a third-order neural network, we demonstrate complete, 100 percent accurate invariance to distortions of scale, position, and in-plate rotation. In a higher-order neural network, feature extraction is built into the network, and does not have to be learned. Only the relatively simple classification step must be learned. This is key to achieving very rapid training. The training set is much smaller than with standard neural network software because the higher-order network only has to be shown one view of each object to be learned, not every possible view. The software and graphical user interface run on any Sun workstation. Results of the use of the neural software in autonomous robotic vision systems are presented. Such a system could have extensive application in robotic manufacturing.
Machine Learning, deep learning and optimization in computer vision

NASA Astrophysics Data System (ADS)

Canu, Stéphane

2017-03-01

As quoted in the Large Scale Computer Vision Systems NIPS workshop, computer vision is a mature field with a long tradition of research, but recent advances in machine learning, deep learning, representation learning and optimization have provided models with new capabilities to better understand visual content. The presentation will go through these new developments in machine learning covering basic motivations, ideas, models and optimization in deep learning for computer vision, identifying challenges and opportunities. It will focus on issues related with large scale learning that is: high dimensional features, large variety of visual classes, and large number of examples.
Deep Learning in Gastrointestinal Endoscopy.

PubMed

Patel, Vivek; Armstrong, David; Ganguli, Malika; Roopra, Sandeep; Kantipudi, Neha; Albashir, Siwar; Kamath, Markad V

2016-01-01

Gastrointestinal (GI) endoscopy is used to inspect the lumen or interior of the GI tract for several purposes, including, (1) making a clinical diagnosis, in real time, based on the visual appearances; (2) taking targeted tissue samples for subsequent histopathological examination; and (3) in some cases, performing therapeutic interventions targeted at specific lesions. GI endoscopy is therefore predicated on the assumption that the operator-the endoscopist-is able to identify and characterize abnormalities or lesions accurately and reproducibly. However, as in other areas of clinical medicine, such as histopathology and radiology, many studies have documented marked interobserver and intraobserver variability in lesion recognition. Thus, there is a clear need and opportunity for techniques or methodologies that will enhance the quality of lesion recognition and diagnosis and improve the outcomes of GI endoscopy. Deep learning models provide a basis to make better clinical decisions in medical image analysis. Biomedical image segmentation, classification, and registration can be improved with deep learning. Recent evidence suggests that the application of deep learning methods to medical image analysis can contribute significantly to computer-aided diagnosis. Deep learning models are usually considered to be more flexible and provide reliable solutions for image analysis problems compared to conventional computer vision models. The use of fast computers offers the possibility of real-time support that is important for endoscopic diagnosis, which has to be made in real time. Advanced graphics processing units and cloud computing have also favored the use of machine learning, and more particularly, deep learning for patient care. This paper reviews the rapidly evolving literature on the feasibility of applying deep learning algorithms to endoscopic imaging.
Current Technologies and its Trends of Machine Vision in the Field of Security and Disaster Prevention

NASA Astrophysics Data System (ADS)

Hashimoto, Manabu; Fujino, Yozo

Image sensing technologies are expected as useful and effective way to suppress damages by criminals and disasters in highly safe and relieved society. In this paper, we describe current important subjects, required functions, technical trends, and a couple of real examples of developed system. As for the video surveillance, recognition of human trajectory and human behavior using image processing techniques are introduced with real examples about the violence detection for elevators. In the field of facility monitoring technologies as civil engineering, useful machine vision applications such as automatic detection of concrete cracks on walls of a building or recognition of crowded people on bridge for effective guidance in emergency are shown.
Biologically based machine vision: signal analysis of monopolar cells in the visual system of Musca domestica.

PubMed

Newton, Jenny; Barrett, Steven F; Wilcox, Michael J; Popp, Stephanie

2002-01-01

Machine vision for navigational purposes is a rapidly growing field. Many abilities such as object recognition and target tracking rely on vision. Autonomous vehicles must be able to navigate in dynamic enviroments and simultaneously locate a target position. Traditional machine vision often fails to react in real time because of large computational requirements whereas the fly achieves complex orientation and navigation with a relatively small and simple brain. Understanding how the fly extracts visual information and how neurons encode and process information could lead us to a new approach for machine vision applications. Photoreceptors in the Musca domestica eye that share the same spatial information converge into a structure called the cartridge. The cartridge consists of the photoreceptor axon terminals and monopolar cells L1, L2, and L4. It is thought that L1 and L2 cells encode edge related information relative to a single cartridge. These cells are thought to be equivalent to vertebrate bipolar cells, producing contrast enhancement and reduction of information sent to L4. Monopolar cell L4 is thought to perform image segmentation on the information input from L1 and L2 and also enhance edge detection. A mesh of interconnected L4's would correlate the output from L1 and L2 cells of adjacent cartridges and provide a parallel network for segmenting an object's edges. The focus of this research is to excite photoreceptors of the common housefly, Musca domestica, with different visual patterns. The electrical response of monopolar cells L1, L2, and L4 will be recorded using intracellular recording techniques. Signal analysis will determine the neurocircuitry to detect and segment images.
3-D Signal Processing in a Computer Vision System

Treesearch

Dongping Zhu; Richard W. Conners; Philip A. Araman

1991-01-01

This paper discusses the problem of 3-dimensional image filtering in a computer vision system that would locate and identify internal structural failure. In particular, a 2-dimensional adaptive filter proposed by Unser has been extended to 3-dimension. In conjunction with segmentation and labeling, the new filter has been used in the computer vision system to...

Experiences Using an Open Source Software Library to Teach Computer Vision Subjects

ERIC Educational Resources Information Center

Cazorla, Miguel; Viejo, Diego

2015-01-01

Machine vision is an important subject in computer science and engineering degrees. For laboratory experimentation, it is desirable to have a complete and easy-to-use tool. In this work we present a Java library, oriented to teaching computer vision. We have designed and built the library from the scratch with emphasis on readability and…
Detecting Motion from a Moving Platform; Phase 3: Unification of Control and Sensing for More Advanced Situational Awareness

DTIC Science & Technology

2011-11-01

RX-TY-TR-2011-0096-01) develops a novel computer vision sensor based upon the biological vision system of the common housefly , Musca domestica...01 summarizes the development of a novel computer vision sensor based upon the biological vision system of the common housefly , Musca domestica
Assistive technology for children and young people with low vision.

PubMed

Thomas, Rachel; Barker, Lucy; Rubin, Gary; Dahlmann-Noor, Annegret

2015-06-18

Recent technological developments, such as the near universal spread of mobile phones and portable computers and improvements in the accessibility features of these devices, give children and young people with low vision greater independent access to information. Some electronic technologies, such as closed circuit TV, are well established low vision aids and newer versions, such as electronic readers or off-the shelf tablet computers, may offer similar functionalities with easier portability and at lower cost. To assess the effect of electronic assistive technologies on reading, educational outcomes and quality of life in children and young people with low vision. We searched CENTRAL (which contains the Cochrane Eyes and Vision Group Trials Register) (2014, Issue 9), Ovid MEDLINE, Ovid MEDLINE In-Process and Other Non-Indexed Citations, Ovid MEDLINE Daily, Ovid OLDMEDLINE (January 1946 to October 2014), EMBASE (January 1980 to October 2014), the Health Technology Assessment Programme (HTA) (www.hta.ac.uk/), the metaRegister of Controlled Trials (mRCT) (www.controlled-trials.com), ClinicalTrials.gov (www.clinicaltrials.gov) and the World Health Organization (WHO) International Clinical Trials Registry Platform (ICTRP) (www.who.int/ictrp/search/en). We did not use any date or language restrictions in the electronic searches for trials. We last searched the electronic databases on 30 October 2014. We intended to include randomised controlled trials (RCTs) and quasi-RCTs in this review. We planned to include trials involving children between the ages of 5 and 16 years with low vision as defined by, or equivalent to, the WHO 1992 definition of low vision. We planned to include studies that explore the use of assistive technologies (ATs). These could include all types of closed circuit television/electronic vision enhancement systems (CCTV/EVES), computer technology including tablet computers and adaptive technologies such as screen readers, screen magnification and optical character recognition (OCR). We intended to compare the use of ATs with standard optical aids, which include distance refractive correction (with appropriate near addition for aphakic (no lens)/pseudophakic (with lens implant) patients) and monocular/binoculars for distance and brightfield magnifiers for near. We also planned to include studies that compare different types of ATs with each other, without or in addition to conventional optical aids, and those that compare ATs given with or without instructions for use. Independently, two review authors reviewed titles and abstracts for eligibility. They divided studies into categories to 'definitely include', 'definitely exclude' and 'possibly include', and the same two authors made final judgements about inclusion/exclusion by obtaining full-text copies of the studies in the 'possibly include' category. We did not identify any randomised controlled trials in this subject area. High-quality evidence about the usefulness of electronic AT for children and young people with visual impairment is needed to inform the choice healthcare and education providers and family have to make when selecting a technology. Randomised controlled trials are needed to assess the impact of AT. Research protocols should carefully select outcomes relevant not only to the scientific community, but more importantly to families and teachers. Functional outcomes such as reading accuracy, comprehension and speed should be recorded, as well as the impact of AT on independent learning and quality of life.
Vision-Based UAV Flight Control and Obstacle Avoidance

DTIC Science & Technology

2006-01-01

denoted it by Vb = (Vb1, Vb2 , Vb3). Fig. 2 shows the block diagram of the proposed vision-based motion analysis and obstacle avoidance system. We denote...structure analysis often involve computation- intensive computer vision tasks, such as feature extraction and geometric modeling. Computation-intensive...First, we extract a set of features from each block. 2) Second, we compute the distance between these two sets of features. In conventional motion
a Fully Automated Pipeline for Classification Tasks with AN Application to Remote Sensing

NASA Astrophysics Data System (ADS)

Suzuki, K.; Claesen, M.; Takeda, H.; De Moor, B.

2016-06-01

Nowadays deep learning has been intensively in spotlight owing to its great victories at major competitions, which undeservedly pushed `shallow' machine learning methods, relatively naive/handy algorithms commonly used by industrial engineers, to the background in spite of their facilities such as small requisite amount of time/dataset for training. We, with a practical point of view, utilized shallow learning algorithms to construct a learning pipeline such that operators can utilize machine learning without any special knowledge, expensive computation environment, and a large amount of labelled data. The proposed pipeline automates a whole classification process, namely feature-selection, weighting features and the selection of the most suitable classifier with optimized hyperparameters. The configuration facilitates particle swarm optimization, one of well-known metaheuristic algorithms for the sake of generally fast and fine optimization, which enables us not only to optimize (hyper)parameters but also to determine appropriate features/classifier to the problem, which has conventionally been a priori based on domain knowledge and remained untouched or dealt with naïve algorithms such as grid search. Through experiments with the MNIST and CIFAR-10 datasets, common datasets in computer vision field for character recognition and object recognition problems respectively, our automated learning approach provides high performance considering its simple setting (i.e. non-specialized setting depending on dataset), small amount of training data, and practical learning time. Moreover, compared to deep learning the performance stays robust without almost any modification even with a remote sensing object recognition problem, which in turn indicates that there is a high possibility that our approach contributes to general classification problems.
Simulation Test of a Head-Worn Display with Ambient Vision Display for Unusual Attitude Recovery

NASA Technical Reports Server (NTRS)

Arthur, Jarvis (Trey) J., III; Nicholas, Stephanie N.; Shelton, Kevin J.; Ballard, Kathryn; Prinzel, Lawrence J., III; Ellis, Kyle E.; Bailey, Randall E.; Williams, Steven P.

2017-01-01

Head-Worn Displays (HWDs) are envisioned as a possible equivalent to a Head-Up Display (HUD) in commercial and general aviation. A simulation experiment was conducted to evaluate whether the HWD can provide an equivalent or better level of performance to a HUD in terms of unusual attitude recognition and recovery. A prototype HWD was tested with ambient vision capability which were varied (on/off) as an independent variable in the experiment testing for attitude awareness. The simulation experiment was conducted in two parts: 1) short unusual attitude recovery scenarios where the aircraft is placed in an unusual attitude and a single-pilot crew recovered the aircraft; and, 2) a two-pilot crew operating in a realistic flight environment with "off-nominal" events to induce unusual attitudes. The data showed few differences in unusual attitude recognition and recovery performance between the tested head-down, head-up, and head-worn display concepts. The presence and absence of ambient vision stimulation was inconclusive. The ergonomic influences of the head-worn display, necessary to implement the ambient vision experimentation, may have influenced the pilot ratings and acceptance of the concepts.
Recognition of plant parts with problem-specific algorithms

NASA Astrophysics Data System (ADS)

Schwanke, Joerg; Brendel, Thorsten; Jensch, Peter F.; Megnet, Roland

1994-06-01

Automatic micropropagation is necessary to produce cost-effective high amounts of biomass. Juvenile plants are dissected in clean- room environment on particular points on the stem or the leaves. A vision-system detects possible cutting points and controls a specialized robot. This contribution is directed to the pattern- recognition algorithms to detect structural parts of the plant.
A low-cost machine vision system for the recognition and sorting of small parts

NASA Astrophysics Data System (ADS)

Barea, Gustavo; Surgenor, Brian W.; Chauhan, Vedang; Joshi, Keyur D.

2018-04-01

An automated machine vision-based system for the recognition and sorting of small parts was designed, assembled and tested. The system was developed to address a need to expose engineering students to the issues of machine vision and assembly automation technology, with readily available and relatively low-cost hardware and software. This paper outlines the design of the system and presents experimental performance results. Three different styles of plastic gears, together with three different styles of defective gears, were used to test the system. A pattern matching tool was used for part classification. Nine experiments were conducted to demonstrate the effects of changing various hardware and software parameters, including: conveyor speed, gear feed rate, classification, and identification score thresholds. It was found that the system could achieve a maximum system accuracy of 95% at a feed rate of 60 parts/min, for a given set of parameter settings. Future work will be looking at the effect of lighting.
Binary patterns encoded convolutional neural networks for texture recognition and remote sensing scene classification

NASA Astrophysics Data System (ADS)

Anwer, Rao Muhammad; Khan, Fahad Shahbaz; van de Weijer, Joost; Molinier, Matthieu; Laaksonen, Jorma

2018-04-01

Designing discriminative powerful texture features robust to realistic imaging conditions is a challenging computer vision problem with many applications, including material recognition and analysis of satellite or aerial imagery. In the past, most texture description approaches were based on dense orderless statistical distribution of local features. However, most recent approaches to texture recognition and remote sensing scene classification are based on Convolutional Neural Networks (CNNs). The de facto practice when learning these CNN models is to use RGB patches as input with training performed on large amounts of labeled data (ImageNet). In this paper, we show that Local Binary Patterns (LBP) encoded CNN models, codenamed TEX-Nets, trained using mapped coded images with explicit LBP based texture information provide complementary information to the standard RGB deep models. Additionally, two deep architectures, namely early and late fusion, are investigated to combine the texture and color information. To the best of our knowledge, we are the first to investigate Binary Patterns encoded CNNs and different deep network fusion architectures for texture recognition and remote sensing scene classification. We perform comprehensive experiments on four texture recognition datasets and four remote sensing scene classification benchmarks: UC-Merced with 21 scene categories, WHU-RS19 with 19 scene classes, RSSCN7 with 7 categories and the recently introduced large scale aerial image dataset (AID) with 30 aerial scene types. We demonstrate that TEX-Nets provide complementary information to standard RGB deep model of the same network architecture. Our late fusion TEX-Net architecture always improves the overall performance compared to the standard RGB network on both recognition problems. Furthermore, our final combination leads to consistent improvement over the state-of-the-art for remote sensing scene classification.
Common constraints limit Korean and English character recognition in peripheral vision.

PubMed

He, Yingchen; Kwon, MiYoung; Legge, Gordon E

2018-01-01

The visual span refers to the number of adjacent characters that can be recognized in a single glance. It is viewed as a sensory bottleneck in reading for both normal and clinical populations. In peripheral vision, the visual span for English characters can be enlarged after training with a letter-recognition task. Here, we examined the transfer of training from Korean to English characters for a group of bilingual Korean native speakers. In the pre- and posttests, we measured visual spans for Korean characters and English letters. Training (1.5 hours × 4 days) consisted of repetitive visual-span measurements for Korean trigrams (strings of three characters). Our training enlarged the visual spans for Korean single characters and trigrams, and the benefit transferred to untrained English symbols. The improvement was largely due to a reduction of within-character and between-character crowding in Korean recognition, as well as between-letter crowding in English recognition. We also found a negative correlation between the size of the visual span and the average pattern complexity of the symbol set. Together, our results showed that the visual span is limited by common sensory (crowding) and physical (pattern complexity) factors regardless of the language script, providing evidence that the visual span reflects a universal bottleneck for text recognition.
Common constraints limit Korean and English character recognition in peripheral vision

PubMed Central

He, Yingchen; Kwon, MiYoung; Legge, Gordon E.

2018-01-01

The visual span refers to the number of adjacent characters that can be recognized in a single glance. It is viewed as a sensory bottleneck in reading for both normal and clinical populations. In peripheral vision, the visual span for English characters can be enlarged after training with a letter-recognition task. Here, we examined the transfer of training from Korean to English characters for a group of bilingual Korean native speakers. In the pre- and posttests, we measured visual spans for Korean characters and English letters. Training (1.5 hours × 4 days) consisted of repetitive visual-span measurements for Korean trigrams (strings of three characters). Our training enlarged the visual spans for Korean single characters and trigrams, and the benefit transferred to untrained English symbols. The improvement was largely due to a reduction of within-character and between-character crowding in Korean recognition, as well as between-letter crowding in English recognition. We also found a negative correlation between the size of the visual span and the average pattern complexity of the symbol set. Together, our results showed that the visual span is limited by common sensory (crowding) and physical (pattern complexity) factors regardless of the language script, providing evidence that the visual span reflects a universal bottleneck for text recognition. PMID:29327041
Component-based target recognition inspired by human vision

NASA Astrophysics Data System (ADS)

Zheng, Yufeng; Agyepong, Kwabena

2009-05-01

In contrast with machine vision, human can recognize an object from complex background with great flexibility. For example, given the task of finding and circling all cars (no further information) in a picture, you may build a virtual image in mind from the task (or target) description before looking at the picture. Specifically, the virtual car image may be composed of the key components such as driver cabin and wheels. In this paper, we propose a component-based target recognition method by simulating the human recognition process. The component templates (equivalent to the virtual image in mind) of the target (car) are manually decomposed from the target feature image. Meanwhile, the edges of the testing image can be extracted by using a difference of Gaussian (DOG) model that simulates the spatiotemporal response in visual process. A phase correlation matching algorithm is then applied to match the templates with the testing edge image. If all key component templates are matched with the examining object, then this object is recognized as the target. Besides the recognition accuracy, we will also investigate if this method works with part targets (half cars). In our experiments, several natural pictures taken on streets were used to test the proposed method. The preliminary results show that the component-based recognition method is very promising.
Quality grading of Atlantic salmon (Salmo salar) by computer vision.

PubMed

Misimi, E; Erikson, U; Skavhaug, A

2008-06-01

In this study, we present a promising method of computer vision-based quality grading of whole Atlantic salmon (Salmo salar). Using computer vision, it was possible to differentiate among different quality grades of Atlantic salmon based on the external geometrical information contained in the fish images. Initially, before the image acquisition, the fish were subjectively graded and labeled into grading classes by a qualified human inspector in the processing plant. Prior to classification, the salmon images were segmented into binary images, and then feature extraction was performed on the geometrical parameters of the fish from the grading classes. The classification algorithm was a threshold-based classifier, which was designed using linear discriminant analysis. The performance of the classifier was tested by using the leave-one-out cross-validation method, and the classification results showed a good agreement between the classification done by human inspectors and by the computer vision. The computer vision-based method classified correctly 90% of the salmon from the data set as compared with the classification by human inspector. Overall, it was shown that computer vision can be used as a powerful tool to grade Atlantic salmon into quality grades in a fast and nondestructive manner by a relatively simple classifier algorithm. The low cost of implementation of today's advanced computer vision solutions makes this method feasible for industrial purposes in fish plants as it can replace manual labor, on which grading tasks still rely.
Construction, implementation and testing of an image identification system using computer vision methods for fruit flies with economic importance (Diptera: Tephritidae).

PubMed

Wang, Jiang-Ning; Chen, Xiao-Lin; Hou, Xin-Wen; Zhou, Li-Bing; Zhu, Chao-Dong; Ji, Li-Qiang

2017-07-01

Many species of Tephritidae are damaging to fruit, which might negatively impact international fruit trade. Automatic or semi-automatic identification of fruit flies are greatly needed for diagnosing causes of damage and quarantine protocols for economically relevant insects. A fruit fly image identification system named AFIS1.0 has been developed using 74 species belonging to six genera, which include the majority of pests in the Tephritidae. The system combines automated image identification and manual verification, balancing operability and accuracy. AFIS1.0 integrates image analysis and expert system into a content-based image retrieval framework. In the the automatic identification module, AFIS1.0 gives candidate identification results. Afterwards users can do manual selection based on comparing unidentified images with a subset of images corresponding to the automatic identification result. The system uses Gabor surface features in automated identification and yielded an overall classification success rate of 87% to the species level by Independent Multi-part Image Automatic Identification Test. The system is useful for users with or without specific expertise on Tephritidae in the task of rapid and effective identification of fruit flies. It makes the application of computer vision technology to fruit fly recognition much closer to production level. © 2016 Society of Chemical Industry. © 2016 Society of Chemical Industry.
Focal-Plane Sensing-Processing: A Power-Efficient Approach for the Implementation of Privacy-Aware Networked Visual Sensors

PubMed Central

Fernández-Berni, Jorge; Carmona-Galán, Ricardo; del Río, Rocío; Kleihorst, Richard; Philips, Wilfried; Rodríguez-Vázquez, Ángel

2014-01-01

The capture, processing and distribution of visual information is one of the major challenges for the paradigm of the Internet of Things. Privacy emerges as a fundamental barrier to overcome. The idea of networked image sensors pervasively collecting data generates social rejection in the face of sensitive information being tampered by hackers or misused by legitimate users. Power consumption also constitutes a crucial aspect. Images contain a massive amount of data to be processed under strict timing requirements, demanding high-performance vision systems. In this paper, we describe a hardware-based strategy to concurrently address these two key issues. By conveying processing capabilities to the focal plane in addition to sensing, we can implement privacy protection measures just at the point where sensitive data are generated. Furthermore, such measures can be tailored for efficiently reducing the computational load of subsequent processing stages. As a proof of concept, a full-custom QVGA vision sensor chip is presented. It incorporates a mixed-signal focal-plane sensing-processing array providing programmable pixelation of multiple image regions in parallel. In addition to this functionality, the sensor exploits reconfigurability to implement other processing primitives, namely block-wise dynamic range adaptation, integral image computation and multi-resolution filtering. The proposed circuitry is also suitable to build a granular space, becoming the raw material for subsequent feature extraction and recognition of categorized objects. PMID:25195849
Focal-plane sensing-processing: a power-efficient approach for the implementation of privacy-aware networked visual sensors.

PubMed

Fernández-Berni, Jorge; Carmona-Galán, Ricardo; del Río, Rocío; Kleihorst, Richard; Philips, Wilfried; Rodríguez-Vázquez, Ángel

2014-08-19

The capture, processing and distribution of visual information is one of the major challenges for the paradigm of the Internet of Things. Privacy emerges as a fundamental barrier to overcome. The idea of networked image sensors pervasively collecting data generates social rejection in the face of sensitive information being tampered by hackers or misused by legitimate users. Power consumption also constitutes a crucial aspect. Images contain a massive amount of data to be processed under strict timing requirements, demanding high-performance vision systems. In this paper, we describe a hardware-based strategy to concurrently address these two key issues. By conveying processing capabilities to the focal plane in addition to sensing, we can implement privacy protection measures just at the point where sensitive data are generated. Furthermore, such measures can be tailored for efficiently reducing the computational load of subsequent processing stages. As a proof of concept, a full-custom QVGA vision sensor chip is presented. It incorporates a mixed-signal focal-plane sensing-processing array providing programmable pixelation of multiple image regions in parallel. In addition to this functionality, the sensor exploits reconfigurability to implement other processing primitives, namely block-wise dynamic range adaptation, integral image computation and multi-resolution filtering. The proposed circuitry is also suitable to build a granular space, becoming the raw material for subsequent feature extraction and recognition of categorized objects.
A novel binary shape context for 3D local surface description

NASA Astrophysics Data System (ADS)

Dong, Zhen; Yang, Bisheng; Liu, Yuan; Liang, Fuxun; Li, Bijun; Zang, Yufu

2017-08-01

3D local surface description is now at the core of many computer vision technologies, such as 3D object recognition, intelligent driving, and 3D model reconstruction. However, most of the existing 3D feature descriptors still suffer from low descriptiveness, weak robustness, and inefficiency in both time and memory. To overcome these challenges, this paper presents a robust and descriptive 3D Binary Shape Context (BSC) descriptor with high efficiency in both time and memory. First, a novel BSC descriptor is generated for 3D local surface description, and the performance of the BSC descriptor under different settings of its parameters is analyzed. Next, the descriptiveness, robustness, and efficiency in both time and memory of the BSC descriptor are evaluated and compared to those of several state-of-the-art 3D feature descriptors. Finally, the performance of the BSC descriptor for 3D object recognition is also evaluated on a number of popular benchmark datasets, and an urban-scene dataset is collected by a terrestrial laser scanner system. Comprehensive experiments demonstrate that the proposed BSC descriptor obtained high descriptiveness, strong robustness, and high efficiency in both time and memory and achieved high recognition rates of 94.8%, 94.1% and 82.1% on the considered UWA, Queen, and WHU datasets, respectively.
Retina vascular network recognition

NASA Astrophysics Data System (ADS)

Tascini, Guido; Passerini, Giorgio; Puliti, Paolo; Zingaretti, Primo

1993-09-01

The analysis of morphological and structural modifications of the retina vascular network is an interesting investigation method in the study of diabetes and hypertension. Normally this analysis is carried out by qualitative evaluations, according to standardized criteria, though medical research attaches great importance to quantitative analysis of vessel color, shape and dimensions. The paper describes a system which automatically segments and recognizes the ocular fundus circulation and micro circulation network, and extracts a set of features related to morphometric aspects of vessels. For this class of images the classical segmentation methods seem weak. We propose a computer vision system in which segmentation and recognition phases are strictly connected. The system is hierarchically organized in four modules. Firstly the Image Enhancement Module (IEM) operates a set of custom image enhancements to remove blur and to prepare data for subsequent segmentation and recognition processes. Secondly the Papilla Border Analysis Module (PBAM) automatically recognizes number, position and local diameter of blood vessels departing from optical papilla. Then the Vessel Tracking Module (VTM) analyses vessels comparing the results of body and edge tracking and detects branches and crossings. Finally the Feature Extraction Module evaluates PBAM and VTM output data and extracts some numerical indexes. Used algorithms appear to be robust and have been successfully tested on various ocular fundus images.
Extraction of edge-based and region-based features for object recognition

NASA Astrophysics Data System (ADS)

Coutts, Benjamin; Ravi, Srinivas; Hu, Gongzhu; Shrikhande, Neelima

1993-08-01

One of the central problems of computer vision is object recognition. A catalogue of model objects is described as a set of features such as edges and surfaces. The same features are extracted from the scene and matched against the models for object recognition. Edges and surfaces extracted from the scenes are often noisy and imperfect. In this paper algorithms are described for improving low level edge and surface features. Existing edge extraction algorithms are applied to the intensity image to obtain edge features. Initial edges are traced by following directions of the current contour. These are improved by using corresponding depth and intensity information for decision making at branch points. Surface fitting routines are applied to the range image to obtain planar surface patches. An algorithm of region growing is developed that starts with a coarse segmentation and uses quadric surface fitting to iteratively merge adjacent regions into quadric surfaces based on approximate orthogonal distance regression. Surface information obtained is returned to the edge extraction routine to detect and remove fake edges. This process repeats until no more merging or edge improvement can take place. Both synthetic (with Gaussian noise) and real images containing multiple object scenes have been tested using the merging criteria. Results appeared quite encouraging.
Using parallel evolutionary development for a biologically-inspired computer vision system for mobile robots.

PubMed

Wright, Cameron H G; Barrett, Steven F; Pack, Daniel J

2005-01-01

We describe a new approach to attacking the problem of robust computer vision for mobile robots. The overall strategy is to mimic the biological evolution of animal vision systems. Our basic imaging sensor is based upon the eye of the common house fly, Musca domestica. The computational algorithms are a mix of traditional image processing, subspace techniques, and multilayer neural networks.

Progress in computer vision.

NASA Astrophysics Data System (ADS)

Jain, A. K.; Dorai, C.

Computer vision has emerged as a challenging and important area of research, both as an engineering and a scientific discipline. The growing importance of computer vision is evident from the fact that it was identified as one of the "Grand Challenges" and also from its prominent role in the National Information Infrastructure. While the design of a general-purpose vision system continues to be elusive machine vision systems are being used successfully in specific application elusive, machine vision systems are being used successfully in specific application domains. Building a practical vision system requires a careful selection of appropriate sensors, extraction and integration of information from available cues in the sensed data, and evaluation of system robustness and performance. The authors discuss and demonstrate advantages of (1) multi-sensor fusion, (2) combination of features and classifiers, (3) integration of visual modules, and (IV) admissibility and goal-directed evaluation of vision algorithms. The requirements of several prominent real world applications such as biometry, document image analysis, image and video database retrieval, and automatic object model construction offer exciting problems and new opportunities to design and evaluate vision algorithms.
What can neuromorphic event-driven precise timing add to spike-based pattern recognition?

PubMed

Akolkar, Himanshu; Meyer, Cedric; Clady, Zavier; Marre, Olivier; Bartolozzi, Chiara; Panzeri, Stefano; Benosman, Ryad

2015-03-01

This letter introduces a study to precisely measure what an increase in spike timing precision can add to spike-driven pattern recognition algorithms. The concept of generating spikes from images by converting gray levels into spike timings is currently at the basis of almost every spike-based modeling of biological visual systems. The use of images naturally leads to generating incorrect artificial and redundant spike timings and, more important, also contradicts biological findings indicating that visual processing is massively parallel, asynchronous with high temporal resolution. A new concept for acquiring visual information through pixel-individual asynchronous level-crossing sampling has been proposed in a recent generation of asynchronous neuromorphic visual sensors. Unlike conventional cameras, these sensors acquire data not at fixed points in time for the entire array but at fixed amplitude changes of their input, resulting optimally sparse in space and time-pixel individually and precisely timed only if new, (previously unknown) information is available (event based). This letter uses the high temporal resolution spiking output of neuromorphic event-based visual sensors to show that lowering time precision degrades performance on several recognition tasks specifically when reaching the conventional range of machine vision acquisition frequencies (30-60 Hz). The use of information theory to characterize separability between classes for each temporal resolution shows that high temporal acquisition provides up to 70% more information that conventional spikes generated from frame-based acquisition as used in standard artificial vision, thus drastically increasing the separability between classes of objects. Experiments on real data show that the amount of information loss is correlated with temporal precision. Our information-theoretic study highlights the potentials of neuromorphic asynchronous visual sensors for both practical applications and theoretical investigations. Moreover, it suggests that representing visual information as a precise sequence of spike times as reported in the retina offers considerable advantages for neuro-inspired visual computations.
Simulation of the «COSMONAUT-ROBOT» System Interaction on the Lunar Surface Based on Methods of Machine Vision and Computer Graphics

NASA Astrophysics Data System (ADS)

Kryuchkov, B. I.; Usov, V. M.; Chertopolokhov, V. A.; Ronzhin, A. L.; Karpov, A. A.

2017-05-01

Extravehicular activity (EVA) on the lunar surface, necessary for the future exploration of the Moon, involves extensive use of robots. One of the factors of safe EVA is a proper interaction between cosmonauts and robots in extreme environments. This requires a simple and natural man-machine interface, e.g. multimodal contactless interface based on recognition of gestures and cosmonaut's poses. When travelling in the "Follow Me" mode (master/slave), a robot uses onboard tools for tracking cosmonaut's position and movements, and on the basis of these data builds its itinerary. The interaction in the system "cosmonaut-robot" on the lunar surface is significantly different from that on the Earth surface. For example, a man, dressed in a space suit, has limited fine motor skills. In addition, EVA is quite tiring for the cosmonauts, and a tired human being less accurately performs movements and often makes mistakes. All this leads to new requirements for the convenient use of the man-machine interface designed for EVA. To improve the reliability and stability of human-robot communication it is necessary to provide options for duplicating commands at the task stages and gesture recognition. New tools and techniques for space missions must be examined at the first stage of works in laboratory conditions, and then in field tests (proof tests at the site of application). The article analyzes the methods of detection and tracking of movements and gesture recognition of the cosmonaut during EVA, which can be used for the design of human-machine interface. A scenario for testing these methods by constructing a virtual environment simulating EVA on the lunar surface is proposed. Simulation involves environment visualization and modeling of the use of the "vision" of the robot to track a moving cosmonaut dressed in a spacesuit.
A convolutional neural network neutrino event classifier

DOE PAGES

Aurisano, A.; Radovic, A.; Rocco, D.; ...

2016-09-01

Here, convolutional neural networks (CNNs) have been widely applied in the computer vision community to solve complex problems in image recognition and analysis. We describe an application of the CNN technology to the problem of identifying particle interactions in sampling calorimeters used commonly in high energy physics and high energy neutrino physics in particular. Following a discussion of the core concepts of CNNs and recent innovations in CNN architectures related to the field of deep learning, we outline a specific application to the NOvA neutrino detector. This algorithm, CVN (Convolutional Visual Network) identifies neutrino interactions based on their topology withoutmore » the need for detailed reconstruction and outperforms algorithms currently in use by the NOvA collaboration.« less
SVM based colon polyps classifier in a wireless active stereo endoscope.

PubMed

Ayoub, J; Granado, B; Mhanna, Y; Romain, O

2010-01-01

This work focuses on the recognition of three-dimensional colon polyps captured by an active stereo vision sensor. The detection algorithm consists of SVM classifier trained on robust feature descriptors. The study is related to Cyclope, this prototype sensor allows real time 3D object reconstruction and continues to be optimized technically to improve its classification task by differentiation between hyperplastic and adenomatous polyps. Experimental results were encouraging and show correct classification rate of approximately 97%. The work contains detailed statistics about the detection rate and the computing complexity. Inspired by intensity histogram, the work shows a new approach that extracts a set of features based on depth histogram and combines stereo measurement with SVM classifiers to correctly classify benign and malignant polyps.
A convolutional neural network neutrino event classifier

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aurisano, A.; Radovic, A.; Rocco, D.

Here, convolutional neural networks (CNNs) have been widely applied in the computer vision community to solve complex problems in image recognition and analysis. We describe an application of the CNN technology to the problem of identifying particle interactions in sampling calorimeters used commonly in high energy physics and high energy neutrino physics in particular. Following a discussion of the core concepts of CNNs and recent innovations in CNN architectures related to the field of deep learning, we outline a specific application to the NOvA neutrino detector. This algorithm, CVN (Convolutional Visual Network) identifies neutrino interactions based on their topology withoutmore » the need for detailed reconstruction and outperforms algorithms currently in use by the NOvA collaboration.« less
Rapid prototyping of an adaptive light-source for mobile manipulators with EasyKit and EasyLab

NASA Astrophysics Data System (ADS)

Wojtczyk, Martin; Barner, Simon; Geisinger, Michael; Knoll, Alois

2008-08-01

While still not common in day-to-day business, mobile robot platforms form a growing market in robotics. Mobile platforms equipped with a manipulator for increased flexibility have been used successfully in biotech laboratories for sample management as shown on the well-known ESACT meetings. Navigation and object recognition is carried out by the utilization of a mounted machine vision camera. To cope with the different illumination conditions in a large laboratory, development of an adaptive light source was indispensable. We present our approach of rapid developing a computer controlled, adaptive LED light within one single business day, by utilizing the hardware toolbox EasyKit and our appropriate software counterpart EasyLab.
Fully convolutional network with cluster for semantic segmentation

NASA Astrophysics Data System (ADS)

Ma, Xiao; Chen, Zhongbi; Zhang, Jianlin

2018-04-01

At present, image semantic segmentation technology has been an active research topic for scientists in the field of computer vision and artificial intelligence. Especially, the extensive research of deep neural network in image recognition greatly promotes the development of semantic segmentation. This paper puts forward a method based on fully convolutional network, by cluster algorithm k-means. The cluster algorithm using the image's low-level features and initializing the cluster centers by the super-pixel segmentation is proposed to correct the set of points with low reliability, which are mistakenly classified in great probability, by the set of points with high reliability in each clustering regions. This method refines the segmentation of the target contour and improves the accuracy of the image segmentation.
Computer vision in cell biology.

PubMed

Danuser, Gaudenz

2011-11-23

Computer vision refers to the theory and implementation of artificial systems that extract information from images to understand their content. Although computers are widely used by cell biologists for visualization and measurement, interpretation of image content, i.e., the selection of events worth observing and the definition of what they mean in terms of cellular mechanisms, is mostly left to human intuition. This Essay attempts to outline roles computer vision may play and should play in image-based studies of cellular life. Copyright © 2011 Elsevier Inc. All rights reserved.
Automated decoding of facial expressions reveals marked differences in children when telling antisocial versus prosocial lies.

PubMed

Zanette, Sarah; Gao, Xiaoqing; Brunet, Megan; Bartlett, Marian Stewart; Lee, Kang

2016-10-01

The current study used computer vision technology to examine the nonverbal facial expressions of children (6-11years old) telling antisocial and prosocial lies. Children in the antisocial lying group completed a temptation resistance paradigm where they were asked not to peek at a gift being wrapped for them. All children peeked at the gift and subsequently lied about their behavior. Children in the prosocial lying group were given an undesirable gift and asked if they liked it. All children lied about liking the gift. Nonverbal behavior was analyzed using the Computer Expression Recognition Toolbox (CERT), which employs the Facial Action Coding System (FACS), to automatically code children's facial expressions while lying. Using CERT, children's facial expressions during antisocial and prosocial lying were accurately and reliably differentiated significantly above chance-level accuracy. The basic expressions of emotion that distinguished antisocial lies from prosocial lies were joy and contempt. Children expressed joy more in prosocial lying than in antisocial lying. Girls showed more joy and less contempt compared with boys when they told prosocial lies. Boys showed more contempt when they told prosocial lies than when they told antisocial lies. The key action units (AUs) that differentiate children's antisocial and prosocial lies are blink/eye closure, lip pucker, and lip raise on the right side. Together, these findings indicate that children's facial expressions differ while telling antisocial versus prosocial lies. The reliability of CERT in detecting such differences in facial expression suggests the viability of using computer vision technology in deception research. Copyright © 2016 Elsevier Inc. All rights reserved.
The impact on midlevel vision of statistically optimal divisive normalization in V1.

PubMed

Coen-Cagli, Ruben; Schwartz, Odelia

2013-07-15

The first two areas of the primate visual cortex (V1, V2) provide a paradigmatic example of hierarchical computation in the brain. However, neither the functional properties of V2 nor the interactions between the two areas are well understood. One key aspect is that the statistics of the inputs received by V2 depend on the nonlinear response properties of V1. Here, we focused on divisive normalization, a canonical nonlinear computation that is observed in many neural areas and modalities. We simulated V1 responses with (and without) different forms of surround normalization derived from statistical models of natural scenes, including canonical normalization and a statistically optimal extension that accounted for image nonhomogeneities. The statistics of the V1 population responses differed markedly across models. We then addressed how V2 receptive fields pool the responses of V1 model units with different tuning. We assumed this is achieved by learning without supervision a linear representation that removes correlations, which could be accomplished with principal component analysis. This approach revealed V2-like feature selectivity when we used the optimal normalization and, to a lesser extent, the canonical one but not in the absence of both. We compared the resulting two-stage models on two perceptual tasks; while models encompassing V1 surround normalization performed better at object recognition, only statistically optimal normalization provided systematic advantages in a task more closely matched to midlevel vision, namely figure/ground judgment. Our results suggest that experiments probing midlevel areas might benefit from using stimuli designed to engage the computations that characterize V1 optimality.
Rapid quantification of color vision: the cone contrast test.

PubMed

Rabin, Jeff; Gooch, John; Ivan, Douglas

2011-02-09

To describe the design, specificity, and sensitivity of the cone contrast test (CCT), a computer-based, cone-specific (L, M, S) contrast sensitivity test for diagnosing type and severity of color vision deficiency (CVD). The CCT presents a randomized series of colored letters visible only to L, M or S cones in decreasing steps of cone contrast to determine L, M, and S letter-recognition thresholds. Sensitivity and specificity were determined by retrospective comparison of CCT scores to anomaloscope and pseudoisochromatic plate (PIP) results in 1446 applicants for pilot training. CVD was detected in 49 (3.4%) of 1446 applicants with hereditary red-green (protan or deutan) CVD detected in 47 (3.5%) of 1359 men and blue-yellow (tritan) in 2 of 1446. In agreement with the anomaloscope, the CCT showed 100% sensitivity for detection and categorization of CVD (40 deutan, 7 protan, 2 tritan). PIP testing showed lower sensitivity (80% detected; 20% missed) due in part to the applicant's prior experience and/or pretest preparation. CCT specificity for confirming normal color vision was 100% for L and M cone tests and 99.8% for S cones. The CCT has sensitivity and specificity comparable to anomaloscope testing and exceeds PIP sensitivity in practiced observers. The CCT provides a rapid (6 minutes), clinically expedient, measure of color vision for quantifying normal color performance, diagnosing type and severity of hereditary deficiency, and detection of acquired sensitivity loss due to ocular, neurologic, and/or systemic disease, as well as injury and physiological stressors, such as altitude and fatigue.
Computer Vision Syndrome.

PubMed

Randolph, Susan A

2017-07-01

With the increased use of electronic devices with visual displays, computer vision syndrome is becoming a major public health issue. Improving the visual status of workers using computers results in greater productivity in the workplace and improved visual comfort.
Spatiotemporal dynamics underlying object completion in human ventral visual cortex.

PubMed

Tang, Hanlin; Buia, Calin; Madhavan, Radhika; Crone, Nathan E; Madsen, Joseph R; Anderson, William S; Kreiman, Gabriel

2014-08-06

Natural vision often involves recognizing objects from partial information. Recognition of objects from parts presents a significant challenge for theories of vision because it requires spatial integration and extrapolation from prior knowledge. Here we recorded intracranial field potentials of 113 visually selective electrodes from epilepsy patients in response to whole and partial objects. Responses along the ventral visual stream, particularly the inferior occipital and fusiform gyri, remained selective despite showing only 9%-25% of the object areas. However, these visually selective signals emerged ∼100 ms later for partial versus whole objects. These processing delays were particularly pronounced in higher visual areas within the ventral stream. This latency difference persisted when controlling for changes in contrast, signal amplitude, and the strength of selectivity. These results argue against a purely feedforward explanation of recognition from partial information, and provide spatiotemporal constraints on theories of object recognition that involve recurrent processing. Copyright © 2014 Elsevier Inc. All rights reserved.
Low-cost real-time automatic wheel classification system

NASA Astrophysics Data System (ADS)

Shabestari, Behrouz N.; Miller, John W. V.; Wedding, Victoria

1992-11-01

This paper describes the design and implementation of a low-cost machine vision system for identifying various types of automotive wheels which are manufactured in several styles and sizes. In this application, a variety of wheels travel on a conveyor in random order through a number of processing steps. One of these processes requires the identification of the wheel type which was performed manually by an operator. A vision system was designed to provide the required identification. The system consisted of an annular illumination source, a CCD TV camera, frame grabber, and 386-compatible computer. Statistical pattern recognition techniques were used to provide robust classification as well as a simple means for adding new wheel designs to the system. Maintenance of the system can be performed by plant personnel with minimal training. The basic steps for identification include image acquisition, segmentation of the regions of interest, extraction of selected features, and classification. The vision system has been installed in a plant and has proven to be extremely effective. The system properly identifies the wheels correctly up to 30 wheels per minute regardless of rotational orientation in the camera's field of view. Correct classification can even be achieved if a portion of the wheel is blocked off from the camera. Significant cost savings have been achieved by a reduction in scrap associated with incorrect manual classification as well as a reduction of labor in a tedious task.
Measuring exertion time, duty cycle and hand activity level for industrial tasks using computer vision.

PubMed

Akkas, Oguz; Lee, Cheng Hsien; Hu, Yu Hen; Harris Adamson, Carisa; Rempel, David; Radwin, Robert G

2017-12-01

Two computer vision algorithms were developed to automatically estimate exertion time, duty cycle (DC) and hand activity level (HAL) from videos of workers performing 50 industrial tasks. The average DC difference between manual frame-by-frame analysis and the computer vision DC was -5.8% for the Decision Tree (DT) algorithm, and 1.4% for the Feature Vector Training (FVT) algorithm. The average HAL difference was 0.5 for the DT algorithm and 0.3 for the FVT algorithm. A sensitivity analysis, conducted to examine the influence that deviations in DC have on HAL, found it remained unaffected when DC error was less than 5%. Thus, a DC error less than 10% will impact HAL less than 0.5 HAL, which is negligible. Automatic computer vision HAL estimates were therefore comparable to manual frame-by-frame estimates. Practitioner Summary: Computer vision was used to automatically estimate exertion time, duty cycle and hand activity level from videos of workers performing industrial tasks.
Reconfigurable vision system for real-time applications

NASA Astrophysics Data System (ADS)

Torres-Huitzil, Cesar; Arias-Estrada, Miguel

2002-03-01

Recently, a growing community of researchers has used reconfigurable systems to solve computationally intensive problems. Reconfigurability provides optimized processors for systems on chip designs, and makes easy to import technology to a new system through reusable modules. The main objective of this work is the investigation of a reconfigurable computer system targeted for computer vision and real-time applications. The system is intended to circumvent the inherent computational load of most window-based computer vision algorithms. It aims to build a system for such tasks by providing an FPGA-based hardware architecture for task specific vision applications with enough processing power, using the minimum amount of hardware resources as possible, and a mechanism for building systems using this architecture. Regarding the software part of the system, a library of pre-designed and general-purpose modules that implement common window-based computer vision operations is being investigated. A common generic interface is established for these modules in order to define hardware/software components. These components can be interconnected to develop more complex applications, providing an efficient mechanism for transferring image and result data among modules. Some preliminary results are presented and discussed.
Scene and human face recognition in the central vision of patients with glaucoma

PubMed Central

Aptel, Florent; Attye, Arnaud; Guyader, Nathalie; Boucart, Muriel; Chiquet, Christophe; Peyrin, Carole

2018-01-01

Primary open-angle glaucoma (POAG) firstly mainly affects peripheral vision. Current behavioral studies support the idea that visual defects of patients with POAG extend into parts of the central visual field classified as normal by static automated perimetry analysis. This is particularly true for visual tasks involving processes of a higher level than mere detection. The purpose of this study was to assess visual abilities of POAG patients in central vision. Patients were assigned to two groups following a visual field examination (Humphrey 24–2 SITA-Standard test). Patients with both peripheral and central defects and patients with peripheral but no central defect, as well as age-matched controls, participated in the experiment. All participants had to perform two visual tasks where low-contrast stimuli were presented in the central 6° of the visual field. A categorization task of scene images and human face images assessed high-level visual recognition abilities. In contrast, a detection task using the same stimuli assessed low-level visual function. The difference in performance between detection and categorization revealed the cost of high-level visual processing. Compared to controls, patients with a central visual defect showed a deficit in both detection and categorization of all low-contrast images. This is consistent with the abnormal retinal sensitivity as assessed by perimetry. However, the deficit was greater for categorization than detection. Patients without a central defect showed similar performances to the controls concerning the detection and categorization of faces. However, while the detection of scene images was well-maintained, these patients showed a deficit in their categorization. This suggests that the simple loss of peripheral vision could be detrimental to scene recognition, even when the information is displayed in central vision. This study revealed subtle defects in the central visual field of POAG patients that cannot be predicted by static automated perimetry assessment using Humphrey 24–2 SITA-Standard test. PMID:29481572
Feasibility Study of a Vision-Based Landing System for Unmanned Fixed-Wing Aircraft

DTIC Science & Technology

2017-06-01

International Journal of Computer Science and Network Security 7 no. 3: 112–117. Accessed April 7, 2017. http://www.sciencedirect.com/science/ article /pii...the feasibility of applying computer vision techniques and visual feedback in the control loop for an autonomous system. This thesis examines the...integration into an autonomous aircraft control system. 14. SUBJECT TERMS autonomous systems, auto-land, computer vision, image processing
Surpassing Humans and Computers with JellyBean: Crowd-Vision-Hybrid Counting Algorithms.

PubMed

Sarma, Akash Das; Jain, Ayush; Nandi, Arnab; Parameswaran, Aditya; Widom, Jennifer

2015-11-01

Counting objects is a fundamental image processisng primitive, and has many scientific, health, surveillance, security, and military applications. Existing supervised computer vision techniques typically require large quantities of labeled training data, and even with that, fail to return accurate results in all but the most stylized settings. Using vanilla crowd-sourcing, on the other hand, can lead to significant errors, especially on images with many objects. In this paper, we present our JellyBean suite of algorithms, that combines the best of crowds and computer vision to count objects in images, and uses judicious decomposition of images to greatly improve accuracy at low cost. Our algorithms have several desirable properties: (i) they are theoretically optimal or near-optimal , in that they ask as few questions as possible to humans (under certain intuitively reasonable assumptions that we justify in our paper experimentally); (ii) they operate under stand-alone or hybrid modes, in that they can either work independent of computer vision algorithms, or work in concert with them, depending on whether the computer vision techniques are available or useful for the given setting; (iii) they perform very well in practice, returning accurate counts on images that no individual worker or computer vision algorithm can count correctly, while not incurring a high cost.

Biological Basis For Computer Vision: Some Perspectives

NASA Astrophysics Data System (ADS)

Gupta, Madan M.

1990-03-01

Using biology as a basis for the development of sensors, devices and computer vision systems is a challenge to systems and vision scientists. It is also a field of promising research for engineering applications. Biological sensory systems, such as vision, touch and hearing, sense different physical phenomena from our environment, yet they possess some common mathematical functions. These mathematical functions are cast into the neural layers which are distributed throughout our sensory regions, sensory information transmission channels and in the cortex, the centre of perception. In this paper, we are concerned with the study of the biological vision system and the emulation of some of its mathematical functions, both retinal and visual cortex, for the development of a robust computer vision system. This field of research is not only intriguing, but offers a great challenge to systems scientists in the development of functional algorithms. These functional algorithms can be generalized for further studies in such fields as signal processing, control systems and image processing. Our studies are heavily dependent on the the use of fuzzy - neural layers and generalized receptive fields. Building blocks of such neural layers and receptive fields may lead to the design of better sensors and better computer vision systems. It is hoped that these studies will lead to the development of better artificial vision systems with various applications to vision prosthesis for the blind, robotic vision, medical imaging, medical sensors, industrial automation, remote sensing, space stations and ocean exploration.
THE EFFECT OF WORD ASSOCIATIONS ON THE RECOGNITION OF FLASHED WORDS.

ERIC Educational Resources Information Center

SAMUELS, S. JAY

THE HYPOTHESIS THAT WHEN ASSOCIATED PAIRS OF WORDS ARE PRESENTED, SPEED OF RECOGNITION WILL BE FASTER THAN WHEN NONASSOCIATED WORD PAIRS ARE PRESENTED OR WHEN A TARGET WORD IS PRESENTED BY ITSELF WAS TESTED. TWENTY UNIVERSITY STUDENTS, INITIALLY SCREENED FOR VISION, WERE ASSIGNED RANDOMLY TO ROWS OF A 5 X 5 REPEATED-MEASURES LATIN SQUARE DESIGN.…
A Vision-Based Counting and Recognition System for Flying Insects in Intelligent Agriculture.

PubMed

Zhong, Yuanhong; Gao, Junyuan; Lei, Qilun; Zhou, Yao

2018-05-09

Rapid and accurate counting and recognition of flying insects are of great importance, especially for pest control. Traditional manual identification and counting of flying insects is labor intensive and inefficient. In this study, a vision-based counting and classification system for flying insects is designed and implemented. The system is constructed as follows: firstly, a yellow sticky trap is installed in the surveillance area to trap flying insects and a camera is set up to collect real-time images. Then the detection and coarse counting method based on You Only Look Once (YOLO) object detection, the classification method and fine counting based on Support Vector Machines (SVM) using global features are designed. Finally, the insect counting and recognition system is implemented on Raspberry PI. Six species of flying insects including bee, fly, mosquito, moth, chafer and fruit fly are selected to assess the effectiveness of the system. Compared with the conventional methods, the test results show promising performance. The average counting accuracy is 92.50% and average classifying accuracy is 90.18% on Raspberry PI. The proposed system is easy-to-use and provides efficient and accurate recognition data, therefore, it can be used for intelligent agriculture applications.
Research and Development of Target Recognition and Location Crawling Platform based on Binocular Vision

NASA Astrophysics Data System (ADS)

Xu, Weidong; Lei, Zhu; Yuan, Zhang; Gao, Zhenqing

2018-03-01

The application of visual recognition technology in industrial robot crawling and placing operation is one of the key tasks in the field of robot research. In order to improve the efficiency and intelligence of the material sorting in the production line, especially to realize the sorting of the scattered items, the robot target recognition and positioning crawling platform based on binocular vision is researched and developed. The images were collected by binocular camera, and the images were pretreated. Harris operator was used to identify the corners of the images. The Canny operator was used to identify the images. Hough-chain code recognition was used to identify the images. The target image in the image, obtain the coordinates of each vertex of the image, calculate the spatial position and posture of the target item, and determine the information needed to capture the movement and transmit it to the robot control crawling operation. Finally, In this paper, we use this method to experiment the wrapping problem in the express sorting process The experimental results show that the platform can effectively solve the problem of sorting of loose parts, so as to achieve the purpose of efficient and intelligent sorting.
A Vision-Based Counting and Recognition System for Flying Insects in Intelligent Agriculture

PubMed Central

Zhong, Yuanhong; Gao, Junyuan; Lei, Qilun; Zhou, Yao

2018-01-01

Rapid and accurate counting and recognition of flying insects are of great importance, especially for pest control. Traditional manual identification and counting of flying insects is labor intensive and inefficient. In this study, a vision-based counting and classification system for flying insects is designed and implemented. The system is constructed as follows: firstly, a yellow sticky trap is installed in the surveillance area to trap flying insects and a camera is set up to collect real-time images. Then the detection and coarse counting method based on You Only Look Once (YOLO) object detection, the classification method and fine counting based on Support Vector Machines (SVM) using global features are designed. Finally, the insect counting and recognition system is implemented on Raspberry PI. Six species of flying insects including bee, fly, mosquito, moth, chafer and fruit fly are selected to assess the effectiveness of the system. Compared with the conventional methods, the test results show promising performance. The average counting accuracy is 92.50% and average classifying accuracy is 90.18% on Raspberry PI. The proposed system is easy-to-use and provides efficient and accurate recognition data, therefore, it can be used for intelligent agriculture applications. PMID:29747429
Dynamic Vision for Control

DTIC Science & Technology

2006-07-27

unlimited 13. SUPPLEMENTARY NOTES 14. ABSTRACT The goal of this project was to develop analytical and computational tools to make vision a Viable sensor for...vision.ucla. edu July 27, 2006 Abstract The goal of this project was to develop analytical and computational tools to make vision a viable sensor for the ... sensors . We have proposed the framework of stereoscopic segmentation where multiple images of the same obejcts were jointly processed to extract geometry
Evolution of attention mechanisms for early visual processing

NASA Astrophysics Data System (ADS)

Müller, Thomas; Knoll, Alois

2011-03-01

Early visual processing as a method to speed up computations on visual input data has long been discussed in the computer vision community. The general target of a such approaches is to filter nonrelevant information from the costly higher-level visual processing algorithms. By insertion of this additional filter layer the overall approach can be speeded up without actually changing the visual processing methodology. Being inspired by the layered architecture of the human visual processing apparatus, several approaches for early visual processing have been recently proposed. Most promising in this field is the extraction of a saliency map to determine regions of current attention in the visual field. Such saliency can be computed in a bottom-up manner, i.e. the theory claims that static regions of attention emerge from a certain color footprint, and dynamic regions of attention emerge from connected blobs of textures moving in a uniform way in the visual field. Top-down saliency effects are either unconscious through inherent mechanisms like inhibition-of-return, i.e. within a period of time the attention level paid to a certain region automatically decreases if the properties of that region do not change, or volitional through cognitive feedback, e.g. if an object moves consistently in the visual field. These bottom-up and top-down saliency effects have been implemented and evaluated in a previous computer vision system for the project JAST. In this paper an extension applying evolutionary processes is proposed. The prior vision system utilized multiple threads to analyze the regions of attention delivered from the early processing mechanism. Here, in addition, multiple saliency units are used to produce these regions of attention. All of these saliency units have different parameter-sets. The idea is to let the population of saliency units create regions of attention, then evaluate the results with cognitive feedback and finally apply the genetic mechanism: mutation and cloning of the best performers and extinction of the worst performers considering computation of regions of attention. A fitness function can be derived by evaluating, whether relevant objects are found in the regions created. It can be seen from various experiments, that the approach significantly speeds up visual processing, especially regarding robust ealtime object recognition, compared to an approach not using saliency based preprocessing. Furthermore, the evolutionary algorithm improves the overall performance of the preprocessing system in terms of quality, as the system automatically and autonomously tunes the saliency parameters. The computational overhead produced by periodical clone/delete/mutation operations can be handled well within the realtime constraints of the experimental computer vision system. Nevertheless, limitations apply whenever the visual field does not contain any significant saliency information for some time, but the population still tries to tune the parameters - overfitting avoids generalization in this case and the evolutionary process may be reset by manual intervention.
Recurrent Convolutional Neural Networks: A Better Model of Biological Object Recognition.

PubMed

Spoerer, Courtney J; McClure, Patrick; Kriegeskorte, Nikolaus

2017-01-01

Feedforward neural networks provide the dominant model of how the brain performs visual object recognition. However, these networks lack the lateral and feedback connections, and the resulting recurrent neuronal dynamics, of the ventral visual pathway in the human and non-human primate brain. Here we investigate recurrent convolutional neural networks with bottom-up (B), lateral (L), and top-down (T) connections. Combining these types of connections yields four architectures (B, BT, BL, and BLT), which we systematically test and compare. We hypothesized that recurrent dynamics might improve recognition performance in the challenging scenario of partial occlusion. We introduce two novel occluded object recognition tasks to test the efficacy of the models, digit clutter (where multiple target digits occlude one another) and digit debris (where target digits are occluded by digit fragments). We find that recurrent neural networks outperform feedforward control models (approximately matched in parametric complexity) at recognizing objects, both in the absence of occlusion and in all occlusion conditions. Recurrent networks were also found to be more robust to the inclusion of additive Gaussian noise. Recurrent neural networks are better in two respects: (1) they are more neurobiologically realistic than their feedforward counterparts; (2) they are better in terms of their ability to recognize objects, especially under challenging conditions. This work shows that computer vision can benefit from using recurrent convolutional architectures and suggests that the ubiquitous recurrent connections in biological brains are essential for task performance.
[-25]A Similarity Analysis of Audio Signal to Develop a Human Activity Recognition Using Similarity Networks.

PubMed

García-Hernández, Alejandra; Galván-Tejada, Carlos E; Galván-Tejada, Jorge I; Celaya-Padilla, José M; Gamboa-Rosales, Hamurabi; Velasco-Elizondo, Perla; Cárdenas-Vargas, Rogelio

2017-11-21

Human Activity Recognition (HAR) is one of the main subjects of study in the areas of computer vision and machine learning due to the great benefits that can be achieved. Examples of the study areas are: health prevention, security and surveillance, automotive research, and many others. The proposed approaches are carried out using machine learning techniques and present good results. However, it is difficult to observe how the descriptors of human activities are grouped. In order to obtain a better understanding of the the behavior of descriptors, it is important to improve the abilities to recognize the human activities. This paper proposes a novel approach for the HAR based on acoustic data and similarity networks. In this approach, we were able to characterize the sound of the activities and identify those activities looking for similarity in the sound pattern. We evaluated the similarity of the sounds considering mainly two features: the sound location and the materials that were used. As a result, the materials are a good reference classifying the human activities compared with the location.
Grading Multiple Choice Exams with Low-Cost and Portable Computer-Vision Techniques

NASA Astrophysics Data System (ADS)

Fisteus, Jesus Arias; Pardo, Abelardo; García, Norberto Fernández

2013-08-01

Although technology for automatic grading of multiple choice exams has existed for several decades, it is not yet as widely available or affordable as it should be. The main reasons preventing this adoption are the cost and the complexity of the setup procedures. In this paper, Eyegrade, a system for automatic grading of multiple choice exams is presented. While most current solutions are based on expensive scanners, Eyegrade offers a truly low-cost solution requiring only a regular off-the-shelf webcam. Additionally, Eyegrade performs both mark recognition as well as optical character recognition of handwritten student identification numbers, which avoids the use of bubbles in the answer sheet. When compared with similar webcam-based systems, the user interface in Eyegrade has been designed to provide a more efficient and error-free data collection procedure. The tool has been validated with a set of experiments that show the ease of use (both setup and operation), the reduction in grading time, and an increase in the reliability of the results when compared with conventional, more expensive systems.
Gestonurse: a robotic surgical nurse for handling surgical instruments in the operating room.

PubMed

Jacob, Mithun; Li, Yu-Ting; Akingba, George; Wachs, Juan P

2012-03-01

While surgeon-scrub nurse collaboration provides a fast, straightforward and inexpensive method of delivering surgical instruments to the surgeon, it often results in "mistakes" (e.g. missing information, ambiguity of instructions and delays). It has been shown that these errors can have a negative impact on the outcome of the surgery. These errors could potentially be reduced or eliminated by introducing robotics into the operating room. Gesture control is a natural and fundamentally sound alternative that allows interaction without disturbing the normal flow of surgery. This paper describes the development of a robotic scrub nurse Gestonurse to support surgeons by passing surgical instruments during surgery as required. The robot responds to recognized hand signals detected through sophisticated computer vision and pattern recognition techniques. Experimental results show that 95% of the gestures were recognized correctly. The gesture recognition algorithm presented is robust to changes in scale and rotation of the hand gestures. The system was compared to human task performance and was found to be only 0.83 s slower on average.
Detailed 3D representations for object recognition and modeling.

PubMed

Zia, M Zeeshan; Stark, Michael; Schiele, Bernt; Schindler, Konrad

2013-11-01

Geometric 3D reasoning at the level of objects has received renewed attention recently in the context of visual scene understanding. The level of geometric detail, however, is typically limited to qualitative representations or coarse boxes. This is linked to the fact that today's object class detectors are tuned toward robust 2D matching rather than accurate 3D geometry, encouraged by bounding-box-based benchmarks such as Pascal VOC. In this paper, we revisit ideas from the early days of computer vision, namely, detailed, 3D geometric object class representations for recognition. These representations can recover geometrically far more accurate object hypotheses than just bounding boxes, including continuous estimates of object pose and 3D wireframes with relative 3D positions of object parts. In combination with robust techniques for shape description and inference, we outperform state-of-the-art results in monocular 3D pose estimation. In a series of experiments, we analyze our approach in detail and demonstrate novel applications enabled by such an object class representation, such as fine-grained categorization of cars and bicycles, according to their 3D geometry, and ultrawide baseline matching.
A Similarity Analysis of Audio Signal to Develop a Human Activity Recognition Using Similarity Networks

PubMed Central

García-Hernández, Alejandra; Galván-Tejada, Jorge I.; Celaya-Padilla, José M.; Velasco-Elizondo, Perla; Cárdenas-Vargas, Rogelio

2017-01-01

Human Activity Recognition (HAR) is one of the main subjects of study in the areas of computer vision and machine learning due to the great benefits that can be achieved. Examples of the study areas are: health prevention, security and surveillance, automotive research, and many others. The proposed approaches are carried out using machine learning techniques and present good results. However, it is difficult to observe how the descriptors of human activities are grouped. In order to obtain a better understanding of the the behavior of descriptors, it is important to improve the abilities to recognize the human activities. This paper proposes a novel approach for the HAR based on acoustic data and similarity networks. In this approach, we were able to characterize the sound of the activities and identify those activities looking for similarity in the sound pattern. We evaluated the similarity of the sounds considering mainly two features: the sound location and the materials that were used. As a result, the materials are a good reference classifying the human activities compared with the location. PMID:29160799
Vision Systems with the Human in the Loop

NASA Astrophysics Data System (ADS)

Bauckhage, Christian; Hanheide, Marc; Wrede, Sebastian; Käster, Thomas; Pfeiffer, Michael; Sagerer, Gerhard

2005-12-01

The emerging cognitive vision paradigm deals with vision systems that apply machine learning and automatic reasoning in order to learn from what they perceive. Cognitive vision systems can rate the relevance and consistency of newly acquired knowledge, they can adapt to their environment and thus will exhibit high robustness. This contribution presents vision systems that aim at flexibility and robustness. One is tailored for content-based image retrieval, the others are cognitive vision systems that constitute prototypes of visual active memories which evaluate, gather, and integrate contextual knowledge for visual analysis. All three systems are designed to interact with human users. After we will have discussed adaptive content-based image retrieval and object and action recognition in an office environment, the issue of assessing cognitive systems will be raised. Experiences from psychologically evaluated human-machine interactions will be reported and the promising potential of psychologically-based usability experiments will be stressed.
Artificial neural networks using complex numbers and phase encoded weights.

PubMed

Michel, Howard E; Awwal, Abdul Ahad S

2010-04-01

The model of a simple perceptron using phase-encoded inputs and complex-valued weights is proposed. The aggregation function, activation function, and learning rule for the proposed neuron are derived and applied to Boolean logic functions and simple computer vision tasks. The complex-valued neuron (CVN) is shown to be superior to traditional perceptrons. An improvement of 135% over the theoretical maximum of 104 linearly separable problems (of three variables) solvable by conventional perceptrons is achieved without additional logic, neuron stages, or higher order terms such as those required in polynomial logic gates. The application of CVN in distortion invariant character recognition and image segmentation is demonstrated. Implementation details are discussed, and the CVN is shown to be very attractive for optical implementation since optical computations are naturally complex. The cost of the CVN is less in all cases than the traditional neuron when implemented optically. Therefore, all the benefits of the CVN can be obtained without additional cost. However, on those implementations dependent on standard serial computers, CVN will be more cost effective only in those applications where its increased power can offset the requirement for additional neurons.
Using an Augmented Reality Device as a Distance-based Vision Aid-Promise and Limitations.

PubMed

Kinateder, Max; Gualtieri, Justin; Dunn, Matt J; Jarosz, Wojciech; Yang, Xing-Dong; Cooper, Emily A

2018-06-06

For people with limited vision, wearable displays hold the potential to digitally enhance visual function. As these display technologies advance, it is important to understand their promise and limitations as vision aids. The aim of this study was to test the potential of a consumer augmented reality (AR) device for improving the functional vision of people with near-complete vision loss. An AR application that translates spatial information into high-contrast visual patterns was developed. Two experiments assessed the efficacy of the application to improve vision: an exploratory study with four visually impaired participants and a main controlled study with participants with simulated vision loss (n = 48). In both studies, performance was tested on a range of visual tasks (identifying the location, pose and gesture of a person, identifying objects, and moving around in an unfamiliar space). Participants' accuracy and confidence were compared on these tasks with and without augmented vision, as well as their subjective responses about ease of mobility. In the main study, the AR application was associated with substantially improved accuracy and confidence in object recognition (all P < .001) and to a lesser degree in gesture recognition (P < .05). There was no significant change in performance on identifying body poses or in subjective assessments of mobility, as compared with a control group. Consumer AR devices may soon be able to support applications that improve the functional vision of users for some tasks. In our study, both artificially impaired participants and participants with near-complete vision loss performed tasks that they could not do without the AR system. Current limitations in system performance and form factor, as well as the risk of overconfidence, will need to be overcome.This is an open-access article distributed under the terms of the Creative Commons Attribution-Non Commercial-No Derivatives License 4.0 (CCBY-NC-ND), where it is permissible to download and share the work provided it is properly cited. The work cannot be changed in any way or used commercially without permission from the journal.
Computer vision camera with embedded FPGA processing

NASA Astrophysics Data System (ADS)

Lecerf, Antoine; Ouellet, Denis; Arias-Estrada, Miguel

2000-03-01

Traditional computer vision is based on a camera-computer system in which the image understanding algorithms are embedded in the computer. To circumvent the computational load of vision algorithms, low-level processing and imaging hardware can be integrated in a single compact module where a dedicated architecture is implemented. This paper presents a Computer Vision Camera based on an open architecture implemented in an FPGA. The system is targeted to real-time computer vision tasks where low level processing and feature extraction tasks can be implemented in the FPGA device. The camera integrates a CMOS image sensor, an FPGA device, two memory banks, and an embedded PC for communication and control tasks. The FPGA device is a medium size one equivalent to 25,000 logic gates. The device is connected to two high speed memory banks, an IS interface, and an imager interface. The camera can be accessed for architecture programming, data transfer, and control through an Ethernet link from a remote computer. A hardware architecture can be defined in a Hardware Description Language (like VHDL), simulated and synthesized into digital structures that can be programmed into the FPGA and tested on the camera. The architecture of a classical multi-scale edge detection algorithm based on a Laplacian of Gaussian convolution has been developed to show the capabilities of the system.
Research on three-dimensional reconstruction method based on binocular vision

NASA Astrophysics Data System (ADS)

Li, Jinlin; Wang, Zhihui; Wang, Minjun

2018-03-01

As the hot and difficult issue in computer vision, binocular stereo vision is an important form of computer vision,which has a broad application prospects in many computer vision fields,such as aerial mapping,vision navigation,motion analysis and industrial inspection etc.In this paper, a research is done into binocular stereo camera calibration, image feature extraction and stereo matching. In the binocular stereo camera calibration module, the internal parameters of a single camera are obtained by using the checkerboard lattice of zhang zhengyou the field of image feature extraction and stereo matching, adopted the SURF operator in the local feature operator and the SGBM algorithm in the global matching algorithm are used respectively, and the performance are compared. After completed the feature points matching, we can build the corresponding between matching points and the 3D object points using the camera parameters which are calibrated, which means the 3D information.
Machine learning and computer vision approaches for phenotypic profiling.

PubMed

Grys, Ben T; Lo, Dara S; Sahin, Nil; Kraus, Oren Z; Morris, Quaid; Boone, Charles; Andrews, Brenda J

2017-01-02

With recent advances in high-throughput, automated microscopy, there has been an increased demand for effective computational strategies to analyze large-scale, image-based data. To this end, computer vision approaches have been applied to cell segmentation and feature extraction, whereas machine-learning approaches have been developed to aid in phenotypic classification and clustering of data acquired from biological images. Here, we provide an overview of the commonly used computer vision and machine-learning methods for generating and categorizing phenotypic profiles, highlighting the general biological utility of each approach. © 2017 Grys et al.
Machine learning and computer vision approaches for phenotypic profiling

PubMed Central

Morris, Quaid

2017-01-01

With recent advances in high-throughput, automated microscopy, there has been an increased demand for effective computational strategies to analyze large-scale, image-based data. To this end, computer vision approaches have been applied to cell segmentation and feature extraction, whereas machine-learning approaches have been developed to aid in phenotypic classification and clustering of data acquired from biological images. Here, we provide an overview of the commonly used computer vision and machine-learning methods for generating and categorizing phenotypic profiles, highlighting the general biological utility of each approach. PMID:27940887

A framework for the recognition of high-level surgical tasks from video images for cataract surgeries

PubMed Central

Lalys, Florent; Riffaud, Laurent; Bouget, David; Jannin, Pierre

2012-01-01

The need for a better integration of the new generation of Computer-Assisted-Surgical (CAS) systems has been recently emphasized. One necessity to achieve this objective is to retrieve data from the Operating Room (OR) with different sensors, then to derive models from these data. Recently, the use of videos from cameras in the OR has demonstrated its efficiency. In this paper, we propose a framework to assist in the development of systems for the automatic recognition of high level surgical tasks using microscope videos analysis. We validated its use on cataract procedures. The idea is to combine state-of-the-art computer vision techniques with time series analysis. The first step of the framework consisted in the definition of several visual cues for extracting semantic information, therefore characterizing each frame of the video. Five different pieces of image-based classifiers were therefore implemented. A step of pupil segmentation was also applied for dedicated visual cue detection. Time series classification algorithms were then applied to model time-varying data. Dynamic Time Warping (DTW) and Hidden Markov Models (HMM) were tested. This association combined the advantages of all methods for better understanding of the problem. The framework was finally validated through various studies. Six binary visual cues were chosen along with 12 phases to detect, obtaining accuracies of 94%. PMID:22203700
Possible Computer Vision Systems and Automated or Computer-Aided Edging and Trimming

Treesearch

Philip A. Araman

1990-01-01

This paper discusses research which is underway to help our industry reduce costs, increase product volume and value recovery, and market more accurately graded and described products. The research is part of a team effort to help the hardwood sawmill industry automate with computer vision systems, and computer-aided or computer controlled processing. This paper...
Night Vision Laboratory Static Performance Model for Thermal Viewing Systems

DTIC Science & Technology

1975-04-01

Research and Development Technical Report f ECOM-� • i’.__1’=• =•NIGHT VISION LABORATORY STATIC PERFORMANCE MODEL 1 S1=• : FOR THERMAL VIEWING...resolvable temperature Infrared imaging Minimum detectable temperature1.Detection and recognition performance Night visi,-)n Noise equivalent temperature...modulation transfer function (MTF). The noise charactcristics are specified by the noise equivalent temper- ature difference (NE AT), The next sections
The Perception of Multiple Images

ERIC Educational Resources Information Center

Goldstein, E. Bruce

1975-01-01

A discussion of visual field, foveal and peripheral vision, eye fixations, recognition and recall of pictures, memory for meaning of pictures, and the relation between speed of presentation and memory. (Editor)
Smartphone, tablet computer and e-reader use by people with vision impairment.

PubMed

Crossland, Michael D; Silva, Rui S; Macedo, Antonio F

2014-09-01

Consumer electronic devices such as smartphones, tablet computers, and e-book readers have become far more widely used in recent years. Many of these devices contain accessibility features such as large print and speech. Anecdotal experience suggests people with vision impairment frequently make use of these systems. Here we survey people with self-identified vision impairment to determine their use of this equipment. An internet-based survey was advertised to people with vision impairment by word of mouth, social media, and online. Respondents were asked demographic information, what devices they owned, what they used these devices for, and what accessibility features they used. One hundred and thirty-two complete responses were received. Twenty-six percent of the sample reported that they had no vision and the remainder reported they had low vision. One hundred and seven people (81%) reported using a smartphone. Those with no vision were as likely to use a smartphone or tablet as those with low vision. Speech was found useful by 59% of smartphone users. Fifty-one percent of smartphone owners used the camera and screen as a magnifier. Forty-eight percent of the sample used a tablet computer, and 17% used an e-book reader. The most frequently cited reason for not using these devices included cost and lack of interest. Smartphones, tablet computers, and e-book readers can be used by people with vision impairment. Speech is used by people with low vision as well as those with no vision. Many of our (self-selected) group used their smartphone camera and screen as a magnifier, and others used the camera flash as a spotlight. © 2014 The Authors Ophthalmic & Physiological Optics © 2014 The College of Optometrists.
Optical and digital pattern recognition; Proceedings of the Meeting, Los Angeles, CA, Jan. 13-15, 1987

NASA Technical Reports Server (NTRS)

Liu, Hua-Kuang (Editor); Schenker, Paul (Editor)

1987-01-01

The papers presented in this volume provide an overview of current research in both optical and digital pattern recognition, with a theme of identifying overlapping research problems and methodologies. Topics discussed include image analysis and low-level vision, optical system design, object analysis and recognition, real-time hybrid architectures and algorithms, high-level image understanding, and optical matched filter design. Papers are presented on synthetic estimation filters for a control system; white-light correlator character recognition; optical AI architectures for intelligent sensors; interpreting aerial photographs by segmentation and search; and optical information processing using a new photopolymer.
Differential Geometry Applied To Least-Square Error Surface Approximations

NASA Astrophysics Data System (ADS)

Bolle, Ruud M.; Sabbah, Daniel

1987-08-01

This paper focuses on extraction of the parameters of individual surfaces from noisy depth maps. The basis for this are least-square error polynomial approximations to the range data and the curvature properties that can be computed from these approximations. The curvature properties are derived using the invariants of the Weingarten Map evaluated at the origin of local coordinate systems centered at the range points. The Weingarten Map is a well-known concept in differential geometry; a brief treatment of the differential geometry pertinent to surface curvature is given. We use the curvature properties for extracting certain surface parameters from the curvature properties of the approximations. Then we show that curvature properties alone are not enough to obtain all the parameters of the surfaces; higher order properties (information about change of curvature) are needed to obtain full parametric descriptions. This surface parameter estimation problem arises in the design of a vision system to recognize 3D objects whose surfaces are composed of planar patches and patches of quadrics of revolution. (Quadrics of revolution are quadrics that are surfaces of revolution.) A significant portion of man-made objects can be modeled using these surfaces. The actual process of recognition and parameter extraction is framed as a set of stacked parameter space transforms. The transforms are "stacked" in the sense that any one transform computes only a partial geometric description that forms the input to the next transform. Those who are interested in the organization and control of the recognition and parameter recognition process are referred to [Sabbah86], this paper briefly touches upon the organization, but concentrates mainly on geometrical aspects of the parameter extraction.
The Memory That's Right and the Memory That's Left: Event-Related Potentials Reveal Hemispheric Asymmetries in the Encoding and Retention of Verbal Information

ERIC Educational Resources Information Center

Evans, Karen M.; Federmeier, Kara D.

2007-01-01

We examined the nature and timecourse of hemispheric asymmetries in verbal memory by recording event-related potentials (ERPs) in a continuous recognition task. Participants made overt recognition judgments to test words presented in central vision that were either novel (new words) or had been previously presented in the left or right visual…
Behavioral model of visual perception and recognition

NASA Astrophysics Data System (ADS)

Rybak, Ilya A.; Golovan, Alexander V.; Gusakova, Valentina I.

1993-09-01

In the processes of visual perception and recognition human eyes actively select essential information by way of successive fixations at the most informative points of the image. A behavioral program defining a scanpath of the image is formed at the stage of learning (object memorizing) and consists of sequential motor actions, which are shifts of attention from one to another point of fixation, and sensory signals expected to arrive in response to each shift of attention. In the modern view of the problem, invariant object recognition is provided by the following: (1) separated processing of `what' (object features) and `where' (spatial features) information at high levels of the visual system; (2) mechanisms of visual attention using `where' information; (3) representation of `what' information in an object-based frame of reference (OFR). However, most recent models of vision based on OFR have demonstrated the ability of invariant recognition of only simple objects like letters or binary objects without background, i.e. objects to which a frame of reference is easily attached. In contrast, we use not OFR, but a feature-based frame of reference (FFR), connected with the basic feature (edge) at the fixation point. This has provided for our model, the ability for invariant representation of complex objects in gray-level images, but demands realization of behavioral aspects of vision described above. The developed model contains a neural network subsystem of low-level vision which extracts a set of primary features (edges) in each fixation, and high- level subsystem consisting of `what' (Sensory Memory) and `where' (Motor Memory) modules. The resolution of primary features extraction decreases with distances from the point of fixation. FFR provides both the invariant representation of object features in Sensor Memory and shifts of attention in Motor Memory. Object recognition consists in successive recall (from Motor Memory) and execution of shifts of attention and successive verification of the expected sets of features (stored in Sensory Memory). The model shows the ability of recognition of complex objects (such as faces) in gray-level images invariant with respect to shift, rotation, and scale.
Machine vision for real time orbital operations

NASA Technical Reports Server (NTRS)

Vinz, Frank L.

1988-01-01

Machine vision for automation and robotic operation of Space Station era systems has the potential for increasing the efficiency of orbital servicing, repair, assembly and docking tasks. A machine vision research project is described in which a TV camera is used for inputing visual data to a computer so that image processing may be achieved for real time control of these orbital operations. A technique has resulted from this research which reduces computer memory requirements and greatly increases typical computational speed such that it has the potential for development into a real time orbital machine vision system. This technique is called AI BOSS (Analysis of Images by Box Scan and Syntax).
Development of a Wireless Computer Vision Instrument to Detect Biotic Stress in Wheat

PubMed Central

Casanova, Joaquin J.; O'Shaughnessy, Susan A.; Evett, Steven R.; Rush, Charles M.

2014-01-01

Knowledge of crop abiotic and biotic stress is important for optimal irrigation management. While spectral reflectance and infrared thermometry provide a means to quantify crop stress remotely, these measurements can be cumbersome. Computer vision offers an inexpensive way to remotely detect crop stress independent of vegetation cover. This paper presents a technique using computer vision to detect disease stress in wheat. Digital images of differentially stressed wheat were segmented into soil and vegetation pixels using expectation maximization (EM). In the first season, the algorithm to segment vegetation from soil and distinguish between healthy and stressed wheat was developed and tested using digital images taken in the field and later processed on a desktop computer. In the second season, a wireless camera with near real-time computer vision capabilities was tested in conjunction with the conventional camera and desktop computer. For wheat irrigated at different levels and inoculated with wheat streak mosaic virus (WSMV), vegetation hue determined by the EM algorithm showed significant effects from irrigation level and infection. Unstressed wheat had a higher hue (118.32) than stressed wheat (111.34). In the second season, the hue and cover measured by the wireless computer vision sensor showed significant effects from infection (p = 0.0014), as did the conventional camera (p < 0.0001). Vegetation hue obtained through a wireless computer vision system in this study is a viable option for determining biotic crop stress in irrigation scheduling. Such a low-cost system could be suitable for use in the field in automated irrigation scheduling applications. PMID:25251410
Relationship between slow visual processing and reading speed in people with macular degeneration

PubMed Central

Cheong, Allen MY; Legge, Gordon E; Lawrence, Mary G; Cheung, Sing-Hang; Ruff, Mary A

2007-01-01

Purpose People with macular degeneration (MD) often read slowly even with adequate magnification to compensate for acuity loss. Oculomotor deficits may affect reading in MD, but cannot fully explain the substantial reduction in reading speed. Central-field loss (CFL) is often a consequence of macular degeneration, necessitating the use of peripheral vision for reading. We hypothesized that slower temporal processing of visual patterns in peripheral vision is a factor contributing to slow reading performance in MD patients. Methods Fifteen subjects with MD, including 12 with CFL, and five age-matched control subjects were recruited. Maximum reading speed and critical print size were measured with RSVP (Rapid Serial Visual Presentation). Temporal processing speed was studied by measuring letter-recognition accuracy for strings of three randomly selected letters centered at fixation for a range of exposure times. Temporal threshold was defined as the exposure time yielding 80% recognition accuracy for the central letter. Results Temporal thresholds for the MD subjects ranged from 159 to 5881 ms, much longer than values for age-matched controls in central vision (13 ms, p<0.01). The mean temporal threshold for the 11 MD subjects who used eccentric fixation (1555.8 ± 1708.4 ms) was much longer than the mean temporal threshold (97.0 ms ± 34.2 ms, p<0.01) for the age-matched controls at 10° in the lower visual field. Individual temporal thresholds accounted for 30% of the variance in reading speed (p<0.05). Conclusion The significant association between increased temporal threshold for letter recognition and reduced reading speed is consistent with the hypothesis that slower visual processing of letter recognition is one of the factors limiting reading speed in MD subjects. PMID:17881032
Lunar Applications in Reconfigurable Computing

NASA Technical Reports Server (NTRS)

Somervill, Kevin

2008-01-01

NASA s Constellation Program is developing a lunar surface outpost in which reconfigurable computing will play a significant role. Reconfigurable systems provide a number of benefits over conventional software-based implementations including performance and power efficiency, while the use of standardized reconfigurable hardware provides opportunities to reduce logistical overhead. The current vision for the lunar surface architecture includes habitation, mobility, and communications systems, each of which greatly benefit from reconfigurable hardware in applications including video processing, natural feature recognition, data formatting, IP offload processing, and embedded control systems. In deploying reprogrammable hardware, considerations similar to those of software systems must be managed. There needs to be a mechanism for discovery enabling applications to locate and utilize the available resources. Also, application interfaces are needed to provide for both configuring the resources as well as transferring data between the application and the reconfigurable hardware. Each of these topics are explored in the context of deploying reconfigurable resources as an integral aspect of the lunar exploration architecture.
A Novel Interdisciplinary Approach to Socio-Technical Complexity

NASA Astrophysics Data System (ADS)

Bassetti, Chiara

The chapter presents a novel interdisciplinary approach that integrates micro-sociological analysis into computer-vision and pattern-recognition modeling and algorithms, the purpose being to tackle socio-technical complexity at a systemic yet micro-grounded level. The approach is empirically-grounded and both theoretically- and analytically-driven, yet systemic and multidimensional, semi-supervised and computable, and oriented towards large scale applications. The chapter describes the proposed approach especially as for its sociological foundations, and as applied to the analysis of a particular setting --i.e. sport-spectator crowds. Crowds, better defined as large gatherings, are almost ever-present in our societies, and capturing their dynamics is crucial. From social sciences to public safety management and emergency response, modeling and predicting large gatherings' presence and dynamics, thus possibly preventing critical situations and being able to properly react to them, is fundamental. This is where semi/automated technologies can make the difference. The work presented in this chapter is intended as a scientific step towards such an objective.
Computer vision-based analysis of foods: a non-destructive colour measurement tool to monitor quality and safety.

PubMed

Mogol, Burçe Ataç; Gökmen, Vural

2014-05-01

Computer vision-based image analysis has been widely used in food industry to monitor food quality. It allows low-cost and non-contact measurements of colour to be performed. In this paper, two computer vision-based image analysis approaches are discussed to extract mean colour or featured colour information from the digital images of foods. These types of information may be of particular importance as colour indicates certain chemical changes or physical properties in foods. As exemplified here, the mean CIE a* value or browning ratio determined by means of computer vision-based image analysis algorithms can be correlated with acrylamide content of potato chips or cookies. Or, porosity index as an important physical property of breadcrumb can be calculated easily. In this respect, computer vision-based image analysis provides a useful tool for automatic inspection of food products in a manufacturing line, and it can be actively involved in the decision-making process where rapid quality/safety evaluation is needed. © 2013 Society of Chemical Industry.
Job-shop scheduling applied to computer vision

NASA Astrophysics Data System (ADS)

Sebastian y Zuniga, Jose M.; Torres-Medina, Fernando; Aracil, Rafael; Reinoso, Oscar; Jimenez, Luis M.; Garcia, David

1997-09-01

This paper presents a method for minimizing the total elapsed time spent by n tasks running on m differents processors working in parallel. The developed algorithm not only minimizes the total elapsed time but also reduces the idle time and waiting time of in-process tasks. This condition is very important in some applications of computer vision in which the time to finish the total process is particularly critical -- quality control in industrial inspection, real- time computer vision, guided robots. The scheduling algorithm is based on the use of two matrices, obtained from the precedence relationships between tasks, and the data obtained from the two matrices. The developed scheduling algorithm has been tested in one application of quality control using computer vision. The results obtained have been satisfactory in the application of different image processing algorithms.
Development of the method of aggregation to determine the current storage area using computer vision and radiofrequency identification

NASA Astrophysics Data System (ADS)

Astafiev, A.; Orlov, A.; Privezencev, D.

2018-01-01

The article is devoted to the development of technology and software for the construction of positioning and control systems in industrial plants based on aggregation to determine the current storage area using computer vision and radiofrequency identification. It describes the developed of the project of hardware for industrial products positioning system in the territory of a plant on the basis of radio-frequency grid. It describes the development of the project of hardware for industrial products positioning system in the plant on the basis of computer vision methods. It describes the development of the method of aggregation to determine the current storage area using computer vision and radiofrequency identification. Experimental studies in laboratory and production conditions have been conducted and described in the article.
Energy scaling advantages of resistive memory crossbar based computation and its application to sparse coding

DOE Office of Scientific and Technical Information (OSTI.GOV)

Agarwal, Sapan; Quach, Tu -Thach; Parekh, Ojas

In this study, the exponential increase in data over the last decade presents a significant challenge to analytics efforts that seek to process and interpret such data for various applications. Neural-inspired computing approaches are being developed in order to leverage the computational properties of the analog, low-power data processing observed in biological systems. Analog resistive memory crossbars can perform a parallel read or a vector-matrix multiplication as well as a parallel write or a rank-1 update with high computational efficiency. For an N × N crossbar, these two kernels can be O(N) more energy efficient than a conventional digital memory-basedmore » architecture. If the read operation is noise limited, the energy to read a column can be independent of the crossbar size (O(1)). These two kernels form the basis of many neuromorphic algorithms such as image, text, and speech recognition. For instance, these kernels can be applied to a neural sparse coding algorithm to give an O(N) reduction in energy for the entire algorithm when run with finite precision. Sparse coding is a rich problem with a host of applications including computer vision, object tracking, and more generally unsupervised learning.« less
Energy scaling advantages of resistive memory crossbar based computation and its application to sparse coding

DOE PAGES

Agarwal, Sapan; Quach, Tu -Thach; Parekh, Ojas; ...

2016-01-06

In this study, the exponential increase in data over the last decade presents a significant challenge to analytics efforts that seek to process and interpret such data for various applications. Neural-inspired computing approaches are being developed in order to leverage the computational properties of the analog, low-power data processing observed in biological systems. Analog resistive memory crossbars can perform a parallel read or a vector-matrix multiplication as well as a parallel write or a rank-1 update with high computational efficiency. For an N × N crossbar, these two kernels can be O(N) more energy efficient than a conventional digital memory-basedmore » architecture. If the read operation is noise limited, the energy to read a column can be independent of the crossbar size (O(1)). These two kernels form the basis of many neuromorphic algorithms such as image, text, and speech recognition. For instance, these kernels can be applied to a neural sparse coding algorithm to give an O(N) reduction in energy for the entire algorithm when run with finite precision. Sparse coding is a rich problem with a host of applications including computer vision, object tracking, and more generally unsupervised learning.« less
Texture and art with deep neural networks.

PubMed

Gatys, Leon A; Ecker, Alexander S; Bethge, Matthias

2017-10-01

Although the study of biological vision and computer vision attempt to understand powerful visual information processing from different angles, they have a long history of informing each other. Recent advances in texture synthesis that were motivated by visual neuroscience have led to a substantial advance in image synthesis and manipulation in computer vision using convolutional neural networks (CNNs). Here, we review these recent advances and discuss how they can in turn inspire new research in visual perception and computational neuroscience. Copyright © 2017. Published by Elsevier Ltd.

Accommodative spasm in siblings: A unique finding

PubMed Central

Rutstein, Robert P

2010-01-01

Accommodative spasm is a rare condition occurring in children, adolescents, and young adults. A familial tendency for this binocular vision disorder has not been reported. I describe accommodative spasm occurring in a brother and sister. Both children presented on the same day with complaints of headaches and blurred vision. Treatment included cycloplegia drops and bifocals. Siblings of patients having accommodative spasm should receive a detailed eye exam with emphasis on recognition of accommodative spasm. PMID:20534925
Vision-based object detection and recognition system for intelligent vehicles

NASA Astrophysics Data System (ADS)

Ran, Bin; Liu, Henry X.; Martono, Wilfung

1999-01-01

Recently, a proactive crash mitigation system is proposed to enhance the crash avoidance and survivability of the Intelligent Vehicles. Accurate object detection and recognition system is a prerequisite for a proactive crash mitigation system, as system component deployment algorithms rely on accurate hazard detection, recognition, and tracking information. In this paper, we present a vision-based approach to detect and recognize vehicles and traffic signs, obtain their information, and track multiple objects by using a sequence of color images taken from a moving vehicle. The entire system consist of two sub-systems, the vehicle detection and recognition sub-system and traffic sign detection and recognition sub-system. Both of the sub- systems consist of four models: object detection model, object recognition model, object information model, and object tracking model. In order to detect potential objects on the road, several features of the objects are investigated, which include symmetrical shape and aspect ratio of a vehicle and color and shape information of the signs. A two-layer neural network is trained to recognize different types of vehicles and a parameterized traffic sign model is established in the process of recognizing a sign. Tracking is accomplished by combining the analysis of single image frame with the analysis of consecutive image frames. The analysis of the single image frame is performed every ten full-size images. The information model will obtain the information related to the object, such as time to collision for the object vehicle and relative distance from the traffic sings. Experimental results demonstrated a robust and accurate system in real time object detection and recognition over thousands of image frames.
Artificial intelligence and signal processing for infrastructure assessment

NASA Astrophysics Data System (ADS)

Assaleh, Khaled; Shanableh, Tamer; Yehia, Sherif

2015-04-01

The Ground Penetrating Radar (GPR) is being recognized as an effective nondestructive evaluation technique to improve the inspection process. However, data interpretation and complexity of the results impose some limitations on the practicality of using this technique. This is mainly due to the need of a trained experienced person to interpret images obtained by the GPR system. In this paper, an algorithm to classify and assess the condition of infrastructures utilizing image processing and pattern recognition techniques is discussed. Features extracted form a dataset of images of defected and healthy slabs are used to train a computer vision based system while another dataset is used to evaluate the proposed algorithm. Initial results show that the proposed algorithm is able to detect the existence of defects with about 77% success rate.
Tera-Ops Processing for ATR

NASA Technical Reports Server (NTRS)

Udomkesmalee, Suraphol; Padgett, Curtis; Zhu, David; Lung, Gerald; Howard, Ayanna

2000-01-01

A three-dimensional microelectronic device (3DANN-R) capable of performing general image convolution at the speed of 1012 operations/second (ops) in a volume of less than 1.5 cubic centimeter has been successfully built under the BMDO/JPL VIGILANTE program. 3DANN-R was developed in partnership with Irvine Sensors Corp., Costa Mesa, California. 3DANN-R is a sugar-cube-sized, low power image convolution engine that in its core computation circuitry is capable of performing 64 image convolutions with large (64x64) windows at video frame rates. This paper explores potential applications of 3DANN-R such as target recognition, SAR and hyperspectral data processing, and general machine vision using real data and discuss technical challenges for providing deployable systems for BMDO surveillance and interceptor programs.
Performance Monitoring Of A Computer Numerically Controlled (CNC) Lathe Using Pattern Recognition Techniques

NASA Astrophysics Data System (ADS)

Daneshmend, L. K.; Pak, H. A.

1984-02-01

On-line monitoring of the cutting process in CNC lathe is desirable to ensure unattended fault-free operation in an automated environment. The state of the cutting tool is one of the most important parameters which characterises the cutting process. Direct monitoring of the cutting tool or workpiece is not feasible during machining. However several variables related to the state of the tool can be measured on-line. A novel monitoring technique is presented which uses cutting torque as the variable for on-line monitoring. A classifier is designed on the basis of the empirical relationship between cutting torque and flank wear. The empirical model required by the on-line classifier is established during an automated training cycle using machine vision for off-line direct inspection of the tool.
When Ultrasonic Sensors and Computer Vision Join Forces for Efficient Obstacle Detection and Recognition

PubMed Central

Mocanu, Bogdan; Tapu, Ruxandra; Zaharia, Titus

2016-01-01

In the most recent report published by the World Health Organization concerning people with visual disabilities it is highlighted that by the year 2020, worldwide, the number of completely blind people will reach 75 million, while the number of visually impaired (VI) people will rise to 250 million. Within this context, the development of dedicated electronic travel aid (ETA) systems, able to increase the safe displacement of VI people in indoor/outdoor spaces, while providing additional cognition of the environment becomes of outmost importance. This paper introduces a novel wearable assistive device designed to facilitate the autonomous navigation of blind and VI people in highly dynamic urban scenes. The system exploits two independent sources of information: ultrasonic sensors and the video camera embedded in a regular smartphone. The underlying methodology exploits computer vision and machine learning techniques and makes it possible to identify accurately both static and highly dynamic objects existent in a scene, regardless on their location, size or shape. In addition, the proposed system is able to acquire information about the environment, semantically interpret it and alert users about possible dangerous situations through acoustic feedback. To determine the performance of the proposed methodology we have performed an extensive objective and subjective experimental evaluation with the help of 21 VI subjects from two blind associations. The users pointed out that our prototype is highly helpful in increasing the mobility, while being friendly and easy to learn. PMID:27801834
When Ultrasonic Sensors and Computer Vision Join Forces for Efficient Obstacle Detection and Recognition.

PubMed

Mocanu, Bogdan; Tapu, Ruxandra; Zaharia, Titus

2016-10-28

In the most recent report published by the World Health Organization concerning people with visual disabilities it is highlighted that by the year 2020, worldwide, the number of completely blind people will reach 75 million, while the number of visually impaired (VI) people will rise to 250 million. Within this context, the development of dedicated electronic travel aid (ETA) systems, able to increase the safe displacement of VI people in indoor/outdoor spaces, while providing additional cognition of the environment becomes of outmost importance. This paper introduces a novel wearable assistive device designed to facilitate the autonomous navigation of blind and VI people in highly dynamic urban scenes. The system exploits two independent sources of information: ultrasonic sensors and the video camera embedded in a regular smartphone. The underlying methodology exploits computer vision and machine learning techniques and makes it possible to identify accurately both static and highly dynamic objects existent in a scene, regardless on their location, size or shape. In addition, the proposed system is able to acquire information about the environment, semantically interpret it and alert users about possible dangerous situations through acoustic feedback. To determine the performance of the proposed methodology we have performed an extensive objective and subjective experimental evaluation with the help of 21 VI subjects from two blind associations. The users pointed out that our prototype is highly helpful in increasing the mobility, while being friendly and easy to learn.
Owls see in stereo much like humans do.

PubMed

van der Willigen, Robert F

2011-06-10

While 3D experiences through binocular disparity sensitivity have acquired special status in the understanding of human stereo vision, much remains to be learned about how binocularity is put to use in animals. The owl provides an exceptional model to study stereo vision as it displays one of the highest degrees of binocular specialization throughout the animal kingdom. In a series of six behavioral experiments, equivalent to hallmark human psychophysical studies, I compiled an extensive body of stereo performance data from two trained owls. Computer-generated, binocular random-dot patterns were used to ensure pure stereo performance measurements. In all cases, I found that owls perform much like humans do, viz.: (1) disparity alone can evoke figure-ground segmentation; (2) selective use of "relative" rather than "absolute" disparity; (3) hyperacute sensitivity; (4) disparity processing allows for the avoidance of monocular feature detection prior to object recognition; (5) large binocular disparities are not tolerated; (6) disparity guides the perceptual organization of 2D shape. The robustness and very nature of these binocular disparity-based perceptual phenomena bear out that owls, like humans, exploit the third dimension to facilitate early figure-ground segmentation of tangible objects.
Egocentric daily activity recognition via multitask clustering.

PubMed

Yan, Yan; Ricci, Elisa; Liu, Gaowen; Sebe, Nicu

2015-10-01

Recognizing human activities from videos is a fundamental research problem in computer vision. Recently, there has been a growing interest in analyzing human behavior from data collected with wearable cameras. First-person cameras continuously record several hours of their wearers' life. To cope with this vast amount of unlabeled and heterogeneous data, novel algorithmic solutions are required. In this paper, we propose a multitask clustering framework for activity of daily living analysis from visual data gathered from wearable cameras. Our intuition is that, even if the data are not annotated, it is possible to exploit the fact that the tasks of recognizing everyday activities of multiple individuals are related, since typically people perform the same actions in similar environments, e.g., people working in an office often read and write documents). In our framework, rather than clustering data from different users separately, we propose to look for clustering partitions which are coherent among related tasks. In particular, two novel multitask clustering algorithms, derived from a common optimization problem, are introduced. Our experimental evaluation, conducted both on synthetic data and on publicly available first-person vision data sets, shows that the proposed approach outperforms several single-task and multitask learning methods.
Acting to gain information

NASA Technical Reports Server (NTRS)

Rosenchein, Stanley J.; Burns, J. Brian; Chapman, David; Kaelbling, Leslie P.; Kahn, Philip; Nishihara, H. Keith; Turk, Matthew

1993-01-01

This report is concerned with agents that act to gain information. In previous work, we developed agent models combining qualitative modeling with real-time control. That work, however, focused primarily on actions that affect physical states of the environment. The current study extends that work by explicitly considering problems of active information-gathering and by exploring specialized aspects of information-gathering in computational perception, learning, and language. In our theoretical investigations, we analyzed agents into their perceptual and action components and identified these with elements of a state-machine model of control. The mathematical properties of each was developed in isolation and interactions were then studied. We considered the complexity dimension and the uncertainty dimension and related these to intelligent-agent design issues. We also explored active information gathering in visual processing. Working within the active vision paradigm, we developed a concept of 'minimal meaningful measurements' suitable for demand-driven vision. We then developed and tested an architecture for ongoing recognition and interpretation of visual information. In the area of information gathering through learning, we explored techniques for coping with combinatorial complexity. We also explored information gathering through explicit linguistic action by considering the nature of conversational rules, coordination, and situated communication behavior.
Age Differences in the Differentiation of Trait Impressions From Faces

PubMed Central

Ng, Stacey Y.; Zebrowitz, Leslie A.; Franklin, Robert G.

2016-01-01

Objectives. We investigated whether evidence that older adults (OA) show less differentiation of visual stimuli than younger adults (YA) extends to trait impressions from faces and effects of face age. We also examined whether age differences in mood, vision, or cognition-mediated differentiation differences. Finally, we investigated whether age differences in trait differentiation mediated differences in impression positivity. Method. We used a differentiation index adapted from previous work on stereotyping to assess OA and YA likelihood of assigning different faces to different levels on trait scales. We computed scores for ratings of older and younger faces’ competence, health, hostility, and untrustworthiness. Results. OA showed less differentiated trait ratings than YA. Measures of mood, vision, and cognition did not mediate these rater age differences. Hostility was differentiated more for younger than older faces, while health was differentiated more for older faces, but only by OA. Age differences in differentiation mediated age differences in impression positivity. Discussion. Less differentiation of trait impressions from faces in OA is consistent with previous evidence for less differentiation in face and emotion recognition. Results indicated that that age-related dedifferentiation does not reflect narrow changes in visual function. They also provide a novel explanation for OA positivity effects. PMID:25194140
Ontological Representation of Light Wave Camera Data to Support Vision-Based AmI

PubMed Central

Serrano, Miguel Ángel; Gómez-Romero, Juan; Patricio, Miguel Ángel; García, Jesús; Molina, José Manuel

2012-01-01

Recent advances in technologies for capturing video data have opened a vast amount of new application areas in visual sensor networks. Among them, the incorporation of light wave cameras on Ambient Intelligence (AmI) environments provides more accurate tracking capabilities for activity recognition. Although the performance of tracking algorithms has quickly improved, symbolic models used to represent the resulting knowledge have not yet been adapted to smart environments. This lack of representation does not allow to take advantage of the semantic quality of the information provided by new sensors. This paper advocates for the introduction of a part-based representational level in cognitive-based systems in order to accurately represent the novel sensors' knowledge. The paper also reviews the theoretical and practical issues in part-whole relationships proposing a specific taxonomy for computer vision approaches. General part-based patterns for human body and transitive part-based representation and inference are incorporated to an ontology-based previous framework to enhance scene interpretation in the area of video-based AmI. The advantages and new features of the model are demonstrated in a Social Signal Processing (SSP) application for the elaboration of live market researches.
Impact of computer use on children's vision.

PubMed

Kozeis, N

2009-10-01

Today, millions of children use computers on a daily basis. Extensive viewing of the computer screen can lead to eye discomfort, fatigue, blurred vision and headaches, dry eyes and other symptoms of eyestrain. These symptoms may be caused by poor lighting, glare, an improper work station set-up, vision problems of which the person was not previously aware, or a combination of these factors. Children can experience many of the same symptoms related to computer use as adults. However, some unique aspects of how children use computers may make them more susceptible than adults to the development of these problems. In this study, the most common eye symptoms related to computer use in childhood, the possible causes and ways to avoid them are reviewed.
Evaluating Effects of Divided Hemispheric Processing on Word Recognition in Foveal and Extrafoveal Displays: The Evidence from Arabic

PubMed Central

Almabruk, Abubaker A. A.; Paterson, Kevin B.; McGowan, Victoria; Jordan, Timothy R.

2011-01-01

Background Previous studies have claimed that a precise split at the vertical midline of each fovea causes all words to the left and right of fixation to project to the opposite, contralateral hemisphere, and this division in hemispheric processing has considerable consequences for foveal word recognition. However, research in this area is dominated by the use of stimuli from Latinate languages, which may induce specific effects on performance. Consequently, we report two experiments using stimuli from a fundamentally different, non-Latinate language (Arabic) that offers an alternative way of revealing effects of split-foveal processing, if they exist. Methods and Findings Words (and pseudowords) were presented to the left or right of fixation, either close to fixation and entirely within foveal vision, or further from fixation and entirely within extrafoveal vision. Fixation location and stimulus presentations were carefully controlled using an eye-tracker linked to a fixation-contingent display. To assess word recognition, Experiment 1 used the Reicher-Wheeler task and Experiment 2 used the lexical decision task. Results Performance in both experiments indicated a functional division in hemispheric processing for words in extrafoveal locations (in recognition accuracy in Experiment 1 and in reaction times and error rates in Experiment 2) but no such division for words in foveal locations. Conclusions These findings from a non-Latinate language provide new evidence that although a functional division in hemispheric processing exists for word recognition outside the fovea, this division does not extend up to the point of fixation. Some implications for word recognition and reading are discussed. PMID:21559084
Biometrics: Facing Up to Terrorism

DTIC Science & Technology

2001-10-01

ment committee appointed by Secretary of Trans- portation Norman Y. Mineta to review airport security measures will recommend that facial recogni- tion...on the Role Facial Recognition Technology Can Play in Enhancing Airport Security .” Joseph Atick, the CEO of Visionics, testified before the government...system at a U.S. air- port. This deployment is believed to be the first-in-the-nation use of face-recognition technology for airport security . The sys
Computer vision-based automated peak picking applied to protein NMR spectra.

PubMed

Klukowski, Piotr; Walczak, Michal J; Gonczarek, Adam; Boudet, Julien; Wider, Gerhard

2015-09-15

A detailed analysis of multidimensional NMR spectra of macromolecules requires the identification of individual resonances (peaks). This task can be tedious and time-consuming and often requires support by experienced users. Automated peak picking algorithms were introduced more than 25 years ago, but there are still major deficiencies/flaws that often prevent complete and error free peak picking of biological macromolecule spectra. The major challenges of automated peak picking algorithms is both the distinction of artifacts from real peaks particularly from those with irregular shapes and also picking peaks in spectral regions with overlapping resonances which are very hard to resolve by existing computer algorithms. In both of these cases a visual inspection approach could be more effective than a 'blind' algorithm. We present a novel approach using computer vision (CV) methodology which could be better adapted to the problem of peak recognition. After suitable 'training' we successfully applied the CV algorithm to spectra of medium-sized soluble proteins up to molecular weights of 26 kDa and to a 130 kDa complex of a tetrameric membrane protein in detergent micelles. Our CV approach outperforms commonly used programs. With suitable training datasets the application of the presented method can be extended to automated peak picking in multidimensional spectra of nucleic acids or carbohydrates and adapted to solid-state NMR spectra. CV-Peak Picker is available upon request from the authors. gsw@mol.biol.ethz.ch; michal.walczak@mol.biol.ethz.ch; adam.gonczarek@pwr.edu.pl Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
The impact on midlevel vision of statistically optimal divisive normalization in V1

PubMed Central

Coen-Cagli, Ruben; Schwartz, Odelia

2013-01-01

The first two areas of the primate visual cortex (V1, V2) provide a paradigmatic example of hierarchical computation in the brain. However, neither the functional properties of V2 nor the interactions between the two areas are well understood. One key aspect is that the statistics of the inputs received by V2 depend on the nonlinear response properties of V1. Here, we focused on divisive normalization, a canonical nonlinear computation that is observed in many neural areas and modalities. We simulated V1 responses with (and without) different forms of surround normalization derived from statistical models of natural scenes, including canonical normalization and a statistically optimal extension that accounted for image nonhomogeneities. The statistics of the V1 population responses differed markedly across models. We then addressed how V2 receptive fields pool the responses of V1 model units with different tuning. We assumed this is achieved by learning without supervision a linear representation that removes correlations, which could be accomplished with principal component analysis. This approach revealed V2-like feature selectivity when we used the optimal normalization and, to a lesser extent, the canonical one but not in the absence of both. We compared the resulting two-stage models on two perceptual tasks; while models encompassing V1 surround normalization performed better at object recognition, only statistically optimal normalization provided systematic advantages in a task more closely matched to midlevel vision, namely figure/ground judgment. Our results suggest that experiments probing midlevel areas might benefit from using stimuli designed to engage the computations that characterize V1 optimality. PMID:23857950
Operational Assessment of Color Vision

DTIC Science & Technology

2016-06-20

evaluated in this study. 15. SUBJECT TERMS Color vision, aviation, cone contrast test, Colour Assessment & Diagnosis , color Dx, OBVA 16. SECURITY...symbologies are frequently used to aid or direct critical activities such as aircraft landing approaches or railroad right-of-way designations...computer-generated display systems have facilitated the development of computer-based, automated tests of color vision [14,15]. The United Kingdom’s
Neo-Symbiosis: The Next Stage in the Evolution of Human Information Interaction

DOE Office of Scientific and Technical Information (OSTI.GOV)

Griffith, Douglas; Greitzer, Frank L.

We re-address the vision of human-computer symbiosis expressed by J. C. R. Licklider nearly a half-century ago, when he wrote: “The hope is that in not too many years, human brains and computing machines will be coupled together very tightly, and that the resulting partnership will think as no human brain has ever thought and process data in a way not approached by the information-handling machines we know today.” (Licklider, 1960). Unfortunately, little progress was made toward this vision over four decades following Licklider’s challenge, despite significant advancements in the fields of human factors and computer science. Licklider’s vision wasmore » largely forgotten. However, recent advances in information science and technology, psychology, and neuroscience have rekindled the potential of making the Licklider’s vision a reality. This paper provides a historical context for and updates the vision, and it argues that such a vision is needed as a unifying framework for advancing IS&T.« less
Discriminative exemplar coding for sign language recognition with Kinect.

PubMed

Sun, Chao; Zhang, Tianzhu; Bao, Bing-Kun; Xu, Changsheng; Mei, Tao

2013-10-01

Sign language recognition is a growing research area in the field of computer vision. A challenge within it is to model various signs, varying with time resolution, visual manual appearance, and so on. In this paper, we propose a discriminative exemplar coding (DEC) approach, as well as utilizing Kinect sensor, to model various signs. The proposed DEC method can be summarized as three steps. First, a quantity of class-specific candidate exemplars are learned from sign language videos in each sign category by considering their discrimination. Then, every video of all signs is described as a set of similarities between frames within it and the candidate exemplars. Instead of simply using a heuristic distance measure, the similarities are decided by a set of exemplar-based classifiers through the multiple instance learning, in which a positive (or negative) video is treated as a positive (or negative) bag and those frames similar to the given exemplar in Euclidean space as instances. Finally, we formulate the selection of the most discriminative exemplars into a framework and simultaneously produce a sign video classifier to recognize sign. To evaluate our method, we collect an American sign language dataset, which includes approximately 2000 phrases, while each phrase is captured by Kinect sensor with color, depth, and skeleton information. Experimental results on our dataset demonstrate the feasibility and effectiveness of the proposed approach for sign language recognition.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.